Jakob Bue Bjorner1,2, Berend Terluin3, Andrew Trigg4, Jinxiang Hu5, Keri J S Brady6, Pip Griffiths7. 1. QualityMetric Incorporated, LLC, Johnston, RI, USA. jbjorner@qualitymetric.com. 2. University of Copenhagen, Copenhagen, Denmark. jbjorner@qualitymetric.com. 3. Amsterdam University Medical Centers, Amsterdam, The Netherlands. 4. Adelphi Values, Bollington, UK. 5. University of Kansas Medical Center, Kansas City, KS, USA. 6. Department of Health Law, Policy & Management, Boston University School of Public Health, Boston, USA. 7. IQVIA France, Paris, France.
Abstract
PURPOSE: Thresholds for meaningful within-individual change (MWIC) are useful for interpreting patient-reported outcome measures (PROM). Transition ratings (TR) have been recommended as anchors to establish MWIC. Traditional statistical methods for analyzing MWIC such as mean change analysis, receiver operating characteristic (ROC) analysis, and predictive modeling ignore problems of floor/ceiling effects and measurement error in the PROM scores and the TR item. We present a novel approach to MWIC estimation for multi-item scales using longitudinal item response theory (LIRT). METHODS: A Graded Response LIRT model for baseline and follow-up PROM data was expanded to include a TR item measuring latent change. The LIRT threshold parameter for the TR established the MWIC threshold on the latent metric, from which the observed PROM score MWIC threshold was estimated. We compared the LIRT approach and traditional methods using an example data set with baseline and three follow-up assessments differing by magnitude of score improvement, variance of score improvement, and baseline-follow-up score correlation. RESULTS: The LIRT model provided good fit to the data. LIRT estimates of observed PROM MWIC varied between 3 and 4 points score improvement. In contrast, results from traditional methods varied from 2 to 10 points-strongly associated with proportion of self-rated improvement. Best agreement between methods was seen when approximately 50% rated their health as improved. CONCLUSION: Results from traditional analyses of anchor-based MWIC are impacted by study conditions. LIRT constitutes a promising and more robust analytic approach to identifying thresholds for MWIC.
PURPOSE: Thresholds for meaningful within-individual change (MWIC) are useful for interpreting patient-reported outcome measures (PROM). Transition ratings (TR) have been recommended as anchors to establish MWIC. Traditional statistical methods for analyzing MWIC such as mean change analysis, receiver operating characteristic (ROC) analysis, and predictive modeling ignore problems of floor/ceiling effects and measurement error in the PROM scores and the TR item. We present a novel approach to MWIC estimation for multi-item scales using longitudinal item response theory (LIRT). METHODS: A Graded Response LIRT model for baseline and follow-up PROM data was expanded to include a TR item measuring latent change. The LIRT threshold parameter for the TR established the MWIC threshold on the latent metric, from which the observed PROM score MWIC threshold was estimated. We compared the LIRT approach and traditional methods using an example data set with baseline and three follow-up assessments differing by magnitude of score improvement, variance of score improvement, and baseline-follow-up score correlation. RESULTS: The LIRT model provided good fit to the data. LIRT estimates of observed PROM MWIC varied between 3 and 4 points score improvement. In contrast, results from traditional methods varied from 2 to 10 points-strongly associated with proportion of self-rated improvement. Best agreement between methods was seen when approximately 50% rated their health as improved. CONCLUSION: Results from traditional analyses of anchor-based MWIC are impacted by study conditions. LIRT constitutes a promising and more robust analytic approach to identifying thresholds for MWIC.
Authors: Berend Terluin; Philip Griffiths; Andrew Trigg; Caroline B Terwee; Jakob B Bjorner Journal: J Clin Epidemiol Date: 2021-12-26 Impact factor: 6.437