| Literature DB >> 34118008 |
Ron D Hays1, Karen L Spritzer2, Steven P Reise3.
Abstract
The reliable change index has been used to evaluate the significance of individual change in health-related quality of life. We estimate reliable change for two measures (physical function and emotional distress) in the Patient-Reported Outcomes Measurement Information System (PROMIS®) 29-item health-related quality of life measure (PROMIS-29 v2.1). Using two waves of data collected 3 months apart in a longitudinal observational study of chronic low back pain and chronic neck pain patients receiving chiropractic care, and simulations, we compare estimates of reliable change from classical test theory fixed standard errors with item response theory standard errors from the graded response model. We find that unless true change in the PROMIS physical function and emotional distress scales is substantial, classical test theory estimates of significant individual change are much more optimistic than estimates of change based on item response theory.Entities:
Keywords: PROMIS®; individual change; responders to treatment
Mesh:
Year: 2021 PMID: 34118008 PMCID: PMC8437927 DOI: 10.1007/s11336-021-09774-1
Source DB: PubMed Journal: Psychometrika ISSN: 0033-3123 Impact factor: 2.500
Physical functioning graded response model item parameters
| Item | Slope | Category thresholds | |||
|---|---|---|---|---|---|
| PFA11: Are you able to do chores such as vacuuming or yard work? | 4.72 | ||||
| PFA21: Are you able to go up and down stairs at a normal pace? | 3.93 | ||||
| PFA23: Are you able to go for a walk of at least 15 minutes? | 3.79 | ||||
| PFA53: Are you able to run errands and shop? | 4.29 | ||||
HealthMeasures is the official information and distribution center for PROMIS®.
PROMIS item parameters are available from help@healthmeasures.net.
Emotional distress graded response model item parameters
| Item | Slope | Category thresholds | |||
|---|---|---|---|---|---|
| EDANX01: I felt fearful | 3.60 | 0.34 | 1.09 | 1.96 | 2.70 |
| EDANX40: I found it hard to focus on anything other than my anxiety | 3.88 | 0.49 | 1.26 | 2.11 | 2.90 |
| EDANX41: my worries overwhelmed me | 3.66 | 0.36 | 1.03 | 1.78 | 2.62 |
| EDANX53: I felt uneasy | 3.66 | 0.60 | 1.56 | 2.50 | |
| EDDEP04: I felt worthless | 4.26 | 0.40 | 0.98 | 1.70 | 2.44 |
| EDDEP06: I felt helpless | 4.14 | 0.35 | 0.92 | 1.68 | 2.47 |
| EDDEP29: I felt depressed | 4.34 | 0.60 | 1.43 | 2.27 | |
| EDDEP41: I felt hopeless | 4.45 | 0.56 | 1.07 | 1.78 | 2.53 |
Item parameters above were estimated using the dataset analyzed in this paper. The intraclass correlation between the expected a posterior standard deviations (EAP SDs) based on these parameters and the average of the EAP SDs for the depression and anxiety scales was 0.92. PROMIS item parameters are available from help@healthmeasures.net
Fig. 1Physical functioning scale information curve
Percentage of individuals classified as worse, same, and better based on change from baseline to 3 months later for physical function using two-tailed and one-tailed significance tests
| Reliable change index | Worse | Same | Better |
|---|---|---|---|
| Two-tailed ( | |||
| Classical test theory | 173 (9%) | 1425 (78%) | 236 (13%) |
| Item response theory | 56 (3%) | 1677 (91%) | 101 (6%) |
| One-tailed ( | |||
| Classical test theory | 196 (11%) | 1366 (74%) | 272 (15%) |
| Item response theory | 112 (6%) | 1539 (84%) | 183 (10%) |
SEM SD * . Reliability = 0.86 ; IRT : mean = 3.52 (range 1.92–6.88); : mean = 3.61 (range 1.92–6.98)
Cross-tabulation of change groups based on item response theory (columns) and classical test theory (rows) standard errors for physical function
| Classical test theory | Item response theory | |||
|---|---|---|---|---|
| Worse | Same | Better | Total | |
| Two-tailed | ||||
| Worse | 126 | 0 | 173 | |
| Same | 9 | 12 | 1425 | |
| Better | 0 | 147 | 236 | |
| Total | 56 | 1677 | 101 | 1834 |
| One-tailed | ||||
| Worse | 98 | 0 | 196 | |
| Same | 14 | 24 | 1366 | |
| Better | 0 | 113 | 272 | |
| Total | 112 | 1539 | 183 | 1834 |
Bold indicates agreement between clasical test theory and item response theory.
Means (standard deviations) of change scores by 9 subgroups formed by cross-tabulation of item response theory (columns) and classical test theory (rows) change group in physical function
| Classical test theory | Item response theory | ||
|---|---|---|---|
| Worse | Same | Better | |
| Two-tailed | |||
| Worse | NA | ||
| Same | 6.78 (0.31) | ||
| Better | NA | 9.83 (1.53) | |
| One-tailed | |||
| Worse | NA | ||
| Same | 5.49 (0.40) | ||
| Better | NA | 9.04 (0.77) | |
NA not applicable because there were no observations in these cells
Bold indicates for cells where classical test theory and item response theory agree.
Percentage of individuals classified as worse, same, and better based on change from baseline to 3 months later for emotional distress using two-tailed and one-tailed significance tests
| Reliable change index | Worse | Same | Better |
|---|---|---|---|
| Two-tailed ( | |||
| Classical test theory | 290 (16%) | 1255 (68%) | 289 (16%) |
| Item response theory | 90 (5%) | 1651 (90%) | 93 (5%) |
| One-Tailed ( | |||
| Classical test theory | 324 (18%) | 1175 (64%) | 335 (18%) |
| Item response theory | 143 (8%) | 1558 (85%) | 133 (7%) |
SEM = SD * . Reliability = 0.93 1.95; 1.96 IRT : mean = 4.02 (range 2.21–6.79); : mean = 4.01 (range 2.21–6.52)
Cross-tabulation of change groups based on item response theory (columns) and classical test theory (rows) standard errors for emotional distress
| Classical test theory | Item response theory | |||
|---|---|---|---|---|
| Worse | Same | Better | Total | |
| Two-tailed | ||||
| Worse | 200 | 0 | 290 | |
| Same | 0 | 0 | 1255 | |
| Better | 0 | 196 | 289 | |
| Total | 90 | 1651 | 93 | 1834 |
| One-tailed | ||||
| Worse | 181 | 0 | 324 | |
| Same | 0 | 0 | 1175 | |
| Better | 0 | 202 | 335 | |
| Total | 143 | 1558 | 133 | 1834 |
Bold indicates agreement between classical test theory and item response theory.
Means (standard deviations) of change scores by 9 subgroups formed by cross-tabulation of item response theory (columns) and classical test theory (rows) change group for emotional distress
| Classical test theory | Item response theory | ||
|---|---|---|---|
| Worse | Same | Better | |
| Two-tailed | |||
| Worse | NA | ||
| Same | NA |
| NA |
| Better | NA | 7.33 (1.69) |
|
| One-tailed | |||
| Worse | NA | ||
| Same | NA |
| NA |
| Better | NA | 6.54 (1.53) |
|
NA not applicable because there were no observations in these cells
Bold indicates for cells where classical test theory and item response theory agree.
Number (percent) of people in different physical function and emotional distress change categories according to item response theory
| Definitely worse | Probably worse | Same | Probably better | Definitely better | |
|---|---|---|---|---|---|
| Physical function | 56 (3%) | 56 (3%) | 1539 (84%) | 82 (4%) | 101 (6%) |
| Emotional distress | 90 (5%) | 53 (3%) | 1558 (85%) | 40 (2%) | 93 (5%) |
Definitely worse and better groups defined as significant change according to item response theory standard errors and two-tailed test. Probably worse and better groups defined as significant change according to one-tailed test.