| Literature DB >> 33595827 |
Véronique Sébille1, Lisa M Lix2, Olawale F Ayilara2, Tolulope T Sajobi3, A Cecile J W Janssens4, Richard Sawatzky5, Mirjam A G Sprangers6, Mathilde G E Verdam6,7.
Abstract
PURPOSE: This work is part of an international, interdisciplinary initiative to synthesize research on response shift in results of patient-reported outcome measures. The objective is to critically examine current response shift methods. We additionally propose advancing new methods that address the limitations of extant methods.Entities:
Keywords: Methods; Operationalization; Patient-reported outcomes; Response shift
Mesh:
Year: 2021 PMID: 33595827 PMCID: PMC8602164 DOI: 10.1007/s11136-020-02755-4
Source DB: PubMed Journal: Qual Life Res ISSN: 0962-9343 Impact factor: 4.147
Response shift methods: Description, definition, and operationalization
| Method | Description | Definition | Operationalization | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
Then-test method (a design method*) | The then-test is an additional measurement at posttest occasion. Respondents complete the same measure as they did at pretest and posttest, but now with the instruction to re-evaluate their level of pretest functioning Group-level analysis; can accommodate subgroup analysis | Discrepancy between observed and target change**: Yes, using the operationalization of “observed change” (posttest minus pretest) and “target change” (posttest minus then-test) | Recalibration: pretest-minus then-test scores | ||||||||
Appraisal (a design method*) | Changes in cognitive appraisal can be operationalized by the repeated administration of the QoL Appraisal Profile (QOLAP), QOLAP version 2 or the Brief Appraisal Profile [ Group-level analysis; can accommodate subgroup analysis | Discrepancy between observed and target change**: Yes, using the operationalization of “observed change” (observed QoL change) and “target change” (expected QoL change that is explained by relevant changes in health and other standard predictors of QoL) | |||||||||
Semi-structured interview (a qualitative method) | Interview questions directed at eliciting respondents’ verbalizations of possible response shift effects Individual-level analysis | Discrepancy between observed and target change**: Yes, dependent on the questions, interviews may elicit reflections on observed change (pretest–posttest) and change in the target construct (reflections on that change where respondents replace earlier verbalizations by new ones claiming the latter are more true) | - would you have rated the level of your HRQOL in the same way at (name reference period) if asked at that time rather than now (in retrospect)? - does the response level “a ‘good’ day” (physically/socially/ emotionally/cognitively) mean a different thing now as opposed to (name reference period)? -are some things more or less important for you now? - has the meaning of HRQOL changed for you? - are different things important to you now? (Questions, in part, taken and adapted from Beeken et al. [ | ||||||||
Schedule for the Evaluation of Individual Quality of Life (SEIQoL) (an individualized method) | The SEIQoL asks respondents to nominate the five most relevant domains to their HRQoL. They then assess their current functioning for each domain using a VAS ranging from best to worst possible functioning. Patients then rank the relative importance of each domain by allocating 100 points to the five domains, using a pie chart disc (judgment analysis can also be used) The SEIQoL generates an overall index score, which is the sum of all five domain products (multiplication of each domain’s weight by its corresponding level). If the SEIQoL is administered at two points in time, response shift can be assessed Group- and individual-level analysis | Discrepancy between observed and target change**: No | Recalibration: No Reprioritization: difference in intra-class correlation coefficients between domain weights Reconceptualization: change in frequency and content of the nominated domains over time | ||||||||
Vignettes (a preference-based method) | Patients are asked to rate one or more anchoring vignettes, describing a particular (hypothetical) health state at different points in time (e.g., from poor to excellent) Group-level analysis; can accommodate subgroup analysis | Discrepancy between observed and target change**: No | Reprioritization: mean change in vignette ratings Adjusting: Not applicable | ||||||||
Structural Equation Modeling (SEM) (a Latent Variable Method) | Requires a longitudinal dataset with at least 2 measurement occasions. Uses the factor-analytic framework to operationalize response shift in terms of change in specific model parameters; initially developed at domain (including multiple item)-level, can accommodate individual item-level analysis Group-level analysis; can accommodate subgroup analysis | Discrepancy between observed and target change**: Yes, using the operationalization of “observed change” (observed change in scores) and “target change” (change in the unobserved latent variables) | Uniform recalibration: intercepts Reprioritization: values of factor loadings Reconceptualization: pattern of factor loadings Non-uniform recalibration: residual variances | ||||||||
Item Response Theory (IRT)/ Rasch Measurement Theory (RMT) (a Latent Variable Method) | Requires a longitudinal dataset with at least 2 measurement occasions. Response shift is indicated by change in discrimination power (one parameter per item) and difficulty parameter (p-1 parameters for an item with p response categories) Group-level analysis; can accommodate subgroup analysis | Discrepancy between observed and target change**: Yes, using the operationalization of “observed change” (observed change in item’s responses) and “target change” (change in the unobserved latent variables) | Recalibration: items’ difficulties Reprioritization: discrimination power | ||||||||
| Relative Importance Analysis | Requires a longitudinal dataset with maximally 2 measurement occasions and the a priori identification of two independent groups Two test procedures were proposed: (1) changes in discriminant analysis/logistic regression coefficients over time, and (2) changes in the rank ordering of the domains over time Uses the logistic regression or discriminant analysis framework to operationalize response shift in terms of change in the relative importance of component domains over time, in one group relative to a reference group Group-level analysis; can accommodate subgroup analysis | Discrepancy between observed and target change**: No | Reprioritization: statistically significant change in relative importance of a domain between two time points. | ||||||||
Classification and Regression Tree (CART) | Requires a longitudinal dataset with at least 2 measurement occasions and baseline and clinical time-varying explanatory variables to recursively partition the data into homogeneous subgroups (nodes) with respect to the change in the PROM scores. Uses the CART framework to operationalize response shift in terms of discrepancy between clinical status and change in outcome or change in the relative importance of component domains Group-level analysis; can accommodate subgroup analysis | Discrepancy between observed and target change**: No | Recalibration: inconsistent changes in PROM scores and clinical status Reprioritization: change in the order of importance of each domain over time | ||||||||
| Random Forest Regression | Requires a longitudinal dataset with at least 2 measurement occasion and two groups. Evaluates changes in the relative contribution of HRQOL domains to the prediction of an outcome over time in each group. The relative importance of each domain is assessed using the average variable importance (AVI), which is the relative contribution of a domain to the prediction of an outcome in a CART averaged across several bootstrap samples. The change in the AVI for each component domain in predicting a global QOL scores over time for each group is examined. Response shift is indicated by crossing curves Group-level analysis; can accommodate subgroup analysis | Discrepancy between observed and target change**: No | |||||||||
| Mixed Models and Growth Mixture Models | Requires a longitudinal dataset with at least 3 measurement occasions. Uses mixed models (from which the residuals are obtained, e.g., observed minus predicted HRQoL scores) followed by growth mixture models (from which latent class of homogeneous centered residuals growth trajectories are identified). Response shift is indicated by change in centered residuals over time Group-level analysis; can accommodate subgroup analysis | Discrepancy between observed and target change** | Can detect a general response shift effect. Discrepancy between observed and predicted scores (centered residuals having a pattern of fluctuation over time deviating from zero). | ||||||||
Recalibration: change in one’s internal standards; Reprioritization: change in one’s values; Reconceptualization: change in one’s definition of the target construct. Uniform recalibration: change in all response options in the same direction and to the same extent which will affect the observed variables' mean scores; Non-uniform recalibration: "stretch or shrink" of the scale which will also affect the observed variables' variance and the covariance between them
Detecting: how the method detects response shift. Adjusting: how the method provides change scores that accommodate, or adjust for, response shift. Explaining: how the method can explain response shift
*Design methods require study design changes (e.g., extra measures) needed to detect one or more types of response shift
**Discrepancy between observed and target change: discrepancy between observed change (e.g., change in PROM scores) and target change (i.e., change in the construct that the PROM scores intend to measure). Response shift is assumed to have occurred when observed change is not fully explained by target change
*** Appraisal: The original QOLAP distinguishes among: Recalibration (changes in standard of comparison for assessing one's experience), Reprioritization (changes in strategies for sampling experience), and Reconceptualization (changes in the frame of reference). However, the amount of residual variance explained by changes in appraisal and identified as response shift has not been translated back into the three types of response shift yet
Response shift methods: Assumptions and alternative explanations
| Method | Assumptions | Alternative explanations |
|---|---|---|
Then-test (a design method*) | Differences between mean pretest and then-test scores can also be due to response biases such as effort justification, and social desirability responding Given the need for retrospection, this method is also prone to recall bias and implicit theories of change** | |
Appraisal (a design method*) | The operationalization of appraisal (e.g., health worries, concerns, goals, mood, and spirituality) does not distinguish among appraisal of HRQoL, HRQoL itself, adaptation, and response shift Given the need to retrospect on the way respondents completed questionnaire items, this method is prone to response bias such as recall bias and social desirability responding | |
| Semi-structured interview (a qualitative method) | Recall bias and implicit theories of change** can be introduced if interview questions ask to reflect on the past Respondents may indicate change that could be interpreted as response shift but which in fact is enforced by the interview context (e.g., response biases such as demand characteristics, social desirability responding) Response shift may remain undetected when respondents are not capable of reflection or verbalization | |
Schedule for the Evaluation of Individual Quality of Life (SEIQoL) (an individualized method) | Change in weights (reprioritization) may be an artifact of the calculation method as they need to add up to 100. A decrease in the relative importance of one cue implies increases in the relative importance of other cues Change in domain content (reconceptualization) may be caused by forgetting to nominate a domain previously mentioned (recall bias), not listing a domain that has improved, mentioning a different domain due to implicit theory of change** or mentioning a similar domain at a different level of abstraction If used at the individual level, changes in ranking or content of domains may be attributed to chance fluctuations, such as changes in mood or just measurement error | |
Vignettes (a preference-based method) | The majority of the sample shows response shift in the same domain and same direction | If vignettes describe health states outside respondents’ experience and knowledge, change in ratings over time may be caused by factors that are irrelevant to the vignettes |
| Structural Equation Modeling (SEM) (a Latent Variable Method) | Misspecification of the measurement model (e.g., ignoring multidimensionality) Inter-relations between the different forms of response shift: reprioritization may in fact reflect non-uniform recalibration and vice versa Change in residual variances (non-uniform recalibration) can also be due to change in intercepts (uniform recalibration) or in factor loadings (reprioritization) going in different directions | |
| Item Response Theory (IRT)/ Rasch Measurement Theory (RMT) (a Latent Variable Method) | Misspecification of the measurement model (e.g., ignoring multidimensionality) Inter-relations between the different forms of response shift: Reprioritization may in fact reflect non-uniform recalibration and vice versa Differential change in difficulty parameters (non-uniform recalibration) can also be due to uniform recalibration (or reprioritization for IRT) response shifts going in different directions | |
| Relative Importance Analysis | Relative importance of component domains is sensitive to non-normal data distributions and multi-collinearity when the analysis is conducted using discriminant analysis and logistic regression, respectively, leading to false rank ordering of the domains and false detection of reprioritization response shift Change in relative importance weights or ranks may be due to the existence of more than two observed subgroups (i.e., heterogeneity due to presence of latent groups) | |
| Classification and Regression Trees (CART) | The clinical criterion is measured without any measurement error | This method might be prone to model overfitting leading to false detection of response shift |
| Random Forest Regression | Random forest models are prone to overfitting leading to false detection of response shift, when not cross-validated When the autocorrelation within each explanatory domain over time is ignored, this might affect the estimated importance of each domain (i.e., average variable importance) and possibly the detection of response shift The choice of average variable importance metric can affect the rank ordering of the component domains at each occasion | |
| Mixed Models and Growth Mixture Models | Misspecification of the mixed model for predictions (e.g., misspecified predictors, interactions, covariance structure) might lead to inaccurate trajectories for the residuals from which response shift is deduced Non-monotonic trajectory patterns of residuals may be attributable to other phenomena, such as cognitive impairment |
The assumptions are based on specific literature (references included in the text) and general methodological knowledge. Recalibration: change in one’s internal standards; Reprioritization: change in one’s values; Reconceptualization: change in one’s definition of the target construct. Uniform recalibration: change in all response options in the same direction and to the same extent which will affect the observed variables' means; non-uniform recalibration: "stretch or shrink" of the scale which will also affect the observed variables' variance and the covariance between them
*Design methods require study design changes (e.g., extra measures) needed to detect one or more types of response shift
**Implicit theories of change: the current state of attribute or belief is assessed and a theory of stability or change is invoked
Response shift methods for inter-individual variation
| Method | Description | Challenges |
|---|---|---|
| Mixed Models and Growth Mixture Models | Using mixed models (from which the residuals are obtained) followed by growth mixture models (from which latent class of homogeneous centered residuals growth trajectories are identified). Response shift is indicated by change in centered residuals showing a pattern of fluctuation over time | Mixed model misspecifications can bias predictions and, hence, residuals. Potential contributing factors other than response shift might influence the discrepancies in the direction of the centered residuals, e.g., cognitive impairment [ |
| Structural Equation Model (SEM) with covariates | Testing covariate effects directly within SEM to investigate their effects on response shift in longitudinal data | Fitting multiple models to the data requires a sufficiently large sample size to provide adequate statistical power to detect covariate effects and ensure sufficient heterogeneity to identify subgroups that may experience different types of response shift [ |
| Structural Equation Model with stratification | Stratified SEM analysis, according to an a priori known source of heterogeneity, e.g., disease activity in inflammatory bowel disease | Only a small number of measured covariates can be investigated at the same time [ |
| Mixed Models and Growth Mixture Models and Structural Equation Model | Combination of Mixed Models and Growth Mixture Models and SEM to detect response shift and address potential heterogeneity in the types of response shift. Can be applied when the sources of heterogeneity are unknown a priori. They can then be inferred from data using a latent class approach | This approach results in multi-step analyses with a cascade of statistical manipulations that can raise concerns, including mixed model misspecifications (e.g., predictions can be affected by misspecified covariance structure) and ignoring the uncertainty of classification for the latent classes in SEM where the feasibility and performance of recommendations coming from mixture modeling are unknown in SEM for response shift analyses [ |
| Latent Variable Mixture Models (LVMMs) | LVMMs examine heterogeneity when there is no prior information on measured covariates that may contribute to patient differences. Heterogeneous samples are stratified into groups that are similar by specifying latent classes in the measurement model | These models need to be extended to longitudinal data by examining the possibility of latent classes with different over-time constraints on measurement model parameters that represent different types of response shift (or no response shift). The computational resources required to estimate a large number of model parameters may need to be secured [ |
Response shift methods for multiple time points
| Method | Description | Challenges |
|---|---|---|
| SEM | Extension of Oort’s [ | Large sample sizes are needed to accommodate many parameters and to avoid model overfitting [ |
| Longitudinal three-mode SEM model with Kronecker product restrictions | Extension of Oort’s [ | Model parameters are assumed to change proportionally over time; accordingly, the model is best suited to data with a fixed interval between measurement occasions. Evaluation of model fit is complicated by multiplicative constraints [ |
| Mixed models | Mixed models can accommodate multiple measurement occasions to explore reprioritization response shift by evaluating changes in the importance of components domains to overall, e.g., HRQoL, over time (i.e., significant interaction effects with time) | This approach can be impacted by strong correlations among predictor variables. Such multi-collinearity can be checked and accounted for to avoid unreliable and unstable estimates of regression coefficients and hence spurious findings of reprioritization [ |
| Bayesian joint growth models | Bayesian joint growth models with random occasion-specific parameters for both the latent variable and item parameters to investigate time effects on the occasion-specific item parameters and on the latent variable simultaneously | Specifications of proper prior distributions for the latent variable and for the item parameters are needed, which might be difficult because we do not usually have a clear idea of their a priori distributions [ |
SEM Structural Equation Mode, IRT Item Response Theory, RMT Rasch Measurement Theory