| Literature DB >> 25069632 |
Rita Faria1, Manuel Gomes, David Epstein, Ian R White.
Abstract
Missing data are a frequent problem in cost-effectiveness analysis (CEA) within a randomised controlled trial. Inappropriate methods to handle missing data can lead to misleading results and ultimately can affect the decision of whether an intervention is good value for money. This article provides practical guidance on how to handle missing data in within-trial CEAs following a principled approach: (i) the analysis should be based on a plausible assumption for the missing data mechanism, i.e. whether the probability that data are missing is independent of or dependent on the observed and/or unobserved values; (ii) the method chosen for the base-case should fit with the assumed mechanism; and (iii) sensitivity analysis should be conducted to explore to what extent the results change with the assumption made. This approach is implemented in three stages, which are described in detail: (1) descriptive analysis to inform the assumption on the missing data mechanism; (2) how to choose between alternative methods given their underlying assumptions; and (3) methods for sensitivity analysis. The case study illustrates how to apply this approach in practice, including software code. The article concludes with recommendations for practice and suggestions for future research.Entities:
Mesh:
Year: 2014 PMID: 25069632 PMCID: PMC4244574 DOI: 10.1007/s40273-014-0193-3
Source DB: PubMed Journal: Pharmacoeconomics ISSN: 1170-7690 Impact factor: 4.981
Number and proportion of individuals with complete data by treatment allocation
| Complete at | Surgery ( | Medical management ( |
|---|---|---|
| Year 1 | 134 (75%) | 147 (82%) |
| Year 2 | 121 (68%) | 134 (75%) |
| Year 3 | 112 (63%) | 119 (66%) |
| Year 4 | 114 (64%) | 118 (66%) |
| Year 5 | 115 (65%) | 113 (63%) |
| All years | 88 (49%) | 84 (47%) |
Fig. 1Pattern of missing data. Black shading represents missing data for one or more individuals (arrayed along the horizontal axis) on a particular variable (arrayed along the vertical axis); grey shading represents observed data. a Pattern of missing data on costs. b Pattern of missing data on health-related quality of life (EQ-5D). GP general practitioner
Logistic regression for missingness of costs and quality-adjusted life-years on baseline variables
| Odds ratio in logistic regression for missing data (95 % CI) | ||
|---|---|---|
| Missing data on costs | Missing data on QALYs | |
| Treatment allocation | 1.04 (0.68–1.59) | 1.04 (0.68–1.58) |
| Gender | 1.29 (0.81–2.04) | 1.10 (0.70–1.74) |
| BMI | 1.01 (0.96–1.06) | 1.01 (0.96–1.06) |
| Age | 0.99 (0.97–1.00) | 0.99 (0.97–1.00) |
| EQ-5D at baseline | 0.38** (0.16–0.90) | 0.46* (0.19–1.09) |
QALYs quality-adjusted life-years
* Indicates statistical significance at 0.10
** Indicates statistical significance at 0.05
Fig. 2Comparison of the distribution of imputed values (imputation number 1 to 10) with the observed data (imputation number 0) for quality-adjusted life-years and costs in years 1 and 5. Individual values are represented by dots; the width of a row of dots represents the frequency of values in the distribution. QALYs quality-adjusted life-years
Results of different methods to handle missing data
| Complete case analysis with seemingly unrelated regression model | Multiple imputation of costs and QALYs followed by seemingly unrelated regression model | Mixed model with adjustment for baseline EQ-5D | ||
|---|---|---|---|---|
| Difference in costs (£) | Mean | 1,668 | 1,305 | 1,338 |
| SE | 268 | 255 | 253 | |
| 95 % CI | 1,142–2,194 | 805–1,806 | 843–1,833 | |
| Difference in QALYs adjusted for baseline EQ-5D | Mean | 0.301 | 0.244 | 0.227 |
| SE | 0.106 | 0.098 | 0.100 | |
| 95 % CI | 0.093–0.508 | 0.052–0.437 | 0.031–0.422 | |
| ICER | £/QALY | 5,547 | 5,340 | 5,903 |
| Probability that surgery is cost effective at the threshold of £20,000 per QALY gained | 0.98 | 0.96 | 0.94 | |
ICER incremental cost-effectiveness ratio, QALYs quality-adjusted life-years, SE standard error of the mean
Fig. 3Sensitivity analysis: data are missing not at random for QALYs or for costs. Note—imputed costs between year 2 and 5 are increased by 10 %; imputed QALYs between year 2 and 5 are reduced by 10 %. The probability that surgery is cost effective is stable at values close to 1 even if the imputed costs are increased only for the individuals with missing data randomised to the surgery group. Changes in imputed QALYs have an impact on the probability of cost effectiveness if the shift is implemented only in patients with missing data randomised to the surgery group but probability remains above 50 % throughout all scenarios. QALY quality-adjusted life-year
Recommendations for practice
| Recommendation | Comments |
|---|---|
| Stage 1: Descriptive analysis | |
1.1 Conduct descriptive analysis of the data: • Proportion of missing data by trial group at each follow-up period • Missing data pattern • Association between missingness and baseline variables • Association between missingness and observed outcomes | Report the descriptive analysis that was conducted to inform the assumption on the missing data mechanism |
| 1.2 Discuss among the trial team (trialists, clinicians, trial management group, etc.) the possible mechanisms and reasons for missing data | |
| 1.3 Make an assumption on the missing data mechanism based on the information collected in 1.1 and 1.2 | Note that the descriptive analysis can distinguish between MCAR, CD-MCAR and MAR, but it cannot rule out MNAR |
| 1.4 State the assumption on the missing data mechanism and justify the choice of assumption | |
| 1.5 Report HR-QOL, resource use and costs (if applicable) by treatment group prior to imputation | |
| Stage 2: Choosing and Implementing a Method to Handle Missing data | |
| 2.1. Choose a method to handle the missing data in accordance with the assumed missing data mechanism | Complete case analysis (with the baseline covariates related with missing data included in the analysis model) for CD-MCAR, MI or likelihood-base model for MAR, IPW for monotonic missing data under MCAR, CD-MCAR or MAR |
| 2.2. State up front any other assumptions required for the analysis | e.g. whether missing data in individual resource use items are assumed to be zero |
| 2.3. Include all randomised individuals with follow-up data | Individuals with data only at baseline may be excluded from the base case but should be included in a scenario to make the analysis truly intention-to-treat |
| 2.4. Impute missing baseline covariates with mean imputation or MI | MI is more complex, and may be less efficient, than mean imputation |
2.5. MI seems the most widely applicable method of analysis: • The imputation model should include all covariates related to missingness, related to outcomes and any variable included in the analysis model • MI should be implemented separately by treatment allocation • The number of imputations should be at least greater than the proportion of missing data • Predictive mean matching and/or transformations in MICE can help with CEA data that is non-normal distributed • Costs can be imputed at a resource use level or as costs • QALYs can be imputed at HR-QOL domain level, at the index score level or as QALYs | MI can be implemented with chained equations (MI-MICE) or by joint modelling (MI-JM), which assumes multivariate normality. The current evidence base does not allow for strict recommendations for one approach over another |
| 2.6. Likelihood-based models are a sensible alternative to MI but can be more difficult to implement | Likelihood-based models avoid the imputation step but only covariates allowed for the analysis model can be included. They can be difficult to implement when costs or health outcomes are disaggregated |
| 2.7. IPW methods are useful if the missing data pattern is monotonic | IPW avoids the imputation step but its reliability is dependent on the model specification |
| 2.8. Other ad hoc methods (e.g. complete case, mean imputation or last-value carried forward) should be avoided | They cannot incorporate the uncertainty inherent in missing data, and often make implausible assumptions about the missing data mechanism |
| 2.9. The method chosen to handle missing data can be validated by comparing results with an alternative method that makes the same assumption on the missing data mechanism (e.g. likelihood-based model vs. MI with the same covariates) | If using MI, the imputation model can be validated by comparing the distribution of observed and imputed data |
| 2.10. If using MI, report resource use, HR-QOL scores (if imputed at this level), costs and QALYs by treatment group after imputation. Results after imputation should be compared with the descriptive analysis pre-imputation | |
| Stage 3: Sensitivity analysis to the MAR assumption | |
3.1. Sensitivity analysis explores the robustness of the results to alternative assumptions on the missing data mechanism: • The methods proposed here (weighting approach or an additive shift of imputed values) are straightforward and informative | Pattern mixture and selection models can be difficult to implement |
| 3.2. Interpret the results of the sensitivity analysis in light of the understanding of the disease and the trial context (see 1.2.) | Does the allocation decision (i.e. is the intervention likely to be cost effective?) change given plausible changes in the assumption on the missing data mechanism? |
CD-MCAR covariate-dependent missing completely at random, CEA cost-effectiveness analysis, HR-QOL health-related quality of life, IPW inverse probability weighting, MAR missing at random, MCAR missing completely at random, MI multiple imputation, MI-JM MI: joint modelling, MI-MICE MI: chained equations, MNAR missing not at random, QALYs quality-adjusted life-years
| Missing data are a frequent problem in cost-effectiveness analysis within a randomised clinical trial. |
| Different methods of handling missing data can yield different results and affect decisions on the value for money of healthcare interventions. |
| The choice of method should be grounded in the assumed missing data mechanism, which in turn should be informed by the available evidence. |
| The impact of alternative assumptions about the missing data mechanism should be carefully assessed in sensitivity analysis. |