| Literature DB >> 33858355 |
Ping-Tee Tan1, Suzie Cro2, Eleanor Van Vogt3, Matyas Szigeti3, Victoria R Cornelius3.
Abstract
BACKGROUND: Missing data are common in randomised controlled trials (RCTs) and can bias results if not handled appropriately. A statistically valid analysis under the primary missing-data assumptions should be conducted, followed by sensitivity analysis under alternative justified assumptions to assess the robustness of results. Controlled Multiple Imputation (MI) procedures, including delta-based and reference-based approaches, have been developed for analysis under missing-not-at-random assumptions. However, it is unclear how often these methods are used, how they are reported, and what their impact is on trial results. This review evaluates the current use and reporting of MI and controlled MI in RCTs.Entities:
Keywords: Controlled multiple imputation; Missing data; Multiple imputation; Randomised controlled trials; Sensitivity analysis
Mesh:
Year: 2021 PMID: 33858355 PMCID: PMC8048273 DOI: 10.1186/s12874-021-01261-6
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Options (not exhaustive) for reference-based and delta-based multiple imputation with a continuous outcome
| Imputation option | Handling of missing outcome data |
|---|---|
| Last mean carried forward (LMCF) | Impute assuming all unobserved participants stayed at the mean value of their respective randomised group after their last observed time point. |
| Copy reference (CR) | Impute assuming all unobserved participants behaved similarly to the behaviour of the specified reference group for the entire duration of the study. |
| Copy increments in reference (CIR) | Impute assuming all unobserved participants followed the mean increments observed in specified reference group after their last observed time point. |
| Jump to reference (J2R) | Impute assuming all unobserved participants jumped to the behaviour of the specified reference group after their last observed time point. |
| Delta | Impute assuming all unobserved participants having a poorer or better response than those observed, by adding or subtracting a “delta” parameter’ to the expected value of the MAR imputed values. “Delta” can be implemented in all treatment groups, or in only one group, or may vary by treatment group or an alternative specified factor. |
Fig. 1PRISMA flow diagram
Fig. 2Articles with multiple imputation or controlled multiple imputation in Lancet and NEJM (2014 to 2019)
Key characteristics of included RCTs and methods for handling missing data
| The Lancet | NEJM | Total | ||||
|---|---|---|---|---|---|---|
| Behaviour (diet, exercise, cognitive) | 5 | 10% | 4 | 6% | 9 | 8% |
| Medical device | 6 | 12% | 1 | 1% | 7 | 6% |
| Diagnostic | 2 | 4% | 0 | 0% | 2 | 2% |
| Drug | 19 | 37% | 47 | 70% | 66 | 56% |
| Health service strategies | 5 | 10% | 2 | 3% | 7 | 6% |
| Psychological | 5 | 10% | 0 | 0% | 5 | 4% |
| Surgical | 9 | 18% | 13 | 19% | 22 | 19% |
| < 100 | 2 | 4% | 4 | 6% | 6 | 5% |
| 100 to 499 | 24 | 47% | 20 | 30% | 44 | 37% |
| 500 to 999 | 16 | 31% | 21 | 31% | 37 | 31% |
| 1000 to 4999 | 8 | 16% | 14 | 21% | 22 | 19% |
| ≥ 5000 | 1 | 2% | 8 | 12% | 9 | 8% |
| < 10% | 21 | 41% | 33 | 49% | 54 | 46% |
| 10 to 19% | 17 | 33% | 23 | 34% | 40 | 34% |
| 20 to 29% | 6 | 12% | 4 | 6% | 10 | 8% |
| ≥ 30% | 3 | 6% | 4 | 6% | 7 | 6% |
| not clear | 4 | 8% | 3 | 4% | 7 | 6% |
| Binary | 20 | 39% | 34 | 51% | 54 | 46% |
| Continuous | 27 | 53% | 30 | 45% | 57 | 48% |
| Count | 4 | 8% | 0 | 0% | 4 | 3% |
| Time-to-event | 0 | 0% | 3 | 4% | 3 | 3% |
| Yes | 5 | 10% | 6 | 9% | 11 | 9% |
| No | 46 | 90% | 61 | 91% | 107 | 91% |
| MAR MIa | 22 | 44% | 21 | 31% | 43 | 36% |
| Controlled MI | 1 | 2% | 1 | 1% | 2 | 2% |
| Complete case | 19 | 38% | 39 | 57% | 58 | 49% |
| Single imputation | 5 | 10% | 5 | 7% | 10 | 8% |
| Last observation carried forward | 3 | 6% | 2 | 3% | 5 | 4% |
| Yes | 36 | 71% | 59 | 88% | 95 | 81% |
| No | 15 | 29% | 8 | 12% | 23 | 19% |
| MAR MI | 28 | 68% | 42 | 54% | 70 | 59% |
| Controlled MI | 3 | 7% | 11 | 14% | 14 | 12% |
| Complete case | 6 | 15% | 10 | 13% | 16 | 13% |
| Single imputation | 1 | 2% | 8 | 10% | 9 | 8% |
| Last observation carried forward | 1 | 2% | 6 | 8% | 7 | 6% |
| Others | 2 | 5% | 1 | 1% | 3 | 3% |
| Reference-based | 2 | 40% | 5 | 45% | 7 | 44% |
| Delta-based | 3 | 60% | 6 | 55% | 9 | 56% |
Data presented as n and %. aOne trial using MI under MAR used a hybrid of MI under MAR and worst observation carried forward (single imputation)
b23 trials using two or more different statistical method for sensitivity analysis
Percentages are rounded to 0 decimal places so may not sum exactly to 100%
Reporting of methods for MI under MAR
| n | % | |
|---|---|---|
| Binary | 52 | 47% |
| Continuous | 52 | 47% |
| Count | 4 | 4% |
| Time-to-event | 2 | 1% |
| Not stated | 47 | 43% |
| Multiple Imputation using Chained Equations (MICE/FCS) | 35 | 32% |
| Specified Imputation Model Type(s) within MICE/FCS | 12 | 11% |
| MCMC MI/algorithm/method | 8 | 7% |
| Regression based MI | 7 | 6% |
| Specified imputation Model Type within regression based MI | 4 | 4% |
| PMM | 5 | 5% |
| MVN imputation | 4 | 4% |
| MVN imputation (non-monotone missing patterns) and regression MI model (monotone patterns) | 1 | 1% |
| MICE (non-monotone missing patterns) and regression MI model (monotone patterns) | 1 | 1% |
| MCMC (non-monotone missing patterns) and PMM (monotone patterns) | 1 | 1% |
| Propensity score MI | 1 | 1% |
| Imputation model incl. All variables in analysis model only | 6 | 5% |
| Imputation model incl. All variables in analysis model + auxiliary variables | 37 | 34% |
| Imputation model did not include all variables in analysis modela | 9 | 8% |
| Imputation model incl. All variables in analysis model + auxiliary variables | 8 | 7% |
| 5 | 9 | 8% |
| 10 | 8 | 7% |
| 11–20 | 25 | 23% |
| 21–50 | 17 | 15% |
| 100 | 8 | 7% |
| 200 | 1 | 1% |
| 1000 | 2 | 2% |
| Not stated | 40 | 36% |
| IVEware software | 1 | 1% |
| MICE (R) | 3 | 3% |
| Proc MI (SAS) | 3 | 3% |
| Proc MI and Proc MIANALYZE (SAS) | 5 | 5% |
| Proc MIANALYZE (SAS) | 2 | 2% |
| Realcom Impute | 1 | 1% |
| Ice (Stata) | 2 | 2% |
| MICE (Stata) | 1 | 1% |
| Mi impute (Stata) | 5 | % |
| MI impute and mi estimate (Stata) | 1 | 1% |
| Missing data module in SPSS 24b | 2 | 2% |
| Not stated | 84 | 76% |
| Yesc | 25 | 23% |
| Nod | 1 | 1% |
| Not stated | 84 | 76% |
| Primary | 2 | 13% |
| Sensitivity | 14 | 87% |
| 1 | 1% | |
9 trials did not include all variables in the analysis model in the imputation model and included auxiliary variables. One trial specified that the Multiple Imputation-Automatic method was used. Explicitly stated (n = 18) or inferable from specified software or reference (n = 7). One trial reported presented the overall 95% confidence using the mean of the values for the lower and upper confidence intervals. Percentages are rounded to 0 decimal places so may not sum exactly to 100%
Reporting of methods for controlled MI
| Controlled MI feature | n | % |
|---|---|---|
| Binary | 4 | 25% |
| Continuous | 9 | 56% |
| Time-to-event | 3 | 19% |
| Delta-based MI | 9 | 56% |
| Reference based MI | 7 | 44% |
| MICEe | 2 | 22% |
| MVN imputation (non-monotone missing patterns) and regression MI model (monotone patterns) | 1 | 11% |
| Kaplan-Meier MI (KMMI) | 2b | 22% |
| ANCOVA MI | 1 | 11% |
| Cox model MI | 1† | 11% |
| Not stated | 3 | 27% |
| MCMC or random draws from a normal distribution with mean equal to subject’s own baseline valuef | 1 | 14% |
| Linear MMRM | 2 | 29% |
| Kaplan-Meier MI (KMMI) | 1 | 14% |
| Not stated | 3 | 43% |
| 9 | 56% | |
| Imputation model incl. All variables in analysis model only | 5 | 31% |
| Imputation model incl. All variables in analysis model + auxiliary variables | 3 | 19% |
| Imputation model did not include all variables in analysis modelg | 1 | 6% |
| 7 | 44% | |
| Imputation model incl. All variables in analysis model + auxiliary variables | 2 | 13% |
| 13 | 81% | |
| 5 | 2 | 13% |
| 20 | 1 | 6% |
| 100 | 6 | 38% |
| 1000 | 4 | 25% |
| Not stated | 3 | 19% |
| Proc MI and Proc MIANALYZE (SAS) | 2 | 13% |
| Proc MIXED and Proc MIANALYZE (SAS) | 2 | 13% |
| Not stated | 12 | 75% |
| Yesh | 9 | 56% |
| Noc | 1 | 6% |
| Not stated | 6 | 38% |
| Primary | 2 | 13% |
| Sensitivity | 14 | 87% |
| Median (range) | 3 | (1–48) |
ª Denominator for variables 16 unless otherwise indicated. bOne trial used both KKMI and Cox model MI in two separate sensitivity analyses. cOne trial reported using a modified version of Rubin’s rules, “the overall average estimated event rate difference and average estimated variance” (did not incorporate any between imputation variability in the variance calculation). dN = 13. Not clear for 1/14 trials using controlled MI in sensitivity analysis. eNo further details available on types of models utilised within MICE. fMissing data during the on-treatment period were imputed “using the MI SAS procedure (using Markov Chain Monte Carlo)” and values missing values during the post-treatment period were “multiply imputed using random draws from a normal distribution where the mean was equal to subject’s own baseline value.” gOne trial did not include all variables in analysis model and included auxiliary variables in the imputation model. h Explicitly stated (n = 5) or inferable from specified software or reference (n = 4). Percentages are rounded to 0 decimal places so may not sum exactly to 100%