| Literature DB >> 32605503 |
Jonathan W Bartlett1, Rachael A Hughes2,3.
Abstract
Multiple imputation has become one of the most popular approaches for handling missing data in statistical analyses. Part of this success is due to Rubin's simple combination rules. These give frequentist valid inferences when the imputation and analysis procedures are so-called congenial and the embedding model is correctly specified, but otherwise may not. Roughly speaking, congeniality corresponds to whether the imputation and analysis models make different assumptions about the data. In practice, imputation models and analysis procedures are often not congenial, such that tests may not have the correct size, and confidence interval coverage deviates from the advertised level. We examine a number of recent proposals which combine bootstrapping with multiple imputation and determine which are valid under uncongeniality and model misspecification. Imputation followed by bootstrapping generally does not result in valid variance estimates under uncongeniality or misspecification, whereas certain bootstrap followed by imputation methods do. We recommend a particular computationally efficient variant of bootstrapping followed by imputation.Entities:
Keywords: Multiple imputation; bootstrap; congeniality
Mesh:
Year: 2020 PMID: 32605503 PMCID: PMC7682506 DOI: 10.1177/0962280220932189
Source DB: PubMed Journal: Stat Methods Med Res ISSN: 0962-2802 Impact factor: 3.021
Median confidence interval width and coverage for the subgroup analysis (uncongenial) and heteroscedastic errors (misspecification) scenarios.
Subgroup analysis | Heteroscedastic errors | |||||
|---|---|---|---|---|---|---|
Median | Median | |||||
|
|
| CI width | CI cov. | CI width | CI cov. | |
| MI Rubin | 10 | 0.0142 | 98.2 | 0.0126 | 91.3 | |
| MI boot Rubin | 10 | 200 | 0.0143 | 98.1 | 0.0129 | 92.1 |
| MI boot pooled percentile | 10 | 200 | 0.0131 | 97.7 | 0.0117 | 89.2 |
| Boot MI percentile | 10 | 200 | 0.0109 | 94.9 | 0.0144 | 95.0 |
| Boot MI percentile | 1 | 200 | 0.0139 | 98.4 | 0.0167 | 97.7 |
| von Hippel | 2 | 200 | 0.0108 | 95.0 | 0.0144 | 94.1 |
CI, confidence interval; CI cov., confidence interval coverage; MI, multiple imputation.
Median confidence interval width and coverage for the omitted interaction (uncongenial) and moderate non-normality (misspecification) scenarios.
Omitted interaction | Moderate non-normality | |||||
|---|---|---|---|---|---|---|
Median | Median | |||||
|
|
| CI width | CI cov. | CI width | CI cov. | |
| MI Rubin | 10 | 0.0146 | 97.3 | 0.0119 | 94.6 | |
| MI boot Rubin | 10 | 200 | 0.0146 | 97.2 | 0.0120 | 94.7 |
| MI boot pooled percentile | 10 | 200 | 0.0135 | 95.4 | 0.0108 | 93.1 |
| Boot MI percentile | 10 | 200 | 0.0128 | 94.2 | 0.0118 | 95.4 |
| Boot MI percentile | 1 | 200 | 0.0159 | 98.0 | 0.0143 | 98.1 |
| von Hippel | 2 | 200 | 0.0127 | 94.0 | 0.0117 | 95.1 |
CI, confidence interval; CI cov., confidence interval coverage; MI, multiple imputation.
Median confidence interval width and coverage under MAR (congenial and correctly specified), jump to reference (uncongenial and correctly specified) imputation from 10,000 simulations.
MAR (congenial) | Jump to reference (uncongenial) | ||||||
|---|---|---|---|---|---|---|---|
Median | Median | ||||||
|
|
| Time (s) | CI width | CI cov. | CI width | CI cov. | |
| MI Rubin | 10 | 0.05 | 0.286 | 95.08 | 0.251 | 99.78 | |
| MI boot Rubin | 10 | 1000 | 13.6 | 0.286 | 95.04 | 0.251 | 99.78 |
| MI boot pooled percentile | 10 | 1000 | 13.7 | 0.260 | 93.07 | 0.237 | 99.63 |
| Boot MI percentile | 10 | 1000 | 36.8 | 0.278 | 95.56 | 0.157 | 96.06 |
| Boot MI percentile | 1 | 1000 | 3.9 | 0.332 | 98.47 | 0.211 | 99.40 |
| von Hippel | 2 | 1000 | 7.6 | 0.272 | 95.29 | 0.151 | 95.26 |
Times shown indicate median execution time for each method on one dataset. MAR, missing at random; CI, confidence interval; CI cov., confidence interval coverage; MI, multiple imputation.