| Literature DB >> 28068910 |
Evangelos Kontopantelis1,2, Ian R White3, Matthew Sperrin4, Iain Buchan4.
Abstract
BACKGROUND: Multiple imputation is frequently used to deal with missing data in healthcare research. Although it is known that the outcome should be included in the imputation model when imputing missing covariate values, it is not known whether it should be imputed. Similarly no clear recommendations exist on: the utility of incorporating a secondary outcome, if available, in the imputation model; the level of protection offered when data are missing not-at-random; the implications of the dataset size and missingness levels.Entities:
Keywords: Imputed outcome; Missing data; Missingness; Multiple imputation
Mesh:
Year: 2017 PMID: 28068910 PMCID: PMC5220613 DOI: 10.1186/s12874-016-0281-5
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Fig. 1Data structure
Analysis methods
| A | complete case analysis (no multiple imputation [mi]) |
| B | no outcome imputation, not included in mi model |
| Ca | no outcome imputation, outcome imputed in mi model |
| Da | outcome imputed and included in mi model |
| Ea | outcome imputed and included in mi model but then cases where it was imputed are deleted |
| Fa | as in C but also including a second correlated outcome in the mi model |
| Ga | as in D but also including a second correlated outcome in the mi model |
| H | as in D but the mi and analysis models do not include the covariate |
Main models of interest, other models provided for comparison purposes
Performance results for exposure E, datasets of 1,000 observationsa
| % miss | A | B | Cb | Db | Eb | Fb | Gb | H | ||
|---|---|---|---|---|---|---|---|---|---|---|
| MCAR | Mean biasc | 20 | −0.019 | −0.107 | −0.020 | −0.025 | −0.030 | −0.021 | −0.023 | −0.427 |
| 40 | −0.038 | −0.209 | −0.041 | −0.045 | −0.020 | −0.041 | −0.045 | −0.437 | ||
| 60 | −0.084 | −0.301 | −0.064 | −0.056 | −0.060 | −0.067 | −0.055 | −0.440 | ||
| 80 | 0.134 | −0.409 | −0.109 | −0.092 | −0.145 | −0.126 | −0.097 | −0.458 | ||
| Mean errorc | 20 | 0.218 | 0.208 | 0.200 | 0.201 | 0.199 | 0.199 | 0.199 | 0.430 | |
| 40 | 0.290 | 0.268 | 0.232 | 0.231 | 0.229 | 0.232 | 0.233 | 0.444 | ||
| 60 | 0.475 | 0.361 | 0.306 | 0.313 | 0.308 | 0.309 | 0.301 | 0.466 | ||
| 80 | 1.016 | 0.503 | 0.584 | 0.503 | 0.543 | 0.560 | 0.489 | 0.525 | ||
| Coverage | 20 | 0.950 | 0.941 | 0.943 | 0.945 | 0.949 | 0.947 | 0.943 | 0.489 | |
| 40 | 0.962 | 0.913 | 0.963 | 0.957 | 0.959 | 0.962 | 0.949 | 0.605 | ||
| 60 | 0.965 | 0.907 | 0.959 | 0.939 | 0.955 | 0.960 | 0.943 | 0.718 | ||
| 80 | 0.992 | 0.957 | 0.989 | 0.956 | 0.986 | 0.988 | 0.965 | 0.823 | ||
| Power | 20 | 0.720 | 0.688 | 0.771 | 0.761 | 0.777 | 0.769 | 0.770 | 0.274 | |
| 40 | 0.462 | 0.425 | 0.604 | 0.593 | 0.644 | 0.610 | 0.597 | 0.181 | ||
| 60 | 0.228 | 0.213 | 0.367 | 0.413 | 0.394 | 0.373 | 0.433 | 0.163 | ||
| 80 | 0.065 | 0.062 | 0.068 | 0.180 | 0.098 | 0.074 | 0.169 | 0.124 | ||
| MAR | Mean biasc | 20 | −0.019 | −0.125 | −0.030 | −0.035 | −0.030 | −0.030 | −0.035 | −0.425 |
| 40 | −0.014 | −0.227 | −0.046 | −0.065 | −0.044 | −0.050 | −0.062 | −0.441 | ||
| 60 | −0.046 | −0.308 | −0.063 | −0.064 | −0.049 | −0.059 | −0.062 | −0.446 | ||
| 80 | −0.054 | −0.350 | −0.069 | −0.119 | −0.080 | −0.073 | −0.128 | −0.408 | ||
| Mean errorc | 20 | 0.224 | 0.215 | 0.200 | 0.200 | 0.203 | 0.198 | 0.197 | 0.428 | |
| 40 | 0.290 | 0.282 | 0.228 | 0.227 | 0.234 | 0.229 | 0.229 | 0.448 | ||
| 60 | 0.491 | 0.367 | 0.314 | 0.312 | 0.303 | 0.314 | 0.305 | 0.467 | ||
| 80 | 1.070 | 0.463 | 0.593 | 0.501 | 0.546 | 0.601 | 0.502 | 0.494 | ||
| Coverage | 20 | 0.941 | 0.925 | 0.955 | 0.954 | 0.956 | 0.955 | 0.953 | 0.504 | |
| 40 | 0.958 | 0.896 | 0.953 | 0.956 | 0.950 | 0.958 | 0.948 | 0.561 | ||
| 60 | 0.955 | 0.889 | 0.963 | 0.948 | 0.971 | 0.964 | 0.946 | 0.706 | ||
| 80 | 0.994 | 0.943 | 0.978 | 0.954 | 0.978 | 0.978 | 0.946 | 0.836 | ||
| Power | 20 | 0.708 | 0.678 | 0.779 | 0.763 | 0.771 | 0.774 | 0.779 | 0.271 | |
| 40 | 0.448 | 0.372 | 0.583 | 0.564 | 0.606 | 0.587 | 0.593 | 0.193 | ||
| 60 | 0.224 | 0.195 | 0.350 | 0.391 | 0.373 | 0.360 | 0.404 | 0.161 | ||
| 80 | 0.010 | 0.052 | 0.050 | 0.147 | 0.077 | 0.045 | 0.143 | 0.123 | ||
| MNAR | Mean biasc | 20 | −0.026 | −0.150 | −0.047 | −0.044 | −0.054 | −0.047 | −0.046 | −0.425 |
| 40 | −0.092 | −0.302 | −0.135 | −0.113 | −0.097 | −0.139 | −0.121 | −0.454 | ||
| 60 | −0.086 | −0.334 | −0.050 | −0.022 | −0.027 | −0.052 | −0.021 | −0.450 | ||
| 80 | 0.038 | −0.478 | −0.253 | −0.283 | −0.314 | −0.316 | −0.275 | −0.484 | ||
| Mean errorc | 20 | 0.227 | 0.228 | 0.207 | 0.207 | 0.209 | 0.208 | 0.207 | 0.431 | |
| 40 | 0.375 | 0.371 | 0.302 | 0.293 | 0.292 | 0.304 | 0.293 | 0.472 | ||
| 60 | 0.590 | 0.411 | 0.366 | 0.368 | 0.361 | 0.371 | 0.355 | 0.479 | ||
| 80 | 1.283 | 0.741 | 0.841 | 0.737 | 0.773 | 0.881 | 0.711 | 0.654 | ||
| Coverage | 20 | 0.954 | 0.923 | 0.940 | 0.945 | 0.941 | 0.944 | 0.943 | 0.554 | |
| 40 | 0.957 | 0.927 | 0.958 | 0.950 | 0.961 | 0.954 | 0.949 | 0.708 | ||
| 60 | 0.973 | 0.962 | 0.967 | 0.951 | 0.964 | 0.968 | 0.946 | 0.746 | ||
| 80 | 0.997 | 1.000 | 0.993 | 0.990 | 0.999 | 0.995 | 0.986 | 0.921 | ||
| Power | 20 | 0.652 | 0.592 | 0.714 | 0.712 | 0.694 | 0.706 | 0.713 | 0.237 | |
| 40 | 0.307 | 0.223 | 0.383 | 0.398 | 0.420 | 0.389 | 0.399 | 0.153 | ||
| 60 | 0.209 | 0.148 | 0.330 | 0.368 | 0.366 | 0.327 | 0.382 | 0.152 | ||
| 80 | 0.015 | 0.010 | 0.041 | 0.088 | 0.039 | 0.030 | 0.096 | 0.092 |
aAnalysis model A: complete case analysis (no multiple imputation [mi]); B: no outcome imputation, not included in mi model; C: no outcome imputation, outcome imputed in mi model; D: outcome imputed and included in mi model; E: outcome imputed and included in mi model but then observations where it was imputed are deleted; F as in C but also including a second correlated outcome in the mi model; G as in D but also including a second correlated outcome in the mi model; H as in D but the mi and analysis models do not include the covariate
bMain models of interest, other models provided for comparison purposes
cReported on log-odds scale and based on a true effect of log [2]
Performance results for exposure E, datasets of 10,000 observationsa
| % miss | A | B | Cb | Db | Eb | Fb | Gb | H | ||
|---|---|---|---|---|---|---|---|---|---|---|
| MCAR | Mean biasc | 20 | 0.001 | −0.089 | −0.003 | −0.008 | −0.006 | −0.003 | −0.006 | −0.413 |
| 40 | −0.005 | −0.181 | −0.016 | −0.024 | −0.009 | −0.016 | −0.021 | −0.419 | ||
| 60 | −0.017 | −0.268 | −0.027 | −0.035 | −0.017 | −0.028 | −0.032 | −0.421 | ||
| 80 | −0.021 | −0.343 | −0.023 | −0.036 | −0.033 | −0.027 | −0.032 | −0.418 | ||
| Mean errorc | 20 | 0.064 | 0.094 | 0.059 | 0.059 | 0.060 | 0.059 | 0.059 | 0.413 | |
| 40 | 0.088 | 0.182 | 0.071 | 0.072 | 0.071 | 0.070 | 0.071 | 0.419 | ||
| 60 | 0.135 | 0.268 | 0.094 | 0.094 | 0.088 | 0.094 | 0.095 | 0.421 | ||
| 80 | 0.270 | 0.343 | 0.150 | 0.144 | 0.148 | 0.148 | 0.143 | 0.418 | ||
| Coverage | 20 | 0.962 | 0.792 | 0.959 | 0.965 | 0.948 | 0.959 | 0.959 | 0.000 | |
| 40 | 0.946 | 0.439 | 0.954 | 0.946 | 0.958 | 0.954 | 0.952 | 0.000 | ||
| 60 | 0.943 | 0.291 | 0.956 | 0.931 | 0.965 | 0.958 | 0.939 | 0.007 | ||
| 80 | 0.952 | 0.394 | 0.952 | 0.929 | 0.950 | 0.949 | 0.929 | 0.137 | ||
| Power | 20 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.983 | |
| 40 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.935 | ||
| 60 | 0.974 | 0.982 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.772 | ||
| 80 | 0.518 | 0.635 | 0.949 | 0.943 | 0.956 | 0.946 | 0.944 | 0.547 | ||
| MAR | Mean biasc | 20 | 0.005 | −0.101 | −0.005 | −0.012 | −0.009 | −0.005 | −0.013 | −0.412 |
| 40 | 0.001 | −0.204 | −0.031 | −0.043 | −0.032 | −0.031 | −0.042 | −0.419 | ||
| 60 | −0.015 | −0.275 | −0.038 | −0.046 | −0.028 | −0.037 | −0.046 | −0.423 | ||
| 80 | 0.014 | −0.349 | −0.058 | −0.068 | −0.058 | −0.058 | −0.072 | −0.415 | ||
| Mean errorc | 20 | 0.064 | 0.104 | 0.059 | 0.058 | 0.060 | 0.058 | 0.058 | 0.412 | |
| 40 | 0.087 | 0.204 | 0.072 | 0.075 | 0.074 | 0.072 | 0.075 | 0.419 | ||
| 60 | 0.135 | 0.275 | 0.096 | 0.098 | 0.092 | 0.095 | 0.097 | 0.423 | ||
| 80 | 0.304 | 0.349 | 0.160 | 0.159 | 0.153 | 0.158 | 0.161 | 0.415 | ||
| Coverage | 20 | 0.965 | 0.727 | 0.963 | 0.967 | 0.962 | 0.965 | 0.964 | 0.000 | |
| 40 | 0.952 | 0.334 | 0.958 | 0.940 | 0.947 | 0.951 | 0.935 | 0.000 | ||
| 60 | 0.948 | 0.237 | 0.950 | 0.924 | 0.953 | 0.950 | 0.918 | 0.004 | ||
| 80 | 0.944 | 0.356 | 0.935 | 0.918 | 0.943 | 0.941 | 0.926 | 0.161 | ||
| Power | 20 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.987 | |
| 40 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.944 | ||
| 60 | 0.977 | 0.982 | 0.999 | 1.000 | 1.000 | 1.000 | 1.000 | 0.792 | ||
| 80 | 0.486 | 0.613 | 0.905 | 0.897 | 0.920 | 0.912 | 0.902 | 0.537 | ||
| MNAR | Mean biasc | 20 | 0.003 | −0.125 | −0.023 | −0.021 | −0.024 | −0.023 | −0.024 | −0.411 |
| 40 | −0.006 | −0.250 | −0.091 | −0.072 | −0.080 | −0.091 | −0.081 | −0.417 | ||
| 60 | −0.003 | −0.288 | −0.016 | 0.010 | 0.005 | −0.017 | 0.003 | −0.420 | ||
| 80 | −0.026 | −0.358 | −0.186 | −0.161 | −0.186 | −0.182 | −0.176 | −0.427 | ||
| Mean errorc | 20 | 0.067 | 0.128 | 0.063 | 0.063 | 0.066 | 0.063 | 0.063 | 0.411 | |
| 40 | 0.112 | 0.250 | 0.113 | 0.102 | 0.107 | 0.113 | 0.107 | 0.417 | ||
| 60 | 0.150 | 0.288 | 0.103 | 0.105 | 0.106 | 0.104 | 0.103 | 0.420 | ||
| 80 | 0.456 | 0.367 | 0.253 | 0.237 | 0.258 | 0.252 | 0.241 | 0.428 | ||
| Coverage | 20 | 0.952 | 0.643 | 0.948 | 0.947 | 0.935 | 0.949 | 0.947 | 0.000 | |
| 40 | 0.960 | 0.349 | 0.893 | 0.908 | 0.886 | 0.887 | 0.893 | 0.002 | ||
| 60 | 0.952 | 0.394 | 0.955 | 0.944 | 0.937 | 0.961 | 0.947 | 0.017 | ||
| 80 | 0.967 | 0.943 | 0.981 | 0.916 | 0.971 | 0.981 | 0.916 | 0.360 | ||
| Power | 20 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.973 | |
| 40 | 0.998 | 0.985 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.826 | ||
| 60 | 0.942 | 0.898 | 0.999 | 0.997 | 1.000 | 0.996 | 0.999 | 0.738 | ||
| 80 | 0.256 | 0.094 | 0.358 | 0.527 | 0.394 | 0.364 | 0.519 | 0.331 |
aAnalysis model A: complete case analysis (no multiple imputation [mi]); B: no outcome imputation, not included in mi model; C: no outcome imputation, outcome imputed in mi model; D: outcome imputed and included in mi model; E: outcome imputed and included in mi model but then observations where it was imputed are deleted; F as in C but also including a second correlated outcome in the mi model; G as in D but also including a second correlated outcome in the mi model; H as in D but the mi and analysis models do not include the covariate
bMain models of interest, other models provided for comparison purposes
cReported on log-odds scale and based on a true effect of log [2]
Fig. 2Mean Bias and 95% Confidence Intervals for exposure E in datasets of 1000 (top) and 10,000 observations (bottom)
Fig. 3Mean absolute Error and 95% Confidence Intervals for exposure E in datasets of 1000 (top) and 10,000 observations (bottom)
Fig. 4Coverage and 95% Confidence Intervals for exposure E in datasets of 1000 (top) and 10,000 observations (bottom)
Fig. 5Power and 95% Confidence Intervals for exposure E in datasets of 1000 (top) and 10,000 observations (bottom)