| Literature DB >> 29270206 |
R P Cornish1, J Macleod1, J R Carpenter2,3, K Tilling1,4.
Abstract
BACKGROUND: When an outcome variable is missing not at random (MNAR: probability of missingness depends on outcome values), estimates of the effect of an exposure on this outcome are often biased. We investigated the extent of this bias and examined whether the bias can be reduced through incorporating proxy outcomes obtained through linkage to administrative data as auxiliary variables in multiple imputation (MI).Entities:
Keywords: ALSPAC; Bias; Breastfeeding; Data linkage; IQ; Missing data; Multiple imputation; Simulation study
Year: 2017 PMID: 29270206 PMCID: PMC5735815 DOI: 10.1186/s12982-017-0068-0
Source DB: PubMed Journal: Emerg Themes Epidemiol ISSN: 1742-7622
Scenarios investigated in the simulations (each investigated with 20, 40, 60 and 80% missing outcome data)
| Factor 3: Change in Pr(IQ observed) for one SD increase in IQ | Factor 2: Correlation between IQ and linked attainment score (KS4) | |||||||
|---|---|---|---|---|---|---|---|---|
| 0.1 | 0.3 | 0.5 | 0.7 | 0.9 | ||||
| Main set of scenarios (each at 20, 40, 60, 80% missing IQ (factor 1)): 64 scenarios | ||||||||
| 0 | ✓ | |||||||
| 0.05 | ✓ | ✓ | ✓ | ✓ | ✓ | |||
| 0.1 | ✓ | ✓ | ✓ | ✓ | ✓ | |||
| 0.2 | ✓ | ✓ | ✓ | ✓ | ✓ | |||
| Secondary sets of scenarios (each at 20, 40, 60, 80% missing IQ (factor 1)): 36 scenarios | ||||||||
| Missing linked data | Factor 5: Change in Pr(KS4 observed) for one SD increase in KS4 | Factor 4: Association between IQ and Pr(IQ observed) dependent on breastfeeding? | ||||||
| No | – | Yes | 0.1a | ✓ | ✓ | ✓ | ✓ | ✓ |
| Yes, 20% | − 0.1 | No | 0.1 | ✓ | ✓ | |||
| + 0.1 | 0.1 | ✓ | ✓ | |||||
aIn the baseline breastfeeding group; reduction in this coefficient of 0.025 for each consecutive breastfeeding group
Availability of additional linked attainment data according to presence/absence of KS4 data among the 13,975 subjects from ALSPAC included in this analysis
| KS2 data available | ||
|---|---|---|
| Yes | No | |
| KS4 data = yes (11,414) | ||
| KS3 data available | ||
| Yes | 9152 | 339 |
| No | 1511 | 412 |
| KS4 data = No | ||
| KS3 data available | ||
| Yes | 79 | 10 |
| No | 473 | 1999 |
Relationship between duration of breastfeeding and IQ at 15
| Duration of breast-feeding | Analysis approach | |||||||
|---|---|---|---|---|---|---|---|---|
| Complete records analysis (n = 4152) | Multiple imputation, using only KS4 attainmentb (n = 13,975) | Multiple imputation, using KS2, KS3 and KS4 attainmentc (n = 13,975) | ||||||
| Difference in mean IQ (95% CI) | Difference in mean IQ (95% CI) | Gain in precisiond (%) | FMIe (%) | Difference in mean IQ (95% CI) | Gain in precisiond (%) | FMIe (%) | ||
| Unadjusted results | Never/< 1 month | – | – | – | – | – | – | – |
| 1 to < 3 months | 1.9 (0.6, 3.2) | 3.2 (2.2, 4.3) | 47 | 60 | 3.4 (2.4, 4.4) | 65 | 55 | |
| 3 to < 6 months | 5.1 (4.0, 6.3) | 6.6 (5.6, 7.6) | 36 | 58 | 6.8 (5.9, 7.8) | 43 | 56 | |
| 6 months + | 7.5 (6.6, 8.5) | 9.3 (8.5, 10.1) | 38 | 59 | 9.6 (8.8, 10.3) | 58 | 53 | |
| Adjusteda results | Never/< 1 month | – | – | – | – | – | – | – |
| 1 to < 3 months | 0.8 (− 0.4, 2.0) | 1.3 (0.3, 2.4) | 36 | 63 | 1.5 (0.5, 2.4) | 54 | 58 | |
| 3 to < 6 months | 2.6 (1.5, 3.7) | 3.2 (2.2, 4.2) | 26 | 61 | 3.4 (2.4, 4.3) | 27 | 62 | |
| 6 months + | (2.5, 4.4) | 4.2 (3.4, 5.0) | 36 | 58 | 4.4 (3.6, 5.2) | 39 | 57 | |
aAdjusted for sex, maternal and paternal education, occupational social class, parity, maternal age, ethnicity, family adversity index, smoking in pregnancy and housing tenure during pregnancy
bIQ predicted from KS4 points cubed (best fitting fractional polynomial of degree 1), plus all other factors. Imputation model for IQ also included an interaction between KS4 points cubed and mother’s education
cIQ additionally predicted from KS2 points squared and KS3 points squared
dRelative to complete records analysis
eFMI: Fraction of missing information
Results when IQ simulated as MAR (factor 3 in scenarios)
| Scenario (factors 1 and 2) | Complete records | MI including linked attainment score (KS4) | ||||||
|---|---|---|---|---|---|---|---|---|
| Estimate (empirical SE) | % bias | MSE | Estimate (empirical SE) | % bias | MSE | Gain in precisiona (%) | FMI (%) | |
| IQ 20% missing | 0.1005 (0.033) | 0.5 | 0.001 | 0.1004 (0.031) | 0.3 | 0.001 | 10 | 15 |
| 0.1990 (0.030) | − 0.5 | 0.0009 | 0.1993 (0.029) | − 0.3 | 0.0008 | 11 | 13 | |
| 0.3006 (0.025) | 0.2 | 0.0006 | 0.3002 (0.024) | 0.1 | 0.0006 | 7 | 13 | |
| IQ 40% missing | 0.0994 (0.038) | − 0.6 | 0.001 | 0.1004 (0.034) | 0.3 | 0.001 | 22 | 30 |
| Correlation(IQ: KS4) = 0.7 | 0.1988 (0.035) | − 0.6 | 0.001 | 0.1996 (0.033) | − 0.2 | 0.001 | 17 | 28 |
| 0.3004 (0.030) | 0.1 | 0.0009 | 0.3004 (0.027) | 0.1 | 0.0008 | 17 | 29 | |
| IQ 60% missing | 0.1005 (0.049) | 0.5 | 0.002 | 0.1009 (0.042) | 1.1 | 0.002 | 34 | 50 |
| Correlation(IQ: KS4) = 0.7 | 0.1975 (0.042) | − 1.3 | 0.002 | 0.1980 (0.037) | − 0.8 | 0.001 | 33 | 47 |
| 0.2988 (0.037) | − 0.4 | 0.001 | 0.3002 (0.032) | 0.1 | 0.001 | 33 | 48 | |
| IQ 80% missing | 0.1040 (0.073) | 4.0 | 0.005 | 0.1050 (0.061) | 4.8 | 0.004 | 41 | 83 |
| Correlation(IQ: KS4) = 0.7 | 0.2009 (0.062) | 0.4 | 0.004 | 0.2000 (0.052) | 0 | 0.003 | 40 | 81 |
| 0.3011 (0.056) | 0.4 | 0.003 | 0.3022 (0.046) | 0.8 | 0.002 | 47 | 81 | |
MSE mean squared error, FMI fraction of missing information
aRelative to complete records analysis
Results for IQ MNAR: difference in Pr(IQ observed) = 0.10 for 1 SD increase in IQ (factor 3)
| Scenario (factors 1 and 2) | Complete records | MI including linked attainment score (KS4) | ||||||
|---|---|---|---|---|---|---|---|---|
| Estimate (empirical SE) | % bias | MSE | Estimate (empirical SE) | % bias | MSE | Gain in precision (%) | FMI (%) | |
| IQ 20% missing | 0.08 (0.034) | − 17 | 0.001 | 0.08 (0.034) | − 16 | 0.001 | − 0.1 | 24 |
| Correlation(IQ:KS4) = 0.1 | 0.17 (0.030) | − 14 | 0.002 | 0.17 (0.030) | − 14 | 0.002 | 1 | 21 |
| 0.26 (0.025) | − 12 | 0.002 | 0.26 (0.025) | − 12 | 0.002 | 0.2 | 21 | |
| IQ 20% missing | As above | 0.08 (0.034) | − 15 | 0.001 | 2 | 23 | ||
| Correlation(IQ:KS4) = 0.3 | 0.18 (0.030) | − 13 | 0.001 | 4 | 20 | |||
| 0.27 (0.025) | − 11 | 0.002 | 1 | 20 | ||||
| IQ 20% missing | As above | 0.09 (0.033) | − 13 | 0.001 | 6 | 20 | ||
| Correlation(IQ:KS4) = 0.5 | 0.18 (0.029) | − 11 | 0.001 | 8 | 18 | |||
| 0.27 (0.025) | − 9 | 0.001 | 4 | 18 | ||||
| IQ 20% missing | As above | 0.09 (0.032) | − 9 | 0.001 | 14 | 16 | ||
| Correlation(IQ:KS4) = 0.7 | 0.19 (0.028) | − 7 | 0.001 | 14 | 13 | |||
| 0.28 (0.024) | − 6 | 0.001 | 9 | 13 | ||||
| IQ 20% missing | As above | 0.10 (0.031) | − 4 | 0.0009 | 27 | 7 | ||
| Correlation(IQ:KS4) = 0.9 | 0.19 (0.027) | − 3 | 0.0008 | 21 | 6 | |||
| 0.29 (0.023) | − 2 | 0.0006 | 18 | 6 | ||||
| IQ 40% missing | 0.07 (0.041) | − 29 | 0.003 | 0.07 (0.041) | − 28 | 0.002 | − 0.2 | 45 |
| Correlation(IQ: KS4) = 0.1 | 0.16 (0.035) | − 20 | 0.003 | 0.16 (0.035) | − 20 | 0.003 | − 0.5 | 41 |
| 0.26 (0.029) | − 15 | 0.003 | 0.26 (0.029) | − 14 | 0.003 | 0.5 | 42 | |
| IQ 40% missing | As above | 0.07 (0.040) | − 26 | 0.002 | 4 | 43 | ||
| Correlation(IQ: KS4) = 0.3 | 0.16 (0.035) | − 19 | 0.003 | 2 | 40 | |||
| 0.26 (0.029) | − 14 | 0.002 | 3 | 40 | ||||
| IQ 40% missing | As above | 0.08 (0.039) | − 23 | 0.002 | 12 | 39 | ||
| Correlation(IQ: KS4) = 0.5 | 0.17 (0.034) | − 16 | 0.002 | 8 | 36 | |||
| 0.27 (0.028) | − 12 | 0.002 | 9 | 37 | ||||
| IQ 40% missing | As above | 0.08 (0.037) | − 17 | 0.002 | 27 | 32 | ||
| Correlation(IQ:KS4) = 0.7 | 0.18 (0.032) | − 11 | 0.002 | 20 | 28 | |||
| 0.28 (0.027) | − 8 | 0.001 | 22 | 29 | ||||
| IQ 40% missing | As above | 0.09 (0.032) | − 7 | 0.001 | 56 | 16 | ||
| Correlation(IQ:KS4) = 0.9 | 0.19 (0.029) | − 5 | 0.0009 | 44 | 14 | |||
| 0.29 (0.025) | − 3 | 0.0007 | 50 | 14 | ||||
| IQ 60% missing | 0.03 (0.049) | − 74 | 0.008 | 0.03 (0.050) | − 73 | 0.008 | − 1 | 65 |
| Correlation(IQ:KS4) = 0.1 | 0.10 (0.043) | − 49 | 0.011 | 0.10 (0.043) | − 49 | 0.011 | − 1 | 62 |
| 0.19 (0.037) | − 36 | 0.013 | 0.19 (0.037) | − 35 | 0.013 | − 1 | 63 | |
| IQ 60% missing | As above | 0.03 (0.049) | − 68 | 0.007 | 0.6 | 64 | ||
| Correlation(IQ:KS4) = 0.3 | 0.11 (0.042) | − 46 | 0.01 | 1 | 61 | |||
| 0.20 (0.037) | − 33 | 0.011 | 2 | 61 | ||||
| IQ 60% missing | As above | 0.04 (0.047) | − 58 | 0.006 | 8 | 60 | ||
| Correlation(IQ:KS4) = 0.5 | 0.12 (0.041) | − 39 | 0.008 | 11 | 57 | |||
| 0.22 (0.035) | − 28 | 0.008 | 12 | 57 | ||||
| IQ 60% missing | As above | 0.06 (0.044) | − 42 | 0.004 | 28 | 52 | ||
| Correlation(IQ: KS4) = 0.7 | 0.14 (0.037) | − 28 | 0.004 | 31 | 48 | |||
| 0.24 (0.032) | − 20 | 0.005 | 32 | 48 | ||||
| IQ 60% missing | As above | 0.08 (0.036) | − 17 | 0.002 | 84 | 30 | ||
| Correlation(IQ:KS4) = 0.9 | 0.18 (0.032) | − 11 | 0.001 | 83 | 27 | |||
| 0.28 (0.027) | − 8 | 0.001 | 86 | 28 | ||||
| IQ 80% missing | − 0.14 (0.068) | − 237 | 0.06 | − 0.14 (0.069) | − 236 | 0.06 | − 3 | 86 |
| Correlation(IQ: KS4) = 0.1 | − 0.13 (0.062) | − 165 | 0.11 | − 0.13 (0.062) | − 164 | 0.11 | − 0.5 | 85 |
| − 0.05 (0.052) | − 116 | 0.12 | − 0.05 (0.053) | − 115 | 0.12 | − 1 | 85 | |
| IQ 80% missing | As above | − 0.12 (0.068) | − 223 | 0.05 | − 0.3 | 85 | ||
| Correlation(IQ: KS4) = 0.3 | − 0.11 (0.060) | − 155 | 0.1 | 6 | 84 | |||
| − 0.03 (0.051) | − 109 | 0.11 | 5 | 84 | ||||
| IQ 80% missing | As above | − 0.09 (0.065) | − 194 | 0.04 | 9 | 84 | ||
| Correlation(IQ: KS4) = 0.5 | − 0.07 (0.056) | − 134 | 0.08 | 21 | 81 | |||
| 0.02 (0.048) | − 94 | 0.08 | 19 | 82 | ||||
| IQ 80% missing | As above | − 0.04 (0.059) | − 143 | 0.02 | 36 | 79 | ||
| Correlation(IQ:KS4) = 0.7 | 0.002 (0.050) | − 99 | 0.04 | 56 | 76 | |||
| 0.09 (0.043) | − 70 | 0.05 | 50 | 77 | ||||
| IQ 80% missing | As above | 0.04 (0.044) | − 58 | 0.006 | 140 | 59 | ||
| Correlation(IQ:KS4) = 0.9 | 0.12 (0.038) | − 41 | 0.008 | 170 | 55 | |||
| 0.21 (0.033) | − 29 | 0.009 | 156 | 57 | ||||
Results when linked attainment score MNAR with 20% missing linked data (correlation between linked attainment score and IQ = 0.7); different values of difference in Pr(KS4 observed) for one SD increase in KS4 (diff Pr(KS4obs)) (factor 5 in scenarios)
| Scenario [in each case: IQ MNAR (diff Pr(IQ obs) = 0.10), correlation(IQ:KS4) = 0.7, | Complete recordsa | MI including linked attainment score (KS4) | ||||||
|---|---|---|---|---|---|---|---|---|
| linked attainment = 20% missing] | Estimate (empirical SE) | % bias | MSE | Estimate (empirical SE) | % bias | MSE | Gain in precision (%) | FMI (%) |
| IQ 20% missing | 0.08 (0.034) | − 17 | 0.001 | 0.09 (0.033) | − 7 | 0.001 | 8 | 17 |
| Diff Pr(KS4obs) = − 0.10 | 0.17 (0.030) | − 14 | 0.002 | 0.19 (0.030) | − 7 | 0.001 | 6 | 14 |
| 0.26 (0.025) | − 12 | 0.002 | 0.28 (0.026) | − 6 | 0.001 | 7 | 15 | |
| IQ 20% missing | As above | 0.09 (0.033) | − 12 | 0.001 | 7 | 18 | ||
| Diff Pr(KS4obs) = + 0.10 | 0.18 (0.030) | − 11 | 0.001 | 7 | 15 | |||
| 0.27 (0.025) | − 9 | 0.001 | 9 | 15 | ||||
| IQ 40% missing, | 0.07 (0.041) | − 29 | 0.003 | 0.09 (0.037) | − 15 | 0.002 | 16 | 34 |
| Diff Pr(KS4obs) = − 0.10 | 0.16 (0.035) | −20 | 0.003 | 0.18 (0.034) | − 10 | 0.002 | 16 | 31 |
| 0.26 (0.029) | − 15 | 0.003 | 0.28 (0.027) | − 8 | 0.001 | 17 | 32 | |
| IQ 40% missing | As above | 0.08 (0.037) | − 20 | 0.002 | 22 | 35 | ||
| Diff Pr(KS4obs) = + 0.10 | 0.17 (0.034) | − 15 | 0.002 | 15 | 32 | |||
| 0.27 (0.028) | − 11 | 0.002 | 13 | 32 | ||||
| IQ 60% missing | 0.03 (0.049) | − 74 | 0.008 | 0.06 (0.043) | − 43 | 0.004 | 32 | 55 |
| Diff Pr(KS4obs) = − 0.10 | 0.10 (0.043) | − 49 | 0.01 | 0.14 (0.039) | − 30 | 0.005 | 24 | 52 |
| 0.19 (0.037) | − 36 | 0.01 | 0.24 (0.032) | − 21 | 0.005 | 29 | 52 | |
| IQ 60% missing | As above | 0.05 (0.042) | − 50 | 0.004 | 24 | 55 | ||
| Diff Pr(KS4obs) = + 0.10 | 0.13 (0.037) | − 35 | 0.006 | 35 | 52 | |||
| 0.23 (0.031) | − 25 | 0.006 | 33 | 52 | ||||
| IQ 80% missing | − 0.14 (0.068) | − 237 | 0.06 | − 0.06 (0.060) | − 162 | 0.03 | 28 | 81 |
| Diff Pr(KS4obs) = − 0.10 | − 0.13 (0.062) | − 165 | 0.11 | − 0.02 (0.054) | − 111 | 0.05 | 30 | 78 |
| − 0.05 (0.052) | − 116 | 0.12 | 0.07 (0.047) | − 78 | 0.06 | 25 | 80 | |
| IQ 80% missing | As above | − 0.06 (0.060) | − 156 | 0.03 | 37 | 80 | ||
| Diff Pr(KS4obs) = + 0.10 | − 0.01 (0.053) | − 108 | 0.05 | 36 | 77 | |||
| 0.07 (0.045) | − 76 | 0.05 | 30 | 79 | ||||
aThe results for the complete records analysis presented here are the same as those presented in Table 5 but are included here for comparison
Fig. 1DAG illustrating a scenario in which inclusion of a proxy may increase bias