| Literature DB >> 21978450 |
Manisha Desai1, Denise A Esserman, Marilie D Gammon, Mary B Terry.
Abstract
BACKGROUND: In molecular epidemiology studies biospecimen data are collected, often with the purpose of evaluating the synergistic role between a biomarker and another feature on an outcome. Typically, biomarker data are collected on only a proportion of subjects eligible for study, leading to a missing data problem. Missing data methods, however, are not customarily incorporated into analyses. Instead, complete-case (CC) analyses are performed, which can result in biased and inefficient estimates.Entities:
Year: 2011 PMID: 21978450 PMCID: PMC3217865 DOI: 10.1186/1742-5573-8-5
Source DB: PubMed Journal: Epidemiol Perspect Innov ISSN: 1742-5573
Description of Scenarios Used in Two Sets of Simulation Studies.
| Table | Set | Scenario | Median % Missing X1 | Nature of Missing | Auxiliary Relationship | Type of Missing Variable |
|---|---|---|---|---|---|---|
| 3a. Impact of Auxiliary Relationship Under Condition 1 | 1 | A | 22% | Binary | ||
| 1 | B | 22% | Binary | |||
| 1 | C | 22% | Binary | |||
| 3b. Impact of Auxiliary Relationship Under Condition 2 | 1 | D | 20% | Continuous | ||
| 1 | E | 20% | Continuous | |||
| 1 | F | 20% | Continuous | |||
| 3c. Impact of Auxiliary Relationship Under Condition 3 | 1 | G | 20% | Continuous | ||
| 1 | H | 20% | Continuous | |||
| 1 | I | 20% | Continuous | |||
| 4a. Impact of Conditions Under Set of Auxiliary Variables with Varying Strength When Missing Genotype Data | 2 | J | 21% | Binary | ||
| 2 | K | 24% | Binary | |||
| 4b. Impact of Conditions Under Set of Auxiliary Variables with Varying Strength When Missing Exposure Data | 2 | L | 21% | Binary | ||
| 2 | M | 21% | Ordinal | |||
aCondition 1: X1 is 12.2 times more likely to be missing if X1 = 1
bCondition 2: Extreme values of X1 are more likely to be missing (probability of missing is a quadratic function of X1 or the log odds of missing X1 = γ0+γ1 X1 + γ2 X12, where γ1 = -1 and γ2 = 2.)
cCondition 3: A 1-unit increase in X1 corresponds to a 7.4 times decrease in the probability of missing for controls, but a 7.4 times increase for cases
dCondition 1: Those with fast metabolizing genotype are 12.2 times more likely to be missing data on genotype. Missingness is also related to other observed covariates (mammogram, education, race, breastfeeding oral contraceptive use, hormone therapy use, and smoking status) as informed by the real data set for all subjects.
eCondition 3: Exposed controls with fast genotype are 7.4 times more likely to be missing genotype than those without, while exposed cases with fast genotype are 7.4 times less likely to be missing genotype. Unexposed subjects with fast genotype are 2.7 times more likely to be missing genotype. Missingness is also related to other observed covariates (mammogram, education, race, breastfeeding oral contraceptive use, hormone therapy use, and smoking status) as informed by the real data set for all subjects.
fCondition 2:Extreme values of exposure are more likely to be missing (probability of missing is a quadratic function of X1 or the log odds of missing X1 = γ0+γ1 X1 + γ2 X12, where γ1 = -3 and γ2 = 1.) Missingness is also related to other observed covariates (mammogram, education, race, breastfeeding oral contraceptive use, hormone therapy use, and smoking status) as informed by the real data set for all subjects.
1Strong: Those with X1 = 1 have Z values (SD = 1), that are 3 units higher on average than those with X1 = 0
2Strong: Average correlation between X1 and Z is 0.97
3None: X1 and Z are independent variables
4Moderate: Those with X1 = 1 have Z values (SD = 1) that are 1 unit higher on average than those with X1 = 0
5Strong: Those with X1 = 1 have Z values (SD = 1) that are 4 units higher on average than those with X1 = 0
6Moderate: Average correlation between X1 and Z is 0.57
7Realistic: Auxiliary variables are informed by real data set and include all variables used to generate missing data mechanism of genotype (case-control status, exposure, mammogram, education, race, breastfeeding behavior, oral contraceptive use, hormone therapy use, and smoking status) as well as those that relate to genotype (exposure, race, breastfeeding behavior), a subset of the former.
8Realistic: Auxiliary variables are informed by real data set and include all variables used to generate missing data mechanism of exposure (case-control status, genotype, mammogram, education, race, breastfeeding behavior, oral contraceptive use, hormone therapy use, and smoking status) as well as those that relate to exposure (genotype, education, race, oral contraceptive use, hormone therapy use, and smoking status), a subset of the former.
Description of Relationship Between Variable with Missing Data (Genotype or Exposure) and Set of Auxiliary Variables.
| Variable with Missing Data: | Genotype | Exposure | ||
|---|---|---|---|---|
| Auxiliary Variables | OR | P-value | OR | P-value |
| Level of average alcohol exposure | 0.90 | P = 0.06 | NA | NA |
| Race | P < 0.001 | P < 0.001 | ||
| Reference (Caucasian) | 1.00 | 1.00 | ||
| African American | 4.14 | 0.43 | ||
| Other | 3.60 | 0.24 | ||
| Breastfed for at least 6 months | P < 0.001 | |||
| Reference (No) | 1.00 | |||
| Yes | 0.68 | |||
| Education | P < 0.001 | |||
| Reference (less than high school) | 1.00 | |||
| Completed high school and some college | 1.52 | |||
| College graduate or more | 1.68 | |||
| Use of oral contraceptives | P = 0.005 | |||
| Reference (No) | 1.00 | |||
| Yes | 1.32 | |||
| Use of hormone therapy | P = 0.018 | |||
| Reference (No) | 1.00 | |||
| Yes | 1.29 | |||
| Smoking Status | P < 0.001 | |||
| Reference (Never smoked) | 1.00 | |||
| Past smoker | 1.63 | |||
| Current smoker | 1.67 | |||
| Genotype | NA | NA | P = 0.001 | |
| Reference (No mutation) | 1.00 | |||
| Has mutation | 0.73 | |||
Impact of Auxiliary Relationship Under Conditions 1, 2, and 3.
| a.condition 1 | ||||||||
|---|---|---|---|---|---|---|---|---|
| A: No Auxiliary | ||||||||
| X1 (X2 = 0) | Full | 0.998 | 0.211 | -0.002 | 0.043 | 0.403 | 95.2 (93.9,96.5) | |
| CC | 0.999 | 0.323 | -0.001 | 0.106 | 1.000 | 95.3 (94.0,96.6) | ||
| MI | 1.106 | 0.321 | 0.106 | 0.101 | 0.951 | 95.4 (94.1,96.7) | ||
| X1 (X2 = 1) | Full | 2.542 | 0.378 | 0.042 | 0.161 | 0.031 | 95.2 (94.0,96.6) | |
| CC | 2.890 | 11.419 | 0.390 | 5.271 | 1.000 | 95.7 (94.4,97.0) | ||
| MI | 2.330 | 0.615 | -0.170 | 0.317 | 0.060 | 92.8 (91.2,94.4) | ||
| Interaction (True effect = 1.5) | Full | 1.544 | 0.434 | 0.044 | 0.195 | 0.037 | 96.0 (94.8,97.2) | |
| CC | 1.891 | 11.500 | 0.391 | 5.321 | 1.000 | 96.4 (95.2,97.6) | ||
| MI | 1.223 | 0.690 | -0.277 | 0.382 | 0.072 | 94.6 (93.2,96.0) | ||
| B: Moderate | ||||||||
| X1 (X2 = 0) | Full | 0.986 | 0.211 | -0.014 | 0.043 | 0.410 | 96.1 (94.9,97.3) | |
| CC | 0.984 | 0.323 | -0.016 | 0.105 | 1.000 | 95.7 (94.4,97.0) | ||
| MI | 1.157 | 0.312 | 0.157 | 0.110 | 1.039 | 94.9 (93.5,96.3) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.53 | 0.376 | 0.03 | 0.149 | 0.066 | 95.9 (94.7,97.1) | |
| CC | 2.698 | 4.867 | 0.198 | 2.28 | 1.000 | 96.1 (94.9,97.3) | ||
| MI | 2.438 | 0.600 | -0.062 | 0.251 | 0.110 | 96.3 (95.1,97.5) | ||
| Interaction (True effect = 1.5) | Full | 1.544 | 0.432 | 0.044 | 0.195 | 0.080 | 95.7 (94.4,97.0) | |
| CC | 1.714 | 4.949 | 0.214 | 2.437 | 1.000 | 96.8 (95.7,97.9) | ||
| MI | 1.281 | 0.671 | -0.219 | 0.317 | 0.130 | 96.7 (95.6,97.8) | ||
| C: Strong Auxiliary | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 0.997 | 0.211 | -0.003 | 0.046 | 0.403 | 95.4 (94.1,96.7) | |
| CC | 0.999 | 0.324 | -0.001 | 0.113 | 1.000 | 94.4 (93.0,95.8) | ||
| MI | 1.027 | 0.223 | 0.027 | 0.051 | 0.447 | 95.3 (94.0,96.6) | ||
| X1 (X2 = 1) | Full | 2.546 | 0.378 | 0.046 | 0.158 | 0.027 | 96 (94.8,97.2) | |
| CC | 2.968 | 12.65 | 0.468 | 5.941 | 1.000 | 96.9 (95.8,98.0) | ||
| MI | 2.555 | 0.407 | 0.055 | 0.176 | 0.030 | 96.5 (95.0,97.4) | ||
| Interaction (True effect = 1.5) | Full | 1.549 | 0.434 | 0.049 | 0.200 | 0.034 | 96.1 (94.9,97.3) | |
| CC | 1.969 | 12.730 | 0.469 | 5.971 | 1.000 | 96.6 (95.5,97.7) | ||
| MI | 1.528 | 0.462 | 0.028 | 0.205 | 0.034 | 96.2 (95.0,97.4) | ||
| b. Condition 2 | ||||||||
| Scenario | Variable | Method | Mean β | Mean SE | Mean Bias | MSE | RelMSE | Coverage |
| D: No Auxiliary | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.008 | 0.108 | 0.008 | 0.011 | 0.496 | 95.6 (94.3,96.9) | |
| CC | 1.013 | 0.146 | 0.013 | 0.023 | 1.000 | 93.8 (92.3,95.3) | ||
| MI | 1.090 | 0.145 | 0.090 | 0.029 | 1.254 | 92.0 (90.3,93.7) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.551 | 0.280 | 0.051 | 0.083 | 0.869 | 95.6 (94.3,96.9) | |
| CC | 2.550 | 0.299 | 0.050 | 0.096 | 1.000 | 95.4 (94.1,96.7) | ||
| MI | 2.357 | 0.293 | -0.143 | 0.085 | 0.888 | 93.1 (91.5,94.7) | ||
| Interaction (True effect = 1.5) | Full | 1.543 | 0.300 | 0.043 | 0.093 | 0.804 | 94.9 (93.5,96.3) | |
| CC | 1.537 | 0.333 | 0.037 | 0.116 | 1.000 | 95.5 (94.2,96.8) | ||
| MI | 1.267 | 0.325 | -0.233 | 0.126 | 1.084 | 90.2 (88.4,92.0) | ||
| E: Moderate | ||||||||
| X1 (X2 = 0) | Full | 1.011 | 0.108 | 0.010 | 0.012 | 0.533 | 95.4 (94.1,96.7) | |
| CC | 1.009 | 0.146 | 0.009 | 0.023 | 1.000 | 94.5 (93.1,95.9) | ||
| MI | 1.163 | 0.142 | 0.163 | 0.046 | 2.039 | 81.4 (79.0,83.8) | ||
| X1 (X2 = 1) | Full | 2.558 | 0.280 | 0.058 | 0.090 | 0.911 | 94.2 (92.8,95.6) | |
| CC | 2.555 | 0.298 | 0.055 | 0.099 | 1.000 | 94.5 (93.5,96.3) | ||
| MI | 2.571 | 0.296 | 0.071 | 0.085 | 0.855 | 96.7 (95.6,97.8) | ||
| Interaction | Full | 1.547 | 0.300 | 0.047 | 0.103 | 0.845 | 94.9 (93.5,96.3) | |
| CC | 1.545 | 0.333 | 0.045 | 0.122 | 1.000 | 94.7 (93.3,96.1) | ||
| MI | 1.408 | 0.328 | -0.092 | 0.102 | 0.832 | 95.1 (93.8,96.4) | ||
| F: Strong | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.002 | 0.108 | 0.002 | 0.012 | 0.518 | 94.5 (93.1,95.9) | |
| CC | 1.004 | 0.146 | 0.004 | 0.024 | 1.000 | 94.2 (92.8,95.6) | ||
| MI | 1.053 | 0.114 | 0.053 | 0.016 | 0.693 | 93.0 (91.4,94.6) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.530 | 0.278 | 0.030 | 0.079 | 0.860 | 95.2 (93.9,96.5) | |
| CC | 2.530 | 0.297 | 0.030 | 0.092 | 1.000 | 95.0 (93.6,96.4) | ||
| MI | 2.573 | 0.280 | 0.073 | 0.084 | 0.908 | 95.7 (93.8,96.4) | ||
| Interaction (True effect = 1.5) | Full | 1.527 | 0.298 | 0.027 | 0.092 | 0.788 | 95.0 (93.6,96.4) | |
| CC | 1.526 | 0.331 | 0.026 | 0.117 | 1.000 | 94.9 (93.5,96.3) | ||
| MI | 1.520 | 0.303 | 0.020 | 0.092 | 0.791 | 95.1 (93.8,96.4) | ||
| c. Condition 3 | ||||||||
| Scenario | Variable | Method | Mean β | Mean SE | Mean Bias | MSE | RelMSE | Coverage |
| G: No Auxiliary | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.006 | 0.108 | 0.006 | 0.011 | 0.066 | 96.2 (95.0,97.4) | |
| CC | 0.608 | 0.119 | -0.392 | 0.168 | 1.000 | 10.2 (8.3,12.1) | ||
| MI | 0.680 | 0.119 | -0.320 | 0.116 | 0.691 | 23.9 (21.3,26.5) | ||
| X1 (X2 = 1) | Full | 2.549 | 0.279 | 0.049 | 0.085 | 0.693 | 94.6 (93.2,96.0) | |
| CC | 2.322 | 0.288 | -0.178 | 0.122 | 1.000 | 85.3 (83.1,87.5) | ||
| MI | 1.974 | 0.274 | -0.526 | 0.324 | 2.654 | 49.9 (46.8,53.0) | ||
| Interaction (True effect = 1.5) | Full | 1.543 | 0.299 | 0.043 | 0.097 | 0.640 | 95.1 (93.8,96.4) | |
| CC | 1.714 | 0.312 | 0.214 | 0.152 | 1.000 | 92.3 (90.6,94.0) | ||
| MI | 1.294 | 0.297 | -0.206 | 0.093 | 0.609 | 93.5 (92.0,95.0) | ||
| H: Moderate | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.003 | 0.108 | 0.004 | 0.012 | 0.073 | 94.5 (93.1,95.9) | |
| CC | 0.610 | 0.119 | -0.390 | 0.167 | 1.000 | 11.9 (9.9,13.9) | ||
| MI | 0.793 | 0.116 | -0.207 | 0.057 | 0.340 | 54.4 (51.3,57.5) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.535 | 0.277 | 0.035 | 0.085 | 0.657 | 93.5 (92.0,95.0) | |
| CC | 2.309 | 0.287 | -0.191 | 0.129 | 1.000 | 85.1 (82.9,87.3) | ||
| MI | 2.222 | 0.279 | -0.278 | 0.141 | 1.093 | 80.9 (78.5,83.3) | ||
| Interaction (True effect = 1.5) | Full | 1.531 | 0.298 | 0.031 | 0.096 | 0.658 | 94.3 (92.9,95.7) | |
| CC | 1.699 | 0.310 | 0.198 | 0.146 | 1.000 | 91.4 (89.7,93.1) | ||
| MI | 1.428 | 0.301 | -0.072 | 0.074 | 0.508 | 95.4 (94.1,96.7) | ||
| I: Strong Auxiliary | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 0.999 | 0.108 | -0.001 | 0.012 | 0.069 | 95.4 (94.1,96.7) | |
| CC | 0.603 | 0.118 | -0.397 | 0.172 | 1.000 | 11.1 (9.2,13.0) | ||
| MI | 0.986 | 0.109 | -0.014 | 0.012 | 0.072 | 94.8 (93.4,96.2) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.533 | 0.277 | 0.033 | 0.081 | 0.635 | 95.0 (93.6,96.4) | |
| CC | 2.303 | 0.287 | -0.197 | 0.128 | 1.000 | 84.2 (81.9,86.5) | ||
| MI | 2.533 | 0.278 | 0.033 | 0.081 | 0.628 | 95.2 (93.9,96.5) | ||
| Interaction (True effect = 1.5) | Full | 1.533 | 0.298 | 0.033 | 0.091 | 0.638 | 95.3 (94.0,96.6) | |
| CC | 1.700 | 0.310 | 0.200 | 0.142 | 1.000 | 92.8 (91.2,94.4) | ||
| MI | 1.547 | 0.299 | 0.047 | 0.091 | 0.640 | 95.8 (94.6,97.0) | ||
Results From Fitting Full, Complete-Case, and Multiple Imputation Models to 1000 Simulated Data Sets With a Sample Size of 1000 Where the Covariate of Interest and as a Result the Interaction Term Were Missing for Approximately 20% of Subjects.
Impact of Nature of Missingness With Auxiliary Relationship Based on Data from LIBSCP.
| a. Subjects Missing Data on Genotype Under Conditions 1 and 3 | ||||||||
|---|---|---|---|---|---|---|---|---|
| J: | Effect of Fast Metabolizing Genotype For Unexposed | Full | 1.001 | 0.155 | 0.001 | 0.025 | 0.738 | 95.1 (93.8,96.4) |
| CC | 0.999 | 0.182 | -0.001 | 0.033 | 1.000 | 94.3 (92.9,95.7) | ||
| MI | 0.989 | 0.182 | -0.011 | 0.032 | 0.954 | 95.0 (93.6,96.4) | ||
| Effect of Fast Metabolizing Genotype For Exposed | Full | 2.508 | 0.157 | 0.008 | 0.025 | 0.664 | 94.3 (92.9,95.7) | |
| CC | 2.519 | 0.197 | 0.019 | 0.037 | 1.000 | 96.4 (95.2,97.6) | ||
| MI | 2.090 | 0.193 | -0.410 | 0.190 | 5.144 | 42.4 (39.3,45.5) | ||
| Interaction Effect | Full | 1.507 | 0.220 | 0.007 | 0.049 | 0.684 | 94.7 (93.3,96.1) | |
| CC | 1.520 | 0.268 | 0.020 | 0.072 | 1.000 | 95.5 (94.2,96.8) | ||
| MI | 1.101 | 0.265 | -0.399 | 0.211 | 2.945 | 70.0 (67.2,72.8) | ||
| K: | ||||||||
| Effect of Fast Metabolizing Genotype For Unexposed | Full | 1.003 | 0.155 | 0.003 | 0.022 | 0.666 | 96.1 (94.9,97.3) | |
| CC | 1.000 | 0.185 | 0.0002 | 0.034 | 1.000 | 95.6 (94.3,96.9) | ||
| MI | 1.004 | 0.191 | 0.004 | 0.032 | 0.978 | 95.6 (94.3,96.9) | ||
| Effect of Fast Metabolizing Genotype For Exposed | Full | 2.509 | 0.157 | 0.010 | 0.024 | 0.024 | 95.9 (94.7,97.1) | |
| CC | 3.463 | 0.233 | 0.963 | 0.983 | 1.000 | 0.2 (-0.1,0.5) | ||
| MI | 2.611 | 0.274 | 0.111 | 0.041 | 0.042 | 99.8 (99.5,100.1) | ||
| Interaction Effect | Full | 1.507 | 0.220 | 0.007 | 0.048 | 0.047 | 95.9 (94.7,97.1) | |
| CC | 2.463 | 0.298 | 0.963 | 1.012 | 1.000 | 9.5 (7.7,11.3) | ||
| MI | 1.608 | 0.340 | 0.108 | 0.074 | 0.073 | 99.2 (98.6,99.8) | ||
| b. Subjects Missing Data on Alcohol Exposure Under Conditions 1 and 2 | ||||||||
| Scenario | Variable | Method | Mean β | Mean SE | Mean Bias | MSE | RelMSE | Coverage |
| L: | Effect of Fast Metabolizing Genotype For Unexposed | Full | 1.016 | 0.155 | 0.016 | 0.025 | 0.962 | 94.7 (93.3,96.1) |
| CC | 1.015 | 0.158 | 0.015 | 0.026 | 1.000 | 95.4 (94.1,96.7) | ||
| MI | 1.308 | 0.150 | 0.308 | 0.115 | 4.502 | 45.6 (42.5,48.7) | ||
| Effect of Fast Metabolizing Genotype For Exposed | Full | 2.504 | 0.156 | 0.004 | 0.022 | 0.737 | 96.2 (95.0,97.4) | |
| CC | 2.503 | 0.189 | 0.003 | 0.039 | 1.000 | 94.1 (92.6,95.6) | ||
| MI | 2.449 | 0.183 | -0.051 | 0.026 | 0.835 | 96.9 (95.8,98.0) | ||
| Interaction Effect | Full | 1.488 | 0.220 | -0.012 | 0.048 | 0.845 | 95.2 (93.9,96.5) | |
| CC | 1.487 | 0.247 | -0.013 | 0.056 | 1.000 | 96.4 (95.2,97.6) | ||
| MI | 1.141 | 0.247 | -0.359 | 0.172 | 3.055 | 71.1 (68.3,73.9) | ||
| M: | Effect of Alcohol Exposure for Those without Fast Genotype (True effect = 0.5) | Full | 0.499 | 0.075 | -0.001 | 0.006 | 0.622 | 95.5 (94.2,96.8) |
| CC | 0.501 | 0.095 | 0.001 | 0.009 | 1.000 | 94.6 (93.2,96.0) | ||
| MI | 0.560 | 0.095 | 0.060 | 0.012 | 1.348 | 91.4 (89.7,93.1) | ||
| Effect of Alcohol Exposure for Those without Fast Genotype (True effect = 2.0) | Full | 2.008 | 0.176 | 0.008 | 0.032 | 0.780 | 94.4 (93.0,95.8) | |
| CC | 2.006 | 0.202 | 0.006 | 0.041 | 1.000 | 95.1 (93.8,96.4) | ||
| MI | 1.621 | 0.199 | -0.379 | 0.169 | 4.183 | 50.5 (47.4,53.6) | ||
| Interaction Effect (True effect = 1.5) | Full | 1.509 | 0.192 | 0.009 | 0.037 | 0.743 | 95.2 (93.9,96.5) | |
| CC | 1.505 | 0.223 | 0.005 | 0.050 | 1.000 | 95.6 (94.3,96.9) | ||
| MI | 1.061 | 0.217 | -0.439 | 0.219 | 4.392 | 44.5 (41.4,47.6) | ||
Results From Fitting Full, Complete-Case, and Multiple Imputation Models to 1000 Simulated Data Sets With a Sample Size of 2058 Where the Covariate of Interest and as a Result the Interaction Term Were Missing for Approximately 20% of Subjects.
Results From Fitting Complete-Case and Multiple Imputation Models to Data from the Long Island Breast Cancer Study Project [19] Assessing the Effect of a Gene-Environment Interaction where m = 10
| Genotype/Alcohol Status | ORaCC | ORaMI | % Change in βCoefficient |
|---|---|---|---|
| Slow-Intermediate/Non-alcohol consumer | 1.00 | 1.00 | |
| Fast/Non-alcohol consumer | 1.18 | 1.14 | 22.11% |
| Slow-Intermediate/< 15 grams | 1.16 | 1.11 | 25.38% |
| Fast/< 15 grams | 0.92 | 0.95 | 30.89% |
| Slow-Intermediate/15-30 grams | 1.49 | 1.27 | 40.23% |
| Fast/15-30 grams | 2.32 | 1.68 | 38.68% |
| Slow-Intermediate/30+ grams | 0.72 | 0.77 | 17.40% |
| Fast/30+ grams | 0.98 | 0.86 | > 100% |
CC = Complete Case; CI = Confidence Interval; MI = Multiple Imputation;OR = Odds Ratio
aEstimates are adjusted for age at diagnosis, education, race, caloric intake, smoking status and body mass index
Impact of Percentage Missing Under Conditions 1, 2, and 3.
| a. Condition 1a | ||||||||
|---|---|---|---|---|---|---|---|---|
| Scenario | Variable* | Method | Mean β | Mean SE | Mean Bias | MSE | RelMSE | Coverage |
| A1: 20% missing | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.002 | 0.211 | 0.002 | 0.044 | 0.543 | 95.0 (93.6,96.4) | |
| CC | 1.000 | 0.284 | -0.000 | 0.081 | 1.000 | 95.8 (94.6,97.0) | ||
| MI | 1.062 | 0.230 | 0.062 | 0.052 | 0.644 | 95.9 (94.7,97.1) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.549 | 0.378 | 0.049 | 0.166 | 0.068 | 95.4 (94.1,96.7) | |
| CC | 2.713 | 4.822 | 0.213 | 2.436 | 1.000 | 97.0 (95.9,98.1) | ||
| MI | 2.543 | 0.426 | 0.043 | 0.178 | 0.072 | 96.8 (95.7,97.9) | ||
| Interaction (True effect = 1.5) | Full | 1.547 | 0.434 | 0.047 | 0.204 | 0.082 | 95.7 (94.4,97.0) | |
| CC | 1.713 | 4.894 | 0.213 | 2.503 | 1.000 | 96.7 (95.6,97.8) | ||
| MI | 1.482 | 0.483 | -0.018 | 0.205 | 0.082 | 96.1 (94.9,97.3) | ||
| A2: 30% missing | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.003 | 0.211 | 0.003 | 0.044 | 0.358 | 94.6 (93.2,96.0) | |
| CC | 0.992 | 0.345 | -0.008 | 0.122 | 1.000 | 95.0 (93.6,96.4) | ||
| MI | 1.095 | 0.245 | 0.095 | 0.065 | 0.531 | 94.9 (93.5,96.3) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.558 | 0.379 | 0.058 | 0.146 | 0.014 | 97.2 (96.2,98.2) | |
| CC | 3.288 | 25.207 | 0.788 | 10.689 | 1.000 | 98.5 (97.7,99.3) | ||
| MI | 2.557 | 0.462 | 0.057 | 0.178 | 0.017 | 98.3 (97.5,99.1) | ||
| Interaction (True effect = 1.5) | Full | 1.555 | 0.435 | 0.055 | 0.195 | 0.018 | 96.1 (94.9,97.3) | |
| CC | 2.295 | 25.290 | 0.795 | 10.904 | 0.018 | 96.1 (94.9,97.3) | ||
| MI | 1.462 | 0.515 | -0.038 | 0.219 | 0.020 | 97.8 (96.9,98.7) | ||
| A3: 40% missing | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.020 | 0.211 | 0.020 | 0.045 | 0.258 | 95.6 (94.3,96.8) | |
| CC | 0.991 | 0.411 | -0.009 | 0.175 | 1.000 | 95.5 (94.2,96.8) | ||
| MI | 1.137 | 0.263 | 0.137 | 0.083 | 0.474 | 94.1 (92.6,95.6) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.540 | 0.377 | 0.040 | 0.140 | 0.006 | 96.6 (95.5,97.7) | |
| CC | 4.121 | 64.794 | 1.621 | 23.920 | 1.000 | 96.5 (95.4,97.6) | ||
| MI | 2.536 | 0.489 | 0.036 | 0.177 | 0.007 | 97.5 (96.5,98.5) | ||
| Interaction | Full | 1.520 | 0.433 | 0.020 | 0.191 | 0.008 | 96.0 (94.8,97.2) | |
| CC | 3.131 | 64.890 | 1.631 | 24.283 | 1.000 | 97.0 (95.9,98.1) | ||
| MI | 1.399 | 0.538 | -0.101 | 0.220 | 0.009 | 97.2 (96.2,98.2) | ||
| b. Condition 2b | ||||||||
| Scenario | Variable | Method | Mean β | Mean SE | Mean Bias | MSE | RelMSE | Coverage |
| A4: 20% missing | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.007 | 0.108 | 0.007 | 0.012 | 0.590 | 95.5 (94.2,96.8) | |
| CC | 1.010 | 0.147 | 0.010 | 0.021 | 1.000 | 95.9 (94.7,97.1) | ||
| MI | 1.059 | 0.115 | 0.059 | 0.017 | 0.820 | 92.5 (90.9,94.1) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.552 | 0.279 | 0.052 | 0.089 | 0.909 | 93.7 (92.2,95.2) | |
| CC | 2.548 | 0.298 | 0.048 | 0.098 | 1.000 | 93.4 (91.9,94.9) | ||
| MI | 2.593 | 0.281 | 0.093 | 0.095 | 0.966 | 94.5 (93.1,95.9) | ||
| Interaction | Full | 1.544 | 0.300 | 0.044 | 0.102 | 0.872 | 93.6 (92.1,95.1) | |
| CC | 1.538 | 0.332 | 0.038 | 0.117 | 1.000 | 94.0 (92.5,95.5) | ||
| MI | 1.534 | 0.304 | 0.034 | 0.101 | 0.863 | 94.0 (92.5,95.5) | ||
| A5: 30% missing | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.003 | 0.108 | 0.003 | 0.012 | 0.437 | 94.5 (93.1,95.9) | |
| CC | 1.008 | 0.168 | 0.008 | 0.027 | 1.000 | 94.6 (93.2,96.0) | ||
| MI | 1.079 | 0.118 | 0.079 | 0.020 | 0.741 | 92.4 (90.8,94.0) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.532 | 0.277 | 0.032 | 0.078 | 0.798 | 95.7 (94.4,97.0) | |
| CC | 2.529 | 0.317 | 0.029 | 0.098 | 1.000 | 96.2 (95.0,97.4) | ||
| MI | 2.612 | 0.284 | 0.112 | 0.091 | 0.928 | 95.6 (94.3,96.9) | ||
| Interaction | Full | 1.529 | 0.297 | 0.029 | 0.091 | 0.731 | 95.4 (94.1,96.7) | |
| CC | 1.521 | 0.359 | 0.021 | 0.125 | 1.000 | 94.8 (93.4,96.2) | ||
| MI | 1.533 | 0.308 | 0.033 | 0.093 | 0.743 | 95.4 (94.1,96.7) | ||
| A6: 40% missing | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.005 | 0.108 | 0.005 | 0.011 | 0.330 | 95.8 (94.6,97.0) | |
| CC | 1.007 | 0.197 | 0.007 | 0.035 | 1.000 | 95.9 (94.7,97.1) | ||
| MI | 1.109 | 0.123 | 0.109 | 0.026 | 0.749 | 87.5 (85.5,89.5) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.551 | 0.279 | 0.051 | 0.081 | 0.620 | 95.7 (94.4,97.0) | |
| CC | 2.563 | 0.355 | 0.063 | 0.131 | 1.000 | 96.2 (95.0,97.4) | ||
| MI | 2.675 | 0.295 | 0.175 | 0.112 | 0.852 | 94.3 (92.9,95.7) | ||
| Interaction | Full | 1.547 | 0.300 | 0.047 | 0.094 | 0.563 | 95.1 (93.8,96.4) | |
| CC | 1.557 | 0.406 | 0.057 | 0.168 | 1.000 | 95.3 (94.0,96.6) | ||
| MI | 1.565 | 0.319 | 0.065 | 0.098 | 0.588 | 96.9 (95.8,98.0) | ||
| c. Condition 3c | ||||||||
| Scenario | Variable | Method | Mean β | Mean SE | Mean Bias | MSE | RelMSE | Coverage |
| A7: 20% missing | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.012 | 0.108 | 0.012 | 0.012 | 0.076 | 94.7 (93.3,96.1) | |
| CC | 0.616 | 0.119 | -0.384 | 0.162 | 1.000 | 11.9 (9.9,13.9) | ||
| MI | 0.999 | 0.109 | -0.001 | 0.012 | 0.077 | 94.3 (92.9,95.7) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.549 | 0.279 | 0.049 | 0.080 | 0.667 | 96.1 (94.9,97.3) | |
| CC | 2.320 | 0.289 | -0.180 | 0.120 | 1.000 | 87.1 (85.0,89.2) | ||
| MI | 2.551 | 0.280 | 0.051 | 0.080 | 0.666 | 96.2 (95.0,97.4) | ||
| Interaction | Full | 1.537 | 0.300 | 0.037 | 0.091 | 0.635 | 96.1 (94.9,97.3) | |
| CC | 1.704 | 0.313 | 0.204 | 0.143 | 1.000 | 92.6 (91.0,94.2) | ||
| MI | 1.552 | 0.301 | 0.052 | 0.092 | 0.642 | 95.7 (94.4,97.0) | ||
| A8: 30% missing | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.006 | 0.108 | 0.006 | 0.012 | 0.027 | 95.0 (93.6,96.4) | |
| CC | 0.331 | 0.127 | -0.669 | 0.466 | 1.000 | 0.3 (0,0.6) | ||
| MI | 0.983 | 0.110 | -0.017 | 0.013 | 0.028 | 95.0 (93.6,96.4) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.528 | 0.277 | 0.028 | 0.075 | 0.179 | 95.8 (94.6,97.0) | |
| CC | 1.927 | 0.299 | -0.573 | 0.419 | 1.000 | 48.6 (45.5,51.7) | ||
| MI | 2.492 | 0.277 | -0.008 | 0.072 | 0.171 | 95.6 (94.3,96.9) | ||
| Interaction | Full | 1.522 | 0.297 | 0.022 | 0.084 | 0.743 | 96.0 (94.8,97.2) | |
| CC | 1.596 | 0.325 | 0.096 | 0.113 | 1.000 | 95.7 (94.4,97.0) | ||
| MI | 1.509 | 0.298 | 0.009 | 0.081 | 0.713 | 96.1 (94.9,97.3) | ||
| A9: 40% missing | ||||||||
| X1 (X2 = 0) (True effect = 1.0) | Full | 1.007 | 0.108 | 0.007 | 0.011 | 0.012 | 95.8 (94.6,97.0) | |
| CC | 0.027 | 0.139 | -0.973 | 0.966 | 1.000 | 0.0 (0.0,0.0) | ||
| MI | 0.968 | 0.110 | -0.032 | 0.013 | 0.013 | 94.1 (92.6,95.6) | ||
| X1 (X2 = 1) (True effect = 2.5) | Full | 2.537 | 0.277 | 0.037 | 0.079 | 0.092 | 95.6 (94.3,96.9) | |
| CC | 1.627 | 0.312 | -0.873 | 0.864 | 1.000 | 23.1 (20.5,25.7) | ||
| MI | 2.461 | 0.276 | -0.039 | 0.073 | 0.085 | 95.4 (94.1,96.7) | ||
| Interaction | Full | 1.531 | 0.298 | 0.031 | 0.089 | 0.683 | 96.1 (94.9,97.3) | |
| CC | 1.600 | 0.342 | 0.100 | 0.130 | 1.000 | 95.2 (93.9,96.5) | ||
| MI | 1.493 | 0.297 | -0.007 | 0.081 | 0.627 | 96.9 (95.8,98.0) | ||
Results From Fitting Full, Complete-Case, and Multiple Imputation Models to 1000 Simulated Data Sets With a Sample Size of 1000 Where the Covariate of Interest and as a Result the Interaction Term Were Missing for Some Subjects and the Auxiliary Information Was Strong.
aCondition 1: X1 is 7.4 times more likely to be missing if X1 = 1, where X1 is binary
bCondition 2: Extreme values of X1 are more likely to be missing (probability of missing is a quadratic function of X1 or the log odds of missing X1 = γ 0+γ 1 X1 + γ 2 X12, where γ 1 = -1 and γ 2 = 2), where X1 is continuous
cCondition 3: A 1-unit increase in X1 corresponds to a 7.4 times decrease in the probability of missing for controls, but a 7.4 times increase for cases, where X1 is continuous