| Literature DB >> 32586271 |
Donna L Coffman1, Jiangxiu Zhou2, Xizhen Cai3.
Abstract
BACKGROUND: Causal effect estimation with observational data is subject to bias due to confounding, which is often controlled for using propensity scores. One unresolved issue in propensity score estimation is how to handle missing values in covariates.Entities:
Keywords: Causal inference; Generalized boosted models; Missing data; Propensity scores
Mesh:
Year: 2020 PMID: 32586271 PMCID: PMC7318364 DOI: 10.1186/s12874-020-01053-4
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Fig. 1Complete data generation model
Percentages of missingness on x1 based on x5 and Y
| Group | % of observations | % missing on | % missing on | ||
|---|---|---|---|---|---|
| 1 | 0 | 0 | 25 | 10 | 20 |
| 2 | 0 | 1 | 25 | 20 | 40 |
| 3 | 1 | 0 | 25 | 30 | 60 |
| 4 | 1 | 1 | 25 | 40 | 80 |
| Total | 25 | 50 |
Percentages of missingness on x2 based on x6 and Y
| Group | % of observations | % missing on | % missing on | ||
|---|---|---|---|---|---|
| 1 | 0 | 0 | 30 | 25 | 50 |
| 2 | 0 | 1 | 40 | 30 | 60 |
| 3 | 1 | 0 | 20 | 20 | 40 |
| 4 | 1 | 1 | 10 | 15 | 30 |
| Total | 25 | 50 |
Percentages of missingness on x3 based on x8 and Y
| Group | % of observations | % missing on | % missing on | ||
|---|---|---|---|---|---|
| 1 | 0 | 0 | 40 | 30 | 60 |
| 2 | 0 | 1 | 30 | 25 | 50 |
| 3 | 1 | 0 | 10 | 15 | 30 |
| 4 | 1 | 1 | 20 | 20 | 40 |
| Total | 25 | 50 |
Percentages of missingness on x4 based on x9 and Y
| Group | % of observations | % missing on | % missing on | ||
|---|---|---|---|---|---|
| 1 | 0 | 0 | 20 | 15 | 30 |
| 2 | 0 | 1 | 30 | 25 | 50 |
| 3 | 1 | 0 | 30 | 35 | 70 |
| 4 | 1 | 1 | 20 | 20 | 40 |
| Total | 25 | 50 |
Simulation results for scenario A, n = 500, 25%missing
| Missingness Mechanism | Method | True confounders | Leave ×1 out | Add ×5 | Leave ×1 out + Add ×5 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Bias (SD) | SE | RMSE | Bias (SD) | SE | RMSE | Bias (SD) | SE | RMSE | Bias (SD) | SE | RMSE | ||
| complete | −0.001 (0.041) | 0.068 | 0.041 | 0.036 (0.043) | 0.067 | 0.056 | −0.001 (0.043) | 0.070 | 0.043 | 0.042 (0.043) | 0.068 | 0.060 | |
| comGBM | 0.029 (0.042) | 0.067 | 0.050 | – | – | – | – | – | – | – | – | – | |
| SI + pe + pu | 0.000 (0.049) | 0.068 | 0.041 | 0.036 (0.049) | 0.067 | 0.061 | 0.000 (0.051) | 0.070 | 0.051 | 0.042 (0.050) | 0.068 | 0.065 | |
| SI + pe | −0.003 (0.049) | 0.068 | 0.041 | 0.035 (0.048) | 0.067 | 0.059 | −0.002 (0.050) | 0.070 | 0.050 | 0.041 (0.049) | 0.068 | 0.064 | |
| TMI | 0.105 (0.065) | 0.072 | 0.113 | 0.131 (0.064) | 0.071 | 0.146 | 0.109 (0.066) | 0.074 | 0.127 | 0.138 (0.066) | 0.072 | 0.153 | |
| MI | −0.001 (0.045) | 0.068 | 0.041 | 0.036 (0.046) | 0.067 | 0.058 | −0.001 (0.047) | 0.070 | 0.047 | 0.042 (0.047) | 0.068 | 0.063 | |
| MIMP | −0.001 (0.045) | 0.069 | 0.041 | 0.035 (0.046) | 0.067 | 0.058 | −0.001 (0.047) | 0.070 | 0.047 | 0.042 (0.047) | 0.068 | 0.063 | |
| GBM | 0.067 (0.050) | 0.066 | 0.079 | – | – | – | – | – | – | – | – | – | |
| GBM + SI + pe | 0.035 (0.047) | 0.067 | 0.054 | – | – | – | – | – | – | – | – | – | |
| SI + pe + pu | 0.000 (0.048) | 0.068 | 0.041 | 0.036 (0.048) | 0.067 | 0.060 | 0.000 (0.050) | 0.070 | 0.050 | 0.042 (0.049) | 0.068 | 0.065 | |
| SI + pe | 0.000 (0.048) | 0.068 | 0.041 | 0.036 (0.047) | 0.067 | 0.059 | 0.000 (0.050) | 0.070 | 0.050 | 0.042 (0.048) | 0.068 | 0.064 | |
| TMI | 0.102 (0.060) | 0.070 | 0.110 | 0.129 (0.060) | 0.069 | 0.142 | 0.109 (0.062) | 0.071 | 0.125 | 0.136 (0.062) | 0.070 | 0.149 | |
| MI | 0.000 (0.044) | 0.068 | 0.041 | 0.036 (0.045) | 0.067 | 0.058 | 0.000 (0.046) | 0.070 | 0.046 | 0.042 (0.046) | 0.068 | 0.062 | |
| MIMP | 0.000 (0.045) | 0.069 | 0.041 | 0.037 (0.045) | 0.067 | 0.058 | 0.000 (0.046) | 0.070 | 0.046 | 0.043 (0.046) | 0.068 | 0.063 | |
| GBM | 0.066 (0.048) | 0.066 | 0.078 | – | – | – | – | – | – | – | – | – | |
| GBM + SI + pe | 0.036 (0.046) | 0.067 | 0.055 | – | – | – | – | – | – | – | – | – | |
| SI + pe + pu | 0.000 (0.049) | 0.068 | 0.041 | 0.036 (0.049) | 0.067 | 0.061 | 0.001 (0.051) | 0.070 | 0.051 | 0.043 (0.049) | 0.068 | 0.065 | |
| SI + pe | −0.001 (0.048) | 0.068 | 0.041 | 0.035 (0.048) | 0.067 | 0.059 | −0.001 (0.050) | 0.070 | 0.050 | 0.042 (0.049) | 0.068 | 0.065 | |
| TMI | 0.104 (0.068) | 0.073 | 0.112 | 0.129 (0.068) | 0.072 | 0.146 | 0.111 (0.071) | 0.075 | 0.132 | 0.137 (0.069) | 0.073 | 0.153 | |
| MI | 0.000 (0.045) | 0.068 | 0.041 | 0.037 (0.046) | 0.067 | 0.059 | 0.001 (0.047) | 0.070 | 0.047 | 0.043 (0.047) | 0.068 | 0.064 | |
| MIMP | 0.002 (0.045) | 0.069 | 0.041 | 0.038 (0.045) | 0.067 | 0.059 | 0.001 (0.047) | 0.070 | 0.047 | 0.043 (0.046) | 0.068 | 0.063 | |
| GBM | 0.069 (0.050) | 0.067 | 0.080 | – | – | – | – | – | – | – | – | – | |
| GBM + SI + pe | 0.037 (0.046) | 0.067 | 0.055 | – | – | – | – | – | – | – | – | – | |
| SI + pe + pu | 0.003 (0.048) | 0.068 | 0.041 | 0.037 (0.048) | 0.066 | 0.061 | 0.004 (0.049) | 0.069 | 0.049 | 0.043 (0.048) | 0.067 | 0.064 | |
| SI + pe | −0.002 (0.049) | 0.068 | 0.041 | 0.035 (0.049) | 0.067 | 0.060 | −0.002 (0.050) | 0.070 | 0.050 | 0.041 (0.050) | 0.068 | 0.065 | |
| TMI | 0.118 (0.062) | 0.071 | 0.125 | 0.144 (0.062) | 0.070 | 0.157 | 0.122 (0.064) | 0.073 | 0.138 | 0.151 (0.064) | 0.071 | 0.164 | |
| MI | 0.003 (0.045) | 0.068 | 0.041 | 0.040 (0.045) | 0.067 | 0.060 | 0.003 (0.046) | 0.070 | 0.046 | 0.046 (0.046) | 0.068 | 0.065 | |
| MIMP | 0.003 (0.045) | 0.068 | 0.041 | 0.040 (0.046) | 0.067 | 0.061 | 0.003 (0.046) | 0.070 | 0.046 | 0.046 (0.046) | 0.068 | 0.065 | |
| GBM | 0.074 (0.049) | 0.066 | 0.085 | – | – | – | – | – | – | – | – | – | |
| GBM + SI + pe | 0.036 (0.047) | 0.067 | 0.055 | – | – | – | – | – | – | – | – | – | |
Note. Complete: logistic regression with complete data before introducing missingness; comGBM GBM with complete data before introducing missingness; SI + pe + pu single imputation + prediction error + parameter uncertainty; SI + pe single imputation + prediction error; TMI treatment mean imputation; MI multiple imputation (m = 20); MIMP multiple imputation missingness pattern (m = 20); GBM GBM with incomplete data; GBM + SI + pe GBM after single imputation + prediction error; SD standard deviation; SE standard error; RMSE root mean squared error
Simulation results for scenario G, n = 500, 25%missing
| Missingness Mechanism | Method | True confounders | Leave ×1 out | Add ×5 | Leave ×1 out + Add ×5 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Bias (SD) | SE | RMSE | Bias (SD) | SE | RMSE | Bias (SD) | SE | RMSE | Bias (SD) | SE | RMSE | ||
| complete | −0.014 (0.044) | 0.071 | 0.046 | 0.037 (0.043) | 0.068 | 0.057 | −0.016 (0.045) | 0.073 | 0.048 | 0.040 (0.043) | 0.069 | 0.059 | |
| comGBM | 0.029 (0.044) | 0.067 | 0.053 | – | – | – | – | – | – | – | – | – | |
| SI + pe + pu | −0.011 (0.052) | 0.071 | 0.053 | 0.040 (0.049) | 0.068 | 0.063 | −0.012 (0.054) | 0.072 | 0.055 | 0.043 (0.050) | 0.068 | 0.066 | |
| SI + pe | −0.013 (0.051) | 0.071 | 0.053 | 0.039 (0.049) | 0.068 | 0.063 | −0.014 (0.053) | 0.072 | 0.055 | 0.042 (0.049) | 0.068 | 0.065 | |
| TMI | 0.096 (0.064) | 0.076 | 0.115 | 0.131 (0.061) | 0.073 | 0.145 | 0.096 (0.067) | 0.078 | 0.117 | 0.134 (0.063) | 0.074 | 0.148 | |
| MI | −0.011 (0.048) | 0.071 | 0.049 | 0.039 (0.046) | 0.068 | 0.060 | −0.013 (0.050) | 0.073 | 0.052 | 0.042 (0.047) | 0.069 | 0.063 | |
| MIMP | −0.011 (0.048) | 0.072 | 0.049 | 0.039 (0.046) | 0.068 | 0.060 | −0.013 (0.050) | 0.073 | 0.052 | 0.042 (0.047) | 0.069 | 0.063 | |
| GBM | 0.060 (0.051) | 0.067 | 0.079 | – | – | – | – | – | – | – | – | – | |
| GBM + SI + pe | 0.033 (0.049) | 0.067 | 0.059 | – | – | – | – | – | – | – | – | – | |
| SI + pe + pu | 0.000 (0.051) | 0.070 | 0.051 | 0.048 (0.048) | 0.067 | 0.068 | −0.002 (0.053) | 0.072 | 0.053 | 0.052 (0.049) | 0.068 | 0.071 | |
| SI + pe | −0.001 (0.050) | 0.070 | 0.050 | 0.048 (0.047) | 0.067 | 0.067 | −0.003 (0.052) | 0.072 | 0.052 | 0.051 (0.048) | 0.068 | 0.070 | |
| TMI | 0.113 (0.057) | 0.073 | 0.127 | 0.148 (0.055) | 0.070 | 0.158 | 0.116 (0.059) | 0.074 | 0.130 | 0.152 (0.056) | 0.071 | 0.162 | |
| MI | −0.001 (0.047) | 0.071 | 0.047 | 0.048 (0.045) | 0.068 | 0.066 | −0.002 (0.048) | 0.072 | 0.048 | 0.051 (0.046) | 0.068 | 0.069 | |
| MIMP | −0.001 (0.047) | 0.071 | 0.047 | 0.048 (0.045) | 0.068 | 0.066 | −0.002 (0.048) | 0.072 | 0.048 | 0.052 (0.046) | 0.069 | 0.069 | |
| GBM | 0.066 (0.050) | 0.067 | 0.083 | – | – | – | – | – | – | – | – | – | |
| GBM + SI + pe | 0.039 (0.047) | 0.067 | 0.061 | – | – | – | – | – | – | – | – | – | |
| SI + pe + pu | − 0.009 (0.051) | 0.070 | 0.052 | 0.040 (0.050) | 0.068 | 0.064 | −0.011 (0.053) | 0.072 | 0.054 | 0.043 (0.050) | 0.068 | 0.066 | |
| SI + pe | −0.009 (0.051) | 0.070 | 0.052 | 0.041 (0.049) | 0.068 | 0.064 | −0.010 (0.053) | 0.072 | 0.054 | 0.045 (0.050) | 0.068 | 0.067 | |
| TMI | 0.090 (0.069) | 0.078 | 0.113 | 0.125 (0.066) | 0.074 | 0.141 | 0.093 (0.071) | 0.080 | 0.117 | 0.128 (0.067) | 0.076 | 0.144 | |
| MI | −0.008 (0.046) | 0.071 | 0.047 | 0.041 (0.046) | 0.068 | 0.062 | −0.010 (0.048) | 0.073 | 0.049 | 0.044 (0.046) | 0.069 | 0.064 | |
| MIMP | −0.008 (0.047) | 0.072 | 0.048 | 0.042 (0.046) | 0.068 | 0.062 | −0.010 (0.049) | 0.073 | 0.050 | 0.045 (0.047) | 0.069 | 0.065 | |
| GBM | 0.058 (0.051) | 0.067 | 0.077 | – | – | – | – | – | – | – | – | – | |
| GBM + SI + pe | 0.036 (0.048) | 0.067 | 0.060 | – | – | – | – | – | – | – | – | – | |
| SI + pe + pu | −0.006 (0.050) | 0.070 | 0.050 | 0.040 (0.048) | 0.067 | 0.062 | −0.007 (0.052) | 0.071 | 0.052 | 0.042 (0.048) | 0.068 | 0.064 | |
| SI + pe | −0.012 (0.051) | 0.071 | 0.052 | 0.039 (0.049) | 0.068 | 0.063 | −0.014 (0.053) | 0.072 | 0.055 | 0.042 (0.050) | 0.069 | 0.065 | |
| TMI | 0.113 (0.060) | 0.075 | 0.128 | 0.147 (0.057) | 0.072 | 0.158 | 0.113 (0.062) | 0.077 | 0.129 | 0.151 (0.059) | 0.073 | 0.162 | |
| MI | − 0.007 (0.047) | 0.071 | 0.048 | 0.043 (0.045) | 0.068 | 0.062 | −0.009 (0.048) | 0.072 | 0.049 | 0.046 (0.046) | 0.069 | 0.065 | |
| MIMP | −0.007 (0.047) | 0.071 | 0.048 | 0.043 (0.046) | 0.068 | 0.063 | −0.009 (0.049) | 0.073 | 0.050 | 0.046 (0.046) | 0.069 | 0.065 | |
| GBM | 0.068 (0.050) | 0.067 | 0.084 | – | – | – | – | – | – | – | – | – | |
| GBM + SI + pe | 0.035 (0.048) | 0.067 | 0.059 | – | – | – | – | – | – | – | – | – | |
Note. Complete: logistic regression with complete data before introducing missingness; comGBM GBM with complete data before introducing missingness; SI + pe + pu single imputation + prediction error + parameter uncertainty; SI + pe single imputation + prediction error; TMI treatment mean imputation; MI multiple imputation (m = 20); MIMP multiple imputation missingness pattern (m = 20); GBM GBM with incomplete data; GBM + SI + pe GBM after single imputation + prediction error; SD standard deviation; SE standard error; RMSE root mean squared error
Simulation results for scenario G with the correct logistic model (25%missing)
| Missingness Mechanism | Method | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Bias (SD) | SE | RMSE | Bias (SD) | SE | RMSE | Bias (SD) | SE | RMSE | ||
| complete | −0.002 (0.045) | 0.070 | 0.045 | 0.000 (0.030) | 0.049 | 0.030 | 0.001 (0.014) | 0.022 | 0.014 | |
| SI + pe + pu | −0.002 (0.053) | 0.070 | 0.053 | −0.001 (0.035) | 0.049 | 0.035 | 0.000 (0.016) | 0.022 | 0.016 | |
| SI + pe | −0.003 (0.052) | 0.070 | 0.052 | 0.000 (0.035) | 0.049 | 0.035 | 0.000 (0.016) | 0.022 | 0.016 | |
| TMI | −0.013 (0.062) | 0.076 | 0.063 | −0.009 (0.041) | 0.053 | 0.042 | −0.008 (0.018) | 0.023 | 0.020 | |
| MI | −0.002 (0.048) | 0.071 | 0.048 | −0.001 (0.033) | 0.049 | 0.033 | 0.000 (0.015) | 0.022 | 0.014 | |
| MIMP | −0.002 (0.048) | 0.071 | 0.049 | −0.001 (0.033) | 0.050 | 0.033 | 0.000 (0.015) | 0.022 | 0.014 | |
| SI + pe + pu | 0.009 (0.052) | 0.070 | 0.053 | 0.009 (0.035) | 0.049 | 0.036 | 0.010 (0.016) | 0.022 | 0.019 | |
| SI + pe | 0.008 (0.051) | 0.070 | 0.052 | 0.010 (0.034) | 0.049 | 0.035 | 0.011 (0.016) | 0.022 | 0.019 | |
| TMI | 0.019 (0.057) | 0.073 | 0.060 | 0.021 (0.039) | 0.051 | 0.044 | 0.023 (0.018) | 0.023 | 0.029 | |
| MI | 0.008 (0.047) | 0.070 | 0.048 | 0.009 (0.031) | 0.049 | 0.033 | 0.010 (0.015) | 0.022 | 0.017 | |
| MIMP | 0.008 (0.048) | 0.071 | 0.048 | 0.009 (0.032) | 0.049 | 0.034 | 0.009 (0.015) | 0.022 | 0.017 | |
| SI + pe + pu | −0.001 (0.052) | 0.070 | 0.052 | 0.003 (0.036) | 0.049 | 0.036 | 0.002 (0.016) | 0.022 | 0.016 | |
| SI + pe | 0.000 (0.052) | 0.070 | 0.052 | 0.003 (0.035) | 0.049 | 0.035 | 0.002 (0.016) | 0.022 | 0.016 | |
| TMI | −0.019 (0.064) | 0.077 | 0.067 | −0.015 (0.043) | 0.054 | 0.046 | −0.015 (0.019) | 0.024 | 0.024 | |
| MI | 0.000 (0.047) | 0.071 | 0.048 | 0.003 (0.032) | 0.049 | 0.033 | 0.002 (0.015) | 0.022 | 0.014 | |
| MIMP | 0.001 (0.047) | 0.071 | 0.046 | 0.004 (0.033) | 0.050 | 0.033 | 0.005 (0.015) | 0.022 | 0.015 | |
| SI + pe + pu | 0.003 (0.050) | 0.070 | 0.050 | 0.002 (0.035) | 0.049 | 0.035 | 0.000 (0.016) | 0.022 | 0.016 | |
| SI + pe | −0.003 (0.052) | 0.070 | 0.052 | 0.002 (0.035) | 0.049 | 0.035 | 0.000 (0.016) | 0.022 | 0.016 | |
| TMI | 0.002 (0.061) | 0.076 | 0.061 | 0.003 (0.042) | 0.052 | 0.042 | −0.004 (0.018) | 0.023 | 0.018 | |
| MI | 0.001 (0.047) | 0.071 | 0.048 | 0.002 (0.033) | 0.049 | 0.032 | 0.000 (0.015) | 0.022 | 0.015 | |
| MIMP | 0.001 (0.047) | 0.071 | 0.047 | 0.002 (0.033) | 0.049 | 0.033 | 0.000 (0.015) | 0.022 | 0.014 | |
Note. Complete: logistic regression with complete data before introducing missingness; SI + pe + pu single imputation + prediction error + parameter uncertainty; SI + pe single imputation + prediction error; TMI treatment mean imputation; MI multiple imputation (m = 20); MIMP multiple imputation missingness pattern (m = 20); SD standard deviation; SE standard error; RMSE root mean squared error