| Literature DB >> 22050039 |
Shaun R Seaman1, Ian R White, Andrew J Copas, Leah Li.
Abstract
Two approaches commonly used to deal with missing data are multiple imputation (MI) and inverse-probability weighting (IPW). IPW is also used to adjust for unequal sampling fractions. MI is generally more efficient than IPW but more complex. Whereas IPW requires only a model for the probability that an individual has complete data (a univariate outcome), MI needs a model for the joint distribution of the missing data (a multivariate outcome) given the observed data. Inadequacies in either model may lead to important bias if large amounts of data are missing. A third approach combines MI and IPW to give a doubly robust estimator. A fourth approach (IPW/MI) combines MI and IPW but, unlike doubly robust methods, imputes only isolated missing values and uses weights to account for remaining larger blocks of unimputed missing data, such as would arise, e.g., in a cohort study subject to sample attrition, and/or unequal sampling fractions. In this article, we examine the performance, in terms of bias and efficiency, of IPW/MI relative to MI and IPW alone and investigate whether the Rubin's rules variance estimator is valid for IPW/MI. We prove that the Rubin's rules variance estimator is valid for IPW/MI for linear regression with an imputed outcome, we present simulations supporting the use of this variance estimator in more general settings, and we demonstrate that IPW/MI can have advantages over alternatives. IPW/MI is applied to data from the National Child Development Study.Entities:
Mesh:
Year: 2011 PMID: 22050039 PMCID: PMC3412287 DOI: 10.1111/j.1541-0420.2011.01666.x
Source DB: PubMed Journal: Biometrics ISSN: 0006-341X Impact factor: 2.571
Mean parameter estimate (“mean”), square root of mean estimated variance (“aSE”), and empirical SE (“eSE”) for four parameters and 10 analysis methods. The true value ofis (θ0, θ2, θ3, θ23) = (−3, 0.5, 0.5, 1).
| θ0 | θ2 | θ3 | θ23 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Method | Mean | aSE | eSE | Mean | aSE | eSE | Mean | aSE | eSE | Mean | aSE | eSE |
| True | −3.000 | .500 | .500 | 1.000 | ||||||||
| CC/CC | −2.995 | .080 | .079 | .090 | .081 | .087 | .200 | .080 | .086 | 1.005 | .082 | .091 |
| CC/IPW | −2.993 | .082 | .079 | .199 | .092 | .091 | .200 | .086 | .089 | 1.004 | .094 | .100 |
| CC/MI | −2.994 | .075 | .076 | .202 | .081 | .081 | .201 | .079 | .083 | 1.004 | .084 | .086 |
| IPW/CC | −2.993 | .102 | .101 | .382 | .110 | .112 | .495 | .109 | .114 | 1.008 | .114 | .119 |
| IPW/IPW | −2.990 | .106 | .104 | .489 | .120 | .124 | .494 | .112 | .117 | 1.006 | .121 | .132 |
| IPW/MI | −2.992 | .097 | .096 | .498 | .105 | .105 | .497 | .104 | .107 | 1.006 | .110 | .113 |
| MI/MI | −3.000 | .089 | .081 | .503 | .092 | .087 | .497 | .090 | .088 | 1.006 | .092 | .082 |
| MI*/MI | −2.998 | .092 | .085 | .498 | .095 | .093 | .496 | .094 | .094 | .749 | .100 | .083 |
| MI/MI* | −2.999 | .108 | .101 | .100 | .088 | .054 | .099 | .088 | .051 | .391 | .091 | .055 |
| IPW/MI* | −2.998 | .107 | .100 | .492 | .119 | .122 | .495 | .117 | .115 | .776 | .131 | .127 |
Mean parameter estimate (mean), square root of mean estimated variance (aSE), and empirical SE (eSE) for five parameters and 10 analysis methods. Results forθ2are omitted because, apart from Monte Carlo error, they are the same as forθ3. The true value ofis (θ0, θ2, θ3, θ4, θ23) = (0, 0.5, 0.5, 0.5, 1).
| θ0 | θ3 | θ4 | θ23 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Method | Mean | aSE | eSE | Mean | aSE | eSE | Mean | aSE | eSE | Mean | aSE | eSE |
| True | .000 | .500 | .500 | 1.000 | ||||||||
| CC/CC | .238 | .060 | .056 | .196 | .061 | .065 | .183 | .060 | .064 | .992 | .064 | .077 |
| IPW/IPW | .020 | .095 | .102 | .485 | .103 | .113 | .479 | .108 | .124 | .990 | .108 | .119 |
| IPW/MI | .002 | .075 | .075 | .495 | .084 | .084 | .490 | .092 | .089 | 1.001 | .089 | .088 |
| MI*/MI | −.086 | .051 | .061 | .663 | .100 | .129 | .372 | .071 | .072 | .976 | .079 | .117 |
| MI*/MI* | −.087 | .051 | .060 | .674 | .100 | .126 | .337 | .077 | .081 | .970 | .080 | .112 |
| IPW/MI* | −.003 | .078 | .076 | .504 | .086 | .091 | .427 | .096 | .089 | .978 | .092 | .095 |
| IPWe/MI | .003 | .061 | .060 | .497 | .081 | .083 | .491 | .089 | .087 | 1.001 | .088 | .089 |
LOR and SEs for predictors of high blood glucose. Binary predictors are gestational age < 38 weeks, preeclampsia, smoking during pregnancy, prepregnancy BMI≥ 25Kg/m2, and manual socioeconomic position (SEP) at birth. Ordinal and continuous predictors are birth weight for gestational age (tertile), BMI at age 45 (Kg/m2), and waist circumference at age 45 (cm). Adjustment was also made for sex and family history of diabetes.
| CC/MI | IPW/MI | MI/MI | ||||
|---|---|---|---|---|---|---|
| LOR | SE | LOR | SE | LOR | SE | |
| Short gestation | 0.46 | 0.22 | 0.48 | 0.23 | 0.44 | 0.20 |
| Preeclampsia | 0.46 | 0.27 | 0.55 | 0.27 | 0.47 | 0.25 |
| Mother overweight | 0.29 | 0.15 | 0.36 | 0.16 | 0.18 | 0.12 |
| Smoke in pregnancy | 0.02 | 0.14 | 0.04 | 0.14 | 0.04 | 0.14 |
| Manual SEP | 0.37 | 0.17 | 0.44 | 0.18 | 0.39 | 0.17 |
| Birth weight | −0.31 | 0.09 | −0.31 | 0.09 | −0.32 | 0.09 |
| BMI age 45 | 0.04 | 0.02 | 0.02 | 0.02 | 0.03 | 0.02 |
| Waist size age 45 | 0.07 | 0.01 | 0.07 | 0.01 | 0.07 | 0.01 |