| Literature DB >> 22681630 |
Vanina Héraud-Bousquet1, Christine Larsen, James Carpenter, Jean-Claude Desenclos, Yann Le Strat.
Abstract
BACKGROUND: Multiple Imputation as usually implemented assumes that data are Missing At Random (MAR), meaning that the underlying missing data mechanism, given the observed data, is independent of the unobserved data. To explore the sensitivity of the inferences to departures from the MAR assumption, we applied the method proposed by Carpenter et al. (2007).This approach aims to approximate inferences under a Missing Not At random (MNAR) mechanism by reweighting estimates obtained after multiple imputation where the weights depend on the assumed degree of departure from the MAR assumption.Entities:
Mesh:
Year: 2012 PMID: 22681630 PMCID: PMC3537570 DOI: 10.1186/1471-2288-12-73
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Multivariate logistic regression of factors associated with severe liver disease
| Factors | Patients (n = 4 343) | % SLD | % missing data | Complete Case (n = 2 130) aOR* (95% CI*) | Multiple Imputation (n = 4 343) aOR* (95% CI*) |
| Period of inclusion | | | | | |
| 2001-2003 | 2330 | 7.0 | | | |
| 2004-2007 | 2013 | 9.5 | | | |
| Sex | | | | | |
| Female | 993 | 4.2 | | 1.0 | 1.0 |
| Male | 3350 | 9.3 | | 1.8 [1.1,3.0] | 2.0 [1.4,2.9] |
| Age | | | | | |
| ≤ 40 years | 2435 | 3.9 | | 1.0 | 1.0 |
| > 40 years | 1908 | 13.6 | | 2.2 [1.5,3.3] | 2.3 [1.7,3.1] |
| Time between 1st HCV + test and referral | | | | | |
| < 1 year | 1728 | 6.7 | | | |
| ≥ 1 year | 2163 | 8.7 | | | |
| Missing | 452 | 11.5 | 10.4 | | |
| Duration of HCV infection at referral† | | | | | |
| < 18 years | 1709 | 3.0 | | 1.0 | 1.0 |
| ≥ 18 years | 2002 | 12.5 | | 3.1 [2.0,5.1] | 2.6 [1.8,3.7] |
| Missing | 632 | 8.2 | 14.6 | | |
| History of excessive alcohol intake‡ | | | | | |
| No | 2015 | 4.5 | | 1.0 | 1.0 |
| Yes | 1847 | 13.2 | | 2.6 [1.8,3.7] | 2.8 [2.2,3.7] |
| Missing | 481 | 4.4 | 11.1 | | |
| HbsAg status | | | | | |
| Negative | 3570 | 8.3 | | 1.0 | |
| Positive | 89 | 13.5 | | 2.4 [1.0,5.9] | |
| Missing | 684 | 6.7 | 15.7 | | |
| HIV serostatus | | | | | |
| Negative | 3342 | 8.2 | | | 1.0 |
| Positive | 294 | 14.0 | | | 1.8 [1.2,2.6] |
| Missing | 707 | 5.7 | 16.3 | | |
| HCV genotype 3 | | | | | |
| No | 2083 | 7.2 | | 1.0 | 1.0 |
| Yes | 1117 | 10.3 | | 1.5 [1.1,2.0] | 1.6 [1.3,2.1] |
| Missing | 1143 | 7.8 | 26.3 | ||
*aOR, adjusted Odds Ratio; CI, confidence interval; SLD, severe liver disease (cirrhosis, hepatocellular carcinoma); HCV, hepatitis C virus; HBsAg, hepatitis B surface antigen; HIV, human immunodeficiency virus.
† Time from suspected year of infection to year of referral to the reference centre. Suspected year of HCV infection is defined as year of the last HCV negative test performed during the drug-use period or year of first drug injection.
‡ >210 g/week for women and >280 g/week for men.
Complete case and multiple imputation analyses were applied to a population of HCV-RNA positive drug users newly referred in hepatology reference centres in France, 2001–2007.
Multivariate regression to explain the missing indicator of genotype 3 using covariates
| Severe liver disease † | −0.05 | 0.13 | 0.72 |
| Age | 0.16 | 0.09 | 0.06 |
| Sex | 0.04 | 0.08 | 0.63 |
| Disease duration ‡ | 0.19 | 0.08 | 0.04 |
| Delay of referral 4 | 0.05 | 0.08 | 0.53 |
| Alcohol consumption 5 | −0.005 | 0.07 | 0.94 |
| HIV serostatus | 0.02 | 0.14 | 0.90 |
| HbsAg status | −0.14 | 0.23 | 0.52 |
* P, pvalue; SE, standard error.
† Cirrhosis or hepatocellular carcinoma.
‡ Time from suspected year of infection to year of referral to the reference centre. Suspected year of HCV infection is defined as year of the last HCV negative test performed during the drug-use period or year of first drug injection.
4 Time between testing and first referral.
5 >210 g/week for women and >280 g/week for men.
Figure 1Graphical determination of a delta value for the variable genotype 3. Left panel: histogram of the sum of genotype 3 imputed values for each data set and for M = 1000 bases ; extreme values of this sum are 340 in imputed dataset n°921 and 480 in imputed dataset n°771. Right panel: normalized weights (wm) for each imputed dataset according to δ.
Figure 2Normalized weights (w) for each imputed dataset according to δ for the variable genotype 3. The hatched zone delineates values of δcorresponding to maximum normalized weights equal to 0.5.
Figure 3Analysis of the variable genotype 3 with δ = 0.15. Left panel: normalized weights (wm) versus (estimated logistic regression coefficient of genotype 3 in the imputed dataset m), for each imputed data set. The dash line represents (mean of over the 1000 imputed datasets). Right panel: running estimate, calculated as the moving average of the according to the number of imputed datasets. On the right axis is plotted the ‘rug’ of the 1000 estimates for each imputed dataset. The dash line represents (mean of over the 1000 imputed datasets).
Multivariate analysis for the complete case, multiple imputation and sensitivity analysis, with = 1000 imputed data sets
| | Missing values % | aOR 95% CI | SE | CV (%) | aOR 95% CI | SE | CV (%) | VRMI(MI | δ | aOR*95% CI | SE | CV (%) | VRSA(SA |
| Alcohol consumption | 11.1 | 2.32 | 0.39 | 17 | 2.82 | 0.37 | 13 | 21.86 | -0.40 | 2.86 | 0.37 | 13 | 1.29 |
| | | [1.66,3.23] | | | [2.18,3.66] | | | | | [2.21,3.70] | | | |
| Genotype3 | 26.3 | 1.51 | 0.24 | 16 | 1.66 | 0.23 | 14 | 9.70 | 0.15 | 1.60 | 0.21 | 13 | 3.56 |
| | | [1.10,2.07] | | | [1.27,2.16] | | | | | [1.23,2.06] | | | |
| HIV | 16.3 | 1.56 | 0.41 | 27 | 1.80 | 0.34 | 19 | 15.52 | 0.70 | 1.91 | 0.36 | 19 | 6.12 |
| serostatus | [0.92,2.62] | [1.24,2.61] | [1.32,2.76] | ||||||||||
Note. aOR, odds ratio adjusted on sex, age, duration of HCV infection, alcohol consumption and HIV serostatus; aOR*, odds ratio obtained from the MI adjusted odds ratio estimates.
CI, confidence interval; CV, coefficient of variation of the aOR; VR , variation rate of the aOR for CC and MI analyses; VR , variation rate of the aOR for MI and senstitivity analyses.
Covariates included in the model were: sex, age, duration of HCV infection, alcohol consumption, genotype 3 and HIV serostatus. Sensitivity analysis: the weighting process was applied to alcohol consumption, genotype 3 and HIV serostatus indepedently.
Figure 4Normalized weights (w) for each imputed dataset according to δ for the variables alcohol consumption and HIV serostatus. Left panel: the interval for δ is restrained to [−0.4;0.4]. Right panel: the interval for δ is [−0.4;0.7].
Figure 5Variation rate according to δ after sensitivity analysis (VR) for genotype 3, alcohol consumption and HIV serostatus. The black points correspond to the VRSA calculated for the value of δ retained for each variable (genotype 3 δ = 0.15, alcohol δ = −0.4 and HIV δ = 0.7).