| Literature DB >> 21324148 |
Jinhui Ma1, Noori Akhtar-Danesh, Lisa Dolovich, Lehana Thabane.
Abstract
BACKGROUND: Attrition, which leads to missing data, is a common problem in cluster randomized trials (CRTs), where groups of patients rather than individuals are randomized. Standard multiple imputation (MI) strategies may not be appropriate to impute missing data from CRTs since they assume independent data. In this paper, under the assumption of missing completely at random and covariate dependent missing, we compared six MI strategies which account for the intra-cluster correlation for missing binary outcomes in CRTs with the standard imputation strategies and complete case analysis approach using a simulation study.Entities:
Mesh:
Year: 2011 PMID: 21324148 PMCID: PMC3055218 DOI: 10.1186/1471-2288-11-18
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Kappa statistics for different imputation strategies when missingness is completely at random
| Imputation level | Imputation strategies | Percentage of missingness | |||||
|---|---|---|---|---|---|---|---|
| 5% | 10% | 15% | 20% | 30% | 50% | ||
| Within cluster | Logistic regression | 0.954 | 0.913 | ||||
| Propensity score | 0.953 | 0.910 | 0.865 | 0.820 | 0.730 | 0.549 | |
| MCMC1 | 0.954 | 0.913 | 0.869 | 0.825 | 0.737 | 0.561 | |
| Across cluster | Propensity score | 0.954 | 0.912 | 0.868 | 0.828 | 0.738 | 0.556 |
| Random-effects logistic regression | 0.955 | 0.914 | 0.871 | 0.830 | 0.741 | 0.562 | |
| Fixed-effects logistic regression | 0.956 | 0.911 | 0.866 | 0.821 | 0.732 | 0.554 | |
| Ignore cluster | Logistic regression | 0.954 | 0.907 | 0.861 | 0.814 | 0.722 | 0.537 |
| Propensity score | 0.952 | 0.902 | 0.854 | 0.804 | 0.707 | 0.512 | |
| MCMC1 | 0.953 | 0.906 | 0.859 | 0.811 | 0.717 | 0.530 | |
Note:
1. MCMC = Markov chain Monte Carlo. For the MCMC methods, we round the imputed values to 1 if it is equal or greater than 0.5 and to 0 otherwise.
Estimated treatment effects for different imputation strategies when missingness is completely at random
| Imputation level | Imputation strategies | Analysis model | OR4 and 95% CI5 for Complete Data: GEE2 1.14 (0.76 1.70) RE3 1.12 (0.72 1.76) | |||||
|---|---|---|---|---|---|---|---|---|
| OR4 and 95% CI5 for Different Percentage of missingness | ||||||||
| 5% | 10% | 15% | 20% | 30% | 50% | |||
| Within cluster | Logistic regression | GEE2 | 1.14 (0.75 1.73) | 1.14 (0.76 1.72) | ||||
| RE3 | 1.13 (0.71 1.79) | 1.13 (0.71 1.78) | ||||||
| Propensity score | GEE2 | 1.14 (0.75 1.74) | 1.14 (0.75 1.73) | 1.14 (0.74 1.75) | 1.14 (0.73 1.77) | 1.14 (0.72 1.82) | 1.17 (0.68 2.01) | |
| RE3 | 1.12 (0.70 1.80) | 1.12 (0.70 1.79) | 1.12 (0.69 1.81) | 1.12 (0.68 1.84) | 1.12 (0.66 1.90) | 1.14 (0.61 2.14) | ||
| MCMC1 | GEE2 | 1.14 (0.75 1.72) | 1.13 (0.75 1.70) | 1.13 (0.75 1.71) | 1.12 (0.74 1.71) | 1.12 (0.72 1.73) | 1.11 (0.69 1.79) | |
| RE3 | 1.12 (0.71 1.78) | 1.11 (0.70 1.76) | 1.11 (0.70 1.77) | 1.11 (0.69 1.78) | 1.10 (0.67 1.79) | 1.10 (0.64 1.87) | ||
| Across cluster | Propensity score | GEE2 | 1.14 (0.77 1.69) | 1.14 (0.77 1.68) | 1.14 (0.78 1.68) | 1.14 (0.78 1.67) | 1.15 (0.79 1.68) | 1.16 (0.77 1.74) |
| RE3 | 1.18 (0.88 1.59) | 1.18 (0.88 1.59) | 1.18 (0.87 1.60) | 1.18 (0.87 1.61) | 1.18 (0.85 1.64) | 1.19 (0.80 1.77) | ||
| Random-effects | GEE2 | 1.15 (0.78 1.69) | 1.16 (0.79 1.70) | 1.17 (0.80 1.72) | 1.18 (0.80 1.74) | 1.21 (0.80 1.81) | 1.25 (0.75 2.06) | |
| logistic regression | RE3 | 1.14 (0.74 1.74) | 1.15 (0.76 1.75) | 1.17 (0.77 1.76) | 1.18 (0.78 1.78) | 1.21 (0.79 1.85) | 1.25 (0.75 2.08) | |
| Fixed-effects | GEE2 | 1.14 (0.76 1.71) | 1.15 (0.76 1.73) | 1.15 (0.76 1.75) | 1.16 (0.75 1.78) | 1.16 (0.74 1.84) | 1.18 (0.69 2.01) | |
| Logistic regression | RE4 | 1.13 (0.72 1.77) | 1.13 (0.72 1.79) | 1.14 (0.72 1.82) | 1.14 (0.71 1.84) | 1.15 (0.69 1.91) | 1.16 (0.63 2.13) | |
| Ignore cluster | Logistic regression | GEE2 | 1.14 (0.78 1.68) | 1.14 (0.79 1.66) | 1.15 (0.80 1.65) | 1.15 (0.80 1.65) | 1.16 (0.82 1.64) | 1.16 (0.82 1.65) |
| RE3 | 1.13 (0.74 1.73) | 1.14 (0.75 1.71) | 1.14 (0.77 1.70) | 1.15 (0.78 1.68) | 1.16 (0.81 1.66) | 1.16 (0.82 1.66) | ||
| Propensity score | GEE2 | 1.14 (0.78 1.67) | 1.14 (0.79 1.66) | 1.15 (0.80 1.65) | 1.15 (0.81 1.64) | 1.15 (0.82 1.61) | 1.16 (0.82 1.62) | |
| RE3 | 1.13 (0.74 1.72) | 1.14 (0.76 1.70) | 1.14 (0.77 1.68) | 1.15 (0.79 1.67) | 1.15 (0.81 1.64) | 1.16 (0.82 1.63) | ||
| MCMC1 | GEE2 | 1.14 (0.78 1.68) | 1.14 (0.78 1.66) | 1.14 (0.79 1.65) | 1.14 (0.80 1.63) | 1.14 (0.81 1.61) | 1.15 (0.83 1.59) | |
| RE3 | 1.13 (0.74 1.73) | 1.13 (0.75 1.70) | 1.14 (0.77 1.68) | 1.14 (0.78 1.66) | 1.14 (0.80 1.63) | 1.15 (0.82 1.60) | ||
| Complete case analysis | GEE2 | 1.14 (0.76 1.70) | 1.14 (0.76 1.71) | 1.14 (0.76 1.72) | 1.15 (0.76 1.72) | 1.15 (0.76 1.75) | 1.16 (0.74 1.81) | |
| RE3 | 1.12 (0.72 1.76) | 1.13 (0.72 1.76) | 1.13 (0.72 1.77) | 1.13 (0.72 1.78) | 1.14 (0.72 1.81) | 1.15 (0.71 1.87) | ||
Note:
1. MCMC = Markov chain Monte Carlo. For the MCMC methods, we round the imputed values to 1 if it is equal or greater than 0.5 and to 0 otherwise.
2. GEE = Generalized estimation equation method
3. RE = Random-effects logistic regression
4. OR = Odds ratio
5. CI = Confidence interval
Kappa statistics for different imputation strategies when missingness is covariate dependent
| Imputation Level | Imputation strategies | Percentage of missingness | |||||
|---|---|---|---|---|---|---|---|
| 5% | 10% | 15% | 20% | 30% | 50% | ||
| Within cluster | Logistic regression | 0.949 | 0.902 | ||||
| Propensity score | 0.947 | 0.899 | 0.850 | 0.801 | 0.706 | 0.524 | |
| MCMC1 | 0.948 | 0.901 | 0.854 | 0.806 | 0.714 | 0.535 | |
| Across cluster | Propensity score | 0.949 | 0.903 | 0.853 | 0.805 | 0.713 | 0.529 |
| Random-effects logistic regression | 0.951 | 0.908 | 0.859 | 0.808 | 0.717 | 0.538 | |
| Fixed-effects logistic regression | 0.949 | 0.899 | 0.850 | 0.801 | 0.707 | 0.528 | |
| Ignore cluster | Logistic regression | 0.947 | 0.895 | 0.844 | 0.793 | 0.695 | 0.508 |
| Propensity score | 0.945 | 0.891 | 0.839 | 0.787 | 0.688 | 0.495 | |
| MCMC1 | 0.946 | 0.893 | 0.841 | 0.790 | 0.691 | 0.501 | |
Note:
1 MCMC = Markov chain Monte Carlo. For the MCMC methods, we round the imputed values to 1 if it is equal or greater than 0.5 and to 0 otherwise.
Figure 1Kappa statistics for different imputation strategies when missingness is covariate dependent.
Estimated treatment effects for different imputation strategies when missingness is covariate dependent
| Imputation level | Imputation strategies | Analysis model | OR4 and 95% CI5 for Complete Data: GEE2 1.14 (0.76 1.70) RE3 1.12 (0.72 1.76) | |||||
|---|---|---|---|---|---|---|---|---|
| OR4 and 95% CI5 for Different Percentage of missingness | ||||||||
| 5% | 10% | 15% | 20% | 30% | 50% | |||
| Within cluster | Logistic regression | GEE2 | 1.14 (0.76 1.72) | 1.14 (0.76 1.72) | ||||
| RE3 | 1.12 (0.71 1.78) | 1.13 (0.71 1.78) | ||||||
| Propensity score | GEE2 | 1.14 (0.75 1.72) | 1.14 (0.75 1.73) | 1.14 (0.74 1.75) | 1.14 (0.73 1.78) | 1.15 (0.71 1.84) | 1.18 (0.68 2.04) | |
| RE3 | 1.12 (0.70 1.79) | 1.12 (0.70 1.79) | 1.12 (0.69 1.82) | 1.12 (0.68 1.86) | 1.12 (0.65 1.93) | 1.15 (0.61 2.18) | ||
| MCMC1 | GEE2 | 1.13 (0.75 1.71) | 1.13 (0.75 1.70) | 1.13 (0.74 1.71) | 1.12 (0.74 1.72) | 1.12 (0.72 1.74) | 1.12 (0.69 1.80) | |
| RE3 | 1.11 (0.70 1.77) | 1.11 (0.70 1.76) | 1.11 (0.69 1.77) | 1.11 (0.69 1.78) | 1.10 (0.67 1.81) | 1.10 (0.64 1.88) | ||
| Across cluster | Propensity score | GEE2 | 1.14 (0.77 1.68) | 1.14 (0.77 1.67) | 1.14 (0.78 1.67) | 1.14 (0.79 1.67) | 1.15 (0.79 1.67) | 1.15 (0.76 1.72) |
| RE3 | 1.18 (0.88 1.59) | 1.18 (0.87 1.59) | 1.18 (0.87 1.60) | 1.18 (0.86 1.61) | 1.18 (0.85 1.64) | 1.17 (0.78 1.76) | ||
| Random-effects | GEE2 | 1.15 (0.78 1.69) | 1.16 (0.80 1.70) | 1.18 (0.81 1.72) | 1.19 (0.81 1.75) | 1.22 (0.81 1.83) | 1.31 (0.83 2.06) | |
| logistic regression | RE3 | 1.14 (0.75 1.74) | 1.16 (0.77 1.74) | 1.18 (0.79 1.76) | 1.19 (0.80 1.78) | 1.22 (0.80 1.86) | 1.31 (0.83 2.05) | |
| Fixed-effects | GEE2 | 1.14 (0.76 1.71) | 1.15 (0.76 1.73) | 1.15 (0.76 1.76) | 1.16 (0.75 1.79) | 1.17 (0.73 1.86) | 1.17 (0.67 2.04) | |
| Logistic regression | RE4 | 1.13 (0.72 1.77) | 1.14 (0.72 1.79) | 1.14 (0.71 1.83) | 1.15 (0.71 1.86) | 1.15 (0.68 1.94) | 1.15 (0.61 2.18) | |
| Ignore cluster | Logistic regression | GEE2 | 1.14 (0.78 1.67) | 1.14 (0.79 1.65) | 1.15 (0.80 1.64) | 1.15 (0.81 1.64) | 1.16 (0.83 1.63) | 1.15 (0.81 1.63) |
| RE3 | 1.13 (0.74 1.72) | 1.14 (0.76 1.70) | 1.15 (0.78 1.68) | 1.15 (0.80 1.67) | 1.16 (0.82 1.65) | 1.15 (0.81 1.63) | ||
| Propensity score | GEE2 | 1.14 (0.78 1.67) | 1.14 (0.79 1.65) | 1.15 (0.81 1.64) | 1.15 (0.82 1.63) | 1.15 (0.83 1.61) | 1.15 (0.82 1.62) | |
| RE3 | 1.13 (0.75 1.72) | 1.14 (0.77 1.69) | 1.15 (0.79 1.67) | 1.15 (0.80 1.66) | 1.15 (0.82 1.63) | 1.15 (0.82 1.62) | ||
| MCMC1 | GEE2 | 1.14 (0.78 1.67) | 1.14 (0.79 1.65) | 1.15 (0.80 1.63) | 1.15 (0.81 1.62) | 1.15 (0.82 1.59) | 1.13 (0.82 1.57) | |
| RE3 | 1.13 (0.74 1.72) | 1.14 (0.77 1.69) | 1.14 (0.78 1.67) | 1.15 (0.80 1.65) | 1.15 (0.81 1.61) | 1.13 (0.82 1.57) | ||
| Complete case analysis | GEE2 | 1.14 (0.76 1.70) | 1.14 (0.76 1.71) | 1.14 (0.76 1.72) | 1.15 (0.76 1.73) | 1.15 (0.75 1.75) | 1.15 (0.73 1.80) | |
| RE3 | 1.13 (0.72 1.75) | 1.13 (0.72 1.76) | 1.13 (0.72 1.77) | 1.14 (0.72 1.78) | 1.14 (0.72 1.80) | 1.15 (0.71 1.85) | ||
Note:
1. MCMC = Markov chain Monte Carlo. For MCMC methods, we round the imputed values to 1 if it is equal or greater than 0.5 and to 0 otherwise.
2. GEE = Generalized estimation equation method
3. RE = Random-effects logistic regression
4. OR = Odds ratio
5. CI = Confidence interval
Figure 2Treatment effect estimated from generalized estimating equations when 30% data is covariate dependent missing.
Figure 3Treatment effect estimated from random-effects logistic regression when 30% data is covariate dependent missing.