| Literature DB >> 23343209 |
Jinhui Ma1, Parminder Raina, Joseph Beyene, Lehana Thabane.
Abstract
BACKGROUND: The objective of this simulation study is to compare the accuracy and efficiency of population-averaged (i.e. generalized estimating equations (GEE)) and cluster-specific (i.e. random-effects logistic regression (RELR)) models for analyzing data from cluster randomized trials (CRTs) with missing binary responses.Entities:
Mesh:
Year: 2013 PMID: 23343209 PMCID: PMC3560270 DOI: 10.1186/1471-2288-13-9
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Figure 1Schematic overview of the simulation study. Abbreviations: MI, multiple imputation; GEE, generalized estimating equations; RELR, random-effects logistic regression; SB, standardized bias; ESE, empirical standard error; RMSE, root mean square error; obs., observations.
Comparison of empirical standard error
| 5 | 500 | 0% | 0.07 | 0.10 | | | | | ||
| 15% | 0.08 | 0.11 | 0.08 | 0.07 | 0.08 | 0.08 | ||||
| 30% | 0.08 | 0.12 | 0.08 | 0.08 | 0.10 | 0.09 | ||||
| 0% | 0.15 | 0.12 | | | | | ||||
| 15% | 0.15 | 0.13 | 0.13 | 0.12 | 0.16 | 0.14 | ||||
| 30% | 0.15 | 0.15 | 0.12 | 0.11 | 0.16 | 0.15 | ||||
| 0% | 0.30 | 0.15 | | | | | ||||
| 15% | 0.30 | 0.16 | 0.26 | 0.24 | 0.30 | 0.28 | ||||
| 30% | 0.30 | 0.16 | 0.22 | 0.20 | 0.30 | 0.29 | ||||
| 20 (L-Design) | 50 | 0% | 0.11 | 0.17 | | | | | ||
| 15% | 0.11 | 0.17 | 0.12 | 0.12 | 0.13 | 0.13 | ||||
| 30% | 0.12 | 0.19 | 0.12 | 0.13 | 0.15 | 0.16 | ||||
| 0% | 0.17 | 0.31 | | | | | ||||
| 15% | 0.17 | 0.34 | 0.16 | 0.16 | 0.18 | 0.19 | ||||
| 30% | 0.18 | 0.39 | 0.15 | 0.16 | 0.20 | 0.21 | ||||
| 0% | 0.22 | 0.18 | | | | | ||||
| 15% | 0.23 | 0.21 | 0.20 | 0.22 | 0.23 | 0.26 | ||||
| 30% | 0.23 | 0.22 | 0.18 | 0.19 | NA | NA | ||||
| 30 (L-Design) | 30 | 0% | 0.15 | 0.28 | | | | | ||
| 15% | 0.16 | 0.33 | 0.15 | 0.15 | 0.17 | 0.18 | ||||
| 30% | 0.17 | 0.37 | 0.15 | 0.15 | NA | NA | ||||
| 0% | 0.19 | 0.33 | | | | | ||||
| 15% | 0.20 | 0.38 | 0.18 | 0.19 | NA | NA | ||||
| 30% | 0.20 | 0.42 | 0.17 | 0.18 | NA | NA | ||||
| 0% | 0.26 | 0.38 | | | | | ||||
| 15% | 0.26 | 0.40 | 0.23 | 0.27 | NA | NA | ||||
| 30% | 0.26 | 0.44 | 0.21 | 0.23 | NA | NA | ||||
Empirical standard error is defined as the average of standard errors of the estimated treatment effects across all simulation replications. The empirical standard errors obtained when 0% data are missing are considered as references for comparing with those obtained when 15% or 30% data are missing.
Note: 1. m: Number of clusters per trial arm. 2. n: Number of subjects per cluster.
3. ρ: intracluster correlation coefficient; 4. VIF: Variance inflation factor, i.e. 1+(m-1)ρ; 5. Standard MI: Standard multiple imputation using logistic regression method.
6. Within-cluster MI: Within-cluster multiple imputation using logistic regression method, which is not applicable (NA) for some L-design of cluster randomized trials.
7. GEE: Generalized estimating equations. 8. RELR: Random-effects logistic regression.
9. For CRTs with 5 clusters per arm, modified standard errors are provided.
Figure 2Comparison of empirical standard error.
Comparison of standardized bias
| 5 | 500 | 0% | 0.02 | 0.73 | | | | | ||
| 15% | 0.03 | 0.71 | 0.02 | 0.17 | 0.03 | 0.15 | ||||
| 30% | 0.01 | 0.63 | 0.00 | 0.18 | 0.00 | 0.08 | ||||
| 0% | 0.01 | 0.34 | | | | | ||||
| 15% | 0.00 | 0.33 | 0.00 | 0.02 | 0.00 | 0.03 | ||||
| 30% | 0.00 | 0.32 | 0.00 | 0.01 | 0.00 | 0.03 | ||||
| 0% | 0.02 | 0.15 | | | | | ||||
| 15% | 0.02 | 0.15 | 0.02 | 0.08 | 0.03 | 0.10 | ||||
| 30% | 0.02 | 0.14 | 0.01 | 0.05 | 0.02 | 0.09 | ||||
| 20 (L-Design) | 50 | 0% | 0.04 | 0.38 | | | | | ||
| 15% | 0.04 | 0.37 | 0.03 | 0.04 | 0.06 | 0.01 | ||||
| 30% | 0.04 | 0.36 | 0.03 | 0.05 | 0.08 | 0.01 | ||||
| 0% | 0.01 | 0.26 | | | | | ||||
| 15% | 0.00 | 0.24 | 0.00 | 0.09 | 0.03 | 0.11 | ||||
| 30% | 0.02 | 0.13 | 0.01 | 0.06 | 0.03 | 0.12 | ||||
| 0% | 0.02 | 0.20 | | | | | ||||
| 15% | 0.01 | 0.19 | 0.01 | 0.15 | 0.05 | 0.16 | ||||
| 30% | 0.01 | 0.19 | 0.01 | 0.10 | NA | NA | ||||
| 30 (L-Design) | 30 | 0% | 0.02 | 0.33 | | | | | ||
| 15% | 0.02 | 0.32 | 0.02 | 0.12 | 0.02 | 0.15 | ||||
| 30% | 0.01 | 0.14 | 0.00 | 0.06 | NA | NA | ||||
| 0% | 0.01 | 0.23 | | | | | ||||
| 15% | 0.01 | 0.23 | 0.01 | 0.18 | NA | NA | ||||
| 30% | 0.02 | 0.23 | 0.02 | 0.13 | NA | NA | ||||
| 0% | 0.01 | 0.16 | | | | | ||||
| 15% | 0.00 | 0.15 | 0.00 | 0.14 | NA | NA | ||||
| 30% | 0.01 | 0.15 | 0.00 | 0.16 | NA | NA | ||||
Standardized bias is defined as the difference between the expectation of the estimator and the parameter, divided by the standard deviation of the estimator. Standardized biases obtained when 0% data are missing are considered as references for comparing with those obtained when 15% or 30% data are missing.
Note: 1. m: Number of clusters per trial arm. 2. n: Number of subjects per cluster.
3. ρ: Intracluster correlation coefficient; 4. VIF: Variance inflation factor, i.e. 1+(m-1)ρ; 5. Standard MI: Standard multiple imputation using logistic regression method.
6. Within-cluster MI: Within-cluster multiple imputation using logistic regression method, which is not applicable (NA) for some L-design of cluster randomized trials.
7. GEE: Generalized estimating equations. 8. RELR: Random-effects logistic regression.
9. For CRTs with 5 clusters per arm, modified standard errors are provided.
Figure 3Comparison of standardized bias.
Comparison of root mean squared error
| 5 | 500 | 0% | 0.07 | 0.10 | | | | | ||
| 15% | 0.08 | 0.10 | 0.08 | 0.06 | 0.08 | 0.06 | ||||
| 30% | 0.08 | 0.11 | 0.08 | 0.07 | 0.09 | 0.08 | ||||
| 0% | 0.14 | 0.17 | | | | | ||||
| 15% | 0.14 | 0.17 | 0.15 | 0.15 | 0.15 | 0.15 | ||||
| 30% | 0.15 | 0.17 | 0.15 | 0.15 | 0.15 | 0.15 | ||||
| 0% | 0.31 | 0.34 | | | | | ||||
| 15% | 0.31 | 0.34 | 0.31 | 0.32 | 0.31 | 0.33 | ||||
| 30% | 0.31 | 0.34 | 0.31 | 0.32 | 0.31 | 0.33 | ||||
| 20 (L-Design) | 50 | 0% | 0.11 | 0.13 | | | | | ||
| 15% | 0.11 | 0.13 | 0.12 | 0.12 | 0.12 | 0.12 | ||||
| 30% | 0.12 | 0.14 | 0.14 | 0.12 | 0.13 | 0.13 | ||||
| 0% | 0.18 | 0.20 | | | | | ||||
| 15% | 0.18 | 0.21 | 0.18 | 0.19 | 0.18 | 0.19 | ||||
| 30% | 0.19 | 0.20 | 0.19 | 0.20 | 0.19 | 0.20 | ||||
| 0% | 0.24 | 0.26 | | | | | ||||
| 15% | 0.24 | 0.27 | 0.24 | 0.26 | 0.24 | 0.27 | ||||
| 30% | 0.25 | 0.27 | 0.25 | 0.26 | NA | NA | ||||
| 30 (L-Design) | 30 | 0% | 0.15 | 0.17 | | | | | ||
| 15% | 0.16 | 0.18 | 0.16 | 0.16 | 0.15 | 0.17 | ||||
| 30% | 0.16 | 0.17 | 0.16 | 0.17 | NA | NA | ||||
| 0% | 0.20 | 0.21 | | | | | ||||
| 15% | 0.20 | 0.22 | 0.20 | 0.22 | NA | NA | ||||
| 30% | 0.20 | 0.23 | 0.21 | 0.22 | NA | NA | ||||
| 0% | 0.27 | 0.30 | | | | | ||||
| 15% | 0.27 | 0.30 | 0.28 | 0.33 | NA | NA | ||||
| 30% | 0.28 | 0.30 | 0.28 | 0.31 | NA | NA | ||||
Root mean squared error is defined as the square root of the mean squared error, which is the average squared difference between the estimated treatment effect and the true parameter. The root mean squared errors obtained when 0% data are missing are considered as references for comparing with those obtained when 15% or 30% data are missing.
Note: 1. m: Number of clusters per trial arm. 2. n: Number of subjects per cluster.
3. ρ: Intracluster correlation coefficient; 4. VIF: Variance inflation factor, i.e. 1+(m-1)ρ; 5. Standard MI: Standard multiple imputation using logistic regression method.
6. Within-cluster MI: Within-cluster multiple imputation using logistic regression method, which is not applicable (NA) for some L-design of cluster randomized trials.
7. GEE: Generalized estimating equations. 8. RELR: Random-effects logistic regression.
9. For CRTs with 5 clusters per arm, modified standard errors are provided.
Figure 4Comparison of root mean squared error.
Comparison of coverage probability
| 5 | 500 | 0% | 0.91 | 0.96 | | | | | ||
| 15% | 0.92 | 0.97 | 0.93 | 0.97 | 1.00 | 0.99 | ||||
| 30% | 0.93 | 0.97 | 0.95 | 0.98 | 1.00 | 0.99 | ||||
| 0% | 0.92 | 0.79 | | | | | ||||
| 15% | 0.92 | 0.81 | 0.90 | 0.87 | 0.95 | 0.91 | ||||
| 30% | 0.94 | 0.84 | 0.88 | 0.84 | 0.98 | 0.93 | ||||
| 0% | 0.91 | 0.49 | | | | | ||||
| 15% | 0.91 | 0.52 | 0.89 | 0.83 | 0.93 | 0.89 | ||||
| 30% | 0.93 | 0.52 | 0.83 | 0.77 | 0.96 | 0.90 | ||||
| 20 (L-Design) | 50 | 0% | 0.94 | 0.98 | | | | | ||
| 15% | 0.94 | 0.98 | 0.93 | 0.95 | 0.96 | 0.97 | ||||
| 30% | 0.94 | 0.98 | 0.92 | 0.96 | 0.98 | 0.98 | ||||
| 0% | 0.93 | 0.91 | | | | | ||||
| 15% | 0.93 | 0.92 | 0.90 | 0.89 | 0.94 | 0.94 | ||||
| 30% | 0.93 | 0.93 | 0.87 | 0.88 | 0.95 | 0.96 | ||||
| 0% | 0.93 | 0.78 | | | | | ||||
| 15% | 0.93 | 0.82 | 0.89 | 0.88 | 0.93 | 0.93 | ||||
| 30% | 0.92 | 0.83 | 0.85 | 0.85 | NA | NA | ||||
| 30 (L-Design) | 30 | 0% | 0.95 | 0.95 | | | | | ||
| 15% | 0.96 | 0.96 | 0.93 | 0.93 | 0.97 | 0.96 | ||||
| 30% | 0.95 | 0.96 | 0.91 | 0.92 | NA | NA | ||||
| 0% | 0.95 | 0.91 | | | | | ||||
| 15% | 0.95 | 0.93 | 0.92 | 0.92 | NA | NA | ||||
| 30% | 0.95 | 0.94 | 0.89 | 0.90 | NA | NA | ||||
| 0% | 0.94 | 0.79 | | | | | ||||
| 15% | 0.94 | 0.81 | 0.90 | 0.89 | NA | NA | ||||
| 30% | 0.94 | 0.85 | 0.85 | 0.85 | NA | NA | ||||
Coverage probability is defined as the proportion of times that the nominal 95% confidence interval contains the true treatment effect across all simulation replications. Coverage probabilities obtained when 0% data are missing are considered as references for comparing with those obtained when 15% or 30% data are missing.
Note: 1. m: Number of clusters per trial arm. 2. n: Number of subjects per cluster.
3. ρ: Intra-cluster correlation coefficient; 4. VIF: Variance inflation factor, i.e. 1+(m-1)ρ; 5. Standard MI: Standard multiple imputation using logistic regression method.
6. Within-cluster MI: Within-cluster multiple imputation using logistic regression method, which is not applicable (NA) for some L-design of cluster randomized trials.
7. GEE: Generalized estimating equations. 8. RELR: Random-effects logistic regression.
9. For CRTs with 5 clusters per arm, modified standard errors are provided.
Figure 5Comparison of coverage probability.