| Literature DB >> 23095325 |
Mulugeta Gebregziabher1, Leonard Egede, Gregory E Gilbert, Kelly Hunt, Paul J Nietert, Patrick Mauldin.
Abstract
BACKGROUND: With the current focus on personalized medicine, patient/subject level inference is often of key interest in translational research. As a result, random effects models (REM) are becoming popular for patient level inference. However, for very large data sets that are characterized by large sample size, it can be difficult to fit REM using commonly available statistical software such as SAS since they require inordinate amounts of computer time and memory allocations beyond what are available preventing model convergence. For example, in a retrospective cohort study of over 800,000 Veterans with type 2 diabetes with longitudinal data over 5 years, fitting REM via generalized linear mixed modeling using currently available standard procedures in SAS (e.g. PROC GLIMMIX) was very difficult and same problems exist in Stata's gllamm or R's lme packages. Thus, this study proposes and assesses the performance of a meta regression approach and makes comparison with methods based on sampling of the full data. DATA: We use both simulated and real data from a national cohort of Veterans with type 2 diabetes (n=890,394) which was created by linking multiple patient and administrative files resulting in a cohort with longitudinal data collected over 5 years. METHODS ANDEntities:
Mesh:
Year: 2012 PMID: 23095325 PMCID: PMC3542162 DOI: 10.1186/1471-2288-12-163
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Characteristics of study population for the full (n=890,394) and sampled cohorts
| Non-Hispanic White: % (n) | 62 | (547,645) | 61 | (138,470) | 62 | (55,489) | 62 | (27,853) | 62 | (5,529) | 62 | (547,645) |
| Non-Hispanic Black: % (n) | 12 | (107,935) | 12 | (27,317) | 12 | (10,941) | 12 | (5,406) | 12 | (1,097) | 12 | (107,935) |
| Hispanic: % (n) | 14 | (123,558) | 14 | (31,062) | 14 | (12,481) | 14 | (6,148) | 14 | (1,285) | 13 | (123,558) |
| Other: % (n) | 12 | (111,256) | 13 | (28,151) | 12 | (11,089) | 12 | (5,593) | 12 | (1,089) | 13 | (111,256) |
| Male: % (n) | 98 | (869,508) | 98 | (219,708) | 98 | (87,921) | 98 | (43,947) | 98 | (8,794) | 98 | (869,508) |
| Married: % (n) | 65 | (574,307) | 64 | (145,060) | 65 | (58,222) | 64 | (29,002) | 65 | (5,853) | 64 | (574,307) |
| Disability (mean % & sd) | 12 | (0.03) | 12 | (0.06) | 12 | (0.09) | 12 | (0.13) | 13 | (0.30) | 12 | (0.63) |
| Northeast | 12 | (103,056) | 12 | (25,994) | 11 | (10,274) | 12 | (5,272) | 12 | (1,074) | - | (103,056) |
| Mid-Atlantic | 23 | (201,058) | 22 | (50,579) | 23 | (20,328) | 23 | (10,230) | 23 | (2,000) | - | (201,058) |
| Midwest | 21 | (184,348) | 21 | (46,940) | 21 | (18,658) | 21 | (9,368) | 20 | (1,827) | - | (184,348) |
| South | 30 | (265,450) | 30 | (66,988) | 30 | (26,759) | 29 | (13,189) | 30 | (2,707) | - | (265,450) |
| West | 15 | (136,482) | 15 | (34,499) | 16 | (13,981) | 15 | (6,941) | 16 | (1,392) | - | (136,482) |
| Urban Residence | 62 | (548,786) | 61 | (138,339) | 61 | (55,324) | 62 | (27,701) | 61 | (5,513) | 61 | (548,786) |
| Rural Residence | 38 | (341,608) | 39 | (85,612) | 39 | (34,676) | 38 | (17,299) | 39 | (3,487) | 39 | (341,608) |
| Mean HbA1c (mean % & sd) | 7.4 | (0.002) | 7.4 | (0.003) | 7.4 | (0.005) | 7.4 | (0.007) | 7.4 | (0.016) | 7.5 | (0.030) |
| Mean HbA1c<8%: % (n) | 73 | (703,596) | 73 | (177,751) | 73 | (71,195) | 73 | (34,498) | 71 | (7,112) | 70 | (703,596) |
| No Comorbidities | 57 | (507,320) | 57 | (128,326) | 57 | (51,178) | 57 | (25,506) | 57 | (5,143) | 57 | (507,320) |
| 1 Comorbidity | 28 | (248,898) | 28 | (62,961) | 28 | (25,309) | 28 | (12,655) | 27 | (2,456) | 28 | (248,898) |
| 2 Comorbidities | 11 | (95,542) | 11 | (23,998) | 11 | (9,706) | 11 | (4,898) | 11 | (1,022) | 11 | (95,542) |
| 3+ Comorbidities | 4 | (38,634) | 4 | (9,715) | 4 | (3,807) | 4 | (1,941) | 4 | (379) | 4 | (38,634) |
Not applicable due to sampling by VISN or aggregation by VISN.
Parameter estimates, 95% confidence intervals, standard errors for intercept, race and comorbidity in linear mixed model (LMM*) of HbA1c using sampling and random effects Meta-regression, in for Veterans with Type 2 Diabetes (2002-2006)
| β (95% CI) | 100 | 7.54 (7.52, 7.55) | 0.46 (0.45, 0.46) | 0.29 (0.28, 0.30) | 0.25 (0.23, 0.25) | 0.01 (0.01, 0.02) | 0.04 (0.04, 0.05) | 0.11 (0.11, 0.13) |
| | 25 | 7.59 (7.55, 7.61) | 0.46 (0.44, 0.47) | 0.31 (0.28, 0.32) | 0.24 (0.22, 0.25) | 0.01 (0.00, 0.02) | 0.02 (0.01, 0.04) | 0.10 (0.08, 0.13) |
| | 10 | 7.54 (7.48, 7.58) | 0.47 (0.44, 0.48) | 0.30 (0.26, 0.32) | 0.26 (0.23, 0.27) | 0.03 (0.02, 0.05) | 0.08 (0.07, 0.12) | 0.08 (0.05, 0.13) |
| | 5 | 7.54 (7.48, 7.62) | 0.44 (0.41, 0.47) | 0.28 (0.23, 0.32) | 0.27 (0.24, 0.30) | 0.03 (0.01, 0.06) | 0.05 (0.02, 0.09) | 0.13 (0.08, 0.18) |
| SE | 100 | 0.0115 | 0.005 | 0.007 | 0.005 | 0.0037 | 0.0054 | 0.0079 |
| | 25 | 0.0115 | 0.005 | 0.007 | 0.005 | 0.0037 | 0.0054 | 0.0080 |
| | 10 | 0.0115 | 0.005 | 0.007 | 0.005 | 0.0037 | 0.0054 | 0.0080 |
| | 5 | 0.0116 | 0.005 | 0.0069 | 0.005 | 0.0037 | 0.0053 | 0.0079 |
| Parameter | Sample (%) | Intercept | Non-Hispanic Black | Hispanic | Other | 1 Comorbidity | 2 Comorbidities | 3+ Comorbidities |
| β (95% CI) | 25 | 7.61 (7.57, 7.63) | 0.47 (0.45, 0.48) | 0.28 (0.26, 0.29) | 0.26 (0.24, 0.27) | 0.01 (0, 0.02) | 0.03 (0.02, 0.05) | 0.11 (0.11, 0.15) |
| | 10 | 7.58 (7.53, 7.63) | 0.46 (0.43, 0.48) | 0.28 (0.25, 0.30) | 0.26 (0.23, 0.28) | 0.0 (-0.01, 0.02) | 0.05 (0.03, 0.08) | 0.16 (0.13, 0.2) |
| | 5 | 7.61 (7.54, 7.68) | 0.38 (0.35, 0.41) | 0.30 (0.26, 0.35) | 0.25 (0.21, 0.28) | 0.02 (0.0, 0.05) | 0.05 (0.02, 0.09) | 0.09 (0.05, 0.15) |
| SE | 25 | 0.0111 | 0.0049 | 0.0068 | 0.005 | 0.0037 | 0.0054 | 0.0079 |
| | 10 | 0.0111 | 0.0049 | 0.0068 | 0.0049 | 0.0037 | 0.0054 | 0.0078 |
| | 5 | 0.0111 | 0.0050 | 0.0069 | 0.005 | 0.0037 | 0.0053 | 0.0079 |
| Parameter** | Sample (%) | Intercept | Non-Hispanic Black | Hispanic | Other | 1 Comorbidity | 2 Comorbidities | 3+ Comorbidities |
| β(95% CI) | 100 | 7.58 (7.54, 7.62) | 0.45 (0.41, 0.49) | 0.08, (0.04, 0.12) | 0.23 (0.19, 0.27) | 0.01 (-0.04, 0.05) | 0.03 (-0.01, 0.07) | 0.09 (0.05, 0.13) |
*-Independent variables used in fitting the linear mixed model were: linear time; race (non-Hispanic white reference, indicator variables); sex (female reference); marital status (single reference), service disability percentage, residence status (urban/rural, rural reference), VISN region (Northeast, Mid-Atlantic, South, Midwest, and West), and number of comorbidities (1, 2, or 3+; none reference).
** Veteran Integrated Service Networks (VISNs) 13 and 14 are excluded in all these models.
Parameter estimates (95% CI), standard errors for intercept, race and comorbidity in general linear mixed model (GLMM†) for binary HbA1c using sampling and random effects meta-regression, in for veterans with type 2 diabetes (2002-2006)
| β (95% CI) | 100 | -0.94 (-0.98, -0.91) | 0.62 (-0.02, -0.01) | 0.45 (0.43, 0.48) | 0.36 (0.35, 0.38) | 0.07 (0.06, 0.08) | 0.15 (0.13, 0.17) | 0.27 (0.24, 0.29) |
| | 25 | -1.93 (-2.17, -1.69) | 1.27 (1.17, 1.37) | 1.07 (0.93, 1.20) | 0.87 (0.77, 0.97) | 0.12 (0.05, 0.21) | 0.19 (0.06, 0.31) | 0.39 (0.19, 0.58) |
| | 10 | -2.48 (-2.93, -2.04) | 1.39 (1.20, 1.58) | 1.27 (1.01, 1.53) | 0.97 (0.78, 1.15) | 0.16 (0.01, 0.31) | 0.36 (0.14, 0.59) | 0.44 (0.07, 0.80) |
| | 5 | -2.16 (-2.87, -1.45) | 1.69 (1.40, 1.99) | 1.10 (0.70, 1.50) | 1.05 (0.75, 1.35) | 0.39 (0.16, 0.62) | 0.34 (0.09, 0.80) | 0.33 (-0.24, 0.90) |
| SE | 100 | 0.0182 | 0.0076 | 0.0105 | 0.0078 | 0.0053 | 0.0085 | 0.0125 |
| | 25 | 0.1221 | 0.0514 | 0.0691 | 0.0512 | 0.0402 | 0.0621 | 0.3896 |
| | 10 | 0.2269 | 0.0965 | 0.1308 | 0.0959 | 0.0748 | 0.1163 | 0.1858 |
| | 5 | 0.3608 | 0.1505 | 0.2038 | 0.1520 | 0.1175 | 0.1821 | 0.2905 |
| Parameter | Sample (%) | Intercept | Non-Hispanic Black | Hispanic | Other | 1 Comorbidity | 2 Comorbidities | 3+ Comorbidities |
| β (95% CI) | 25 | -1.83 (-2.07, -1.59) | 1.25 (-0.01, 0.01) | 0.90 (0.76, 1.03) | 0.88 (0.78, 0.98) | 0.10 (0.02, 0.18) | 0.30 (0.17, 0.42) | 0.75 (0.56, 0.94) |
| | 10 | -2.34 (-2.78, -1.89) | 1.47 (1.29, 1.66) | 1.03 (0.78, 1.29) | 1.00 (0.81, 1.19) | 0.07 (-0.08, 0.21) | 0.24 (0.02, 0.47) | 0.69 (0.34, 1.05) |
| | 5 | -2.65 (-3.35, -1.95) | 1.44 (1.14, 1.74) | 1.74 (1.34, 2.15) | 1.10 (0.81, 1.40) | 0.20 (-0.03, 0.43) | 0.16 (-0.19, 0.52) | 0.80 (0.22, 1.39) |
| SE | 25 | 0.1209 | 0.0514 | 0.0690 | 0.0512 | 0.0401 | 0.0620 | 0.0984 |
| | 10 | 0.2272 | 0.0955 | 0.1282 | 0.0949 | 0.0745 | 0.1154 | 0.1813 |
| | 5 | 0.3561 | 0.1531 | 0.2060 | 0.1505 | 0.1175 | 0.1817 | 0.2984 |
| Parameter** | Sample (%) | Intercept | Non-Hispanic Black | Hispanic | Other | 1 Comorbidity | 2 Comorbidities | 3+ Comorbidities |
| β (95% CI) | 100 | -0.93 (-0.99, -0.87) | 0.58 (0.52, 0.64) | 0.11 (0.05, 0.17) | 0.32 (0.26, 0.38) | 0.07 (0.01, 0.13) | 0.14 (0.08, 0.20) | 0.25 (0.19, 0.31) |
†-Independent variables used in fitting the general linear mixed model using a binomial distribution with a logit link function were: linear time; race (non-Hispanic white reference, indicator variables); sex (female reference); marital status (single reference), service disability percentage, residence status (urban/rural, rural reference), and number of comorbidities (1, 2, or 3+; none reference).
** Veteran Integrated Service Networks (VISNs) 13 and 14 are excluded in all these models.
Figure 1LMM parameter estimates and pooled 95% confidence bounds for random effects meta-regression (intercept, race) without veteran integrated service networks (VISNs) 13 and 14. *- Independent variables used in fitting model were: linear time; race (non-Hispanic white reference, indicator variables); sex (female reference); service disability percentage, marital status (single reference), residence status (urban/rural, rural reference), VISN region (Northeast, Mid-Atlantic, South, Midwest, and West, South reference); and number of comorbidities (1, 2, or 3+; none reference).
Figure 2GLMM parameter estimates and pooled 95% confidence bounds for random effects meta-regression (intercept, race) without veteran integrated service networks (VISNs) 13 and 14. *- Independent variables used in fitting model were: linear time; race (non-Hispanic white reference, indicator variables); sex (female reference); service disability percentage, marital status (single reference), residence status (urban/rural, rural reference), VISN region (Northeast, Mid-Atlantic, South, Midwest, and West, South reference); and number of comorbidities (1, 2, or 3+; none reference).
Figure 3Akaike’s information criterion (AIC) and Bayesian Information Criterion (BIC) for LMM (top two) and GLMM (bottom two). *- Independent variables used in fitting the model were: linear time; race (non-Hispanic white reference, indicator variables); sex (female reference); service disability percentage, marital status (single reference), residence status (urban/rural, rural reference), VISN region (Northeast, Mid-Atlantic, South, Midwest, and West, South reference); and number of comorbidities (1, 2, or 3+; none reference).