| Literature DB >> 34689190 |
Nick Birk1,2, Mika Matsuzaki3,4, Teresa T Fung5, Yanping Li3, Carolina Batis6, Meir J Stampfer3,7,8, Megan Deitchler9, Walter C Willett3,7,8, Wafaie W Fawzi10, Sabri Bromage3, Sanjay Kinra2, Shilpa N Bhupathiraju3,8, Erin Lake1.
Abstract
BACKGROUND: The prevalence of type 2 diabetes has increased substantially in India over the past 3 decades. Undiagnosed diabetes presents a public health challenge, especially in rural areas, where access to laboratory testing for diagnosis may not be readily available.Entities:
Keywords: GDQS; GLMM; LASSO; cluster-correlation; diabetes; machine learning; mixed model; prediabetes; random forest; survey
Mesh:
Substances:
Year: 2021 PMID: 34689190 PMCID: PMC8542097 DOI: 10.1093/jn/nxab281
Source DB: PubMed Journal: J Nutr ISSN: 0022-3166 Impact factor: 4.687
FIGURE 1Flowchart of study sample size.
Selected characteristics of participants at wave 3 in the Andhra Pradesh Child and Parent Study[1]
| Characteristics | No prediabetes[ | Prediabetes or diabetes[ | Overall |
|---|---|---|---|
| Participants | 4440 (79) | 1215 (21) | 5655 (100) |
| Age, y | 34.6 ± 13 | 41.1 ± 14 | 35.6 ± 14 |
| Women | 2099 (47.3) | 534 (44.0) | 2633 (46.6) |
| Ever use of tobacco[ | 1060 (23.9) | 378 (31.1) | 1438 (25.4) |
| Alcoholic beverage consumption, g/d | 240 ± 701 | 344 ± 956 | 262 ± 764 |
| Unable to walk[ | 254 (5.7) | 150 (12.3) | 404 (7.1) |
| Use of rations card | 3117 (70.2) | 687 (56.5) | 3804 (67.3) |
| Time spent in sedentary activities, h/d | 5.51 ± 3.4 | 5.79 ± 3.6 | 5.57 ± 3.4 |
| Global Diet Quality Score | 19.1 ± 3.6 | 18.9 ± 3.7 | 19.0 ± 3.6 |
1 n = 5655. Values are means ± SDs or n (%).
2Absence of prediabetes was defined as a fasting blood glucose concentration <100 mg/dL.
3Prediabetes and/or diabetes includes individuals with a fasting blood glucose concentration ≥100 mg/dL.
4Tobacco use is defined as having reported ever smoking, chewing, or snuffing tobacco products.
5Unable to walk responses exclude reasons related to shortness of breath.
Performance metrics of select models for predicting prediabetes[1]
| Algorithm | Predictors | Train AUC (95% CI) | AUC (95% CI) | Sensitivity | Specificity |
|---|---|---|---|---|---|
| Random guessing | NA | 0.531 (0.512, 0.551) | 0.556 (0.473, 0.639) | 0.811 | 0.246 |
| GLM | Age | 0.635 (0.616, 0.653) | 0.702 (0.629, 0.774) | 0.774 | 0.570 |
| GLM | GDQS | 0.515 (0.495, 0.534) | 0.511 (0.423, 0.598) | 0.547 | 0.488 |
| GLM | Age and GDQS | 0.636 (0.617, 0.654) | 0.709 (0.640, 0.779) | 0.774 | 0.575 |
| GLM | Age, GDQS food groups, hours sedentary, alcoholic beverage consumption, unable to walk, use of rations card, sex, tobacco use | 0.654 (0.635, 0.672) | 0.716 (0.645, 0.787) | 0.755 | 0.600 |
| GLMM | Age and GDQS, family random intercept | 0.878 (0.867, 0.889) | 0.710 (0.640, 0.779) | 0.755 | 0.599 |
| GLMM | Age and GDQS food groups, family random intercept | 0.873 (0.861, 0.884) | 0.711 (0.642, 0.781) | 0.793 | 0.594 |
| GLMM | Age, GDQS food groups, hours sedentary, alcoholic beverage consumption, unable to walk, use of rations card, sex, tobacco use, family random intercept | 0.872 (0.861, 0.883) | 0.722 (0.652, 0.792) | 0.717 | 0.662 |
| LASSO | Age, GDQS food groups, hours sedentary, alcoholic beverage consumption, unable to walk, use of rations card, sex, tobacco use | 0.644 (0.625, 0.663) | 0.705 (0.633, 0.776) | 0.774 | 0.580 |
| Elastic net (α = 1) | Age, GDQS food groups, hours sedentary, alcoholic beverage consumption, unable to walk, use of rations card, sex, tobacco use | 0.641 (0.626, 0.659) | 0.700 (0.627, 0.772) | 0.774 | 0.570 |
| Random forest | Age, GDQS food groups, hours sedentary, alcoholic beverage consumption, unable to walk, use of rations card, sex, tobacco use | 1.000 (1.000, 1.000) | 0.705 (0.633, 0.776) | 0.774 | 0.517 |
AUC, area under the receiver operating characteristic curve; GDQS, Global Diet Quality Score; GLM, generalized linear model; GLMM, generalized linear mixed model; LASSO, least absolute shrinkage and selection operator; NA, not applicable.
FIGURE 2Receiver operating characteristic curves for the random guessing model (A), GLM with Global Diet Quality Score alone (B), GLM with all covariates (C), least absolute shrinkage and selection operator with all covariates (D), random forest with all covariates (E), and generalized linear mixed model with all covariates (F). GLM, generalized linear model.