| Literature DB >> 32720912 |
Lei Zhang1, Xianwen Shang2, Subhashaan Sreedharan2, Xixi Yan2, Jianbin Liu2, Stuart Keel2, Jinrong Wu2, Wei Peng3, Mingguang He2.
Abstract
BACKGROUND: Previous conventional models for the prediction of diabetes could be updated by incorporating the increasing amount of health data available and new risk prediction methodology.Entities:
Keywords: cohort study; diabetes; machine learning; risk prediction
Year: 2020 PMID: 32720912 PMCID: PMC7420582 DOI: 10.2196/16850
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Cumulative incidence of diabetes, stratified by age groups in men and women, and stratified by BMI groups in men and women.
Comparison of model performance between logistic regression and machine-learning models.
| Duration | Logistic regression | Gradient boosting machine | Deep learning | Random forest | ||||
|
| AUCa (range) | RMSEb | AUC (range) | RMSE | AUC (range) | RMSE | AUC (range) | RMSE |
| 3 years | 0.7401 (0.7262-0.7541) | 0.1203 | 0.7927 (0.7803-0.8051) | 0.1197 | 0.7769 (0.7639-0.7899) | 0.1244 | 0.7868 (0.7742-0.7993) | 0.1198 |
| 5 years | 0.7192 (0.7084-0.7301) | 0.1633 | 0.7769 (0.7673-0.7864) | 0.1620 | 0.7610 (0.7566-0.7762) | 0.1667 | 0.7769 (0.7612-0.7804) | 0.1622 |
| 7 years | 0.6990 (0.6901-0.7077) | 0.2087 | 0.7589 (0.751-0.7668) | 0.2063 | 0.7526 (0.7446-0.7606) | 0.2099 | 0.7531 (0.7452-0.761) | 0.2066 |
| 10 years | 0.6885 (0.6801-0.6961) | 0.2318 | 0.7491 (0.7426-0.7570 ) | 0.2314 | 0.7374 (0.7339-0.7486) | 0.2435 | 0.7439 (0.7365-0.7510) | 0.2318 |
aAUC: area under the receiver operating characteristic curve.
bRMSE: root mean squared error.
Figure 2Ranked contribution to the variance of diabetes prediction by various models. (+ increasing risk; - decreasing risk; * being male increases risk compared with being female; # being born overseas increases diabetes risk compared with being born in Australia; § having private insurance decreases risk compared with having no private insurance; $ being in major cities increases risk compared with being in inner or outside regional areas; ‡ having Asian or other ancestry increases diabetes risk compared with having Australian ancestry). GBM: gradient boost machine.
Model-predicted probability of diabetes onset in three scenarios compared with their respective status quo scenarios.
| Scenario | Baseline scenario | Scenarios with hypothetical BMI change | |||
|
|
|
|
|
| |
|
| Year 3 | 3.04% | 1.54% | 6611.97 (93,288) | <.001 |
|
| Year 5 | 5.81% | 2.89% | 7957.43 (93,288) | <.001 |
|
| Year 7 | 10.62% | 4.68% | 12,120.59 (93,288) | <.001 |
|
| Year 10 | 13.43% | 6.22% | 12,732.71 (93,288) | <.001 |
|
|
|
|
|
| |
|
| Year 3 | 1.93% | 1.02% | 15,401.27 (267,658) | <.001 |
|
| Year 5 | 3.68% | 1.94% | 17,086.55 (267,658) | <.001 |
|
| Year 7 | 6.41% | 2.98% | 23,460.63 (267,658) | <.001 |
|
| Year 10 | 8.26% | 3.93% | 24,604.81 (267,658) | <.001 |
|
|
|
|
|
| |
|
| Year 3 | 1.93% | 0.77% | 20,856.85 (267,658) | <.001 |
|
| Year 5 | 3.68% | 1.50% | 22,630.22 (267,658) | <.001 |
|
| Year 7 | 6.41% | 2.14% | 31,002.83 (267,658) | <.001 |
|
| Year 10 | 8.26% | 2.79% | 33,214.27 (267658) | <.001 |
aScenario 1: “obese” individuals but become “overweight.”
bScenario 2: “obese” individuals become “overweight” and “overweight” individuals reach a “healthy” BMI.
cScenario 3: all “obese” and “overweight” individuals reach a “healthy” BMI.