| Literature DB >> 33303841 |
Yang Wu1,2,3, Haofei Hu4,5,3, Jinlin Cai1,2,6, Runtian Chen1,2,3, Xin Zuo7, Heng Cheng7, Dewen Yan8,9,10.
Abstract
Identifying individuals at high risk for incident diabetes could help achieve targeted delivery of interventional programs. We aimed to develop a personalized diabetes prediction nomogram for the 3-year risk of diabetes among Chinese adults. This retrospective cohort study was among 32,312 participants without diabetes at baseline. All participants were randomly stratified into training cohort (n = 16,219) and validation cohort (n = 16,093). The least absolute shrinkage and selection operator model was used to construct a nomogram and draw a formula for diabetes probability. 500 bootstraps performed the receiver operating characteristic (ROC) curve and decision curve analysis resamples to assess the nomogram's determination and clinical use, respectively. 155 and 141 participants developed diabetes in the training and validation cohort, respectively. The area under curve (AUC) of the nomogram was 0.9125 (95% CI, 0.8887-0.9364) and 0.9030 (95% CI, 0.8747-0.9313) for the training and validation cohort, respectively. We used 12,545 Japanese participants for external validation, its AUC was 0.8488 (95% CI, 0.8126-0.8850). The internal and external validation showed our nomogram had excellent prediction performance. In conclusion, we developed and validated a personalized prediction nomogram for 3-year risk of incident diabetes among Chinese adults, identifying individuals at high risk of developing diabetes.Entities:
Mesh:
Year: 2020 PMID: 33303841 PMCID: PMC7729957 DOI: 10.1038/s41598-020-78716-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Flowchart of study participants.
Baseline characteristics of the training and validation cohorts.
| Characteristic | Training cohort | Validation cohort | Standardized difference | |
|---|---|---|---|---|
| Participants | 16,219 | 16,093 | ||
| Age (year) | 43.15 ± 12.65 | 43.10 ± 12.59 | 0.00 (− 0.02, 0.03) | 0.747 |
| Gender | 0.00 (− 0.02, 0.02) | 0.790 | ||
| Male | 10,527 (64.91%) | 10,468 (65.05%) | ||
| Female | 5692 (35.09%) | 5625 (34.95%) | ||
| BMI (kg/m2) | 23.56 ± 3.28 | 23.54 ± 3.32 | 0.01 (− 0.01, 0.03) | 0.527 |
| SBP (mmHg) | 119.74 ± 15.73 | 119.85 ± 15.94 | 0.01 (− 0.01, 0.03) | 0.526 |
| DBP (mmHg) | 74.97 ± 10.53 | 74.94 ± 10.48 | 0.00 (− 0.02, 0.03) | 0.758 |
| FPG (mmol/L) | 4.97 ± 0.62 | 4.97 ± 0.62 | 0.01 (− 0.01, 0.03) | 0.528 |
| TG (mmol/L) | 1.17 (0.80–1.75) | 1.17 (0.80–1.75) | 0.00 (− 0.02, 0.02) | 0.860 |
| HDL-C (mmol/L) | 1.34 ± 0.31 | 1.34 ± 0.30 | 0.01 (− 0.01, 0.03) | 0.329 |
| LDL-C (mmol/L) | 2.74 ± 0.68 | 2.74 ± 0.69 | 0.00 (− 0.02, 0.02) | 0.804 |
| ALT (U/L) | 19.60 (13.80–29.60) | 19.60 (13.80–29.30) | 0.01 (− 0.02, 0.03) | 0.837 |
| BUN (mmol/L) | 4.71 ± 1.17 | 4.70 ± 1.16 | 0.01 (− 0.01, 0.03) | 0.264 |
| Scr (umol/L) | 72.17 ± 15.24 | 72.30 ± 15.22 | 0.01 (− 0.01, 0.03) | 0.457 |
| Smoking status | 0.00 (− 0.02, 0.02) | 0.804 | ||
| Never | 12,240 (75.47%) | 12,164 (75.59%) | ||
| Ever/Current | 3979 (24.53%) | 3929 (24.41%) | ||
| Drinking status | 0.01 (− 0.02, 0.03) | 0.621 | ||
| Never | 13,018 (80.26%) | 12,952 (80.48%) | ||
| Ever/Current | 3201 (19.74%) | 3141 (19.52%) | ||
| Family history | 0.00 (− 0.02, 0.03) | 0.700 | ||
| No | 15,302 (94.35%) | 15,199 (94.44%) | ||
| Yes | 917 (5.65%) | 894 (5.56%) |
Values are n (%) or mean ± SD.
BMI, Body mass index; SBP, Systolic blood pressure; DBP, Diastolic blood pressure; FPG; Fasting plasma glucose; TG, Triglyceride; HDL-C, High density lipoprotein cholesterol; LDL-C, Low density lipid cholesterol; ALT, Alanine aminotransferase; BUN, Blood urea nitrogen; Scr, Serum creatinine; Family history, Family history of diabetes.
Baseline characteristics for the training and validation cohorts by incident diabetes status.
| Characteristic | Training cohort | Validation cohort | ||||
|---|---|---|---|---|---|---|
| No diabetes | Incident diabetes | P value | No diabetes | Incident diabetes | P value | |
| Participants | 16,064 | 155 | 15,952 | 141 | ||
| Age (year) | 43.03 ± 12.60 | 55.34 ± 12.68 | < 0.001 | 42.98 ± 12.52 | 56.57 ± 12.88 | < 0.001 |
| Gender | < 0.001 | < 0.001 | ||||
| Male | 10,399 (64.73%) | 128 (82.58%) | 10,355 (64.91%) | 113 (80.14%) | ||
| Female | 5665 (35.27%) | 27 (17.42%) | 5597 (35.09%) | 28 (19.86%) | ||
| BMI (kg/m2) | 23.54 ± 3.27 | 26.27 ± 3.17 | < 0.001 | 23.51 ± 3.31 | 26.30 ± 3.39 | < 0.001 |
| SBP (mmHg) | 119.61 ± 15.67 | 132.81 ± 16.30 | < 0.001 | 119.76 ± 15.88 | 129.99 ± 19.58 | < 0.001 |
| DBP (mmHg) | 74.91 ± 10.50 | 81.14 ± 11.17 | < 0.001 | 74.90 ± 10.47 | 78.69 ± 10.67 | < 0.001 |
| FPG (mmol/L) | 4.96 ± 0.61 | 6.03 ± 0.69 | < 0.001 | 4.96 ± 0.61 | 6.01 ± 0.70 | < 0.001 |
| TG (mmol/L) | 1.16 (0.80–1.74) | 1.83 (1.24–2.67) | < 0.001 | 1.16 (0.80–1.74) | 1.69 (1.11–2.60) | < 0.001 |
| HDL-C(mmol/L) | 1.34 ± 0.30 | 1.35 ± 0.79 | 0.709 | 1.34 ± 0.30 | 1.29 ± 0.30 | 0.071 |
| LDL-C(mmol/L) | 2.74 ± 0.68 | 2.92 ± 0.65 | < 0.001 | 2.74 ± 0.69 | 2.81 ± 0.71 | 0.202 |
| ALT(U/L) | 19.50 (13.70–29.40) | 26.70 (19.00–43.90) | < 0.001 | 19.50 (13.80–29.10) | 27.10 (18.90–40.60) | < 0.001 |
| BUN (mmol/L) | 4.71 ± 1.17 | 5.15 ± 1.43 | < 0.001 | 4.69 ± 1.16 | 5.16 ± 1.33 | < 0.001 |
| Scr (umol/L) | 72.15 ± 15.21 | 74.77 ± 17.83 | 0.033 | 72.28 ± 15.20 | 74.81 ± 16.80 | 0.049 |
| Smoking status | < 0.001 | < 0.001 | ||||
| Never | 12,150 (75.63%) | 90 (58.06%) | 12,084 (75.75%) | 80 (56.74%) | ||
| Ever/Current | 3914 (24.37%) | 65 (41.94%) | 3868 (24.25%) | 61 (43.26%) | ||
| Drinking status | 0.012 | 0.335 | ||||
| Never | 12,906 (80.34%) | 112 (72.26%) | 12,834 (80.45%) | 118 (83.69%) | ||
| Ever/Current | 3158 (19.66%) | 43 (27.74%) | 3118 (19.55%) | 23 (16.31%) | ||
| Family history | 0.139 | 0.124 | ||||
| No | 15,160 (94.37%) | 142 (91.61%) | 15,070 (94.47%) | 129 (91.49%) | ||
| Yes | 904 (5.63%) | 13 (8.39%) | 882 (5.53%) | 12 (8.51%) | ||
Values are n (%) or mean ± SD.
SD, Standardized difference; BMI, Body mass index; SBP, Systolic blood pressure; DBP, Diastolic blood pressure; FPG; Fasting plasma glucose; TG, Triglyceride; HDL-C, High-density lipoprotein cholesterol; LDL-C, Low-density lipid cholesterol; ALT, Alanine aminotransferase; BUN, Blood urea nitrogen; Scr, Serum creatinine; Family history, Family history of diabetes.
Risk predictors for incident diabetes in the univariate and multivariate analysis.
| Variable | Univariate (OR,95%CI, P) | Multivariate (OR,95%CI, P) |
|---|---|---|
| Age(year) | 1.066 (1.058, 1.075) < 0.00001 | 1.047 (1.036, 1.058) < 0.00001 |
| Male | 1.0 | 1.0 |
| Female | 0.421 (0.314, 0.564) < 0.00001 | 0.675 (0.451, 1.009) 0.05506 |
| BMI (kg/m2) | 1.238 (1.202, 1.274) < 0.00001 | 1.122 (1.077, 1.168) < 0.00001 |
| SBP (mmHg) | 1.039 (1.033, 1.046) < 0.00001 | 1.008 (0.999, 1.018) 0.07860 |
| DBP (mmHg) | 1.042 (1.032, 1.052) < 0.00001 | 0.994 (0.980, 1.009) 0.42703 |
| FPG (mmol/L) | 13.925 (11.487, 16.882) < 0.00001 | 8.564 (6.978, 10.509) < 0.00001 |
| TG (mmol/L) | 1.304 (1.238, 1.373) < 0.00001 | 1.069 (0.994, 1.150) 0.07091 |
| HDL-C (mmol/L) | 0.831 (0.567, 1.216) 0.34028 | 1.515 (1.101, 2.086) 0.01085 |
| LDL-C (mmol/L) | 1.303 (1.115, 1.524) 0.00090 | 0.858 (0.722, 1.020) 0.08233 |
| ALT (U/L) | 1.010 (1.007, 1.012) < 0.00001 | 1.008 (1.004, 1.011) 0.00016 |
| BUN (mmol/L) | 1.343 (1.232, 1.464) < 0.00001 | 1.026 (0.924, 1.139) 0.63007 |
| Scr (umol/L) | 1.011 (1.004, 1.018) 0.00368 | 0.992 (0.982, 1.002) 0.10641 |
| Never | 1.0 | 1.0 |
| Ever/Current | 2.308 (1.831, 2.910) < 0.00001 | 1.527 (1.158, 2.014) 0.00271 |
| Never | 1.0 | 1.0 |
| Ever/Current | 1.177 (0.894, 1.550) 0.24580 | 0.822 (0.606, 1.115) 0.20821 |
| No | 1.0 | 1.0 |
| Yes | 1.561 (1.034, 2.359) 0.03421 | 1.902 (1.219, 2.967) 0.00461 |
BMI, Body mass index; SBP, Systolic blood pressure; DBP, Diastolic blood pressure; FPG; Fasting plasma glucose; TG, Triglyceride; HDL-C, High-density lipoprotein cholesterol; LDL-C, Low-density lipid cholesterol; ALT, Alanine aminotransferase; BUN, Blood urea nitrogen; Scr, Serum creatinine; Family history, Family history of diabetes.
OR, Hazard ratios; CI, Confidence interval; Ref, Reference.
Figure 2Risk predictors selection using the LASSO logistic regression model. (A) Optimal predictor (lambda) selection in the LASSO model with fivefold cross validation by minimum criteria. The area under the receiver operation characteristic curve was plotted versus log (lambda). Dotted vertical lines were drawn at the optimal values by using the minimum criteria and the 1 SE of the minimum criteria; (B) LASSO coefficient profiles of the 15 predictors. A coefficient profile plot was developed against the log (lambda) sequence. Vertical line was drawn at the value selected with fivefold cross validation, where optimal lambda resulted in 5 predictors with nonzero coefficients (lambda = 0.003).
Prediction performance of the nomogram for the risk of diabetes.
| AUC | 95% CI | Best threshold | Specificity (%) | Sensitivity (%) | Accuracy (%) | PPV (%) | NPV (%) | PLR | NLR | DOR | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Lower | Upper | |||||||||||
| Training cohort | 0.9125 | 0.8887 | 0.9364 | 0.0072 | 80.11 | 89.03 | 80.20 | 4.14 | 99.87 | 4.4764 | 0.1369 | 32.6967 |
| Validation cohort | 0.9030 | 0.8747 | 0.9313 | − 4.8295 | 82.30 | 85.11 | 82.33 | 4.08 | 99.84 | 4.8091 | 0.1810 | 26.5756 |
AUC, Area under curve; CI, Confidence interval; PPV, Positive predictive value; NPV, Negative predictive value; PLR, Positive likelihood ratio; NLR, Negative likelihood ratio; DOR, Diagnostic odds ratio.
The algorithm of diabetes risk in LASSO model:
Model = − 23.14183 + 0.03224* age (year) + 0.10645* BMI (kg/m2) + 0.01388* SBP (mmHg) + 2.24841* FPG (mmol/L) + 0.09444* TG (mmol/L).
Figure 3Nomogram to predict the risk of diabetes for Chinese adults. The patient’s score for each risk predictor is plotted on the appropriate scale. The patient’s score for each risk predictor is plotted on the appropriate scale and vertical lines are drawn from that value to the top Points scale to obtain the corresponding scores. All scores are summed to obtain the total points score. The total points score is plotted on the bottom Total Points scale. The corresponding value shows the predicted probability of incident diabetes.
Figure 4Using bootstrap resampling validation (times = 500) to confirm the prediction performance stability of the nomogram in the training cohort (A) and validation cohort (B).
Figure 5Comparison between predicted and observed 3-year incidence of deciles of predicted diabetes risk score for the training cohort in the nomogram.
Values of sensitivity, specificity and predictive values of the nomogram scores at different cut-off values.
| Predicted probability | Specificity (%) | Sensitivity (%) | Accuracy (%) | PPV (%) | NPV (%) | PLR | NLR | DOR |
|---|---|---|---|---|---|---|---|---|
| ≥ 0.05 | 95.61 | 61.29 | 95.28 | 11.86 | 99.61 | 13.95 | 0.40 | 34.44 |
| ≥ 0.10 | 97.43 | 43.87 | 96.92 | 14.14 | 99.45 | 17.06 | 0.58 | 29.62 |
| ≥ 0.15 | 98.62 | 32.90 | 98.00 | 18.75 | 99.35 | 23.92 | 0.68 | 35.15 |
| ≥ 0.20 | 99.10 | 24.52 | 98.39 | 20.88 | 99.27 | 27.35 | 0.76 | 35.91 |
| ≥ 0.25 | 99.60 | 16.77 | 98.80 | 28.57 | 99.20 | 41.46 | 0.84 | 49.61 |
| ≥ 0.30 | 99.78 | 12.26 | 98.95 | 35.19 | 99.16 | 56.26 | 0.88 | 63.98 |
| ≥ 0.35 | 99.88 | 7.74 | 99.00 | 37.50 | 99.12 | 62.18 | 0.92 | 67.32 |
| ≥ 0.40 | 99.94 | 3.87 | 99.02 | 37.50 | 99.08 | 62.18 | 0.96 | 64.65 |
| ≥ 0.45 | 99.98 | 1.29 | 99..04 | 40.00 | 99.06 | 69.09 | 0.99 | 69.98 |
| ≥ 0.50 | 99.99 | 1.29 | 99.04 | 50.00 | 99.06 | 103.64 | 0.99 | 104.98 |
PPV, Positive predictive value; NPV, Negative predictive value; PLR, Positive likelihood ratio; NLR, Negative likelihood ratio; DOR, Diagnostic odds ratio.
Figure 6The decision curve analysis of the LASSO model for 3-year diabetes risk in the training cohort (A) and validation cohort (B). The black line represents the net benefit when none of the participants are considered to develop diabetes, while the light gray line represents the net benefit when all participants are considered to develop diabetes. The area between the "no treatment line" (black line) and "all treatment line" (light gray line) in the model curve indicates the clinical utility of the model. The farther the model curve is from the black and light gray lines, the better the clinical use of the nomogram. (Using bootstraps with 500 resamples).
Figure 7The ROC curves of the nomogram in the external validation cohort (A) the overall population of the original study (B).