| Literature DB >> 32626780 |
Yunzhen Ye1,2, Yu Xiong1,2, Qiongjie Zhou1,2, Jiangnan Wu1, Xiaotian Li1,2,3,4, Xirong Xiao1,2.
Abstract
BACKGROUND: Gestational diabetes mellitus (GDM) contributes to adverse pregnancy and birth outcomes. In recent decades, extensive research has been devoted to the early prediction of GDM by various methods. Machine learning methods are flexible prediction algorithms with potential advantages over conventional regression.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32626780 PMCID: PMC7306091 DOI: 10.1155/2020/4168340
Source DB: PubMed Journal: J Diabetes Res Impact factor: 4.011
Figure 1Study profile and analysis pipeline.
Baseline characteristics.
| Characteristic | Development set ( | Validation set ( | ||||
|---|---|---|---|---|---|---|
| GDM ( | Control ( |
| GDM ( | Control ( |
| |
|
| ||||||
| Maternal age (years) | ||||||
| <20 | 1 (0.44%) | 7 (0.26%) | <0.001 | 1 (0.11%) | 0 | <0.001 |
| 20-34.9 | 2051 (90.95%) | 2522 (95.35%) | 856 (92.34) | 1105 (94.20%) | ||
| 35-40 | 159 (7.05%) | 88 (3.33%) | 51 (5.50%) | 50 (4.26%) | ||
| ≥40 | 9 (0.40%) | 3 (0.11%) | 5 (0.54%) | 1 (0.09%) | ||
| BMI (kg/m2) | ||||||
| <25 | 1290 (57.2%) | 1920 (72.6%) | <0.001 | 559 (60.30%) | 853 (72.7%) | <0.001 |
| 25-29.9 | 450 (20.0%) | 269 (10.2%) | 161 (17.40%) | 110 (9.4%) | ||
| >30 | 80 (3.5%) | 26 (1.0%) | 29 (3.10%) | 7 (0.6%) | ||
| Education status | ||||||
| <College | 740 (33.9%) | 957 (35.2%) | 0.41 | 372 (37.2%) | 383 (34.8%) | 0.28 |
| College | 1142 (52.4%) | 1363 (50.1%) | 0.08 | 491 (49.1%) | 539 (49.0%) | 0.97 |
| >College | 209 (9.6%) | 299 (11.0%) | 0.12 | 99 (9.9%) | 132 (12.0%) | 0.12 |
| Smoking | 30 (1.4%) | 31 (1.1%) | 0.44 | 14 (1.4%) | 15 (1.4%) | 0.85 |
| Nulliparous | 1840 (81.60%) | 2194 (82.95%) | 0.44 | 763 (82.31%) | 973 (82.95) | 0.67 |
| Prior macrosomia | 22 (1.0%) | 15 (0.57%) | 0.10 | 10 (1.08%) | 3 (0.26%) | 0.02 |
| Prior preterm delivery | 22 (1.0%) | 17 (0.64%) | 0.18 | 7 (0.76%) | 10 (0.85%) | 0.80 |
| Prior GDM | 20 (0.89%) | 0 | <0.001 | 12 (1.30%) | 0 | <0.001 |
| Family history of diabetes | 21 (0.93%) | 9 (0.34%) | 0.008 | 14 (1.51%) | 3 (0.36%) | 0.001 |
|
| ||||||
| 3-Triglyceride | 1.67 ± 0.79 | 1.44 ± 0.59 | <0.001 | 1.67 ± 0.85 | 1.40 ± 0.55 | <0.001 |
| Uric acid | 213.30 ± 45.25 | 202.71 ± 39.88 | <0.001 | 213.19 ± 44.93 | 203.24 ± 41.29 | <0.001 |
| Glycosylated hemoglobin | 5.21 ± 0.43 | 5.03 ± 0.38 | <0.001 | 5.19 ± 0.42 | 5.01 ± 0.35 | <0.001 |
| Alkaline phosphatase | 67.85 ± 36.53 | 66.94 ± 35.94 | 0.008 | 68.06 ± 37.75 | 67.32 ± 36.85 | 0.07 |
| Total cholesterol | 4.71 ± 0.77 | 4.60 ± 0.76 | <0.001 | 4.70 ± 0.86 | 4.65 ± 0.77 | 0.18 |
| Lactic dehydrogenase | 152.92 ± 37.80 | 153.00 ± 34.67 | 0.18 | 153.19 ± 34.77 | 152.38 ± 33.85 | 0.18 |
| Fasting blood glucose | 4.59 ± 0.67 | 4.34 ± 0.46 | <0.001 | 4.59 ± 0.74 | 4.32 ± 0.40 | <0.001 |
| AFP concentration | 42.82 ± 17.66 | 44.53 ± 17.32 | <0.001 | 42.56 ± 16.19 | 44.72 ± 17.55 | 0.001 |
| Fibrinogen | 3.77 ± 0.64 | 3.63 ± 0.59 | <0.001 | 3.72 ± 0.61 | 3.62 ± 0.62 | <0.001 |
| High-density lipoprotein | 1.08 ± 0.22 | 1.05 ± 0.21 | <0.001 | 1.08 ± 0.22 | 1.06 ± 0.22 | 0.04 |
Data are the n (%) or mean ± SD. P values indicate differences between groups calculated using the two-sample Wilcoxon rank-sum (Mann-Whitney) test for continuous variables and the Pearson χ2 test or ANOVA for categorical variables, with trend tests if appropriate. The “missing” category was not included in statistical tests. For characteristics that had no “missing” category, the data were 100% complete. Maternal age was defined as age at recruitment into the study. Maternal BMI was recorded at middle pregnancy when Down's syndrome screening was performed.
Figure 2Results of discrimination and calibration metrics of machine learning and logistic regressions in the validation cohort. The AUC (a) and mean absolute error (b) are presented in each model as mean and 95% confidence intervals.
Figure 3Contribution of the predictor variables in GBDT and the logistic regression model. (a) Importance of the predictor variables in the GBDT model in the validation cohort. (b) Importance of the predictor variables in the logistic model with restricted cubic spline in the validation cohort. (c) Partial plot of the effects of fasting blood glucose (GLU, mmol/L), glycosylated hemoglobin (HbA1c, %), triglyceride (TG, mmol/L), and maternal BMI (kg/m2) on the risk of GDM across different values in the GBDT model. (d) Partial plot of the effect of glycosylated hemoglobin (HbA1c, %), high-density lipoprotein (HDL, mmol/L), fasting blood glucose (GLU, mmol/L), and triglyceride (TG, mmol/L) on the risk of GDM across different values in the logistic model with restricted cubic spline.
Figure 4Predictive performance of the GBDT model. (a) AUCs of the development and validation cohorts. (b) The graph presents the calibration curve of the GBDT model by showing the relationships between observed and predicted GDM using the GBDT model. Data are presented as mean and 95% CIs.
Figure 5Predictive values for participants with and without GDM in the cohorts. (a) Distribution of predictive value and its relationship with the observed GDM in the combined cohort (development and validation cohort). (b) Distribution of predictive values in the development and validation cohort. Data are presented as median and interquartile ranges.
Performance of the cutoff points of 0.3 and 0.7 for the GBDT model in predicting GDM.
| Cutoff points | Development cohort | Validation cohort |
|---|---|---|
| 0.3 | ||
| Negative predictive value | 82.40% (79.90%-84.70%) | 74.10% (69.50%-78.20%) |
| Positive predictive value | 51.3% (49.80%-52.90%) | 52.60% (50.20%-54.90%) |
| Sensitivity | 92% (90.80%-93.10%) | 90% (88.00%-91.70%) |
| Specificity | 30% (28.30%-31.80%) | 26% (23.50%-28.70%) |
| Positive likelihood ratio | 1.31 (1.29-1.34) | 1.22 (1.18-1.26) |
| Negative likelihood ratio | 0.27 (0.11-0.42) | 0.39 (0.17-0.60) |
| 0.7 | ||
| Negative predictive value | 59.30% (57.80%-60.70%) | 56.1% (53.90%-58.30%) |
| Positive predictive value | 86.60% (82.90%-89.60%) | 93.2% (88.20%-96.10%) |
| Sensitivity | 16% (14.50%-17.60%) | 15% (12.90%-17.30%) |
| Specificity | 98% (97.40%-98.50%) | 99% (98.20%-99.40%) |
| Positive likelihood ratio | 8 (7.72-8.28) | 15 (14.38-15.61) |
| Negative likelihood ratio | 0.86 (0.84-0.88) | 0.86 (0.83-0.89) |
Performance of the cutoff points of 0.3 and 0.7 for the GBDT model in predicting adverse pregnancy outcomes.
| Cutoff points | Development cohort | Validation cohort |
|---|---|---|
| 0.3 | ||
| Negative predictive value | 100% (92.7%-100%) | 52.4% (32.4%-71.7%) |
| Positive predictive value | 50.1% (48.7%-51.5%) | 49.9% (47.8%-52.1%) |
| Sensitivity | 100% (99.8%-100%) | 99.0% (98.3%-99.5%) |
| Specificity | 2% (1.5%-2.6%) | 1% (0.6%-1.9%) |
| Positive likelihood ratio | 1.02% (1.01%-1.03%) | 1% (0.99%-1.01%) |
| Negative likelihood ratio | 0 (0-nan) | 1% (0.15%-1.85%) |
| 0.7 | ||
| Negative predictive value | 51.4% (50.0%-52.8%) | 50.9% (48.7%-53.0%) |
| Positive predictive value | 83.0% (76.1%-88.2%) | 79.2% (66.5%-88.0%) |
| Sensitivity | 5% (4.2%-6.0%) | 4% (3.0%-5.4%) |
| Specificity | 99% (98.5%-99.3%) | 99% (98.1%-99.4%) |
| Positive likelihood ratio | 5 (4.57-5.43) | 4 (3.33-4.67) |
| Negative likelihood ratio | 0.96 (0.95-0.97) | 0.97 (0.96-0.98) |