| Literature DB >> 36148336 |
Zuoquan Zhong1, Shiming Sun2, Jingfan Weng3, Hanlin Zhang2, Hui Lin1, Jing Sun2, Miaohong Pan4, Hangyuan Guo4, Jufang Chi1.
Abstract
Background: In recent years, the prevalence of type 2 diabetes mellitus (T2DM) has increased annually. The major complication of T2DM is cardiovascular disease (CVD). CVD is the main cause of death in T2DM patients, particularly those with comorbid acute coronary syndrome (ACS). Although risk prediction models using multivariate logistic regression are available to assess the probability of new-onset ACS development in T2DM patients, none have been established using machine learning (ML).Entities:
Keywords: acute coronary syndrome; machine learning; nomogram; random forest; type 2 diabetes mellitus
Year: 2022 PMID: 36148336 PMCID: PMC9486471 DOI: 10.3389/fpubh.2022.947204
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Figure 1Workflow diagram: The initial dataset was randomly split into training dataset and testing dataset in the ratio of 70:30. Different machine learning algorithms were using k-folding cross validation (k = 5). ACS, acute coronary syndrome.
Patients characteristics.
|
|
|
|
|
|---|---|---|---|
| Sex | |||
| 1 = Male, | 237 (62) | 80 (57) | 0.242 |
| 2 = Female, | 143 (38) | 61 (43) | |
| Age, years | 64.3 ± 12.3 | 65.1 ± 12.0 | 0.54 |
| Smoking, | |||
| 1 = YES | 135 (36) | 49 (35) | 0.869 |
| 0 = NO | 245 (64) | 92 (65) | |
| Drinking, | |||
| 1 = YES | 135 (36) | 36 (26) | 0.031 |
| 0 = NO | 245 (64) | 105 (74) | |
| Breath, times/min | 18.8 ± 1.8 | 19.2 ± 5.7 | 0.231 |
| Heartrate, beats/min | 82.4 ± 14.0 | 80.3 ± 14.2 | 0.127 |
| SBP, mm/hg | 138.6 ± 21.1 | 138.4 ± 18.2 | 0.932 |
| DBP, mm/hg | 81.0 ± 12.1 | 79.5 ± 10.7 | 0.199 |
| Killip, | |||
| 1 | 340 (89) | 127 (90) | 0.898 |
| 2 | 28 (7) | 9 (7) | |
| 3 | 9 (3) | 3 (2) | |
| 4 | 3 (1) | 2 (1) | |
| Hypertension, | |||
| 1 = YES | 228 (60) | 76 (54) | 0.21 |
| 0 = NO | 152 (40) | 65 (46) | |
| Hyperlipidemia, | |||
| 1 = YES | 153 (40) | 41 (29) | 0.019 |
| 0 = NO | 227 (60) | 100 (71) | |
| Family history of CVD, | |||
| 1 = YES | 41 (11) | 9 (6) | 0.129 |
| 0 = NO | 339 (89) | 132 (94) | |
| AST, U/L | 49.0 ± 83.6 | 47.1 ± 83.2 | 0.818 |
| LDH, U/L | 268.7 ± 242.1 | 264.2 ± 239.2 | 0.849 |
| TBIL, umol/L | 12.6 ± 7.4 | 11.9 ± 5.1 | 0.217 |
| Total protein, g/L | 65.2 ± 6.4 | 11.9 ± 15.1 | 0.371 |
| Albumin, g/L | 38.5 ± 4.8 | 38.6 ± 3.8 | 0.921 |
| Globulin, g/L | 26.7 ± 4.5 | 27.1 ± 4.0 | 0.324 |
| A/G | 1.5 ± 1.2 | 1.5 ± 0.3 | 0.408 |
| Urea, mmol/L | 6.0 ± 3.4 | 5.8 ± 2.4 | 0.555 |
| Creatinine, umol/L | 78.2 ± 67.7 | 72.9 ± 32.3 | 0.367 |
| Uric acid, umol/L | 322.2 ± 108.5 | 308.1 ± 96.3 | 0.179 |
| Total cholesterol, mmol/L | 4.5 ± 1.3 | 4.5 ± 1.1 | 0.513 |
| Triglyceride, mmol/L | 1.9 ± 2.0 | 1.5 ± 0.9 | 0.003 |
| HDL, mmol/L | 1.1 ± 0.3 | 1.1 ± 0.4 | 0.297 |
| LDL, mmol/L | 2.7 ± 0.9 | 2.7 ± 0.9 | 0.881 |
| Apo A1, g/L | 1.1 ± 0.2 | 1.1 ± 0.3 | 0.595 |
| Apo B, g/L | 1.0 ± 0.3 | 1.0 ± 0.3 | 0.415 |
| Apo B/Apo A1 | 0.9 ± 0.3 | 0.9 ± 0.3 | 0.328 |
| Fasting blood glucose, mmol/L | 9.8 ± 4.2 | 9.2 ± 3.0 | 0.108 |
| HBDH, U/L | 206.3 ± 210.9 | 210.3 ± 232.2 | 0.854 |
| CKMB, U/L | 26.2 ± 47.1 | 19.9 ± 26.2 | 0.054 |
| Homocysteine, umol/L | 11.2 ± 5.2 | 12.7 ± 18.0 | 0.128 |
| C-reactive protein, mg/L | 14.7 ± 27.2 | 14.1 ± 28.5 | 0.836 |
| Neutrophils,*1012/L | 4.3 ± 2.4 | 4.0 ± 2.1 | 0.319 |
| Lymphocyte,*1012/L | 1.7 ± 0.8 | 1.7 ± 0.6 | 0.688 |
| Neutrophils/lymphocyte | 3.3 ± 3.8 | 2.9 ± 2.8 | 0.312 |
| HbA1c, % | 8.4 ± 2.2 | 8.5 ± 2.0 | 0.609 |
| TyR | 9.3 ± 0.8 | 9.1 ± 0.6 | 0.014 |
Variables with normal distribution were presented as mean ± standard deviation (SD), other variables with classification were described as counts (percentages). Abbreviations: SBP, systolic blood pressure; DBP, diastolic blood pressure; CVD, cardiovascular disease; AST, aspartate aminotransferase; LDH, lactate dehydrogenase; TBIL, total bilirubin; A/G, albumin-globulin ratio; HDL, high-density lipoprotein; LDL, low-density lipoprotein; Apo A1,apolipoprotein A1; Apo B, apolipoprotein B; Apo B/Apo A1, apolipoprotein B-apolipoprotein A1 ratio; HBDH,α- hydroxybutyrate dehydrogenase; CKMB, creatine kinase MB; Neutrophils/lymphocyte, neutrophil-lymphocyte ratio; TyG, ln [fasting TG (mg/dL) × FBG (mg/dL)/2].
Multivariate logistic regression analysis.
|
|
|
|
|
|---|---|---|---|
| (Intercept) | −7.689 | <0.0001 | |
| Family history of CVD | 2.116 | 8.302 (3.566–19.326) | <0.0001 |
|
| |||
| 0 = NO | Reference | ||
| 1 = YES | 0.598 | 1.819 (0.994–3.327) | 0.0523 |
|
| |||
| 0 = NO | Reference | ||
| 1 = YES | −1.170 | 0.310 (0.163–0.592) | 0.0004 |
| Age | 0.066 | 3.261 (2.075–5.127) | <0.0001 |
| Neutrophils | 0.175 | 1.488 (1.090–2.031) | 0.0122 |
| Killip | 1.686 | 157.060 (9.545–2,584.400) | 0.0004 |
| AST <40 | Reference | ||
| AST 40-200 | 2.147 | 8.557 (3.721–19.676) | <0.0001 |
| AST ≥ 200 | 3.862 | 47.548 (4.852–466.010) | 0.0009 |
Figure 2Developed newly ACS prediction nomogram in T2DM patients. ACS, acute coronary disease; CVD, cardiovascular disease; AST, aspartate aminotransferase.
Figure 3ROC curves from training test (A) and testing test (B) using different machine learning algorithms. Legend including area under receiver operator characteristic curve for each algorithm with 95% confidence intervals. LR, logistic regression; LASSO, the least absolute shrinkage and selection operator; KNN, K-nearest neighbor; SVM, support vector machine; XGBoost, extreme gradient boosting; ANN, artificial neural networks.
The contrast of different machine learning models performance.
|
|
|
| ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| |
| LR | 0.86 | 0.95 | 0.77 | 0.90 | 0.83 | 0.80 | 0.86 | 0.70 | 0.76 | 0.73 |
| LR with LASSO | 0.78 | 0.86 | 0.59 | 0.85 | 0.69 | 0.74 | 0.80 | 0.59 | 0.77 | 0.67 |
| KNN | 0.93 |
| 0.84 | 0.99 | 0.91 | 0.80 | 0.88 | 0.64 | 0.89 | 0.74 |
| SVM linear | 0.89 | 0.92 | 0.83 | 0.89 | 0.86 | 0.82 | 0.90 | 0.75 | 0.83 | 0.78 |
| SVM radial |
|
| 0.98 |
|
| 0.83 | 0.92 | 0.76 | 0.84 | 0.80 |
| decision tree | 0.87 | 0.92 | 0.80 | 0.88 | 0.84 | 0.82 | 0.88 | 0.75 | 0.83 | 0.78 |
| random Forest | 0.87 | 0.86 | 0.91 | 0.77 | 0.84 |
|
|
|
|
|
| extreme gradient boosting | 0.97 |
|
| 0.99 | 0.97 | 0.88 |
| 0.81 |
| 0.86 |
| neural network | 0.76 | 0.71 | 0.49 | 0.89 | 0.63 | 0.68 | 0.70 | 0.38 | 0.80 | 0.52 |
The model with best performance is given in bold. LR, logistic regression; KNN, K-nearest neighbor; SVM, support vector machine.