| Literature DB >> 35204511 |
Pablo Juan-Salvadores1,2, Cesar Veiga2, Víctor Alfonso Jiménez Díaz1,2,3, Alba Guitián González4, Cristina Iglesia Carreño4, Cristina Martínez Reglero5, José Antonio Baz Alonso2,3, Francisco Caamaño Isorna6,7, Andrés Iñiguez Romo2,4.
Abstract
Coronary artery disease is a chronic disease with an increased expression in the elderly. However, different studies have shown an increased incidence in young subjects over the last decades. The prediction of major adverse cardiac events (MACE) in very young patients has a significant impact on medical decision-making following coronary angiography and the selection of treatment. Different approaches have been developed to identify patients at a higher risk of adverse outcomes after their coronary anatomy is known. This is a prognostic study of combined data from patients ≤40 years old undergoing coronary angiography (n = 492). We evaluated whether different machine learning (ML) approaches could predict MACE more effectively than traditional statistical methods using logistic regression (LR). Our most effective model for long-term follow-up (60 ± 27 months) was random forest (RF), obtaining an area under the curve (AUC) = 0.79 (95%CI 0.69-0.88), in contrast with LR, obtaining AUC = 0.66 (95%CI 0.53-0.78, p = 0.021). At 1-year follow-up, the RF test found AUC 0.80 (95%CI 0.71-0.89) vs. LR 0.50 (95%CI 0.33-0.66, p < 0.001). The results of our study support the hypothesis that ML methods can improve both the identification of MACE risk patients and the prediction vs. traditional statistical techniques even in a small sample size. The application of ML techniques to focus the efforts on the detection of MACE in very young patients after coronary angiography could help tailor upfront follow-up strategies in such young patients according to their risk of MACE and to be used for proper assignment of health resources.Entities:
Keywords: acute coronary syndrome; coronary angiography; coronary artery disease; machine learning; major adverse cardiovascular events; prediction models; young patient
Year: 2022 PMID: 35204511 PMCID: PMC8870965 DOI: 10.3390/diagnostics12020422
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Figure 1Study flow chart.
Results of the K-fold cross-validation.
| LR2 | NB | LDA | RF | MLP | SVM | LR1 | |
|---|---|---|---|---|---|---|---|
|
| 0.91 | 0.39 | 0.73 | 0.82 | 0.91 | 0.91 | 0.82 |
|
| 0.86 | 0.55 | 0.82 | 0.82 | 0.82 | 0.77 | 0.77 |
|
| 0.77 | 0.32 | 0.73 | 0.90 | 0.80 | 0.70 | 0.90 |
|
| 1.00 | 0.65 | 0.50 | 0.95 | 0.90 | 0.95 | 1.00 |
|
| 0.95 | 1.00 | 0.95 | 0.90 | 0.90 | 1.00 | 0.95 |
|
| 0.70 | 0.20 | 0.80 | 0.50 | 0.65 | 0.70 | 0.75 |
|
| 0.95 | 0.95 | 0.50 | 0.95 | 1.00 | 0.85 | 1.00 |
|
| 0.25 | 0.00 | 0.00 | 0.80 | 0.55 | 0.20 | 0.70 |
|
| 0.80 | 0.35 | 0.65 | 0.70 | 0.70 | 0.75 | 0.75 |
|
| 0.85 | 0.75 | 0.73 | 0.55 | 0.90 | 0.95 | 0.80 |
NB: Naive Bayes. LDA: Linear discriminant analysis. RF: Random forest. MLP: Multi-layer Perceptron. LR: Logistic regression using Lasso (LR1) and Ridge Regression (LR2). SVM: Support Vector Machine.
Clinical characteristics of patients ≤40 years undergoing coronary angiography.
| Variables | Overall (n = 492) |
|---|---|
| Women | 60 (12.2%) |
| Follow-up time (months) | 60 ± 27 |
| Body mass index | 28 ± 5 |
| Hypertension | 113 (23.0%) |
| Diabetes mellitus | 43 (8.7%) |
| Smoking | 381 (77.4%) |
| Dyslipidemia | 217 (44.1%) |
| Family history of CAD 1 | 132 (26.8%) |
| Previous revascularization | 62 (12.6%) |
| Cocaine | 52 (10.6%) |
| Alcohol abuse | 52 (10.5%) |
| Cannabis | 56 (11.4%) |
| Peripheral artery disease | 7(1.4%) |
| Congestive heart failure | 3 (0.6%) |
| Previous stroke | 3 (0.6%) |
| Atrial fibrillation | 3 (0.6%) |
| Renal failure | 27 (5.5%) |
| Depression | 44 (8.9%) |
| Total cholesterol (mg/dL) | 194 ± 53 |
| LDL-cholesterol (mg/dL) | 124 ± 48 |
| HDL-cholesterol (mg/dL) | 39 ± 11 |
| Triglycerides (mg/dL) | 162 ± 114 |
| Creatinine (mg/dL) | 1.28 ± 1.8 |
| Glucose (mg/dL) | 107 ± 44 |
| LVEF 2 (%) | 55 ± 9 |
| Hospitalization days | 6 ± 7 |
1 CAD, coronary artery disease; 2 LVEF, left ventricular ejection fraction.
Prediction Ability of the Reference Model (LR2, Linear Regression2) and five Machine Learning Models, measured in terms of AUC-ROC, Sensitivity, Specificity, Accuracy, and Precision values to predict MACE, using 8-fold cross-validation of 123 patients from the testing dataset.
| CLS | AUC | Sensitivity (95% CI) | Specificity (95% CI) | Accuracy (95% CI) | Precision (95% CI) | |
|---|---|---|---|---|---|---|
| Prediction Ability Long-Term Follow-Up | ||||||
|
| 0.66 | --- | 0.59 | 0.79 | 0.74 | 0.46 |
|
| 0.73 | 0.193 | 0.97 | 0.03 | 0.25 | 0.23 |
|
| 0.62 | 0.167 | 0.55 | 0.74 | 0.70 | 0.40 |
|
| 0.79 | 0.021 | 0.69 | 0.70 | 0.70 | 0.42 |
|
| 0.63 | 0.143 | 0.48 | 0.82 | 0.74 | 0.45 |
|
| 0.64 | 0.689 | 0.38 | 0.85 | 0.74 | 0.44 |
|
| 0.68 | 0.009 | 0.59 | 0.80 | 0.74 | 0.47 |
CLS, classifiers; AUC, Area Under the Curve; CI, confidence interval; LR, Logistic Regression; NB, Naive Bayes; LDA, Linear Discriminant Analysis; RF, Random Forest; MLP, Multi-layer Perceptron; SVM, Support Vector Machine.
Figure 2(a) Areas under the receiver operating characteristic (ROC) and (b) precision/recall (PR) curves for machine-learning models. AUC, Area Under the Curve; LDA, Linear Discriminant Analysis; MLP, Multi-layer Perceptron; NB, Naive Bayes; RF, Random Forest; LR, Logistic Regression; SVM, Support Vector Machine.
Prediction Ability of the Reference Model (LR2, Linear Regression2) and five Machine Learning Models, measured in terms of AUC-ROC, Sensitivity, Specificity, Accuracy and Precision values to predict MACE, using 8-fold cross-validation of 123 patients from the testing dataset.
| CLS | AUC | Sensitivity (95% CI) | Specificity (95% CI) | Accuracy (95% CI) | Precision (95% CI) | |
|---|---|---|---|---|---|---|
| Prediction Ability One-Year Follow-Up | ||||||
|
| 0.50 | --- | 0.25 | 0.81 | 0.74 | 0.17 |
|
| 0.47 | 0.741 | 0.75 | 0.06 | 0.15 | 0.11 |
|
| 0.49 | 0.970 | 0.37 | 0.78 | 0.73 | 0.21 |
|
| 0.80 | <0.001 | 0.75 | 0.72 | 0.72 | 0.29 |
|
| 0.56 | 0.159 | 0.25 | 0.84 | 0.76 | 0.19 |
|
| 0.45 | 0.271 | 0.06 | 0.87 | 0.76 | 0.07 |
|
| 0.61 | 0.066 | 0.56 | 0.52 | 0.53 | 0.15 |
CLS, classifiers; AUC, Area Under the Curve; CI, confidence interval; LR, Logistic Regression; NB, Naive Bayes; LDA, Linear Discriminant Analysis; RF, Random Forest; MLP, Multi-layer Perceptron; SVM, Support Vector Machine.
Figure 3(a) Relevance of the 15 most important clinical variables extracted from the Random Forest classifier to long-term follow-up. (b) SHAP analysis for those 15 most important clinical variables.