| Literature DB >> 30571498 |
Ioannis A Kakadiaris1, Michalis Vrigkas1, Albert A Yen2, Tatiana Kuznetsova3, Matthew Budoff4, Morteza Naghavi2.
Abstract
Background Studies have demonstrated that the current US guidelines based on American College of Cardiology/American Heart Association (ACC/AHA) Pooled Cohort Equations Risk Calculator may underestimate risk of atherosclerotic cardiovascular disease ( CVD ) in certain high-risk individuals, therefore missing opportunities for intensive therapy and preventing CVD events. Similarly, the guidelines may overestimate risk in low risk populations resulting in unnecessary statin therapy. We used Machine Learning ( ML ) to tackle this problem. Methods and Results We developed a ML Risk Calculator based on Support Vector Machines ( SVM s) using a 13-year follow up data set from MESA (the Multi-Ethnic Study of Atherosclerosis) of 6459 participants who were atherosclerotic CVD-free at baseline. We provided identical input to both risk calculators and compared their performance. We then used the FLEMENGHO study (the Flemish Study of Environment, Genes and Health Outcomes) to validate the model in an external cohort. ACC / AHA Risk Calculator, based on 7.5% 10-year risk threshold, recommended statin to 46.0%. Despite this high proportion, 23.8% of the 480 "Hard CVD " events occurred in those not recommended statin, resulting in sensitivity 0.76, specificity 0.56, and AUC 0.71. In contrast, ML Risk Calculator recommended only 11.4% to take statin, and only 14.4% of "Hard CVD " events occurred in those not recommended statin, resulting in sensitivity 0.86, specificity 0.95, and AUC 0.92. Similar results were found for prediction of "All CVD " events. Conclusions The ML Risk Calculator outperformed the ACC/AHA Risk Calculator by recommending less drug therapy, yet missing fewer events. Additional studies are underway to validate the ML model in other cohorts and to explore its ability in short-term CVD risk prediction.Entities:
Keywords: Artificial intelligence; Machine learning; atherosclerosis; cardiovascular disease prevention; cardiovascular disease risk factors; cardiovascular risk; clinical decision support; prediction statistics; statin
Mesh:
Substances:
Year: 2018 PMID: 30571498 PMCID: PMC6404456 DOI: 10.1161/JAHA.118.009476
Source DB: PubMed Journal: J Am Heart Assoc ISSN: 2047-9980 Impact factor: 5.501
Figure 1Overview of ML approach. For each ML model, we divided the study population 50/50 into training and prediction subset cohorts. Next, we augmented the training subset using NEATER and trained the SVM prediction model. During prediction, each sample in the prediction cohort was analyzed and classified. Then, the cohorts switched places (ie, prediction becomes training, and vice versa) and the process was repeated. The overall iterative process was repeated 10 times for each ML model, and the results were averaged. CVD indicates cardiovascular disease; FLEMENGHO study, the Flemish Study of Environment, Genes and Health Outcomes; HDL, high‐density lipoprotein; MESA, the Multi‐Ethnic Study of Atherosclerosis; NEATER, a method for the filtering of oversampled data using non‐cooperative game theory; SVM, Support Vector Machine.
Baseline Characteristics of Study Population and Subgroups of Interest
| All (N=6459) | Hard CVD (n=480) | All CVD (n=976) | ACC/AHA <9.75% 13‐y risk (n=3487) | ACC/AHA ≥9.75% 13‐y risk (n=2972) | ML: Low Risk (13‐y) (n=5724) | ML: High Risk (13‐y) (n=735) | |
|---|---|---|---|---|---|---|---|
| Age, y | 61.3±9.6 | 65.8±8.9 | 65.7±8.7 | 55.3±6.9 | 68.4±7.1 | 60.6±9.6 | 66.4±8.2 |
| Male, n% | 3060 (47.4%) | 282 (58.7%) | 590 (60.4%) | 1254 (36.0%) | 1806 (60.8%) | 2601 (45.4%) | 459 (62.4%) |
| Female, n% | 3399 (52.6%) | 198 (41.3%) | 386 (39.6%) | 2233 (64.0%) | 1166 (39.2%) | 3123 (54.6%) | 276 (37.6%) |
| Ethnicity, n% | |||||||
| White | 2484 (38.5%) | 187 (39.0%) | 413 (42.3%) | 1439 (41.2%) | 1045 (35.2%) | 2197 (38.4%) | 287 (39.0%) |
| Asian | 767 (11.9%) | 35 (7.3%) | 67 (6.9%) | 446 (12.8%) | 321 (10.8%) | 697 (12.2%) | 70 (9.5%) |
| Black | 1780 (27.5%) | 138 (28.7%) | 282 (28.9%) | 794 (22.8%) | 986 (33.2%) | 1573 (27.5%) | 207 (28.2%) |
| Hispanic | 1428 (22.1%) | 120 (25.0%) | 214 (21.9%) | 808 (23.2%) | 620 (20.8%) | 1257 (21.9%) | 171 (23.3%) |
| Total cholesterol, mg/dL | 194.4±35.8 | 194.6±34.2 | 193.2±37.3 | 194.9±34.9 | 193.8±36.8 | 194.7±36.6 | 192.3±30.9 |
| High‐density lipoprotein cholesterol, mg/dL | 50.9±14.8 | 47.8±13.9 | 48.1±13.6 | 52.6±15.0 | 48.8±14.3 | 51.5±15.0 | 46.5±12.3 |
| Systolic blood pressure, mm Hg | 125.9±21.1 | 136.3±22.2 | 134.5±21.7 | 116.9±16.4 | 136.6±21.1 | 124.7±20.9 | 135.8±20.0 |
| Hypertension, n% | 2351 (36.4%) | 243 (50.6%) | 510 (52.2%) | 735 (21.1%) | 1616 (54.4%) | 1974 (34.5%) | 377 (51.3%) |
| Diabetes mellitus, n% | 729 (11.3%) | 107 (22.3%) | 217 (22.2%) | 127 (3.6%) | 602 (20.3%) | 653 (11.4%) | 76 (10.3%) |
| Smoking, n% | |||||||
| Current smoking | 869 (13.5%) | 92 (19.2%) | 169 (17.3%) | 387 (11.1%) | 482 (16.2%) | 762 (13.3%) | 107 (14.6%) |
| Prior smoking | 2365 (36.6%) | 180 (37.5%) | 419 (42.9%) | 1192 (34.2%) | 1173 (39.5%) | 2073 (36.2%) | 292 (39.7%) |
| Never | 3225 (49.9%) | 208 (43.3%) | 388 (39.8%) | 1908 (54.7%) | 1317 (44.3%) | 2889 (50.5%) | 336 (45.7%) |
| Family history heart attack, n% | 2593 (40.1%) | 239 (49.8%) | 482 (49.4%) | 1364 (39.1%) | 1229 (41.3%) | 2231 (39.0%) | 362 (49.2%) |
| Coronary artery calcification, Agatston | 138.8±408.6 | 316.1±577.2 | 389.9±759.0 | 41.0±160.8 | 253.4±555.2 | 121.0±380.1 | 274.0±563.7 |
| hsCRP, mg/L |
3.8±5.8 |
4.3±5.9 |
4.5±6.8 |
3.6±5.2 |
4.0±6.4 |
3.8±5.8 |
3.7±5.6 |
Continuous variables are expressed as mean±SD. Categorical variables are presented as absolute numbers and frequencies.
The American College of Cardiology/American Heart Association (ACC/AHA) Risk Calculator does not use these variables; therefore, they were not included in the Machine Learning CVD predictive models.
High sensitivity C‐reactive protein (hsCRP) is also expressed as a geometric mean with 90% confidence interval since this variable is not normally distributed.
Risk Calculator Comparison: Sensitivity‐Specificity‐Other Performance Metrics
| Event | Model | Sn (95% CI) |
| Sp (95% CI) |
| FN | FP | TP | TN | Acc (95% CI) |
| NRI (95% CI) |
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Male | |||||||||||||
| Hard CVD | ACC/AHA Risk Calculator | 0.86±0.1 (0.81–0.90) | – | 0.44±0.1 (0.42–0.46) | – | 40 | 1564 | 242 | 1214 | 0.48±0.1 (0.46–0.49) | – | – | – |
| ML Risk Calculator | 0.90±0.1 (0.86–0.94) | ≤0.001 | 0.93±0.1 (0.92–0.94) | ≤0.001 | 27 | 204 | 255 | 2574 | 0.92±0.1 (0.91–0.93) | ≤0.001 | 0.53 (0.51–0.55) | ≤0.001 | |
| All CVD | ACC/AHA Risk Calculator | 0.84±0.1 (0.81–0.87) | – | 0.47±0.1 (0.45–0.49) | – | 96 | 1312 | 494 | 1158 | 0.54±0.1 (0.52–0.56) | – | – | – |
| ML Risk Calculator | 0.97±0.1 (0.96–0.99) | ≤0.001 | 0.82±0.1 (0.80–0.84) | ≤0.001 | 15 | 443 | 575 | 2027 | 0.85±0.1 (0.84–0.86) | ≤0.001 | 0.48 (0.46–0.50) | ≤0.001 | |
| Female | |||||||||||||
| Hard CVD | ACC/AHA Risk Calculator | 0.63±0.1 (0.56–0.69) | – | 0.67±0.1 (0.66–0.69) | – | 74 | 1042 | 124 | 2159 | 0.67±0.1 (0.66–0.69) | – | – | – |
| ML Risk Calculator | 0.79±0.1 (0.72–0.84) | ≤0.001 | 0.96±0.1 (0.95–0.97) | ≤0.001 | 42 | 120 | 156 | 3081 | 0.95±0.1 (0.94–0.96) | ≤0.001 | 0.45 (0.43–0.47) | ≤0.001 | |
| All CVD | ACC/AHA Risk Calculator | 0.62±0.1 (0.57–0.67) | – | 0.69±0.1 (0.68–0.71) | – | 146 | 926 | 240 | 2087 | 0.68±0.1 (0.67–0.70) | – | – | – |
| ML Risk Calculator | 0.93±0.1 (0.90–0.95) | ≤0.001 | 0.92±0.1 (0.91–0.93) | ≤0.001 | 28 | 247 | 358 | 2766 | 0.92±0.1 (0.91–0.93) | ≤0.001 | 0.54 (0.52–0.55) | ≤0.001 | |
| All | |||||||||||||
| Hard CVD | ACC/AHA Risk Calculator | 0.76±0.1 (0.72–0.80) | – | 0.56±0.1 (0.55–0.58) | – | 114 | 2606 | 366 | 3373 | 0.58±0.1 (0.57–0.59) | – | – | – |
| ML Risk Calculator | 0.86±0.1 (0.82–0.89) | ≤0.001 | 0.95±0.1 (0.94–0.96) | ≤0.001 | 69 | 324 | 411 | 5655 | 0.94±0.1 (0.93–0.95) | ≤0.001 | 0.49 (0.48–0.50) | ≤0.001 | |
| All CVD | ACC/AHA Risk Calculator | 0.75±0.1 (0.72–0.78) | – | 0.59±0.1 (0.56–0.61) | – | 242 | 2238 | 734 | 3245 | 0.62±0.1 (0.60–0.63) | – | – | – |
| ML Risk Calculator | 0.96±0.1 (0.94–0.97) | ≤0.001 | 0.87±0.1 (0.86–0.88) | ≤0.001 | 43 | 690 | 933 | 4793 | 0.89±0.1 (0.88–0.89) | ≤0.001 | 0.50 (0.48–0.51) | ≤0.001 | |
ACC/AHA indicates American College of Cardiology/American Heart Association; CI, confidence interval; CVD, cardiovascular disease; FN, false negatives; FP, false positives; ML, Machine Learning; NRI, net reclassification improvement; Sn, sensitivity; Sp, specificity; TN, true negatives; TP, true positives.
Figure 2Receiver operating characteristic (ROC) curves for prediction of (A) “Hard CVD” events and (B) “All CVD” events comparing the ML Risk Calculator (blue) with the American College of Cardiology/American Heart Association (ACC/AHA) Risk Calculator (red). AUC indicates area under the curve; CVD, cardiovascular disease; ML, Machine Learning.
Figure 3Risk calculator comparison: statin eligibility and missed treatment opportunities. Pie graphs illustrate the performance comparison between ML Risk Calculator and ACC/AHA Risk Calculator for predicting “Hard CVD” events (left) and “All CVD” events (right). ACC/AHA indicates American College of Cardiology/American Heart Association; CBG, coronary bypass grafts; CHD, coronary heart disease; CHF, congestive heart failures; CVD, cardiovascular disease; MI, myocardial infarction; PTCA, percutaneous transluminal coronary angioplasties; PVD, peripheral vascular diseases; TIA, transient ischemic attacks.
External Validation: Risk Calculator Comparison between Models Trained and Tested on “White Race” MESA Cohort and Models Trained on “White Race” MESA Cohort and Tested on “White Race” FLEMENGHO Cohort
| Cohort | Model | Sn (95% CI) |
| Sp (95% CI) |
| FN | FP | TP | TN | Acc (95% CI) |
| NRI (95% CI) |
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Male | |||||||||||||
| Train and test on White Race MESA | ACC/AHA Risk Calculator | 0.85±0.1 (0.77–0.91) | – | 0.45±0.1 (0.42–0.48) | – | 16 | 602 | 91 | 488 | 0.48±0.1 (0.46–0.51) | – | – | – |
| ML Risk Calculator | 0.90±0.1 (0.82–0.95) | ≤0.001 | 0.98±0.1 (0.97–0.99) | ≤0.001 | 11 | 22 | 96 | 1068 | 0.97±0.1 (0.96–0.98) | 0.002 | 0.58 (0.55–0.60) | ≤0.001 | |
| Train on White Race MESA and test on FLEMENGHO | ACC/AHA Risk Calculator | 0.74±0.1 (0.66–0.80) | – | 0.55±0.1 (0.5–0.59) | – | 41 | 234 | 114 | 283 | 0.59±0.1 (0.55–0.63) | – | – | – |
| ML Risk Calculator | 0.86±0.1 (0.80–0.91) | ≤0.001 | 0.82±0.1 (0.79–0.85) | ≤0.001 | 21 | 92 | 134 | 425 | 0.83±0.1 (0.80–0.86) | 0.001 | 0.39 (0.36–0.43) | 0.004 | |
| Female | |||||||||||||
| Train and test on White Race MESA | ACC/AHA Risk Calculator | 0.58±0.1 (0.49–0.68) | – | 0.72±0.1 (0.70–0.75) | – | 34 | 336 | 46 | 871 | 0.71±0.1 (0.69–0.74) | – | – | – |
| ML Risk Calculator | 0.77±0.1 (0.65–0.85) | ≤0.001 | 0.94±0.1 (0.93–0.96) | ≤0.001 | 19 | 68 | 61 | 1139 | 0.93±0.1 (0.92–0.95) | 0.002 | 0.41 (0.40–0.46) | ≤0.001 | |
| Train on White Race MESA and test on FLEMENGHO | ACC/AHA Risk Calculator | 0.48±0.1 (0.39–0.57) | – | 0.82±0.1 (0.78–0.85) | – | 57 | 103 | 53 | 463 | 0.76±0.1 (0.73–0.79) | – | – | – |
| ML Risk Calculator | 0.56±0.1 (0.47–0.66) | ≤0.001 | 0.91±0.1 (0.88–0.93) | ≤0.001 | 48 | 52 | 62 | 514 | 0.85±0.1 (0.82–0.88) | 0.002 | 0.17 (0.14–0.20) | 0.004 | |
| All | |||||||||||||
| Train and test on White Race MESA | ACC/AHA Risk Calculator | 0.73±0.1 (0.66–0.79) | – | 0.59±0.1 (0.57–0.61) | – | 50 | 938 | 137 | 1359 | 0.60±0.1 (0.58–0.62) | – | – | – |
| ML Risk Calculator | 0.84±0.1 (0.78–0.89) | ≤0.001 | 0.96±0.1 (0.95–0.97) | ≤0.001 | 30 | 90 | 157 | 2207 | 0.95±0.1 (0.94–0.96) | 0.001 | 0.48 (0.46–0.50) | ≤0.001 | |
| Train on White Race MESA and test on FLEMENGHO | ACC/AHA Risk Calculator | 0.63±0.1 (0.57–0.69) | – | 0.69±0.1 (0.66–0.72) | – | 98 | 337 | 167 | 746 | 0.68±0.1 (0.65–0.70) | – | – | – |
| ML Risk Calculator | 0.74±0.1 (0.68–0.79) | ≤0.001 | 0.87±0.1 (0.85–0.89) | ≤0.001 | 69 | 144 | 196 | 939 | 0.84±0.1 (0.82–0.86) | 0.001 | 0.29 (0.27–0.31) | 0.004 | |
ACC/AHA, American College of Cardiology/American Heart Association; CI, confidence interval; FLEMENGHO, Flemish Study of Environment, Genes and Health Outcomes; FN, false negatives; FP, false positives; MESA, Multi‐Ethnic Study of Atherosclerosis; NRI, net reclassification improvement; Sn, sensitivity; Sp, specificity; TN, true negatives; TP, true positives.