| Literature DB >> 35720674 |
Yan Liu1,2, Haoxing Ren3, Hanna Fanous1, Xuming Dai4, Hope M Wolf5, Tyrone C Wade2, Cassandra J Ramm2, George A Stouffer2.
Abstract
Background: Coronary artery disease (CAD) costs healthcare billions of dollars annually and is the leading cause of death despite available noninvasive diagnostic tools. Objective: This study aims to examine the usefulness of machine learning in predicting hemodynamically significant CAD using routine demographics, clinical factors, and laboratory data.Entities:
Keywords: Artificial intelligence; Coronary artery disease; Health care efficiency; Machine learning
Year: 2022 PMID: 35720674 PMCID: PMC9204796 DOI: 10.1016/j.cvdhj.2022.02.002
Source DB: PubMed Journal: Cardiovasc Digit Health J ISSN: 2666-6936
Baseline patient characteristics
| Characteristic | Data (n = 185) |
|---|---|
| Age (years) ± SD | 72 ± 11 |
| Sex, n (%) | |
| Male | 119 (64%) |
| Female | 66 (36%) |
| CAD risk factors, n (%) | |
| BMI | 32 ± 8 |
| DM | 126 (68%) |
| CKD | 121 (65%) |
| HTN | 179 (96%) |
| Stroke | 37 (20%) |
| RAS | 11 (5.9%) |
| PAD | 55(30%) |
| sCAD | 123 (66%) |
| HFrEF | 41 (22%) |
| HFpEF | 52 (28%) |
BMI = body mass index; CAD = coronary artery disease; CKD = chronic kidney disease; DM = diabetes mellitus; HFpEF = heart failure with preserved ejection fraction; HFrEF = heart failure with reduced ejection fraction; HTN = hypertension; PAD = peripheral arterial disease; RAS = renal arterial stenosis; sCAD = suspected CAD (by calcium score/computed tomography chest without contrast).
Figure 1Machine learning algorithm. Using random forest model, we made 10 runs of 5-fold stratified cross-validation for performance evaluation. For each run, the dataset is randomly divided into 5 equal 5-folds, each with approximately the same number of classes. Five validation experiments are then performed, with each fold used in turn as validation set and the remaining 4 folds as the training set. This process was repeated 10 times, resulting in 50 trained random forest models, and the average validation performance across all the models was reported.
Figure 2Performance of random forest model in predicting hemodynamically significant coronary artery disease (CAD) by multiclass strategy. A: Receiver operating characteristic curves (ROCs) for multiclass strategy in predicting “No significant CAD” (left upper panel), “hemodynamically significant 1-vessel CAD” (right upper panel), “hemodynamically significant 2-vessel CAD (left lower panel), and “hemodynamically significant 3-plus-vessel CAD (right lower panel). B: The performance of the machine learning model was assessed by sensitivity, specificity, precision, and F1 score. AUC = area under the curve; sdv = standard deviation.
Data point importance ranking from learned models in single-class and multiclass experiments
| Single-class experiments | Multiclass experiments | ||||
|---|---|---|---|---|---|
| Data points | Importance mean | Importance Sdv | Data points | Importance mean | Importance Sdv |
| sCAD | 0.27 | 0.035 | Age | 0.18 | 0.023 |
| Age | 0.14 | 0.026 | HFrEF | 0.15 | 0.029 |
| BMI | 0.14 | 0.026 | Most recent eGFR | 0.14 | 0.025 |
| Most recent eGFR | 0.13 | 0.021 | sCAD | 0.13 | 0.023 |
| Most recent EF | 0.07 | 0.016 | BMI | 0.12 | 0.036 |
| PAD | 0.07 | 0.020 | Most recent EF | 0.09 | 0.014 |
| Previous cath | 0.07 | 0.022 | HFpEF | 0.07 | 0.040 |
| Anemia | 0.03 | 0.020 | PAD | 0.04 | 0.011 |
| Sex | 0.03 | 0.022 | Pulmonary edema | 0.04 | 0.015 |
| CKD | 0.02 | 0.019 | Previous cath | 0.03 | 0.008 |
| Stroke | 0.02 | 0.015 | Sex | 0.004 | 0.013 |
| Pulmonary edema | 0.003 | 0.009 | Anemia | 0.003 | 0.011 |
| RAS | 0.002 | 0.004 | CKD | 0.003 | 0.010 |
| DM | 0 | 0 | Stroke | 0.002 | 0.009 |
| HTN | 0 | 0 | HTN | 0 | 0 |
| HFrEF | 0 | 0 | DM | 0 | 0 |
| Most recent creatinine | 0 | 0 | Most recent creatinine | 0 | 0 |
| HFpEF | 0 | 0 | RAS | 0 | 0 |
BMI = body mass index; Cath = catheterization; CKD = chronic kidney disease; DM = diabetes mellitus; EF = ejection fraction; HFpEF = heart failure with preserved ejection fraction; HFrEF = heart failure with reduced ejection fraction; HTN = hypertension; PAD = peripheral arterial disease; RAS = renal arterial stenosis; sCAD = suspected coronary artery disease; Sdv= standard deviation.
Figure 3Performance of random forest model in predicting hemodynamically significant coronary artery disease (CAD) by single-class strategy. A: Receiver operating characteristic curves (ROCs) for single-class strategy in predicting any hemodynamically significant CAD. B: The performance of the machine learning model was assessed by sensitivity, specificity, precision, and F1 score. AUC = area under the curve; Sdv = standard deviation.