| Literature DB >> 32994452 |
Chayakrit Krittanawong1,2, Hafeez Ul Hassan Virk3, Sripal Bangalore4, Zhen Wang5,6, Kipp W Johnson7, Rachel Pinotti8, HongJu Zhang9, Scott Kaplin10, Bharat Narasimhan10, Takeshi Kitai11, Usman Baber10, Jonathan L Halperin10, W H Wilson Tang11.
Abstract
Several machine learning (ML) algorithms have been increasingly utilized for cardiovascular disease prediction. We aim to assess and summarize the overall predictive ability of ML algorithms in cardiovascular diseases. A comprehensive search strategy was designed and executed within the MEDLINE, Embase, and Scopus databases from database inception through March 15, 2019. The primary outcome was a composite of the predictive ability of ML algorithms of coronary artery disease, heart failure, stroke, and cardiac arrhythmias. Of 344 total studies identified, 103 cohorts, with a total of 3,377,318 individuals, met our inclusion criteria. For the prediction of coronary artery disease, boosting algorithms had a pooled area under the curve (AUC) of 0.88 (95% CI 0.84-0.91), and custom-built algorithms had a pooled AUC of 0.93 (95% CI 0.85-0.97). For the prediction of stroke, support vector machine (SVM) algorithms had a pooled AUC of 0.92 (95% CI 0.81-0.97), boosting algorithms had a pooled AUC of 0.91 (95% CI 0.81-0.96), and convolutional neural network (CNN) algorithms had a pooled AUC of 0.90 (95% CI 0.83-0.95). Although inadequate studies for each algorithm for meta-analytic methodology for both heart failure and cardiac arrhythmias because the confidence intervals overlap between different methods, showing no difference, SVM may outperform other algorithms in these areas. The predictive ability of ML algorithms in cardiovascular diseases is promising, particularly SVM and boosting algorithms. However, there is heterogeneity among ML algorithms in terms of multiple parameters. This information may assist clinicians in how to interpret data and implement optimal algorithms for their dataset.Entities:
Mesh:
Year: 2020 PMID: 32994452 PMCID: PMC7525515 DOI: 10.1038/s41598-020-72685-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Proposed quality assessment of ML research for clinical practice.
Clarity of algorithms Propose new algorithms Select the proper algorithms Compare alternative algorithms |
Reliable database/center Number of database/centers Number of samples (patients/images) Type and diversity of data |
Manuscript with sufficient supplementary information Letter or editor, short article, abstract Report baseline characteristics of patients |
Comparison to expert clinicians Comparison to validated clinical risk models |
Assessment of outcome based on standard medical taxonomy External validation cohort |
Report both discrimination and calibration metrics Report one or more of the following: sensitivity, specificity, positive, negative cases, balanced accuracy |
Figure 1Study design. This flow chart illustrates the selection process for published reports.
Characteristics of the included studies.
| First author | Analytic model | Sample | Indication | Imaging | Comparison | Database |
|---|---|---|---|---|---|---|
| Alickovic et al. (2016) | RF | 47 | Arrhythmia detection | ECG | Five ECG signal patterns from MIT-BIH (normal (N), Premature Ventricular Complex (PVC), Atrial Premature Contraction (APC), Right Bundle Branch Block (RBBB) and Left Bundle Branch Block (RBBB)) and four ECG patterns from St. -Petersburg Institute of Cardiological Technics 12lead Arrhythmia Database (N, APC, PVC and RBBB) | St. Petersburg and MIT-BIH database |
| Au-Yeung et al. (2018) | 5sRF, 10sRF, SVM | 788 | Ventricular arrhythmia | ICD Data | SCD-HeFT study | |
| Hill et al. (2018) | Logistic-linear regression, SVM, RF | 2,994,837 | Development of AF/flutter in gen pop | Clinical data | ML compared with conventional linear statistical methods | UK Clinical Practice Research Datalink (CPRD) between 01–01-2006 and 31–12-2016 was undertaken |
| Kotu et al. (2015) | k-NN, SVM, RF | 54 | arrhythmic risk stratification of post MI patients | Cardiac MRI | Low LVEF and Scar versus textural features of scar | Single center |
| Ming-Zher Poh et al. (2018) | CNN | 149,048 | AF | ECG | Several publicly accessible PPG repositories, including the MIMIC-III critical care database,11 the Vortal data set from healthy volunteers12 and the IEEE-TBME PPG Respiratory Rate Benchmark data set.1 | |
| Xiaoyan Xu et al. (2018) | CNN | 25 | AF | ECG | MIT-BIH Atrial Fibrillation database | MIT-BIH Atrial Fibrillation Database |
| Araki et al. (2016) | SVM classifier with five different kernels sets | 15 | Plaque rupture prediction | IVUS | 40 MHz catheter utilizing iMap (Boston Scientific, Marlborough, MA, USA) with 2,865 frames per patient (42,975 frames) and (b) linear probe B-mode carotid ultrasound (Toshiba scCNNer, Japan) | Single center |
| Araki et al. (2016) | SVM combined with PCA | 19 | Coronary risk assessment | IVUS | Single center | |
| Arsanjani et al. (2013) | boosting algorithm | 1,181 | Perfusion SPECT in CAD | Perfusion SPECT | 2 experts, combined supine/prone TPD | Single center |
| BaumCNN et al. (2017) | Custom-built algorithm | 258 | ctFFR in detecting relevant lesions | Invasive FFR determination of relevant lesions | the MACHINE Registry | |
| Coenen (2018) | Custom-built algorithm | 351 | Invasive FFR / Computational flow dynamics based (CFD-FFR) | CT angiography | Invasive FFR / Computational flow dynamics based (CFD-FFR) | 5 centers in Europe, Asia, and the United States |
| Dey et al. (2015) | boosting algorithm | 37 | Coronary CTA in ischemic heart disease patients to predict impaired myocardial flow reserve | CCTA | Clinical stenosis grading | Single center |
| Eisenberg et al. (2018) | boosting algorithm | 1925 | MPI in CAD | SPECT | Human visual analysis | The ReFiNE registry |
| Freiman et al. (2017) | Custom-built algorithm | 115 | CCTA in coronary artery stenosis | CCTA | Cardiac image analysis | The MICCAI 2012 challenge |
| Guner et al. (2010) | CNN | 243 | Stable CAD | Myocardial perfusion SPECT (MPS) | SPECT evaluation and human–computer interaction One expert reader who has 10 years of experience and six nuclear medicine residents who have two to four years of experience in nuclear cardiology took part in the study | Single center |
| Hae et al. (2018) | Logistic-linear regression, SVM, RF, boosting algorithm | 1,132 | Prediction FFR in stable and unstable angina patients | FFR, CCTA | Single center | |
| Han et al. (2017) | Logistic-linear regression | 252 | Physiologically significant CAD | CCTA and invasive fractional flow reserve (FFR | The DeFACTO study | |
| Hu (Xiuhua) et al. (2018) | Custom-built algorithm | 105 | Intermediate coronary artery lesions | CCTA | CCTA-FFR vs Invasive angiography FFR | Single center |
| Hu et al. (2018) | Boosting algorithm | 1861 | MPI in CAD | SPECT | True early reperfusion | Multicenter REFINE SPECT registry |
| Wei et al. (2014) | Custom-built algorithm | 83 | Noncalcified plaques (NCPs) detection on CCTA | CCTA | Single center | |
| Kranthi et al. (2017) | Boosting algorithm | 85,945 | CCTA in CAD | CCTA | 66 available parameters (34 clinical parameters, 32 laboratory parameters) | Single center |
| Madan et al. (2013) | SVM | 407 | Urinary proteome in CAD | global proteomic profile analysis of urinary proteome | Indian Atherosclerosis Research Study | |
| Zellweger et al. (2018) | Custom-built algorithm | 987 | CAD evaluation | N/A | Framingham scores | The Ludwigshafen Risk and Cardiovascular Health Study (LURIC) |
| Moshrik Abd alamir et al. (2018) | Custom-built algorithm | 923 | ED patients with chest pain -CTA analysis | CT Angiography | Single center | |
| Nakajima et al. (2017) | CNN | 1,001 | Previous myocardial infarction and coronary revascularization | SPECT | Expert consensus interpretations | Japanese multicenter study |
| Song et al. (2014) | SVM | 208 | Risk prediction in ACS | N/A | Single center | |
| VanHouten et al. (2014) | Logistic-linear regression, RF | 20,078 | Risk prediction in ACS | N/A | Single center | |
| Xiao et al. (2018) | CNN | 15 | Ischemic ST change in ambulatory ECG | ECG | Long-Term ST Database (LTST database) from PhysioNet | |
| Yoneyama et al. (2017) | CNN | 59 | Detecting culprit coronary arteries | CCTA and myocardial perfusion SPECT | Single center | |
| Abouzari et al. (2009) | CNN | 300 | SDH post-surgery outcome prediction | CT head | Single center | |
| Alexander Roederer et al.(2014) | Logistic-linear regression | 81 | SAH-Vasospasm prediction | Passively obtained clinical data | Single center | |
| Arslan et al. (2016) | Logistic-linear regression, SVM, boosting algorithm | 80 | Ischemic stroke | EMR | Single center | |
| Atanassova et al. (2008) | CNN | 54 | Major stroke | Diastolic BP | 2 CNNs compared | Single center |
| Barriera et al. (2018) | CNN | 284 | Stroke (ICH and ischemic stroke) | CT head | Stroke neurologists reading CT | Single center |
| Beecy et al. (2017) | CNN | 114 | Stroke | CT head | Expert consensus interpretations | Single center |
| Dharmasaroja et al. (2013) | CNN | 194 | Stroke/intracranial hemorrhage | CT head | Thrombolysis after ischemic stroke | Single center |
| Fodeh et al. (2018) | SVM | 1834 | Atraumatic ICH | EHR review | Single center | |
| Gottrup et al. (2005) | kNN, Custom-built algorithm | 14 | Acute ischemic stroke | MRI | Applicability of highly flexible instance-based methods | Single center |
| Ho et al. (2016) | SVM, RF, and GBRT models | 105 | Acute ischemic stroke | MRI | Classification models for the problem of unknown time-since-stroke | Single center |
| Knight-Greenfield et al. (2018) | CNN | 114 | Stroke | CT head | Expert consensus interpretations | Single center |
| Ramos et al. (2018) | SVM, RF, Logistic-linear regression, CNN | 317 | SAH | CT Head | Delayed cerebral ischemia in SAH detection | Single center |
| SÜt et al. (2012) | MLP neural networks | 584 | Stroke mortality | EMR data | Selected variables using univariate statistical analyses | N/A |
| Paula De Toledo et al. (2009) | Logistic-linear regression | 441 | SAH | CT Head | Algorithms used were C4.5, fast decision tree learner, partial decision trees, repeated incremental pruning to produce error reduction, nearest neighbor with generalization, and ripple down rule learner | Multicenter Register |
| Thorpe et al. (2018) | decision tree | 66 | Stroke | Transcranial Doppler | Velocity Curvature Index (VCI) vs Velocity Asymmetry Index (VAI) | Single center |
| Williamson et al. (2019) | BOOSTING algorithm, RF | 483 | Risk stratification in SAH | True poor outcomes | Single center | |
| Xie et al. (2019) | Boosting algorithm | 512 | Predict Patient Outcome in Acute Ischemic Stroke | CT Head and clinical parameters | Feature selections were performed using a greedy algorithm | Single center |
| Andjelkovic et al. (2014) | CNN | 193 | HF in congenital heart disease | Echocardiography | Single center | |
| Blecker et al. (2018) | Logistic-linear regression | 37,229 | ADHF | Early ID of patients at risk of readmission for ADHF | 4 algorithms tested | Single center |
| Gleeson et al. (2016) | Custom-built algorithm | 534 | HF | Echocardiography and ECG | Data mining was applied to discover novel ECG and echocardiographic markers of risk | Single center |
| Golas et al. (2018) | Logistic-linear regression, boosting algorithm, CNN | 11,510 | HF | EHR | Heat failure patients to predict 30 day readmissions | Several hospitals in the Partners Healthcare System |
| Mortazavi et al. (2016) | Random forests, boosting, combined algorithms or logistic regression | 1653 | HF | Surveys to hospital examinations | Tele-HF trial | |
| Frizzell et al | Random forest and gradient-boosted algorithms | 56,477 | HF | EHR | Traditional statistical methods | GWTG-HF registry |
| Kasper Rossing et al. (2016) | SVM | 33 | HFpEF | Urinary proteomic analysis | Heart failure clinic (Single center) | |
| Kiljanek et al. (2009) | RF | 1587 | HF | Clinical diagnosis | Development of congestive heart failure after NSTEMI | CRUSADE registry |
| Liu et al. (2016) | Boosting algorithm, Logistic-linear regression | 526 | HF | Medical data, blood test, and echocardiographic imaging | Predicting mortality in HF | Single center |
SVM support vector machine, RF random forest, CNN convolutional neural network, kNN k-nearest neighbors, PCA principal component analysis, GBRT gradient boosted regression trees, MLP multilayer perceptron, HER electronic health record, HF heart failure, HFpEF heart failure with preserved ejection fraction, ADHF acute decompensated heart failure, SAH subarachnoid hemorrhage, SDH subdural hematoma, ICH intracerebral hemorrhage, CAD coronary artery disease, ACS acute coronary syndrome, CCTA coronary computed tomography angiography, FFR fractional flow reserve, IVUS intravascular ultrasound, ICD implantable cardioverter-defibrillator, AF atrial fibrillation, ECG electrocardiogram.
Figure 2ROC curves comparing different machine learning models for CAD prediction. The prediction in CAD was associated with pooled AUC of 0.87 (95% CI 0.76–0.93) for CNN, pooled AUC of 0.88 (95% CI 0.84–0.91) for boosting algorithms, and pooled of AUC 0.93 (95% CI 0.85–0.97) for others (custom-built algorithms).
Figure 3ROC curves comparing different machine learning models for stroke prediction. The prediction in stroke was associated with pooled AUC of 0.90 (95% CI 0.83–0.95) for CNN, pooled AUC of 0.92 (95% CI 0.81–0.97) for SVM algorithms, and pooled AUC of 0.91 (95% CI 0.81–0.96) for boosting algorithms.