| Literature DB >> 32720083 |
George Bazoukis1, Stavros Stavrakis2, Jiandong Zhou3,4, Sandeep Chandra Bollepalli5, Gary Tse6, Qingpeng Zhang3,4, Jagmeet P Singh7, Antonis A Armoundas8,9.
Abstract
Machine learning (ML) algorithms "learn" information directly from data, and their performance improves proportionally with the number of high-quality samples. The aim of our systematic review is to present the state of the art regarding the implementation of ML techniques in the management of heart failure (HF) patients. We manually searched MEDLINE and Cochrane databases as well the reference lists of the relevant review studies and included studies. Our search retrieved 122 relevant studies. These studies mainly refer to (a) the role of ML in the classification of HF patients into distinct categories which may require a different treatment strategy, (b) discrimination of HF patients from the healthy population or other diseases, (c) prediction of HF outcomes, (d) identification of HF patients from electronic records and identification of HF patients with similar characteristics who may benefit form a similar treatment strategy, (e) supporting the extraction of important data from clinical notes, and (f) prediction of outcomes in HF populations with implantable devices (left ventricular assist device, cardiac resynchronization therapy). We concluded that ML techniques may play an important role for the efficient construction of methodologies for diagnosis, management, and prediction of outcomes in HF patients.Entities:
Keywords: Deep learning; Heart failure; Machine learning
Mesh:
Year: 2021 PMID: 32720083 PMCID: PMC7384870 DOI: 10.1007/s10741-020-10007-3
Source DB: PubMed Journal: Heart Fail Rev ISSN: 1382-4147 Impact factor: 4.214
Fig. 1Areas of application of machine learning in the management of heart failure patients
Comparison of machine learning algorithms with traditional methods in the management of heart failure
| Author | Journal | Year | Outcome | Comparison between machine learning and conventional methods | Conclusion | |||
|---|---|---|---|---|---|---|---|---|
| Machine learning models | Conventional methods | |||||||
| Classification of HF patients | ||||||||
| Austin PC | 2013 | Discrimination HFpEF vs HFrEF | Model | AUC | Model | AUC | Conventional LR performed at least as well as modern methods | |
| Regression tree | 0.683 | LR | 0.780 | |||||
| Bagged regression tree | 0.733 | |||||||
| Random forest | 0.751 | |||||||
| Boosted regression tree (depth 1) | 0.752 | |||||||
| Boosted regression tree (depth 2) | 0.768 | |||||||
| Boosted regression tree (depth 3) | 0.772 | |||||||
| Boosted regression tree (depth 4) | 0.774 | |||||||
| CRT response | ||||||||
| Kalscheur MM | 2018 | All-cause mortality or HF hospitalization in CRT recipients | AUC values RF model (0.74, 95% CI 0.72–0.76) Sequential minimal optimization to train a SVM (0.67, 95% CI 0.65–0.68) | AUC values Multivariate LR (0.67, 95% CI 0.65–0.69) | The improvement in AUC for the RF model was statistically significant compared to the other models, | |||
| Data extraction | ||||||||
| Zhang R | 2018 | HF information (NYHA) extraction from clinical notes | RF, | LR → | ML-based methods outperformed a rule-based method. The best machine learning method was an RF | |||
| HF diagnosis | ||||||||
| Nirschi JJ | 2018 | HF diagnosis using biopsy images | AUC value RF 0.952 Deep learning 0.974 | AUC value Pathologists 0.75 | ML models outperformed conventional methods | |||
| Rasmy L | 2018 | HF diagnosis | AUC value Recurrent NN 0.822 | AUC value LR 0.766 | ML outperformed conventional methods | |||
| Son CS | 2012 | HF diagnosis | Rough sets based decision-making model → accuracy 97.5%, SENS 97.2%, SPE 97.7%, PPV 97.2%, NPV 97.7%, AUC 97.5% | LR-based decision-making model → accuracy 88.7%, SENS 90.1%, SPE 87.5%, PPV 85.3%, NPV 91.7%, AUC 88.8% | ML models outperformed conventional methods | |||
| Wu J | 2010 | HF diagnosis | Boosting using a less strict cut-off had better performance compared to SVM | The highest median AUC (0.77) was observed for LR with Bayesian information criterion | LR and boosting were, both, superior to SVM | |||
| Identification of HF patients | ||||||||
| Blecker S | 2016 | Identification of HF patients | ML using notes and imaging reports → (developmental set) AUC 99%, SENS 92%, PPV 80%. (Validation SET) AUC 97%, SENS 84%, PPV 80% | LR using structured data → (developmental set) AUC 96%, SENS 78%, PPV 80%. (Validation SET) AUC 95%, SENS 76%, PPV 80% | ML models improved identification of HF patients | |||
| Blecker S | 2018 | Identification of HF hospitalization | ML with use of both data → (developmental set) AUC 99%, SENS 98%, PPV 43%. (Validation SET) AUC 99%, SENS 98%, PPV 34% | LR using structured data, notes, and imaging reports → (developmental set) AUC 96%, SENS 98%, PPV 14%. (Validation SET) AUC 96%, SENS 98%, PPV 15% | ML models performed better in identifying decompensated HF | |||
| Choi E | 2017 | Predicting HF diagnosis from EHR | AUC values 12-month observation → NN model 0.777 MLP with 1 hidden layer 0.765 SVM 0.743 K-NN 0.730 | AUC values 12-month observation → LR 0.747 | ML models performed better in detecting incident HF with a short observation window of 12–18 months | |||
| Prediction of outcomes | ||||||||
| Austin PC | 2012 | 30-day mortality | AUC values Regression tree 0.674 Bagged trees 0.713 Random forests 0.752 Boosted trees—depth one 0.769 Boosted trees—depth two 0.788 Boosted trees—depth three 0.801 Boosted trees—depth four 0.811 | AUC values LR 0.773 | Ensemble methods from the data mining and ML literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional LR models | |||
| Austin PC | 2010 | In-hospital mortality | AUC values LR models Regression trees 0.620–0.651 | AUC values LR 0.747–0.775 | LR predicted in-hospital mortality in patients hospitalized with HF more accurately than did the regression trees | |||
| Awan SE | 2019 | 30-day readmissions | AUC values MLP 0.62 Weighted random forest 0.55 Weighted decision trees 0.53 Weighted SVM models 0.54 | AUC values LR 0.58 | The proposed MLP-based approach is superior to other ML and regression techniques | |||
| Fonarow GC | 2005 | In-hospital mortality | AUC values CART model (derivation cohort 68.7%; validation cohort 66.8%) | AUC values LR model (derivation cohort 75.9%; validation cohort 75.7%) | Based on AUC, the accuracy of the CART model (derivation cohort 68.7%; validation cohort 66.8%) was modestly less than that of the more complicated LR model (derivation cohort75.9%; validation cohort 75.7%) | |||
| Frizzell JD | 2016 | 30-day readmissions | Tree-augmented naive Bayesian network 0.618 RF 0.607 Gradient-boosted 0.614 Least absolute shrinkage and selection operator models 0.618 | C-statistics LR 0.624 | ML methods showed limited predictive ability | |||
| Golas SB | 2018 | 30-day readmissions | AUC values Gradient boosting 0.650 ± 0.011 Maxout networks 0.695 ± 0.016 Deep unified networks 0.705 ± 0.015 | AUC values LR 0.664 ± 0.015 | Deep learning techniques performed better than other traditional techniques | |||
| Hearn J | 2018 | Clinical deterioration (i.e., the need for mechanical circulatory support, listing for heart transplantation, or mortality from any cause) | AUC values ppVo2 0.800 (0.753–0.838) Staged LASSO 0.827 (0.785–0.867) Staged NN 0.835 (0.795–0.880) BxB LASSO 0.816 (0.767–0.866) BxB NN 0.842 (0.794–0.882) | AUC values CPET risk score 0.759 (0.709–0.799) | NN incorporating breath-by-breath data achieved the best performance | |||
| Kwon JM | 2019 | Hospital mortality | AUC values Deep learning 0.913 RF 0.835 | AUC values LR 0.835 MAGGIC score 0.806 GWTG score 0.783 | The echocardiography-based deep learning model predicted in-hospital mortality among HD patients more accurately than existing prediction models | |||
| Phillips KT | 2005 | Mortality | AUC levels Nearest neighbor 0.823 NN 0.802 Decision tree 0.4975 | AUC values Stepwise LR 0.734 | Data mining methods outperform multiple logistic regression and traditional epidemiological methods | |||
| Mortazavi BJ | 2016 | HF readmissions | Boosting 0.678 | LR 0.543 | Boosting improved the c-statistic by 24.9% over LR | |||
| Myers J | 2014 | Cardiovascular death | AUC values Artificial NN 0.72 Cox PH models 0.69 | AUC values LR 0.70 | An artificial NN model slightly improves upon conventional methods | |||
| Panahiazar M | 2015 | 5-year mortality | AUC values RF 62% (baseline set), 72% (extended set) Decision tree 50% (baseline set), 50% (extended set) SVM 55% (baseline set), 38% (extended set) AdaBoost 61% (baseline set), 68% (extended set) | AUC values LR 61% (baseline set), 73% (extended set) | LR and RF return more accurate models | |||
| Subramanian D | 2011 | 1-year mortality | Ensemble model using gentle boosting with 10-fold cross-validation 84% | Μultivariate LR model using time-series cytokine Measurements 81% | The ensemble model showed significantly better performance | |||
| Taslimitehrani V | 2016 | 5-year survival | Precision SVM 0.2, CPXR (log) 0.721 Recall SVM 0.5 CPXR (log) 0.615 Accuracy SVM 0.66 CPXR 0.809 | Precision LR 0.513 Recall LR 0.506 Accuracy LR 0.717 | CPXR is better than logistic regression, SVM, random forest and AdaBoost | |||
| Turgeman L | 2016 | Hospital readmissions | AUC values NN 0.589 (train), 0.639 (test) Naïve Bayes 0.699 (train), 0.676 (test) SVM 0.768 (train), 0.643 (test) CART decision tree 0.529 (train), 0.556 (test) Ensemble models C5 0.714 (train), 0.693 (test) CHAID decision tree 0.671 (train), 0.691 (test) | AUC values LR 0.642 (train), 0.699 (test) | A dynamic mixed-ensemble model combines a boosted C5.0 model as the base ensemble classifier and SVM model as a secondary classifier to control classification error for the minority class | |||
| Wong W | 2003 | Mortality (365 days models) | AUC values MLP 69% Radial basis function 67% | AUC values LR 60% | NNs are able to outperform the LR in terms of sample prediction | |||
| Yu S | 2015 | 30-day HF readmissions | AUC values Linear SVM 0.65 Poly SVM 0.61 Cox PH 0.63 | AUC values Industry standard method (LACE) 0.56 | The ML models performed better compared to standard method | |||
| Zhang J | 2013 | Death or hospitalization | AUC values Decision trees 79.7% | AUC values LR 73.8% | Decision trees tended to perform better than LR models | |||
| Zhu K | 2015 | 30-day readmissions | AUC values RF 0.577 SVM 0.560 Conditional LR 1 = 0.576 Conditional LR 2 = 0.608 Conditional LR 3 = 0.615 | AUC values Standard LR 0.547 Stepwise LR 0.539 | LR after combining ML outperforms standard classification models | |||
| Zolfaghar K | In | 2013 | HF readmissions | AUC values Multicare health systems model RF 62.25% | AUC values Multicare health systems model LR 63.78% Yale model LR 59.72% | ML random forest model does not outperform traditional LR model | ||
AUC area under the receiver operating curve, CPET cardiopulmonary exercise test, HF heart failure, LR logistic regression, ML machine learning, MLP multilayer perceptron, NN neural networks, NPV negative prognostic value, PH proportional hazard, PPV positive prognostic value, ppVo2 predicted peak oxygen uptake, RF random forest, SENS sensitivity, SPE specificity, SVM support vector machine