| Literature DB >> 30834806 |
Subhi J Al'Aref1, Gurpreet Singh1, Alexander R van Rosendael1, Kranthi K Kolli1, Xiaoyue Ma1, Gabriel Maliakal1, Mohit Pandey1, Bejamin C Lee1, Jing Wang1, Zhuoran Xu1, Yiye Zhang2, James K Min1, S Chiu Wong3, Robert M Minutello3.
Abstract
Background The ability to accurately predict the occurrence of in-hospital death after percutaneous coronary intervention is important for clinical decision-making. We sought to utilize the New York Percutaneous Coronary Intervention Reporting System in order to elucidate the determinants of in-hospital mortality in patients undergoing percutaneous coronary intervention across New York State. Methods and Results We examined 479 804 patients undergoing percutaneous coronary intervention between 2004 and 2012, utilizing traditional and advanced machine learning algorithms to determine the most significant predictors of in-hospital mortality. The entire data were randomly split into a training (80%) and a testing set (20%). Tuned hyperparameters were used to generate a trained model while the performance of the model was independently evaluated on the testing set after plotting a receiver-operator characteristic curve and using the output measure of the area under the curve ( AUC ) and the associated 95% CIs. Mean age was 65.2±11.9 years and 68.5% were women. There were 2549 in-hospital deaths within the patient population. A boosted ensemble algorithm (AdaBoost) had optimal discrimination with AUC of 0.927 (95% CI 0.923-0.929) compared with AUC of 0.913 for XGB oost (95% CI 0.906-0.919, P=0.02), AUC of 0.892 for Random Forest (95% CI 0.889-0.896, P<0.01), and AUC of 0.908 for logistic regression (95% CI 0.907-0.910, P<0.01). The 2 most significant predictors were age and ejection fraction. Conclusions A big data approach that utilizes advanced machine learning algorithms identifies new associations among risk factors and provides high accuracy for the prediction of in-hospital mortality in patients undergoing percutaneous coronary intervention.Entities:
Keywords: big data analytics; in‐hospital mortality; machine learning; percutaneous coronary intervention
Mesh:
Year: 2019 PMID: 30834806 PMCID: PMC6474922 DOI: 10.1161/JAHA.118.011160
Source DB: PubMed Journal: J Am Heart Assoc ISSN: 2047-9980 Impact factor: 5.501
Baseline Characteristics of the Study Population
| Variable | All Patients (n=479 804) |
|---|---|
| Age (y), mean±SD | 65.2±11.9 |
| Male sex (%) | 151 349 (31.5%) |
| White ethnicity (%) | 385 984 (80.4%) |
| Ejection fraction, mean±SD | 50.6±14.5 |
| BMI, kg/m2 | 29.4 ± (5.9) |
| Median CCS class (IQR) | 3 [2, 4] |
| Previous PCI (%) | |
| 1 | 115 200 (24%) |
| 2 | 45 153 (9.4%) |
| 3 or more | 35 456 (7.4%) |
| History of cerebrovascular disease (%) | 39 434 (8.2%) |
| History of peripheral vascular disease (%) | 37 647 (7.8%) |
| History of heart failure (%) | 19 279 (4%) |
| History of malignant ventricular arrhythmia (%) | 2769 (0.6%) |
| History of COPD (%) | 74 423 (15.5%) |
| History of diabetes mellitus (%) | 161 771 (33.7%) |
| History of renal failure on dialysis (%) | 10 456 (2.2%) |
| History of previous CABG (%) | 79 075 (16.5%) |
| Hemodynamic instability (%) | 2363 (0.5%) |
| ST‐segment elevation on ECG | 49 084 (10.2%) |
BMI indicates body mass index; CABG, coronary artery bypass graft; CCS, Canadian Cardiovascular Society; COPD, chronic obstructive pulmonary disease; IQR, interquartile range (25th and 75th percentile); PCI, percutaneous coronary intervention.
Figure 1Feature importance ranking. This figure lists the relative importance of clinical and angiographic variables in the developed machine learning–based model for the prediction of in‐hospital mortality after percutaneous coronary intervention (selected for the model with the highest area under the curve—AdaBoost. See Figure S4, for feature importance ranking with SD across 5‐fold cross‐validation). BMI indicates body mass index; CABG, coronary artery bypass grafting; CCS, Canadian Cardiovascular Society; CVA, cerebrovascular accident; MI, myocardial infarction; PCI, percutaneous coronary intervention; RCA, right coronary artery.
Figure 2Receiver operating curves. In this study, we trained 4 models: (1) AdaBoost (2), XGBoost (3), Logistic Regression, and (4) Random Forest. We performed 5‐fold cross‐validation on the data set for each model. The area‐under‐the‐curve for all the models has been indicated as mean±SD. AdaBoost was noted to have the best performance for prediction of in‐hospital mortality after percutaneous coronary intervention.
Summary of the Brier Scores Evaluating the Calibration of the Machine Learning Models (AdaBoost, XGBoost, and Random Forest) as Well as That of Logistic Regression
| Model | Brier Score |
|---|---|
| AdaBoost | 0.159±0.031 |
| XGBoost | 0.494±0.091 |
| Random Forest | 0.084±0.001 |
| Logistic Regression | 0.173±0.045 |
Figure 3Trend comparison between model outcome and clinically important features. Clinically important categorical and continuous features were plotted to understand their underlying trends in relation to the model outcome. A, Continuous variables have been plotted in a joint scatter and regression plot. The underlying trend between variables and the model outcome is shown for each variable. Normalized values for each variable are plotted on the x‐axis. B, Categorical variables have been plotted using box plot. BMI indicates body mass index; CCS, Canadian Cardiovascular Society; PCI, percutaneous coronary intervention.