| Literature DB >> 35204333 |
Sandeep Chandra Bollepalli1, Ashish Kumar Sahani2, Naved Aslam3, Bishav Mohan3, Kanchan Kulkarni1, Abhishek Goyal3, Bhupinder Singh3, Gurbhej Singh3, Ankit Mittal3, Rohit Tandon3, Shibba Takkar Chhabra3, Gurpreet S Wander3, Antonis A Armoundas1,4.
Abstract
Risk stratification at the time of hospital admission is of paramount significance in triaging the patients and providing timely care. In the present study, we aim at predicting multiple clinical outcomes using the data recorded during admission to a cardiac care unit via an optimized machine learning method. This study involves a total of 11,498 patients admitted to a cardiac care unit over two years. Patient demographics, admission type (emergency or outpatient), patient history, lab tests, and comorbidities were used to predict various outcomes. We employed a fully connected neural network architecture and optimized the models for various subsets of input features. Using 10-fold cross-validation, our optimized machine learning model predicted mortality with a mean area under the receiver operating characteristic curve (AUC) of 0.967 (95% confidence interval (CI): 0.963-0.972), heart failure AUC of 0.838 (CI: 0.825-0.851), ST-segment elevation myocardial infarction AUC of 0.832 (CI: 0.821-0.842), pulmonary embolism AUC of 0.802 (CI: 0.764-0.84), and estimated the duration of stay (DOS) with a mean absolute error of 2.543 days (CI: 2.499-2.586) of data with a mean and median DOS of 6.35 and 5.0 days, respectively. Further, we objectively quantified the importance of each feature and its correlation with the clinical assessment of the corresponding outcome. The proposed method accurately predicts various cardiac outcomes and can be used as a clinical decision support system to provide timely care and optimize hospital resources.Entities:
Keywords: STEMI; duration of stay; heart failure; machine learning; mortality; pulmonary embolism
Year: 2022 PMID: 35204333 PMCID: PMC8871182 DOI: 10.3390/diagnostics12020241
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Baseline patient characteristics.
| Total Subjects: 11,498 | Mean (Standard Deviation) or Proportion (%) | Median Value (Interquartile Range) | Missing Values (%) |
|---|---|---|---|
|
| |||
| Age (year) | 60.81 (13.47) | 62.00 (17) | 0.00 |
| Gender (male %) | 63.58 | 0.00 | |
| Locality (urban %) | 75.84 | 0.00 | |
| Admission type (emergency %) | 67.81 | 0.00 | |
| Duration of stay (days) | 6.35 (4.56) | 5.00 (5) | 0.00 |
| Mortality (expiry %) | 9.40 | 0.00 | |
|
| |||
| Smoking | 5.06 | 0.00 | |
| Alcohol | 6.77 | 0.00 | |
| Diabetes mellitus | 30.99 | 0.00 | |
| Hypertension | 47.70 | 0.00 | |
| Prior coronary artery disease | 66.69 | 0.00 | |
| Prior cardiomyopathy | 14.33 | 0.00 | |
| Chronic kidney disease | 8.66 | 0.00 | |
|
| |||
| Hemoglobin (g/dL) | 12.32 (2.31) | 12.50 (3.1) | 1.81 |
| Total lymphocyte count (K/uL) | 11.41 (7.08) | 10.00 (5.3) | 1.98 |
| Platelets (K/uL) | 238.38 (103.11) | 226.00 (116) | 2.04 |
| Glucose (mmol:L) | 160.47 (82.67) | 134.00 (88) | 5.28 |
| Urea (mg/dL) | 47.82 (40.57) | 34.00 (29) | 1.69 |
| Creatinine (mg/dL) | 1.30 (1.16) | 0.93 (0.6) | 1.76 |
| Brain natriuretic peptide (pg/mL) | 785.96 (988.89) | 432.00 (934) | 59.91 |
| Raised cardiac enzymes | 20.26 | 0.00 | |
| Ejection fraction | 44.13 (13.42) | 44.00 (28) | 10.51 |
|
| |||
| Severe anemia | 1.79 | 0.00 | |
| Anemia | 16.69 | 0.00 | |
| Stable angina | 9.08 | 0.00 | |
| Acute coronary syndrome | 37.16 | 0.00 | |
| ST-segment elevation myocardial infarction | 14.62 | 0.00 | |
| Atypical chest pain | 3.07 | 0.00 | |
| Heart failure (HF) | 26.75 | 0.00 | |
| HF with reduced ejection fraction | 14.19 | 0.00 | |
| HF with normal ejection fraction | 12.63 | 0.00 | |
| Valvular | 3.41 | 0.00 | |
| Complete heart block | 2.61 | 0.00 | |
| Sick sinus syndrome | 0.70 | 0.00 | |
| Acute kidney injury | 20.51 | 0.00 | |
| Cerebrovascular accident infract | 2.83 | 0.00 | |
| Cerebrovascular accident bleed | 0.42 | 0.00 | |
| Atrial fibrillation | 4.87 | 0.00 | |
| Ventricular tachycardia | 3.13 | 0.00 | |
| Paroxysmal supraventricular tachycardia | 0.74 | 0.00 | |
| Congenital | 1.13 | 0.00 | |
| Urinary tract infection | 5.87 | 0.00 | |
| Neuro cardiogenic syncope | 0.97 | 0.00 | |
| Orthostatic | 0.82 | 0.00 | |
| Infective endocarditis | 0.16 | 0.00 | |
| Deep-vein thrombosis | 1.37 | 0.00 | |
| Cardiogenic shock | 6.78 | 0.00 | |
| Shock | 5.64 | 0.00 | |
| Pulmonary embolism | 1.46 | 0.00 | |
| Chest infection | 2.33 | 0.00 |
Performance of the proposed method in terms of area under receiver operating characteristic curve (AUC) for predicting mortality, heart failure, ST-segment elevation myocardial infarction (STEMI), and pulmonary embolism and in terms of mean absolute error (MAE) for estimating the duration of stay for various set of input features. FS1 constitutes all the features. Features with cumulative importance of less than 1% are excluded from FS1 to form FS2. The most significant feature from FS2 is removed to form FS3. Similarly, FS4, FS5, and FS6 are formed by excluding the most significant feature from the corresponding super sets FS3, FS4, FS5, and FS6, respectively. Optimal performance (highlighted in bold) is obtained on feature set-2 (FS2) by excluding redundant features.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| AUC (95% CI) | MAE (95% CI) | ||||
| FS1 | 0.955 (0.947–0.963) | 0.833 (0.819–0.846) | 0.832 (0.824–0.839) | 0.779 (0.733–0.826) | 2.561 (2.526–2.596) |
| FS2 |
|
|
|
|
|
| FS3 | 0.952 (0.946–0.958) | 0.795 (0.783–0.807) | 0.790 (0.778–0.801) | 0.737 (0.688–0.786) | 2.572 (2.528–2.616) |
| FS4 | 0.938 (0.929–0.947) | 0.767 (0.755–0.779) | 0.731 (0.714–0.748) | 0.630 (0.580–0.680) | 2.623 (2.579–2.667) |
| FS5 | 0.922 (0.912–0.933) | 0.725 (0.715–0.734) | 0.678 (0.666–0.691) | 0.621 (0.585–0.658) | 2.642 (2.598–2.685) |
| FS6 | 0.911 (0.901–0.922) | 0.707 (0.696–0.718) | 0.647 (0.632–0.662) | 0.597 (0.557–0.636) | 2.651 (2.608–2.695) |
| FS7 | 0.907 (0.899–0.915) | 0.670 (0.657–0.684) | 0.624 (0.615–0.633) | 0.589 (0.543–0.636) | 2.694 (2.650–2.737) |
Figure 1Optimal receiver operating characteristic curve of mortality classifier using the optimal feature set (FS2) as input. The proposed model achieved an AUC of 0.967 (95% CI: 0.963–0.927), which is superior to the AUC of the classifier using all features (FS1) as input.
Figure 2Optimal receiver operating characteristic curve of heart failure classifier using the optimal feature set (FS2) as input. The proposed model achieved an AUC of 0.838 (95% CI: 0.825–0.851), which is superior to the AUC of the classifier using all features (FS1) as input.
Figure 3Optimal receiver operating characteristic curve of ST-segment elevation myocardial infarction (STEMI) classifier using the optimal feature set (FS2) as input. The proposed model achieved an AUC of 0.832 (95% CI: 0.821–0.842), which is comparable to the AUC of the classifier using all features (FS1) as input.
Figure 4Optimal receiver operating characteristic curve of pulmonary embolism classifier using the optimal feature set (FS2) as input. The proposed model achieved an AUC of 0.802 (95% CI: 0.764–0.840), which is superior to the AUC of the classifier using all features (FS1) as input.
Figure 5(A) The mean predicted duration of stay along with the 95% confidence intervals versus the actual duration of stay. (B) The absolute value of the mean prediction error along with the 95% confidence intervals versus the actual duration of stay. The proposed model achieved a mean absolute error (MAE) of 2.543 days (95% CI: 2.499–2.586), which is superior to the MAE of the classifier using all features (FS1) as input.