| Literature DB >> 30810291 |
Saqib Ejaz Awan1, Mohammed Bennamoun1, Ferdous Sohel2, Frank Mario Sanfilippo3, Girish Dwivedi4.
Abstract
AIMS: Machine learning (ML) is widely believed to be able to learn complex hidden interactions from the data and has the potential in predicting events such as heart failure (HF) readmission and death. Recent studies have revealed conflicting results likely due to failure to take into account the class imbalance problem commonly seen with medical data. We developed a new ML approach to predict 30 day HF readmission or death and compared the performance of this model with other commonly used prediction models. METHODS ANDEntities:
Keywords: Heart failure; Machine learning; Prediction; Readmission
Mesh:
Year: 2019 PMID: 30810291 PMCID: PMC6437443 DOI: 10.1002/ehf2.12419
Source DB: PubMed Journal: ESC Heart Fail ISSN: 2055-5822
Figure 1Cohort identification flow chart. The final cohort contains 10 757 HF patients admitted to WA hospitals in 2003–2008. HF, Heart failure; WA, Western Australia.
Characteristics of HF patients in the study cohort
| Characteristic | Count (percentage) | |
|---|---|---|
| Alive and non‐readmitted within 30 days | Readmitted or dead within 30 days | |
| Total number of HF patients | 8211 (76.3) | 2546 (23.7) |
| Age (years), mean (SD) | 81.1 (7.6) | 83.1 (7.6) |
| Male (%) | 4028 (49.0) | 1247 (49.0) |
| Indigenous status: Aboriginal or Torres Strait Islander (%) | 141 (1.7) | 35(1.4) |
| History of heart failure | 3642 (44.3) | 1422 (55.8) |
| Length of stay (days), mean (SD) | 10.39 (15.9) | 16.22 (46.8) |
| Co‐morbidities (%) | ||
| Ischaemic heart disease | 4506 (54.9) | 1457 (57.2) |
| Hypertension | 5497 (66.9) | 1751 (68.8) |
| Atrial fibrillation | 3398 (41.4) | 1102 (43.3) |
| Diabetes | 2458 (29.9) | 806 (31.6) |
| Chronic obstructive pulmonary disease | 2240 (27.3) | 783 (30.7) |
| Peripheral vascular disease | 1547 (18.8) | 549 (21.5) |
| Stroke | 1014 (12.3) | 366 (14.4) |
| Dementia | 545 (6.6) | 270 (10.6) |
| Depression | 691 (8.4) | 272 (10.7) |
| Cancer | 2811 (34.2) | 934 (36.7) |
| Chronic kidney disease | 2027 (24.7) | 795 (31.2) |
| Cardiogenic shock | 68 (0.8) | 30 (1.1) |
| Cardiomyopathy | 344 (4.2) | 115 (4.5) |
| SEIFA (%) | ||
| 5th quintile (least disadvantage) | 552 (6.7) | 182 (7.1) |
| 4th quintile | 1398 (17.0) | 439 (17.2) |
| 3rd quintile | 1450 (17.6) | 474 (18.6) |
| 2nd quintile | 1810 (22.0) | 555 (21.8) |
| 1st quintile (highest disadvantage) | 3001 (36.5) | 896 (35.2) |
| ARIA (%) | ||
| Major cities of Australia | 4247 (51.7) | 1334 (52.4) |
| Inner regional Australia | 2514 (30.6) | 692 (27.1) |
| Outer regional Australia | 897 (10.9) | 327 (12.8) |
| Remote Australia | 330 (4.0) | 118 (4.6) |
| Very remote Australia | 223 (2.7) | 75 (2.9) |
| History of drugs in the last 6 months (%) | ||
| No supply of BB or RAASi | 2298 (28.0) | 731 (28.7) |
| 1 or more supplies of RAASi only | 3249 (39.6) | 1097 (43.1) |
| 1 or more supplies of BB only | 741 (9.0) | 205 (8.0) |
| 1 or more supplies of both BB and RAASi | 1518 (18.5) | 411 (16.1) |
| At least one visit to health professionals in the last 6 months | ||
| GP | 6929 (84.4) | 2131 (83.7) |
| Specialist | 3980 (48.8) | 1079 (42.4) |
| Diagnostic | 6556 (79.8) | 2028 (79.6) |
| Allied Health | 1384 (16.8) | 353 (13.9) |
| At least one emergency admission in the last 6 months | 3591 (43.7) | 1303 (51.1) |
| Charlson Comorbidity Index score, mean (SD) | 4.2 (3.0) | 4.8 (3.1) |
| At least two supplies of ATC drugs in the last 6 months | ||
| Alimentary tract and metabolism | 4868 (59.3) | 1660 (65.2) |
| Blood and blood forming organs | 3835 (46.7) | 1183 (46.5) |
| Cardiovascular system | 7190 (87.6) | 2251 (88.4) |
| Dermatologicals | 730 (8.9) | 253 (9.9) |
| Genito urinary system and sex hormones | 388 (4.7) | 123 (4.8) |
| Systemic hormonal preparations, excluding sex hormones and insulins | 974 (11.9) | 356 (14.0) |
| Antiinfectives for systemic use | 3101 (37.8) | 1071 (42.1) |
| Antineoplastic and immunomodulating agents | 276 (3.4) | 111 (4.3) |
| Musculo‐skeletal system | 2390 (29.1) | 770 (30.2) |
| Nervous system | 4883 (59.5) | 1674 (65.7) |
| Antiparasitic products, insecticides, and repellents | 243 (2.9) | 79 (3.1) |
| Respiratory system | 1977 (24.0) | 616 (24.2) |
| Sensory organs | 1950 (23.7) | 656 (25.8) |
| Readmission or death within 30 days | 2546 (23.6) | |
| Readmission within 30 days (emergency only) | 1121 (10.4) | |
| Death within 30 days | 1574 (14.6) | |
ARIA, Accessibility/Remoteness Index of Australia; ATC, Anatomical Therapeutic Chemical Index; BB, beta‐blocker; GP, general practitioner; HF, heart failure; RAASi, renin–angiotensin–aldosterone system index; SD, standard deviation; SEIFA, Socio‐Economic Indexes for Areas.
Figure 2Our three‐layer multi‐layer perceptron network with one hidden layer ‘’ containing 50 hidden nodes. ‘’ represents the node weights, ‘’ denotes the node biases, ‘’ is the input feature vector, and ‘’ represents the binary output.
Performance measure of our prediction models developed for the HF cohort
| Model | AUC | AUPRC | Accuracy (%) | Sensitivity (%) | Specificity (%) |
|---|---|---|---|---|---|
| LACE | 0.551 | 0.448 | 59.85 | 45.54 | 64.80 |
| Logistic regression | 0.576 | 0.455 | 62.26 | 48.37 | 66.85 |
| Random forests | 0.501 | 0.319 | 76.39 | 0.52 | 99.75 |
| Weighted random forests | 0.548 | 0.386 | 76.22 | 21.71 | 88.07 |
| Decision trees | 0.520 | 0.367 | 66.97 | 22.84 | 81.22 |
| Weighted decision trees | 0.528 | 0.379 | 64.18 | 31.44 | 74.16 |
| Support‐vector machines | 0.528 | 0.367 | 71.80 | 16.03 | 89.76 |
| Weighted support‐vector machines | 0.535 | 0.377 | 65.36 | 31.39 | 75.78 |
| Multilayer perceptron | 0.628 | 0.461 | 64.93 | 48.42 | 70.01 |
AUC, area under the receiver operating characteristic curve; AUPRC: area under the precision–recall curve; LACE, length of stay (L), acuity of admission (A), Charlson Comorbidity Index score (C), and number of emergency visits in the last 6 months (E).
The performance of our reproduced prediction models from Frizzell et al.10 on our HF cohort compared with the prediction models of Frizzell et al.10 based on their data (results extracted from the paper)
| Model | Frizzell | Our cohort | ||
|---|---|---|---|---|
| AUC | AUC | Sensitivity (%) | Specificity (%) | |
| Logistic regression | 0.62 | 0.56 | 33.18 | 79.92 |
| Random forests | 0.61 | 0.50 | 1.71 | 99.00 |
| LASSO regression | 0.62 | 0.64 | 0.51 | 100 |
| Multilayer perceptron | — | 0.62 | 48.42 | 70.01 |
AUC, area under the receiver operating characteristic curve; HF, heart failure; LASSO, least absolute shrinkage and selection operator; LR, logistic regression; RF, random forest.
Note the slight difference in the results of LR and RF models on our cohort between Tables 2 and 3. The models in Table 3 were developed based on specifications given by Frizzell et al. (page 2 of the supplementary appendix), while those in Table 2 are based on our own devised models.