| Literature DB >> 31450546 |
Giulia Lorenzoni1, Stefano Santo Sabato2, Corrado Lanera1, Daniele Bottigliengo1, Clara Minto1, Honoria Ocagli1, Paola De Paolis3, Dario Gregori4, Sabino Iliceto5, Franco Pisanò3.
Abstract
The present study aims to compare the performance of eight Machine Learning Techniques (MLTs) in the prediction of hospitalization among patients with heart failure, using data from the Gestione Integrata dello Scompenso Cardiaco (GISC) study. The GISC project is an ongoing study that takes place in the region of Puglia, Southern Italy. Patients with a diagnosis of heart failure are enrolled in a long-term assistance program that includes the adoption of an online platform for data sharing between general practitioners and cardiologists working in hospitals and community health districts. Logistic regression, generalized linear model net (GLMN), classification and regression tree, random forest, adaboost, logitboost, support vector machine, and neural networks were applied to evaluate the feasibility of such techniques in predicting hospitalization of 380 patients enrolled in the GISC study, using data about demographic characteristics, medical history, and clinical characteristics of each patient. The MLTs were compared both without and with missing data imputation. Overall, models trained without missing data imputation showed higher predictive performances. The GLMN showed better performance in predicting hospitalization than the other MLTs, with an average accuracy, positive predictive value and negative predictive value of 81.2%, 87.5%, and 75%, respectively. Present findings suggest that MLTs may represent a promising opportunity to predict hospital admission of heart failure patients by exploiting health care information generated by the contact of such patients with the health care system.Entities:
Keywords: heart failure; hospitalization; machine learning techniques
Year: 2019 PMID: 31450546 PMCID: PMC6780582 DOI: 10.3390/jcm8091298
Source DB: PubMed Journal: J Clin Med ISSN: 2077-0383 Impact factor: 4.241
Sample characteristics. Continuous data are reported as I quartile/Median/III quartile, categorical data are reported as percentage (absolute number).
| Not Hospitalized ( | Hospitalized ( | ||
|---|---|---|---|
| Gender: Female | 54% (92) | 60% (125) | 0.29 |
| Age | 72.0/78.0/83.0 | 73.0/79.0/83.0 | 0.357 |
| BMI | 25.78/29.33/33.21 | 25.49/29.37/34.75 | 0.99 |
|
| |||
| AMI | 12% (21) | 12% (26) | 0.993 |
| HF etiology—ischemic cardiomyopathy | 15% (25) | 22% (47) | 0.058 |
| HF etiology—dilated cardiomyopathy | 9% (16) | 10% (21) | 0.847 |
| HF etiology—valvulopathy | 18% (30) | 21% (45) | 0.357 |
| COPD | 26% (45) | 45% (94) | <0.001 |
| Anemia | 15% (25) | 23% (48) | 0.045 |
| Comorbidities | 39% (67) | 48% (101) | 0.09 |
|
| |||
| Heart rate | 75.0/90.0/100.0 | 80.0/90.0/94.25 | 0.098 |
| BNP | 850/1335/3000 | 1178/2228/3680 | <0.001 |
| Pulmonary pressure | 35/40/47 | 35/41.5/52 | 0.051 |
| NYHA class | 0.914 | ||
| 2 | 24% (39) | 26% (53) | |
| 3 | 67% (107) | 66% (136) | |
| 4 | 9% (14) | 8% (16) | |
| Creatinine | 0.800/1.000/1.208 | 0.810/1.070/1.450 | 0.021 |
| Mean years between clinical examinations | 0.625/1.600/2.900 | 0.900/1.800/2.900 | 0.281 |
BMI: body mass index; AMI: acute myocardial infarction; HF: heart failure; COPD: chronic obstructive pulmonary disease; BNP: beta-type natriuretic peptide; NYHA: New York Heart Association.
Performance of generalized linear model net (GLMN), logistic regression (LR), classification and regression tree (CART), random forest (RF), adaboost (AB), logitboost (LB), support vector machine (SVM), and neural network (NN) obtained with complete case (CC) analysis. The values represent sensitivity, positive predictive value (PPV), negative predictive value (NPV), specificity and accuracy averaged over the values obtained on each resample.
| Technique | Sensitivity | PPV | NPV | Specificity | Accuracy | AUC |
|---|---|---|---|---|---|---|
|
| 77.8 | 87.5 | 75 | 85.7 | 81.2 | 80.6 |
|
| 54.7 | 51.6 | 64.9 | 61.9 | 58.9 | 64.6 |
|
| 44.3 | 61.6 | 65.4 | 78.1 | 63.5 | 58.6 |
|
| 54.9 | 73.0 | 72.7 | 85.6 | 72.6 | 69.1 |
|
| 57.3 | 63.8 | 70.8 | 74.4 | 67.1 | 64.4 |
|
| 66.7 | 66.7 | 57.1 | 51.1 | 62.5 | 65.4 |
|
| 57.3 | 69.0 | 72.2 | 79.4 | 69.9 | 69.5 |
|
| 61.6 | 62.8 | 72.4 | 73.1 | 68.2 | 67.7 |
Performance of GLMN, LR, CART, RF, AB, LB, SVM, and NN obtained with M-I analysis. The values represent sensitivity, positive predictive value (PPV), negative predictive value (NPV), specificity and accuracy averaged over the values obtained on each resample.
| Technique | Sensitivity | PPV | NPV | Specificity | Accuracy | AUC |
|---|---|---|---|---|---|---|
|
| 26.5 | 66.0 | 59.5 | 68.1 | 60.3 | 62.8 |
|
| 54.7 | 57.9 | 65.2 | 68.1 | 62.1 | 64.1 |
|
| 40.0 | 56.6 | 60.9 | 74.3 | 58.9 | 57.2 |
|
| 50.6 | 64.2 | 65.7 | 76.7 | 65.0 | 66.7 |
|
| 56.5 | 62.1 | 67.5 | 72.4 | 65.3 | 68.0 |
|
| 50.0 | 61.2 | 64.8 | 72.5 | 62.5 | 58.9 |
|
| 66.5 | 57.7 | 69.2 | 60.5 | 63.2 | 63.6 |
|
| 28.8 | 58.2 | 59.1 | 83.3 | 58.9 | 61.9 |
Performance of GLMN, LR, CART, RF, AB, LB, SVM, and NN obtained with KNN-I analysis. The values represent sensitivity, positive predictive value (PPV), negative predictive value (NPV), specificity and accuracy averaged over the values obtained on each resample.
| Technique | Sensitivity | PPV | NPV | Specificity | Accuracy | AUC |
|---|---|---|---|---|---|---|
|
| 24.1 | 64.8 | 59.4 | 89.5 | 60.3 | 62.4 |
|
| 54.1 | 57.6 | 64.9 | 68.1 | 61.8 | 63.2 |
|
| 45.3 | 54.4 | 61.2 | 69.5 | 58.7 | 57.8 |
|
| 50.6 | 64.2 | 65.7 | 76.7 | 65.0 | 66.7 |
|
| 53.5 | 60.2 | 65.3 | 71.0 | 63.2 | 65.4 |
|
| 60.7 | 60.3 | 68.8 | 67.9 | 65.0 | 64.2 |
|
| 53.5 | 57.2 | 64.3 | 67.6 | 61.3 | 62.2 |
|
| 55.9 | 58.5 | 65.9 | 68.1 | 62.6 | 64.1 |
Agreement between the class predicted by pair of machine learning techniques (MLTs) with complete case (CC) analysis. The values represent the point estimates of Cohen’s Kappa index along with their 95% CIs.
| NN | LB | SVM | LR | AB | CART | RF | |
|---|---|---|---|---|---|---|---|
|
| 0.8 | 1 | 0.75 | 0.75 | 0.75 | 0.8 | 0.77 |
|
| _ | 0.77 | 0.51 | 0.51 | 0.51 | 0.92 | 1 |
|
| _ | _ | 0.54 | 0.54 | 0.54 | 0.73 | 0.69 |
|
| _ | _ | _ | 1 | 1 | 0.55 | 0.51 |
|
| _ | _ | _ | _ | 1 | 0.55 | 0.51 |
|
| _ | _ | _ | _ | _ | 0.55 | 0.51 |
|
| _ | _ | _ | _ | _ | _ | 0.92 |
Covariates identified by the MLTs trained with CC to have predictive value in identifying patients that had a hospitalization. The symbol “X” denotes that the covariate had predictive value, whereas an empty cell denotes that the covariate had no predictive value. The symbol “_” was used for the MLTs for which it was not possible to identify covariates that had a predictive impact.
| GLMN | LR | CART | RF | AB | LB | SVM | NN | |
|---|---|---|---|---|---|---|---|---|
| Gender (female vs. male) | _ | _ | _ | _ | ||||
| Age | _ | _ | _ | _ | ||||
| BMI | _ | _ | _ | _ | ||||
|
| _ | _ | _ | _ | ||||
| AMI (yes vs. no) | X | X | X | X | _ | _ | _ | _ |
| HF etiology–ischemic cardiomyopathy (yes vs. no) | X | X | X | X | _ | _ | _ | _ |
| HF etiology–dilated cardiomyopathy (yes vs. no) | _ | _ | _ | _ | ||||
| HF etiology–valvulopathy (yes vs. no) | _ | _ | _ | _ | ||||
| COPD (yes vs. no) | X | X | _ | _ | _ | _ | ||
| Anemia (yes vs. no) | X | _ | _ | _ | _ | |||
| Comorbidities (yes vs. no) | X | X | X | X | _ | _ | _ | _ |
|
| _ | _ | _ | _ | ||||
| Heart rate | X | _ | _ | _ | _ | |||
| BNP | X | X | _ | _ | _ | _ | ||
| Pulmonary pressure | X | X | X | _ | _ | _ | _ | |
| NYHA class | X | _ | _ | _ | _ | |||
| Creatinine | X | X | X | _ | _ | _ | _ | |
| Mean years between clinical examinations | X | X | X | _ | _ | _ | _ |
BMI: body mass index; AMI: acute myocardial infarction; HF: heart failure; COPD: chronic obstructive pulmonary disease; BNP: beta-type natriuretic peptide; NYHA: New York Heart Association.
Coefficients’ distributions (median and 95% CI) for 10,000 bootstrap repetitions of the model found to have the best performance, i.e., GLMN (alpha = 0.005, lambda = 1/6).
| 95% CI lower limit | Median | 95% CI upper limit | |
|---|---|---|---|
| Gender (female vs. male) | 0.80 | 0.98 | 1.19 |
| Age | 0.99 | 1 | 1.02 |
| BMI | 0.98 | 1 | 1.01 |
|
| |||
| AMI (yes vs. no) | 1.08 | 1.41 | 1.74 |
| HF etiology—ischemic cardiomyopathy (yes vs. no) | 1.05 | 1.31 | 1.57 |
| HF etiology—dilated cardiomyopathy (yes vs. no) | 0.73 | 1 | 1.36 |
| HF etiology—valvulopathy (yes vs. no) | 0.71 | 0.90 | 1.15 |
| COPD (yes vs. no) | 1 | 1.22 | 1.49 |
| Anemia (yes vs. no) | 0.96 | 1.19 | 1.40 |
| Comorbidities (yes vs. no) | 1.12 | 1.34 | 1.44 |
|
| |||
| Heart rate | 0.99 | 1 | 1 |
| BNP | 1 | 1 | 1 |
| Pulmonary pressure | 1 | 1.01 | 1.02 |
| NYHA class | 0.72 | 0.91 | 1.14 |
| Creatinine | 1.01 | 1.21 | 1.40 |
| Mean years between clinical examinations | 0.99 | 1.08 | 1.17 |
BMI: body mass index; AMI: acute myocardial infarction; HF: heart failure; COPD: chronic obstructive pulmonary disease; BNP: beta-type natriuretic peptide; NYHA: New York Heart Association.