| Literature DB >> 35447735 |
Hirmand Nouraei1, Hooman Nouraei1, Simon W Rabkin1.
Abstract
Heart failure with preserved ejection (HFpEF) is a heterogenous condition affecting nearly half of all patients with heart failure (HF). Artificial intelligence methodologies can be useful to identify patient subclassifications with important clinical implications. We sought a comparison of different machine learning (ML) techniques and clustering capabilities in defining meaningful subsets of patients with HFpEF. Three unsupervised clustering strategies, hierarchical clustering, K-prototype, and partitioning around medoids (PAM), were used to identify distinct clusters in patients with HFpEF, based on a wide range of demographic, laboratory, and clinical parameters. The study population had a median age of 77 years, with a female majority, and moderate diastolic dysfunction. Hierarchical clustering produced six groups but two were too small (two and seven cases) to be clinically meaningful. The K-prototype methods produced clusters in which several clinical and biochemical features did not show statistically significant differences and there was significant overlap between the clusters. The PAM methodology provided the best group separations and identified six mutually exclusive groups (HFpEF1-6) with statistically significant differences in patient characteristics and outcomes. Comparison of three different unsupervised ML clustering strategies, hierarchical clustering, K-prototype, and partitioning around medoids (PAM), was performed on a mixed dataset of patients with HFpEF containing clinical and numerical data. The PAM method identified six distinct subsets of patients with HFpEF with different long-term outcomes or mortality. By comparison, the two other clustering algorithms, the hierarchical clustering and K-prototype, were less optimal.Entities:
Keywords: cluster analysis; heart failure; preserved ejection fraction; unsupervised machine learning
Year: 2022 PMID: 35447735 PMCID: PMC9033031 DOI: 10.3390/bioengineering9040175
Source DB: PubMed Journal: Bioengineering (Basel) ISSN: 2306-5354
Figure 1Dendrogram for hierarchical clustering (n = 196). Each color represents once cluster (cluster 1: green, cluster 2: red, cluster 3: blue, cluster 4: pink, cluster 5: yellow, cluster 6: black).
Summary of clusters using the hierarchical method (n = 196).
| Clusters | 1 | 2 | 3 | 4 | 5 | 6 | ||
|---|---|---|---|---|---|---|---|---|
| Number of subjects | 23 | 2 | 46 | 49 | 7 | 69 | ||
| Age (years) | 80 | 76 | 78 | 68 | 83 | 83 | <0.001 | |
| Male (%) | 13 | 0 | 57 | 73 | 86 | 23 | <0.001 | |
| Atrial fibrillation (%) | 17 | 0 | 24 | 14 | 43 | 35 | 0.100 | |
| Hypertension (%) | 65 | 100 | 83 | 73 | 86 | 68 | 0.410 | |
| Dyslipidemia (%) | 48 | 50 | 72 | 63 | 29 | 32 | <0.001 | |
| Diabetes (%) | 9 | 0 | 43 | 27 | 14 | 16 | 0.006 | |
| Coronary artery disease (%) | 17 | 50 | 57 | 43 | 29 | 26 | 0.007 | |
| Chronic kidney disease (%) | 4 | 0 | 22 | 12 | 29 | 26 | 0.153 | |
| Stroke or transient ischemic attack (%) | 0 | 50 | 7 | 4 | 14 | 14 | 0.046 | |
| Obstructive sleep apnea (%) | 0 | 0 | 13 | 8 | 14 | 6 | 0.447 | |
| Lung disease (%) | 17 | 0 | 13 | 4 | 0 | 6 | 0.260 | |
| Body mass index (kg/m2) | 23.1 | 30.6 | 25.9 | 28.1 | 38.5 | 24.6 | 0.004 | |
| Systolic blood pressure (mmHg) | 139 | 142.5 | 136.5 | 130 | 123 | 134 | 0.055 | |
| Low-density lipoprotein (mmol/L) | 2.4 | 2.7 | 1.6 | 2.0 | 2.2 | 2.1 | <0.001 | |
| Serum creatinine (mmol/L) | 69 | 81.5 | 94.5 | 90 | 94 | 87 | 0.012 | |
| HbA1c (%) | 5.8 | 9.1 | 6 | 6 | 5.8 | 5.8 | 0.111 | |
| Left ventricular ejection fraction (%) | 60 | 60 | 55 | 60 | 57 | 60 | <0.001 | |
| Right ventricle diameter (mm) | 30 | 37 | 35 | 34 | 38 | 36 | <0.001 | |
| Left atrial volume index (mL/m2) | 35 | 30.5 | 39.5 | 33 | 53 | 47.4 | <0.001 | |
| Left ventricle end-diastolic diameter index (mm/m2) | 25 | 28.5 | 26 | 24 | 22.5 | 28.1 | <0.001 | |
| Mitral valve E/A ratio | 0.7 | 0.8 | 1.1 | 0.9 | 3.7 | 1.2 | <0.001 | |
| Average E/e’ ratio | 12.8 | 10.3 | 15.1 | 8.8 | 21.6 | 16.7 | <0.001 | |
| Elevated filling pressure (%) | 21.7 | 0 | 80 | 26.5 | 100 | 91.3 | <0.001 | |
| Diastolic dysfunction (%) | Moderate | 8.7 | 0 | 71.7 | 24.5 | 57.1 | 72.5 | <0.001 |
| Severe | 0 | 0 | 4.3 | 0 | 42.9 | 11.6 | ||
| Meta-analysis Global Group in Chronic Heart Failure | 23.5 | 21.5 | 24 | 24 | 13 | 23 | 0.780 | |
| Heart failure exacerbation (%) | 17.4 | 0 | 28.3 | 8.1 | 85.7 | 46.4 | <0.001 | |
| Cardiovascular mortality (%) | 8.7 | 0 | 2.2 | 2 | 57.1 | 7.2 | <0.001 | |
| All-cause mortality (%) | 8.7 | 0 | 10.9 | 4.1 | 71.4 | 21.7 | <0.001 | |
| Composite endpoints (%) | 17.4 | 0 | 32.6 | 12.2 | 85.7 | 52.2 | <0.001 | |
p-value < 0.05 is statistically significant.
Figure 2T–distributed stochastic neighborhood embedding used to show the local structures of the clusters produced by K–prototype on a two-dimensional plot. Each point represents one study subject.
Summary of clusters using the K-prototype method (n = 196).
| Clusters | 1 | 2 | 3 | 4 | ||
|---|---|---|---|---|---|---|
| Number of subjects | 61 | 9 | 53 | 73 | ||
| Age (years) | 80 | 83 | 84 | 69 | <0.001 | |
| Male (%) | 48 | 78 | 13 | 60 | <0.001 | |
| Atrial fibrillation (%) | 36 | 33 | 25 | 15 | 0.043 | |
| Hypertension (%) | 67 | 78 | 77 | 75 | 0.603 | |
| Dyslipidemia (%) | 56 | 33 | 38 | 59 | 0.065 | |
| Diabetes (%) | 33 | 11 | 17 | 23 | 0.183 | |
| Coronary artery disease (%) | 44 | 22 | 34 | 34 | 0.445 | |
| Chronic kidney disease (%) | 25 | 33 | 25 | 8 | 0.029 | |
| Stroke or transient ischemic attack (%) | 11 | 22 | 8 | 5 | 0.294 | |
| Lung disease (%) | 7 | 11 | 9 | 8 | 0.934 | |
| Obstructive sleep apnea (%) | 10 | 11 | 2 | 10 | 0.328 | |
| Body mass index (kg/m2) | 26.6 | 38.5 | 23.1 | 27.3 | <0.001 | |
| Systolic blood pressure (mmHg) | 137 | 115 | 135 | 132 | 0.022 | |
| Low-density lipoprotein (mmol/L) | 1.8 | 2.1 | 2.1 | 2.2 | <0.034 | |
| Serum creatinine (mmol/L) | 90 | 94 | 89 | 88 | 0.119 | |
| HbA1c (%) | 5.9 | 5.8 | 5.8 | 5.9 | 0.651 | |
| B-type natriuretic peptide (pg/mL) | 282 | 817 | 128 | 78 | <0.001 | |
| Meta-analysis Global Group in Chronic Heart Failure | 23 | 13 | 24 | 19 | <0.001 | |
| Left ventricular ejection fraction (%) | 55 | 57 | 60 | 60 | <0.001 | |
| 76 | 89 | 85 | 53 | <0.001 | ||
| Mild to Moderate aortic stenosis (%) | 14 | 11 | 22 | 8 | 0.089 | |
| Mild to Moderate aortic regurgitation (%) | 22 | 44 | 57 | 11 | <0.001 | |
| Mild to Moderate tricuspid regurgitation (%) | 68 | 89 | 84 | 47 | <0.001 | |
| Right ventricle diameter (mm) | 36 | 38 | 34 | 34 | <0.001 | |
| Left atrial volume index (mL/m2) | 44 | 53 | 40 | 33 | <0.001 | |
| Left ventricle end-diastolic diameter index (mm/m2) | 27 | 22.5 | 28 | 25 | <0.001 | |
| Mitral valve E/A ratio | 1.2 | 2.1 | 0.8 | 0.9 | <0.001 | |
| Average E/e’ ratio | 15.8 | 21.6 | 16.7 | 9.3 | <0.001 | |
| Pulmonary artery pressure (mmHg) | 34 | 48 | 31 | 26 | <0.001 | |
| Elevated filling pressure (%) | 98 | 100 | 74 | 23 | <0.001 | |
| Diastolic dysfunction | Moderate | 80 | 67 | 60 | 19 | <0.001 |
| Severe | 15 | 33 | 0 | 1 | ||
| Heart failure exacerbation (%) | 41 | 89 | 34 | 11 | <0.001 | |
| Cardiovascular mortality (%) | 5 | 67 | 4 | 3 | <0.001 | |
| All-cause mortality (%) | 21 | 78 | 9 | 5 | <0.001 | |
| Composite endpoint (%) | 48 | 89 | 38 | 14 | <0.001 | |
Figure 3T–distributed stochastic neighborhood embedding used to show the local structures of the clusters produced by PAM on a two–dimensional plot. Each point represents one study subject.