| Literature DB >> 34169259 |
Moumita Bhattacharya1, Dai-Yin Lu2,3,4,5, Ioannis Ventoulis2, Gabriela V Greenland2,5, Hulya Yalcin2, Yufan Guan2, Joseph E Marine2, Jeffrey E Olgin5, Stefan L Zimmerman6, Theodore P Abraham2,5, M Roselle Abraham2,5, Hagit Shatkay1.
Abstract
BACKGROUND: Hypertrophic cardiomyopathy (HCM) patients have a high incidence of atrial fibrillation (AF) and increased stroke risk, even with low CHA2DS2-VASc (congestive heart failure, hypertension, age diabetes, previous stroke/transient ischemic attack) scores. Hence, there is a need to understand the pathophysiology of AF/stroke in HCM. In this retrospective study, we develop and apply a data-driven, machine learning-based method to identify AF cases, and clinical/imaging features associated with AF, using electronic health record data.Entities:
Year: 2021 PMID: 34169259 PMCID: PMC8209373 DOI: 10.1016/j.cjco.2021.01.016
Source DB: PubMed Journal: CJC Open ISSN: 2589-790X
Figure 1Schematic illustrating the Hypertrophic Cardiomyopathy Atrial Fibrillation (HCM-AF)-Risk Model.
Eighteen variables identified as most informative for AF by our feature-selection method
| Variable | Variable type | Polychoric correlation (association with AF) | |
|---|---|---|---|
| Left atrial diameter, cm (+) | Continuous | 0.00000000001 | 0.316 |
| Heart rate at peak stress, bpm (–) | Continuous | 0.0000000001 | –0.288 |
| Age, y (+) | Continuous | 0.0000004 | 0.219 |
| Exercise metabolic equivalents (–) | Continuous | 0.0000014 | –0.154 |
| Septal myectomy (+) | Nominal | 0.0000021 | 0.353 |
| Exercise time, s (–) | Continuous | 0.0000031 | –0.225 |
| Diuretic treatment (+) | Nominal | 0.0000046 | 0.251 |
| Percentage of max heart rate achieved at peak exercise, % (–) | Continuous | 0.00058 | –0.156 |
| Heart rate recovery at 1 min post-exercise, bpm (–) | Continuous | 0.00072 | –0.205 |
| LV-LGE on CMR (presence +) | Nominal | 0.00092 | 0.269 |
| E/A (+) | Continuous | 0.001 | 0.091 |
| NYHA functional class (+) | Nominal | 0.0032 | 0.205 |
| E/e′ (+) | Continuous | 0.0033 | 0.157 |
| LV global longitudinal peak systolic strain rate, 1/s (+) | Continuous | 0.029 | 0.120 |
| Dyspnea on exertion (presence +) | Nominal | 0.035 | 0.198 |
| ABPR during exercise test in follow-up visit (+) | Continuous | 0.051 | 0.156 |
| Diastolic blood pressure at peak exercise, mm Hg (–) | Continuous | 0.053 | –0.105 |
| LV global longitudinal early diastolic strain rate, 1/s (–) | Continuous | 0.056 | –0.106 |
ABPR, abnormal blood pressure response; bpm, beats per minute; AF, atrial fibrillation; CMR, cardiac magnetic resonance imaging; E/A, ratio of early diastolic mitral flow velocity to the late diastolic mitral flow velocity; E/e', ratio of early diastolic mitral flow velocity to the early diastolic mitral septal annulus motion velocity; LGE, late gadolinium enhancement; LV, left ventricle; NYHA: New York Heart Association. .
Figure 2Methods for addressing data imbalance—the illustration shows our classification scheme for combining oversampling and undersampling. The topmost layer represents the entire training set, which comprises a majority of No-atrial fibrillation (AF) records (left) and the minority of AF records (right). The majority class in the training set (No-AF) is randomly undersampled such that the No-AF to AF record ratio is 2:1. The minority class (AF) is oversampled using Synthetic Minority Oversampling Technique (SMOTE) to generate synthetic new AF-like records, doubling the number of AF records. The resulting set forms a balanced training set, containing the same number of AF and No-AF records.
Figure 3Flow chart indicating selection of atrial fibrillation (AF) and No-AF cases in the hypertrophic cardiomyopathy (HCM) cohort.
Figure 4Age distribution of hypertrophic cardiomyopathy (HCM) patients with atrial fibrillation (AF) in HCM cohort. AF prevalence increases with age in HCM.
Demographic and clinical feature values of the HCM cohort shown for patients with/without AF
| Variable | No-AF n = 640 | AF n = 191 | |
|---|---|---|---|
| Age, y | 52 ± 16 | 58 ± 13 | <0.001 |
| Male sex | 399 (62) | 120 (63) | 0.9 |
| Body mass index, kg/m2 | 29 ± 6 | 30 ± 7 | 0.1 |
| HCM type | 0.02 | ||
| Non-obstructive | 203 (32) | 56 (30) | |
| Labile obstructive | 244 (38) | 57 (30) | |
| Obstructive | 191 (30) | 77 (41) | |
| NYHA class | <0.001 | ||
| I | 376 (59) | 82 (43) | |
| II-III | 264 (41) | 109 (57) | |
| Angina | 260 (41) | 68 (36) | 0.2 |
| Family history of HCM | 125 (20) | 37 (19) | 0.9 |
| ICD implantation | 42 (7) | 32 (17) | <0.001 |
| Syncope | 120 (19) | 41 (22) | 0.5 |
| Family history of sudden cardiac death | 159 (25) | 44 (23) | 0.7 |
| Non-sustained VT | 62 (10) | 29 (15) | 0.05 |
| Septal wall thickness ≥30 mm | 55 (9) | 8 (4) | 0.07 |
| Medications | |||
| Beta-blocker | 435 (68) | 157 (82) | <0.001 |
| Calcium channel blocker | 176 (28) | 66 (35) | 0.07 |
| RAS blockade | 150 (23) | 49 (26) | 0.6 |
| Disopyramide | 15 (2) | 15 (8) | 0.001 |
| Echocardiography | |||
| Left atrial diameter, mm | 41 ± 7 | 45 ± 8 | <0.001 |
| Maximal septal wall thickness, mm | 21 ± 5 | 21 ± 5 | 0.1 |
| LV ejection fraction, % | 66 ± 8 | 64 ± 8 | 0.1 |
| E/A | 1.3 ± 0.6 | 1.7 ± 1.3 | <0.001 |
| E/é | 18 ± 11 | 22 ± 12 | <0.001 |
| Rest LVOT peak gradient, mm Hg | 28 ± 32 | 32 ± 30 | 0.2 |
| Stress LVOT peak gradient, mm Hg | 67 ± 53 | 72 ± 54 | 0.3 |
| LV-GLS, % | –16.1 ± 3.7 | –15.1 ± 3.8 | 0.003 |
| LV-SR_S | –1.00 ± 0.20 | –0.93 ± 0.20 | <0.001 |
| LV-SR_E | 1.14 ± 0.35 | 1.06 ± 0.31 | <0.001 |
| Moderate or severe MR | 70 (11) | 31 (16) | 0.06 |
| Cardiac magnetic resonance (n = 608) | |||
| LV mass, g | 163 ± 69 | 172 ± 62 | 0.2 |
| LGE presence | 300 (63) | 107 (81) | <0.001 |
| LGE (% of LV mass) | 11 ± 12 | 16 ± 13 | 0.004 |
| Positron emission tomography (n = 145) | |||
| Global rest MBF, mL/min/g | 0.93 ± 0.26 | 1.04 ± 0.54 | 0.2 |
| Global stress MBF, mL/min/g | 2.14 ± 0.65 | 2.01 ± 0.75 | 0.3 |
| Myocardial flow reserve | 2.44 ± 0.85 | 2.30 ± 0.66 | 0.3 |
| Summed difference scores | 5.4 ± 4.9 | 4.3 ± 4.8 | 0.2 |
| Exercise parameters | |||
| Treadmill exercise time, s | 566 ± 202 | 471 ± 193 | <0.001 |
| Metabolic equivalents | 10.3 ± 4.2 | 8.8 ± 3.7 | <0.001 |
| Rest heart rate, bpm | 65 ± 13 | 65 ± 14 | 0.5 |
| Rest systolic blood pressure, mm Hg | 133 ± 22 | 131 ± 18 | 0.5 |
| Rest diastolic blood pressure, mm Hg | 77 ± 11 | 77 ± 12 | 0.9 |
| Stress heart rate, bpm | 145 ± 28 | 131 ± 26 | <0.001 |
| Stress systolic blood pressure, mm Hg | 161 ± 36 | 153 ± 37 | 0.02 |
| Stress diastolic blood pressure, mm Hg | 82 ± 18 | 77 ± 17 | 0.003 |
| ABPR | 179 (31) | 70 (43) | 0.009 |
| Stroke | 7 (1) | 5 (3) | 0.2 |
| Heart failure hospitalization | 27 (5) | 16 (10) | 0.04 |
| VT/VF | 18 (3) | 11 (7) | 0.1 |
| Death | 13 (2) | 13 (8) | 0.003 |
Values are n (%), unless otherwise indicated.
ABPR, abnormal blood pressure response to exercise; AF, atrial fibrillation; bpm, beats per minute; E/A, ratio of early diastolic mitral flow velocity to the late diastolic mitral flow velocity; E/e', ratio of early diastolic mitral flow velocity to the early diastolic mitral septal annulus motion velocity; HCM, hypertrophic cardiomyopathy; ICD, implantable cardioverter defibrillator; LGE, late gadolinium enhancement; LV, left ventricular;; LV-GLS, LV peak global longitudinal systolic strain; LVOT, LV outflow tract; LV-SR_E, LV peak global longitudinal early diastolic strain rate; LV-SR_S, peak global longitudinal systolic strain rate; MBF, myocardial blood flow; MR, mitral regurgitation; NYHA: New York Heart Association; RAS blockade, angiotensin-converting enzyme inhibitor, angiotensin II receptor blocker; VT/VF, ventricular tachycardia/ventricular fibrillation.
Figure 5Distribution of left atrium (LA) size in hypertrophic cardiomyopathy patients with/without atrial fibrillation (AF). Significant overlap exists in LA diameter values in the AF and No-AF groups, but mean values for LA diameter were significantly higher in the AF group, compared with the No-AF group (P < 0.001). Each dot represents a patient; mean ± 1.96 standard deviations is presented.
Comparison of performance between the simple baseline logistic regression classifier (Baseline), approaches used for addressing data imbalance (Random undersampling), and the classifier resulting from our combination of undersampling and oversampling, trained on datasets represented via the 18 features identified by our feature selection method (HCM-AF-Risk Model)
| Performance measure | Baseline | Random undersampling | HCM-AF-Risk Model |
|---|---|---|---|
| Sensitivity | 0.41 (±0.04) | 0.43 (±0.04) | |
| Specificity | 0.90 (±0.02) | 0.70 (±0.03) | |
| AUC (C-index) | 0.79 (±0.04) | 0.77 (±0.02) |
Standard deviation is shown in parentheses; highest values are shown in boldface.
AUC, area under the receiver operating characteristic curve; HCM-AF, hypertrophic cardiomyopathy atrial fibrillation.
Figure 6Receiver operating characteristic (ROC) curve for Hypertrophic Cardiomyopathy Atrial Fibrillation (HCM-AF)-Risk Model. The ROC curve depicts the performance of the HCM-AF-Risk Model that combines the undersampling and oversampling approaches. The false-positive rate is shown on the x-axis, and the true-positive rate is indicated on the y-axis. AUC, area under the curve.
Comparison of performance attained by HCM-AF-Risk Model based on 4 feature sets
| Feature set #/originating study, features | Sensitivity | Specificity | C-index/AUC |
|---|---|---|---|
| 1/FHS | 0.53 ( | 0.60 ( | 0.60 ( |
| 2/ ARIC | 0.57 ( | 0.63 ( | 0.68 ( |
| 3/CHARGE-AF Consortium | 0.54 (±0.10) | 0.60 ( | 0.61 ( |
| 4/HCM-AF-Risk Model (our study) |
The variables associated with the 4 feature sets that were used to represent the data for training our model are indicated in italics in this footnote. Set 1 shows our model performance when trained on 6 attributes identified as informative for AF prediction in the study by Schnabel et al. (C-index/AUC = 0.78), conducted using the FHS dataset; the predictors PR interval by EKG and significant cardiac murmur were not recorded in our dataset. Set 2 shows the performance attained by our model when trained on 10 features reported as predictive of AF in the study by Chamberlain et al. using the ARIC dataset (C-index/AUC = 0.76); the predictors precordial murmur and LVH by EKG were not recorded in our dataset. Set 3 includes 11 risk factors for AF identified in the study by Alonso et al. in the CHARGE-AF Consortium (C-index/AUC = 0.76). Set 4 shows the performance achieved by our model based on the 18 features identified as predictive by our feature-selection approach. The difference between the performance attained when the representation is based on our feature set (Set 4) and those attained when the representation is based on the 3 other sets, is highly statistically significant (P < 0.001). Standard deviation is shown in parentheses. The highest values are shown in boldface.
AF, atrial fibrillation; ARIC, Atherosclerosis Risk in Communities Study; AUC, area under the curve; CHARGE-AF, Cohorts for Heart and Aging Research in Genomic Epidemiology-Atrial Fibrillation; EKG, electrocardiogram; FHS, Framingham Heart Study; HCM-AF, hypertrophic cardiomyopathy atrial fibrillation; PR interval, the time from the onset of the P wave to the start of the QRS complex on electrocardiogrpahy; LVH, left ventricular hypertrophy.
Comparison of clinical/imaging features associated with AF, VArs, and HF in HCM patients, identified by the HCM-AF-Risk Model (current work), the HCM-VAr-Risk Model, and the HCM-HF-Risk Model
| AF | VAr HCM-VAr-Risk Model (22 predictive variables) sensitivity = 0.73, specificity = 0.76, | HF | ||||||
|---|---|---|---|---|---|---|---|---|
| Variables associated with AF in HCM | Polychoric correlation with AF | Variables associated with VAr (VT/VF) in HCM | Polychoric correlation | Variables associated with HF in HCM | Polychoric correlation with HF | |||
| Exercise time, s (–) | 3 x 10–6 | –0.225 | Exercise time, s (–) | 7 x 10–3 | –0.167 | Exercise time, s (–) | 4.7 x 10–7 | –0.346 |
| Exercise METs (–) | 1 x 10–6 | –0.154 | Exercise METs (–) | 1 x 10–2 | –0.131 | Exercise METs (–) | < 1 x 10–9 | –0.579 |
| Age, y (+) | 4 x 10–7 | 0.219 | Age, y (–) | 3 x 10–2 | –0.15 | Sex (male +) | > 1 x 10–9 | 0.401 |
| E/e′ (+) | 3 x 10–3 | 0.157 | E/e′ (+) | 6 x 10–2 | 0.167 | LV-LGE % of LV mass (+) | 3 x 10–2 | 0.191 |
| LV global longitudinal peak systolic strain rate, 1/s (+) | 2.9 x 10–2 | 0.12 | LV global longitudinal peak systolic strain rate, 1/s (+) | 3 x 10–3 | 0.171 | History of syncope (+) | 4.7 x 10–2 | 0.157 |
| LV global longitudinal early diastolic strain rate, 1/s (–) | 5 x 10–2 | –0.106 | LV global longitudinal early diastolic strain rate, 1/s (–) | 1 x 10–3 | –0.213 | History of smoking (+) | 2.5 x 10–2 | 0.148 |
| HR at peak exercise stress, bpm (–) | 1 x 10–10 | –0.288 | NSVT (presence +) | 5 x 10–4 | 0.994 | HR at peak exercise stress, bpm (–) | < 1 x 10–9 | –0.447 |
| LV-LGE (presence +) | 9 x 10–4 | 0.269 | VT induced by NIPS during follow-up (presence +) | 1 x 10–2 | 0.667 | LV-LGE (presence +) | 3.6 x 10–2 | 0.159 |
| HRR at 1 min post-exercise, bpm (–) | 7 x 10–4 | –0.205 | HCM type (non-obstructive +) | 1 x 10–3 | 0.366 | HRR at 1 min post-exercise, bpm (–) | < 1 x 10–9 | –0.411 |
| Dyspnea on exertion (+) | 3 x 10–2 | 0.198 | Peak stress LVOT gradient, mm Hg (–) | 1 x 10–5 | –0.273 | Dyspnea on exertion (+) | < 1 x 10–9 | 0.668 |
| % of max HR at peak exercise, % (–) | 5 x 10–4 | –0.156 | Unexplained syncope (presence +) | 3 x 10–4 | 0.264 | % of max HR at peak exercise, % (+) | 1 x 10–7 | 0.546 |
| Septal myectomy (+) | 2.1 x 10–6 | 0.353 | LV global longitudinal peak systolic strain, % (+) | 3 x 10–2 | 0.235 | Presyncope (+) | 4.2 x 10–2 | 0.146 |
| Left atrial diameter, cm (+) | 1 x 10–11 | 0.316 | SBP before exercise test, mm Hg (–) | 1 x 10–3 | –0.232 | Late diastolic filling velocity (A), cm/s (+) | 6 x 10–3 | 0.136 |
| Diuretic treatment (+) | 4.6 x 10–6 | 0.251 | ECHO LVEF, % (–) | 1 x 10–2 | –0.198 | Family history of HCM (–) | 4.9 x 10–2 | –0.132 |
| NYHA functional class (+) | 3 x 10–3 | 0.205 | Family history of HCM (presence +) | 6 x 10–2 | 0.195 | LV end-diastolic volume, ml (–) | 2.3 x 10–2 | –0.111 |
| ABPR during exercise test in follow-up visit (presence +) | 5 x 10–2 | 0.156 | IVS/PW ratio (+) | 1 x 10–2 | 0.195 | LV end systolic volume, ml (–) | 2.7 x 10–2 | –0.096 |
| DBP at peak exercise, mm Hg (–) | 5 x 10–2 | –0.105 | DBP before exercise test, mm Hg (–) | 1 x 10–2 | –0.177 | Peak stress LVOT gradient, mm Hg (+) | 4.8 x 10–2 | 0.0714 |
| E/A (+) | 1 x 10–3 | Maximal IVS thickness, mm (+) | 3 x 10–3 | 0.125 | ||||
| Peak rest LVOT gradient, mm Hg (–) | 4 x 10–2 | -0.119 | ||||||
| Body mass index, kg/m2 (–) | 3 x 10–2 | –0.115 | ||||||
| Family history of SCD (presence +) | 5 x 10–2 | 0.097 | ||||||
| Statin use (–) | 6 x 10–2 | –0.052 | ||||||
The HCM-AF-Risk Model (current work), the HCM-VAr-Risk Model, and the HCM-HF-Risk Model were developed using similar methods and the same HCM patient dataset.
ABPR, abnormal blood pressure response; AF, atrial flutter or atrial fibrillation of any duration before 1st clinic visit and/or during follow up; bpm, beats per minute; DBP, diastolic blood pressure; E/A: ratio of early diastolic mitral flow velocity to the late diastolic mitral flow velocity; ECHO, echocardiogram; E/e': ratio of early diastolic mitral flow velocity to the early diastolic mitral septal annulus motion; HCM, hypertrophic cardiomyopathy; HF, heart failure (≥NYHA class III symptoms and/or HF hospitalization during follow-up); HR, heart rate; HRR, heart rate recovery; IVS, interventricular septum; IVS/PW, ratio of maximal thickness of interventricular septum and maximal thickness of posterior wall of left ventricle; LV, left ventricle; LVEF, LV ejection fraction; LV-LGE, late gadolinium enhancement in the LV myocardium by cardiac magnetic resonance imaging; LVOT, LV outflow tract; MET, metabolic equivalent; NIPS, non-invasive programmed stimulation; NSVT, non-sustained ventricular tachycardia; NYHA, New York Heart Association; SBP, systolic blood pressure; SCD, sudden cardiac death; VAr, sustained ventricular tachycardia (≥ 30 s) or ventricular fibrillation before 1st clinic visit and/or during follow-up; VF, ventricular fibrillation; VT, ventricular tachycardia.