| Literature DB >> 33459029 |
Zeineb Bouzid1, Ziad Faramand2,3, Richard E Gregg4, Stephanie O Frisch5,2, Christian Martin-Gill6,3, Samir Saba7,3, Clifton Callaway6,3, Ervin Sejdić1,8,5,9, Salah Al-Zaiti2,6,7.
Abstract
Background Classical ST-T waveform changes on standard 12-lead ECG have limited sensitivity in detecting acute coronary syndrome (ACS) in the emergency department. Numerous novel ECG features have been previously proposed to augment clinicians' decision during patient evaluation, yet their clinical utility remains unclear. Methods and Results This was an observational study of consecutive patients evaluated for suspected ACS (Cohort 1 n=745, age 59±17, 42% female, 15% ACS; Cohort 2 n=499, age 59±16, 49% female, 18% ACS). Out of 554 temporal-spatial ECG waveform features, we used domain knowledge to select a subset of 65 physiology-driven features that are mechanistically linked to myocardial ischemia and compared their performance to a subset of 229 data-driven features selected by multiple machine learning algorithms. We then used random forest to select a final subset of 73 most important ECG features that had both data- and physiology-driven basis to ACS prediction and compared their performance to clinical experts. On testing set, a regularized logistic regression classifier based on the 73 hybrid features yielded a stable model that outperformed clinical experts in predicting ACS, with 10% to 29% of cases reclassified correctly. Metrics of nondipolar electrical dispersion (ie, circumferential ischemia), ventricular activation time (ie, transmural conduction delays), QRS and T axes and angles (ie, global remodeling), and principal component analysis ratio of ECG waveforms (ie, regional heterogeneity) played an important role in the improved reclassification performance. Conclusions We identified a subset of novel ECG features predictive of ACS with a fully interpretable model highly adaptable to clinical decision support applications. Registration URL: https://www.clinicaltrials.gov; Unique Identifier: NCT04237688.Entities:
Keywords: ECG; acute coronary syndrome; dimensionality reduction; ischemia; machine learning
Mesh:
Year: 2021 PMID: 33459029 PMCID: PMC7955430 DOI: 10.1161/JAHA.120.017871
Source DB: PubMed Journal: J Am Heart Assoc ISSN: 2047-9980 Impact factor: 5.501
Figure 1Computation of ECG features.
This figure shows the computation of 554 features from each 12‐lead ECG. A, Duration, amplitude, and area of various waveform deflections are calculated from the median beat of each of the 12 leads. B, The 12 median beats are superimposed, and global intervals and subintervals are computed. C, Principal component analysis (PCA) on time‐voltage data is performed on the orthogonal leads I, II, V1–V6 to compute PCA ratios of the eigenvalues of various ECG waveforms. D, Axes, angles, loops, and gradients of QRS and T vectors from xy, xz, yz, and xyz planes are computed. aVL indicates augmented vector left; and aVR, augmented vector right.
Figure 2Flow diagram of the features selection steps used in this study.
This diagram summarizes the steps used to select features using domain knowledge, data‐driven algorithms, and the hybrid combination of both approaches. AUROC indicates area under the receiver operating characteristic curve; K, number of features in each step; LASSO, least absolute shrinkage and selection operator; and RFE, recursive features elimination.
Baseline Study Characteristics
|
Cohort 1 (N=745) (Training Set) |
Cohort 2 (N=499) (Testing Set) | |
|---|---|---|
| Demographics | ||
| Age in y | 59±17 | 59±16 |
| Sex (female) | 317 (42%) | 243 (49%) |
| Race (Black) | 301 (40%) | 202 (40%) |
| Past medical history | ||
| Hypertension | 519 (69%) | 329 (66%) |
| Diabetes mellitus | 196 (26%) | 132 (26%) |
| Old myocardial infarction | 205 (27%) | 122 (24%) |
| Known coronary artery disease | 248 (33%) | 179 (36%) |
| Known heart failure | 130 (17%) | 74 (15%) |
| Prior PCI/CABG | 207 (28%) | 124 (25%) |
| Clinical presentation | ||
| Chest pain | 665 (89%) | 454 (91%) |
| Shortness of breathing | 250 (34%) | 234 (47%) |
| Normal sinus rhythm | 648 (87%) | 442 (88%) |
| Atrial fibrillation | 71 (9%) | 46 (9%) |
| Course of hospitalization | ||
| Length of stay, median [interquartile range] | 2.3 [1.0–3.0] | 1.2 [0.6–2.5] |
| Confirmed ACS (all events) | 114 (15.3%) | 92 (18.4%) |
| Non−ST‐segment elevation‐ACS | 83 (11.1%) | 74 (14.8%) |
| Treated by primary PCI/CABG | 74 (10%) | 65 (13%) |
| 30‐d cardiovascular death | 33 (4.4%) | 24 (4.8%) |
ACS indicates acute coronary syndrome; CABG, coronary artery bypass graft; and PCI, primary percutaneous coronary intervention.
Figure 3Classification performance using LR and ANN classifiers.
These plots show the performance of (A) logistic regression (LR) and (B) artificial neural network (ANN) classifiers on training data (Cohort 1) and testing data (Cohort 2) using all available ECG features (k=554), data‐driven subset of ECG features (k=229), or physiology‐driven subset of ECG features (k=65). P values are based on nonparametric method by Delong. AUC indicates area under the curve.
Overlap in Features Between Data‐Driven and Human‐Expert Techniques
| 12‐Lead ECG Component | Number of Features Selected | Comparison Between Techniques | ||
|---|---|---|---|---|
| Human Expert | Data‐Driven | Overlap in Features | Features Overlooked by Clinicians | |
| ECG normalization (k=2) | 2 | 2 | Age and sex | … |
| P duration, amplitude, or area (k=72) | 0 | 25 | … | Lead‐specific P duration and amplitude |
| PR interval metrics (k=26) | 1 | 11 | Global PR interval | Lead‐specific PR interval |
| Q duration or amplitude (k=24) | 0 | 10 | … | Lead‐specific Q wave presence |
| R duration or amplitude (k=48) | 0 | 23 | … | Lead‐specific R amplitude |
| S duration or amplitude (k=48) | 0 | 16 | … | S amplitude in precordial leads |
| Other QRS complex metrics (k=74) | 1 | 31 | Global QRS duration | QRS notch; ventricular activation time; lead‐specific QRS duration or area |
| Selvester Score (k=19) | 1 | 0 | Total scar size | … |
| ST amplitude, duration, or slope (k=72) | 12 | 31 | Lead‐specific ST amplitude | Lead‐specific ST duration and slope |
| ST deviation morphology (k=14) | 0 | 7 | … | Presence of concaved ST deviation |
| T duration, amplitude, or area (k=76) | 14 | 33 | Lead‐specific T amplitude, T‐to‐R relative amplitude | Lead‐specific T duration or area; presence of notched T wave |
| QT interval and subintervals (k=23) | 4 | 12 | Global QTc, Tpeak−Tend | Lead‐specific QT interval |
| QRS axis (k=12) | 1 | 7 | Frontal plane QRS axis | Horizontal and spatial QRS axis |
| T axis (k=11) | 4 | 6 | T axis in frontal, horizontal, and spatial planes | … |
| QRS and T vector angles (k=5) | 2 | 3 | QRS‐T angle and TCRT | … |
| T loop morphology (k=6) | 4 | 4 | T asymmetry and dispersion | … |
| Principal components analysis (k=16) | 16 | 6 | Principal component analysis ratio of J, T, and ST‐T | … |
| Noise signal (k=8) | 3 | 2 | Noise and baseline wander | … |
Figure 4Classification performance using different subsets of novel ECG features.
These plots show the performance of logistic regression (LR) classifiers on testing data (Cohort 2) for predicting (A) acute coronary syndrome (ACS) and (B) non–ST‐segment elevation‐acute coronary syndrome (NSTE‐ACS) using data‐driven subset of ECG features (k=229), physiology‐driven subset of ECG features (k=65), or hybrid subset with novel features (k=73). AUC indicates area under the curve.
Diagnostic Accuracy Measures of Machine Learning Classifiers Against Gold Standard Reference on the Testing Set (n=499)
| Clinical Experts Interpretation | Commercial Software Read | Novel ECG Features (LR73) | |
|---|---|---|---|
| Predicting Any ACS Event | |||
| Sensitivity | 0.40 (0.30–0.51) | 0.25 (0.17–0.35) | 0.72 (0.61–0.81) |
| Specificity | 0.94 (0.92–0.96) | 0.98 (0.96–0.99) | 0.73 (0.68–0.77) |
| Positive predictive value | 0.63 (0.51–0.73) | 0.79 (0.62–0.90) | 0.38 (0.33–0.42) |
| Negative predictive value | 0.88 (0.86–0.89) | 0.85 (0.83–0.87) | 0.92 (0.89–0.94) |
| NRI index | Reference | … | 0.10 (−0.02–0.23) |
| … | Reference | 0.21 (0.10–0.32) | |
| Predicting Non−ST‐segment elevation‐ACS event | |||
| Sensitivity | 0.26 (0.16–0.37) | 0.12 (0.06–0.22) | 0.72 (0.60–0.82) |
| Specificity | 0.94 (0.92–0.97) | 0.98 (0.96–0.99) | 0.68 (0.63–0.72) |
| Positive predictive value | 0.46 (0.33–0.60) | 0.60 (0.35–0.80) | 0.29 (0.25–0.33) |
| Negative predictive value | 0.87 (0.85–0.89) | 0.85 (0.84–0.87) | 0.93 (0.90–0.95) |
| NRI index | Reference | … | 0.19 (0.04–0.33) |
| … | Reference | 0.29 (0.15–0.42) | |
ACS indicates acute coronary syndrome; NRI, net reclassification improvement index; and LR73, logistic regression model based on the 73 hybrid features.
Figure 5Importance rank of subset of novel ECG features for the task of NSTE‐ACS detection.
This plot shows the feature importance ranking obtained using a random forest model on a hybrid data set including novel ECG features with prehospital ECG data after excluding patients with STEMI. AUC indicates area under the curve; NDPV, nondipolar voltage; NSTE‐ACS, non–ST‐segment elevation‐acute coronary syndrome; PCA, principal component analysis; RMS, root mean square; and STEMI, ST‐segment–elevation myocardial infarction.