| Literature DB >> 33981967 |
Dongdong Zhang1,2, Samuel Yang3,4, Xiaohui Yuan2, Ping Zhang1,5,6.
Abstract
Electrocardiogram (ECG) is a widely used reliable, non-invasive approach for cardiovascular disease diagnosis. With the rapid growth of ECG examinations and the insufficiency of cardiologists, accurate and automatic diagnosis of ECG signals has become a hot research topic. In this paper, we developed a deep neural network for automatic classification of cardiac arrhythmias from 12-lead ECG recordings. Experiments on a public 12-lead ECG dataset showed the effectiveness of our method. The proposed model achieved an average F1 score of 0.813. The deep model showed superior performance than 4 machine learning methods learned from extracted expert features. Besides, the deep models trained on single-lead ECGs produce lower performance than using all 12 leads simultaneously. The best-performing leads are lead I, aVR, and V5 among 12 leads. Finally, we employed the SHapley Additive exPlanations method to interpret the model's behavior at both the patient level and population level.Entities:
Keywords: Artificial Intelligence; Clinical Finding; Medicine
Year: 2021 PMID: 33981967 PMCID: PMC8082080 DOI: 10.1016/j.isci.2021.102373
Source DB: PubMed Journal: iScience ISSN: 2589-0042
Figure 1Comparison of existing models for automatic diagnosis of ECG abnormalities
(A) Two-stage traditional methods using feature engineering; (B) end-to-end deep learning methods.
Figure 2Deep neural network architecture for cardiac arrhythima diagnosis
Our deep neural network accepts raw ECG inputs (12 leads, duration of 30 s, sampling rate of 500 Hz), utilizes 1D CNNs to extract deep features, and outputs the prediction results for 9 diagnostic classes.
Twelve-lead model performance averaged on 10-fold tests
| CA type | Precision | Recall | F1 | AUC | Accuracy |
|---|---|---|---|---|---|
| SNR | 0.814 | 0.800 | 0.805 | 0.974 | 0.948 |
| AF | 0.920 | 0.918 | 0.919 | 0.988 | 0.971 |
| IAVB | 0.868 | 0.865 | 0.864 | 0.987 | 0.974 |
| LBBB | 0.844 | 0.894 | 0.866 | 0.980 | 0.991 |
| RBBB | 0.911 | 0.942 | 0.926 | 0.987 | 0.959 |
| PAC | 0.756 | 0.720 | 0.735 | 0.949 | 0.952 |
| PVC | 0.869 | 0.839 | 0.851 | 0.976 | 0.971 |
| STD | 0.808 | 0.826 | 0.814 | 0.971 | 0.953 |
| STE | 0.603 | 0.504 | 0.535 | 0.923 | 0.974 |
| AVG | 0.821 | 0.812 | 0.813 | 0.970 | 0.966 |
Figure 3F1 score comparison of machine learning models and end-to-end deep learning models
Comparison of single-lead model performance measured by F1 score
| CA type | I | II | III | aVR | aVL | aVF | V1 | V2 | V3 | V4 | V5 | V6 | All |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SNR | 0.705 | 0.682 | 0.602 | 0.712 | 0.604 | 0.663 | 0.657 | 0.694 | 0.710 | 0.717 | 0.731 | 0.721 | |
| AF | 0.914 | 0.927 | 0.911 | 0.913 | 0.908 | 0.924 | 0.913 | 0.915 | 0.922 | 0.910 | 0.905 | 0.919 | |
| IAVB | 0.843 | 0.853 | 0.818 | 0.842 | 0.808 | 0.830 | 0.860 | 0.816 | 0.842 | 0.840 | 0.864 | ||
| LBBB | 0.778 | 0.783 | 0.825 | 0.802 | 0.737 | 0.860 | 0.860 | 0.804 | 0.759 | 0.813 | 0.789 | 0.866 | |
| RBBB | 0.859 | 0.802 | 0.804 | 0.845 | 0.815 | 0.796 | 0.886 | 0.852 | 0.828 | 0.827 | 0.840 | 0.926 | |
| PAC | 0.723 | 0.709 | 0.688 | 0.698 | 0.719 | 0.730 | 0.689 | 0.692 | 0.680 | 0.715 | 0.702 | 0.735 | |
| PVC | 0.813 | 0.821 | 0.846 | 0.818 | 0.792 | 0.836 | 0.788 | 0.842 | 0.835 | 0.838 | 0.818 | 0.809 | |
| STD | 0.695 | 0.790 | 0.627 | 0.793 | 0.573 | 0.711 | 0.615 | 0.652 | 0.702 | 0.753 | 0.781 | 0.757 | |
| STE | 0.433 | 0.406 | 0.312 | 0.435 | 0.251 | 0.338 | 0.293 | 0.417 | 0.477 | 0.485 | 0.497 | 0.535 | |
| AVG | 0.765 | 0.755 | 0.712 | 0.765 | 0.695 | 0.726 | 0.741 | 0.758 | 0.762 | 0.763 | 0.769 | 0.762 |
Figure 4Interpretability of the deep learning model at both the patient level and population level using SHAP values
Figure 5Explanation of the model's prediction results for several ECG instances from different patients
The features with high contribution (i.e., SHAP values) are highlighted in orange. Only the last 10 s of top 2 influential leads are displayed due to the limited space.
Figure 6Population-level interpretation by calculating the contribution rate of ECG leads toward diagnostic classes in the 12-lead deep model