| Literature DB >> 32439873 |
Hisaki Makimoto1,2, Moritz Höckmann3, Tina Lin4, David Glöckner3, Shqipe Gerguri3, Lukas Clasen3, Jan Schmidt3, Athena Assadi-Schmidt3, Alexandru Bejinariu3, Patrick Müller3, Stephan Angendohr3, Mehran Babady3, Christoph Brinkmeyer3, Asuka Makimoto3, Malte Kelm3,5.
Abstract
Artificial intelligence (AI) is developing rapidly in the medical technology field, particularly in image analysis. ECG-diagnosis is an image analysis in the sense that cardiologists assess the waveforms presented in a 2-dimensional image. We hypothesized that an AI using a convolutional neural network (CNN) may also recognize ECG images and patterns accurately. We used the PTB ECG database consisting of 289 ECGs including 148 myocardial infarction (MI) cases to develop a CNN to recognize MI in ECG. Our CNN model, equipped with 6-layer architecture, was trained with training-set ECGs. After that, our CNN and 10 physicians are tested with test-set ECGs and compared their MI recognition capability in metrics F1 (harmonic mean of precision and recall) and accuracy. The F1 and accuracy by our CNN were significantly higher (83 ± 4%, 81 ± 4%) as compared to physicians (70 ± 7%, 67 ± 7%, P < 0.0001, respectively). Furthermore, elimination of Goldberger-leads or ECG image compression up to quarter resolution did not significantly decrease the recognition capability. Deep learning with a simple CNN for image analysis may achieve a comparable capability to physicians in recognizing MI on ECG. Further investigation is warranted for the use of AI in ECG image assessment.Entities:
Mesh:
Year: 2020 PMID: 32439873 PMCID: PMC7242480 DOI: 10.1038/s41598-020-65105-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Concept and structures of Neural Networks. (a) A concept of artificial intelligence (AI). AI encloses machine learning and deep learning as methodologies. Deep learning includes neural networks as architectures. (b) The convolutional neural network (CNN) structure which were used to analyze the optimal number of ECG leads is shown. Six convolutional layers with relu activation and max-pooling layer were followed by a linear output layer into a sigmoid. (c) The CNN structure which were used to evaluate the effect of ECG image-quality reduction is shown. The fourth max-pooling layer was omitted as compared to Fig. 1b due to the pixel number of input ECGs.
Diagnosis in the PTB Database.
| Diagnosis | N = 289 | |
|---|---|---|
| Myocardial infarction (MI) | 148 | |
| Anterior MI | 72 | |
| Inferior MI | 74 | |
| Posterior MI | 15 | |
| Septal MI | 30 | |
| Lateral MI | 55 | |
| Non-MI | Healthy control | 52 |
| Cardiomyopathy | 15 | |
| Bundle branch block | 15 | |
| Dysarrhythmia | 14 | |
| Hypertrophy | 7 | |
| Valvular heart disease | 6 | |
| Myocarditis | 4 | |
| Heart failure | 3 | |
| Angina pectoris | 3 | |
| Others | 22 | |
Figure 2ECG data and datasets preparation. (a) Target ECG regions for the present study are shown. The region A (744*368) was extracted for our deep learning processes. The region B (744*368) was used for over-sampling in order to resolve the data imbalance (see text in detail). (b) Dataset preparation is shown in this flowchart. As test sets and validation sets, 25 ECGs were randomly selected respectively. For data balancing, 10 or 17 ECGs were randomly selected in each group (over-sampling, see Fig. 2a), and added to construct the training sets. (c) ECGs with lead reduction are shown. 9-lead-ECG consisted of leads I, II, III, and V1–6. 7-lead-ECG consisted of leads I, II, III, V1–2, and V5–6. (d) Based on the 9-lead-ECG, we compressed its image quality in 4 different ratios; full quality (744*368), half quality (398*260), quarter quality (282*184) and square-shape (224*224).
Figure 3Comparisons of recognition capabilities. Recognition capabilities are shown in 2 representative metrics (F1 measure = harmonic mean of sensitivity and positive predictive value; accuracy = degree to which the result of the CNN prediction conforms to the correct classification). (a) The comparison between the convolutional neural network (CNN) and human cardiologists is shown. The recognition capability of the 6-layer CNN was significantly higher than that of human cardiologists (F1 measure 0.788 ± 0.056 vs 0.699 ± 0.068, P < 0.0001; accuracy 0.788 ± 0.052 vs 0.67 ± 0.067, P < 0.0001). (b) The impact of ECG lead reduction. The 6-layer CNN with limb-lead ECGs showed significantly lower F1 measure as compared to that with 12-lead and 9-lead ECGs (0.721 ± 0.062 vs 0.823 ± 0.043, P = 0.0022; 0.721 ± 0.062 vs 0.814 ± 0.051, P = 0.0058, respectively). The ECG lead reduction did not significantly affect accuracy. (c) The impact of ECG image quality. The recognition capability of the 6-layer CNN showed no significant differences with reduced ECG image quality.
Metrics of recognition capability according to number of leads.
| 12 lead | 9 lead | 7 lead | Limb lead | Prec. lead | P value | |
|---|---|---|---|---|---|---|
| Sensitivity | 0.86 ± 0.07 | 0.88 ± 0.07 | 0.75 ± 0.10 | 0.65 ± 0.09 | 0.81 ± 0.09 | <0.0001 |
| Specificity | 0.76 ± 0.07 | 0.71 ± 0.10 | 0.81 ± 0.10 | 0.86 ± 0.07 | 0.74 ± 0.09 | 0.0047 |
| PPV | 0.79 ± 0.05 | 0.76 ± 0.07 | 0.80 ± 0.08 | 0.82 ± 0.06 | 0.76 ± 0.06 | 0.13 |
| NPV | 0.85 ± 0.07 | 0.87 ± 0.08 | 0.77 ± 0.07 | 0.71 ± 0.05 | 0.80 ± 0.06 | <0.0001 |
| F1 | 0.82 ± 0.04 | 0.81 ± 0.05 | 0.77 ± 0.07 | 0.72 ± 0.06 | 0.78 ± 0.05 | 0.0021 |
| Accuracy | 0.81 ± 0.04 | 0.80 ± 0.06 | 0.78 ± 0.07 | 0.75 ± 0.05 | 0.77 ± 0.04 | 0.10 |
| Loss | 1.56 ± 0.31 | 1.50 ± 0.19 | 1.41 ± 0.23 | 1.44 ± 0.18 | 1.41 ± 0.20 | 0.55 |
| AUC | 0.88 ± 0.05 | 0.88 ± 0.04 | 0.86 ± 0.05 | 0.85 ± 0.05 | 0.87 ± 0.04 | 0.45 |
| Accuracy (training) | 0.89 ± 0.06 | 0.91 ± 0.05 | 0.88 ± 0.06 | 0.88 ± 0.04 | 0.88 ± 0.03 | 0.39 |
| AUC (training) | 0.97 ± 0.03 | 0.98 ± 0.02 | 0.96 ± 0.03 | 0.96 ± 0.03 | 0.97 ± 0.01 | 0.29 |
| Accuracy (validation) | 0.85 ± 0.06 | 0.85 ± 0.06 | 0.82 ± 0.05 | 0.82 ± 0.03 | 0.82 ± 0.03 | 0.40 |
| AUC (validation) | 0.93 ± 0.05 | 0.93 ± 0.05 | 0.91 ± 0.06 | 0.92 ± 0.05 | 0.92 ± 0.05 | 0.89 |
Metrics of recognition capability according to image quality.
| Full | Half | Quarter | Square | P value | |
|---|---|---|---|---|---|
| Sensitivity | 0.88 ± 0.07 | 0.84 ± 0.10 | 0.77 ± 0.10 | 0.72 ± 0.10 | 0.0021 |
| Specificity | 0.71 ± 0.10 | 0.71 ± 0.12 | 0.82 ± 0.10 | 0.82 ± 0.09 | 0.022 |
| PPV | 0.76 ± 0.07 | 0.75 ± 0.08 | 0.82 ± 0.09 | 0.81 ± 0.09 | 0.19 |
| NPV | 0.87 ± 0.08 | 0.82 ± 0.09 | 0.79 ± 0.07 | 0.75 ± 0.07 | 0.016 |
| F1 | 0.81 ± 0.05 | 0.78 ± 0.05 | 0.79 ± 0.06 | 0.76 ± 0.07 | 0.17 |
| Accuracy | 0.80 ± 0.06 | 0.77 ± 0.05 | 0.79 ± 0.05 | 0.77 ± 0.06 | 0.59 |
| Loss | 1.50 ± 0.19 | 1.77 ± 0.28 | 1.48 ± 0.18 | 1.46 ± 0.22 | 0.0082 |
| AUC | 0.88 ± 0.04 | 0.87 ± 0.05 | 0.87 ± 0.05 | 0.85 ± 0.05 | 0.48 |
| Accuracy (training) | 0.91 ± 0.04 | 0.92 ± 0.04 | 0.86 ± 0.05 | 0.83 ± 0.06 | 0.0003 |
| AUC (training) | 0.98 ± 0.02 | 0.99 ± 0.02 | 0.95 ± 0.04 | 0.93 ± 0.05 | 0.0004 |
| Accuracy (validation) | 0.85 ± 0.06 | 0.81 ± 0.07 | 0.80 ± 0.07 | 0.79 ± 0.07 | 0.17 |
| AUC (validation) | 0.93 ± 0.05 | 0.92 ± 0.04 | 0.87 ± 0.07 | 0.85 ± 0.08 | 0.016 |
Figure 4Heatmaps of last convolution layer activations. The heatmaps of the last convolution layer activity show from where the convolutional neural network (CNN) distinguished MI or non-MI in ECGs. The CNN focused on the red-colored zones. (a) Three cases with MI, in which both the CNN and cardiologists achieved accurate classification. The CNN focused on the elevated ST-T segments. (b) Two cases without MI, in which both the CNN and cardiologists made the correct classification. The CNN distinguished these ECGs as non-MI based mainly on the QRS complex in the precordial leads. (c) Four cases with MI, in which only the CNN achieved accurate classification. The CNN seemed to focus on the ST-T segments mainly in the precordial leads and partially in the limb leads. These MI cases did not show typical ST-T segment change (elevation/depression) or large Q wave, which made recognition of MI only based on ECGs challenging.