| Literature DB >> 32498271 |
Juan Carlos Carrillo-Alarcón1, Luis Alberto Morales-Rosales2, Héctor Rodríguez-Rángel3, Mariana Lobato-Báez4, Antonio Muñoz5, Ignacio Algredo-Badillo6.
Abstract
The electrocardiogram records the heart's electrical activity and generates a significant amount of data. The analysis of these data helps us to detect diseases and disorders via heart bio-signal abnormality classification. In unbalanced-data contexts, where the classes are not equally represented, the optimization and configuration of the classification models are highly complex, reflecting on the use of computational resources. Moreover, the performance of electrocardiogram classification depends on the approach and parameter estimation to generate the model with high accuracy, sensitivity, and precision. Previous works have proposed hybrid approaches and only a few implemented parameter optimization. Instead, they generally applied an empirical tuning of parameters at a data level or an algorithm level. Hence, a scheme, including metrics of sensitivity in a higher precision and accuracy scale, deserves special attention. In this article, a metaheuristic optimization approach for parameter estimations in arrhythmia classification from unbalanced data is presented. We selected an unbalanced subset of those databases to classify eight types of arrhythmia. It is important to highlight that we combined undersampling based on the clustering method (data level) and feature selection method (algorithmic level) to tackle the unbalanced class problem. To explore parameter estimation and improve the classification for our model, we compared two metaheuristic approaches based on differential evolution and particle swarm optimization. The final results showed an accuracy of 99.95%, a F1 score of 99.88%, a sensitivity of 99.87%, a precision of 99.89%, and a specificity of 99.99%, which are high, even in the presence of unbalanced data.Entities:
Keywords: arrhythmia; electrocardiogram (ECG); machine learning; signal processing; unbalanced
Mesh:
Year: 2020 PMID: 32498271 PMCID: PMC7308921 DOI: 10.3390/s20113139
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1The ECG waves, segments, and intervals [4].
Classes selected of MIT-BIH Arrhythmia Database.
| Class | Name | Number of Samples |
|---|---|---|
| N | Normal beat | 74,758 |
| L | Left bundle branch block beat | 8072 |
| R | Right bundle branch block beat | 7255 |
| V | Premature ventricular contraction beat | 7123 |
| ! | Ventricular flutter wave beat | 472 |
| j | Nodal (junctional) escape beat | 229 |
| a | Aberrated atrial premature beat | 150 |
| E | Ventricular escape beat | 106 |
IR values between majority and minority classes.
| Majority Classes | |||||
|---|---|---|---|---|---|
| N | L | R | V | ||
|
| ! | 158.4 | 17.1 | 15.4 | 15.1 |
| j | 326.5 | 35.2 | 31.7 | 31.1 | |
| a | 498.4 | 53.8 | 48.4 | 47.5 | |
| E | 705.3 | 76.2 | 68.4 | 67.2 | |
Figure 2Metahuristic approach diagram.
Figure 3Diagram of preprocessing.
Figure 4(A) Signal before the filtering. (B) Signal after the combination of high-pass and low-pass filtering. (C) Signal after the wavelet filtering process.
Figure 5(A) Signal before normalization. (B) Signal after normalization.
Figure 6Heartbeat segmentation example.
Figure 7Self-organizing map structure [42].
Figure 8Clustering processes of majority classes.
Comparison test to select the number of subsegments. Abbreviations: HL: combination of High-pass and low-pass filtering; DE: differential evolution.
| DE-HL 50 Features | DE-HL 25 Features | |||
|---|---|---|---|---|
| Accuracy | F1 Score | Accuracy | F1 Score | |
|
| 99.92 | 99.77 | 99.92 | 99.76 |
|
| 0.032 | 0.073 | 0.047 | 0.094 |
Figure 9Feature extraction from segments.
Representation vector. Abbreviations: SOM: self organizing map.
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | ... | 56 |
|---|---|---|---|---|---|---|---|---|
| SOM | Class | Class | Class | Class | Number | Features | ||
| map | L | N | V | R | of | to | ||
| size | Percentage | Percentage | Percentage | Percentage | neurons | select | ||
Experimental configuration. Abbreviations: HL: combination of high-pass and low-pass filtering; PSO: particle swarm optimization; n: population; : maximum inertia weight; : minimum inertia weight; : accelerate constant 1; : accelerate constant 2; DE: differential evolution; Cr: crossover rate, F: mutation factor.
| Experiment 1 | Experiment 2 | Experiment 3 | Experiment 4 | |
|---|---|---|---|---|
|
| HL Filtering | HL Filtering | Wavelet Filtering | Wavelet Filtering |
|
| ||||
|
| Map Size: 2–10 | Map Size: 2–10 | Map Size: 2–10 | Map Size: 2–10 |
Figure 10Artificial neural network structure.
Experiments results of ten tests in percentages. Abbreviations. SD: standard deviation.
| Experiment 1 | Experiment 2 | Experiment 3 | Experiment 4 | |
|---|---|---|---|---|
|
|
|
|
|
|
|
| 99.74 | 99.74 | 99.50 |
|
|
| 99.84 | 99.70 | 99.56 | 99.47 |
|
| 99.73 | 99.74 | 99.30 | 99.61 |
|
| 99.76 | 99.73 | 99.51 | 99.68 |
|
| 99.82 | 99.76 | 99.65 | 99.52 |
|
| 99.61 | 99.65 | 99.27 | 99.56 |
|
| 99.26 |
| 99.62 | 99.54 |
|
|
| 99.78 | 99.14 | 99.68 |
|
| 99.62 | 99.87 |
| 99.66 |
|
| 99.41 | 99.83 | 99.62 | 99.66 |
|
| 99.67 | 99.77 | 99.50 | 99.61 |
|
| 0.205 | 0.073 | 0.204 | 0.079 |
Figure 11Means and standard deviations of ten tests on the four experiments.
Best solutions of the experiments on ten tests.
| Experiment 1 | Experiment 2 | Experiment 3 | Experiment 4 | |
|---|---|---|---|---|
|
| 99.93 % | 99.88 % | 99.81 % | 99.69 % |
|
| 99.95 % | 99.95 % | 99.94 % | 99.88 % |
|
| 99.96 % | 99.87 % | 99.78 % | 99.85 % |
|
| 99.99 % | 99.99 % | 99.99 % | 99.98 % |
|
| 99.89 % | 99.89 % | 99.83 % | 99.53 % |
|
| 2 | 2 | 2 | 4 |
|
| 150 | 150 | 150 | 150 |
|
| 106 | 106 | 106 | 106 |
|
| 229 | 229 | 229 | 229 |
|
| 472 | 472 | 472 | 472 |
|
| 4272 | 2784 | 3299 | 1269 |
|
| 113 | 77 | 159 | 2160 |
|
| 817 | 73 | 300 | 921 |
|
| 2167 | 1802 | 1887 | 2463 |
|
| 8326 | 5693 | 6602 | 7770 |
|
| 207 | 500 | 126 | 86 |
|
| 23 | 44 | 27 | 43 |
|
| 2-4,8,9,14,17,24–26, | 1–5,7–13,15,17, | 1,6,8,11,12,14–18, | 1–5,7–9,11,12,14–18, |
Figure 12(A) Scatter plot of instances on the full dataset. (B) Scatter plot of instances on the experiment 1 dataset. (C) Scatter plot of instances on the experiment 2 dataset. (D) Scatter plot of instances on the experiment 3 dataset. (E) Scatter plot of instances on the experiment 4 dataset.
redIR values comparison between full dataset and dataset of the best solution on experiment 2.
| Majority Classes | |||||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
| ||||||||
|
|
|
|
|
|
|
|
| ||
|
|
| 158.4 | 17.1 | 15.4 | 15.1 | 0.2 | 5.9 | 3.8 | 0.2 |
|
| 326.5 | 35.2 | 31.7 | 31.1 | 0.3 | 12.2 | 7.9 | 0.3 | |
|
| 498.4 | 53.8 | 48.4 | 47.5 | 0.5 | 18.6 | 12.0 | 0.5 | |
|
| 705.3 | 76.2 | 68.4 | 67.2 | 0.7 | 26.3 | 17.0 | 0.7 | |
Confusion matrix obtained by the best solution of experiment 2.
| Predicted Label | |||||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| ||
|
|
| 150 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
|
| 0 | 106 | 0 | 1 | 0 | 0 | 0 | 0 | |
|
| 0 | 0 | 228 | 0 | 0 | 0 | 0 | 0 | |
|
| 0 | 0 | 0 | 470 | 0 | 0 | 0 | 0 | |
|
| 0 | 0 | 0 | 1 | 2784 | 0 | 0 | 0 | |
|
| 0 | 0 | 0 | 0 | 0 | 77 | 0 | 0 | |
|
| 0 | 0 | 0 | 0 | 0 | 0 | 73 | 0 | |
|
| 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1802 | |
Examples of how to compute TP, TN, FP, and FN of class “!”.
| a | E | j | ! | L | N | V | R | |
|---|---|---|---|---|---|---|---|---|
|
|
|
|
| |||||
|
| ||||||||
|
| ||||||||
|
|
|
|
| |||||
|
|
|
|
| |||||
|
| ||||||||
|
| ||||||||
|
| ||||||||
Examples of how to compute TP, TN, FP, and FN of class “R”.
| a | E | j | ! | L | N | V | R | |
|---|---|---|---|---|---|---|---|---|
|
|
|
| ||||||
|
| ||||||||
|
| ||||||||
|
| ||||||||
|
| ||||||||
|
| ||||||||
|
| ||||||||
|
|
|
| ||||||
Results of TP, TN, FP, and FN from confusion matrix of Table 9.
| a | E | j | ! | L | N | V | R | |
|---|---|---|---|---|---|---|---|---|
|
| 150 | 106 | 228 | 470 | 2784 | 77 | 73 | 1802 |
|
| 0 | 0 | 1 | 2 | 0 | 0 | 0 | 0 |
|
| 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 |
|
| 5543 | 5586 | 5464 | 5221 | 2908 | 5616 | 5620 | 3890 |
Performance of the best solution on experiment 2 in percentage.
| Class | a | E | j | ! | L | N | V | R | |
|---|---|---|---|---|---|---|---|---|---|
| Metric | |||||||||
|
| 100 | 99.53 | 99.78 | 99.79 | 99.98 | 100 | 100 | 99.97 | |
|
| 100 | 99.98 | 99.98 | 99.96 | 99.98 | 100 | 100 | 99.98 | |
|
| 100 | 99.07 | 100 | 100 | 99.96 | 100 | 100 | 99.94 | |
|
| 100 | 100 | 99.98 | 99.96 | 100 | 100 | 100 | 100 | |
|
| 100 | 100 | 99.56 | 99.58 | 100 | 100 | 100 | 100 | |
Figure 13T-Student statistic from the approach proposed of experiment 2.
Comparison of the method proposed with the state of the art.
| Accuracy | Sensitivity | Precision | F1 Score | No. Arrhythmias | |
|---|---|---|---|---|---|
|
| 93.31 | 45.49 | 46.45 | 45.48 | 2 |
|
| 97.14 | 97.54 | 97.64 | —- | 3 |
|
| 99.1 | 97.9 | —- | —- | 5 |
|
| 93.72 | 26.85 | 85.43 | —- | 5 |
|
| 98.4 | —- | —- | —- | 2 |
|
| 99.26 | 99.26 | 99.30 | 99.27 | 8 |
|
| 99.95 | 99.87 | 99.89 | 99.88 | 8 |
Comparison of the method proposed with other works using the standard AAMI.
| AAMI Standard | ||||
|---|---|---|---|---|
|
|
|
|
| |
|
| 99.1 | 97.9 | — | — |
|
| 93.72 | 26.85 | 85.43 | — |
|
| 99.19 | 98.79 | 98.80 | 98.79 |