| Literature DB >> 32685521 |
Fei Yang1,2, Jiazhi Du3, Jiying Lang2, Weigang Lu4,5, Lei Liu6, Changlong Jin2, Qinma Kang2.
Abstract
Electrocardiogram (ECG) signal is critical to the classification of cardiac arrhythmia using some machine learning methods. In practice, the ECG datasets are usually with multiple missing values due to faults or distortion. Unfortunately, many established algorithms for classification require a fully complete matrix as input. Thus it is necessary to impute the missing data to increase the effectiveness of classification for datasets with a few missing values. In this paper, we compare the main methods for estimating the missing values in electrocardiogram data, e.g., the "Zero method", "Mean method", "PCA-based method", and "RPCA-based method" and then propose a novel KNN-based classification algorithm, i.e., a modified kernel Difference-Weighted KNN classifier (MKDF-WKNN), which is fit for the classification of imbalance datasets. The experimental results on the UCI database indicate that the "RPCA-based method" can successfully handle missing values in arrhythmia dataset no matter how many values in it are missing and our proposed classification algorithm, MKDF-WKNN, is superior to other state-of-the-art algorithms like KNN, DS-WKNN, DF-WKNN, and KDF-WKNN for uneven datasets which impacts the accuracy of classification.Entities:
Mesh:
Year: 2020 PMID: 32685521 PMCID: PMC7327608 DOI: 10.1155/2020/7141725
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1A classic ECG signal sample.
Figure 2An example of classification in uneven sample.
Class description of the UCI cardiac arrhythmia database.
| Class | Class name | Number of instances |
|---|---|---|
| 1 | Normal | 245 |
| 2 | Ischemic changes (coronary artery disease) | 44 |
| 3 | Old anterior myocardial infarction | 15 |
| 4 | Old inferior myocardial infarction | 15 |
| 5 | Sinus tachycardia | 13 |
| 6 | Sinus bradycardia | 25 |
| 7 | Ventricular premature contraction (PVC) | 3 |
| 8 | Supraventricular premature contraction | 2 |
| 9 | Left bundle branch block | 9 |
| 10 | Right bundle branch block | 50 |
| 11 | 1. degree atrioventricular block | 0 |
| 12 | 2. degree AV block | 0 |
| 13 | 3. degree AV block | 0 |
| 14 | Left ventricular hypertrophy | 4 |
| 15 | Atrial fibrillation or flutter | 5 |
| 16 | Others | 22 |
Figure 3The block diagram of the experimental procedure.
Comparison of the classification accuracy (%) achieved w.r.t. the imputation methods and classifiers.
| Proportion | 0.3% | 10% | 20% | 30% | 40% | 50% | 60% | 70% | |
|---|---|---|---|---|---|---|---|---|---|
| KNN | X0 | 59.07 | 55.75 | 54.65 | 54.87 | 53.98 | 54.20 | 53.98 | 54.42 |
| MEAN | 58.19 | 57.08 | 55.53 | 55.75 | 54.42 | 53.98 | 53.76 | 54.65 | |
| PCA | 58.19 | 57.96 | 55.53 | 56.19 | 54.65 | 54.65 | 53.76 | 53.76 | |
| RPCA | 58.85 | 57.52 | 56.86 | 57.30 | 55.09 | 54.42 | 54.42 | 55.09 | |
|
| |||||||||
| DS-WKNN | X0 | 60.84 | 59.96 | 55.75 | 54.42 | 52.88 | 53.76 | 52.43 | 51.99 |
| MEAN | 61.50 | 59.29 | 58.19 | 56.42 | 55.31 | 54.20 | 53.54 | 53.32 | |
| PCA | 61.06 | 60.18 | 58.41 | 58.63 | 55.31 | 55.31 | 53.32 | 50.88 | |
| RPCA | 59.96 | 58.63 | 58.63 | 59.29 | 54.20 | 55.53 | 57.08 | 54.42 | |
|
| |||||||||
| DF-WKNN | X0 | 70.58 | 66.15 | 63.27 | 59.73 | 58.85 | 56.86 | 56.19 | 54.20 |
| MEAN | 71.90 | 69.03 | 68.14 | 65.49 | 61.73 | 59.29 | 59.29 | 56.19 | |
| PCA | 70.35 | 69.03 | 66.59 | 65.49 | 61.28 | 61.28 | 55.53 | 50.44 | |
| RPCA | 70.80 | 67.92 | 65.93 | 65.49 | 62.17 | 61.06 | 60.40 | 57.08 | |
|
| |||||||||
| KDF-WKNN | X0 | 70.35 | 66.15 | 63.27 | 60.18 | 58.19 | 56.86 | 56.19 | 54.65 |
| MEAN | 71.68 | 69.03 | 67.92 | 65.71 | 61.73 | 59.29 | 59.29 | 56.19 | |
| PCA | 70.58 | 68.81 | 67.04 | 65.93 | 61.50 | 61.50 | 55.09 | 51.77 | |
| RPCA | 70.80 | 67.70 | 65.93 | 64.60 | 61.95 | 61.73 | 60.18 | 56.86 | |
|
| |||||||||
| MKDF-WKNN | X0 | 71.90 | 67.04 | 63.72 | 59.96 | 60.18 | 58.85 | 57.30 | 53.98 |
| MEAN | 73.01 | 70.58 | 68.14 | 66.37 | 62.17 | 58.41 | 59.51 | 54.87 | |
| PCA | 71.68 | 69.91 | 69.03 | 66.37 | 60.62 | 60.62 | 55.53 | 49.56 | |
| RPCA | 71.90 | 70.58 | 67.70 | 66.15 | 64.38 | 60.84 | 61.28 | 56.42 | |
Figure 4Comparison of the classification accuracy (%) of MKDF-WKNN using four missing value imputation methods for different proportions of missing data.
Figure 5Comparison of the average classification accuracy (%) of KNN, DS-WKNN, DF-WKNN, KDF-WKNN, and MKDF-WKNN.