| Literature DB >> 31687121 |
Tiantian Xie1,2, Runchuan Li1,2, Shengya Shen3, Xingjin Zhang1,2,4, Bing Zhou1,2, Zongmin Wang1,2.
Abstract
Premature ventricular contraction (PVC) is one of the most common arrhythmias in the clinic. Due to its variability and susceptibility, patients may be at risk at any time. The rapid and accurate classification of PVC is of great significance for the treatment of diseases. Aiming at this problem, this paper proposes a method based on the combination of features and random forest to identify PVC. The RR intervals (pre_RR and post_RR), R amplitude, and QRS area are chosen as the features because they are able to identify PVC better. The experiment was validated on the MIT-BIH arrhythmia database and achieved good results. Compared with other methods, the accuracy of this method has been significantly improved.Entities:
Mesh:
Year: 2019 PMID: 31687121 PMCID: PMC6800940 DOI: 10.1155/2019/5787582
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
The method of detailed division of datasets.
| Dataset | Recordings | Division method | Non-V | V | Total |
|---|---|---|---|---|---|
| DS1 | 101, 106, 108, 109, 112, 114, 115, 116, 118, 119, 122, 124, 201, 203, 205, 207, 208, 209, 215, 220, 223, 230 | Training | 47573 | 3648 | 51221 |
| DS2 | 100, 103, 105, 111, 113, 117, 121, 123, 200, 202, 210, 212, 213, 214, 219, 221, 222, 228, 231, 232, 233, 234 | Test | 46539 | 3157 | 49696 |
| DS1 + DS2 | 44 recordings | — | 94112 | 6805 | 100917 |
Note. Non-V represents non-PVC type, and V represents PVC type.
Figure 1Frame diagram of PVC identification.
Comparison of R waves marked in MIT-BIH and R waves detected by “Pan and Tompkins.”
| Records | All | Correct | Wrong | Missed | Se | P + (PPV) |
|---|---|---|---|---|---|---|
| 100 | 2273 | 2272 | 0 | 1 | 0.9996 | 1 |
| 101 | 1865 | 1865 | 5 | 0 | 1 | 0.9973 |
| 103 | 2084 | 2084 | 0 | 0 | 1 | 1 |
| 105 | 2572 | 2570 | 44 | 2 | 0.9992 | 0.9832 |
| 106 | 2027 | 2018 | 3 | 9 | 0.9956 | 0.9985 |
| 108 | 1763 | 1746 | 61 | 17 | 0.9904 | 0.9662 |
| 109 | 2532 | 2532 | 7 | 0 | 1 | 0.9972 |
| 111 | 2124 | 2123 | 4 | 1 | 0.9995 | 0.9981 |
| 112 | 2539 | 2539 | 3 | 0 | 1 | 0.9988 |
| 113 | 1795 | 1794 | 0 | 1 | 0.9994 | 1 |
| 114 | 1879 | 1878 | 4 | 1 | 0.9995 | 0.9979 |
| 115 | 1953 | 1953 | 0 | 0 | 1 | 1 |
| 116 | 2412 | 2391 | 3 | 21 | 0.9913 | 0.9987 |
| 117 | 1535 | 1535 | 1 | 0 | 1 | 0.9993 |
| 118 | 2278 | 2278 | 12 | 0 | 1 | 0.9948 |
| 119 | 1987 | 1987 | 0 | 0 | 1 | 1 |
| 121 | 1863 | 1863 | 3 | 0 | 1 | 0.9984 |
| 122 | 2476 | 2476 | 0 | 0 | 1 | 1 |
| 123 | 1518 | 1515 | 0 | 3 | 0.9980 | 1 |
| 124 | 1619 | 1618 | 0 | 1 | 0.9994 | 1 |
| 200 | 2601 | 2600 | 63 | 1 | 0.9996 | 0.9763 |
| 201 | 1963 | 1948 | 0 | 15 | 0.9924 | 1 |
| 202 | 2136 | 2128 | 0 | 8 | 0.9962 | 1 |
| 203 | 2980 | 2967 | 89 | 13 | 0.9956 | 0.9709 |
| 205 | 2656 | 2653 | 0 | 3 | 0.9989 | 1 |
| 207 | 2332 | 2135 | 22 | 197 | 0.9155 | 0.9898 |
| 208 | 2955 | 2941 | 10 | 14 | 0.9953 | 0.9966 |
| 209 | 3005 | 3005 | 4 | 0 | 1 | 0.9987 |
| 210 | 2650 | 2604 | 15 | 46 | 0.9826 | 0.9943 |
| 212 | 2748 | 2748 | 0 | 0 | 1 | 1 |
| 213 | 3251 | 3250 | 0 | 1 | 0.9997 | 1 |
| 214 | 2262 | 2259 | 6 | 3 | 0.9987 | 0.9974 |
| 215 | 3363 | 3363 | 1 | 0 | 1 | 0.9997 |
| 219 | 2154 | 2154 | 0 | 0 | 1 | 1 |
| 220 | 2048 | 2048 | 0 | 0 | 1 | 1 |
| 221 | 2427 | 2422 | 0 | 5 | 0.9979 | 1 |
| 222 | 2483 | 2482 | 5 | 1 | 0.9996 | 0.9980 |
| 223 | 2605 | 2604 | 2 | 1 | 0.9996 | 0.9992 |
| 228 | 2053 | 2051 | 166 | 2 | 0.9990 | 0.9251 |
| 230 | 2256 | 2256 | 1 | 0 | 1 | 0.9996 |
| 231 | 1571 | 1570 | 0 | 1 | 0.9994 | 1 |
| 232 | 1780 | 1780 | 14 | 0 | 1 | 0.9922 |
| 233 | 3079 | 3073 | 1 | 6 | 0.9981 | 0.9997 |
| 234 | 2753 | 2751 | 0 | 2 | 0.9993 | 1 |
| Total |
|
|
|
|
|
|
Note. The first column is the name of the record, the second column is the number of R waves marked in MIT-BIH, the third column is the number of correctly detected R waves, the fourth column is the number of falsely detected R waves, the fifth column is the number of missed R waves, the sixth column is the evaluation indicator—sensitivity, and the seventh column is the evaluation indicator—positive prediction rate. According to the AAMIEC38 standard, the difference between the detected QRS complex and the manual mark is within 150 ms, which means that the location detection is successful.
Figure 2Annotation of heartbeat features.
Figure 3Feature optimization: (a) sorting result of feature importance; (b) effect of different quantitative characteristics on the results.
Figure 4Training model based on random forest (M represents the number of trees, and Tree-M represents the Mth tree).
Algorithm 1Algorithm description of decision tree.
Figure 5Flow chart of attribute division.
Classification of confusion matrix.
| Forecast category | |||
|---|---|---|---|
| N | V | ||
| Actual category | N | TN | FP |
| V | FN | TP | |
Note. N represents non-PVC type, and V represents PVC type.
Effect of different n_estimators values on classification results.
| Parameter | DS1 | ||
|---|---|---|---|
|
| Acc (%) | PPV (%) |
|
| 10 | 97.97 | 97.81 | 95.95 |
| 60 | 98.18 | 97.57 | 96.37 |
| 100 | 98.22 | 97.71 | 96.44 |
|
|
|
|
|
| 150 | 98.17 | 97.50 | 96.33 |
Comparison of experimental results of 6 classifiers.
| Classifier | Acc (%) | PPV (%) | Se (%) | Sp (%) |
|
|---|---|---|---|---|---|
| KNN | 95.04 | 96.79 | 97.27 | 97.26 | 94.55 |
| LR | 95.82 | 96.10 | 97.46 | 97.45 | 95.10 |
| NB | 93.59 | 95.70 | 96.05 | 96.06 | 92.11 |
| MLP | 96.22 | 97.26 | 97.77 | 97.02 | 94.99 |
| DT | 94.62 | 96.38 | 96.18 | 96.36 | 92.54 |
|
|
|
|
|
|
|
Impact of unbalanced datasets on experiments.
| Proportion | Acc (%) | PPV (%) | Se (%) | Sp (%) |
|
|---|---|---|---|---|---|
| 13 : 1 | 77.34 | 75.01 | 80.10 | 78.21 | 58.32 |
| 8 : 1 | 84.20 | 84.38 | 85.00 | 86.92 | 71.93 |
| 4 : 1 | 89.47 | 90.63 | 92.50 | 92.08 | 84.59 |
| 1 : 1 | 96.38 | 95.46 | 97.88 | 97.56 | 95.45 |
Comparison with other literatures.
| Methods | Recordings | Measures | ||||
|---|---|---|---|---|---|---|
| Acc (%) | PPV (%) | Se (%) | Sp (%) |
| ||
| Shyu et al. [ | 111, 115, 116, 119, 221, 230, 231 | 97.04 | — | 99.02 | — | — |
| Li et al. [ | 98.0 | 66.0 | 99.7 | — | — | |
| Zarei et al. [ | 99.35 | 90.94 | 100 | 99.33 | 99.33 | |
| Proposed method |
|
|
|
|
| |
|
| ||||||
| Lim [ | 115, 116, 119, 221, 230, 231 | 99.8 | — | 99.2 | — | — |
| Li et al. [ | 99.7 | 79.0 | 99.6 | — | — | |
| Zarei et al. [ | 99.74 | 96.65 | 100 | 99.72 | 99.72 | |
| Proposed method |
|
|
|
|
| |
|
| ||||||
| Llamedo and Martínez [ | All recordings of DS2 | 98.16 | 87.97 | 82.94 | 99.21 | 82.15 |
| Zhang et al. [ | 98.63 | 92.75 | 85.48 | 99.54 | 85.02 | |
| Zarei et al. [ | 98.77 | 86.48 | 96.12 | 98.96 | 95.08 | |
| Zhou et al. [ | 99.41 | 93.55 | 97.59 | 99.54 | 97.13 | |
| Proposed method |
|
|
|
|
| |