| Literature DB >> 26688816 |
Abstract
In the recent decade, disease classification and biomarker discovery have become increasingly important in modern biological and medical research. ECGs are comparatively low-cost and noninvasive in screening and diagnosing heart diseases. With the development of personal ECG monitors, large amounts of ECGs are recorded and stored; therefore, fast and efficient algorithms are called for to analyze the data and make diagnosis. In this paper, an efficient and easy-to-interpret procedure of cardiac disease classification is developed through novel feature extraction methods and comparison of classifiers. Motivated by the observation that the distributions of various measures on ECGs of the diseased group are often skewed, heavy-tailed, or multimodal, we characterize the distributions by sample quantiles which outperform sample means. Three classifiers are compared in application both to all features and to dimension-reduced features by PCA: stepwise discriminant analysis (SDA), SVM, and LASSO logistic regression. It is found that SDA applied to dimension-reduced features by PCA is the most stable and effective procedure, with sensitivity, specificity, and accuracy being 89.68%, 84.62%, and 88.52%, respectively.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26688816 PMCID: PMC4672117 DOI: 10.1155/2015/680381
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Number of cases in the training and testing data sets according to their class of diagnosis.
| Data set | Diagnosis class | Number of cases |
|---|---|---|
| Training | No disease | 26 |
| Disease | 98 | |
|
| ||
| Testing | No disease | 26 |
| Disease | 96 | |
Definition of the nine waveforms.
| Waveform | Definition |
|---|---|
| Up-P | Waveform from the start of the P wave to the peak of the P wave |
|
| |
| Down-P | Waveform from the peak of the P wave to the end of the P wave |
|
| |
| PR | Waveform from the end of the P wave to the start of the QRS wave |
|
| |
| Up-R | Waveform from the start of the QRS wave to the peak of the R wave |
|
| |
| Down-R | Waveform from the peak of the R wave to the end of the QRS wave |
|
| |
| ST | Waveform from the end of the QRS wave to the start of the T wave |
|
| |
| Up-T | Waveform from the start of the T wave to the peak of the T wave |
|
| |
| Down-T | Waveform from the peak of the T wave to the end of the T wave |
|
| |
| TP | Waveform from the end of the T wave of the current beat to the start of the P wave of the next beat |
Figure 1Sample distributions of the PR interval, the QT interval, the slope of the Up-T waveform, and the slope of Down-T waveform of both healthy and diseased subjects.
Major quantile features in the first eight principal components.
| Principal components | Major quantile features | Contribution (63.60%) |
|---|---|---|
| PC1 | QT-int_p95, Down-T-slo_p95, QT-int_p90, | 22.62% |
|
| ||
| PC2 | Down-T-slo_p25, Down-T-slo_p5, Down-T-slo_p10, Down-T-slo_p75, | 10.8% |
|
| ||
| PC3 | Up-R-slo_p99, QRS-amp_p99, Up-T-slo_p99, T-amp_p99, Up-T-slo_p95 | 9.36% |
|
| ||
| PC4 | Up-P-slo_p1, P-amp_p75, PR-slo_p75, P-amp_p90, | 7.12% |
|
| ||
| PC5 | TP-slo_p10, TP-slo_p5, TP-slo_p25, RR-int_p90, | 5.84% |
|
| ||
| PC6 | PR-int_p75, PR-int_p90, PR-int_p95, PR-int_p99, PR-int_p25 | 4.53% |
|
| ||
| PC7 | Up-R-slo_p25, Up-R-slo_p1, P-int_p25, Down-R-slo_p25, T-amp_p99 | 3.32% |
|
| ||
| PC8 | Up-R-slo_p1, Down-P-slo_p90, Down-P-slo_p75, Up-R-slo_p5, Down-P-slo_p95 | 3.00% |
Note: “-int” represents the length of the indicated interval, “-slo” represents the slope of the indicated waveform, and “-amp” represents the amplitude of the indicated wave.
Major features selected by stepwise discriminant analysis.
| Major features | |
|---|---|
| T wave type, |
Note: “-int” represents the length of the indicated interval, “-slo” represents the slope of the indicated waveform, and “-amp” represents the amplitude of the indicated wave.
Figure 2A flow chart of the classification procedure.
Classification results of the different methods on the test set of cases.
| Data set | Method | Sensitivity | Specificity | Accuracy |
|---|---|---|---|---|
| Mean | SDA | 82.29% | 73.08% | 80.33% |
| SVM | 85.57% | 61.54% | 80.49% | |
| LLR | 92.71% | 34.61% | 80.33% | |
|
| ||||
| Quantile | SDA | 89.58% | 73.04% | 86.66% |
| SVM | 86.6% | 73.07% | 83.74% | |
| LLR | 86.46% | 69.23% | 82.79% | |
|
| ||||
| Mean + PCA | SDA | 87.5% | 73.08% | 84.43% |
| SVM | 89.7% | 50% | 81.3% | |
| LLR | 89.58% | 38.46% | 78.69% | |
|
| ||||
| Quantile + PCA | SDA |
|
|
|
| SVM | 89.68% | 76.92% | 86.99% | |
| LLR | 94.79% | 53.85% | 86.70% | |