| Literature DB >> 24752287 |
Abril Valeria Uriarte-Arcia1, Itzamá López-Yáñez2, Cornelio Yáñez-Márquez1.
Abstract
Pattern recognition and classification are two of the key topics in computer science. In this paper a novel method for the task of pattern classification is presented. The proposed method combines a hybrid associative classifier (Clasificador Híbrido Asociativo con Traslación, CHAT, in Spanish), a coding technique for output patterns called one-hot vector and majority voting during the classification step. The method is termed as CHAT One-Hot Majority (CHAT-OHM). The performance of the method is validated by comparing the accuracy of CHAT-OHM with other well-known classification algorithms. During the experimental phase, the classifier was applied to four datasets related to the medical field. The results also show that the proposed method outperforms the original CHAT classification accuracy.Entities:
Mesh:
Year: 2014 PMID: 24752287 PMCID: PMC3994097 DOI: 10.1371/journal.pone.0095715
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Characteristics of the datasets used in the experimetal phase.
| Dataset | Instances | Attributes | Missing Values |
|
| 683 | 9 | Yes |
|
| 306 | 3 | No |
|
| 345 | 6 | No |
|
| 155 | 19 | Yes |
Accuracy comparison with the original method (%) original data.
| Datasets | ||||
| Algorithm | Breast Cancer | Haberman’s Survival | Hepatitis Disease | Liver Disorders |
|
| 63.10 | 65.95 | 68.19 | 55.63 |
|
| 95.00 | 66.36 | 84.96 | 61.45 |
Clasification accuracy comparison (%) original data.
| Dataset | ||||
| Algorithm | Breast Cancer | Haberman’s Survival | Hepatitis Disease | Liver Disorders |
|
| 95.05 | 74.02 | 89.8 | 65.96 |
|
|
| 71.73 | 87.68 | 56.85 |
|
| 94.88 | 72.33 | 88.98 | 66.44 |
|
| 95.00 | 66.36 | 84.96 | 61.45 |
|
| 85.43 | 73.87 | 77.54 | 56.57 |
|
| 91.98 | 72.97 | 88.99 | 56.06 |
|
| 95.69 | 71.90 | 88.40 | 59.11 |
|
| 65.01 | 73.53 | 79.38 | 57.98 |
|
| 95.75 | 65.77 | 81.03 | 62.22 |
|
| 92.14 | 71.90 | 89.70 | 60.80 |
|
| 94.73 | 73.28 | 89.49 | 65.29 |
|
| 65.01 | 73.53 | 79.38 | 57.98 |
|
| 96.26 |
| 87.50 | 54.89 |
|
| 90.32 | 73.74 | 78.00 | 56.96 |
|
| 96.47 | 67.94 |
|
|
|
| 94.74 | 64.48 | 85.32 | 64.10 |
|
| 96.36 | 73.75 | 85.78 | 65.06 |
|
| 96.87 | 73.33 | 88.83 | 57.98 |
|
| 65.01 | 73.53 | 79.38 | 57.98 |
|
| 91.08 | 73.82 | 78.09 | 63.53 |
Clasification accuracy comparison (%) normalized data.
| Dataset | ||||
| Algorithm | Breast Cancer | Haberman’s Survival | Hepatitis Disease | Liver Disorders |
|
| 95.05 | 74.02 | 89.80 | 67.72 |
|
|
| 71.73 | 87.68 | 56.62 |
|
| 94.80 | 72.43 | 88.92 | 67.15 |
|
| 95.52 | 62.45 | 89.52 | 58.5 |
|
| N/A | N/A | N/A | N/A |
|
| 91.98 | 72.97 | 89.05 | 56.28 |
|
| 95.69 | 71.90 | 88.46 | 58.83 |
|
| 90.00 | 73.01 | 88.99 | 64.17 |
|
| 95.46 | 65.58 | 81.03 | 63.25 |
|
| 92.14 | 71.90 | 89.95 | 60.74 |
|
| 94.73 | 73.28 | 89.49 | 64.29 |
|
| 65.01 | 73.53 | 79.38 | 57.98 |
|
| 96.11 | 74.66 | 87.36 | 55.42 |
|
| N/A | N/A | N/A | N/A |
|
| 96.33 | 67.81 |
|
|
|
| 94.79 | 64.96 | 83.70 | 62.75 |
|
| 96.36 | 73.75 | 85.78 | 64.81 |
|
| 96.88 | 73.53 | 88.70 | 57.90 |
|
| 65.01 | 73.53 | 79.38 | 57.98 |
|
| 96.87 |
| 85.03 | 65.86 |
Number of outliers for dataset.
| Original Dataset | Outliers | ||||||
| Total Instances | Class 1 Instances | Class 2 Instances | Class 1 Outliers | Class 2 Outliers | Total Outliers | Outliers % | |
|
| 683 | 444 | 239 | 11 | 11 | 22 | 3.22 |
|
| 345 | 145 | 200 | 71 | 55 | 126 | 36.52 |
|
| 306 | 225 | 81 | 25 | 59 | 84 | 27.45 |
|
| 155 | 32 | 123 | 14 | 13 | 27 | 17.41 |
Clasification accuracy comparison (%) data without outliers.
| Dataset | ||||
| Algorithm | Breast Cancer | Haberman’s Survival | Hepatitis Disease | Liver Disorders |
|
| 98.41 | 94.84 | 96.24 | 83.44 |
|
| 99.46 | 87.59 | 93.72 | 76.98 |
|
| 97.84 | 93.39 | 96.40 | 84.43 |
|
| 98.95 | 90.44 | 88.94 | 69.23 |
|
| 86.79 | 90.60 | 86.72 | 64.44 |
|
| 93.78 | 89.52 | 95.25 | 66.53 |
|
| 97.16 | 88.32 | 94.31 | 77.73 |
|
| 65.51 | 90.12 | 86.03 | 66.21 |
|
| 99.35 | 93.70 | 87.71 | 75.13 |
|
| 96.41 | 91.00 | 95.71 | 66.95 |
|
| 97.25 | 92.41 | 96.32 | 80.11 |
|
| 65.51 | 90.12 | 86.03 | 66.21 |
|
| 97.93 |
|
| 58.54 |
|
| 92.24 | 90.93 | 87.72 | 65.44 |
|
| 98.55 | 93.79 | 96.10 |
|
|
| 97.75 | 92.85 | 93.65 | 81.98 |
|
| 98.46 | 93.26 | 96.62 | 79.03 |
|
|
| 91.24 | 92.63 | 66.21 |
|
| 65.51 | 90.12 | 86.03 | 66.21 |
|
| 92.90 | 90.39 | 85.40 | 73.89 |
Clasification accuracy comparison (%) normalized data without outliers.
| Dataset | ||||
| Algorithm | Breast Cancer | Haberman’s Survival | Hepatitis Disease | Liver Disorders |
|
| 98.41 | 94.84 | 96.24 | 75.58 |
|
| 99.47 | 87.59 | 93.72 | 66.80 |
|
| 97.70 | 93.75 | 96.40 | 75.63 |
|
| 97.69 | 90.53 | 89.18 | 74.13 |
|
| N/A | N/A | N/A | N/A |
|
| 93.78 | 89.52 | 96.33 | 61.41 |
|
| 97.17 | 88.32 | 94.31 | 68.28 |
|
| 94.49 | 90.12 | 93.02 | 69.28 |
|
| 99.35 | 93.70 | 87.71 | 69.19 |
|
| 96.41 | 91.00 | 95.71 | 63.85 |
|
| 97.25 | 92.42 | 96.32 | 72.20 |
|
| 65.51 | 90.12 | 86.03 | 62.10 |
|
| 97.79 |
|
| 56.85 |
|
| N/A | N/A | N/A | N/A |
|
| 98.70 | 93.75 | 96.39 |
|
|
| 97.65 | 92.77 | 91.67 | 71.97 |
|
| 98.46 | 93.26 | 96.92 | 71.92 |
|
| 99.50 | 91.33 | 92.63 | 62.05 |
|
| 65.51 | 90.12 | 86.03 | 62.10 |
|
|
| 93.63 | 86.97 | 72.10 |
Comparison of classification Improvement (%) for Breast Cancer dataset.
| Breast Cancer | |||
| Algorithm | Normalization | Without Outliers | Without Outliers Normalized |
|
| 0.00 | 3.36 | 3.36 |
|
| 0.00 | 2.12 | 2.13 |
|
| −0.08 | 2.96 | 2.82 |
|
| 0.52 | 3.95 | 2.69 |
|
| N/A | 1.36 | N/A |
|
| 0.00 | 1.80 | 1.80 |
|
| 0.00 | 1.47 | 1.48 |
|
|
| 0.50 |
|
|
| −0.29 | 3.60 | 3.60 |
|
| 0.00 |
| 4.27 |
|
| 0.00 | 2.52 | 2.52 |
|
| 0.00 | 0.50 | 0.50 |
|
| −0.15 | 1.67 | 1.53 |
|
| N/A | 1.92 | N/A |
|
| −0.14 | 2.08 | 2.23 |
|
| 0.05 | 3.01 | 2.91 |
|
| 0.00 | 2.10 | 2.10 |
|
| 0.01 | 2.63 | 2.63 |
|
| 0.00 | 0.50 | 0.50 |
|
| 5.79 | 1.82 | 8.8 |
Comparison of classification improvement (%) for haberman’s survival dataset.
| Haberman’s Survival | |||
| Algorithm | Normalization | Without Outliers | Without Outliers Normalized |
|
| 0.00 | 20.82 | 20.82 |
|
| 0.00 | 15.86 | 15.86 |
|
| 0.10 | 21.06 | 21.42 |
|
| N/A | 16.73 | N/A |
|
| 0.00 | 16.55 | 16.55 |
|
| 0.00 | 16.52 | 16.42 |
|
| −0.52 | 16.59 | 16.59 |
|
| −0.19 | 27.93 | 27.93 |
|
| 0.00 | 19.10 | 19.10 |
|
| 0.00 | 19.13 | 19.13 |
|
| 0.00 | 16.59 | 16.59 |
|
| −0.14 | 20.45 | 20.40 |
|
| N/A | 17.29 | N/A |
|
| −0.13 | 25.85 | 25.81 |
|
| 0.48 |
|
|
|
| 0.00 | 19.51 | 19.51 |
|
| 0.00 | 17.91 | 18.00 |
|
| 0.00 | 16.59 | 16.59 |
|
|
| 16.57 | 19.81 |
|
| −3.91 | 24.08 | 24.17 |
Comparison of classification improvement (%) for hepatitis disease dataset.
| Hepatitis Disease | |||
| Algorithm | Normalization | Without Outliers | Without Outliers Normalized |
|
| 0.00 | 6.44 | 6.44 |
|
| 0.00 | 6.04 | 6.04 |
|
| −0.06 | 7.42 | 7.42 |
|
| 4.56 | 3.98 | 4.22 |
|
| N/A | 9.18 | N/A |
|
| 0.06 | 6.26 | 7.34 |
|
| 0.06 | 5.91 | 5.91 |
|
|
| 6.65 |
|
|
| 0.00 | 6.68 | 6.68 |
|
| 0.25 | 6.01 | 6.01 |
|
| 0.00 | 6.83 | 6.83 |
|
| 0.00 | 6.65 | 6.65 |
|
| −0.14 | 10.13 | 9.52 |
|
| N/A | 9.72 | N/A |
|
| −0.07 | 5.49 | 5.78 |
|
| −1.62 | 8.33 | 6.35 |
|
| 0.00 |
| 11.14 |
|
| −0.13 | 3.80 | 3.80 |
|
| 0.00 | 6.65 | 6.65 |
|
| 6.94 | 7.31 | 8.88 |
Comparison of classification improvement (%) for liver disorders dataset.
| Liver Disorders | |||
| Algorithm | Normalization | Without Outliers | Without Outliers Normalized |
|
| 1.76 | 17.48 | 9.62 |
|
| −0.23 |
| 9.95 |
|
| 0.71 | 17.99 | 9.19 |
|
| −2.95 | 7.78 |
|
|
| N/A | 7.87 | N/A |
|
| 0.22 | 10.47 | 5.35 |
|
| −0.28 | 18.62 | 9.17 |
|
|
| 8.23 | 11.30 |
|
| 1.03 | 12.91 | 6.97 |
|
| −0.06 | 6.15 | 3.05 |
|
| −1.00 | 14.82 | 6.91 |
|
| 0.00 | 8.23 | 4.12 |
|
| 0.53 | 3.65 | 1.96 |
|
| N/A | 8.48 | N/A |
|
| −0.68 | 18.92 | 8.82 |
|
| −1.35 | 17.88 | 7.87 |
|
| −0.25 | 13.97 | 6.86 |
|
| −0.08 | 8.23 | 4.07 |
|
| 0.00 | 8.23 | 4.12 |
|
| 2.33 | 10.36 | 8.57 |