| Literature DB >> 25295306 |
Mei-Ling Huang1, Yung-Hsiang Hung1, W M Lee2, R K Li2, Bo-Ru Jiang1.
Abstract
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.Entities:
Mesh:
Year: 2014 PMID: 25295306 PMCID: PMC4175386 DOI: 10.1155/2014/795624
Source DB: PubMed Journal: ScientificWorldJournal ISSN: 1537-744X
Feature information for Dermatology and Zoo databases.
| Dermatology | Zoo | |
|---|---|---|
| Dataset characteristics | Multivariate | Multivariate |
| Attribute characteristics | Categorical, integer | Categorical, integer |
| Associated tasks | Classification | Classification |
| Area | Life | Life |
| Number of instances | 366 | 101 |
| Number of attributes | 33 | 16 |
| Number of class | 6 | 7 |
Attributes of Dermatology database.
| ID | Attribute |
|---|---|
| V1 | Erythema |
| V2 | Scaling |
| V3 | Definite borders |
| V4 | Itching |
| V5 | Koebner phenomenon |
| V6 | Polygonal papules |
| V7 | Follicular papules |
| V8 | Oral mucosal involvement |
| V9 | Knee and elbow involvement |
| V10 | Scalp involvement |
| V11 | Family history |
| V12 | Melanin incontinence |
| V13 | Eosinophils in the infiltrate |
| V14 | PNL infiltrate |
| V15 | Fibrosis of the papillary dermis |
| V16 | Exocytosis |
| V17 | Acanthosis |
| V18 | Hyperkeratosis |
| V19 | Parakeratosis |
| V20 | Clubbing of the rete ridges |
| V21 | Elongation of the rete ridges |
| V22 | Thinning of the suprapapillary epidermis |
| V23 | Spongiform pustule |
| V24 | Munro microabscess |
| V25 | Focal hypergranulosis |
| V26 | Disappearance of the granular layer |
| V27 | Vacuolisation and damage of basal layer |
| V28 | Spongiosis |
| V29 | Saw-tooth appearance of retes |
| V30 | Follicular horn plug |
| V31 | Perifollicular parakeratosis |
| V32 | Inflammatory mononuclear infiltrate |
| V33 | Band-like infiltrate |
| V34 | Age |
Attributes of Zoo database.
| ID | Attribute |
|---|---|
| V1 | Hair |
| V2 | Feathers |
| V3 | Eggs |
| V4 | Milk |
| V5 | Airborne |
| V6 | Aquatic |
| V7 | Predator |
| V8 | Toothed |
| V9 | Backbone |
| V10 | Breathes |
| V11 | Venomous |
| V12 | Fins |
| V13 | Legs |
| V14 | Tail |
| V15 | Domestic |
| V16 | Cat-size |
Figure 1Research framework.
Classification accuracy comparison.
| Dermatology database | Zoo database | ||||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
| ||||||
| 1 | 3 | 10 | 12 | 0.1 | 5 | 10 | 12 | ||
| 1 | 52.57% | 95.18% | 94.08% | 94.22% | 1 | 71.18% | 78.09% | 62.36% | 40.64% |
| 10 | 52.57% | 96.04% | 97.94% | 97.93% | 10 | 71.18% | 96.00% | 91.00% | 85.09% |
| 50 | 52.57% | 96.31% | 96.86% | 96.58% | 50 | 71.18% | 96.09% | 96.00% | 96.00% |
| 100 | 52.57% | 96.31% | 96.32% | 96.03% | 100 | 71.18% | 96.09% | 96.09% | 96.00% |
Factor level configuration of LS-SVM parameter design.
| Dermatology database | Zoo database | ||||||
|---|---|---|---|---|---|---|---|
| Control factor | Level | Control factor | Level | ||||
| 1 | 2 | 3 | 1 | 2 | 3 | ||
|
| 10 | 50 | 100 |
| 5 | 10 | 50 |
|
| 2.4 | 5 | 10 |
| 0.08 | 4 | 11 |
Summary of experiment data of Dermatology database.
| Number | Control factor | Observation | Average |
| |||||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
| |||
| 1 | 1 | 1 | 0.9631 | 0.9701 | 0.9697 | 0.9627 | 0.9614 | 0.9654 | −0.3060 |
| 2 | 1 | 2 | 0.9686 | 0.9749 | 0.9653 | 0.9621 | 0.9732 | 0.9688 | −0.2755 |
| 3 | 1 | 3 | 0.9795 | 0.9847 | 0.9848 | 0.9838 | 0.9735 | 0.9813 | −0.1647 |
| 4 | 2 | 1 | 0.9630 | 0.9615 | 0.9581 | 0.9599 | 0.9668 | 0.9619 | −0.3379 |
| 5 | 2 | 2 | 0.9687 | 0.9721 | 0.9704 | 0.9707 | 0.9626 | 0.9689 | −0.2746 |
| 6 | 2 | 3 | 0.9685 | 0.9748 | 0.9744 | 0.9712 | 0.9707 | 0.9719 | −0.2475 |
| 7 | 3 | 1 | 0.9671 | 0.9689 | 0.9648 | 0.9668 | 0.9645 | 0.9664 | −0.2967 |
| 8 | 3 | 2 | 0.9741 | 0.9704 | 0.9797 | 0.9799 | 0.9767 | 0.9762 | −0.2098 |
| 9 | 3 | 3 | 0.9625 | 0.9633 | 0.9642 | 0.9678 | 0.9619 | 0.9639 | −0.3191 |
(A 1 = 10, A 2 = 50, A 3 = 100; B 1 = 2.4, B 2 = 5, B 3 = 10).
Summary of experiment data of Zoo database.
| Number | Control factor | Observation | Average |
| |||||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
| |||
| 1 | 1 | 1 | 0.9513 | 0.9673 | 0.9435 | 0.9567 | 0.9546 | 0.9547 | −0.4037 |
| 2 | 1 | 2 | 0.9600 | 0.9616 | 0.9588 | 0.9611 | 0.9608 | 0.9605 | −0.3504 |
| 3 | 1 | 3 | 0.7809 | 0.7833 | 0.7820 | 0.7679 | 0.7811 | 0.7790 | −2.1694 |
| 4 | 2 | 1 | 0.7118 | 0.6766 | 0.7368 | 0.7256 | 0.7109 | 0.7123 | −2.9571 |
| 5 | 2 | 2 | 0.9600 | 0.9612 | 0.9604 | 0.9519 | 0.9440 | 0.9555 | −0.3960 |
| 6 | 2 | 3 | 0.8900 | 0.8947 | 0.9214 | 0.9050 | 0.9190 | 0.9060 | −0.8598 |
| 7 | 3 | 1 | 0.7118 | 0.7398 | 0.7421 | 0.7495 | 0.7203 | 0.7327 | −2.7064 |
| 8 | 3 | 2 | 0.9610 | 0.9735 | 0.9709 | 0.9752 | 0.9661 | 0.9693 | −0.2709 |
| 9 | 3 | 3 | 0.9600 | 0.9723 | 0.9707 | 0.9509 | 0.9763 | 0.9660 | −0.3013 |
(A 1 = 5, A 2 = 10, A 3 = 50; B 1 = 0.08, B 2 = 4, B 3 = 11).
Average of each factor at all levels.
| Dermatology | Zoo | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Control factor | Level | Control factor | Level | ||||||
| 1 | 2 | 3 | Difference | 1 | 2 | 3 | Difference | ||
|
| −0.2487 | −0.2867 | −0.2752 | 0.0380 |
| −0.9745 | −1.4043 | −1.0929 | 0.4298 |
|
| −0.3135 | −0.2533 | −0.2438 | 0.0697 |
| −2.0224 | −0.3391 | −1.1102 | 1.6833 |
Figure 2Main effect plots for SN ratio of Dermatology database.
Figure 3Main effect plots for SN ratio of Zoo database.
Classification performance comparison of Dermatology database.
| Methods | Dimensions |
|
| Accuracy |
|---|---|---|---|---|
| SVM | 33 | 100 | 5 | 95.10% ± 0.0096 |
| SVM-RFE | 23 | 50 | 2.4 | 89.28% ± 0.0139 |
| SVM-RFE-Taguchi | 23 | 10 | 10 | 95.38% ± 0.0098 |
Figure 4Classification performance comparison of Dermatology database.
Classification performance comparison of Zoo database.
| Methods | Dimensions |
|
| Accuracy |
|---|---|---|---|---|
| SVM | 16 | 10 | 11 | 89% ± 0.0314 |
| SVM-RFE | 6 | 50 | 0.08 | 92% ± 0.0199 |
| SVM-RFE-Taguchi | 12 | 5 | 4 | 97% ± 0.0396 |
Figure 5Classification performance comparison of Zoo database.
Comparison of classification accuracy in related literature.
| Author | Method | Accuracy% |
|---|---|---|
| Dermatology database | ||
| Xie et al. (2005) [ | FOut_SVM | 91.74% |
| Srinivasa et al. (2006) [ | FCM_SVM | 83.30% |
| Ren et al. (2006) [ | LDA_SVM | 72.09% |
| Our Method (2014) | SVM-RFE-Taguchi | 95.38% |
|
| ||
| Zoo database | ||
| Xie et al. (2005) [ | FOut_SVM | 88.24% |
| He (2006) [ | NFPH_k-modes | 92.08% |
| Golzari et al. (2009) [ | Fuzzy_AIRS | 94.96% |
| Our Method (2014) | SVM-RFE-Taguchi | 97.00% |