| Literature DB >> 27403253 |
Shokoufeh Aalaei1, Hadi Shahraki2, Alireza Rowhanimanesh3, Saeid Eslami4.
Abstract
OBJECTIVES: This study addresses feature selection for breast cancer diagnosis. The present process uses a wrapper approach using GA-based on feature selection and PS-classifier. The results of experiment show that the proposed model is comparable to the other models on Wisconsin breast cancer datasets.Entities:
Keywords: Breast cancer; Classification feature; Selection data mining
Year: 2016 PMID: 27403253 PMCID: PMC4923467
Source DB: PubMed Journal: Iran J Basic Med Sci ISSN: 2008-3866 Impact factor: 2.699
Wisconsin breast cancer datasets (18)
| Dataset | No. of attribute | No. of instances | No. of class |
|---|---|---|---|
| Wisconsin breast cancer (WBC) | 11 | 699 | 2 |
| Wisconsin diagnosis breast cancer (WDBC) | 32 | 569 | 2 |
| Wisconsin prognosis breast cancer (WPBC) | 34 | 198 | 2 |
Wisconsin breast cancer (WBC) Attribute (20)
| # Attribute | Domain | |
|---|---|---|
| 1 | Sample code number | Id number |
| 2 | Clump thickness | 1 – 10 |
| 3 | Uniformity of cell size | 1 – 10 |
| 4 | Uniformity of cell shape | 1 – 10 |
| 5 | Marginal adhesion | 1 – 10 |
| 6 | Single epithelial cell size | 1 – 10 |
| 7 | Bare nuclei | 1 – 10 |
| 8 | Bland chromatin | 1 – 10 |
| 9 | Normal nucleoli | 1 – 10 |
| 10 | Mitoses | 1 – 10 |
| 11 | Class | (2 for benign, 4 for malignant) |
Figure 1Generating initial population
Figure 2Separating two classes with one hyper plane
Figure 3Proposed feature selection flowchart
Selected features after applying feature selection method
| Dataset | Selected features |
|---|---|
| WBC | 3,6,8,9 |
| WDBC | 1,2,6,8,12,14,18,19,21,22,25,26,27,29 |
| WPBC | 1,4,5,6,7,10,11,13,15,16,18,23,24,25,28,29 |
The Sensitivity, specificity and accuracy of 3 classifiers with and without feature selection (FS) using WBC dataset
| Accuracy | Specificity | Sensitivity | ||||
|---|---|---|---|---|---|---|
| Without FS | With FS | Without FS | With FS | Without FS | With FS | |
| PSO | 96.2 | 96.9 | 96.4 | 97.5 | 96.5 | 97.7 |
| GA | 96 | 96.6 | 96.5 | 96.6 | 96.5 | 97.1 |
| ANN | 96.8 | 96.7 | 95.2 | 97.2 | 94.9 | 97.2 |
The Sensitivity, specificity and accuracy of 3 classifiers with and without feature selection (FS) using WDBC dataset
| Accuracy | Specificity | Sensitivity | ||||
|---|---|---|---|---|---|---|
| Without FS | With FS | Without FS | With FS | Without FS | With FS | |
| PSO | 96.4 | 97.2 | 93.1 | 95.6 | 98.6 | 98 |
| GA | 96.1 | 96.6 | 92.9 | 93.7 | 97.8 | 97.5 |
| ANN | 96.5 | 97.3 | 96 | 95.1 | 98.2 | 98.4 |
The Sensitivity, specificity and accuracy of 3 classifiers with and without feature selection (FS) using WPBC dataset
| Accuracy | Specificity | Sensitivity | ||||
|---|---|---|---|---|---|---|
| Without FS | With FS | Without FS | With FS | Without FS | With FS | |
| PSO | 77.8 | 78.2 | 88.5 | 92.9 | 32.0 | 33.3 |
| GA | 76.3 | 78.1 | 90.2 | 92.8 | 26.9 | 31.0 |
| ANN | 77.4 | 79.2 | 94.4 | 96.3 | 28.3 | 33 |
Comparison of experimental results of proposed method and other papers in WBC
| Classifier (reference) | CART ( | AR+NN ( | RS-SVM ( | SVM ( | Graph-based ( | This study | ||
|---|---|---|---|---|---|---|---|---|
| ANN | PS-classifier | GA-classifier | ||||||
| Classification accuracy | 96.9 | 97.4 | 96.8 | 96.5 | 96.4 | 96.7 | 96.9 | 96.6 |
Comparison of experimental results of proposed method and other papers in WDBC
| Classifier (reference) | CART ( | RBF_FS ( | FRNN_FS ( | FS_SFS ( | This study | |||
|---|---|---|---|---|---|---|---|---|
| ANN | PS-classifier | GA-classifier | ||||||
| Classification accuracy | 94.7 | 96.05 | 95.88 | 93.0 | 97.3 | 97.2 | 96.6 | |
Comparison of experimental results of proposed method and other papers in WPBC
| Classifier (reference) | CART ( | Naïve Bayes- ReliefF ( | Naïve Bayes -Fisher Filtering ( | This study | ||
|---|---|---|---|---|---|---|
| ANN | PS-classifier | GA-classifier | ||||
| Classification accuracy | 73.3 | 77.74 | 75.25 | 79.2 | 78.2 | 78.1 |