| Literature DB >> 29186052 |
Lingyun Gao1, Mingquan Ye2, Changrong Wu3.
Abstract
Intelligent optimization algorithms have advantages in dealing with complex nonlinear problems accompanied by good flexibility and adaptability. In this paper, the FCBF (Fast Correlation-Based Feature selection) method is used to filter irrelevant and redundant features in order to improve the quality of cancer classification. Then, we perform classification based on SVM (Support Vector Machine) optimized by PSO (Particle Swarm Optimization) combined with ABC (Artificial Bee Colony) approaches, which is represented as PA-SVM. The proposed PA-SVM method is applied to nine cancer datasets, including five datasets of outcome prediction and a protein dataset of ovarian cancer. By comparison with other classification methods, the results demonstrate the effectiveness and the robustness of the proposed PA-SVM method in handling various types of data for cancer classification.Entities:
Keywords: ABC; PSO; SVM; cancer classification; intelligent optimization
Mesh:
Year: 2017 PMID: 29186052 PMCID: PMC6149693 DOI: 10.3390/molecules22122086
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Reduced attributes by FCBF.
| Datasets | Original Attributes | Reduced Attributes |
|---|---|---|
| Breast cancer | 24,481 | 92 |
| Lung cancer | 2880 | 6 |
| NervSys | 7129 | 28 |
| Prostate cancer | 12,600 | 27 |
| Colon caner | 2000 | 14 |
| Leukemia | 12,582 | 97 |
| Ovarian cancer | 15,154 | 30 |
| DLBCL1 1 | 7129 | 73 |
| DLBCL2 | 7129 | 27 |
1 DLBCL represents diffuse large B-cell lymphoma.
Classification accuracy of different methods.
| Datasets | No. of Attributes | LIBSVM (%) | GA-SVM 1 (%) | PSO-SVM 2 (%) | ABC-SVM 3 (%) | PA-SVM 4 (%) |
|---|---|---|---|---|---|---|
| Breastcancer | 92 | 83.51 | 87.63 | 87.63 | ||
| Lung cancer | 6 | 64.10 | 66.67 | 74.36 | ||
| NervSys | 28 | 88.33 | 90 | 90 | ||
| Prostate cancer | 27 | 90.48 | ||||
| Colon caner | 14 | 85.48 | 90.32 | 90.32 | ||
| Leukemia | 97 | |||||
| Ovarian cancer | 30 | |||||
| DLBCL1 | 73 | 98.70 | 98.70 | 98.70 | 98.70 | |
| DLBCL2 | 27 | 81.03 | 82.76 | 84.48 | 82.76 |
1 GA-SVM method is genetic algorithm combined with SVM; 2 PSO-SVM denotes particle swarm optimization combined with SVM; 3 ABC-SVM means artificial bee colony method is used to optimize SVM; 4 PA-SVM combines particle swarm optimization with artificial bee colony to optimize SVM. The bold in the table represents the optimal value.
Figure 1Cancer classification accuracy (%) of different methods.
Details of cancer datasets.
| Datasets | Samples | No. of Attributes | Classes | Labels |
|---|---|---|---|---|
| Breast cancer | 97 | 24,481 | 2 | outcome prediction |
| Lung cancer | 39 | 2880 | 2 | outcome prediction |
| NervSys | 60 | 7129 | 2 | outcome prediction |
| Prostate cancer | 21 | 12,600 | 2 | outcome prediction |
| Colon caner | 62 | 2000 | 2 | cancer or not |
| Leukemia | 72 | 12,582 | 3 | multi-category |
| Ovarian cancer | 253 | 15,154 | 2 | protein data |
| DLBCL1 | 77 | 7129 | 2 | two category |
| DLBCL2 | 58 | 7129 | 2 | outcome prediction |
Figure 2The flowchart of PSO algorithm.
Figure 3The flowchart of the ABC algorithm.
Figure 4The whole frame of the proposed PA-SVM algorithm.