| Literature DB >> 32908920 |
Davies Segera1, Mwangi Mbuthia1, Abraham Nyete1.
Abstract
Finding an optimal set of discriminative features is still a crucial but challenging task in biomedical science. The complexity of the task is intensified when any of the two scenarios arise: a highly dimensioned dataset and a small sample-sized dataset. The first scenario poses a big challenge to existing machine learning approaches since the search space for identifying the most relevant feature subset is so diverse to be explored quickly while utilizing minimal computational resources. On the other hand, the second aspect poses a challenge of too few samples to learn from. Though many hybrid metaheuristic approaches (i.e., combining multiple search algorithms) have been proposed in the literature to address these challenges with very attractive performance compared to their counterpart standard standalone metaheuristics, more superior hybrid approaches can be achieved if the individual metaheuristics within the proposed hybrid algorithms are improved prior to the hybridization. Motivated by this, we propose a new hybrid Excited- (E-) Adaptive Cuckoo Search- (ACS-) Intensification Dedicated Grey Wolf Optimization (IDGWO), i.e., EACSIDGWO. EACSIDGWO is an algorithm where the step size of ACS and the nonlinear control strategy of parameter a → of the IDGWO are innovatively made adaptive via the concept of the complete voltage and current responses of a direct current (DC) excited resistor-capacitor (RC) circuit. Since the population has a higher diversity at early stages of the proposed EACSIDGWO algorithm, both the ACS and IDGWO are jointly involved in local exploitation. On the other hand, to enhance mature convergence at latter stages of the proposed algorithm, the role of ACS is switched to global exploration while the IDGWO is still left conducting the local exploitation. To prove that the proposed algorithm is superior in providing a good learning from fewer instances and an optimal feature selection from information-rich biomedical data, all these while maintaining a high classification accuracy of the data, the EACSIDGWO is employed to solve the feature selection problem. The EACSIDGWO as a feature selector is tested on six standard biomedical datasets from the University of California at Irvine (UCI) repository. The experimental results are compared with the state-of-the-art feature selection techniques, including binary ant-colony optimization (BACO), binary genetic algorithm (BGA), binary particle swarm optimization (BPSO), and extended binary cuckoo search algorithm (EBCSA). These results reveal that the EACSIDGWO has comprehensive superiority in tackling the feature selection problem, which proves the capability of the proposed algorithm in solving real-world complex problems. Furthermore, the superiority of the proposed algorithm is proved via various numerical techniques like ranking methods and statistical analysis.Entities:
Mesh:
Year: 2020 PMID: 32908920 PMCID: PMC7450338 DOI: 10.1155/2020/8506365
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Algorithm 1Algorithm 1: Pseudocode for the standard CS.
Algorithm 3Algorithm 3. Pseudocode for the EACSIDGWO (Binary Version).
Considered biomedical datasets.
| Dataset | Number of features | Number of cases |
|---|---|---|
| Breast Cancer Wisconsin (prognosis) | 33 | 198 |
| Breast Cancer Wisconsin (diagnostic) | 30 | 569 |
| SPECTF Heart | 44 | 267 |
| Ovarian Cancer | 4000 | 216 |
| CNS | 7129 | 60 |
| Colon | 2000 | 62 |
Selection of suitable kernel functions.
| Dataset | Kernel function |
|---|---|
| Breast Cancer Wisconsin (prognosis) | Radial basis function (RBF) |
| Breast Cancer Wisconsin (diagnostic) | Radial basis function (RBF) |
| SPECTF Heart | Radial basis function (RBF) |
| Ovarian Cancer | Linear function |
| CNS | Linear function |
| Colon | Linear function |
Selection of parameter values for the considered approaches.
| Algorithm | Parameter values |
|---|---|
| EACSIDGWO | stepMax = 1, |
| EBCS |
|
| BACO | Γinitial = 0.1, |
| BGA |
|
| BPSO |
|
Experimental results for the Ovarian Cancer dataset.
| Algorithm | Accuracy | Number of features | ||||
|---|---|---|---|---|---|---|
| MaxlowbarAcc | MinlowbarAcc | AvglowbarAcc | MaxlowbarNFeat | MinlowbarNFeat | AvglowbarNFeat | |
| EACSIDGWO |
|
|
|
|
|
|
| EBCS |
| 0.991 | 0.991 | 1855 | 1747 | 1811.6 |
| BACO |
|
|
|
|
|
|
| BGA |
| 0.991 | 0.991 | 1830 | 1755 | 1887.3 |
| BPSO |
|
|
| 1913 | 1777 | 1857 |
Values in bold represent the best result, and values in italic denote the worst in each column, respectively.
Experimental results for the Breast Cancer Wisconsin (Diagnostic) dataset.
| Algorithm | Accuracy | Number of features | ||||
|---|---|---|---|---|---|---|
| MaxlowbarAcc | MinlowbarAcc | AvglowbarAcc | MaxlowbarNFeat | MaxlowbarNFeat | AvglowbarNFeat | |
| EACSIDGWO | 0.977 |
|
|
|
|
|
| EBCS |
|
| 0.973 | 4 |
| 3.1 |
| BACO |
|
|
|
|
|
|
| BGA | 0.975 | 0.965 | 0.972 | 6 |
| 3.6 |
| BPSO |
| 0.963 | 0.974 |
|
| 5.4 |
Values in bold represent the best result, and values in italic denote the worst in each column, respectively.
Experimental results for the Breast Cancer Wisconsin (Prognosis) dataset.
| Algorithm | Accuracy | Number of features | ||||
|---|---|---|---|---|---|---|
| MaxlowbarAcc | MinlowbarAcc | AvglowbarAcc | MaxlowbarNFeat | MinlowbarNFeat | AvglowbarNFeat | |
| EACSIDGWO |
|
|
|
|
|
|
| EBCS | 0.874 | 0.828 | 0.856 | 8 | 4 | 6.2 |
| BACO |
|
|
|
|
|
|
| BGA | 0.874 | 0.793 | 0.843 | 10 | 4 | 6.5 |
| BPSO |
|
| 0.821 | 11 | 4 | 8.3 |
Values in bold represent the best result, and values in italic denote the worst in each column, respectively.
Experimental results for the SPECTF Heart dataset.
| Algorithm | Accuracy | Number of features | ||||
|---|---|---|---|---|---|---|
| MaxlowbarAcc | MinlowbarAcc | AvglowbarAcc | MaxlowbarNFeat | MinlowbarNFeat | AvglowbarNFeat | |
| EACSIDGWO |
|
|
|
|
|
|
| EBCS | 0.873 | 0.846 | 0.861 | 8 | 5 | 6.2 |
| BACO |
|
|
|
|
|
|
| BGA |
| 0.846 | 0.866 | 11 | 4 | 8.4 |
| BPSO | 0.865 | 0.846 | 0.854 |
| 9 | 10.9 |
Values in bold represent the best result, and values in italic denote the worst in each column, respectively.
Experimental results for the CNS dataset.
| Algorithm | Accuracy | Number of features | ||||
|---|---|---|---|---|---|---|
| MaxlowbarAcc | MinlowbarAcc | AvglowbarAcc | MaxlowbarNFeat | MinlowbarNFeat | AvglowbarNFeat | |
| EACSIDGWO |
|
|
|
|
|
|
| EBCS |
| 0.667 | 0.667 | 3490 | 3391 | 3446.7 |
| BACO |
|
|
|
| 3432 |
|
| BGA | 0.683 | 0.667 | 0.668 | 3566 |
| 3489.7 |
| BPSO |
| 0.667 | 0.667 | 3547 | 3359 | 3474.3 |
Values in bold represent the best result, and values in italic denote the worst in each column, respectively.
Experimental results for the Colon dataset.
| Algorithm | Accuracy | Number of features | ||||
|---|---|---|---|---|---|---|
| MaxlowbarAcc | MinlowbarAcc | AvglowbarAcc | MaxlowbarNFeat | MinlowbarNFeat | AvglowbarNFeat | |
| EACSIDGWO |
|
|
|
|
|
|
| EBCS | 0.903 | 0.871 | 0.887 |
|
|
|
| BACO | 0.903 | 0.871 |
| 1002 | 932 | 976 |
| BGA |
| 0.871 | 0.882 | 1003 | 944 | 962.8 |
| BPSO |
|
| 0.879 | 1003 | 933 | 971.2 |
Values in bold represent the best result, and values in italic denote the worst in each column, respectively.
Using Wilcoxon's rank-sum test at p = 0.05 to compare EACSIDGWO with other algorithms.
| Dataset | Wilcoxon's rank-sum test | EBCS vs EACSIDGWO | BACO vs EACSIDGWO | BGA vs EACSIDGWO | BPSO vs EACSIDGWO |
|---|---|---|---|---|---|
| Ovarian Cancer |
| 0.000181651 | 0.000181651 | 0.000182672 | 0.000181651 |
|
| 1.000000000 | 1.000000000 | 1.000000000 | 1.000000000 | |
|
| 3.743255786 | 3.743255786 | 3.741848283 | 3.743255786 | |
|
| |||||
| Breast Cancer Wisconsin (diagnostic) |
| 0.022591996 | 0.000146767 | 0.017044126 | 0.000582314 |
|
| 1.000000000 | 1.000000000 | 1.000000000 | 1.000000000 | |
|
| 2.28026466 | 3.796476695 | 2.38575448 | 3.439721266 | |
|
| |||||
| Breast Cancer Wisconsin (prognosis) |
| 0.000730466 | 0.0001707 | 0.00073729 | 0.000174624 |
|
| 1.000000000 | 1.0000000 | 1.00000000 | 1.000000000 | |
|
| 3.377881495 | 3.758843896 | 3.375323463 | 3.753152986 | |
|
| |||||
| SPECTF Heart |
| 0.000321376 | 0.000176611 | 0.000176611 | 0.000177611 |
|
| 1.000000000 | 1.000000000 | 1.000000000 | 1.000000000 | |
|
| 3.597430949 | 3.750317207 | 3.750317207 | 3.748901726 | |
|
| |||||
| CNS |
| 0.000182672 | 0.000182672 | 0.000182672 | 0.000182672 |
|
| 1.000000000 | 1.000000000 | 1.000000000 | 1.000000000 | |
|
| 3.741848283 | 3.741848283 | 3.741848283 | 3.741848283 | |
|
| |||||
| COLON |
| 0.000182672 | 0.000182672 | 0.000182672 | 0.000181651 |
|
| 1.000000000 | 1.000000000 | 1.000000000 | 1.000000000 | |
|
| 3.741848283 | 3.741848283 | 3.741848283 | 3.743255786 | |
Overall ranking of considered algorithms.
| Algorithm | Measures | Datasets | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Ovarian Cancer | Breast Cancer Wisconsin (diagnostic) | Breast Cancer Wisconsin (prognosis) | SPECTF Heart | CNS | Colon | Sum of ranks | Overall rank | Total sum | Final ranks | ||
| EACSIDGWO | MaxlowbarAcc | 1 | 2 | 1 | 1 | 1 | 1 | 7 | 1 | 37 | 1 |
| MinlowbarAcc | 1 | 1 | 1 | 1 | 1 | 1 | 6 | 1 | |||
| AvglowbarAcc | 1 | 1 | 1 | 1 | 1 | 1 | 6 | 1 | |||
| MaxlowbarNFeat | 1 | 1 | 1 | 1 | 1 | 1 | 6 | 1 | |||
| MinlowbarNFeat | 1 | 1 | 1 | 1 | 1 | 1 | 6 | 1 | |||
| AvglowbarNFeat | 1 | 1 | 1 | 1 | 1 | 1 | 6 | 1 | |||
|
| |||||||||||
| EBCS | MaxlowbarAcc | 2 | 1 | 2 | 3 | 3 | 2 | 13 | 2 | 84 | 2 |
| MinlowbarAcc | 2 | 1 | 2 | 2 | 2 | 2 | 11 | 2 | |||
| AvglowbarAcc | 2 | 2 | 2 | 3 | 3 | 2 | 14 | 2 | |||
| MaxlowbarNFeat | 3 | 2 | 2 | 2 | 2 | 4 | 15 | 2 | |||
| MinlowbarNFeat | 2 | 1 | 2 | 3 | 3 | 5 | 16 | 2 | |||
| AvglowbarNFeat | 2 | 2 | 2 | 2 | 2 | 5 | 15 | 2 | |||
|
| |||||||||||
| BACO | MaxlowbarAcc | 2 | 4 | 4 | 4 | 2 | 2 | 18 | 4 | 138 | 5 |
| MinlowbarAcc | 3 | 4 | 5 | 3 | 3 | 2 | 20 | 4 | |||
| AvglowbarAcc | 3 | 5 | 5 | 5 | 4 | 4 | 26 | 5 | |||
| MaxlowbarNFeat | 5 | 4 | 5 | 4 | 5 | 2 | 25 | 5 | |||
| MinlowbarNFeat | 5 | 2 | 3 | 5 | 3 | 2 | 20 | 3 | |||
| AvglowbarNFeat | 5 | 5 | 5 | 5 | 5 | 4 | 29 | 5 | |||
Overall ranking of considered algorithms.
| Algorithm | Measures | Datasets | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Ovarian Cancer | Breast Cancer Wisconsin (diagnostic) | Breast Cancer Wisconsin (prognosis) | SPECTF Heart | CNS | Colon | Sum of ranks | Overall rank | Total sum | Final ranks | ||
| BGA | MaxlowbarAcc | 2 | 3 | 2 | 1 | 2 | 3 | 13 | 2 | 95 | 3 |
| MinlowbarAcc | 2 | 2 | 5 | 2 | 2 | 2 | 15 | 3 | |||
| AvglowbarAcc | 2 | 4 | 3 | 2 | 2 | 3 | 16 | 3 | |||
| MaxlowbarNFeat | 2 | 3 | 3 | 3 | 4 | 3 | 18 | 3 | |||
| MinlowbarNFeat | 3 | 1 | 2 | 2 | 4 | 4 | 16 | 2 | |||
| AvglowbarNFeat | 3 | 3 | 3 | 3 | 3 | 2 | 17 | 3 | |||
|
| |||||||||||
| BPSO | MaxlowbarAcc | 2 | 1 | 3 | 3 | 3 | 3 | 15 | 3 | 110 | 4 |
| MinlowbarAcc | 3 | 3 | 2 | 2 | 2 | 3 | 15 | 3 | |||
| AvglowbarAcc | 3 | 2 | 4 | 4 | 3 | 5 | 21 | 4 | |||
| MaxlowbarNFeat | 4 | 4 | 4 | 4 | 3 | 3 | 22 | 4 | |||
| MinlowbarNFeat | 4 | 1 | 2 | 4 | 2 | 3 | 16 | 2 | |||
| AvglowbarNFeat | 3 | 4 | 4 | 4 | 3 | 3 | 21 | 4 | |||