| Literature DB >> 36072739 |
Zhiwei Ye1, Yi Xu1, Qiyi He1, Mingwei Wang1, Wanfang Bai2, Hongwei Xiao3.
Abstract
With the rapid development of the Internet of Things (IoT), the curse of dimensionality becomes increasingly common. Feature selection (FS) is to eliminate irrelevant and redundant features in the datasets. Particle swarm optimization (PSO) is an efficient metaheuristic algorithm that has been successfully applied to obtain the optimal feature subset with essential information in an acceptable time. However, it is easy to fall into the local optima when dealing with high-dimensional datasets due to constant parameter values and insufficient population diversity. In the paper, an FS method is proposed by utilizing adaptive PSO with leadership learning (APSOLL). An adaptive updating strategy for parameters is used to replace the constant parameters, and the leadership learning strategy is utilized to provide valid population diversity. Experimental results on 10 UCI datasets show that APSOLL has better exploration and exploitation capabilities through comparison with PSO, grey wolf optimizer (GWO), Harris hawks optimization (HHO), flower pollination algorithm (FPA), salp swarm algorithm (SSA), linear PSO (LPSO), and hybrid PSO and differential evolution (HPSO-DE). Moreover, less than 8% of features in the original datasets are selected on average, and the feature subsets are more effective in most cases compared to those generated by 6 traditional FS methods (analysis of variance (ANOVA), Chi-Squared (CHI2), Pearson, Spearman, Kendall, and Mutual Information (MI)).Entities:
Mesh:
Year: 2022 PMID: 36072739 PMCID: PMC9441366 DOI: 10.1155/2022/1825341
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1The framework of leadership learning.
Figure 2The flowchart of APSOLL.
Details of datasets.
| Dataset | Number of features | Number of instances | Number of classes |
|---|---|---|---|
| MIC | 124 | 1700 | 7 |
| Urban | 147 | 507 | 9 |
| SCADI | 205 | 69 | 6 |
| Arrhythmia | 279 | 452 | 13 |
| Madelon | 500 | 2600 | 2 |
| Isolet5 | 617 | 1559 | 26 |
| MF | 649 | 2000 | 10 |
| PD | 754 | 756 | 2 |
| CNAE-9 | 857 | 1080 | 9 |
| QSAR | 1024 | 1687 | 2 |
Parameters Setting of different metaheuristic algorithms.
| Algorithms | Parameters | Values |
|---|---|---|
| Common settings | Number of iterations |
|
| Population size |
| |
| The upper limit of particle position |
| |
| The lower limit of particle position | lb | |
|
| ||
| GWO | Correlation coefficient |
|
|
| ||
| PSO | Acceleration factor |
|
| Inertia weight |
| |
|
| ||
| HHO | Levy component |
|
|
| ||
| FPA | Acceleration factor |
|
| Levy component |
| |
| Switch probability |
| |
|
| ||
| SSA | Convergence factor |
|
|
| ||
| LPSO | Acceleration factor |
|
| Upper limit of inertia weight |
| |
| Lower limit of inertia weight |
| |
|
| ||
| HPSO-DE | Acceleration factor |
|
| Crossover rate |
| |
| Scaling factor |
| |
| Predefined generation |
| |
| Inertia weight |
| |
|
| ||
| APSOLL | Inertia weight |
|
Figure 3The average convergence curves of different metaheuristic algorithms for datasets below 500 dimensions.
Figure 4The average convergence curves of different metaheuristic algorithms for datasets above 500 dimensions.
Figure 5The average number of selected features for datasets below 500 dimensions by different FS methods based on metaheuristic algorithms.
Figure 6The average number of selected features for datasets above 500 dimensions by different FS methods based on metaheuristic algorithms.
Comparisons between APSOLL and other metaheuristic algorithms for datasets below 500 dimensions.
| Datasets | Method | Fit (std.) | Sfit | Acc (std.) | Sacc | #F (std.) |
| Time |
|---|---|---|---|---|---|---|---|---|
| MIC | GWO | 93.28 (0.27) | + | 91.03 (0.40) | + | 1.80 (0.65) | = | 125.37 |
| PSO | 87.04 (0.96) | + | 91.03 (0.48) | + | 27.40 (3.49) | + | 220.42 | |
| HHO | 91.83 (1.76) | + | 89.08 (2.60) | + | 2.13 (2.42) | = | 162.52 | |
| FPA | 82.66 (0.53) | + | 90.63 (0.75) | + | 44.2 (2.50) | + | 217.76 | |
| SSA | 82.68 (0.87) | + | 90.65 (0.78) | + | 44.2 (3.23) | + | 122.10 | |
| LPSO | 86.89 (0.89) | + | 91.08 (0.54) | + | 28.13 (3.48) | + | 220.95 | |
| HPSO-DE | 92.99 (0.38) | + | 90.83 (0.45) | + | 2.43 (1.09) | + | 133.59 | |
| APSOLL | 93.65 (0.35) |
| 91.40 (0.56) |
| 1.33 (0.47) |
| 122.10 | |
|
| ||||||||
| Urban | GWO | 87.21 (4.12) | = | 85.03 (16.03) | = | 11.33 (2.70) | + | 96.00 |
| PSO | 65.57 (5.57) | + | 64.97 (13.90) | + | 48.53 (5.89) | + | 160.65 | |
| HHO | 83.78 (3.18) | + | 79.43 (14.26) | + | 8.93 (5.41) | = | 94.06 | |
| FPA | 57.64 (2.87) | + | 58.26 (10.28) | + | 64.40 (6.52) | + | 163.10 | |
| SSA | 58.10 (3.92) | + | 58.17 (10.39) | + | 61.83 (5.88) | + | 162.84 | |
| LPSO | 62.53 (4.90) | + | 60.41 (11.97) | + | 47.80 (5.17) | + | 163.89 | |
| HPSO-DE | 86.06 (1.20) | = | 82.24 (14.30) | = | 7.40 (2.11) | = | 47.19 | |
| APSOLL | 86.60 (2.18) |
| 82.84 (20.46) |
| 6.83 (1.91) |
| 75.32 | |
|
| ||||||||
| SCADI | GWO | 95.13 (2.04) | = | 95.40 (2.88) | = | 11.23 (7.49) | = | 29.19 |
| PSO | 86.43 (3.22) | + | 93.33 (4.19) | + | 60.87 (8.10) | + | 124.32 | |
| HHO | 91.95 (3.61) | + | 90.63 (4.51) | + | 10.23 (7.14) | = | 24.98 | |
| FPA | 81.80 (3.48) | + | 92.38 (4.19) | + | 87.90 (7.17) | + | 147.73 | |
| SSA | 81.05 (3.59) | + | 90.79 (4.59) | + | 85.47 (8.11) | + | 152.42 | |
| LPSO | 86.01 (3.12) | + | 92.54 (4.20) | + | 59.90 (6.65) | + | 100.95 | |
| HPSO-DE | 94.38 (2.38) | + | 93.65 (3.33) | + | 8.07 (2.89) | = | 23.31 | |
| APSOLL | 97.04 (1.63) |
| 97.22 (2.35) |
| 6.92 (2.75) |
| 33.66 | |
|
| ||||||||
| Arrhythmia | GWO | 78.11 (1.31) | + | 72.33 (1.70) | + | 23.48 (5.65) | + | 161.93 |
| PSO | 67.50 (1.28) | + | 68.97 (1.98) | + | 100.23 (8.12) | + | 164.50 | |
| HHO | 74.48 (1.94) | + | 65.29 (3.20) | + | 11.40 (10.72) | = | 127.69 | |
| FPA | 62.73 (1.07) | + | 65.59 (1.97) | + | 122.57 (7.68) | + | 160.16 | |
| SSA | 62.56 (1.30) | + | 65.39 (1.77) | + | 122.93 (6.44) | + | 159.65 | |
| LPSO | 67.92 (1.39) | + | 68.77 (1.80) | + | 95.03 (6.31) | + | 167.47 | |
| HPSO-DE | 75.49 (0.86) | + | 66.96 (1.39) | + | 12.87 (2.50) | = | 80.80 | |
| APSOLL | 80.82 (1.45) |
| 74.14 (1.75) |
| 10.08 (3.95) |
| 113.36 | |
|
| ||||||||
| Madelon | GWO | 90.28 (1.00) | + | 89.71 (1.17) | + | 42.00 (7.01) | + | 310.38 |
| PSO | 74.72 (1.12) | + | 82.04 (1.18) | + | 211.73 (12.36) | + | 327.80 | |
| HHO | 81.44 (3.82) | + | 78.95 (3.48) | + | 216.47 (10.49) | + | 399.71 | |
| FPA | 75.08 (0.86) | + | 77.52 (1.16) | + | 236.67 (9.62) | + | 320.63 | |
| SSA | 70.06 (1.21) | + | 77.53 (1.57) | + | 242.17 (8.97) | + | 322.84 | |
| LPSO | 75.07 (1.18) | + | 82.94 (1.58) | + | 63.70 (37.22) | + | 325.15 | |
| HPSO-DE | 79.98 (1.72) | + | 73.64 (2.56) | + | 26.13 (5.06) | + | 301.25 | |
| APSOLL | 92.44 (0.44) |
| 90.65 (0.62) |
| 16.92 (4.75) |
| 259.51 | |
Comparisons between APSOLL and other metaheuristic algorithms for datasets above 500 dimensions.
| Datasets | Method | Fit (std.) | Sfit | Acc (std.) | Sacc | #F (Std.) | Sf | Time |
|---|---|---|---|---|---|---|---|---|
| Isolet5 | GWO | 89.66 (1.01) | + | 91.23 (1.38) | = | 86.53 (9.14) | + | 212.36 |
| PSO | 78.10 (1.04) | + | 87.31 (1.44) | + | 268.07 (10.71) | + | 219.31 | |
| HHO | 82.61 (1.71) | + | 81.60 (1.83) | + | 92.57 (26.83) | + | 283.60 | |
| FPA | 74.45 (0.89) | + | 83.50 (1.34) | + | 287.73 (10.48) | + | 211.13 | |
| SSA | 74.25 (1.02) | + | 83.60 (1.45) | + | 293.53 (8.69) | + | 207.08 | |
| LPSO | 78.61 (0.98) | + | 87.79 (1.40) | + | 264.42 (10.19) | + | 215.30 | |
| HPSO-DE | 81.72 (1.08) | + | 76.42 (1.68) | + | 36.53 (5.12) | − | 215.85 | |
| APSOLL | 91.37 (0.49) |
| 91.08 (0.55) |
| 48.92 (2.36) |
| 219.14 | |
|
| ||||||||
| MF | GWO | 96.63 (0.54) | + | 97.77 (0.54) | = | 39.27 (6.89) | + | 225.82 |
| PSO | 86.86 (0.72) | + | 97.19 (0.54) | + | 241.73 (13.51) | + | 274.84 | |
| HHO | 94.04 (0.95) | + | 94.98 (0.98) | + | 52.93 (13.66) | + | 303.36 | |
| FPA | 84.31 (0.53) | + | 96.47 (0.60) | + | 286.13 (6.77) | + | 281.36 | |
| SSA | 84.39 (0.53) | + | 96.67 (0.71) | + | 287.33 (9.16) | + | 275.93 | |
| LPSO | 87.18 (0.63) | + | 97.19 (0.61) | + | 234.7 (8.29) | + | 267.64 | |
| HPSO-DE | 94.05 (0.53) | + | 93.84 (0.77) | + | 35.57 (4.65) | + | 224.65 | |
| APSOLL | 97.71 (0.33) |
| 98.07 (0.53) |
| 20.25 (1.23) |
| 228.91 | |
|
| ||||||||
| PD | GWO | 85.54 (2.25) | + | 80.88 (3.54) | = | 27.00 (9.84) | + | 187.67 |
| PSO | 71.54 (1.62) | + | 74.60 (2.11) | + | 268.4 (11.07) | + | 185.17 | |
| HHO | 86.78 (1.28) | + | 81.60 (1.90) | + | 8.43 (6.53) | = | 127.75 | |
| FPA | 68.77 (1.31) | + | 74.60 (2.24) | + | 338.03 (12.81) | + | 174.27 | |
| SSA | 68.59 (1.44) | + | 74.23 (1.99) | + | 336 (13.24) | + | 173.38 | |
| LPSO | 71.38 (1.96) | + | 74.48 (2.60) | + | 270.43 (14.95) | + | 185.49 | |
| HPSO-DE | 86.88 (0.87) | + | 83.26 (1.39) | = | 35.20 (4.53) | + | 197.66 | |
| APSOLL | 88.44 (0.85) |
| 83.92 (1.24) |
| 7.58 (2.22) |
| 152.82 | |
|
| ||||||||
| CNAE-9 | GWO | 86.28 (1.22) | + | 88.80 (1.53) | − | 167.83 (20.93) | + | 203.26 |
| PSO | 77.41 (1.75) | + | 88.25 (2.47) | − | 409.80 (14.03) | + | 197.09 | |
| HHO | 74.04 (1.83) | + | 79.55 (4.12) | + | 332.23 (80.68) | + | 269.84 | |
| FPA | 73.91 (1.36) | + | 83.79 (1.99) | + | 420.70 (15.22) | + | 185.77 | |
| SSA | 73.74 (1.85) | + | 83.80 (2.51) | + | 425.57 (12.44) | + | 183.18 | |
| LPSO | 77.69 (1.27) | + | 88.79 (1.92) | − | 412.63 (14.06) | + | 195.26 | |
| HPSO-DE | 69.52 (1.87) | + | 77.60 (2.43) | + | 422.40 (13.83) | + | 200.46 | |
| APSOLL | 87.35 (0.55) |
| 85.03 (0.93) |
| 61.83 (5.38) |
| 210.71 | |
|
| ||||||||
| QSAR | GWO | 92.45 (0.54) | + | 93.21 (0.66) | = | 95.57 (9.18) | + | 236.35 |
| PSO | 82.20 (0.62) | + | 92.01 (0.68) | + | 416.70 (17.62) | + | 327.26 | |
| HHO | 92.68 (0.45) | + | 90.35 (0.83) | + | 19.10 (12.23) | − | 227.42 | |
| FPA | 80.13 (0.49) | + | 91.16 (0.80) | + | 466.93 (10.49) | + | 323.93 | |
| SSA | 80.06 (0.48) | + | 91.37 (0.71) | + | 474.40 (11.67) | + | 320.06 | |
| LPSO | 82.40 (0.55) | + | 92.02 (0.59) | + | 410.13 (14.90) | + | 323.23 | |
| HPSO-DE | 92.44 (0.27) | + | 91.14 (0.41) | + | 46.30 (6.15) | + | 207.01 | |
| APSOLL | 94.10 (0.55) |
| 93.11 (0.74) |
| 36.83 (7.84) |
| 231.28 | |
Figure 7The classification accuracy of 6 traditional FS methods in selecting different numbers of features for datasets below 500 dimensions.
Figure 8The classification accuracy of 6 traditional FS methods in selecting different numbers of features for datasets above 500 dimensions.
The optimal classification accuracy, number of selected features, and CPU time in comparison to traditional methods.
| Datasets | ANOVA | CHI2 | Pearson | Spearman | Kendall | MI | APSOLL | |
|---|---|---|---|---|---|---|---|---|
| MIC | Acc (%) | 90.39 | 94.31 | 90.39 | 90.39 | 90.39 | 90.39 | 92.55 |
| #F | 31 | 6 | 29 | 56 | 59 | 19 | 2 | |
| Time | 3.38 | 3.24 | 4.03 | 13.04 | 8.15 | 164.07 | 123.69 | |
|
| ||||||||
| Urban | Acc (%) | 83.01 | 83.66 | 63.40 | 63.40 | 63.40 | 82.35 | 85.62 |
| #F | 22 | 62 | 3 | 3 | 3 | 31 | 4 | |
| Time | 2.12 | 3.36 | 2.96 | 15.32 | 8.61 | 161.59 | 91.62 | |
|
| ||||||||
| SCADI | Acc (%) | 95.24 | 95.24 | 85.71 | 90.47 | 85.71 | 100 | 100 |
| #F | 23 | 34 | 94 | 118 | 107 | 46 | 7 | |
| Time | 0.79 | 2.19 | 2.02 | 19.56 | 12.49 | 143.46 | 38.08 | |
|
| ||||||||
| Arrhythmia | Acc (%) | 63.97 | 63.24 | 63.24 | 63.97 | 63.24 | 63.97 | 75.74 |
| #F | 55 | 12 | 8 | 17 | 13 | 4 | 6 | |
| Time | 2.94 | 2.34 | 8.09 | 56.24 | 40.11 | 675.94 | 127.73 | |
|
| ||||||||
| Madelon | Acc (%) | 89.87 | 89.36 | 71.41 | 71.41 | 71.41 | 79.36 | 91.15 |
| #F | 17 | 13 | 499 | 499 | 499 | 48 | 17 | |
| Time | 38.03 | 40.01 | 49.72 | 223.64 | 137.44 | 2833.54 | 169.32 | |
|
| ||||||||
| Isolet5 | Acc (%) | 86.97 | 85.47 | 84.83 | 85.26 | 85.26 | 87.82 | 92.08 |
| #F | 245 | 289 | 378 | 351 | 223 | 204 | 46 | |
| Time | 62.04 | 61.61 | 57.49 | 561.71 | 197.98 | 8484.77 | 218.44 | |
|
| ||||||||
| MF | Acc (%) | 98.33 | 98.83 | 94.83 | 94.00 | 94.33 | 93.83 | 98.50 |
| #F | 622 | 402 | 482 | 386 | 411 | 629 | 21 | |
| Time | 43.79 | 44.31 | 50.45 | 312.71 | 188.62 | 5533.33 | 105.40 | |
|
| ||||||||
| PD | Acc (%) | 79.30 | 87.67 | 80.18 | 81.06 | 81.06 | 78.41 | 85.90 |
| #F | 4 | 140 | 16 | 24 | 25 | 70 | 9 | |
| Time | 61.75 | 62.94 | 62.16 | 395.49 | 234.74 | 2488.38 | 154.63 | |
|
| ||||||||
| CNAE-9 | Acc (%) | 89.81 | 88.27 | 85.49 | 85.49 | 85.49 | 89.51 | 85.80 |
| #F | 213 | 142 | 855 | 855 | 855 | 647 | 64 | |
| Time | 66.91 | 84.32 | 63.66 | 462.64 | 237.18 | 7087.23 | 221.19 | |
|
| ||||||||
| QSAR | Acc (%) | 91.52 | 91.72 | 91.72 | 91.72 | 91.72 | 91.72 | 93.88 |
| #F | 110 | 105 | 984 | 984 | 984 | 461 | 33 | |
| Time | 126.37 | 135.70 | 125.28 | 1058.36 | 350.85 | 8411.24 | 221.18 | |