| Literature DB >> 35440609 |
Nebojsa Bacanin1, Miodrag Zivkovic2, Fadi Al-Turjman3, K Venkatachalam4, Pavel Trojovský4,5, Ivana Strumberger2, Timea Bezdan2.
Abstract
Deep learning has recently been utilized with great success in a large number of diverse application domains, such as visual and face recognition, natural language processing, speech recognition, and handwriting identification. Convolutional neural networks, that belong to the deep learning models, are a subtype of artificial neural networks, which are inspired by the complex structure of the human brain and are often used for image classification tasks. One of the biggest challenges in all deep neural networks is the overfitting issue, which happens when the model performs well on the training data, but fails to make accurate predictions for the new data that is fed into the model. Several regularization methods have been introduced to prevent the overfitting problem. In the research presented in this manuscript, the overfitting challenge was tackled by selecting a proper value for the regularization parameter dropout by utilizing a swarm intelligence approach. Notwithstanding that the swarm algorithms have already been successfully applied to this domain, according to the available literature survey, their potential is still not fully investigated. Finding the optimal value of dropout is a challenging and time-consuming task if it is performed manually. Therefore, this research proposes an automated framework based on the hybridized sine cosine algorithm for tackling this major deep learning issue. The first experiment was conducted over four benchmark datasets: MNIST, CIFAR10, Semeion, and UPS, while the second experiment was performed on the brain tumor magnetic resonance imaging classification task. The obtained experimental results are compared to those generated by several similar approaches. The overall experimental results indicate that the proposed method outperforms other state-of-the-art methods included in the comparative analysis in terms of classification error and accuracy.Entities:
Mesh:
Year: 2022 PMID: 35440609 PMCID: PMC9016213 DOI: 10.1038/s41598-022-09744-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Control parameter summary.
| Population size | 49 |
| Maximum iteration number | 500 |
| Parameter | Changes according to Eq. ( |
| Parameter | |
| Parameter | |
| Parameter | |
| FA’s Absorption coefficient | 1 |
| FA’s attractiveness parameter at | 1 |
| FA’s randomization parameter | Changes according to Eq. ( |
| FA’s Initial | 0.5 |
Result comparison of different well-known metaheuristics on CEC2019 benchmark functions.
| Function | Stats | EHOI | EHO | SCA | SSA | GOA | WOA | BBO | MFO | PSO | FA | OBSCA-FS |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CEC01 | Mean | 4.76E+04 | 1.35E+07 | 9.83E+09 | 3.21E+09 | 1.61E+10 | 1.03E+10 | 3.52E+10 | 7.17E+09 | 6.75E+11 | 7.43E+04 | |
| Std | 2.14E+03 | 7.91E+06 | 6.95E+09 | 1.42E+09 | 8.99E+9 | 9.14E+09 | 2.32E+10 | 8.69E+09 | 2.34E+11 | 4.49E+03 | 4.21E+03 | |
| CEC02 | Mean | 1.70E+01 | 1.72E+01 | 1.75E+01 | 1.73E+01 | 1.74E+01 | 1.73E+01 | 8.87E+01 | 1.74E+01 | 8.56E+02 | 2.85E+01 | |
| Std | 3.66E−16 | 7.29E−15 | 5.19E−03 | 6.55E−05 | 3.23E−02 | 1.95E−03 | 2.45E+01 | 4.17E−15 | 3.87E+02 | 3.21E+02 | 5.32E+01 | |
| CEC03 | Mean | 1.27E+01 | 1.27E+01 | 1.27E+01 | 1.27E+01 | 1.27E+01 | 1.27E+01 | 1.27E+01 | 1.27E+01 | 1.27E+01 | 1.27E+01 | 1.27E+01 |
| Std | 3.95E−16 | 7.44E−16 | 3.25E−04 | 3.11E−15 | 6.47E−04 | 7.94E−06 | 5.25E−07 | 4.38E−05 | 4.12E−04 | 5.22E−01 | 4.03E−01 | |
| CEC04 | Mean | 1.28E+01 | 1.55E+01 | 8.32E+02 | 3.25E+01 | 1.51E+02 | 2.65E+02 | 6.95E+01 | 1.38E+02 | 6.92E+01 | 3.89E+01 | |
| Std | 4.26E+00 | 8.52E+00 | 3.85E+02 | 1.09E+01 | 1.13E+02 | 1.39E+02 | 2.99E+01 | 1.15E+02 | 5.43E+01 | 2.32E−01 | 1.19E+00 | |
| CEC05 | Mean | 1.05E+00 | 1.07E+00 | 2.23E+00 | 1.35E+00 | 1.33E+00 | 1.67E+00 | 1.31E+00 | 1.13E+00 | 1.55E+00 | 1.13E+00 | |
| Std | 3.25E−03 | 2.41E−02 | 7.81E−02 | 2.33E−01 | 1.21E−01 | 3.86E−02 | 9.63E−02 | 6.56E−02 | 1.18E−01 | 4.26E−02 | 2.17E−02 | |
| CEC06 | Mean | 8.33E+00 | 9.45E+00 | 1.04E+01 | 3.79E+00 | 6.19E+00 | 9.14E+00 | 5.78E+00 | 4.92E+00 | 1.03E+01 | 1.05E+01 | |
| Std | 6.23E−01 | 1.31E+00 | 8.15E+00 | 1.23E+00 | 1.33E+00 | 1.05E+00 | 2.99E−01 | 2.13E+00 | 3.35E+00 | 6.20E−01 | 4.46E−02 | |
| CEC07 | Mean | 1.42E+02 | 1.81E+02 | 6.38E+02 | 2.89E+02 | 2.87E+02 | 4.53E+02 | 4.92E+00 | 3.19E+02 | 5.97E+02 | 4.91E+02 | |
| Std | 1.13E+02 | 1.51E+02 | 2.78E+02 | 2.25E+02 | 1.75E+02 | 2.25E+02 | 1.21E+00 | 2.15E+02 | 1.89E+02 | 1.23E+02 | 8.36E+01 | |
| CEC08 | Mean | 3.15E+00 | 5.77E+00 | 5.08E+00 | 5.49E+00 | 5.75E+00 | 4.81E+00 | 5.45E+00 | 5.10E+00 | 2.78E+00 | 2.83E+00 | |
| Std | 9.15E−02 | 1.44E+00 | 7.29E−01 | 7.83E−01 | 5.14E−01 | 7.29E−01 | 1.03E+00 | 5.62E−01 | 7.33E−01 | 8.99E−01 | 9.13E−01 | |
| CEC09 | Mean | 2.29E+00 | 2.41E+00 | 8.75E+01 | 2.38E+00 | 2.45E+00 | 5.16E+00 | 3.75E+00 | 2.46E+00 | 2.65E+00 | 4.95E+00 | |
| Std | 5.55E−03 | 2.18E−02 | 5.63E+01 | 5.33E−02 | 6.41E−02 | 5.29E−01 | 3.14E−01 | 6.76E−02 | 8.45E−02 | 2.83E−01 | 1.54E−02 | |
| CEC10 | Mean | 1.92E+01 | 2.11E+01 | 2.08E+01 | 2.03E+01 | 2.00E+01 | 2.05E+01 | 2.07E+01 | 2.02E+01 | 2.06E+01 | 2.02E+01 | |
| Std | 3.49E+00 | 7.29E+00 | 6.45E+00 | 8.19E+00 | 6.67E+00 | 3.52E−01 | 7.13E−00 | 6.66E−01 | 9.81E+02 | 9.13E−02 | 1.56E−02 |
Friedman ranks for the comparable method over 10 CEC2019 instances.
| Function | EHOI | EHO | SCA | SSA | GOA | WOA | BBO | MFO | PSO | FA | OBSCA-FS |
|---|---|---|---|---|---|---|---|---|---|---|---|
| CEC01 | 2 | 4 | 7 | 5 | 9 | 8 | 10 | 6 | 11 | 3 | 1 |
| CEC02 | 4 | 4 | 8 | 4 | 7 | 4 | 10 | 4 | 11 | 9 | 1 |
| CEC03 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 |
| CEC04 | 2 | 3 | 11 | 4 | 9 | 10 | 7 | 8 | 6 | 5 | 1 |
| CEC05 | 2.5 | 2.5 | 11 | 6 | 8 | 10 | 7 | 5 | 9 | 4 | 1 |
| CEC06 | 6 | 8 | 11 | 2 | 5 | 7 | 4 | 3 | 10 | 9 | 1 |
| CEC07 | 3 | 4 | 11 | 5 | 6 | 8 | 2 | 7 | 10 | 9 | 1 |
| CEC08 | 1 | 4 | 11 | 7 | 8 | 10 | 5 | 9 | 6 | 2 | 3 |
| CEC09 | 2 | 3 | 11 | 4 | 5 | 9 | 8 | 6 | 7 | 10 | 1 |
| CEC10 | 2 | 9 | 11 | 3 | 4.5 | 7 | 4.5 | 7 | 10 | 7 | 1 |
| Average | 3.05 | 4.75 | 9.8 | 4.6 | 6.75 | 7.9 | 6.35 | 6.1 | 8.6 | 6.4 | 1.7 |
| Rank | 2 | 4 | 11 | 3 | 8 | 9 | 6 | 5 | 10 | 7 | 1 |
Friedman and Iman-Davenport test results ().
| Friedman value | Iman-Davenport value | ||||
|---|---|---|---|---|---|
| 5.12E+01 | 1.83E+01 | 1.11E−16 | 9.46E+00 | 1.93E+00 | 1.11E−13 |
Holm’s step-down procedure result.
| Comparison | Rank | 0.05/(k-i) | 0.1/(k-i) | |
|---|---|---|---|---|
| OBSCA-FS versus SCA | 2.37E−08 | 0 | 0.005000 | 0.01000 |
| OBSCA-FS versus PSO | 1.46E−06 | 1 | 0.005556 | 0.01111 |
| OBSCA-FS versus WOA | 1.46E−05 | 2 | 0.006250 | 0.01250 |
| OBSCA-FS versus GOA | 3.31E−04 | 3 | 0.007143 | 0.01429 |
| OBSCA-FS versus FA | 7.66E−04 | 4 | 0.008333 | 0.01667 |
| OBSCA-FS versus BBO | 8.59E−04 | 5 | 0.010000 | 0.02000 |
| OBSCA-FS versus MFO | 1.50E−03 | 6 | 0.012500 | 0.02500 |
| OBSCA-FS versus EHO | 1.98E−02 | 7 | 0.016667 | 0.03333 |
| OBSCA-FS versus SSA | 2.52E−02 | 8 | 0.025000 | 0.05000 |
| OBSCA-FS versus EHOI | 1.81E−01 | 9 | 0.050000 | 0.10000 |
Figure 1Convergence speed graphs of the 10 CEC 2019 benchmark functions as direct comparison between proposed OBSCA-FS and relevant metaheuristics.
Figure 2Example instance of MNIST, Semeion and USPS model.
Figure 3Example instance of CIFAR10 model.
Figure 4Proposed methodology for dp regularization.
The CNN , and adjustments for simulations.
| Dataset | ||||
|---|---|---|---|---|
| CIFAR-10 | 0.001 | 0.9 | 0.004 | [0, 1] |
| MNIST | 0.01 | 0.9 | 0.00005 | [0, 1] |
| Semeion | 0.001 | 0.9 | 0.00005 | [0, 1] |
| USPS | 0.01 | 0.9 | 0.00005 | [0, 1] |
Configuration of the datasets used in experiments.
| Dataset | Training set samples (batch size) | Validation set samples (batch size) | Testing set (batch size) | Epochs |
|---|---|---|---|---|
| CIFAR-10 | 20,000 (100) | 30,000 (100) | 10,000 (100) | 4000 |
| MNIST | 20,000 (64) | 40,000 (100) | 10,000 (100) | 10,000 |
| Semeion | 200 (2) | 400 (400) | 993 (993) | 10,000 |
| USPS | 2406 (32) | 4885 (977) | 2007 (2007) | 10,000 |
Configuration of control parameters of the metaheuristics that were implemented and included in the comparative analysis.
| Algorithm | Control parameters and their values |
|---|---|
| BA[ | |
| CS[ | |
| PSO[ | |
| EHO[ | |
| WOA[ | Initial value of |
| SSA[ | |
| GOA[ | |
| BBO[ | |
| FA[ | |
| SCA[ |
Comparative results of the suggested OBSCA-FS method and other metaheuristics approaches in terms of mean classification accuracy.
| Method | MNIST | Semeion | USPS | CIFAR-10 | ||||
|---|---|---|---|---|---|---|---|---|
| acc. | acc. | acc. | acc. | |||||
| Caffe | 99.07 | 0 | 97.62 | 0 | 95.80 | 0 | 71.47 | 0 |
| Dropout Caffe | 99.18 | 0.5 | 98.14 | 0.5 | 96.21 | 0.5 | 72.08 | 0.5 |
| BA | 99.14 | 0.491 | 98.35 | 0.692 | 96.45 | 0.762 | 71.49 | 0.633 |
| CS | 99.14 | 0.489 | 98.21 | 0.544 | 96.31 | 0.715 | 71.21 | 0.669 |
| PSO | 99.16 | 0.493 | 97.79 | 0.371 | 96.33 | 0.725 | 71.51 | 0.621 |
| EHO | 99.13 | 0.475 | 98.11 | 0.481 | 96.24 | 0.682 | 71.15 | 0.705 |
| WOA | 99.15 | 0.489 | 98.23 | 0.561 | 96.32 | 0.722 | 71.23 | 0.685 |
| SSA | 99.19 | 0.499 | 98.31 | 0.642 | 96.41 | 0.753 | 71.58 | 0.529 |
| GOA | 99.16 | 0.492 | 98.15 | 0.513 | 96.15 | 0.481 | 70.95 | 0.849 |
| BBO | 99.13 | 0.474 | 98.16 | 0.515 | 96.17 | 0.483 | 71.08 | 0.768 |
| FA | 99.18 | 0.495 | 98.29 | 0.619 | 96.42 | 0.758 | 71.55 | 0.583 |
| SCA | 99.17 | 0.496 | 98.25 | 0.580 | 96.29 | 0.705 | 71.54 | 0.597 |
| OBSCA-FS | 0.524 | 0.722 | 0.838 | 0.394 | ||||
Figure 5Convergence graph of MNIST, CIFAR-10, Semeion and USPS datasets for average classification error for OBSCA-FS, SCA and WOA.
Friedman ranks for the comparable method over 4 CNN classification instances.
| Function | Caffe | Dropout Caffe | BA | CS | PSO | EHO | WOA | SSA | GOA | BBO | FA | SCA | OBSCA-FS |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MNIST | 13 | 3.5 | 9.5 | 9.5 | 6.5 | 11.5 | 8 | 2 | 6.5 | 11.5 | 3.5 | 5 | 1 |
| Semeion | 13 | 10 | 2 | 7 | 12 | 11 | 6 | 3 | 9 | 8 | 4 | 5 | 1 |
| USPS | 13 | 10 | 2 | 7 | 5 | 9 | 6 | 4 | 12 | 11 | 3 | 8 | 1 |
| CIFAR-10 | 8 | 2 | 7 | 10 | 6 | 11 | 9 | 3 | 13 | 12 | 4 | 5 | 1 |
| Average | 11.75 | 6.375 | 5.125 | 8.375 | 7.375 | 10.625 | 7.25 | 3 | 10.125 | 10.625 | 3.625 | 5.75 | 1 |
| Rank | 13 | 6 | 4 | 9 | 8 | 11 | 7 | 2 | 10 | 12 | 3 | 5 | 1 |
Figure 6The CNN structure utilized for MRI dataset.
Control parameters’ values for metaheuristics methods included in the experiments.
| Metaheuristics | Parameters’ values |
|---|---|
| GA[ | |
| FA[ | |
| mFA[ | |
| BA[ | |
| EHO[ | |
| WOA[ | |
| SCA[ |
MRI tumor grades classification comparative analysis.
| Approach | Accuracy (%) | Dropout |
|---|---|---|
| SVM + RFE[ | 71.2 | |
| Vanilla preprocessing + shallow CNN[ | 91.4 | |
| CNN LeNet-5[ | 74.9 | |
| VGG19[ | 92.6 | |
| DenseNet[ | 92.7 | |
| CNN + GA[ | 94.9 | 0.33 |
| CNN + mFA[ | 96.9 | 0.39 |
| CNN + BA[ | 95.6 | 0.37 |
| CNN + EHO[ | 94.8 | 0.31 |
| CNN + WOA[ | 95.5 | 0.36 |
| CNN + HHO[ | 96.5 | 0.38 |
| CNN + eHHO[ | 98.3 | 0.41 |
| CNN + FA[ | 96.1 | 0.37 |
| CNN + SCA[ | 96.8 | 0.40 |
| CNN + OBSCA-FS |
Figure 7MRI dataset—best solutions diversity over 10 runs results.
Figure 8Confusion matrix for OBSCA-FS and eHHO, the two approaches that scored the best results on the MRI dataset.