| Literature DB >> 34620891 |
Olaide N Oyelade1, Absalom E Ezugwu2.
Abstract
The design of neural architecture to address the challenge of detecting abnormalities in histopathology images can leverage the gains made in the field of neural architecture search (NAS). The NAS model consists of a search space, search strategy and evaluation strategy. The approach supports the automation of deep learning (DL) based networks such as convolutional neural networks (CNN). Automating the process of CNN architecture engineering using this approach allows for finding the best performing network for learning classification problems in specific domains and datasets. However, the engineering process of NAS is often limited by the potential solutions in search space and the search strategy. This problem often narrows the possibility of obtaining best performing networks for challenging tasks such as the classification of breast cancer in digital histopathological samples. This study proposes a NAS model with a novel search space initialization algorithm and a new search strategy. We designed a block-based stochastic categorical-to-binary (BSCB) algorithm for generating potential CNN solutions into the search space. Also, we applied and investigated the performance of a new bioinspired optimization algorithm, namely the Ebola optimization search algorithm (EOSA), for the search strategy. The evaluation strategy was achieved through computation of loss function, architectural latency and accuracy. The results obtained using images from the BACH and BreakHis databases showed that our approach obtained best performing architectures with the top-5 of the architectures yielding a significant detection rate. The top-1 CNN architecture demonstrated a state-of-the-art performance of base on classification accuracy. The NAS strategy applied in this study and the resulting candidate architecture provides researchers with the most appropriate or suitable network configuration for using digital histopathology.Entities:
Mesh:
Year: 2021 PMID: 34620891 PMCID: PMC8497552 DOI: 10.1038/s41598-021-98978-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
A description of notation and coefficients used in Eqs. (1)–(7).
| Symbols | Descriptions |
|---|---|
| π | Recruitment rate of susceptible human individuals |
| Ŋ | Decay rate of Ebola virus in the environment |
| Α | Rate of hospitalization of infected individuals |
| Γ | Disease-induced death rate of human individuals |
| β1 | Contact rate of infectious human individuals |
| β2 | Contact rate of pathogen individuals/environment |
| β3 | Contact rate of deceased human individuals |
| β4 | Contact rate of recovered human individuals |
| γ | Recovery rate of human individuals |
| τ | Natural death rate of human individuals |
| δ | Rate of burial of deceased human individuals |
| ϑ | Rate of vaccination of individuals |
| ϖ | Rate of response to hospital treatment |
| μ | Rate of response to vaccination |
| Ξ | Rate of quarantine of infected individuals |
Summary of the reviewed studies.
| References | Search space | NAS search and optimization method | Evaluation strategy |
|---|---|---|---|
| Cortes et al.[ | Simple network grown incrementally | Adaptive structural learning (AdaNet) | Binary classification accuracy |
| Negrinho and Gordon[ | Tree-structured search space | MCTS and SMBO | Training and validation |
| Wang et al.[ | AlexNet and LeNet hyperparameters | Hyperband algorithm and Bayesian optimization | Classification accuracy |
| Huang et al.[ | Global architecture | Greedy search approach | Mean prediction accuracy |
| Weng et al.[ | Primitive operations and intermediate nodes | DARTs | Measuring loss and accuracy |
| Erivaldo et al.[ | Random CNN architectures initialization | PSO search strategy | Crossentropy loss and velocity computation |
| Liu et al.[ | Residual blocks | GA search strategy | Fitness function for image quality measurement |
| Garg et al.[ | Hierarchical structure using DAG | Differentiable architecture search | Surrogate approach |
| Krishna et al.[ | NASBench-101 search | Reinforcement learning | Actor-critic algorithms |
| Calisto and Yeun[ | Basic operations and corresponding hyperparameters | Evolutionary algorithm | Classification accuracy and hyperparameter reduction |
| Wang et al.[ | Cell-based representation | Divide-and-conquer (DC) approach | k-means base clustered evaluation |
| Cassimon et al.[ | Cell-based representation | Reinforcement learning | Multi-objective evaluation |
| Fan et al.[ | Hybrids of cell-based representation | Gradient-decent-based neural architecture optimization (NAO) | Minimization of regression and reconstruction losses, and dropout rates |
| Dai et al.[ | AdaNet: Hierarchical structure | Gradient-decent-based using Momentum | Maximizing classification accuracy |
| Gheshlaghi et al.[ | Cell-based representation of primitive operations | Gradient-based approach for binary gate method | Training from scratch |
| Chen et al.[ | Basic operations | Reinforcement learning using LSTM | HyperNet based accuracy evaluator and hardware performance predictor |
| Chen and Li[ | Weight sharing strategy from a major super-network | Evolution algorithm method | Commonalities among best performing architectures |
| Guo et al.[ | Basic operations | Inference model learning from Pareto frontier parameters | Model performance and computational cost |
| Zhang et al.[ | Basic operations | Reinforcement learning and evolutionary algorithm | Minimization of loss function |
| Hu et al.[ | Attention-guided differentiable mechanism | Classification accuracy | |
| Xu et al.[ | Super-network | Partially connected DARTS | Error rates of searched networks |
| Ru et al.[ | Graph-like search spaces | Bayesian optimization | Performance evaluation of motifs |
| Fu et al.[ | Basic operations | Reinforcement learning and LSTM | Quantified-parameter evaluation mechanism |
| Lin et al.[ | A single randomly initialized network | Inference budgets model | Zero-shot approach |
| Liu et al.[ | A SuperNet | Particle swarm optimization | |
| Liang et al.[ | DAG-based FPNs | One-short search strategy | Detection accuracy |
Figure 1The proposed EOSA-NAS model consisting of four components: the search space, EOSA-NAS search strategy, evaluation strategy and the breast cancer detection module using the top-5 and top-1 CNN architectures.
Categorization of parameters based on the block encoding scheme for representation of the hyperparameters of convolutional neural network.
| (Min, max) no. of blocks in BSCBE | Block category | CNN Hyperparameters | Notational Representation | Lower Bound | Upper Bound |
|---|---|---|---|---|---|
| (1, 1) | General hyperparameter block | Batch size/mode | Gb | 0 | 2 |
| Learning rate | Gα | 0 | 8 | ||
| Optimization algorithm | Go | 0 | 7 | ||
| Epoch | Ge | 1 | 2 | ||
| (0,1) | Input-Zeropadding block | Whether to zero-pad inputs or not | Zα | 0 | 1 |
| (1, N) | Convolutional layer block | Number of conv-pool blocks | C | 1 | 6 |
| Number of convolutional blocks in Cl | C | 0 | 2 | ||
| Choice of activation function per convolutional layer | C | 0 | 2 | ||
| Number of kernel | C | 3 | 10 | ||
| Kernel size | C | 0 | 10 | ||
| Pool size | C | 0 | 2 | ||
| Pool operation type | C | 0 | 1 | ||
| Weight regularization operation | C | 0 | 2 | ||
| (1, 2) | Fully connected block | Number of dense (fully-connected) layers | F | 0 | 1 |
| Activation function for the layer | F | 0 | 2 | ||
| Use of dropout layer | F | 2.0 | 2.2 | ||
| Weight regularization operation | F | 0 | 2 | ||
| (1, 1) | Loss function block | Choice of loss function | LF | 0 | 2 |
Figure 2A generic representation of an encoded CNN architecture based on the parameters covered by the search space.
A summary of formula for computing values for hyperparameters and the corresponding search space using the proposed encoding scheme.
| Hyperparameter | Formula | Hyperparameter search space |
|---|---|---|
| Gb | [0, 1, 3] | |
| Gα | [1 × 10 − 5, 5 × 10 − 5, 1 × 10 − 4, 5 × 10 − 4, 1 × 10 − 3, 5 × 10 − 3, 1 × 10 − 2, 5 × 10 − 2, 1 × 10 − 1, 5 × 10 − 1] | |
| Go | [0 = > "SGD", 1 = > "Adam", 2 = > "RMSprop", 3 = > "Adagrad", 4 = > "Nestrov", 5 = > "Adadelta", 6 = > "Adamax", 7 = > "Momentum] | |
| Ge | 5 | |
| IZ | [0,1] | |
| C | [1, 3, 5, 7, 9, 11] | |
| C | [1, 2, 3] | |
| C | [0 = > "ReLU" | |
| C | [8, 16, 32, 64, 128, 256, 512, 1024] | |
| C | [1, 3, 5, 7, 9, 11] | |
| C | [2, 3, 4] | |
| C | [Max pooling, Average pooling] | |
| C | [ L1, L2, L1L2] | |
| F | [1, 2] | |
| F | [0 = > " Softmax" | |
| F | [0.35, 0.4, 0.45, 0.5] | |
| F | [L1, L2, L1L2] | |
| LF | [categorical cross-entropy, sparse cross-entropy] |
Notations and description for variables and parameters used for experimenting with EOSA optimization algorithm.
| Symbols | Descriptions | Range |
|---|---|---|
| Epoch | Number of iteration for the EOSA algorithm | 5 |
| Population | Number of neural architectures in the search space | 50 |
| π | Recruitment rate of susceptible human individuals | Variable |
| ŋ | Decay rate of Ebola virus in the environment | (0, ∞) |
| α | Rate of hospitalization of infected individuals | (0, 1) |
| Γ | Disease-induced death rate of human individuals | [0.4, 0.9] |
| β1 | Contact rate of infectious human individuals | Variable |
| β2 | Contact rate of pathogen individuals/environment | Variable |
| β3 | Contact rate of deceased human individuals | Variable |
| β4 | Contact rate of recovered human individuals | Variable |
| γ | Recovery rate of human individuals | (0, 1) |
| τ | Natural death rate of human individuals | (0, 1) |
| δ | Rate of burial of deceased human individuals | (0, 1) |
| ϑ | Rate of vaccination of individuals | (0, 1) |
| ϖ | Rate of response to hospital treatment | (0, 1) |
| μ | Rate response to vaccination | (0, 1) |
| ξ | Rate of quarantine of infected individuals | (0, 1) |
Figure 3Sample images from the BACH datasets showing (a) normal (b) benign (c) in situ carcinoma and (d) invasive carcinoma cases.
Figure 4Sample images from the BreakHis datasets showing (a) adenosis, (b) ductal carcinoma, (c) mucinous carcinoma, and (d) papillary carcinoma malignant cases. Each column shows the magnification of samples for (a)–(d) in 40X, 100X, 200X, and 400X accordingly. The H&E stain the nuclei with a dark purple (Hematoxylin) and the cytoplasm with a light pink (Eosin).
Standard and CEC benchmark functions used for the experimentation in evaluating the performances of EOSA, ABC, WOA, PSO and GA.
| ID | Function name | Model of the function |
|---|---|---|
| F1 | Ackley | |
| F2 | Alpine | |
| F3 | Brown | |
| F4 | Bent Cigar | |
| F5 | Dixon and Price | |
| F6 | Discus Function | |
| F7 | Levy | |
| F8 | Powel | |
| F9 | Quartic | |
| F10 | Rastrigin | |
| F11 | SR-F27 | Shifted and Rotated Rastrigin’s Function |
| F12 | Wavy 1 | |
| F13 | Zakharov | |
| F14 | Salomon | |
| F15 | Weierstrass Function |
Comparison of best, worst, mean, median and standard deviation (stdev) values for EOSA, ABC, WOA, PSO, and GA metaheuristic algorithms using the classical benchmark and IEEE CEC functions over 500 epochs and 100 population size.
| Functions | Metrics | EOSA | ABC | WOA | PSO | GA |
|---|---|---|---|---|---|---|
| F1 | Best | 0.046591 | 0.046596 | 0.046571 | 9.94223 | |
| Worst | 0.046588 | 20.8892 | 0.046596 | 19.83618 | ||
| Mean | 0.046465 | 19.30266 | 0.046596 | 0.046571 | 10.40362 | |
| Median | 0.046512 | 19.15063 | 0.046596 | 0.046571 | 10.1534 | |
| Stdev | 0.000107 | 0.948262 | 5.20E−18 | 5.55E−18 | 0.938523 | |
| F2 | Best | 0.0028 | 0.002748 | 0.002769 | 39.73652 | |
| Worst | 0.002768 | 245.4735 | 0.002769 | 184.0994 | ||
| Mean | 0.002608 | 33.16789 | 0.002748 | 0.002769 | 44.36342 | |
| Median | 0.002607 | 7.26278 | 0.002748 | 0.002769 | 42.07979 | |
| Stdev | 4.68E−05 | 52.19852 | 3.69E−19 | 2.82E−19 | 10.53887 | |
| F3 | Best | 0.000417 | 0.000416 | 0.000414 | 921.248 | |
| Worst | 1498.884 | 0.000416 | 0.000414 | 1269.038 | ||
| Mean | 0.00011 | 294.4233 | 0.000416 | 0.000414 | 938.3754 | |
| Median | 8.86E−05 | 203.1162 | 0.000416 | 0.000414 | 929.879 | |
| Stdev | 4.55E−05 | 227.7159 | 6.23E−20 | 7.86E−20 | 30.31403 | |
| F4 | Best | 2.49E−12 | 2.45E−12 | 2.49E−12 | 4.13E + 09 | |
| Worst | 2.48E−12 | 2.57E + 11 | 2.49E−12 | 1.34E + 11 | ||
| Mean | 2.05E−12 | 2.05E + 11 | 2.45E−12 | 2.49E−12 | 5.68E + 09 | |
| Median | 2.18E−12 | 2.01E + 11 | 2.45E−12 | 2.49E−12 | 4.45E + 09 | |
| Stdev | 3.79E−13 | 1.3E + 10 | 3.03E−28 | 4.04E−28 | 7.3E + 09 | |
| F5 | Best | 2.78E−12 | 2.80E−12 | 2.79E−12 | 395.2324 | |
| Worst | 2.86E−12 | 43,618,954 | 2.80E−12 | 194,298 | ||
| Mean | 1.17E−12 | 161,597.3 | 2.80E−12 | 2.79E−12 | 2351.452 | |
| Median | 9.35E−13 | 1152.776 | 2.80E−12 | 2.79E−12 | 423.69 | |
| Stdev | 4.16E−13 | 2,214,592 | 4.04E−28 | 3.03E−28 | 12,218 | |
| F6 | Best | 1.02E−10 | 1.02E−10 | 1.02E−10 | 6952.905 | |
| Worst | 1,342,862 | 195,495.6 | ||||
| Mean | 7.19E−11 | 263,974.3 | 1.02E−10 | 1.02E−10 | 14,746.75 | |
| Median | 7.20E−11 | 253,737.4 | 1.02E-10 | 1.02E−10 | 8375.828 | |
| Stdev | 2.03E−11 | 63,079.51 | 1.62E−26 | 1.81E−26 | 21,265.92 | |
| F7 | Best | 0.000248 | 0.000248 | 0.000251 | 41.79268 | |
| Worst | 0.000253 | 1479.208 | 0.000251 | 823.37 | ||
| Mean | 0.0002 | 106.1467 | 0.000248 | 0.000251 | 58.77442 | |
| Median | 0.000228 | 15.67991 | 0.000248 | 0.000251 | 47.54116 | |
| Stdev | 6.14E−05 | 232.7978 | 4.20E−20 | 4.74E−20 | 50.30075 | |
| F8 | Best | 1.98E−05 | 2.41E−05 | 2.31E−05 | 0.009794 | |
| Worst | 24.42778 | 2.41E−05 | 2.31E−05 | 5.436187 | ||
| Mean | 1.32E−05 | 0.345815 | 2.41E−05 | 2.31E−05 | 0.038439 | |
| Median | 1.14E−05 | 0.005065 | 2.41E−05 | 2.31E−05 | 0.013349 | |
| Stdev | 4.71E−06 | 1.7694 | 3.22E−21 | 4.40E−21 | 0.279212 | |
| F9 | Best | 1.38E−10 | 1.40E−10 | 1.39E−10 | 30,500.52 | |
| Worst | 1.40E−10 | 3.68E + 09 | 1.40E−10 | 1.13E + 09 | ||
| Mean | 9.97E−11 | 2.53E + 09 | 1.40E−10 | 1.39E−10 | 4,511,122 | |
| Median | 1.06E−10 | 2.44E + 09 | 1.40E−10 | 1.39E−10 | 144,930.2 | |
| Stdev | 3.33E−11 | 2.26E + 08 | 1.29E−26 | 1.94E−26 | 54,440,104 | |
| F10 | Best | 0.000471 | 0.000474 | 0.000475 | 745.3493 | |
| Worst | 0.000475 | 1599.605 | 0.000475 | 1278.155 | ||
| Mean | 0.000287 | 444.8808 | 0.000474 | 0.000475 | 772.7753 | |
| Median | 0.00028 | 315.5723 | 0.000474 | 0.000475 | 760.9054 | |
| Stdev | 0.000134 | 271.854 | 7.32E−20 | 8.40E−20 | 45.18663 | |
| F11 | Best | 0.000331 | 0.000333 | 0.00033 | 1654.473 | |
| Worst | 2490.439 | 0.000333 | 0.00033 | 2194.09 | ||
| Mean | 0.000326 | 1912.671 | 0.000333 | 0.00033 | 1676.178 | |
| Median | 0.000325 | 1851.11 | 0.000333 | 0.00033 | 1664.138 | |
| Stdev | 3.15E−06 | 159.2676 | 4.88E−20 | 3.79E−20 | 45.46335 | |
| F12 | Best | 2.00E−29 | 1.82E−29 | 2.01E−29 | 112,016.4 | |
| Worst | 1.96E−29 | 2.76E + 24 | 2.01E−29 | 1.92E + 24 | ||
| Mean | 5.07E−30 | 1.12E + 22 | 1.82E−29 | 2.01E−29 | 8.29E + 21 | |
| Median | 2.13E−30 | 8.34E + 17 | 1.82E−29 | 2.01E−29 | 140,116 | |
| Stdev | 4.89E−30 | 1.42E + 23 | 2.70E−45 | 3.22E−45 | 1.06E + 23 | |
| F13 | Best | 0.303455 | 0.30833 | 0.307142 | 2.686451 | |
| Worst | 2.842985 | 0.306802 | 0.307142 | 2.778775 | ||
| Mean | 0.304267 | 1.791644 | 0.261832 | 0.307142 | 2.686881 | |
| Median | 0.304119 | 1.67436 | 0.245368 | 0.307142 | 2.686451 | |
| Stdev | 0.00089 | 0.256993 | 0.023673 | 3.61E−17 | 0.005805 | |
| F14 | Best | 2.46E−05 | 2.45E−05 | 2.44E−05 | 412.1038 | |
| Worst | 25,843.77 | 2.45E−05 | 13,787.81 | |||
| Mean | 1.74E−05 | 21,080.93 | 2.45E−05 | 2.44E−05 | 580.0391 | |
| Median | 1.99E−05 | 20,736.46 | 2.45E−05 | 2.44E−05 | 459.1532 | |
| Stdev | 6.69E−06 | 1251.021 | 3.22E−21 | 2.20E−21 | 760.3748 | |
| F15 | Best | 0.005899 | 0.005866 | 0.005885 | 14.62603 | |
| Worst | 0.005876 | 130.4765 | 0.005885 | 97.42765 | ||
| Mean | 0.005761 | 30.95515 | 0.005866 | 0.005885 | 16.72031 | |
| Median | 0.005757 | 7.717655 | 0.005866 | 0.005885 | 15.10426 | |
| Stdev | 4.67E−05 | 39.61793 | 6.07E−19 | 8.67E−19 | 6.393344 |
Comparison of best, worst, mean, median and standard deviation (stdev) values for EOSA, ABC, WOA, PSO, and GA metaheuristic algorithms using the constrained IEEE CEC-2017 benchmark functions over 500 epochs and 100 population size.
| Functions | Metrics | EOSA | ABC | WOA | PSO | GA |
|---|---|---|---|---|---|---|
| Best | 2.78E−11 | 2.78E−11 | 2.78E−11 | 6,500,451 | ||
| Stdev | 8.44E + 08 | 2.78E−11 | 3.88E−27 | 3.10E + 08 | ||
| Median | 2.75E−11 | 4.97E + 09 | 3.39E−27 | 2.78E−11 | 17,405,089 | |
| Best | 2.49E−12 | 2.45E−12 | 2.49E−12 | 4.17E + 09 | ||
| Stdev | 9.11E−17 | 1.30E + 10 | 2.45E−12 | 2.83E−28 | 7.53E + 09 | |
| Median | 2.48E−12 | 2.02E + 11 | 4.64E−28 | 2.49E−12 | 4.44E + 09 | |
| Best | 1.02E−10 | 1.02E−10 | 1.03E−10 | 8666.065 | ||
| Stdev | 3.19E−14 | 124,317.8 | 1.02E−10 | 2.13E−26 | 23,773.8 | |
| Median | 1.01E−10 | 251,561 | 1.36E−26 | 1.03E−10 | 12,804.61 | |
| Best | 3.71E−12 | 3.70E−12 | 3.73E−12 | 1,099,091 | ||
| Stdev | 1.35E−16 | 9.64E + 09 | 3.70E−12 | 7.88E−28 | 2.26E + 09 | |
| Median | 3.71E−12 | 8.50E + 10 | 5.65E−28 | 3.73E−12 | 5,359,283 | |
| Best | 0.045719 | 0.045711 | 0.045704 | 18.25292 | ||
| Stdev | 9.25E−07 | 0.957905 | 0.045711 | 5.90E−18 | 0.531952 | |
| Median | 0.045669 | 20.02451 | 5.90E−18 | 0.045704 | 18.40464 | |
| Best | 0.001299 | 0.001302 | 0.001298 | 618.1048 | ||
| Stdev | 8.59E−09 | 31.98589 | 0.001298 | 1.63E−19 | 6.159877 | |
| Median | 0.001299 | 710.5576 | 1.52E−19 | 0.001298 | 619.0061 | |
| Best | 0.000228 | 0.000227 | 0.000226 | 761.7748 | ||
| Stdev | 7.57E−09 | 141.8918 | 0.000227 | 3.25E−20 | 66.00672 | |
| Median | 0.000224 | 2526.215 | 4.07E−20 | 0.000226 | 766.6583 | |
| Best | 0.000345 | 0.000344 | 0.000343 | 1557.367 | ||
| Stdev | 4.04E−08 | 156.4804 | 0.000344 | 3.52E−20 | 44.58426 | |
| Median | 0.000343 | 1756.355 | 3.79E−20 | 0.000343 | 1567.959 | |
| Best | 0.00033 | 0.000334 | 0.000332 | 1657.835 | ||
| Stdev | 1.49E−08 | 158.7789 | 0.000334 | 4.88E−20 | 45.19802 | |
| Median | 0.000333 | 1857.348 | 4.34E−20 | 0.000332 | 1671.562 | |
| Best | 2.17E−05 | 2.19E−05 | 2.18E−05 | 22,425.71 | ||
| Stdev | 2.54E−09 | 3268.026 | 2.19E−05 | 1.86E−21 | 1633.723 | |
| Median | 2.16E−05 | 21,565.89 | 2.20E−21 | 2.18E−05 | 22,826.18 |
Figure 5Convergent curves of EOSA optimization algorithm on F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12, F13, F14 and F15 standard benchmark functions.
Figure 6Comparison of convergence curves of the performance of EOSA, ABC, WOA, PSO, and GA optimization algorithms on all standard benchmark functions applied in this study.
Figure 7Comparison of convergence curves of the performance of EOSA, ABC, WOA, PSO, and GA optimization algorithms on all standard benchmark functions applied in this study.
Comparison of parameters for the best five (5) initial neural network configurations (solutions) generated for the search space.
| Parameters | Top-1 | Top-2 | Top-3 | Top-4 | Top-5 |
|---|---|---|---|---|---|
| Dataset batching | Random sample size | Half of dataset | Random sample size | Half of dataset | Random sample size |
| Zero padding | Yes | Yes | Yes | Yes | Yes |
| No. Convo-Pool blocks | 2 | 3 | 2 | 3 | 6 |
| Details of Convolution layers | [1Convo, 'relu', 32, 9, 2, 'Avg', 'L1'], [3Convo, 'relu', 64, 9, 2, 'Avg', 'L1'] | ([3Convo, 0.005, 'Adagrad', 3], True, [2, 'relu', 32, 3, 2, 'Max', 'L1'], [4, 'relu', 64, 3, 2, 'Avg', 'L1'], [4, 'relu', 128, 3, 2, 'Avg', 'None'], | [1Convo, 'relu', 32, 9, 2, 'Avg', 'L1'], [3Convo, 'relu', 64, 9, 2, 'Avg', 'L1'] | [2Convo, 'relu', 32, 3, 2, 'Max', 'None'], [4, 'relu', 64, 3, 2, 'Avg', 'None'], [4, 'relu', 128, 3, 2, 'Max', 'L1'] | [3Convo, 'relu', 32, 9, 2, 'Max', 'L1'], [2, 'relu', 64, 1, 2, 'Avg', 'None'], [3, 'relu', 128, 11, 2, 'Max', 'None'], [1, 'relu', 256, 9, 2, 'Avg', 'L1'], [2, 'relu', 512, 7, 2, 'Max', 'None'], [3, 'relu', 1024, 3, 2, 'Avg', 'None'] |
| Pool size | 2 × 2 | 2 × 2 | 2 × 2 | 2 × 2 | 2 × 2 |
| Filters size | 9 × 9, 9 × 9 | 3 × 3, 3 × 3, 3 × 3 | 9 × 9, 9 × 9 | 3 × 3, 3 × 3, 3 × 3 | 9 × 9, 1 × 1, 11 × 11, 9 × 9, 7 × 7, 3 × 3 |
| Filter count | 32 × 32, 64 × 64 | 32 × 32, 64 × 64, 128 × 128 | 32 × 32, 64 × 64 | 32 × 32, 64 × 64, 128 × 128 | 32 × 32, 64 × 64 |
| No. FC layers | 2 | 3 | 2 | 3 | 1 |
| Dense Layer activation function and dropout rate | Softmax and 0.48 | Softmax and 0.5 and LI | Softmax and 0.5 | Softmax and 0.45 and L1 | Softmax and 0.47 and L1 |
| Learning rate | 0.05 | 0.005 | 0.05 | 0.005 | 1e-05 |
| Optimizer | RMSprop | Adagrad | RMSprop | Adagrad | Adam |
| Classifier | Categorical crossentropy | Categorical crossentropy | Categorical crossentropy | Categorical crossentropy | Categorical crossentropy |
Figure 8Neural network architectures of the Top-5 generated network architectures generated for the search space.
Performance comparison for training the five (5) best performing CNN architectures from EOSA-NAS algorithm using mean, median, accuracy and standard deviation for accuracy, and loss, computation time values for the 250 epochs of EOSA.
| S/N | Accuracy | Loss | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Best | Mean | Median | Worst | Stdev | Worst | Median | Best | Latency | |
| Top-5 | 0.551 | 0.313 | 0.332 | 0.030 | 0.247 | 2.79E + 09 | 1.84 | 1.84 | 12.87 |
| Top-4 | 0.573 | 0.376 | 0.359 | 0.111 | 0.097 | 9.13E + 08 | 3.16 | 1.31 | 12.52 |
| Top-3 | 0.613 | 0.354 | 0.326 | 0.136 | 0.137 | 5.1E + 09 | 2.21 | 1.318 | 21.26 |
| Top-2 | 0.627 | 0.396 | 0.350 | 0.098 | 0.051 | 26,261,178 | 2.21 | 1.231 | 39.21 |
| Top-1 | 0.655 | 0.415 | 0.417 | 0.147 | 0.150 | 23,565.56 | 11,137.88 | 1.297 | 93.59 |
Figure 9A radar plot showing the performance comparison of the top-5 best performing network architectures from EOSA-NAS algorithm based on mean, median, worst, and best accuracy values.
Performance comparison for prediction of the four (4) best performing CNN architectures of the EOSA-NAS algorithm using AUC, precision, recall, sensitivity, specificity, accuracy and loss after full train for 60, 70 and 100 epochs.
| Architectures | F1-score | Precision | Sensitivity | Specificity | Recall | Accuracy | Kappa |
|---|---|---|---|---|---|---|---|
| Top-4 | 0 | 0.0 | – | – | 0 | 0. 24 | – |
| Top-2 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
| Top-3 | 0 | 0 | – | 0 | 0.1 | 0. 25 | 0 |
| Top-1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
Figure 10Plot of the accuracy and loss values for the training of the Top-1, 2, and 3 architectures respectively which were optimized using the EOSA-NAS model, showing their performances after sixty (60) training epoch.
Figure 11Neural network architecture of the Top-1 architecture optimized using EOSA-NAS model, which represents the overall best performing architecture after hundred (100) training epoch.
Figure 12Plot of the accuracy and loss values for the training of the Top-1 architecture optimized using EOSA-NAS model, which represents the overall best performing architecture after hundred (100) training epoch.
Comparison of NAS-based CNN design with state-of-the-art canonical CNN design approach for detection and classification of breast cancer using histopathology images.
| References | Methods | Performance | Dataset |
|---|---|---|---|
| Zheng et al. [ | Nucleus-guided CNN | Accuracy 96.4%, Sensitivity 0.955, Specificity 0.964 | Images from Motic (Xiamen) Medical Diagnostic Systems |
| Nejad et al. [ | CNN + Data augmentation | Detection rate 77.5% | BreakHis database |
| Araújo et al. [ | CNN + Support Vector Machine | Accuracies of 77.8%, sensitivity of 95.6% | Bioimaging 2015 breast histology classification challenge |
| Han et al. [ | Structured Deep Learning Model + Data augmentation | 93.2% accuracy | BreakHis database |
| Saha et al. [ | Handcrafted features + CNN | 92% precision, 88% recall and 90% | MITOS-ATYPIA-14, ICPR-2012, and AMIDA-13 datasets |
| Zhu et al. [ | Squeeze-Excitation-Pruning (SEP) + CNN | Accuracy of 87.5 | BreaKHis and BACH dataset |
| Xie et al. [ | Inception_V3 and Inception_ResNet_V2 | Accuracy 96.84% | BreaKHis |
| Kandel and Castelli [ | CNN | AUC of 95.46% | PatchCamelyon |
| Hägele et al. [ | CNN + explanation method | Improved AUC by 5% | BRCA |
| This study | EOSA-NAS CNN | Accuracy 100% | BreakHis and BACH databases |
Figure 13Comparison of the CNN architecture designed using EOSA-NAS model with state-of-the-art CNN architectures applied to the detection of breast cancer in histopathology images.