| Literature DB >> 35698722 |
Essam H Houssein1, Marwa M Emam1, Abdelmgeid A Ali1.
Abstract
Breast cancer is the second leading cause of death in women; therefore, effective early detection of this cancer can reduce its mortality rate. Breast cancer detection and classification in the early phases of development may allow for optimal therapy. Convolutional neural networks (CNNs) have enhanced tumor detection and classification efficiency in medical imaging compared to traditional approaches. This paper proposes a novel classification model for breast cancer diagnosis based on a hybridized CNN and an improved optimization algorithm, along with transfer learning, to help radiologists detect abnormalities efficiently. The marine predators algorithm (MPA) is the optimization algorithm we used, and we improve it using the opposition-based learning strategy to cope with the implied weaknesses of the original MPA. The improved marine predators algorithm (IMPA) is used to find the best values for the hyperparameters of the CNN architecture. The proposed method uses a pretrained CNN model called ResNet50 (residual network). This model is hybridized with the IMPA algorithm, resulting in an architecture called IMPA-ResNet50. Our evaluation is performed on two mammographic datasets, the mammographic image analysis society (MIAS) and curated breast imaging subset of DDSM (CBIS-DDSM) datasets. The proposed model was compared with other state-of-the-art approaches. The obtained results showed that the proposed model outperforms the compared state-of-the-art approaches, which are beneficial to classification performance, achieving 98.32% accuracy, 98.56% sensitivity, and 98.68% specificity on the CBIS-DDSM dataset and 98.88% accuracy, 97.61% sensitivity, and 98.40% specificity on the MIAS dataset. To evaluate the performance of IMPA in finding the optimal values for the hyperparameters of ResNet50 architecture, it compared to four other optimization algorithms including gravitational search algorithm (GSA), Harris hawks optimization (HHO), whale optimization algorithm (WOA), and the original MPA algorithm. The counterparts algorithms are also hybrid with the ResNet50 architecture produce models named GSA-ResNet50, HHO-ResNet50, WOA-ResNet50, and MPA-ResNet50, respectively. The results indicated that the proposed IMPA-ResNet50 is achieved a better performance than other counterparts.Entities:
Keywords: Breast cancer classification; Convolutional neural network; Deep learning; Hyperparameters optimization; Marine predators algorithm; Opposition-based learning; Transfer learning
Year: 2022 PMID: 35698722 PMCID: PMC9175533 DOI: 10.1007/s00521-022-07445-5
Source DB: PubMed Journal: Neural Comput Appl ISSN: 0941-0643 Impact factor: 5.102
Fig. 1CNN standard architecture [7]
Hyperparameter description of CNN
| Hyperparameter-name | Description |
|---|---|
| Learning rate | The initial learning rate for the CNN architecture is one of the significant hyperparameters that affect output performance. When the learning rate is low, the model requires more iterations |
| Number of hidden layer units | Expanding the number of hidden layer units enhances the model and reduces computational efficiency |
| Batch size | It refers to the number of sub-samples sent to the network for parameter updates |
| Dropout rate | A dropout is a regularization approach that reduces overfitting by enhancing validation accuracy and consequently generalizing power |
| Activation Function | Activation functions allow DL techniques to learn nonlinear prediction limits |
| Number of epochs | It is the number of times the entire training data is taken through the training process |
Fig. 2Block diagram of the standard process for hyperparameter optimization in a CNN using meta-heuristic algorithms
Fig. 3The proposed IMPA-ResNet50 architecture block-diagram phases
The data augmentation approaches and their ranges
| Data-augmentation technique | Range |
|---|---|
| Shearing | 0.1 |
| Zooming | 0.1 |
| Width shift | 0.3 |
| Height shift | 0.3 |
| Rotation | 15 |
| Featurewise center | True |
| Featurewise standard deviation normalization | True |
| Fill mode | Reflect |
| Vertical flip | True |
| Horizontal flip | True |
Specifications of CBIS-DDSM dataset
| Dataset | Class | No. of training samples | No. of test samples |
|---|---|---|---|
| CBIS-DDSM | Benign | 1824 | 783 |
| Cancer | 1873 | 803 |
Specifications of MIAS dataset
| Dataset | No. of training samples | No. of test samples | Total images |
|---|---|---|---|
|
| 904 | 386 | 1290 |
| Normal-category | |||
| 830 | |||
| Abnormal-category | |||
| 460 |
Fig. 4MIAS mammography Categories [70]
Parameter settings for IMPA
| Parameter | Value |
|---|---|
| Maximum iteration numbers | 50 |
| Population size | 30 |
| Dimension | 8 |
| Learning rate | [1e−7, 1e−3] |
| Batch size | [1,64] |
| Dropout rate | [0.1,0.9] |
| Number of neurons | [50,500] |
| Maximum number of ResNet50 training epochs | 30 |
Results of the proposed IMPA-ResNet50 on the CBIS-DDSM dataset
| Metrics | IMPA-ResNet50 (%) |
|---|---|
| Accuracy | 98.32 |
| Sensitivity | 96.61 |
| Specificity | 98.56 |
| Precision | 98.68 |
| F-score | 97.65 |
| AUC | 97.88 |
Comparison between the proposed IMPA-ResNet50 model and ResNet50 model on the CBIS-DDSM dataset
| Metrics | ResNet50 (%) | IMPA-ResNet50 (%) | Improvement (%) |
|---|---|---|---|
| Accuracy | 90.11 | 98.32 | 8.21 |
| Sensitivity | 89.80 | 96.61 | 6.81 |
| Specificity | 90.33 | 98.56 | 8.23 |
| Precision | 89.01 | 98.68 | 9.67 |
| F1-score | 90.00 | 97.65 | 7.65 |
| AUC | 91.88 | 97.88 | 6.00 |
Comparison between the proposed IMPA-ResNet50 model and other related studies on the CBIS-DDSM dataset
| References | No. of images | Dataset | Model | Accuracy (%) | Sensitivity (%) | Specificity | Precision (%) | F1-score (%) | AUC |
|---|---|---|---|---|---|---|---|---|---|
| [ | 5316 | DDSM | ResNet50 | 97.35 | – | – | – | – | 0.97 |
| [ | 5316 | DDSM | VGG16 | 97.12 | – | – | – | – | 0.98 |
| [ | 600 | DDSM | CNN | 96.7 | – | – | – | – | – |
| [ | 2400 | DDSM | CNN-YOLO | 97.0 | 93.20 | 94.00 | – | – | 96.45% |
| [ | 2620 | CBIS-DDSM | Fine-tuned ResNet50 | 93.15 | 93.83 | 92.17% | – | – | 95.0% |
| [ | 5272 | CBIS-DDSM | ResNet50 | 87.2 | 86.04 | 89.40% | – | – | 95.00% |
| [ | 5272 | CBIS-DDSM | Fine-tuned AlexNet | 87.20 | 86.2 | 87.7% | 88.0 | 87.1 | 94.00% |
| [ | 3568 | CBIS-DDSM | ResNet50 | 96.6 | 92.95 | 88.60% | – | – | 93.4% |
| [ | 11,562 | DDSM | DCNN | 92.80 | – | – | – | – | – |
| [ | 2781 | CBIS-DDSM | AdaBoost | 90.91 | 82.96 | 98.38% | 86.00 | – | 98.32% |
| Proposed | 5283 | CBIS-DDSM | IMPA-ResNet50 | 98.32 | 95.61 | 98.56% | 98.68 | 97.65 | 97.88% |
Results of the proposed IMPA-ResNet50 on the MIAS dataset
| Metrics | IMPA-ResNet50 (%) |
|---|---|
| Accuracy | 98.88 |
| Sensitivity | 97.61 |
| Specificity | 98.40 |
| Precision | 98.30 |
| F-score | 97.10 |
| AUC | 99.24 |
Comparison between the proposed IMPA-ResNet50 model and the ResNet50 model on the MIAS dataset
| Metrics | ResNet50 (%) | IMPA-ResNet50 (%) | Improvement (%) |
|---|---|---|---|
| Accuracy | 87.50 | 98.88 | 11.38 |
| Sensitivity | 88.10 | 97.61 | 9.51 |
| Specificity | 86.12 | 98.40 | 12.28 |
| Precision | 87.32 | 98.30 | 10.98 |
| F1-score | 87.88 | 97.10 | 9.22 |
| AUC | 89.01 | 99.24 | 10.23 |
Comparison between the proposed IMPA-ResNet50 model and other related studies on the MIAS dataset
| References | No. of images | Dataset | Model | Accuracy (%) | Sensitivity (%) | Specificity (%) | Precision (%) | F1-score (%) | AUC (%) |
|---|---|---|---|---|---|---|---|---|---|
| [ | 322 | MIAS | ResNet50 | 98.23 | – | – | – | – | 99.0 |
| [ | 1288 | MIAS | Fine-tuned DCNN | 95.4 | 96.60 | 92.10 | – | – | 99.00 |
| [ | 322 | MIAS | CNN-GCN | 96.10 ± 1.60 | 96.20 ± 2.90 | 96.00 ± 2.30 | – | – | – |
| [ | 330 | MIAS | DenseNet201 | 92.73 | 94.58 | 91.67 | – | – | – |
| [ | 322 | MIAS | CNN | 82.68 | 82.73 | 82.71 | – | – | – |
| [ | 322 | MIAS | CNN | 89.47 | 90.71 | 90.50 | – | – | – |
| Proposed | 1290 | MIAS | IMPA-ResNet50 | 98.88 | 97.61 | 98.40 | 98.30 | 97.10 | 99.24 |
Comparison between the proposed IMPA-ResNet50 model with the MPA-ResNet50, GSA-ResNet50, HHO-ResNet50, and WOA-ResNet50 models
| Dataset | Classification Model | Accuracy (%) | Sensitivity (%) | Specificity (%) | Precision (%) | F-score (%) |
|---|---|---|---|---|---|---|
| CBIS-DDSM | IMPA-ResNet50 | 98.32 | 96.61 | 98.56 | 98.68 | 97.65 |
| MPA-ResNet50 | 95.95 | 93.03 | 95.28 | 94.22 | 93.85 | |
| GSA-ResNet50 | 95.48 | 94.16 | 95.00 | 95.00 | 94.00 | |
| HHO-ResNet50 | 94.55 | 93.12 | 94.84 | 94.12 | 94.50 | |
| WOA-ResNet50 | 94.13 | 93.10 | 94.00 | 94.00 | 94.00 | |
| MIAS | IMPA-ResNet50 | 98.88 | 97.61 | 98.40 | 98.30 | 97.10 |
| MPA-ResNet50 | 94.95 | 94.03 | 94.28 | 94.22 | 94.85 | |
| GSA-ResNet50 | 94.38 | 94.00 | 93.38 | 94.16 | 94.00 | |
| HHO-ResNet50 | 94.30 | 93.50 | 94.18 | 93.69 | 94.00 | |
| WOA-ResNet50 | 93.38 | 93.00 | 93.00 | 93.00 | 93.00 |