| Literature DB >> 36052321 |
Sidratul Montaha1, Sami Azam2, A K M Rakibul Haque Rafid1, Md Zahid Hasan1, Asif Karim2, Khan Md Hasib3, Shobhit K Patel4, Mirjam Jonkman2, Zubaer Ibna Mannan5.
Abstract
Interpretation of medical images with a computer-aided diagnosis (CAD) system is arduous because of the complex structure of cancerous lesions in different imaging modalities, high degree of resemblance between inter-classes, presence of dissimilar characteristics in intra-classes, scarcity of medical data, and presence of artifacts and noises. In this study, these challenges are addressed by developing a shallow convolutional neural network (CNN) model with optimal configuration performing ablation study by altering layer structure and hyper-parameters and utilizing a suitable augmentation technique. Eight medical datasets with different modalities are investigated where the proposed model, named MNet-10, with low computational complexity is able to yield optimal performance across all datasets. The impact of photometric and geometric augmentation techniques on different datasets is also evaluated. We selected the mammogram dataset to proceed with the ablation study for being one of the most challenging imaging modalities. Before generating the model, the dataset is augmented using the two approaches. A base CNN model is constructed first and applied to both the augmented and non-augmented mammogram datasets where the highest accuracy is obtained with the photometric dataset. Therefore, the architecture and hyper-parameters of the model are determined by performing an ablation study on the base model using the mammogram photometric dataset. Afterward, the robustness of the network and the impact of different augmentation techniques are assessed by training the model with the rest of the seven datasets. We obtain a test accuracy of 97.34% on the mammogram, 98.43% on the skin cancer, 99.54% on the brain tumor magnetic resonance imaging (MRI), 97.29% on the COVID chest X-ray, 96.31% on the tympanic membrane, 99.82% on the chest computed tomography (CT) scan, and 98.75% on the breast cancer ultrasound datasets by photometric augmentation and 96.76% on the breast cancer microscopic biopsy dataset by geometric augmentation. Moreover, some elastic deformation augmentation methods are explored with the proposed model using all the datasets to evaluate their effectiveness. Finally, VGG16, InceptionV3, and ResNet50 were trained on the best-performing augmented datasets, and their performance consistency was compared with that of the MNet-10 model. The findings may aid future researchers in medical data analysis involving ablation studies and augmentation techniques.Entities:
Keywords: ablation study; deep learning models; geometric augmentation; medical image; photometric augmentation; shallow CNN
Year: 2022 PMID: 36052321 PMCID: PMC9424498 DOI: 10.3389/fmed.2022.924979
Source DB: PubMed Journal: Front Med (Lausanne) ISSN: 2296-858X
FIGURE 1Challenges of medical datasets showing breast mammography. (A) Very small ROI and (B) presence of artifacts. (B) The dissimilarity between the same class and (C) similarity between different classes.
FIGURE 2Datasets used in this research.
FIGURE 3Overview of the proposed methodology.
Peak signal-to-noise ratios values of different photometric augmentation techniques.
| Augmentation technique | Skin cancer dermoscopy images | Breast cancer mammogram | Tympanic membrane otoscopic images | COVID chest X-ray images | Breast cancer ultrasound images | breast cancer microscopic biopsy images | Brain tumor MRI images | Chest CT-Scan |
| Hue | 15.63 | – | 18.72 | – | – | 14.50 | – | – |
| Saturation | 16.48 | – | 14.25 | – | – | 15.71 | – | – |
| Noise | 13.79 | 9.52 | 12.72 | 12.50 | 13.36 | 12.47 | 10.30 | 9.62 |
| HE | 14.72 | 16.39 | 15.19 | 17.19 | 14.25 | 14.42 | 14.51 | 13.19 |
| Brightness high | 29.04 | 30.72 | 29.95 | 29.42 | 30.82 | 29.16 | 29.85 | 29.72 |
| Brightness low | 29.12 | 30.23 | 29.19 | 29.15 | 30.61 | 29.31 | 29.17 | 30.35 |
| Contrast high | 34.42 | 31.68 | 31.55 | 31.66 | 32.75 | 31.75 | 32.01 | 34.07 |
| Contrast low | 33.59 | 31.89 | 31.54 | 30.71 | 31.11 | 30.11 | 33.04 | 35.36 |
FIGURE 4Geometric and photometric augmentation techniques.
Description of the original and augmented datasets.
| Breast ultrasound image dataset | ||||
| Class | Original | Balanced | Photometric | Geometric |
| Benign | 440 | – | 1760 | 1760 |
| Malignant | 207 | – | 828 | 828 |
| Normal | 133 | – | 532 | 532 |
| Total | 780 | – | 3120 | 3120 |
|
| ||||
|
| ||||
|
| ||||
|
|
|
|
|
|
|
| ||||
| COVID | 3616 | 1400 | 5600 | 5600 |
| Lung opacity | 6012 | 1500 | 6000 | 6000 |
| Normal | 10192 | 1600 | 6400 | 6400 |
| Viral pneumonia | 1345 | 1345 | 5380 | 5380 |
| Total | 21165 | 5845 | 23380 | 23380 |
|
| ||||
|
| ||||
|
| ||||
|
|
|
|
|
|
|
| ||||
| Benign calc | 398 | – | 1592 | 1592 |
| Benign mass | 417 | – | 1668 | 1668 |
| Malignant calc | 300 | – | 1200 | 1200 |
| Malignant mass | 344 | – | 1376 | 1376 |
| Total | 1459 | – | 5836 | 5836 |
|
| ||||
|
| ||||
|
| ||||
|
|
|
|
|
|
|
| ||||
| Benign | 1800 | – | 7200 | 7200 |
| Malignant | 1497 | – | 5988 | 5988 |
| Total | 3297 | – | 13188 | 13188 |
|
| ||||
|
| ||||
|
| ||||
|
|
|
|
|
|
|
| ||||
| AOM | 119 | – | 476 | 595 |
| CSOM | 63 | – | 252 | 315 |
| Earwax | 140 | – | 560 | 700 |
| Normal | 533 | 250 | 1000 | 800 |
| Total | 855 | 527 | 2288 | 2288 |
|
| ||||
|
| ||||
|
| ||||
|
|
|
|
|
|
|
| ||||
| Glioma tumor | 926 | – | 3704 | 3704 |
| Meningioma tumor | 937 | – | 3748 | 3748 |
| No tumor | 500 | – | 2000 | 2000 |
| Pituitary tumor | 901 | – | 3604 | 3604 |
| Total | 3263 | – | 13056 | 13056 |
|
| ||||
|
| ||||
|
| ||||
|
|
|
|
|
|
|
| ||||
| Benign | 547 | – | 2188 | 2188 |
| Malignant | 1146 | – | 4584 | 4584 |
| Total | 1693 | – | 6772 | 6772 |
|
| ||||
|
| ||||
|
| ||||
|
|
|
|
|
|
|
| ||||
| Left lower lobe of adenocarcinoma | 195 | – | 780 | 780 |
| Large cell carcinoma of left hilum | 115 | – | 460 | 460 |
| Normal | 148 | – | 592 | 592 |
| Squamous cell carcinoma of left hilum | 155 | – | 620 | 620 |
| Total | 613 | – | 2452 | 2452 |
FIGURE 5The architecture of the base model.
FIGURE 6The architecture of the proposed MNet-10 model after ablation study.
Ablation study on layer configurations and activation functions.
| Case study 1: changing convolution and maxpool layer | |||||||
| Configuration no. | No. of convolution layer | No. of pooling layer | Time complexity | Epoch × training time | Test accuracy (%) | Finding | |
| 1 | 5 | 5 | 66M | 79 × 54s | 89.55 | Modest accuracy | |
| 2 | 4 | 4 | 64M | 75 × 54s | 93.36 | Highest accuracy | |
| 3 | 3 | 3 | 62M | 79 × 54s | 86.27 | Lowest accuracy | |
| 4 | 6 | 6 | 64M | 84 × 56s | 91.15 | Modest accuracy | |
| 5 | 7 | 7 | – | – | – | Error | |
|
| |||||||
|
| |||||||
|
| |||||||
|
|
|
|
|
|
| ||
|
| |||||||
| 1 | 3 × 3 | 64M | 72 × 54s | 93.36 | Near highest accuracy | ||
| 2 | 2 × 2 | 28M | 78 × 55s | 93.07 | Highest accuracy | ||
| 3 | 5 × 5 | 178M | 82 × 55s | 93.47 | Highest accuracy | ||
|
| |||||||
|
| |||||||
|
| |||||||
|
|
|
|
|
|
| ||
|
| |||||||
| 1 | 64 → 64 → 64 → 64 | 28M | 75 × 54s | 93.36 | Modest accuracy | ||
| 2 | 32 → 32 → 32 → 32 | 14M | 83 × 53s | 91.22 | Accuracy dropped | ||
| 3 | 32 → 32 → 64 → 64 | 16M | 79 × 53s | 94.51 | Accuracy improved | ||
| 4 | 16 → 32 → 32 → 64 | 10M | 71 × 51s | 94.75 | Highest accuracy | ||
|
| |||||||
|
| |||||||
|
| |||||||
|
|
|
|
|
|
| ||
|
| |||||||
|
| Max | 10M | 66 × 51s | 94.75 | Highest accuracy | ||
| 2 | Average | 10M | 71 × 52s | 94.75 | Highest accuracy | ||
|
| |||||||
|
| |||||||
|
| |||||||
|
|
|
|
|
|
| ||
|
| |||||||
| 1 | PReLU | 10M | 71 × 55s | 96.52 | Highest accuracy | ||
| 2 | Relu | 10M | 66 × 51s | 94.75 | Previous accuracy | ||
| 3 | Leaky ReLu | 10M | 78 × 59s | 95.66 | Accuracy improved | ||
| 4 | Tanh | 10M | 78 × 60s | 94.2 | Accuracy dropped | ||
| 5 | ELU | 10M | 78 × 57s | 96.17 | Accuracy improved | ||
Ablation study on model hyper-parameters, loss function, and flatten layer.
| Case study 6: changing batch size | |||||
| Configuration no. | Batch size | Time complexity | Epoch × training time | Test accuracy (%) | Finding |
| 1 | 16 | 10M | 71 × 59s | 95.84 | Accuracy dropped |
| 2 | 32 | 10M | 68 × 56s | 96.83 | Highest accuracy |
| 3 | 64 | 10M | 71 × 55s | 96.52 | Previous accuracy |
| 4 | 128 | 10M | 78 × 51s | 96.28 | Accuracy dropped |
|
| |||||
|
| |||||
|
| |||||
|
|
|
|
|
|
|
|
| |||||
| 1 | Flatten | 10M | 68 × 56s | 96.83 | Highest accuracy |
| 2 | Global max pooling | 10M | 75 × 56s | 96.47 | Accuracy dropped |
| 3 | Global average pooling | 10M | 83 × 58s | 96.38 | Accuracy dropped |
|
| |||||
|
| |||||
|
| |||||
|
|
|
|
|
|
|
|
| |||||
| 1 | Binary crossentropy | 10M | 82 × 56s | 88.57 | Accuracy dropped |
| 2 | Categorical crossentropy | 10M | 68 × 56s | 96.83 | Highest accuracy |
| 3 | Mean squared error | 10M | 73 × 55s | 87.62 | Accuracy dropped |
| 4 | Mean absolute error | 10M | 92 × 56s | 74.80 | Accuracy dropped |
| 5 | Mean squared logarithmic error | 10M | 68 × 56s | 95.81 | Accuracy dropped |
| 6 | Kullback Leibler divergence | 10M | 78 × 56s | 96.04 | Accuracy dropped |
|
| |||||
|
| |||||
|
| |||||
|
|
|
|
|
|
|
|
| |||||
| 1 | Adam | 10M | 68 × 56s | 96.83 | Accuracy dropped |
| 2 | Nadam | 10M | 74 × 56s | 97.15 | Highest accuracy |
| 3 | SGD | 10M | 87 × 61s | 92.68 | Accuracy dropped |
| 4 | Adamax | 10M | 89 × 58s | 95.75 | Accuracy dropped |
| 5 | RMSprop | 10M | 91 × 59s | 90.82 | Accuracy dropped |
|
| |||||
|
| |||||
|
| |||||
|
|
|
|
|
|
|
|
| |||||
| 1 | 0.01 | 10M | 92 × 55s | 91.46 | Accuracy dropped |
| 2 | 0.007 | 10M | 87 × 56s | 95.85 | Accuracy dropped |
| 3 | 0.001 | 10M | 74 × 56s | 97.15 | Previous accuracy |
| 4 | 0.0007 | 10M | 65 × 57s | 97.34 | Highest accuracy |
| 5 | 0.0001 | 10M | 68 × 57s | 97.28 | Accuracy improved |
FIGURE 7Visualization of resulting time complexity (measured in millions and scaled into range 0–100) and test accuracy (measured in percentage) of all the ablation case studies.
Results of datasets breast mammogram, skin cancer, chest X-ray, tympanic membrane, brain tumor MRI, chest cancer CT-scan, breast cancer microscopic biopsy image, and breast cancer ultrasound image.
| (1) Breast mammogram dataset | |||||||||||
| Experiment | T_acc | T_loss | Val_acc | V_loss | Te_acc | Te_loss | Precision | Recall | Specificity | F1_score | AUC |
| Before augmentation | 76.13 | 0.48 | 69.07 | 0.49 | 66.84 | 0.46 | 66.76 | 66.83 | 79.13 | 70.26 | 66.95 |
| Geometric | 95.35 | 0.12 | 94.08 | 0.23 | 90.32 | 0.24 | 90.79 | 90.39 | 96.37 | 90.59 | 90.43 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
|
| |||||||||||
|
| |||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
| Before augmentation | 90.13 | 0.28 | 89.37 | 0.54 | 88.86 | 0.28 | 88.34 | 88.64 | 93.25 | 88.49 | 88.92 |
| Geometric | 97.68 | 0.18 | 97.40 | 0.48 | 97.82 | 0.1872 | 97.71 | 97.76 | 99.06 | 97.73 | 97.91 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
|
| |||||||||||
|
| |||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
| Before augmentation | 76.15 | 0.10 | 75.02 | 0.32 | 73.66 | 0.10 | 73.36 | 73.65 | 84.56 | 73.47 | 73.77 |
| Geometric | 98.39 | 0.09 | 98.16 | 0.23 | 94.81 | 0.09 | 94.53 | 94.54 | 97.05 | 95.53 | 94.95 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
|
| |||||||||||
|
| |||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
| Before augmentation | 65.15 | 0.82 | 64.52 | 0.08 | 64.37 | 0.08 | 63.82 | 64.06 | 75.48 | 63.94 | 64.41 |
| Geometric | 98.02 | 0.07 | 88.85 | 0.54 | 92.10 | 0.04 | 86.99 | 89.10 | 96.55 | 88.04 | 92.23 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
|
| |||||||||||
|
| |||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
| Before augmentation | 90.13 | 0.28 | 84.07 | 0.49 | 82.36 | 0.366 | 82.27 | 82.56 | 89.13 | 82.41 | 82.44 |
| Geometric | 98.18 | 0.06 | 98.62 | 0.06 | 98.93 | 0.05 | 98.93 | 99.0 | 99.63 | 98.97 | 99.04 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
|
| |||||||||||
|
| |||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
| Before augmentation | 65.15 | 0.82 | 64.52 | 0.08 | 64.37 | 0.08 | 63.82 | 64.26 | 81.45 | 64.03 | 64.41 |
| Geometric | 98.78 | 0.03 | 97.75 | 0.10 | 97.56 | 0.10 | 97.63 | 97.63 | 99.16 | 97.63 | 97.84 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
|
| |||||||||||
|
| |||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
| Before augmentation | 91.15 | 0.40 | 85.02 | 0.42 | 83.66 | 0.10 | 83.36 | 83.65 | 84.56 | 83.65 | 83.80 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| Photometric | 95.06 | 0.06 | 93.87 | 0.22 | 93.50 | 0.15 | 92.06 | 93.08 | 95.86 | 92.57 | 93.63 |
|
| |||||||||||
|
| |||||||||||
|
| |||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
| Before augmentation | 76.15 | 0.10 | 75.02 | 0.32 | 73.66 | 0.10 | 73.36 | 73.65 | 84.56 | 73.47 | 73.81 |
| Geometric | 98.39 | 0.12 | 98.16 | 0.23 | 95.38 | 0.09 | 95.53 | 95.54 | 97.05 | 95.53 | 95.55 |
| P | 98.95 |
|
|
|
|
|
|
|
|
|
|
The results include training accuracy (T_acc), training loss (T_loss), validation accuracy (V_acc), validation loss (V_loss), test accuracy (Te_acc), test loss (Te_loss), precision, recall, specificity, F1 score, and area under the curve value (AUC).
FIGURE 8Accuracy curves for all the eight medical image datasets trained on proposed MNet-10.
Results of VGG16, ResNet50, InceptionV3, and MNet-10 on breast mammogram, skin cancer, chest X-ray, tympanic membrane, brain tumor MRI, chest cancer CT scan, and breast cancer ultrasound image with photometric augmentation techniques and breast cancer microscopic biopsy image with geometric augmentation techniques.
| Datasets | Statistical tests | VGG16 | ResNet50 | InceptionV3 | Proposed model |
| Breast Mammogram dataset | Test accuracy | 90.10 | 63.82 | 88.24 | 97.34 |
| F1 score | 89.47 | 59.92 | 88.15 | 97.12 | |
| AUC | 91.38 | 63.97 | 89.32 | 97.47 | |
| Specificity | 93.61 | 68.45 | 93.49 | 98.95 | |
| Skin cancer dermoscopy dataset | Test accuracy | 90.68 | 82.71 | 92.19 | 98.43 |
| F1 score | 87.18 | 81.35 | 90.26 | 98.40 | |
| AUC | 92.04 | 82.11 | 93.84 | 98.65 | |
| Specificity | 94.12 | 86.09 | 96.17 | 99.32 | |
| COVID chest X-ray dataset | Test accuracy | 93.74 | 78.80 | 89.87 | 97.29 |
| F1 score | 92.36 | 75.63 | 86.95 | 97.31 | |
| AUC | 95.27 | 76.29 | 90.32 | 97.42 | |
| Specificity | 95.41 | 83.93 | 92.40 | 99.09 | |
| Tympanic membrane dataset | Test accuracy | 89.99 | 55.78 | 94.26 | 96.31 |
| F1 score | 89.57 | 54.83 | 93.81 | 96.34 | |
| AUC | 91.68 | 55.91 | 95.07 | 96.48 | |
| Specificity | 92.16 | 65.74 | 95.43 | 98.74 | |
| Brain tumor MRI dataset | Test accuracy | 97.63 | 78.93 | 92.49 | 99.54 |
| F1 score | 96.25 | 76.20 | 91.83 | 99.56 | |
| AUC | 97.84 | 79.58 | 94.28 | 99.71 | |
| Specificity | 97.14 | 83.45 | 95.03 | 99.84 | |
| Chest cancer CT-scan dataset | Test accuracy | 98.78 | 81.71 | 96.74 | 99.82 |
| F1 score | 98.05 | 80.34 | 93.72 | 99.84 | |
| AUC | 99.47 | 81.92 | 98.03 | 99.90 | |
| Specificity | 99.12 | 88.24 | 97.91 | 99.91 | |
| Breast cancer microscopic biopsy image dataset | Test accuracy | 91.85 | 80.35 | 92.47 | 96.76 |
| F1 score | 89.30 | 80.11 | 90.34 | 96.29 | |
| AUC | 93.53 | 82.45 | 93.70 | 96.84 | |
| Specificity | 93.41 | 86.26 | 94.18 | 98.53 | |
| Breast cancer ultrasound image dataset | Test accuracy | 96.43 | 85.61 | 93.43 | 98.75 |
| F1 score | 96.18 | 83.57 | 93.18 | 97.61 | |
| AUC | 97.10 | 87.04 | 94.35 | 97.59 | |
| Specificity | 98.75 | 91.73 | 96.83 | 99.18 |
The results include test accuracy, specificity, F1 score, and area under the curve (AUC) statistical values.
Results of Wilcoxon signed-ranked test.
| Pairwise model comparison | Test outcome | |
| Proposed model MNet-10 vs. VGG16 | 0.003 | Significant |
| Proposed model MNet-10 vs. ResNet50 | 0.003 | Significant |
| Proposed model MNet-10 vs. InceptionV3 | 0.003 | Significant |