| Literature DB >> 35267587 |
Gelan Ayana1, Jinhyung Park1, Se-Woon Choe1,2.
Abstract
Despite great achievements in classifying mammographic breast-mass images via deep-learning (DL), obtaining large amounts of training data and ensuring generalizations across different datasets with robust and well-optimized algorithms remain a challenge. ImageNet-based transfer learning (TL) and patch classifiers have been utilized to address these challenges. However, researchers have been unable to achieve the desired performance for DL to be used as a standalone tool. In this study, we propose a novel multi-stage TL from ImageNet and cancer cell line image pre-trained models to classify mammographic breast masses as either benign or malignant. We trained our model on three public datasets: Digital Database for Screening Mammography (DDSM), INbreast, and Mammographic Image Analysis Society (MIAS). In addition, a mixed dataset of the images from these three datasets was used to train the model. We obtained an average five-fold cross validation AUC of 1, 0.9994, 0.9993, and 0.9998 for DDSM, INbreast, MIAS, and mixed datasets, respectively. Moreover, the observed performance improvement using our method against the patch-based method was statistically significant, with a p-value of 0.0029. Furthermore, our patchless approach performed better than patch- and whole image-based methods, improving test accuracy by 8% (91.41% vs. 99.34%), tested on the INbreast dataset. The proposed method is of significant importance in solving the need for a large training dataset as well as reducing the computational burden in training and implementing the mammography-based deep-learning models for early diagnosis of breast cancer.Entities:
Keywords: cancer cell line; classification; mammogram; multi-stage transfer learning; patchless
Year: 2022 PMID: 35267587 PMCID: PMC8909211 DOI: 10.3390/cancers14051280
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Mammogram datasets summary.
| Characteristics | DDSM | INbreast | MIAS |
|---|---|---|---|
| Origin | USA | Portugal | UK |
| Age | Yes | Yes | No |
| Number of cases | 2620 | 115 | 161 |
| Views | MLO and CC | MLO and CC | MLO |
| Number of images | 10,480 | 410 | 322 |
| Resolution | 8 and 16 bits/pixel | 14 bits/pixel | 8 bits/pixel |
| Benign: malignant ratio | 0.65:0.35 | 0.72:0.28 | 0.84:0.16 |
| Lesion type | All types of lesions | All types of lesions | All types of lesions |
| Annotation | Pixel level annotation | Annotation including label of individual finding | Center and ROI |
| Breast density information | Yes | Yes | Yes |
DDSM: Digital Database for Screening Mammography, MIAS: Mammographic Image Analysis Society, USA: United States of America, UK: United Kingdom, MLO: mediolateral oblique, CC: craniocaudal, ROI: region of interest.
Dataset categories.
| Dataset | Category | Sub-Category | Dataset Size | Validation | Test |
|---|---|---|---|---|---|
| DDSM | Benign | - | 3582 | 1194 | 1194 |
| Malignant | - | 4293 | 1431 | 1431 | |
| INbreast | Benign | - | 1512 | 504 | 504 |
| Malignant | - | 3066 | 1022 | 1022 | |
| MIAS | Benign | - | 1422 | 474 | 474 |
| Malignant | - | 864 | 288 | 288 | |
| Mixed | Benign | DDSM | 3582 | 1194 | 1194 |
| INbreast | 1512 | 504 | 504 | ||
| MIAS | 1422 | 474 | 474 | ||
| Total | 6516 | 2172 | 2172 | ||
| Malignant | DDSM | 4293 | 1431 | 1431 | |
| INbreast | 3066 | 1022 | 1022 | ||
| MIAS | 864 | 288 | 288 | ||
| Total | 8233 | 2741 | 2741 |
DDSM: Digital Database for Screening Mammography, MIAS: Mammographic Image Analysis Society.
Figure 1Cancer cell line image acquisition and pre-processing.
Figure 2Different images formed after augmentation (Right MLO benign images from DDSM).
Figure 3The proposed EfficientNetB2 based patchless multi-stage transfer-learning method for mammography breast mass image classification. FC: fully connected; TL: transfer learning; Conv: convolution; BN: batch normalization; Act.: activation; AP: average pooling; DO: drop out; SM: Softmax; GAP: global average pooling.
Model additional layers size.
| Layer Type | Input | Output |
|---|---|---|
| Input Layer | 16 × 227 × 227 × 3 | 16 × 227 × 227 × 3 |
| EfficientNetB2 | Load EfficientNetB2 from Keras and remove classifier & input Layer | |
| Global Average Pooling | 16 × 7 × 7 × 1408 | 16 × 1408 |
| Fully Connected Layer1 with L2 | 16 × 1408 | 16 × 1024 |
| Fully Connected Layer2 | 16 × 1024 | 16 × 8 |
| Fully Connected Layer3 | 16 × 8 | 16 × 8 |
| Softmax | 16 × 8 | 16 × 2 |
Results of learning using early stop and fixed epoch.
| Dataset | Training Condition | Validation Accuracy | Loss | Stopping Epoch |
|---|---|---|---|---|
| DDSM | Early stop with patience = 5 | 100 | 0.65 | 150 |
| Early stop with patience = 5 | 99.97 | 0.64 | 150 | |
| Fixed epoch of 150 | 100 | 0.05 | 150 | |
| INbreast | Early stop with patience = 5 | 99.93 | 0.078 | 150 |
| Early stop with patience = 5 | 99.93 | 0.08 | 150 | |
| Fixed epoch of 150 | 99.93 | 0.078 | 150 | |
| MIAS | Early stop with patience = 5 | 99.92 | 0.87 | 150 |
| Early stop with patience = 5 | 99.92 | 0.87 | 150 | |
| Fixed epoch of 150 | 99.92 | 0.86 | 150 | |
| Mixed dataset | Early stop with patience = 5 | 99.95 | 0.05 | 133 |
| Early stop with patience = 5 | 99.98 | 0.07 | 150 | |
| Fixed epoch of 150 | 99.98 | 0.07 | 150 |
Performance summary of patchless multi-stage transfer learning based on EfficientNetB2 architecture with Adagrad optimizer.
| Dataset | F1 | AUC | Test Accuracy | Sensitivity | Specificity |
|---|---|---|---|---|---|
| DDSM | 1 | 1 | 1 | 1 | 1 |
| INbreast | 0.9995 | 0.9994 | 0.9993 | 0.9996 | 0.9992 |
| MIAS | 0.9989 | 0.9993 | 0.9992 | 0.9987 | 1 |
| Mixed | 0.9998 | 0.9998 | 0.9998 | 1 | 0.9997 |
AUC: area under receiver operating curve; DDSM: Digital Database for Screening Mammography; MIAS: Mammographic Image Analysis Society database.
Figure 4The learning curves of the proposed patchless deep-learning method for mammography breast-mass image classification on the (a) DDSM dataset; (b) INbreast dataset; (c) MIAS dataset; and (d) mixed dataset. train acc: training accuracy; val acc: validation accuracy; train loss: training loss; val loss: validation loss.
Results of robustness analysis of the proposed system across different CNN models and optimizers.
| Dataset | CNN-Optimizer Combination | F1-Score | AUC | Test Accuracy | Sensitivity | Specificity |
|---|---|---|---|---|---|---|
| DDSM | EfficientNetB2-Adagrad | 1.0 | 1 | 1.0 | 1.0 | 1.0 |
| EfficientNetB2-Adam | 0.99993 | 0.99993 | 0.99992 | 1.0 | 0.99986 | |
| EfficientNetB2-SGD | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
| ResNet50-Adagrad | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
| ResNet50-Adam | 0.94108 | 0.89991 | 0.90898 | 0.79983 | 1.0 | |
| ResNet50-SGD | 0.99986 | 0.99986 | 0.99984 | 1.0 | 0.99972 | |
| InceptionV3-Adagrad | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
| InceptionV3-Adam | 0.88227 | 0.8 | 0.81809 | 0.6 | 1.0 | |
| InceptionV3-SGD | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
| INbreast | EfficientNetB2-Adagrad | 0.99951 | 0.99941 | 0.99934 | 0.99960 | 0.99921 |
| EfficientNetB2-Adam | 0.99872 | 0.99802 | 0.99829 | 0.99722 | 0.99882 | |
| EfficientNetB2-SGD | 0.99664 | 0.99637 | 0.99554 | 0.99880 | 0.99393 | |
| ResNet50-Adagrad | 0.99892 | 0.99821 | 0.99855 | 0.99722 | 0.99921 | |
| ResNet50-Adam | 0.97055 | 0.96968 | 0.96371 | 0.98730 | 0.95209 | |
| ResNet50-SGD | 0.99647 | 0.99486 | 0.99528 | 0.99365 | 0.99608 | |
| InceptionV3-Adagrad | 0.99793 | 0.99764 | 0.99724 | 0.99880 | 0.99647 | |
| InceptionV3-Adam | 0.99786 | 0.99603 | 0.99711 | 0.99285 | 0.99921 | |
| InceptionV3-SGD | 0.99892 | 0.99852 | 0.99855 | 0.99841 | 0.99863 | |
| MIAS | EfficientNetB2-Adagrad | 0.99896 | 0.99936 | 0.99921 | 0.99873 | 1.0 |
| EfficientNetB2-Adam | 0.99860 | 0.99874 | 0.99895 | 0.99957 | 0.99791 | |
| EfficientNetB2-SGD | 0.99310 | 0.99564 | 0.99475 | 0.99199 | 0.99930 | |
| ResNet50-Adagrad | 0.99193 | 0.99242 | 0.99396 | 0.99873 | 0.98611 | |
| ResNet50-Adam | 0.95908 | 0.96780 | 0.96825 | 0.96962 | 0.96597 | |
| ResNet50-SGD | 0.99235 | 0.99365 | 0.99422 | 0.99536 | 0.99236 | |
| InceptionV3-Adagrad | 0.99614 | 0.99645 | 0.99711 | 0.99915 | 0.99375 | |
| InceptionV3-Adam | 0.99450 | 0.99608 | 0.99580 | 0.99494 | 0.99722 | |
| InceptionV3-SGD | 0.99476 | 0.99554 | 0.99606 | 0.99831 | 0.99236 | |
| Mixed | EfficientNetB2-Adagrad | 0.99985 | 0.99985 | 0.99983 | 1.0 | 0.99970 |
| EfficientNetB2-Adam | 0.99919 | 0.99913 | 0.99910 | 0.99935 | 0.99890 | |
| EfficientNetB2-SGD | 0.99926 | 0.99926 | 0.99918 | 0.99990 | 0.99861 | |
| ResNet50-Adagrad | 0.99905 | 0.99893 | 0.99894 | 0.99889 | 0.99897 | |
| ResNet50-Adam | 0.93016 | 0.88472 | 0.89688 | 0.77956 | 0.98986 | |
| ResNet50-SGD | 0.99766 | 0.99737 | 0.99739 | 0.99714 | 0.99759 | |
| InceptionV3-Adagrad | 0.99828 | 0.99806 | 0.99808 | 0.99788 | 0.99824 | |
| InceptionV3-Adam | 0.88094 | 0.79390 | 0.81700 | 0.59410 | 0.99365 | |
| InceptionV3-SGD | 0.99821 | 0.99797 | 0.99800 | 0.99769 | 0.99824 |
SGD: stochastic gradient descent; CNN: convolutional neural network; AUC: area under receiver operating curve; DDSM: Digital Database for Screening Mammography; MIAS: Mammographic Image Analysis Society database.
Comparison of the proposed multistage transfer learning against conventional transfer learning.
| Model | Dataset Type | CNN Architecture | Optimizer | Time (h) | Five-Fold Cross Validation Test Accuracy (%) |
|---|---|---|---|---|---|
| Best practice Conventional TL | DDSM | ResNet50 | Adam | 1.846567529 | 85.723 |
| INbreast | ResNet50 | Adam | 1.824081421 | 83.566 | |
| MIAS | ResNet50 | Adam | 1.805489539 | 90.670 | |
| Mixed | ResNet50 | Adam | 1.858144065 | 86.335 | |
| Multistage TL with the same set up as CTL | DDSM | ResNet50 | Adam | 1.711060605 | 90.898 |
| INbreast | ResNet50 | Adam | 1.708678728 | 96.371 | |
| MIAS | ResNet50 | Adam | 1.694282732 | 96.825 | |
| Mixed | ResNet50 | Adam | 1.724357648 | 89.688 | |
| Multistage TL with our best model | DDSM | EfficientNetB2 | Adagrad | 1.60336038 | 100 |
| INbreast | EfficientNetB2 | Adagrad | 1.51702123 | 99.934 | |
| MIAS | EfficientNetB2 | Adagrad | 1.50130263 | 99.921 | |
| Mixed | EfficientNetB2 | Adagrad | 1.62434423 | 99.983 |
CNN: convolutional neural network; TL: transfer learning; CTL: conventional transfer learning; DDSM: Digital Database for Screening Mammography; MIAS: Mammographic Image Analysis Society database; hr.: hour.
Comparison of the proposed patchless multistage transfer learning method against the patch and whole-image classifier.
| Fold Number | Patch and Whole Image Classifier | Proposed Patchless Multistage Transfer Learning Method | ||
|---|---|---|---|---|
| Accuracy (%) | Time (h) | Accuracy (%) | Time (h) | |
| Fold 1 | 98.165 | 2.1730723 | 99.213 | 1.714756812 |
| Fold 2 | 77.129 | 2.12583438 | 99.737 | 1.757123122 |
| Fold 3 | 87.614 | 2.10610313 | 99.344 | 1.736005956 |
| Fold 4 | 94.695 | 2.09312669 | 99.279 | 1.730858592 |
| Fold 5 | 99.476 | 1.71885766 | 99.017 | 1.733659677 |
| Average | 91.416 | 2.043398834 | 99.344 | 1.734480832 |
Comparison of the proposed multistage transfer learning method with the state-of-the-art mammographic breast cancer classification methods.
| Paper | Application | Image Dataset | Dataset Size | Model Validation | CNN Model | AUC | Accuracy (%) |
|---|---|---|---|---|---|---|---|
| Al-masni et al. [ | Classification | DDSM | 600 with augmentation | 5-fold CV | CNN, F-CNN | 0.9645 | 96.33 |
| Al-antari et al. [ | Classification | DDSM, INbreast | 9240 DDSM and 2266 INbreast with augmentation | 5-fold CV | CNN, ResNet50, InceptionResNet-V2 | CNN = 0.945, | CNN = 94.5, |
| Ribli et al. [ | Classification | DDSM, SUD, INbreast | 2949 with augmentation | NA | Faster RCNN | 0.95 | NA |
| Chougrad et al. [ | Classification | DDSM, BCDR, INbreast, mixed, MIAS | 6116 with augmentation | 5-fold CV | Deep CNN | 0.98 on DDSM, on 0.96 BCDR, 0.97 on INbreast, and 0.99 on MIAS | 97.35 on DDSM, on 96.67 BCDR, 95.50 on INbreast, and 98.23 on MIAS |
| Lotter et al. [ | Classification | DDSM | 10,480 with augmentation | CV by patient | Wide ResNet | 0.92 | NA |
| Dhungel et al. [ | Classification | INbreast | 410 without augmentation | 5-fold CV | CNN, RF, BO | 0.69–0.76 MUI, 0.8–0.91 MS | Maximum of 95% |
| Saraswathi & Srinivasan [ | Classification | MIAS | 322 without augmentation | 10-fold CV | FCRN | NA | 94.7 |
| The proposed method | Classification | DDSM, INbreast, MIAS, mixed | 13,128 DDSM, 7632 INbreast, and 3816 MIAS. 24,576 mixed | 5-fold CV | EfficientNetB2 | 1 on DDSM, 0.9995 on INbreast, 0.9989 on MIAS, and 0.9998 on mixed dataset | 100 on DDSM, 99.93 on INbreast, 99.92 on MIAS, and 99.98 on mixed dataset |
CNN: Convolutional Neural Network; AUC: area under receiver operating curve; CV: cross-validation; DDSM: Digital Database for Screening Mammography; SUD: Semmelweis University dataset; F-CNN: Fourier Convolutional Neural Networks; NA: not available; Faster RCNN: faster region-based convolutional neural network; BC: breast cancer; BCDR: breast cancer digital repository; MIAS: Mammographic Image Analysis Society database; DM: digital mammograms; DCNN: Deep Convolutional Neural Network; MIAS: Mammographic Image Analysis Society database; RF: Random Forests; MUI: minimal user intervention; MS: manual set-up; BO: Bayesian optimization; FCRN: Fully complex-valued relaxation neural network.
Figure 5Examples of missing images from: (a) INbreast and (b) MIAS datasets.