| Literature DB >> 34885225 |
Muhammad Firoz Mridha1, Md Abdul Hamid2, Muhammad Mostafa Monowar2, Ashfia Jannat Keya1, Abu Quwsar Ohi1, Md Rashedul Islam3, Jong-Myon Kim4.
Abstract
Breast cancer is now the most frequently diagnosed cancer in women, and its percentage is gradually increasing. Optimistically, there is a good chance of recovery from breast cancer if identified and treated at an early stage. Therefore, several researchers have established deep-learning-based automated methods for their efficiency and accuracy in predicting the growth of cancer cells utilizing medical imaging modalities. As of yet, few review studies on breast cancer diagnosis are available that summarize some existing studies. However, these studies were unable to address emerging architectures and modalities in breast cancer diagnosis. This review focuses on the evolving architectures of deep learning for breast cancer detection. In what follows, this survey presents existing deep-learning-based architectures, analyzes the strengths and limitations of the existing studies, examines the used datasets, and reviews image pre-processing techniques. Furthermore, a concrete review of diverse imaging modalities, performance metrics and results, challenges, and research directions for future researchers is presented.Entities:
Keywords: breast cancer diagnosis; image pre-processing; imaging modalities; neural networks
Year: 2021 PMID: 34885225 PMCID: PMC8656730 DOI: 10.3390/cancers13236116
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Figure 1A taxonomy of deep-learning-based breast cancer diagnosis.
A comparison of existing surveys based on breast cancer diagnosis.
| Survey | Taxonomy | Datasets | Imaging Modalities | Evaluation Metrics | Challenges | Deep-Learning Architectures | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Ref. | Year | ANN | Autoencoder | DBN | CNN | ELM | GAN | |||||
| [ | 2017 | ✗ | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| [ | 2018 | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ |
| [ | 2018 | ✗ | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| [ | 2019 | ✗ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ |
| [ | 2019 | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ |
| [ | 2019 | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ | ✗ | ✓ |
| [ | 2020 | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ |
| [ | 2020 | ✗ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ |
| [ | 2020 | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✓ | ✗ | ✓ | ✗ | ✗ |
| [ | 2020 | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ |
| [ | 2021 | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ |
| [ | 2021 | ✗ | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ |
| [ | 2021 | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✗ | ✓ | ✗ | ✓ |
| Ours | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Figure 2This figure shows the published number of deep-learning-based breast cancer studies in the past 6 years and the current year.
Figure 3A sample illustration of ANN with multiple hidden layers for breast cancer diagnosis.
State-of-the-art studies based on ANN architecture.
| Reference | Dataset | Architecture | Category | Strength | Limitation |
|---|---|---|---|---|---|
| Abbass [ | WBCD | MPANN | BrC diagnosis | Better generalization | Absence of feature engineering |
| Karabatak and Ince [ | WBCD | AR+NN | BrC diagnosis | Reducing feature dimensions | Inadequate model evaluation |
| Rouhi et al. [ | MIAS, DDSM | ANN | Mammography, image segmentation | Correctly identify small mass lesion | Insufficient images |
| Jafari-Marandi et al. [ | WDBC | LS-SOED | BrC diagnosis | Driven to better decision-making | Inclusion of the dataset’s missing values |
| Becker et al. [ | BCDR | ANN | Mammography, BrC detection | Correctly identify small mass lesion | Insufficient images |
Figure 4An illustration of the autoencoder model for breast-cancer diagnosis.
State-of-the-art studies based on autoencoder architecture.
| Reference | Dataset | Architecture | Category | Strength | Limitation |
|---|---|---|---|---|---|
| Xu et al. [ | PD | SSAE + Softmax | Histopathology, nuclei patch classification, unsupervised learning | High-level feature learning | Limited images |
| Xu et al. [ | PLOS 2018 | SSAE + Softmax | Histopathology, nuclei detection, unsupervised Learning | Lower computation time | Imbalanced data |
| Cheng et al. [ | PD | SDAE | Ultrasound, supervised learning, breast-lesion classification | Adequate model evaluation | Absence of model comparison |
| Kadam et al. [ | WDBC | FE-SSAE-SM | Feature ensemble learning, BrC classification | Adequate evaluation | Absence of data-preprocessing techniques |
| Feng et al. [ | BCC | SDAE + Softmax | Histopathology, nuclei classification, unsupervised feature learning | Utilizing robust features of breast cancer nuclei | Insufficient images |
PD = private dataset, FE-SSAE-SM = feature ensemble learning based on stacked sparse autoencoders and softmax regression model.
Figure 5An illustration of the DBN model for breast-cancer diagnosis.
State-of-the-art studies based on DBN architecture.
| Reference | Dataset | Architecture | Category | Strength | Limitation |
|---|---|---|---|---|---|
| Abdel-Zaher and Eldeib [ | WBCD | DBN-NN | Unsupervised learning, supervised learning, BrC classification | Tested on several train-test partition | May suffer from overfitting issue |
| Zhang et al. [ | PD | PGBM | Ultrasound (shear-wave elastography (SWE)), feature extraction, classifying breast tumor | Utilized a different ultrasound technique | Higher training time |
| Dhungel et al. [ | DDSM-BCRP, INbreast | DBN | Mammography, segmentation of masses, structured learning | Can learn complex features | Inadequate model evaluation |
| Dhungel et al. [ | DDSM-BCRP, INbreast | CRF | Mammography, segmentation of masses, structured output learning | Significantly faster model | Inadequate model evaluation |
| Al-antari et al. [ | DDSM | DBN | Mammography, automatic mass detection | Feature engineering | Higher error rate for confusing benign with malignant |
| Khademi and Nedialkov [ | WDBC, WOBC | DBN | Breast cancer diagnosis | The integration of microarray and clinical data | Comparison with ML models instead of other DL models |
Figure 6An illustration of CNN-based model for breast cancer diagnosis.
State-of-the-art studies based on De-novo CNN.
| Reference | Dataset | Architecture | Category | Strength | Limitation |
|---|---|---|---|---|---|
| Arevalo et al. [ | BCDR | CNN (UDM) | Mammography, mass lesion classification | omparison with a pre-trained model | Simple architecture |
| Spanhol et al. [ | BreakHis | CNN (UDM) | Histopathology, image classification | Used high-resolution histopathological images | For training, only small patches of the images are used. |
| Albarqouni et al. [ | Crowdsourcing | AggNet | Histopathology, mitosis Detection | Tested with a benchmark dataset | unreliable (crowd) annotations |
| Xu et al. [ | PD | DCNN (COM) | Histopathology, image segmentation and classification | Can learn complex features | Insufficient images |
| Kooi et al. [ | PD | CNN | Mammography, breast mass lesion classification | Focused on the detection of solid, malignant lesions including architectural distortions | Absence of benign lesions in training set |
| Araújo et al. [ | BICBH | CNN (UDM) | Histopathology, image-wise classification, patch-wise classification | Multi-class classification | Limited images |
| Samala et al. [ | PD | DLCNN (UDM) | Digital breast tomosynthesis, recognition of microcalcification | Learns complex patterns | Limited images |
| Ting et al. [ | MIAS | CNNI-BCC (UDM) | Mammography, breast-lesion classification | Feature-wise data augmentation | Limited images |
| Yan et al. [ | PD | CNN+RNN | Histopathology, pathological image classification | Released a larger and more diverse dataset | Lack of data pre-processing |
| Wang et al. [ | BreakHis | CNN (UDM) | Histopathology, BrC binary classification, deep feature fusion, and enhanced routing | Classification is conducted for different magnification factors | Absence of image pre-processing |
State-of-the-art studies based on a TL-based CNN.
| Reference | Dataset | Architecture | Category | Strength | Limitation |
|---|---|---|---|---|---|
| Huynh et al. [ | PD | AlexNet | Mammography, feature extraction, breast-mass classification | Automatic lesion segmentation | Inadequate model evaluation |
| Samala et al. [ | DDSM | CNN (FTM-ML) | Mammography, mass classification | Multi-task transfer learning | Absence of model comparison |
| Chougrad et al. [ | DDSM, BCDR, INbreast | VGG16, ResNet50 and Inception v3 (FTM-ML) | Mammography, mass-lesion classification | Merged three datasets | Inadequate model evaluation |
| Xie et al. [ | BreakHis | CNN (FTM-LL) | Histopatology, multi-class classififaction, clustering analysis | Solved the unbalanced distribution of samples | Lack of image pre-processing |
| Mendel et al. [ | PD | CNN (FTM-ML) | Mammography, digital breast tomosynthesis, classification | Leave-one-out step-wise feature selection was used to eliminate redundant features. | Lack of training data |
| Kumar et al. [ | BreakHis | VGGNet-16 (FTM-ML) | Histopathology, feature extraction, image classification | Analysis of effects of image pre-processing | Accuracy is influenced by magnification |
| Yu et al. [ | PD | CNN (FTM-ML) | Histopathology, image classification | Images are collected via the internet. | The quality of the images could be inadequate. |
| Hu et al. [ | PD | CNN (FTM-ML) | MRI, feature extraction | Pre-processing, large dataset, and extended training times are not required | Issue of class imbalance |
State-of-the-art studies based on RL-based CNN.
| Reference | Dataset | Architecture | Category | Strength | Limitation |
|---|---|---|---|---|---|
| Toğaçar et al. [ | BreakHis | BreastNet | Histopathology, BrC diagnosis | Can be used in all microscopic images at different magnification rates | Absence of data pre-processing |
| Gour et al. [ | BreakHis | ResHist | Histopathology, lassification of benign and malignant | Preserves the global information of histopathological images | Consumes a lot of processing power |
| Hu et al. [ | BreakHis | myResNet-34 | Histopathology, malignancy-and-benign classification | Automatic target image generation | Significant rate of misclassification |
| Singh et al. [ | PD | ResNet | Mammography, digital breast tomosynthesis, multi-class classification | The approach is simple and can be applied in different imaging | Only patch-level images are used to train the model. |
| Li et al. [ | PD, INbreast | ResNet50 | Mammographic density classification | Combination of deep residual networks with integrated dilated convolutions and attention methods | Imbalance classes |
Figure 7A basic extreme learning machine architecture.
State-of-the-art studies based on ELM architecture.
| Reference | Dataset | Architecture | Category | Strength | Limitation |
|---|---|---|---|---|---|
| Lahoura et al. [ | WBCD | ELM | Feature selection, cloud environment, BrC diagnosis | Consideration of feature engineering | Absence of image pre-processing technique |
| Wang et al. [ | PD (Mamograms) | ELM | Mass detection, feature extraction, Clustering | Feature fusion | Insufficient data |
| NEMISSI et al. [ | WBCD | ELM | BrC diagnosis, genetic algorithm | Higher generalization performance | Inadequate evaluation |
| Ronoud and Asadi [ | WDBC | ELM (DBN+ELM+BP) | BrC diagnosis, ensemble approach | Parameter tuning | |
| Wang et al. [ | BreaKHis, ImageNet | ELM (ICELM) | Feature extraction, double-step deep transfer learning, BrC diagnosis | A novel method | Not an end-to-end architecture |
| Toprak [ | WBCD | ELM | Detection and characterization of benign and malignant types | ELM is superior to other methods in performance and speed | Imbalance classes |
| Muduli et al. [ | WBCD | ELM | Classification of breast masses, feature extraction and reduction | The generalization performance is improved | There is a possibility of data loss |
Figure 8A cGAN Architecture: generator G (top) and discriminator D (bottom).
State-of-the-art studies based on GAN architecture.
| Reference | Dataset | Architecture | Category | Strength | Limitation |
|---|---|---|---|---|---|
| Guan and Loew [ | DDSM | GAN | Image augmentation, BrC diagnosis | Sufficient images | GAN is only used as image generator |
| Shams et al. [ | WBCD | GAN (DiaGRAM) | BrC diagnosis | Enhanced feature learning | |
| Thuy and Hoang [ | BreaKHis | GAN (styleGAN, Pix2Pix) | Image augmentation | Feature extraction with VGG16 and VGG19 | Generated images contain noise and affected the classifiers accuracy |
| Singh et al. [ | DDSM, INbreast | GAN (cGAN) | Breast-tumor segmentation | Works well on limited training samples | Tumor segmentation from full-mammograms has a low accuracy |
| Fan et al. [ | PD (DCE-MRI images) | GAN (SRGAN) | Image augmentation, BrC diagnosis | Generated super resolution ADC images | There is no conventional medical process that uses ADC images |
| Swiecicki et al. [ | PD | GAN | Digital breast tomosynthesis, image completion, abnormality detection | Able to identify suspicious regions without the need for training images containing abnormalities | Inadequate model evaluation |
| Tien et al. [ | PD | GAN | Computed tomography image-quality improvement | Can convert blurred images into clear images | Only for chest region |
PD (DCE-MRI) = private dataset of dynamic contrast-enhanced magnetic resonance imaging, ADC = apparent diffusion coefficient.
Detailed information of publicly available datasets.
| SL | Dataset Name | Category | No. of Images | Classes | Image Format | Resolution | URL |
|---|---|---|---|---|---|---|---|
| 1 | DDSM [ | Mammograms | 10,480 | Benign, cancer, normal, benign without callback (bwc) | .JPEG | 16-bit | |
| 2 | MIAS [ | Mammograms | 322 | Benign, malignant, normal | .PGM | 8 bit | |
| 3 | mini-MIAS [ | Mammograms | 322 | Calcification, circumscribed masses, spiculated masses, other/ill-defined masses, architectural distortion, asymmetry, normal | .PGM | 1024 × 1024 pixels | |
| 4 | CBIS-DDSM [ | Mammograms | 1644 | Normal, benign, and malignant | .DICOM | 16-bit | |
| 5 | INBreast [ | Mammograms | 410 | Benign, malignant, normal | .DICOM | 14-bit | |
| 6 | UPMC | Tomography and mamograms | - - | Hamartoma, invasive ductal carcinoma (IDC), asymmetry, lobular carcinoma, papilloma, calcifications | .DICOM | - | |
| 7 | BICBH [ | Histology images | 259 | normal, benign, in situ carconima and invasive carcinoma | .TIFF | - - | |
| 8 | BreakHis [ | Histology images | 7909 | Benign and malignant | .PNG | 8-bit | |
| 9 | BCC [ | Histology images | 58 | Malignant, benignant | .TIFF | 896 × 768 pixels, 768 × 512 pixels | |
| 10 | BACH [ | Histology images | 400 | Normal, benign, in situ carcinoma, invasive carcinoma | . TIFF | 2048 × 1536 pixels | |
| 11 | TUPAC16 [ | Histology images | 500 | - - | .SVS | 50 k × 50 k pixels | |
| 12 | IDC [ | Histology images | 162 | Invasive ductal carcinoma (IDC), non-IDC | .PNG | - - | |
| 13 | MITOS-ATYPIA 14 | Histology images | - | Mitosis and nuclear atypia | .TIFF | 1539 × 1376 pixels, 1663 × 1485 pixels | |
| 14 | DMR-IR | Infrared Images | - - | - - | - - | 640 × 480 pixels | |
| 15 | BCDR [ | Mammograms and ultrasound | - - | Benign, malignant, normal | .DICOM | 720 × 1167 | |
| 16 | TCGA | Mammograms | 88 | - - | .DICOM | - - | |
| 17 | BancoWeb LAPIMO [ | Mammograms | 1473 | Benign, malignant, normal | .TIFF | 12 bits | |
| 18 | PLOS 2018 | Histology images | 537 | Nuclear, non-nuclear | .TIFF | 2200 × 2200 pixels | |
| 19 | WBCD or WBCO [ | Multivariate | 699 data | Benign, malignant | - - | - - | |
| 20 | WDBC [ | Multivariate | 569 | Malignant, benign | - - | - - | |
| 21 | Histopathological images [ | Histology images | 3771 | Normal, benign, in situ carcinoma and invasive carcinoma | - - | 2048 × 1536 pixels |
The advantages of image pre-processing methods used in previous studies are presented.
| Pre-Processing Method | Methodology | Advantages | References |
|---|---|---|---|
| Image augmentation | Geometric transformations such as rotation and flipping | To prevent the problem of overfitting. To address the issue of class imbalance in training. For improved interpretation of HP images, the network can learn lesions from several perspectives, much like a pathologist does in real life. | [ |
| Insert noise/distortion (Gaussian noise, barrel or pin cushion transforms) | Allows for the robust training of NN It can predict with greater accuracy even when images are noisy | ||
| Patch-creation methods (patches with 50% overlapping, no overlapping, or randomly selected patches) | It can retain the image aspect ratio, architecture, or shape of the lesion, as well as subjective information. It improves the classifier’s performance while decreasing the likelihood of false negatives. It can decrease the possibility of information loss. | ||
| Synthetic minority over-sampling technique (SMOTE) | To solve the class imbalance problem before training NNs, this method increases the number of samples to the minority class. | ||
| ROI extraction | Methods such as region growing, nuclei segmentation, the Otsu method, and the Markov random model were utilized. | Increases the amount of positive and negative image samples available. Assists the neural network (NN) in learning better representations of abnormal areas and decreases the likelihood of overfitting. Reduces calculation time and resource use. | [ |
| Scaling | Gaussian pyramid, bi-cubic interpolation, bilinear interpolation | The image must be resized before being provided as input to the NN. Carefully chosen interpolation algorithms can prevent information loss while mapping to the new pixel grid. Along with resizing, the Gaussian pyramid can assist to increase the number of images. | [ |
| Normalization and enhancement | Histogram equalization, adaptive mean, median filters, log transforms, CLAHE method, Wiener filter, multi-threshold peripheral equalization algorithm. | Normalize the image’s low-value and high-value intensity/contrast. Adaptive filters reduce noise by taking into account mean, variation, and spatial correlations. Reduces the effects of image blurring and impulsive noise in ultrasound images. Multi-threshold peripheral equalization enhances and removes irrelevant information from mammograms. On the normalized image, ANN typically performs better. It aids in the reduction in loss during backpropagation. | [ |
| Remove artifacts | Using binary images and thresholding the pixel intensity, cropping border, extracting larger regions, using geometric parabola around rib cage. | Non-breast areas (labels, wages, white strips/borders, opaque markers, lungs, thorax, chest wall, and pectoral muscle) in mammograms, US, and MRI can be reduced. | [ |
| Stain normalization or removal | Stain normalization | To make variable color (due to H&E staining of histology images) uniform across all images for certain patients. As a consequence, NN will not be distracted by variations in brightness and color staining and will produce superior classification results for multiclass BrC. The contrast, intensity, and color characteristics of the source images are almost identical to those of the reference image. | [ |
| Color deconvolution | To extract hematoxylin-eosin (H&E) staining intensities from HP images and to transform them into optical density space images without being considerably affected. It decreases image dimensionality, consumes fewer resources, and improves classification performance. It maintains textural information in histology images that is related with stain colors. |
Figure 9Representative H&E stained images from the BreakHis dataset.
Figure 10Breast cancer mammogram images from the DDSM dataset: (a) normal, (b) benign (not cancer), and (c) cancer.
An overview of datasets, references, methods, evaluation metrics, and accuracy.
| Dataset | Reference | Evaluation Metrics | Methods | Accuracy |
|---|---|---|---|---|
| MIAS, DDSM | Rouhi et al. [ | Sensitivity, specificity, accuracy, AUC | ANN | 90.16%, 96.47% |
| DDSM | Kumar et al. [ | Accuracy | ANN | 100% |
| BCC | Feng et al. [ | Precision, recall, F1 Score, accuracy, mean execution time | autoencoder | 98.27% |
| - | Wu et al. [ | Accuracy, sensitivity, specificity, and autoencoder | 95.45% | |
| WBCO, WDBC | Ronoud and Asadi [ | Accuracy, sensitivity, specificity, NPV, PPV, | DBN+ELM | 99.75% |
| DDSM | Mandala and Di [ | Sensitivity, specifity, accuracy | DBN | 93% |
| DDSM, PD (DBT) | Samala et al. [ | ROC, AUC | CNN (UDM) | 0.93 (AUC) |
| BreakHis | Spanhol et al. [ | Accuracy, patient score, patient-recognition rate, image-recognition rate | CNN(UDM) | 90% |
| MIAS | Ting et al. [ | Accuracy, sensitivity, specificity | CNN (UDM) | 90.50% |
| PD | Yan et al. [ | Accuracy, sensitivity, AUC | CNN (UDM) | 91.3% |
| BreakHis | Han et al. [ | Recognition rates, accuracy | CNN(COM) | 96.9% |
| BreakHis | Kumar et al. [ | Accuracy, F1-score | CNN (TL) | 97.01% |
| BreakHis | Toğaçar et al. [ | Accuracy, sensitivity, specificity, precision, and F1-score | CNN (RL) | 98.70% |
| BreaKHis | Hu et al. [ | Precision, recall, accuracy, F1-Score | CNN (RL) | 91% |
| BreaKHis | Wang et al. [ | Accuracy, sensitivity | ELM | 98.18% |
| WBCD | Lahoura et al. [ | Accuracy, recall, precision, F1-score | ELM | 98.68% |
| DDSM, INbreast | Shams et al. [ | Accuracy, AUC | GAN+CNN | 89%, 93.5% |
| DDSM, INbreast | Singh et al. [ | Accuracy | cGAN | 80% |
A research directive for architectural selection for BrC diagnosis.
| Architecture/Policy | Architecture Strength | Architecture Limitation | Research Direction |
|---|---|---|---|
| ANN | Driven to better decision-making. | Not suitable for extracting spatial information | Architectures suitable for extracting spatial information are required. |
| Autoencoder | Excellent for condensing feature information. | Requires separate feature classification system, along with fine-tuned multi-stage training strategies. | Implementation of a single-stage training platform is required. |
| DBN | Requires low data on training. | DBN is not incredibly optimized for image-recognition processes. | Requires stronger fusion with convolutional architectures. |
| Transfer learning | Strong weight initialization results in achieving better accuracy from minimal training data. | Requires intuition of feature relation between pre-trained dataset and target dataset. | Transfer learning strategies should be implemented based on data relevancy. |
| Residual learning | Enables generalization of deeper architectures by auto-calibration on unnecessary features. | Requires batch-normalization resulting in adding extra computation complexity. | Unnecessary and heavy residuals should be avoided. |
| ELM | Faster learning capability with the advantage of avoiding vanishing gradients. | Hard to solve underfitting and overfitting issues. Additionally, ELM is not great at image-classification tasks. | Need to move towards deep-learning strategies or extensively improve architectural perspectives. |
| GAN | Excellent for data distribution learning and generating synthetic data. | Training sometimes falls into local-minima. Additionally, it may cause excessive focus on fixed data patterns. | Should be implemented for generating synthetic datasets and increasing the capability of cancer classifier models. |