| Literature DB >> 35621905 |
Parita Oza1, Paawan Sharma1, Samir Patel1, Festus Adedoyin2, Alessandro Bruno2.
Abstract
Research in the medical imaging field using deep learning approaches has become progressively contingent. Scientific findings reveal that supervised deep learning methods' performance heavily depends on training set size, which expert radiologists must manually annotate. The latter is quite a tiring and time-consuming task. Therefore, most of the freely accessible biomedical image datasets are small-sized. Furthermore, it is challenging to have big-sized medical image datasets due to privacy and legal issues. Consequently, not a small number of supervised deep learning models are prone to overfitting and cannot produce generalized output. One of the most popular methods to mitigate the issue above goes under the name of data augmentation. This technique helps increase training set size by utilizing various transformations and has been publicized to improve the model performance when tested on new data. This article surveyed different data augmentation techniques employed on mammogram images. The article aims to provide insights into basic and deep learning-based augmentation techniques.Entities:
Keywords: data augmentation; deep learning; mammograms; medical imaging
Year: 2022 PMID: 35621905 PMCID: PMC9147240 DOI: 10.3390/jimaging8050141
Source DB: PubMed Journal: J Imaging ISSN: 2313-433X
Figure 1(a) Shows the ideal trend of the model with training and validation error functions decreasing almost simultaneously. (b) Shows the undesired effect of overfitting, having the training error decrease and, conversely, validation error increases suddenly.
Figure 2Methods to tackle overfitting.
Figure 3The PRISMA flow diagram and the selection method.
Figure 4Example of images after applying geometric transformation.
Figure 5An example of patch of Mammogram and a sample of patches generated with geometric transformation [12].
Figure 6Mammograms after applying random erasing.
Figure 7Example of data augmentation based on various filters and noise.
Figure 8Given mask image (A); normal mammogram image (B); generated mammogram image with synthetic mask (C) [55].
Figure 9Private Dataset: Given mask image (A); normal mammogram image (B); generated mammogram image with synthetic mask (C) [55].
Figure 10Randomly sampled examples of original and synthetic mammograms [58].
Summary of basic and advanced image augmentation techniques.
| Sr No. | DA Technique | Sub Category | Label Preserving | Strength | Limitation |
|---|---|---|---|---|---|
| 1 | Geometric | Flipping | No | Good solutions for positional bias present | Additional memory, Transformation |
| Cropping | Not always | ||||
| Rotation | Not always | ||||
| Translation | Yes | ||||
| 2 | Noise Injection [ | - | Yes | Allows model to learn more robust | Difficult to decide amount of noise |
| 3 | Kernel Filters [ | - | Yes | Good to generate sharpen and | Similar to CNN mechanism |
| 4 | Mixing Images [ | - | No | - | Makes not much sense from human |
| 5 | Random Erasing [ | - | Not always | Analogous to dropout regularization. | Some manual intervention may be |
| 6 | Adversarial Training [ | - | Yes | Help to illustrate weak decision boundaries | Less explored |
| 7 | Generative Adversarial Network [ | - | Yes | GANs generate data that looks similar to | Harder to train, Generating results |
| 8 | Neural Style Transfer [ | - | - | Improves the generalization ability of | Efforts needed to select style, |
Summary of articles using Image augmentation.
| Ref. | Task Performed | Model | Dataset | Model Performance | Data Augmentation Approach |
|---|---|---|---|---|---|
| [ | AD detection | Deep CNN | Private | AUC: 0.83 ± 0.14 | Rotation by 90, 180 and 270 degrees, |
| [ | AD detection | Deep CNN | MIAS, DDSM, INBreast | Accuracy: 93.75% | Rotation, flipping, shear, scaling, etc. |
| [ | Mass detection | Faster R-CNN | CBIS-DDSM | Sensitivity: 0.833 ± 0.038 | Horizontal and Vertical Flipping |
| [ | Mass detection | mr2NST | mammograms from | - | Neural Style Transfer |
| [ | BI-RADS Classification | AlexNet | INBreast | Accuracy: 83.4 | Image co-registration |
| [ | Tumor detection | Modified AlexNet | MIAS | 95.70% | Scaling, horizontal flip, |
| [ | Mass Classification | InceptionV3 and ResNet50 | DDSM | Accuracy: | Geometric Transformation |
| [ | Mammogram classification | Pre-trained CNN Architectures | Private | - | Reflection and Rotation |
| [ | BI-RADS classification | CNN | MIAS | Accuracy: 83.6% | Flip, rotation, shift and zoom |
| [ | Mammogram | Pre-trained CNN Architectures | MIAS | Accuracy: 99.01% | Gaussian blurring, horizontal flipping, |
| [ | Mass detection | Google Inception-V3 | INBreast | ROC: 0.86 | Gaussian noise, Flipping, |
| [ | Mass Classification | VGG based DCNN | INBreast, CBIS, BCRP | - | elastic deformations |
| [ | Mass Classification | DCNN | MIAS, INBreast, DDSM | Conventional DA techniques: 88% | GAN |
| [ | Mass Classification | AlexNet, InceptionV3 | INBreast, CBIS-DDSM | Accuracy: | rotation, flipping, shearing |
| [ | Lesion | ResNet50, VGG16, VGG19 | CBIS-DDSM | Accuracy: 90.4% | Geometric transformation, |
| [ | Abnormality | Meta Learning, REsnet101 | CBIS-DDSM | Accuracy: | Geometric transformations |
| [ | Mammogram | VGGNet, GoogleNet, Resnet | CBIS-DDSM, MIAS | AUC: 0.932 | Geometric transformations |
| [ | Mammogram | Residual Networks | INBreast | Specificity: 0.89 | Rotation, Translation |
| [ | Mass detection | InceptionV3 | INBreast | ROC: 0.91 | Geometric transformations, |
| [ | Mammogram Classification | Alexnet, Resnet | Private | - | Geometric transformations |
| [ | AD detection | Alexnet, SVM | CBIS-DDSM, DDSM, MIAS | Accuracy: 92 | Geometric transformations, TTA |
| [ | Mammogram detection and | YOLO | INBreast | Accuracy: 89.6 | Rotation, Flipping |
| [ | Build datasets of breast | Alexnet, Densenet, | INBreast | - | Rotation, Flipping |
| [ | Mass Detection | Faster R-CNN | OMI-DB | TPR: | Horizontal Flipping |
| [ | Breast cancer diagnosis | Pre-trained CNN | CBIS- DDSM, BCDR, | F1 Score for MIAS 0.907 ± 0.150 | - |
| [ | Breast cancer classification | DCNN | MIAS | Accuracy: 90.50 | Feature wise data augmentation |
| [ | Mass Classification | CNN | DDSM | - | cycle GAN |
| [ | Masses Discrimination | GoogleNet | DDSM | Accuracy: 90.38% | Flipping, Cropped-ROI, Gaussian noise |
| [ | Image Classification | VGG-16/19 | Mini MIAS | - | Crossover technique |
| [ | Mass Image Synthesis | GAN | DDSM, Private | - | Contextual Information Based on GANs |
| [ | Mass Detection | One-Stage Object Detection | INBreast | Recall: INBreast: 0.93 | Elastic Deformation |
| [ | Mass Detection | Fully Convolutional Network | CBIS-DDSM | 0.8040 PAUC | Adversarial Learning |
| [ | Breast Cancer Classification | Deep CNN | MIAS, DDSM, | Accuracy: | Geometric Transformations, |
| [ | Mass Detection | Contrastive Learning, | Inbreast, Private | - | Geometric Transformations |
| [ | Mass Classification | Deep CNN | Private | 0.760 ± 0.015 for 80% labeled data | Virtual Adversarial Training |
| [ | Mass Detection | Eight Object Detection Models | OPTIMAM, Inbreast, | Out of eight models, DETR [ | Cutout and RandConv |
| [ | BI-RADS Classification | EfficientNet-B2 | Private | Macro F1 score: 0.595 | Transparency Strategy |
| [ | Mass Detection | Pre-trained CNNs, | BCDR | Accuracy: 84% | Geometric Transformations |
| [ | Lesion Detection | YOLOv4 | INBreast | Sensitivity: 93% by NCA | Geometric Transformations |
| [ | Mammogram Density Classification | DenseNet201, ResNet50 | MIAS | Accuracy: | Geometric Transformations |
| [ | Mass Segmentation | U-Net | DDSM | Sensitivity: 92.32% | Geometric Transformations |
| [ | Breast Cancer Detection | Pre-trained CNNs | MIAS | Accuracy: | Geometric Transformations |
Articles with pre and post augmentation dataset size and model performance.
| Ref. | Pre-Augmentation Dataset Size | Post-Augmentation Dataset Size | Post-Augmentation Model Performance |
|---|---|---|---|
| [ | 280 (Mammograms) | 345,000 ROIs | - |
| [ | 5136 ROIs (MIAS), | 49,724 ROIs (MIAS), | - |
| [ | - | 8 new labels per image | - |
| [ | 374 | 1560 samples | Accuracy improved by more than 33% |
| [ | 322 | 2576 | - |
| [ | 3290 | 26,320 | - |
| [ | - | - | Rise in validation accuracy from |
| [ | 322 | 9000 | - |
| [ | - | - | Increased AUC from 0.78 to 0.86 |
| [ | - | - | Improved FPI 3.509 (CBIS), 1.864 (BCRP) |
| [ | - | - | Rise in accuracy from 0.6026 to 0.8670 |
| [ | 1798 | Single image to be augmented into 546 images | Rise in accuracy from 69.85% to 94% |
| [ | 5257 | 104,795 | - |
| [ | - | - | Rise in accuracy from 78.92% to 80.56% |
| [ | - | - | Improvement in sensitivity from 0.786 to 0.913 |
| [ | - | - | Improvement in auROC from 0.62 to 0.73 |
| [ | 215 ROI | 3006 ROI | - |
| [ | 107 | 428 | - |
| [ | 106 | 7632 | - |
| [ | 221 (Patches) | 1768 Patches | - |
| [ | - | - | Improvement in accuracy by 1.4 % |
| [ | - | Dataset is expanded by 24 times | - |
| [ | - | - | Improvement in accuracy by 1.47%, |
| [ | - | - | Improvement in detection rate by 5.03% |
| [ | 322 (MIAS), 1500 (DDSM), 410 (Inbreast) | 3200 (MIAS), 28,800 (DDSM), 2240 (Inbreast) | - |
| [ | 25,373 (Training Samples) | 28,000 (Training Samples) | - |
| [ | 106 | 1080 | - |
| [ | 7989 | 48,659 ROI | - |
Figure 11Train and test-time data augmentation.
Figure 12Test-time data augmentation framework.