| Literature DB >> 33172076 |
Yong Joon Suh1, Jaewon Jung2, Bum-Joo Cho2,3.
Abstract
Mammography plays an important role in screening breast cancer among females, and artificial intelligence has enabled the automated detection of diseases on medical images. This study aimed to develop a deep learning model detecting breast cancer in digital mammograms of various densities and to evaluate the model performance compared to previous studies. From 1501 subjects who underwent digital mammography between February 2007 and May 2015, craniocaudal and mediolateral view mammograms were included and concatenated for each breast, ultimately producing 3002 merged images. Two convolutional neural networks were trained to detect any malignant lesion on the merged images. The performances were tested using 301 merged images from 284 subjects and compared to a meta-analysis including 12 previous deep learning studies. The mean area under the receiver-operating characteristic curve (AUC) for detecting breast cancer in each merged mammogram was 0.952 ± 0.005 by DenseNet-169 and 0.954 ± 0.020 by EfficientNet-B5, respectively. The performance for malignancy detection decreased as breast density increased (density A, mean AUC = 0.984 vs. density D, mean AUC = 0.902 by DenseNet-169). When patients' age was used as a covariate for malignancy detection, the performance showed little change (mean AUC, 0.953 ± 0.005). The mean sensitivity and specificity of the DenseNet-169 (87 and 88%, respectively) surpassed the mean values (81 and 82%, respectively) obtained in a meta-analysis. Deep learning would work efficiently in screening breast cancer in digital mammograms of various densities, which could be maximized in breasts with lower parenchyma density.Entities:
Keywords: artificial intelligence; breast cancer; breast density; convolutional neural network; deep learning; mammography
Year: 2020 PMID: 33172076 PMCID: PMC7711783 DOI: 10.3390/jpm10040211
Source DB: PubMed Journal: J Pers Med ISSN: 2075-4426
Figure 1Flow diagram for the subject enrollment in this study.
Data composition for digital mammograms in the training and testing datasets.
| Whole Dataset | Training Set | Test Set | |||||
|---|---|---|---|---|---|---|---|
| Breast | Patient | Breast | Patient | Breast | Patient | ||
| Overall | 3002 | 1501 | 2701 | 1484 | 301 | 284 | |
| Non-malignant | 2465 | 1496 | 2218 | 1427 | 247 | 235 | |
| Malignant | 537 | 532 | 483 | 478 | 54 | 54 | |
| Breast density | A | 152 | 76 | 132 | 74 | 20 | 18 |
| B | 594 | 297 | 532 | 292 | 62 | 57 | |
| C | 1560 | 780 | 1405 | 774 | 155 | 149 | |
| D | 696 | 348 | 632 | 344 | 64 | 60 | |
Performances of deep learning models for breast cancer detection in mammograms by breast density.
| Breast Density/Model | Accuracy (%) | Sensitivity (%) | Specificity (%) | PPV (%) | NPV (%) | AUC |
|---|---|---|---|---|---|---|
|
| ||||||
| DenseNet-169 | 88.1 ± 0.2 | 87.0 ± 0.0 | 88.4 ± 0.2 | 62.1 ± 0.5 | 96.9 ± 0.0 | 0.952 ± 0.005 |
| EfficientNet-B5 | 87.9 ± 4.7 | 88.3 ± 4.7 | 87.9 ± 4.7 | 62.1 ± 9.9 | 97.2 ± 1.3 | 0.954 ± 0.020 |
|
| ||||||
| DenseNet-169 | 95.0 ± 0.0 | 100 ± 0.0 | 92.9 ± 0.0 | 85.7 ± 0.0 | 100.0 ± 0.0 | 0.984 ± 0.007 |
| EfficientNet-B5 | 96.7 ± 2.9 | 100.0 ± 0.0 | 95.3 ± 4.1 | 90.5 ± 8.3 | 100.0 ± 0.0 | 0.988 ± 0.012 |
|
| ||||||
| DenseNet-169 | 96.2 ± 4.1 | 97.0 ± 5.3 | 96.1 ± 3.9 | 85.3 ± 14.3 | 99.3 ± 1.2 | 0.962 ± 0.041 |
| EfficientNet-B5 | 95.2 ± 4.3 | 97.0 ± 5.3 | 94.8 ± 4.1 | 81.0 ± 12.9 | 99.3 ± 1.2 | 0.990 ± 0.009 |
|
| ||||||
| DenseNet-169 | 86.4 ± 6.2 | 87.7 ± 4.3 | 86.2 ± 6.7 | 58.8 ± 13.5 | 97.0 ± 1.1 | 0.950 ± 0.014 |
| EfficientNet-B5 | 81.9 ± 5.1 | 84.0 ± 5.7 | 81.5 ± 5.2 | 49.6 ± 9.1 | 96.0 ± 1.6 | 0.940 ± 0.016 |
|
| ||||||
| DenseNet-169 | 84.3 ± 5.4 | 83.3 ± 5.8 | 84.6 ± 5.3 | 51.0 ± 11.5 | 96.5 ± 1.3 | 0.902 ± 0.033 |
| EfficientNet-B5 | 85.9 ± 10.9 | 86.7 ± 11.5 | 85.8 ± 10.8 | 58.4 ± 28.2 | 97.1 ± 2.5 | 0.925 ± 0.055 |
PPV, positive predictive value; NPV, negative predictive value; AUC, area under the receiver operating characteristic curve.
Figure 2Heatmap of confusion matrix for breast cancer detection by the best performing (a) DenseNet-169 and (b) EfficientNet-B5.
Figure 3Receiver operating characteristic curves for detecting breast cancer on digital mammograms by the best performing (a) DenseNet-169 and (b) EfficientNet-B5.
Figure 4Gradient-weighed class activation mapping for mammograms having breast cancer by (a) DenseNet-169 and (b) EfficientNet-B5.
Forest plot of the previous studies showing the pooled (a) sensitivity and (b) specificity on performance of deep learning algorithm for breast cancer detection in mammograms. CI, confidence interval.
| (a) Sensitivity | |||
|---|---|---|---|
| Sensitivity (95% CI) | |||
|
| Regab (2019) | 0.86 | (0.79–0.91) |
| Rodriguez–Ruiz (2019) | 0.86 | (0.78–0.92) | |
| Gastounioti (2018) | 0.81 | (0.72–0.88) | |
| Kim (2018) | 0.76 | (0.72–0.79) | |
| Becker (2017) | 0.71 | (0.63–0.79) | |
| Teare (2017) | 0.91 | (0.86–0.95) | |
| Akselrob-Ballin (2019) | 0.80 | (0.79–0.81) | |
| Cai (2019) | 0.89 | (0.86–0.92) | |
| Al-Masni (2018) | 0.99 | (0.96–1.00) | |
| Casti (2017) | 0.84 | (0.64–0.95) | |
| Sun (2017) | 0.81 | (0.79–0.83) | |
| Wang (2016) | 0.89 | (0.81–0.94) | |
| Pooled sensitivity = 0.81 (0.80–0.82) | |||
| I2 = 0.927 | |||
| ( | |||
| Specificity (95% CI) | |||
|
| Regab (2019) | 0.88 | (0.82–0.92) |
| Rodriguez–Ruiz (2019) | 0.79 | (0.72–0.86) | |
| Gastounioti (2018) | 0.98 | (0.96–0.99) | |
| Kim (2018) | 0.90 | (0.88–0.92) | |
| Becker (2017) | 0.70 | (0.62–0.77) | |
| Teare (2017) | 0.80 | (0.76–0.84) | |
| Akselrob-Ballin (2019) | 0.82 | (0.80–0.83) | |
| Cai (2019) | 0.87 | (0.83–0.90) | |
| Al-Masni (2018) | 1.00 | (0.98–1.00) | |
| Casti (2017) | 0.77 | (0.55–0.92) | |
| Sun (2017) | 0.72 | (0.70–0.74) | |
| Wang (2016) | 0.90 | (0.82–0.95) | |
| Pooled specificity = 0.82 (0.81–0.82) | |||
| I2 = 0.967 | |||