| Literature DB >> 35885548 |
Tengku Muhammad Hanis1, Md Asiful Islam2,3, Kamarul Imran Musa1.
Abstract
In this meta-analysis, we aimed to estimate the diagnostic accuracy of machine learning models on digital mammograms and tomosynthesis in breast cancer classification and to assess the factors affecting its diagnostic accuracy. We searched for related studies in Web of Science, Scopus, PubMed, Google Scholar and Embase. The studies were screened in two stages to exclude the unrelated studies and duplicates. Finally, 36 studies containing 68 machine learning models were included in this meta-analysis. The area under the curve (AUC), hierarchical summary receiver operating characteristics (HSROC) curve, pooled sensitivity and pooled specificity were estimated using a bivariate Reitsma model. Overall AUC, pooled sensitivity and pooled specificity were 0.90 (95% CI: 0.85-0.90), 0.83 (95% CI: 0.78-0.87) and 0.84 (95% CI: 0.81-0.87), respectively. Additionally, the three significant covariates identified in this study were country (p = 0.003), source (p = 0.002) and classifier (p = 0.016). The type of data covariate was not statistically significant (p = 0.121). Additionally, Deeks' linear regression test indicated that there exists a publication bias in the included studies (p = 0.002). Thus, the results should be interpreted with caution.Entities:
Keywords: breast cancer; diagnostic accuracy; machine learning; mammography; meta-analysis
Year: 2022 PMID: 35885548 PMCID: PMC9320089 DOI: 10.3390/diagnostics12071643
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Figure 1Flow diagram of the study selection process.
Characteristics of included studies.
| Study | ID | Country | Source | Size of Dataset | Train/Validation/Test Split | Type of Data | Classifier | Prediction Class | TP | TN | FP | FN | Accuracy |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Abdolmaleki 2006 [ | 1 | Iran | Primary data | 122 cases | 82/-/40 | DM | NN | Benign-Malignant | 16 | 14 | 8 | 2 | 0.75 |
| Acharyau 2008 [ | 2 | USA | DDSM | 360 images | 270/-/90 | DM | NN | Normal-Benign-Malignant | 55 | 28 | 2 | 5 | 0.97 |
| 3 | USA | DDSM | 360 images | 270/-/90 | DM | GMM | Normal-Benign-Malignant | 57 | 29 | 1 | 3 | 0.98 | |
| Al-antari 2020 [ | 4 | USA | DDSM | 600 images | 420/60/120 | DM | DL | Benign-Malignant | 59 | 59 | 1 | 1 | 0.98 |
| 5 | Portugal | INbreast | 410 images | 78/12/22 | DM | DL | Benign-Malignant | 14 | 6 | 2 | 0 | 0.95 | |
| Alfifi 2020 [ | 6 | UK | MIAS | 200 images | NE | DM | DL | Normal-Benign-Malignant | 124 | 66 | 7 | 3 | 0.95 |
| 7 | UK | MIAS | 200 images | NE | DM | Tree-based | Normal-Benign-Malignant | 102 | 54 | 29 | 15 | 0.78 | |
| 8 | UK | MIAS | 200 images | NE | DM | KNN | Normal-Benign-Malignant | 99 | 50 | 32 | 19 | 0.74 | |
| Al-hiary 2012 [ | 9 | Jordan | Primary data | NE | NE | DM | NN | Normal-Cancer | 14 | 15 | 1 | 2 | 0.91 |
| Al-masni 2018 [ | 10 | USA | DDSM | 2400 images | 1920/-/480 | DM | NN | Benign-Malignant | 240 | 226 | 14 | 0 | 0.97 |
| Bandeira-diniz 2018 [ | 11 | USA | DDSM | 2482 images | 1990/-/492 | DM | DL | Non-mass-Mass | 2418 | 4306 | 442 | 225 | 0.91 |
| 12 | USA | DDSM | 2482 images | 1990/-/492 | DM | DL | Non-mass-Mass | 1774 | 5615 | 210 | 188 | 0.95 | |
| Barkana 2017 [ | 13 | USA | DDSM | 2173 images | 1451/-/722 | DM | NN | Benign-Malignant | 325 | 270 | 70 | 57 | 0.82 |
| 14 | USA | DDSM | 2173 images | 1451/-/722 | DM | SVM | Benign-Malignant | 318 | 278 | 62 | 64 | 0.83 | |
| Biswas 2019 [ | 15 | UK | MIAS | 322 images | 226/48/48 | DM | NN | Normal-Abnormal | 32 | 12 | 3 | 1 | 0.92 |
| Cai 2019 [ | 16 | China | Primary data | 990 images | 891/-/99 | DM | SVM | Benign-Malignant | 48 | 39 | 6 | 6 | 0.89 |
| Chen 2019a [ | 17 | China | Primary data | 81 cases | NE | DM | Tree-based | Benign-Malignant | 31 | 30 | 11 | 9 | 0.75 |
| Chen 2019b [ | 18 | USA | Primary data | 275 cases | 10-folds cross validation | DM | SVM | Benign-Malignant | 102 | 104 | 37 | 32 | 0.75 |
| 19 | USA | Primary data | 275 cases | 10-folds cross validation | DM | SVM | Benign-Malignant | 103 | 114 | 27 | 31 | 0.79 | |
| Danala 2018 [ | 20 | USA | Primary data | 111 cases | LOO-CV | DM | DL | Benign-Malignant | 63 | 24 | 9 | 15 | 0.78 |
| 21 | USA | Primary data | 111 cases | LOO-CV | DM | DL | Benign-Malignant | 55 | 21 | 12 | 23 | 0.68 | |
| Daniellopez-cabrera 2020 [ | 22 | UK | mini-MIAS | 322 images | NE | DM | DL | Normal-Abnormal | 31 | 101 | 2 | 4 | 0.97 |
| 23 | UK | mini-MIAS | 322 images | NE | DM | DL | Benign-Malignant | 14 | 28 | 3 | 1 | 0.91 | |
| Fathy 2019 [ | 24 | USA | DDSM | 3932 images | 2517/629/786 | DM | DL | Normal-Abnormal | 389 | 325 | 71 | 1 | 0.91 |
| Girija 2019 [ | 25 | UK | mini-MIAS | 322 images | NE | DM | Tree-based | Normal-Abnormal | 266 | 48 | 4 | 4 | 0.98 |
| 26 | UK | mini-MIAS | 322 images | NE | DM | Tree-based | Benign-Malignant | 200 | 55 | 6 | 9 | 0.94 | |
| Jebamony 2020 [ | 27 | UK | mini-MIAS | 294 images | 203/-/91 | DM | NN | Benign-Malignant | 33 | 41 | 12 | 5 | 0.85 |
| 28 | UK | mini-MIAS | 294 images | 203/-/91 | DM | SVM | Benign-Malignant | 37 | 49 | 4 | 1 | 0.96 | |
| Junior 2010 [ | 29 | UK | mini-MIAS | 428 ROIs | 320/-/108 | DM | NN | Normal-Abnormal | 16 | 69 | 5 | 18 | 0.79 |
| 30 | UK | mini-MIAS | 428 ROIs | 320/-/108 | DM | SVM | Normal-Abnormal | 20 | 80 | 1 | 7 | 0.93 | |
| Kanchanamani 2016 [ | 31 | UK | MIAS | 322 images | NE | DM | SVM | Normal-Abnormal | 46 | 120 | 24 | 0 | 0.87 |
| 32 | UK | MIAS | 322 images | NE | DM | Bayes-based | Normal-Abnormal | 30 | 94 | 50 | 16 | 0.65 | |
| 33 | UK | MIAS | 322 images | NE | DM | DL | Normal-Abnormal | 23 | 101 | 43 | 23 | 0.65 | |
| 34 | UK | MIAS | 322 images | NE | DM | KNN | Normal-Abnormal | 28 | 112 | 32 | 18 | 0.74 | |
| 35 | UK | MIAS | 322 images | NE | DM | LDA | Normal-Abnormal | 28 | 112 | 32 | 18 | 0.74 | |
| 36 | UK | MIAS | 322 images | NE | DM | SVM | Benign-Malignant | 58 | 53 | 2 | 7 | 0.93 | |
| 37 | UK | MIAS | 322 images | NE | DM | Bayes-based | Benign-Malignant | 50 | 20 | 35 | 15 | 0.58 | |
| 38 | UK | MIAS | 322 images | NE | DM | DL | Benign-Malignant | 29 | 29 | 26 | 36 | 0.48 | |
| 39 | UK | MIAS | 322 images | NE | DM | KNN | Benign-Malignant | 41 | 25 | 30 | 24 | 0.55 | |
| 40 | UK | MIAS | 322 images | NE | DM | LDA | Benign-Malignant | 38 | 33 | 22 | 27 | 0.59 | |
| Kim 2018 [ | 41 | Korea | Primary data | 29,107 images | 26631/1238/1238 | DM | DL | Normal-Abnormal | 471 | 548 | 71 | 148 | 0.82 |
| Mao 2019 [ | 42 | China | Primary data | 173 cases | 138/-/35 | DM | SVM | Benign-Malignant | 13 | 14 | 1 | 7 | 0.80 |
| 43 | China | Primary data | 173 cases | 138/-/35 | DM | Logistic | Benign-Malignant | 17 | 14 | 1 | 3 | 0.89 | |
| 44 | China | Primary data | 173 cases | 138/-/35 | DM | KNN | Benign-Malignant | 8 | 14 | 1 | 12 | 0.83 | |
| 45 | China | Primary data | 173 cases | 138/-/35 | DM | Bayes-based | Benign-Malignant | 9 | 13 | 2 | 11 | 0.78 | |
| Miao 2015 [ | 46 | USA | MMD | 830 cases | 10-folds cross validation | DM | SVM | Benign-Malignant | 381 | 399 | 28 | 22 | 0.94 |
| Miao 2013 [ | 47 | USA | MMD | 830 cases | NE | DM | NN | Benign-Malignant | 360 | 384 | 43 | 43 | 0.90 |
| Milosevic 2015 [ | 48 | UK | MIAS | 300 images | 5-folds cross validation | DM | SVM | Normal-Abnormal | 23 | 163 | 24 | 90 | 0.62 |
| 49 | UK | MIAS | 300 images | 5-folds cross validation | DM | KNN | Normal-Abnormal | 44 | 138 | 49 | 69 | 0.61 | |
| 50 | UK | MIAS | 300 images | 5-folds cross validation | DM | Bayes-based | Normal-Abnormal | 53 | 113 | 74 | 60 | 0.55 | |
| 51 | Serbia | Primary data | 300 images | 5-folds cross validation | DM | SVM | Normal-Abnormal | 121 | 130 | 20 | 29 | 0.84 | |
| 52 | Serbia | Primary data | 300 images | 5-folds cross validation | DM | KNN | Normal-Abnormal | 84 | 79 | 71 | 66 | 0.54 | |
| 53 | Serbia | Primary data | 300 images | 5-folds cross validation | DM | Bayes-based | Normal-Abnormal | 114 | 118 | 32 | 36 | 0.77 | |
| Nithya 2012 [ | 54 | USA | DDSM | 250 images | 200/-/50 | DM | NN | Normal-Abnormal | 23 | 24 | 2 | 1 | 0.94 |
| Nusantara 2016 [ | 55 | UK | MIAS | 322 images | 291/-/31 | DM | KNN | Normal-Abnormal | 10 | 20 | 0 | 1 | 0.97 |
| Palantei 2017 [ | 56 | UK | MIAS | NE | NE | DM | SVM | Normal-Abnormal | 9 | 21 | 4 | 0 | 0.88 |
| Paramkusham 2018 [ | 57 | USA | DDSM | 148 images | 126/-/22 | DM | SVM | Benign-Malignant | 10 | 10 | 1 | 1 | 0.91 |
| Roseline 2018 [ | 58 | UK | MIAS | NE | NE | DM | KNN | Benign-Malignant | 49 | 60 | 4 | 2 | 0.95 |
| Shah 2015 [ | 59 | UK | MIAS | 320 images | NE | DM | NN | Normal-Abnormal | 54 | 49 | 2 | 3 | 0.95 |
| 60 | UK | MIAS | 320 images | NE | DM | NN | Benign-Malignant | 24 | 22 | 2 | 6 | 0.85 | |
| Shivhare 2020 [ | 61 | USA, UK | DDSM, MIAS | NE | NE | DM | NN | Benign-Malignant | 12 | 16 | 2 | 3 | 0.85 |
| 62 | USA, UK | DDSM, MIAS | NE | NE | DM | DL | Benign-Malignant | 1 | 17 | 1 | 14 | 0.55 | |
| 63 | USA, UK | DDSM, MIAS | NE | NE | DM | SVM | Benign-Malignant | 0 | 18 | 0 | 15 | 0.55 | |
| Singh 2018 [ | 64 | UK | MIAS | 139 ROIs | 69/28/42 | DM | NN | Benign-Malignant | 25 | 14 | 1 | 2 | 0.93 |
| Venkata 2019 [ | 65 | NA | NA | 110 images | 80/-/30 | DM | Logistic regression | Benign-Malignant | 14 | 14 | 1 | 1 | 0.93 |
| Wang 2017 [ | 66 | UK | mini-MIAS | 200 images | 10-folds cross validation | DM | NN | Normal-Abnormal | 92 | 92 | 8 | 8 | 0.92 |
| Wutsqa 2017 [ | 67 | UK | MIAS | 120 cases | 96/-/24 | DM | NN | Normal-Abnormal | 14 | 8 | 0 | 2 | 0.92 |
| Yousefi 2018 [ | 68 | USA | Primary data | 87 images | NE | Tomosynthesis | Tree-based | Benign-Malignant | 11 | 13 | 2 | 2 | 0.87 |
DM = digital mammogram; NN = neural network; GMM = Gaussian mixture model; DL = deep learning; KNN = k-nearest neighbor; SVM = support vector machine; LDA = linear discriminant analysis; ROIs = region of interests; LOO-CV = leave-one-out cross validation; NE = not clearly explained; NA = not available; TP = true positive; TN = true negative; FP = false positive; FN = false negative; DDSM = database for screening mammography; MIAS = mammographic image analysis society; MMD = mammographic mass database.
Figure 2Sensitivity and specificity of machine learning models in the study.
Figure 3The diagnostic odds ratio of machine learning models in the study.
Figure 4Hierarchical summary receiver operating characteristics (HSROC) curve for overall machine learning models in the study.
A likelihood ratio test for bivariate meta-regression models with the null model.
| Model | Covariate | ꭓ2-Statistic (df) | |
|---|---|---|---|
| Model 1 | Country | 19.55 (6) | 0.003 * |
| Model 2 | Source | 31.10 (12) | 0.002 * |
| Model 3 | Type of data | 4.23 (2) | 0.121 |
| Model 4 | Classifier | 30.32 (16) | 0.016 * |
* Significance at p < 0.05.
A post hoc pairwise comparison for covariates country, source of data and classifier.
| Comparisons | dAUC (95% CI) | |
|---|---|---|
| Country | ||
| USA vs. UK | 0.051 (0.006, 0.127) | 0.035 * |
| USA vs. others 1 | 0.095 (0.044, 0.191) | 0.001 ** |
| UK vs. others 1 | 0.044 (−0.034, 0.131) | 0.241 |
| Source of data | ||
| Primary data vs. DDSM | — † | — † |
| Primary data vs. MIAS 2 | −0.062 (−0.127, 0.023) | 0.152 |
| DDSM vs. MIAS 2 | — † | — † |
| Classifier | ||
| NN vs. DL | — † | — † |
| NN vs. Tree-based | 0.003 (−0.071, 0.138) | 0.946 |
| NN vs. KNN | 0.157 (0.026, 0.325) | 0.010 |
| NN vs. SVM | 0.033 (−0.034, 0.074) | 0.337 |
| NN vs. Bayes-based | 0.252 (0.119, 0.379) | <0.001 ** |
| DL vs. Tree-based | −0.016 (−0.122, 0.117) | 0.690 |
| DL vs. KNN | — † | — † |
| DL vs. SVM | — † | — † |
| DL vs. Bayes-based | — † | — † |
| Tree-based vs. KNN | 0.153 (−0.023, 0.333) | 0.082 |
| Tree-based vs. SVM | 0.030 (−0.101, 0.099) | 0.578 |
| Tree-based vs. Bayes-based | 0.249 (0.073, 0.395) | 0.007 ** |
| KNN vs. SVM | −0.123 (−0.300, −0.004) | 0.044 * |
| KNN vs. Bayes-based | 0.096 (−0.121, 0.265) | 0.404 |
| SVM vs. Bayes-based | 0.219 (0.094, 0.350) | <0.001 ** |
* Significance at p < 0.05; ** significance after Bonferroni correction; † non-convergence; 1 others: Iran, Portugal, Jordan, China, Korea and Serbia; 2 mini-MIAS and MIAS databases were combined into a group; dAUC = difference of the area under the curve; DDSM = database for screening mammography; MIAS = mammographic image analysis society; NN = neural network; DL = deep learning; KNN = k-nearest neighbor; SVM = support vector machine.
Figure 5Hierarchical summary receiver operating characteristics (HSROC) curve for each subgroup analysis in the study.
Figure 6Deeks’ funnel plot.
Quality assessment of the included studies according to the QUADAS-2 tool.
|
|
|
|
| |||||
|
|
|
|
|
|
|
| ||
| Abdolmaleki 2006 | Low | Unclear | Low | Low | Low | Low | Low | Good |
| Acharyau 2008 | High | Unclear | Low | Unclear | Low | Low | Low | Good |
| Al-antari 2020 | Low | Unclear | Unclear | Low | Unclear | Low | Unclear | Moderate |
| Alfifi 2020 | Unclear | Unclear | Unclear | Unclear | Low | Low | Unclear | Moderate |
| Al-hiary 2012 | High | Low | Unclear | Unclear | Unclear | Low | Unclear | Moderate |
| Al-masni 2018 | Low | Unclear | Low | Unclear | Low | Low | Low | Moderate |
| Bandeira-diniz 2018 | High | Low | Low | Unclear | Low | Low | Low | Good |
| Barkana 2017 | Unclear | Unclear | Low | Unclear | Unclear | Low | Low | Moderate |
| Biswas 2019 | Unclear | Unclear | Unclear | Unclear | Unclear | Low | Unclear | Moderate |
| Cai 2019 | Low | Low | Low | Low | Low | Low | Low | Moderate |
| Chen 2019a | Low | Unclear | Low | Low | Low | Low | Low | Moderate |
| Chen 2019b | Low | Low | Low | Low | Low | Low | Low | Good |
| Danala 2018 | Low | Low | Low | Low | Low | Low | Low | Good |
| Daniellopez-cabrera 2020 | Unclear | Unclear | Unclear | Unclear | Low | Low | Unclear | Good |
| Fathy 2019 | High | Low | Low | Unclear | Low | Low | Low | Poor |
| Girija 2019 | Unclear | Low | Unclear | Unclear | Low | Low | Low | Good |
| Jebamony 2020 | Unclear | Unclear | Unclear | High | Low | Low | Unclear | Moderate |
| Junior 2010 | High | Unclear | Unclear | High | Low | Low | Unclear | Moderate |
| Kanchanamani 2016 | Unclear | Unclear | Unclear | Unclear | Low | Low | Unclear | Moderate |
| Kim 2018 | Unclear | Low | Low | Low | Low | Low | Low | Moderate |
| Mao 2019 | Low | Unclear | Low | Low | Low | Low | Low | Moderate |
| Miao 2015 | Unclear | Unclear | Unclear | High | Low | Low | Unclear | Moderate |
| Miao 2013 | Low | Low | Unclear | High | Low | Low | Unclear | Moderate |
| Milosevic 2015 | Low | Unclear | Unclear | Unclear | Low | Low | Unclear | Moderate |
| Nithya 2012 | Unclear | Unclear | Low | Unclear | Low | Low | Low | Moderate |
| Nusantara 2016 | Unclear | Low | Unclear | Unclear | Low | Low | Low | Moderate |
| Palantei 2017 | High | Unclear | Unclear | Unclear | Low | Low | Unclear | Poor |
| Paramkusham 2018 | Unclear | Unclear | Low | Unclear | Low | Low | Low | Moderate |
| Roseline 2018 | Unclear | Unclear | Unclear | High | Low | Low | Unclear | Moderate |
| Shah 2015 | Unclear | Unclear | Unclear | Unclear | Low | Low | Unclear | Good |
| Shivhare 2020 | Unclear | Unclear | Unclear | High | Low | Low | Unclear | Good |
| Singh 2018 | Unclear | Unclear | Low | Low | Low | Low | Low | Moderate |
| Venkata 2019 | Unclear | Unclear | Unclear | Unclear | Unclear | Low | Unclear | Moderate |
| Wang 2017 | High | Unclear | Unclear | Unclear | Low | Low | Unclear | Moderate |
| Wutsqa 2017 | High | Unclear | Unclear | Unclear | Low | Low | Unclear | Moderate |
| Yousefi 2018 | Unclear | Unclear | Low | Unclear | Low | Low | Low | Moderate |