| Literature DB >> 30819133 |
Arthur Marka1, Joi B Carter2,3, Ermal Toto4, Saeed Hassanpour5.
Abstract
BACKGROUND: Computer-aided diagnosis of skin lesions is a growing area of research, but its application to nonmelanoma skin cancer (NMSC) is relatively under-studied. The purpose of this review is to synthesize the research that has been conducted on automated detection of NMSC using digital images and to assess the quality of evidence for the diagnostic accuracy of these technologies.Entities:
Keywords: Artificial intelligence; Basal cell carcinoma; Computer-aided diagnosis; Image analysis; Machine learning; Nonmelanoma skin cancer; Squamous cell carcinoma
Mesh:
Year: 2019 PMID: 30819133 PMCID: PMC6394090 DOI: 10.1186/s12880-019-0307-7
Source DB: PubMed Journal: BMC Med Imaging ISSN: 1471-2342 Impact factor: 1.930
Fig. 1PRISMA flow diagram of study selection. Abbreviation: PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses
Levels of evidencea
| Level of evidence | Definition |
|---|---|
| 1 | Independent, blinded comparison of the classifier with a biopsy-proven standard among a large number of consecutive lesions suspected of being the target condition |
| 2 | Independent, blinded comparison of the classifier with a biopsy-proven standard among a small number of consecutive lesions suspected of being the target condition |
| 3 | Independent, blinded comparison of the classifier with a biopsy-proven standard among non-consecutive lesions suspected of being the target condition |
| 4 | Non-independent comparison of the classifier with a biopsy-proven standard among obvious examples of the target condition plus benign lesions |
| 5 | Non-independent comparison of the classifier with a standard of uncertain validity |
aModified from Simel and Rennie [12]
Overview of literature search
| Source | Target NMSC | Digital Image Modality | Database | Algorithm | Outcome | Quality Ratinga |
|---|---|---|---|---|---|---|
| Abbas, 2016 [ | BCC; CSCC | Dermoscopic | 30 BCCs, 30 CSCCs, 300 various other lesions (30% of dataset used for training and 70% for testing)b | ANN | AUROC: 0.92, (BCC), 0.95 (CSCC); Sensitivity: 97% (BCC), 97% (CSCC); Specificity: 68% (BCC), 80% (CSCC) | 5 |
| Ballerini, 2012 [ | BCC; CSCC | Non-dermoscopic | 239 BCCs, 88 CSCCs, 633 benign lesions (3-fold cross-validation) | k-NNc | Accuracy: 89.7%d; Sensitivity: 89.9%d; Specificity: 89.6%d | 3 |
| Chang 2013 [ | BCC; CSCC | Non-dermoscopic | 110 BCCs, 20 CSCCs, 639 various other lesions (leave-one-out cross-validation)b | MSVM | Sensitivity: 90% (BCC), 80% (CSCC) | 2 |
| Cheng, 2011 [ | BCC | Dermoscopic | 59 BCCs, 152 benign lesions (leave-one-out cross-validation) | ANN | AUROC: 0.967 | 4 |
| Cheng, 2012 [ | BCC | Dermoscopic | 263 BCCs, 226 benign lesions (10-fold cross-validation) | ANNc | AUROC: 0.846 | 4 |
| Cheng, 2013 [ | BCC | Dermoscopic | 35 BCCs, 79 benign lesions (leave-one-out cross-validation) | ANN | AUROC: 0.902 | 4 |
| Cheng, 2013 [ | BCC | Dermoscopic | 350 BCCs, 350 benign lesions (10-fold cross-validation) | ANNc | AUROC: 0.981 | 2 |
| Choudhury, 2015 [ | BCC; CSCC | Dermoscopic; Non-dermoscopic | 359 BCCs, CSCCs, MMs, and AKs (40 from each class randomly chosen for training; remainder used for testing)b | MSVMc | Accuracy: 94.6% (BCC), 92.9% (CSCC) | 5 |
| Chuang, 2011 [ | BCC | Non-dermoscopic | 84 BCCs, 235 benign lesions (3-fold cross-validation) | ANN | Accuracy: 95.0%; Sensitivity: 94.4%; Specificity: 95.2% | 3 |
| Dorj, 2018 [ | BCC; CSCC | Non-dermoscopic | Training: 728 BCCs, 777 CSCCs, 768 MMs, and 712 AKs; Testing: 193 BCCs, 200 CSCCs, 190 MMs, and 185 AKs | ANN | Accuracy: 91.8% (BCC), 95.1% (CSCC); Sensitivity: 97.7% (BCC), 96.9% (CSCC); Specificity: 86.7% (BCC), 94.1% (CSCC) | 5 |
| Esteva, 2017 [ | BCC; CSCC | Dermoscopic; Non-dermoscopic | Training: 127463 various lesions (9-fold cross-validation); Testing: 450 BCCs and CSCCs, 257 SKs | ANN | AUROC: 0.96 | 3 |
| Ferris, 2015 [ | BCC; CSCC | Dermoscopic | 11 BCCs, 3 CSCCs, 39 MMs, 120 benign lesions (half used for training and half for testing) | Decision forest classifier | Sensitivity: 78.6% | 2 |
| Fujisawa, 2018 [ | BCC; CSCC | Non-dermoscopic | Training: 974 BCCs, 840 CSCCs, 3053 various other lesions; Testing: 249 BCCs, 189 CSCCs, 704 various other lesionsb | ANN | Sensitivity: 80.3% (BCC), 82.5% (CSCC) | 3 |
| Guvenc, 2013 [ | BCC | Dermoscopic | 68 BCCs, 131 benign lesions (no cross-validation) | Logistic regression | Accuracy: 96.5%; AUROC: 0.988 | 4 |
| Han, 2018 [ | BCC; CSCC | Non-dermoscopic | Training: 19398 various lesions; Testing: 499 BCCs, 211 CSCCs, 2018 various other lesionsb,e | ANN | AUROC: 0.96 (BCC), 0.91 (CSCC); Sensitivity: 88.8% (BCC), 90.2% (CSCC); Specificity: 91.7% (BCC); 80.0% (CSCC) | 3 |
| Immagulate, 2015 [ | BCC; CSCC | Non-dermoscopic | 100 BCCs, 100 CSCCs, 100 AKs, 100 SKs, 100 nevi (10-fold cross-validation) | MSVMc | Accuracy: 93% | 5 |
| Kefel, 2012 [ | BCC | Dermoscopic | 49 BCCs, 153 benign lesions (leave-one-out cross-validation) | ANN | AUROC: 0.925 | 4 |
| Kefel, 2016 [ | BCC | Dermoscopic | Training: 100 BCCs, 254 benign lesions; Testing: 304 BCCs, 720 benign lesions | Logistic regression | AUROC: 0.878 | 2 |
| Kharazmi, 2011 [ | BCC | Dermoscopic | 299 BCCs, 360 benign lesions (no cross-validation) | Random forest classifier | AUROC: 0.903 | 4 |
| Kharazmi, 2016 [ | BCC | Dermoscopic | 299 BCCs, 360 benign lesions (no cross-validation) | Random forest classifier | AUROC: 0.965 | 4 |
| Kharazmi, 2017 [ | BCC | Dermoscopic | Training: 149 BCCs, 300 benign lesions; Testing: 150 BCCs, 300 benign lesions | ANN | AUROC: 0.911; Sensitivity: 85.3%; Specificity: 94.0% | 3 |
| Kharazmi, 2018 [ | BCC | Dermoscopic | 295 BCCs; 369 benign lesions (10-fold cross-validation) | Random forest classifier | AUROC: 0.832; Sensitivity: 74.9%; Specificity: 77.8% | 3 |
| Lee 2018 [ | BCC | Non-dermoscopic | Training: 463 BCCs, 1914 various lesions; Testing: 51 BCCs, 950 various lesionsb | ANN | Sensitivity: 91% | 3 |
| Maurya, 2014 [ | BCC; CSCC | Dermoscopic; Non-dermoscopic | 84 BCCs, 101 CSCCs, 77 MMs, 101 AKs (75 from each class used for training; remainder used for testing)b | MSVM | Accuracy: 83.3% (BCC), 84.1% (CSCC) | 5 |
| Mishra, 2017 [ | BCC | Dermoscopic | 305 BCCs, 718 benign lesions (leave-one-out cross-validation) | Logistic regression | Accuracy: 72%f | 3 |
| Møllersen, 2015 [ | BCC; CSCC | Dermoscopic | Training: 37 MMs, 169 various lesionsg; Testing: 71 BCCs, 7 CSCCs, 799 various lesionsb | Hybrid model of linear and quadratic classifiersc | Sensitivity: 100%; Specificity: 12% | 2 |
| Shakya, 2012 [ | CSCC | Dermoscopic | 53 CSCCs, 53 SKs (no cross-validation) | Logistic regression | AUROC: 0.991 | 4 |
| Shimizu, 2014 [ | BCC | Dermoscopic | 69 BCCs, 105 MMs, 790 benign lesions (10-fold cross-validation)b | Layered model of linear classifiersc | Sensitivity: 82.6% | 3 |
| Shoieb, 2016 [ | BCC | Non-dermoscopic | Training: 84 NMSC, 119 MMs; Testing: 64 BCC, 72 MM, 74 eczema, 31 impetigo | MSVM | Accuracy: 96.2%; Specificity: 96.0%; Sensitivity: 88.9% | 5 |
| Stoecker, 2009 [ | BCC | Dermoscopic | 42 BCCs, 168 various lesions(leave-one-out cross-validation)b | ANN | AUROC: 0.951 | 2 |
| Sumithra, 2015 [ | CSCC | Non-dermoscopic | 31 CSCCs, 31 MMs, 33 SKs, 26 bullae, 20 shingles (70% used for training; remainder used for testing)b | Hybrid model of MSVM and k-NN classifiersc | F-measure: 0.581 | 5 |
| Upadhyay, 2018 [ | BCC; CSCC | Non-Dermoscopic | 239 BCCs, 88 CSCCs, 973 various lesions (24 from each class used for training; remainder used for testing)b | ANN | Accuracy: 96.6% (BCC), 81.2% (CSCC); Sensitivity: 96.8% (BCC), 80.5% (CSCC) | 3 |
| Wahab, 2003 [ | BCC | Non-Dermoscopic | 54 BCCs, 54 DLE, 54 AV (34 from each class used for training; remainder used for testing) | ANN | Sensitivity: 90% | 5 |
| Wahba, 2017 [ | BCC | Dermoscopic | 29 BCCs, 27 nevi (46 total used for training and 10 for testing) | MSVM | Accuracy: 100%; Sensitivity: 100%; Specificity: 100% | 5 |
| Wahba, 2018 [ | BCC | Dermoscopic | 300 BCCs, 300 MMs, 300 nevi, 300 SKs (fivefold cross-validation)b | MSVM | AUROC: Sensitivity: 100%; Specificity: 100% | 3 |
| Yap, 2018 [ | BCC | Dermoscopic; Non-Dermoscopic | 647 BCCs, 2270 various lesions (fivefold cross-validation)b | ANN | Accuracy: 91.8%; Sensitivity: 90.6%; Specificity: 92.3% | 3 |
| Zhang, 2017 [ | BCC | Dermoscopic | 132 BCCs, 132 nevi, 132 SKs, 132 psoriasis (80% used for training; remainder used for testing) | ANN | Accuracy: 92.4%d; Sensitivity: 85%d; Specificity: 94.8%d | 3 |
| Zhang, 2018 [ | BCC | Dermoscopic | 132 BCCs, 132 nevi, 132 SKs, 132 psoriasis (10-fold cross-validation) | ANN | Accuracy: 94.3%d; Sensitivity: 88.2%d; Specificity: 96.1%d | 3 |
| Zhou, 2017 [ | BCC | Dermoscopic | Training: 154 BCCs, 10,262 benign lesions; Testing: 50 BCCs, 1100 benign lesions | ANN | Accuracy: 96.8%d; Sensitivity: 38%; Specificity: 99.5%d | 3 |
Abbreviations, AK Actinic keratosis, ANN Artificial neural network, AUROC Area under receiver operating characteristic, BCC Basal cell carcinoma, CSCC Cutaneous squamous cell carcinoma, k-NN k-nearest neighbors, MM Malignant melanoma, NMSC Non-melanoma skin cancer, MSVM Multiclass support vector machine, SK Seborrheic keratosis
aQuality rating modified from Simel and Rennie [12]
bCompetitive set included both benign and malignant lesions
cStudy tested multiple classifiers; only the classifier that achieved the highest performance has been reported
dFigures are derived from confusion matrix values and represent pooled BCC and CSCC classification in studies that tested both
eTotal test set was derived from three different datasets (“Asan,” “Hallym,” “Edinburgh”), one of which was chronologically assorted and partitioned such that the oldest 90% of images were used for training and the newest 10% for testing [23]. However, number of lesions was provided only for the unpartitioned Asan dataset. Thus, we have estimated the total number of test lesions as 10% of the individual lesion classes in the unpartitioned Asan dataset plus all lesions in the Hallym and Edinburgh datasets
fFigure represents approximation from histogram
gTraining set represents figures provided in a previous study by the experimenters [58]. The classifier has not been retrained [20]
QUADAS-2 summary
| Risk of Bias | Applicability Concerns | ||||||
|---|---|---|---|---|---|---|---|
| Patient Selection | Index Test | Reference Standard | Flow and Timing | Patient Selection | Index Test | Reference Standard | |
| Abbas, 2016 [ | High | Low | High | Low | Low | High | High |
| Ballerini, 2012 [ | High | Low | Low | Low | Low | Low | Low |
| Chang, 2013 [ | Low | Low | Low | Low | Low | Low | Low |
| Cheng, 2011 [ | Low | Low | Low | Low | High | High | Low |
| Cheng, 2012 [ | Low | Low | Low | Low | High | High | Low |
| Cheng, 2013 [ | Low | Low | Low | Low | High | High | Low |
| Cheng, 2013 [ | Unclear | Low | Low | Low | Low | Low | Low |
| Choudhury, 2015 [ | High | Low | High | Low | High | High | High |
| Chuang, 2011 [ | High | Low | Low | Low | Low | Low | Low |
| Dorj, 2018 [ | High | Low | High | Low | Low | High | High |
| Esteva, 2017 [ | High | Low | Low | Low | Low | Low | Low |
| Ferris, 2015 [ | Low | Low | Low | Low | Low | High | Low |
| Fujisawa, 2018 [ | High | Low | Low | Low | Low | High | Low |
| Guvenc, 2013 [ | High | High | Low | Low | High | High | Low |
| Han, 2018 [ | High | Low | Low | Low | Low | High | Low |
| Immagulate, 2015 [ | High | Low | High | Low | Low | High | High |
| Kefel, 2012 [ | High | Low | Low | Low | High | High | Low |
| Kefel, 2016 [ | Low | Low | Low | Low | Low | Low | Low |
| Kharazmi, 2011 [ | High | High | Low | Low | High | High | Low |
| Kharazmi, 2016 [ | High | High | Low | Low | High | High | Low |
| Kharazmi, 2017 [ | High | Low | Low | Low | Low | Low | Low |
| Kharazmi, 2018 [ | High | Low | Low | Low | Low | Low | Low |
| Lee, 2018 [ | High | Low | Low | Low | Low | High | Low |
| Maurya, 2014 [ | High | Low | High | Low | High | High | High |
| Mishra, 2017 [ | Unclear | Low | Low | Low | Low | Low | Low |
| Møllersen, 2015 [ | Low | Low | Low | Low | Low | Low | Low |
| Shakya, 2012 [ | High | High | Low | Low | High | High | Low |
| Shimizu, 2014 [ | High | Low | Low | Low | Low | High | Low |
| Shoieb, 2016 [ | High | Low | High | Low | Low | High | High |
| Stoecker, 2009 [ | Low | Low | Low | Low | Low | High | Low |
| Sumithra, 2015 [ | High | Low | High | Low | High | High | High |
| Upadhyay, 2018 [ | High | Low | Low | Low | Low | Low | Low |
| Wahab, 2003 [ | High | Low | High | Low | Low | High | High |
| Wahba, 2017 [ | High | Low | High | Low | Low | High | High |
| Wahba, 2018 [ | High | Low | Low | Low | Low | Low | Low |
| Yap, 2018 [ | High | Low | Low | Low | Low | High | Low |
| Zhang, 2017 [ | High | Low | Low | Low | Low | Low | Low |
| Zhang, 2018 [ | High | Low | Low | Low | Low | Low | Low |
| Zhou, 2017 [ | High | Low | Low | Low | Low | Low | Low |
Abbreviation, QUADAS-2 The Quality Assessment of Diagnostic Accuracy Studies [13]