Vincent Dick1, Christoph Sinz1, Martina Mittlböck2, Harald Kittler1, Philipp Tschandl1. 1. ViDIR Group, Department of Dermatology, Medical University of Vienna, Vienna, Austria. 2. Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria.
Abstract
IMPORTANCE: The recent advances in the field of machine learning have raised expectations that computer-aided diagnosis will become the standard for the diagnosis of melanoma. OBJECTIVE: To critically review the current literature and compare the diagnostic accuracy of computer-aided diagnosis with that of human experts. DATA SOURCES: The MEDLINE, arXiv, and PubMed Central databases were searched to identify eligible studies published between January 1, 2002, and December 31, 2018. STUDY SELECTION: Studies that reported on the accuracy of automated systems for melanoma were selected. Search terms included melanoma, diagnosis, detection, computer aided, and artificial intelligence. DATA EXTRACTION AND SYNTHESIS: Evaluation of the risk of bias was performed using the QUADAS-2 tool, and quality assessment was based on predefined criteria. Data were analyzed from February 1 to March 10, 2019. MAIN OUTCOMES AND MEASURES: Summary estimates of sensitivity and specificity and summary receiver operating characteristic curves were the primary outcomes. RESULTS: The literature search yielded 1694 potentially eligible studies, of which 132 were included and 70 offered sufficient information for a quantitative analysis. Most studies came from the field of computer science. Prospective clinical studies were rare. Combining the results for automated systems gave a melanoma sensitivity of 0.74 (95% CI, 0.66-0.80) and a specificity of 0.84 (95% CI, 0.79-0.88). Sensitivity was lower in studies that used independent test sets than in those that did not (0.51; 95% CI, 0.34-0.69 vs 0.82; 95% CI, 0.77-0.86; P < .001); however, the specificity was similar (0.83; 95% CI, 0.71-0.91 vs 0.85; 95% CI, 0.80-0.88; P = .67). In comparison with dermatologists' diagnosis, computer-aided diagnosis showed similar sensitivities and a 10 percentage points lower specificity, but the difference was not statistically significant. Studies were heterogeneous and substantial risk of bias was found in all but 4 of the 70 studies included in the quantitative analysis. CONCLUSIONS AND RELEVANCE: Although the accuracy of computer-aided diagnosis for melanoma detection is comparable to that of experts, the real-world applicability of these systems is unknown and potentially limited owing to overfitting and the risk of bias of the studies at hand.
IMPORTANCE: The recent advances in the field of machine learning have raised expectations that computer-aided diagnosis will become the standard for the diagnosis of melanoma. OBJECTIVE: To critically review the current literature and compare the diagnostic accuracy of computer-aided diagnosis with that of human experts. DATA SOURCES: The MEDLINE, arXiv, and PubMed Central databases were searched to identify eligible studies published between January 1, 2002, and December 31, 2018. STUDY SELECTION: Studies that reported on the accuracy of automated systems for melanoma were selected. Search terms included melanoma, diagnosis, detection, computer aided, and artificial intelligence. DATA EXTRACTION AND SYNTHESIS: Evaluation of the risk of bias was performed using the QUADAS-2 tool, and quality assessment was based on predefined criteria. Data were analyzed from February 1 to March 10, 2019. MAIN OUTCOMES AND MEASURES: Summary estimates of sensitivity and specificity and summary receiver operating characteristic curves were the primary outcomes. RESULTS: The literature search yielded 1694 potentially eligible studies, of which 132 were included and 70 offered sufficient information for a quantitative analysis. Most studies came from the field of computer science. Prospective clinical studies were rare. Combining the results for automated systems gave a melanoma sensitivity of 0.74 (95% CI, 0.66-0.80) and a specificity of 0.84 (95% CI, 0.79-0.88). Sensitivity was lower in studies that used independent test sets than in those that did not (0.51; 95% CI, 0.34-0.69 vs 0.82; 95% CI, 0.77-0.86; P < .001); however, the specificity was similar (0.83; 95% CI, 0.71-0.91 vs 0.85; 95% CI, 0.80-0.88; P = .67). In comparison with dermatologists' diagnosis, computer-aided diagnosis showed similar sensitivities and a 10 percentage points lower specificity, but the difference was not statistically significant. Studies were heterogeneous and substantial risk of bias was found in all but 4 of the 70 studies included in the quantitative analysis. CONCLUSIONS AND RELEVANCE: Although the accuracy of computer-aided diagnosis for melanoma detection is comparable to that of experts, the real-world applicability of these systems is unknown and potentially limited owing to overfitting and the risk of bias of the studies at hand.
Authors: Andreas Blum; Friedrich A Bahmer; Jürgen Bauer; Ralph P Braun; Brigitte Coras-Stepanek; Teresa Deinlein; Thomas Eigentler; Christine Fink; Claus Garbe; Holger A Haenssle; Rainer Hofmann-Wellenhof; Harald Kittler; Jürgen Kreusch; Hubert Pehamberger; Hans Schulz; H Peter Soyer; Wilhelm Stolz; Philipp Tschandl; Iris Zalaudek Journal: Hautarzt Date: 2019-11 Impact factor: 0.751
Authors: C Muñoz-López; C Ramírez-Cornejo; M A Marchetti; S S Han; P Del Barrio-Díaz; A Jaque; P Uribe; D Majerson; M Curi; C Del Puerto; F Reyes-Baraona; R Meza-Romero; J Parra-Cares; P Araneda-Ortega; M Guzmán; R Millán-Apablaza; M Nuñez-Mora; K Liopyris; C Vera-Kellet; C Navarrete-Dechent Journal: J Eur Acad Dermatol Venereol Date: 2020-11-22 Impact factor: 6.166
Authors: Albert T Young; Kristen Fernandez; Jacob Pfau; Rasika Reddy; Nhat Anh Cao; Max Y von Franque; Arjun Johal; Benjamin V Wu; Rachel R Wu; Jennifer Y Chen; Raj P Fadadu; Juan A Vasquez; Andrew Tam; Michael J Keiser; Maria L Wei Journal: NPJ Digit Med Date: 2021-01-21
Authors: Julia Höhn; Achim Hekler; Eva Krieghoff-Henning; Jakob Nikolas Kather; Jochen Sven Utikal; Friedegund Meier; Frank Friedrich Gellrich; Axel Hauschild; Lars French; Justin Gabriel Schlager; Kamran Ghoreschi; Tabea Wilhelm; Heinz Kutzner; Markus Heppt; Sebastian Haferkamp; Wiebke Sondermann; Dirk Schadendorf; Bastian Schilling; Roman C Maron; Max Schmitt; Tanja Jutzi; Stefan Fröhling; Daniel B Lipka; Titus Josef Brinker Journal: J Med Internet Res Date: 2021-07-02 Impact factor: 5.428