Julia K Winkler1, Christine Fink1, Ferdinand Toberer1, Alexander Enk1, Teresa Deinlein2, Rainer Hofmann-Wellenhof2, Luc Thomas3, Aimilios Lallas4, Andreas Blum5, Wilhelm Stolz6, Holger A Haenssle1. 1. Department of Dermatology, University of Heidelberg, Heidelberg, Germany. 2. Department of Dermatology and Venerology, Medical University of Graz, Graz, Austria. 3. Department of Dermatology, Lyon Sud University Hospital, Hospices Civils de Lyon, Pierre Bénite, France. 4. First Department of Dermatology, Aristotle University of Thessaloniki, Thessaloniki, Greece. 5. Public, Private, and Teaching Practice, Konstanz, Germany. 6. Department of Dermatology, Allergology and Environmental Medicine II, Klinik Thalkirchnerstraße, Munich, Germany.
Abstract
IMPORTANCE: Deep learning convolutional neural networks (CNNs) have shown a performance at the level of dermatologists in the diagnosis of melanoma. Accordingly, further exploring the potential limitations of CNN technology before broadly applying it is of special interest. OBJECTIVE: To investigate the association between gentian violet surgical skin markings in dermoscopic images and the diagnostic performance of a CNN approved for use as a medical device in the European market. DESIGN AND SETTING: A cross-sectional analysis was conducted from August 1, 2018, to November 30, 2018, using a CNN architecture trained with more than 120 000 dermoscopic images of skin neoplasms and corresponding diagnoses. The association of gentian violet skin markings in dermoscopic images with the performance of the CNN was investigated in 3 image sets of 130 melanocytic lesions each (107 benign nevi, 23 melanomas). EXPOSURES: The same lesions were sequentially imaged with and without the application of a gentian violet surgical skin marker and then evaluated by the CNN for their probability of being a melanoma. In addition, the markings were removed by manually cropping the dermoscopic images to focus on the melanocytic lesion. MAIN OUTCOMES AND MEASURES: Sensitivity, specificity, and area under the curve (AUC) of the receiver operating characteristic (ROC) curve for the CNN's diagnostic classification in unmarked, marked, and cropped images. RESULTS: In all, 130 melanocytic lesions (107 benign nevi and 23 melanomas) were imaged. In unmarked lesions, the CNN achieved a sensitivity of 95.7% (95% CI, 79%-99.2%) and a specificity of 84.1% (95% CI, 76.0%-89.8%). The ROC AUC was 0.969. In marked lesions, an increase in melanoma probability scores was observed that resulted in a sensitivity of 100% (95% CI, 85.7%-100%) and a significantly reduced specificity of 45.8% (95% CI, 36.7%-55.2%, P < .001). The ROC AUC was 0.922. Cropping images led to the highest sensitivity of 100% (95% CI, 85.7%-100%), specificity of 97.2% (95% CI, 92.1%-99.0%), and ROC AUC of 0.993. Heat maps created by vanilla gradient descent backpropagation indicated that the blue markings were associated with the increased false-positive rate. CONCLUSIONS AND RELEVANCE: This study's findings suggest that skin markings significantly interfered with the CNN's correct diagnosis of nevi by increasing the melanoma probability scores and consequently the false-positive rate. A predominance of skin markings in melanoma training images may have induced the CNN's association of markings with a melanoma diagnosis. Accordingly, these findings suggest that skin markings should be avoided in dermoscopic images intended for analysis by a CNN. TRIAL REGISTRATION: German Clinical Trial Register (DRKS) Identifier: DRKS00013570.
IMPORTANCE: Deep learning convolutional neural networks (CNNs) have shown a performance at the level of dermatologists in the diagnosis of melanoma. Accordingly, further exploring the potential limitations of CNN technology before broadly applying it is of special interest. OBJECTIVE: To investigate the association between gentian violet surgical skin markings in dermoscopic images and the diagnostic performance of a CNN approved for use as a medical device in the European market. DESIGN AND SETTING: A cross-sectional analysis was conducted from August 1, 2018, to November 30, 2018, using a CNN architecture trained with more than 120 000 dermoscopic images of skin neoplasms and corresponding diagnoses. The association of gentian violet skin markings in dermoscopic images with the performance of the CNN was investigated in 3 image sets of 130 melanocytic lesions each (107 benign nevi, 23 melanomas). EXPOSURES: The same lesions were sequentially imaged with and without the application of a gentian violet surgical skin marker and then evaluated by the CNN for their probability of being a melanoma. In addition, the markings were removed by manually cropping the dermoscopic images to focus on the melanocytic lesion. MAIN OUTCOMES AND MEASURES: Sensitivity, specificity, and area under the curve (AUC) of the receiver operating characteristic (ROC) curve for the CNN's diagnostic classification in unmarked, marked, and cropped images. RESULTS: In all, 130 melanocytic lesions (107 benign nevi and 23 melanomas) were imaged. In unmarked lesions, the CNN achieved a sensitivity of 95.7% (95% CI, 79%-99.2%) and a specificity of 84.1% (95% CI, 76.0%-89.8%). The ROC AUC was 0.969. In marked lesions, an increase in melanoma probability scores was observed that resulted in a sensitivity of 100% (95% CI, 85.7%-100%) and a significantly reduced specificity of 45.8% (95% CI, 36.7%-55.2%, P < .001). The ROC AUC was 0.922. Cropping images led to the highest sensitivity of 100% (95% CI, 85.7%-100%), specificity of 97.2% (95% CI, 92.1%-99.0%), and ROC AUC of 0.993. Heat maps created by vanilla gradient descent backpropagation indicated that the blue markings were associated with the increased false-positive rate. CONCLUSIONS AND RELEVANCE: This study's findings suggest that skin markings significantly interfered with the CNN's correct diagnosis of nevi by increasing the melanoma probability scores and consequently the false-positive rate. A predominance of skin markings in melanoma training images may have induced the CNN's association of markings with a melanoma diagnosis. Accordingly, these findings suggest that skin markings should be avoided in dermoscopic images intended for analysis by a CNN. TRIAL REGISTRATION: German Clinical Trial Register (DRKS) Identifier: DRKS00013570.
Authors: Xinyi Du-Harpur; Callum Arthurs; Clarisse Ganier; Rick Woolf; Zainab Laftah; Manpreet Lakhan; Amr Salam; Bo Wan; Fiona M Watt; Nicholas M Luscombe; Magnus D Lynch Journal: J Invest Dermatol Date: 2020-09-12 Impact factor: 8.551
Authors: Manoj Kumar Kanakasabapathy; Prudhvi Thirumalaraju; Hemanth Kandula; Fenil Doshi; Anjali Devi Sivakumar; Deeksha Kartik; Raghav Gupta; Rohan Pooniwala; John A Branda; Athe M Tsibris; Daniel R Kuritzkes; John C Petrozza; Charles L Bormann; Hadi Shafiee Journal: Nat Biomed Eng Date: 2021-06-10 Impact factor: 25.671
Authors: Parmita Mehta; Christine A Petersen; Joanne C Wen; Michael R Banitt; Philip P Chen; Karine D Bojikian; Catherine Egan; Su-In Lee; Magdalena Balazinska; Aaron Y Lee; Ariel Rokem Journal: Am J Ophthalmol Date: 2021-05-02 Impact factor: 5.258
Authors: Lauren R Kennedy-Metz; Pietro Mascagni; Antonio Torralba; Roger D Dias; Pietro Perona; Julie A Shah; Nicolas Padoy; Marco A Zenati Journal: IEEE Trans Med Robot Bionics Date: 2020-11-24
Authors: Sherif Mehralivand; Stephanie A Harmon; Joanna H Shih; Clayton P Smith; Nathan Lay; Burak Argun; Sandra Bednarova; Ronaldo Hueb Baroni; Abdullah Erdem Canda; Karabekir Ercan; Rossano Girometti; Ercan Karaarslan; Ali Riza Kural; Andrei S Purysko; Soroush Rais-Bahrami; Victor Martins Tonso; Cristina Magi-Galluzzi; Jennifer B Gordetsky; Ricardo Silvestre E Silva Macarenco; Maria J Merino; Berrak Gumuskaya; Yesim Saglican; Stefano Sioletic; Anne Y Warren; Tristan Barrett; Leonardo Bittencourt; Mehmet Coskun; Chris Knauss; Yan Mee Law; Ashkan A Malayeri; Daniel J Margolis; Jamie Marko; Derya Yakar; Bradford J Wood; Peter A Pinto; Peter L Choyke; Ronald M Summers; Baris Turkbey Journal: AJR Am J Roentgenol Date: 2020-08-05 Impact factor: 3.959