Brigid Betz-Stablein1,2, Brian D'Alessandro3, Uyen Koh2, Elsemieke Plasmeijer1,4, Monika Janda5, Scott W Menzies6,7, Rainer Hofmann-Wellenhof8, Adele C Green1,9, H Peter Soyer2,10. 1. QIMR Berghofer Medical Research Institute, Cancer and Population Studies, Brisbane, Queensland, Australia. 2. The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research Centre, Brisbane, Queensland, Australia. 3. Canfield Scientific Inc., Fairfield, New Jersey, USA. 4. Netherlands Cancer Institute, Dermatology Department, Amsterdam, The Netherlands. 5. Centre of Health Services Research, Faculty of Medicine, The University of Queensland, Brisbane, Queensland, Australia. 6. Sydney Medical School, The University of Sydney, Camperdown, New South Wales, Australia. 7. Sydney Melanoma Diagnostic Centre, Royal Prince Alfred Hospital, Camperdown, New South Wales, Australia. 8. Department of Dermatology, Medical University of Graz, Graz, Austria. 9. CRUK Manchester Institute and University of Manchester, Manchester Academic Health Sciences Centre, Manchester, United Kingdom. 10. Dermatology Department, Princess Alexandra Hospital, Brisbane, Queensland, Australia.
Abstract
BACKGROUND: The number of naevi on a person is the strongest risk factor for melanoma; however, naevus counting is highly variable due to lack of consistent methodology and lack of inter-rater agreement. Machine learning has been shown to be a valuable tool for image classification in dermatology. OBJECTIVES: To test whether automated, reproducible naevus counts are possible through the combination of convolutional neural networks (CNN) and three-dimensional (3D) total body imaging. METHODS: Total body images from a study of naevi in the general population were used for the training (82 subjects, 57,742 lesions) and testing (10 subjects; 4,868 lesions) datasets for the development of a CNN. Lesions were labelled as naevi, or not ("non-naevi"), by a senior dermatologist as the gold standard. Performance of the CNN was assessed using sensitivity, specificity, and Cohen's kappa, and evaluated at the lesion level and person level. RESULTS: Lesion-level analysis comparing the automated counts to the gold standard showed a sensitivity and specificity of 79% (76-83%) and 91% (90-92%), respectively, for lesions ≥2 mm, and 84% (75-91%) and 91% (88-94%) for lesions ≥5 mm. Cohen's kappa was 0.56 (0.53-0.59) indicating moderate agreement for naevi ≥2 mm, and substantial agreement (0.72, 0.63-0.80) for naevi ≥5 mm. For the 10 individuals in the test set, person-level agreement was assessed as categories with 70% agreement between the automated and gold standard counts. Agreement was lower in subjects with numerous seborrhoeic keratoses. CONCLUSION: Automated naevus counts with reasonable agreement to those of an expert clinician are possible through the combination of 3D total body photography and CNNs. Such an algorithm may provide a faster, reproducible method over the traditional in person total body naevus counts.
BACKGROUND: The number of naevi on a person is the strongest risk factor for melanoma; however, naevus counting is highly variable due to lack of consistent methodology and lack of inter-rater agreement. Machine learning has been shown to be a valuable tool for image classification in dermatology. OBJECTIVES: To test whether automated, reproducible naevus counts are possible through the combination of convolutional neural networks (CNN) and three-dimensional (3D) total body imaging. METHODS: Total body images from a study of naevi in the general population were used for the training (82 subjects, 57,742 lesions) and testing (10 subjects; 4,868 lesions) datasets for the development of a CNN. Lesions were labelled as naevi, or not ("non-naevi"), by a senior dermatologist as the gold standard. Performance of the CNN was assessed using sensitivity, specificity, and Cohen's kappa, and evaluated at the lesion level and person level. RESULTS: Lesion-level analysis comparing the automated counts to the gold standard showed a sensitivity and specificity of 79% (76-83%) and 91% (90-92%), respectively, for lesions ≥2 mm, and 84% (75-91%) and 91% (88-94%) for lesions ≥5 mm. Cohen's kappa was 0.56 (0.53-0.59) indicating moderate agreement for naevi ≥2 mm, and substantial agreement (0.72, 0.63-0.80) for naevi ≥5 mm. For the 10 individuals in the test set, person-level agreement was assessed as categories with 70% agreement between the automated and gold standard counts. Agreement was lower in subjects with numerous seborrhoeic keratoses. CONCLUSION: Automated naevus counts with reasonable agreement to those of an expert clinician are possible through the combination of 3D total body photography and CNNs. Such an algorithm may provide a faster, reproducible method over the traditional in person total body naevus counts.
Authors: Katie J Lee; Brigid Betz-Stablein; Mitchell S Stark; Monika Janda; Aideen M McInerney-Leo; Liam J Caffery; Nicole Gillespie; Tatiane Yanes; H Peter Soyer Journal: Front Med (Lausanne) Date: 2022-01-17