Literature DB >> 28815132

Precision Diagnosis Of Melanoma And Other Skin Lesions From Digital Images.

Abhishek Bhattacharya¹, Albert Young^1,2, Andrew Wong^1,2, Simone Stalling¹, Maria Wei^3,4, Dexter Hadley^1,2,3.

Abstract

Melanoma will affect an estimated 73,000 new cases this year and result in 9,000 deaths, yet precise diagnosis remains a serious problem. Without early detection and preventative care, melanoma can quickly spread to become fatal (Stage IV 5-year survival rate is 20-10%) from a once localized skin lesion (Stage IA 5- year survival rate is 97%). There is no biomarker for melanoma in clinical use, and the current diagnostic criteria for skin lesions remains subjective and imprecise. Accurate diagnosis of melanoma relies on a histopathologic gold standard; thus, aggressive excision of melanocytic skin lesions has been the mainstay of treatment. It is estimated that 36 biopsies are performed for every melanoma confirmed by pathology among excised lesions. There is significant morbidity in misdiagnosing melanoma such as progression of the disease for a false negative prediction vs the risks of unnecessary surgery for a false positive prediction. Every year, poor diagnostic precision adds an estimated $673 million in overall cost to manage the disease. Currently, manual dermatoscopic imaging is the standard of care in selecting atypical skin lesions for biopsy, and at best it achieves 90% sensitivity but only 59% specificity when performed by an expert dermatologist. Many computer vision (CV) algorithms perform better than dermatologists in classifying skin lesions although not significantly so in clinical practice. Meanwhile, open source deep learning (DL) techniques in CV have been gaining dominance since 2012 for image classification, and today DL can outperform humans in classifying millions of digital images with less than 5% error rates. Moreover, DL algorithms are readily run on commoditized hardware and have a strong online community of developers supporting their rapid adoption. In this work, we performed a successful pilot study to show proof of concept to DL skin pathology from images. However, DL algorithms must be trained on very large labelled datasets of images to achieve high accuracy. Here, we begin to assemble a large imageset of skin lesions from the UCSF and the San Francisco Veterans Affairs Medical Center (VAMC) dermatology clinics that are well characterized by their underlying pathology, on which to train DL algorithms. If trained on sufficient data, we hypothesize that our approach will significantly outperform general dermatologists in predicting skin lesion pathology. We posit that our work will allow for precision diagnosis of melanoma from widely available digital photography, which may optimize the management of the disease by decreasing unnecessary office visits and the significant morbidity and cost of melanoma misdiagnosis.

Entities: Chemical Disease Gene Species

Year: 2017 PMID： 28815132 PMCID： PMC5543387

Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc

Background and Significance

Melanoma misdiagnosis is a significant public health problem

Over the last two decades the number of patients in the United States diagnosed with melanoma has steadily risen to make it the fifth most common cancer in the nation. This year alone, an estimated 73,000 new cases and 9,000 deaths are expected to occur due to the disease [1]. Although early stages are highly survivable (Stage IA 5-year survival rate is 97%), without early detection and preventative care, melanoma can quickly spread and become fatal (Stage IV 5-year survival rate is 20-10%) [2, 3]. Poor diagnostic precision adds an estimated $673 million in overall cost to the management of the disease[4-6]

Melanoma pathophysiology and staging

Melanoma typically arises in pigment-producing cells known as melanocytes that have undergone adverse genetic mutation most frequently attributed to ultraviolet light (UV) radiation exposure [7-9]. Aside from UV exposure, some rare hereditary mutations in genes such as CDKN2A, CDK4, and MC1R can also be good indicators for patients with high risk of developing familial melanoma (patients that have families with a history of melanoma) [10]. The progression of the disease is best characterized both clinically and histopathologically, and it can rapidly progress from stage 0 (melanoma in situ) to stage 4 (metastatic melanoma)[11] without proper diagnosis and management.

Accurate clinical diagnosis of melanoma can be challenging

The ABCDE method for visually assessing pigmented skin lesions for malignancy[12] outlines Asymmetry, Border irregularity, Color variegation, Diameter (>6mm), and Evolution as clinical features to follow [13]. However, diagnostic accuracy of melanoma by the unaided eye is disappointing [14, 15]. The melanoma yield on biopsy of suspicious lesions is only 1 in 36 [16]. Dermoscopy[17] facilitates visualization of morphological features which are not discernible by examination with the naked eye [18], and it enables better diagnosis as compared to unaided eye [19-21] with an improvement in diagnostic sensitivity of 10–30% [22]. However, dermoscopy may actually lower the diagnostic accuracy in the hands of inexperienced dermatologists [23-26], since this method requires great deal of experience to differentiate skin lesions [27]. Experts achieve 90% sensitivity and 59% specificity, while this performance significantly worsens with inexperience and drops to 62%-63%for general practitioners [28, 29]. Currently available computer vision (CV) algorithms perform only marginally better in practice with no significant improvement in diagnosis relative to a dermatologist [30]. Histopathology remains the gold standard for accurate melanoma diagnosis[31] although the rate of discordant readings between pathologists can be high: when 11 expert pathologists reviewed 37 ‘classic’ melanocytic lesions there was total agreement in only 30% of cases; other studies report up to a 50% discordance rate among pathologists [32-36]. Thus the diagnostic accuracy of melanoma remains problematic independent of the method used for diagnosis.

Deep learning is emerging because of Big Data

Deep learning (DL) emerged from the traditional neural network paradigm of artificial intelligence that was developed in the 1980s to computationally model neuronal activity in the brain. An artificial neuron is modeled to fire around an activation threshold (or bias) and differentially weighted inputs. However, to interpret complex signals and patterns requires sophisticated models of computational neurons that are chained together to propagate signals much like the visual system in brain interprets light signals with successive cognitive interpretation (retina, V1, V2, etc.) in order to classify objects. Today, the most useful neural network models are composed of thousands of multi-layered artificial neurons that are parameterized by exponentially more biases and weights that require massive datasets to estimate. However, once these networks are trained on sufficiently large high quality labeled datasets, they generally outperform other machine learning methods. The computationally intensive process of accurately estimating their parameters by training on massive datasets constitutes the paradigm of DL. Furthermore, the exponential growth in computational power and the recent emergence of GPU computation, together with the abundance of large data sets to train on makes DL application more practical now than ever before.

Deep Learning facilitates the most accurate image classification

Big data, cheap computation, and better algorithms are making breakthroughs in artificial intelligence such as deep learning (DL) more possible now than ever before [37]. Today, DL is being applied to a variety of tasks with extraordinary results [38-53], but the most remarkable progress has been made in the field of computer vision (CV). ImageNet holds an annual Large Scale Visual Recognition Challenge (ILSVRC) competition [54, 55] for teams to classify 1.2 million images of objects into 1,000 categories (). In 2010, all teams used traditional CV algorithms with accuracy rates <71.8%. Only Incremental progress was made until 2012 when Alex Krizhevsky et al. submitted the AlexNet[56] DL approach with a 83.6% accuracy rate that far outperformed the competition. By ILSVRC 2013, all other participants embraced DL, and Google won ILSVRC 2014 with its original Inception model architecture [57]. Since then, Microsoft first outperformed humans with > 95% accuracy rates classifying the ImageNet dataset[58] and Google has subsequently leap-frogged to lead image classification performance with its latest open Inception v3 model[59] that achieves > 96.5% accuracy rates. This proposal is innovative because it leverages the impressive performance of general state- of-the-art deep learning models to improve the current standards of digital mammography screening.

Open source DL frameworks are becoming popular

This project will utilize popular open source deep learning frameworks such as Caffe [60], Theano [61, 62], and Torch [63]. All of these frameworks have contributed to numerous publications and are implemented everywhere from academia to industry. Many of the state-of-the-art algorithms that have won computer vision competitions in the past are published in public repositories tied to each of these frameworks. For example, Caffe, the deep learning framework developed out of the Berkeley Vision Lab, has their own “Model Zoo” where researchers and community members can publish and share pre-trained models. The repository not only includes the first model used to win ImageNet in 2012 [58], but also the latest networks Google uses for their own large scale image classification tasks [57]. The active community of developers that are supporting these open tools will be a valuable resource to fine-tuning pre-existing models as well as to build the novel DL architecture to accurately diagnose skin lesion pathology as we propose here.

Results

We performed a successful pilot study to show proof of concept of DL skin pathology from publically available images.

Publically available digital images of skin lesions

Google crawls the Internet to catalog every available image online, and we searched Google Images for “melanocytic skin lesion”. The majority of images of melanocytic skin lesion came from DermNet Skin Atlas [64], the largest independent photo dermatology source dedicated to online medical education. However only low resolution watermarked lesions (Figure 2) were freely available to download while high resolution images were available at $50 / image. We scraped freely available low quality images from DermNet and labeled them with DermNet assigned diagnoses. In all we identified 275 images labeled by DermNet comprising 170 atypical lesions (42 atypical nevi + 128 malignant melanoma) and 115 benign lesions (25 halo nevi + 80 melanocytic nevi).

Figure 2:

Example Malignant melanoma photo from the DermNet Skin Atlas

Deep Learning features of digital images

We utilized a type of algorithm called transfer learning to train a DL algorithm to classify skin lesions as typical or atypical. Transfer learning is applicable to our limited dataset on standard personal computer hardware because it assumes a DL reference model for feature selection, with subsequent classification of this imageset by traditional machine learning algorithms. We utilized the BVLC AlexNet model from Caffe Model Zoo to reduce the pixels of each digital image to 4,096 features. We used the Python programming language with scikit-learn open libraries to project the 4,096 features learned from the DL algorithm into 2 dimensions with the TSNE method in order to visualize the relationship among 275 images (Figure 3). We found that even with the low quality watermarked DermNet images, were were able to identify structure in the relatedness of images. Specifically, nevi clustered together as did images with hair, as well as different subsets of melanoma. Finally, to assess the predictive power of the features extracted by DL, we used a support vector machine (SVM) classifier trained on the 275 DermNet images with parameter optimization by cross validation. We found the area under the curve (AUC) of 0.83 after 10-fold cross validation of the SVM (Figure 4), and an AUC of of 0.80 to 0.90 represents a “good” measure of accuracy by most standards. We expect that 0.83 is the lower bound on the accuracy of our approach as we will increase our predictive power by training more sophisticated models with much more data.

Figure 3:

TSNE plot of 275 images

Figure 4:

Receiver operating curve for predicting skin pathology from 275 images with 10 fold cross validation.

Conclusion

At this time, computers cannot replace an experienced clinician’s intuition. However, with proficient training on sufficient high-quality data, CV algorithms will eventually match, if not exceed, clinical diagnostic accuracy of dermatologists. Applying DL to a large well-characterized prospectively collected clinical imagesets may indeed yield new diagnostic tools to more accurately diagnose skin cancer. Precision diagnostics of melanoma may serve as a first step in significantly reducing the mortality rate and improving the overall management of the disease.

34 in total

1. The pathogenesis of melanoma induced by ultraviolet radiation.

Authors: D Whiteman; A Green
Journal: N Engl J Med Date: 1999-09-02 Impact factor: 91.245

2. Discordance among expert pathologists in diagnosis of melanocytic neoplasms.

Authors: A B Ackerman
Journal: Hum Pathol Date: 1996-11 Impact factor: 3.466

Review 3. Dermoscopy of pigmented skin lesions--a valuable tool for early diagnosis of melanoma.

Authors: G Argenziano; H P Soyer
Journal: Lancet Oncol Date: 2001-07 Impact factor: 41.316

4. An estimate of the annual direct cost of treating cutaneous melanoma.

Authors: H Tsao; G S Rogers; A J Sober
Journal: J Am Acad Dermatol Date: 1998-05 Impact factor: 11.527

5. Cyclobutane pyrimidine dimers are predominant DNA lesions in whole human skin exposed to UVA radiation.

Authors: Stéphane Mouret; Caroline Baudouin; Marie Charveron; Alain Favier; Jean Cadet; Thierry Douki
Journal: Proc Natl Acad Sci U S A Date: 2006-09-05 Impact factor: 11.205

Review 6. Dermoscopy of pigmented skin lesions.

Authors: Ralph Peter Braun; Harold S Rabinovitz; Margaret Oliviero; Alfred W Kopf; Jean-Hilaire Saurat
Journal: J Am Acad Dermatol Date: 2005-01 Impact factor: 11.527

7. Interobserver variability on the histopathologic diagnosis of cutaneous melanoma and other pigmented skin lesions.

Authors: R Corona; A Mele; M Amini; G De Rosa; G Coppola; P Piccardi; M Fucci; P Pasquini; T Faraggiana
Journal: J Clin Oncol Date: 1996-04 Impact factor: 44.544

8. Addition of dermoscopy to conventional naked-eye examination in melanoma screening: a randomized study.

Authors: Paolo Carli; Vincenzo de Giorgi; Alessandra Chiarugi; Paolo Nardini; Martin A Weinstock; Emanuele Crocetti; Marcello Stante; Benvenuto Giannotti
Journal: J Am Acad Dermatol Date: 2004-05 Impact factor: 11.527

9. Epiluminescence microscopy. A useful tool for the diagnosis of pigmented skin lesions for formally trained dermatologists.

Authors: M Binder; M Schwarz; A Winkler; A Steiner; A Kaider; K Wolff; H Pehamberger
Journal: Arch Dermatol Date: 1995-03