| Literature DB >> 29267358 |
Kajsa Møllersen1, Maciel Zortea2, Thomas R Schopf3, Herbert Kirchesch4, Fred Godtliebsen2.
Abstract
Melanoma is the deadliest form of skin cancer, and early detection is crucial for patient survival. Computer systems can assist in melanoma detection, but are not widespread in clinical practice. In 2016, an open challenge in classification of dermoscopic images of skin lesions was announced. A training set of 900 images with corresponding class labels and semi-automatic/manual segmentation masks was released for the challenge. An independent test set of 379 images, of which 75 were of melanomas, was used to rank the participants. This article demonstrates the impact of ranking criteria, segmentation method and classifier, and highlights the clinical perspective. We compare five different measures for diagnostic accuracy by analysing the resulting ranking of the computer systems in the challenge. Choice of performance measure had great impact on the ranking. Systems that were ranked among the top three for one measure, dropped to the bottom half when changing performance measure. Nevus Doctor, a computer system previously developed by the authors, was used to participate in the challenge, and investigate the impact of segmentation and classifier. The diagnostic accuracy when using an automatic versus the semi-automatic/manual segmentation is investigated. The unexpected small impact of segmentation method suggests that improvements of the automatic segmentation method w.r.t. resemblance to semi-automatic/manual segmentation will not improve diagnostic accuracy substantially. A small set of similar classification algorithms are used to investigate the impact of classifier on the diagnostic accuracy. The variability in diagnostic accuracy for different classifier algorithms was larger than the variability for segmentation methods, and suggests a focus for future investigations. From a clinical perspective, the misclassification of a melanoma as benign has far greater cost than the misclassification of a benign lesion. For computer systems to have clinical impact, their performance should be ranked by a high-sensitivity measure.Entities:
Mesh:
Year: 2017 PMID: 29267358 PMCID: PMC5739481 DOI: 10.1371/journal.pone.0190112
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Rankings for those participants that were highest ranked by one measure.
| Segmentation | Automatic (25 participants) | Manual (18 participants) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Participant | A | B | C | C | D | E | F | A | E | C | E | F |
| Av. precision | 0.64 | 3 | 11 | 11 | 2 | 6 | 8 | 0.62 | 3 | 7 | 3 | 2 |
| AUC of ROC | 3 | 0.83 | 6 | 6 | 4 | 8 | 5 | 4 | 0.81 | 5 | 1 | 2 |
| SE = 95% | 11 | 8 | 0.39 | 1 | 6 | 13 | 7 | 11 | 2 | 0.32 | 2 | 3 |
| SE = 98% | 15 | 14 | 1 | 0.33 | 4 | 13 | 5 | 11 | 1 | 5 | 0.29 | 2 |
| SE = 99% | 11 | 9 | 3 | 3 | 0.25 | 12 | 10 | 8 | 7 | 2 | 7 | 0.25 |
Fig 1Segmentation by different methods.
Blue line: automatic segmentation. Red line: manual segmentation. Image ISIC_000276 of the ISIC Archive, shared under the CC-0 license.
Fig 2ROC curves for different segmentation methods.
ROC curves for Nevus Doctor with automatic and semi-automatic/manual segmentation.
Scores for Nevus Doctor.
| Segmentation: | Automatic | Manual |
|---|---|---|
| Average precision | 0.35 | 0.37 |
| AUC of the ROC | 0.64 | 0.67 |
| SE = 95% | 0.23 | 0.19 |
| SE = 98% | 0.13 | 0.17 |
| SE = 99% | 0.12 | 0.17 |
High-sensitivity measures for Nevus Doctor using different classifiers.
| Automatic segmentation | ||||
|---|---|---|---|---|
| LDA | QDA | dLDA | dQDA | |
| SE = 95% | 0.14 | 0.23 | 0.28 | 0.32 |
| SE = 98% | 0.07 | 0.13 | 0.22 | 0.27 |
| SE = 99% | 0.05 | 0.12 | 0.21 | 0.24 |
Fig 3Different colour.
Example of different colour calibration and/or different light source from the ISIC Data Archive. Image ISIC_0000023 and ISIC_0000037 of the ISIC Archive, shared under the CC-0 license.