| Literature DB >> 35409575 |
Manuel Martin-Gonzalez1,2, Carlos Azcarraga1, Alba Martin-Gil3, Carlos Carpena-Torres3, Pedro Jaen1,2.
Abstract
(1) Background: The purpose of this study was to evaluate the efficacy in terms of sensitivity, specificity, and accuracy of the quantusSKIN system, a new clinical tool based on deep learning, to distinguish between benign skin lesions and melanoma in a hospital population. (2)Entities:
Keywords: artificial intelligence; deep learning; melanoma; oncology; skin cancer
Mesh:
Year: 2022 PMID: 35409575 PMCID: PMC8997631 DOI: 10.3390/ijerph19073892
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Representative images of different skin lesions diagnosed by the quantusSKIN system: true positive (TP), true negative (TN), false negative (FN), and false positive (FP), in addition to two images not recommended to analyze (right column): hair covering the skin lesion (upper) and skin lesion occupying the full image dimensions (lower).
Demographic characteristics of the participants in the study and their skin lesion locations.
| Parameter | Nevus Group | Melanoma Group |
|---|---|---|
| Sample ( | 177 | 55 |
| Age (years) | 40.91 ± 17.83 | 60.53 ± 18.39 |
| Gender (F/M) | 121/56 | 30/25 |
| Skin lesion location ( | ||
| Scalp | 0 (0.0%) | 1 (1.8%) |
| Face | 5 (2.8%) | 6 (10.9%) |
| Neck | 2 (1.1%) | 2 (3.6%) |
| Trunk | 144 (81.4%) | 26 (47.3%) |
| Upper extremity | 10 (5.7%) | 6 (10.9%) |
| Lower extremity | 10 (5.7%) | 11 (20.0%) |
| Hand | 1 (0.6%) | 0 (0.0%) |
| Foot | 4 (2.3%) | 1 (1.8%) |
| Vulvar skin | 1 (0.6%) | 1 (1.8%) |
| Foreskin | 0 (0.0%) | 1 (1.8%) |
Figure 2Receiver operating characteristics (ROC) curves obtained from the analysis of the 232 testing images by using the quantusSKIN system before the retraining with the additional 339 training images (black) and after this retraining (red).
Efficacy of the quantusSKIN system for melanoma diagnosis after its retraining in terms of sensitivity, specificity, accuracy, PPV, NPV, and F1 score for the optimum diagnostic threshold established with two different error metrics. Additionally, the efficacy data of other existing deep learning algorithms recently reported in the scientific literature is summarized.
| Error Metric/Study | Diagnostic Threshold (%) | Sensitivity | Specificity | Accuracy | PPV | NPV | F1 Score | F2 Score |
|---|---|---|---|---|---|---|---|---|
| Maximum F1 score | 53.51 | 0.782 | 0.763 | 0.767 | 0.506 | 0.918 | 0.614 | 0.705 |
| Specificity > 0.800 andmaximum sensitivity | 67.33 | 0.691 | 0.802 | 0.776 | 0.521 | 0.893 | 0.594 | 0.648 |
| Haenssle et al. [ | - | 0.950 | 0.825 | - | - | - | - | - |
| Brinker et al. [ | - | 0.682 | - | - | - | - | - | - |
| Kaur et al. [ | - | 0.830 | 0.839 | 0.830 | - | - | - | - |