| Literature DB >> 35684768 |
Xin Shen1, Lisheng Wei2, Shaoyu Tang1.
Abstract
Aiming at the problems of large intra-class differences, small inter-class differences, low contrast, and small and unbalanced datasets in dermoscopic images, this paper proposes a dermoscopic image classification method based on an ensemble of fine-tuned convolutional neural networks. By reconstructing the fully connected layers of the three pretrained models of Xception, ResNet50, and Vgg-16 and then performing transfer learning and fine-tuning the three pretrained models with the ISIC 2016 Challenge official skin dataset, we integrated the outputs of the three base models using a weighted fusion ensemble strategy in order to obtain a final prediction result able to distinguish whether a dermoscopic image indicates malignancy. The experimental results show that the accuracy of the ensemble model is 86.91%, the precision is 85.67%, the recall is 84.03%, and the F1-score is 84.84%, with these four evaluation metrics being better than those of the three basic models and better than some classical methods, proving the effectiveness and feasibility of the proposed method.Entities:
Keywords: classification; deep learning; dermoscopic image; ensemble learning; fine-tuning; transfer learning
Mesh:
Year: 2022 PMID: 35684768 PMCID: PMC9185225 DOI: 10.3390/s22114147
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Flowchart of the proposed algorithm.
Distribution of two types of dermoscopic images in the experimental dataset.
| Dataset | Benign | Malignant | Total |
|---|---|---|---|
| Training | 584 | 137 | 721 |
| Validation | 145 | 34 | 179 |
| Test | 304 | 75 | 379 |
Figure 2Image preprocessing.
Figure 3Data enhancement.
Figure 4Structure diagram of Xception model.
Figure 5The building blocks of ResNet50.
Figure 6Structure diagram of Vgg16 model.
Figure 7Improvement of model structure.
Confusion matrix.
| Label | Prediction | |
|---|---|---|
| True | False | |
| Positive | TP | FP |
| Negative | FN | TN |
Parameter settings.
| Parameters | Values |
|---|---|
| Experimental platform | Google Colab |
| Language | Python |
| Experiment framework | TensorFlow |
| Optimizer | Adam |
| Loss function | Focal Loss |
| Epochs | 50 |
| Learning rate (initial) | 1 × 10−4 |
| Batch size | 24 |
Figure 8Accuracy curves of the three basic models.
Figure 9Loss curves of the three basic models.
Figure 10Confusion matrix for the three base models.
Accuracy of different model combinations.
| Combination Mode | Accuracy |
|---|---|
| {0.4A, 0.3B, 0.3C} | 84.78% |
| {0.3A, 0.4B, 0.3C} | 85.30% |
| {0.3A, 0.5B, 0.2C} |
|
Comparison of results of different fusion methods.
| Fusion Method | Accuracy |
|---|---|
| Output category voting | 84.57% |
| Output category probability average | 85.65% |
|
|
|
Comparison of experimental results between the basic model and the ensemble model.
| Model | Accuracy | Precision | Recall | F1-Score |
|---|---|---|---|---|
| Xception | 80.56% | 83.38% | 82.15% | 82.76% |
| ResNet50 | 83.89% | 84.75% | 81.86% | 83.28% |
| Vgg16 | 79.44% | 81.55% | 80.11% | 80.82% |
|
|
|
|
|
|
Comparison of ISIC 2016 Challenge competition results.
| Method | Accuracy |
|---|---|
| GUMED | 85.5% |
| GTDL | 81.3% |
| BF_TB | 83.4% |
| ThrunLab | 78.6% |
| Jordan Yap | 84.4% |
|
|
|
Performance comparison of the proposed method and other studies’ algorithms.
| Method | Model | Accuracy | Precision | Recall | F1-Score |
|---|---|---|---|---|---|
| Kaur R. [ | LCNet | 81.41% | 81.88% | 81.30% | 81.05% |
| Zhang J. [ | SDL | 86.28% | 68.10% | - | - |
| Al-Masni, M. A. [ | Inception-ResNet-v2 | 81.79% | - | 81.80% | 82.59% |
| Tang P. [ | GP-CNN-DTEL | 86.30% | 72.80% | 32.00% | - |
|
|
|
|
|
|
|