| Literature DB >> 35573163 |
Suboh Alkhushayni1, Du'a Al-Zaleq1, Luwis Andradi1, Patrick Flynn1.
Abstract
Skin cancer, and its less common form melanoma, is a disease affecting a wide variety of people. Since it is usually detected initially by visual inspection, it makes for a good candidate for the application of machine learning. With early detection being key to good outcomes, any method that can enhance the diagnostic accuracy of dermatologists and oncologists is of significant interest. When comparing different existing implementations of machine learning against public datasets and several we seek to create, we attempted to create a more accurate model that can be readily adapted to use in clinical settings. We tested combinations of models, including convolutional neural networks (CNNs), and various layers of data manipulation, such as the application of Gaussian functions and trimming of images to improve accuracy. We also created more traditional data models, including support vector classification, K-nearest neighbor, Naïve Bayes, random forest, and gradient boosting algorithms, and compared them to the CNN-based models we had created. Results had indicated that CNN-based algorithms significantly outperformed other data models we had created. Partial results of this work were presented at the CSET Presentations for Research Month at the Minnesota State University, Mankato.Entities:
Year: 2022 PMID: 35573163 PMCID: PMC9095410 DOI: 10.1155/2022/2839162
Source DB: PubMed Journal: J Skin Cancer ISSN: 2090-2913
Figure 1Example with and without melanoma.
Comparison of machine learning algorithms from the related work section.
| Article title and author | Method | Accuracy | Summarization |
|---|---|---|---|
| M-skin doctor: A mobile-enabled system for early melanoma skin cancer detection using a support vector machine | SVM | 0.80 | Aleem et al. published an article introducing a mobile-enabled cancer detection system for early melanoma skin cancer using a support vector machine (SVM). |
|
| |||
| Melanoma detection byanalysis of clinical images using a convolutional neural network | CNN | 0.81 | Clinical images (though not from a dermoscopy) were preprocessed to remove noise and illumination effects and fed into a convolutional neural network trained on many samples. |
|
| |||
| Expert-level diagnosis of nonpigmented skin cancer by combined convolutional neural networks. | CNN | 0.735 | Tschandl et al. explored how CNN achieves professional-level accuracy in diagnosing pigmented skin cancer; however, most common types of skin cancers are nonpigmented and hard to diagnose. Thus, the author expected to compare the accuracy of a CNN-based classifier on the diagnosis of nonpigmented skin cancer with that of physicians with different levels of experience in this study. The proposed system can be identified as two main steps, such as neural network diagnoses and human rating. |
| The impact of patient clinical information on automated skin cancer detection | ResNet-50 | 0.788 | The article compares various methods of training a model to recognize cancer in images and considerations that must be made when doing so, particularly when it comes to unsupervised training. The most interesting point is that if control data images are taken on a different camera or dermoscopy, the model may end up learning to pick the images on the subtle differences in the image related to a given model of the device, not the cancer itself. This article goes into detail about one potential data source for images to be used for training, the International Skin Imaging Collaboration. |
| ResNet-101 | 0.757 | ||
| GoogleNet | 0.779 | ||
| MobileNet | 0.762 | ||
| VGGNet-13 | 0.746 | ||
| VGGNet-19 | 0.750 | ||
Figure 2Basic CNN proposed model.
Model one architecture.
| Layer | Size | Output shape |
|---|---|---|
| Input shape | (256,256,3) | |
| Convolutional 2D + ReLu | 16(3 | (256,256,16) |
| Max pooling + ReLu | (2 | (128,128,16) |
| Convolutional 2D + ReLu | 32(3 | (128,128,32) |
| Max pooling + ReLu | (2 | (64,64,32) |
| Convolutional 2D + ReLu | 64(3 | (64,64,64) |
| Max pooling + ReLu | (2 | (32,32,64) |
| Fully connected + ReLu | 512 neurons | 1 |
| Fully connected + sigmoid | 1 | 1 |
Figure 3CNN model two architecture from summary () function.
Figure 4CNN model three architecture from summary ( ) function.
Figure 5Before and after Gaussian function use.
List of parameters used for configuration of the traditional method of machine learning.
| Model | Parameters |
|---|---|
| SVC |
|
| KNN | Algorithm = auto, n_neighbors = 15, weights = distance |
| RF | Criterion = entropy, max_features = auto, n_estimator = 15 |
| Gradient | Max_depth = 2 n_estimator = 50 |
Figure 6SVC confusion matrix, classification report, and accuracy.
Figure 7KNN confusion matrix, classification report, and accuracy.
Figure 8GNB confusion matrix, classification report, and accuracy.
Figure 9RF confusion matrix, classification report, and accuracy.
Figure 10GB confusion matrix, classification report, and accuracy.
Figure 11CNN model one accuracy and loss.
Figure 12CNN model two accuracy fluctuation.
Figure 13CNN model three accuracy fluctuation.
Figure 14CNN model four accuracy and loss.
Our CNN models and their accuracy.
| Model | Accuracy % with all benign |
|---|---|
| Model 1 | 98.23 |
| Model 2 | 98.23 |
| Model 3 | 98.25 |
| Model 4—VGGNet-16 | 98.22 |
Figure 15Comparison of different tested models and their overall accuracy.
Execution time comparison for machine learning algorithms.
| Algorithm | Time taken for training (s) | Time taken for classification (s) |
|---|---|---|
| SVM | 493.871434 | 0.154563 |
| Naïve Bayes | 0.453653 | 0.140434 |
| Random forest with 2 trees | 0.6941016 | 0.081072 |
| Random forest with 5 trees | 1.056321 | 0.126287 |
| Random forest with 10 trees | 1.520123 | 0.186565 |
| Random forest with 20 trees | 2.312458 | 0.349988 |
| Random forest with 50 trees | 4.965323 | 0.788574 |
| CNN | 6.358789 | 0.047896 |
| KNN with | 0 | 0.065487 |
| KNN with | 0 | 0.210213 |
Additional comparison for machine learning algorithms from the related work section.
| Article title and author | Method | Accuracy | Summarization |
|---|---|---|---|
| Skins cancer identification system of HAMl0000 skin cancer dataset using convolutional neural network | CNN | 0.78 | Nugroho et al. investigated to create a skin cancer identification system for decision making. The proposed system was based on the convolutional neural network (CNN) algorithm, and it has three stages such as convolutional layer, pooling layer, and fully connected layer. The convolution layer applies the output function as a feature map from the image. Rectified linear unit (ReLu) used as an activating function. Pooling layer was used to reduce the size of the representation and to reduce the speed. This layer mainly gives the ability to recognize an object. Fully connected layer is used to transform the data dimension and to connect the previous layer to the next layer. |
|
| |||
| Recent advances in deep learning applied to skin cancer detection |
|
| This article is a summary of how machine learning and image processing can help dermatologists more rapidly identify skin cancers, in particular melanomas (the deadliest form of skin cancer). Due to the pressures created by increases in healthcare cost, lack of qualified professionals, and lack of access to relevant medical tools, cases of melanoma being diagnosed at a late stage have been going up. The article explores solutions to this problem and makes three major arguments–images run through machine learning algorithms (particularly models made up of a composition of methods of learning) can be at least as effective at diagnosis of skin cancers as dermatologists (assuming a good image is given)–these algorithms need to be able to work with clinical image data (i.e., from standard cameras), rather than medical imaging devices, and that there is a significant lack of data for testing and training, particularly when it comes to data with relevant metadata (patient age, race, diseases, etc.) associated with an image. |
|
| |||
| A convolutional neural network framework for accurate skin cancer detection | DenseNet201 | 0.95 | Another analysis was performed on the HAM10000 dataset using a DenseNet201 neural network and image augmentation, demonstrating that it may be an effective model to use for this purpose, due to its high classification accuracies and low rate of false negatives. |