| Literature DB >> 33919583 |
Parvathaneni Naga Srinivasu1, Jalluri Gnana SivaSai2, Muhammad Fazal Ijaz3, Akash Kumar Bhoi4, Wonjoon Kim5, James Jin Kang6.
Abstract
Deep learning models are efficient in learning the features that assist in understanding complex patterns precisely. This study proposed a computerized process of classifying skin disease through deep learning based MobileNet V2 and Long Short Term Memory (LSTM). The MobileNet V2 model proved to be efficient with a better accuracy that can work on lightweight computational devices. The proposed model is efficient in maintaining stateful information for precise predictions. A grey-level co-occurrence matrix is used for assessing the progress of diseased growth. The performance has been compared against other state-of-the-art models such as Fine-Tuned Neural Networks (FTNN), Convolutional Neural Network (CNN), Very Deep Convolutional Networks for Large-Scale Image Recognition developed by Visual Geometry Group (VGG), and convolutional neural network architecture that expanded with few changes. The HAM10000 dataset is used and the proposed method has outperformed other methods with more than 85% accuracy. Its robustness in recognizing the affected region much faster with almost 2× lesser computations than the conventional MobileNet model results in minimal computational efforts. Furthermore, a mobile application is designed for instant and proper action. It helps the patient and dermatologists identify the type of disease from the affected region's image at the initial stage of the skin disease. These findings suggest that the proposed system can help general practitioners efficiently and effectively diagnose skin conditions, thereby reducing further complications and morbidity.Entities:
Keywords: Convolutional Neural Network (CNN); Long Short-Term Memory (LSTM); MobileNet; MobileNet V2; deep learning; grey-level correlation; mobile platform; neural network; skin disease
Mesh:
Year: 2021 PMID: 33919583 PMCID: PMC8074091 DOI: 10.3390/s21082852
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
The related work of the machine and deep learning approaches for image classification.
| Reference | Approach | Objective | Challenges of the Approach |
|---|---|---|---|
| [ | Morphological Operations | Morphological operations involve the dilation and erosion that are efficient in identifying the image features that help determine the abnormality. It works through the structuring element. | Identifying the optimal threshold is crucial and not suitable for analyzing the disease region’s growth through morphology operations. The process of applying the structuring elements for the skin disease classification does not yield an accurate result. |
| [ | K-Nearest Neighborhood | KNN based model works without the training data in classifying the data through the feature selection and similarity matching for categorizing the data. It works through the distance measure as the mode of identifying the correlation among the selected features. | KNN-based classification model, the accuracy of the outcome is directly dependent on the quality of underlying data. Additionally, in the case of a larger sample size, the prediction time might be significantly high. The KNN model is subtle to the inappropriate features in the data. |
| [ | Genetic Algorithm | The genetic algorithm relies more on a probabilistic approach by randomly selecting the initial population. It performs the crossover and the mutation operations simultaneously until it reaches a suitable number of segments. | The Genetic Algorithm does not guarantee the global best solution and too much time to converge. |
| [ | Support Vector Machine | Support Vector Machine is efficient in handling the high dimensional data with minimal memory consumption. | Support Vector Machine approach is not appropriate for noisy image data and identifying the feature-based parameters is a challenging task. |
| [ | Artificial Neural Networks | Artificial Neural Networks are efficient in recognition non-linear associations among the dependent and independent parameters by storing the data across the network nodes. | Artificial Neural Network models are efficient in handling the contexts like inadequate understanding of the problem. However, the approach there is a chance of missing the image’s spatial features, and diminishing and exploding the gradient is a significant concern. |
| [ | Convolutional Neural Networks | Convolutional Neural Network models are efficient in the automatic selection of the essential features. The CNN model stores the network nodes’ training data as multi-layer perceptrons rather than storing it in the auxiliary memory. | CNN approach fails to interpret the object’s magnitude and size. Additionally, the model needs tremendous training for a reasonable outcome, apart from the challenge like the spatial invariance among the pixel data. |
| [ | Fully Convolutional Residual Network | Fully Convolutional Residual Network uses the encoder and decoder layers that utilize high-level and low-level features to classify the objects from the image. | The Fully Convolutional Residual Network is efficient in handling the overfitting issue and the degradation problem. However, the model is complex in design and real-time execution. In addition, adding the batch normalization would result in making the architecture more intricate. |
| [ | Fine-tuned Neural Networks | Fine-Tune Neural Network is efficient in handling the novel problem with pre-trained data through inception and update stages. | In FTNN approach, when the elements are fed with new weights, it forgets the previously associated weight that may impact the outcome. |
| [ | Gray Level Co-occurrence Matrix (GLCM) | Gray Level Co-occurrence Matrix (GLCM) is a statistical approach that performs the object’s classification by analyzing spatial association among the pixels based on the pixel texture. | The GLCM approach needs considerable computational efforts, and characteristics are not invariant with rotation and texture changes. |
| [ | Bayesian classification | The Bayesian classification-based approach efficiently handles discrete and continuous data by ignoring the inappropriate features for both the binary and multi-class classifications. | The Bayesian Classifier is not suitable for handling the unsupervised data classification, fails in independent predictors, and is widely known as an inappropriate probabilistic model. |
| [ | Decision Tree | Decision Tree-based models are used in handling both the stable and discrete data that performs the prediction through a rule-based approach. It is proven to be productive in managing non-linear parameters. | In Decision Tree models, a small change in the input data would result in an exponential growth in the outcome makes the model unstable. Overfitting is the other issue associated with the decision tree-based models. |
| [ | Ensemble models | Ensemble models are proven to be better prediction models with a combination of various robust algorithms. They are efficient in analyzing both the linear and complex data patterns by combining two or more complex models. | Ensemble models do have the overfitting issue, and the ensemble model fails to work with unknown discrepancies. The model minimizes the understandability of the approach. |
| [ | Deep Neural Networks | Deep Neural Networks-based models can work with structured and unstructured data. The models can still be able to work with unlabeled data and can yield a better outcome. | The models like the Inception V3 model [ |
Figure 1The architecture of MobileNet V2 model.
Figure 2Architecture of LSTM component.
Figure 3Architecture of the proposed model with MobileNet V2 and LSTM.
The configuration information of the proposed model.
| Implementation Configuration Parameters |
|---|
| Model: Torch Vision, Mobilenet-V2 |
| Base learning rate: 0.1 |
| Learning rate policy: Step-Wise (Reduced by a factor of 10 every 30/3 epochs) |
| Momentum: 0.95 |
| Weight decay: 0.0001 |
| Cycle Length: 10 |
| PCT-Start: 0.9 |
| Batch size: 50 |
Figure 4Images of various image classes from HAM10000 dataset. The image of various diseases are as follows (A) Melanocytic Nevi, (B) Benign Keratosis-like Lesions, (C) Dermatofibroma, (D) Vascular Lesions, (E) Actinic Keratoses and Intraepithelial Carcinoma, (F) Basal Cell Carcinoma, (G) Melanoma, and (H) Normal skin image are presented.
Figure 5Classification confidence and resultant output images with regular training.
Figure 6Resultant outcomes post optimizing the training rate.
Figure 7Classification confidence and resultant output images of the final model.
Figure 8The training, validation, and learning rate of the final model.
The performance metrics of the various approaches.
| Algorithms | Sensitivity | Specificity | Accuracy | JSI | MCC |
|---|---|---|---|---|---|
| HARIS [ | 78.21 | 83.00 | 77.00 | 83.01 | 77.00 |
| FTNN [ | 79.54 | 84.00 | 79.00 | 84.00 | 79.00 |
| CNN [ | 80.41 | 85.00 | 80.00 | 85.16 | 80.00 |
| VGG19 [ | 82.46 | 87.00 | 81.00 | 86.71 | 81.00 |
| MobileNet V1 [ | 84.04 | 89.00 | 82.00 | 88.21 | 83.00 |
| MobileNet V2 [ | 86.41 | 90.00 | 84.00 | 89.95 | 84.00 |
| MobileNet V2-LSTM | 88.24 | 92.00 | 85.34 | 91.07 | 86.00 |
Figure 9The performance of the MobileNet V2-LSTM model.
Figure 10The comparative analysis of MobileNet V2-LSTM model.
The performances of the various algorithms.
| Algorithm | Sensitivity | Specificity | Accuracy |
|---|---|---|---|
| LICU [ | 81.0 | 97.0 | 91.2 |
| SegNet [ | 80.1 | 95.4 | 91.6 |
| U-Net [ | 67.2 | 97.2 | 90.1 |
| Yuan (CDNN) [ | 82.5 | 96.8 | 91.8 |
| DT&RF [ | 87.7 | 99.0 | 97.3 |
| MobileNet V2-LSTM | 92.24 | 95.1 | 90.21 |
The progress of the disease growth.
| Algorithm | Disease Core (DC) | Whole Disease Area (WD) | Enhanced DISEASE (ED) | Confidence (Mean Value) |
|---|---|---|---|---|
| HARIS [ | 8.854 | 12.475 | 3.621 | 0.92 |
| FTNN [ | 8.903 | 12.522 | 3.619 | 0.91 |
| CNN [ | 8.894 | 12.498 | 3.604 | 0.89 |
| MobileNet V2-LSTM | 8.912 | 12.546 | 3.633 | 0.93 |
Figure 11The progress of the disease growth.
The training, validation accuracy, and learning rate.
| Algorithm | Training Accuracy | Validation Accuracy | Learning Rate |
|---|---|---|---|
| VGG16 [ | 83.39 | 81.89 | 2.88 |
| AlexNet [ | 96.89 | 95.78 | 3.47 |
| MobileNet [ | 97.64 | 96.32 | 3.98 |
| Rest-Net 50 [ | 98.73 | 94.23 | 3.75 |
| MobileNet V2-LSTM | 93.89 | 90.72 | 4.20 |
Figure 12The hyperparameters of the proposed model.
Execution Time.
| Algorithm | Execution Time(s) |
|---|---|
| CNN [ | 151.23 |
| VGG19 [ | 128.51 |
| MobileNet V1 [ | 126.98 |
| MobileNet V2 [ | 105.92 |
| MobileNet V2-LSTM | 101.87 |
Figure 13The execution time of MobileNet V2 with LSTM and other approaches.
Figure 14The framework of the proposed mobile application modules doctor, user, Rest-API for database connectivity, and the database.
Figure 15The mobile framework on incorporating MobileNet V2 with LSTM.
Figure 16The interface of the mobile application to gather user’s data and prediction result interface of mobile application.