Literature DB >> 35935599

Differential Diagnosis of Rosacea Using Machine Learning and Dermoscopy.

Lan Ge1, Yaoying Li1, Yaguang Wu1, Ziwei Fan2, Zhiqiang Song1.   

Abstract

Introduction: Rosacea is a common chronic inflammatory disease occurring on the face, whose diagnosis is mainly based on symptoms and physical signs. Due to some overlap in symptoms and signs with other inflammatory skin diseases, young and inexperienced doctors often make misdiagnoses and missed diagnoses in clinical practices. We analyze the results of skin physiology and dermatoscopy using machine learning method and identify the characteristics of acne rosacea, which differentiate it from other common facial inflammatory skin diseases so as to improve the accuracy of clinical and differential diagnosis of rosacea.
Methods: A total of 495 patients who were jointly diagnosed by two experienced doctors were included. Basic data, clinical symptoms, physiological skin detection, and dermatoscopy results were collected, and the clinical characteristics of rosacea and other common facial inflammatory diseases were summarized according to the descriptive analysis results. The model was established using a machine learning method and compared with the judgment results of young and inexperienced doctors to verify whether the model can improve the accuracy of clinical diagnosis and differential diagnosis of rosacea.
Results: The proportion of yellow and red halos, vascular polygons, as well as follicular pustules, showed by dermatoscopy, and the melanin index in physiological skin detection revealed statistical significance in differentiating rosacea and other common facial inflammatory diseases (all P < 0.01). After adopting the machine learning, we found that GBM (Gradient Boosting Machine) algorithm was the best, and the error rate of this model in the validation set was 5.48%. In the final man-machine comparison, the accuracy of the GBM algorithm model for the classification of skin disease was significantly higher than that of young and inexperienced doctors.
Conclusion: Dermatoscopy combined with machine learning can effectively improve the diagnosis and differential diagnosis accuracy of rosacea and other facial inflammatory skin diseases.
© 2022 Ge et al.

Entities:  

Keywords:  dermatoscope; machine learning; rosacea

Year:  2022        PMID: 35935599      PMCID: PMC9354760          DOI: 10.2147/CCID.S373534

Source DB:  PubMed          Journal:  Clin Cosmet Investig Dermatol        ISSN: 1178-7015


Background

Rosacea is a chronic relapsing inflammatory disease that is commonly seen on the face of women aged 20–50. It mainly involves facial nerves and vessels as well as the sebaceous gland unit of the hair follicle. The main clinical manifestations of rosacea are intermittent flushing, persistent erythema, papules, pustules, and telangiectasia. Hypertrophy and eye changes have also been reported in a few patients. However, the clinical presentation of acne is similar to that of rosacea, with acne commonly appearing on the face and shoulders, and skin changes including intermittent flushing, papules and pustules.1–3 The diagnosis of rosacea is clinically challenging. The pathogenesis of rosacea is still unclear, and relatively specific biological markers are lacking. Moreover, the clinical manifestations are diverse and can be induced or aggravated by many factors. Improper treatments, such as with hormones for external use and other diseases, can complicate its diagnosis, making rosacea overlap with other inflammatory diseases with erythema occurring in the face.4 Although histopathology may assist in differential diagnosis, it should not be used as a regular diagnostic method considering facial cosmetic problems.5,6 Some studies have shown that using dermatoscopy to detect the changes in vascular polygons around the hair follicles can effectively help improve the diagnosis of rosacea.7–10 Over recent years, the application of machine learning in the medical field has rapidly developed. Combined with skin imaging data, machine learning has a good application prospect in the screening, diagnosis, and evaluation of skin diseases.11–15 The main idea of the GBM algorithm is to build a new base learner based on the gradient descent direction of the loss function of the previously established base learner for the purpose of integrating these base learners and thus making the overall loss function of the model continuously decrease and the model to continuously improve. In this study, we used machine learning techniques to analyze the results of skin physiological monitoring and dermatoscopy to find out relevant indicators that have relative specificity and sensitivity to the clinical diagnosis of rosacea in order to establish a mathematical model that can accurately identify rosacea and help clinical doctors to more quickly and accurately diagnose rosacea.

Materials and Methods

Research Object

A total of 495 patients with facial diseases and healthy faces who were jointly diagnosed by two experienced doctors were included in this study. Among these patients, 350 patients, included for medical statistics and machine learning modeling, were diagnosed with facial diseases, including 150 with rosacea, 100 with acne, 100 with facial dermatitis (30 with seborrheic dermatitis, 30 with atopic dermatitis, 40 with contact dermatitis, 100 normal controls). Machine learning modeling was performed according to the ratio of training set: validation set = 7:3. Another 45 patients, 15 with rosacea, 15 with acne, 15 with facial dermatitis (contact dermatitis, atopic dermatitis, seborrheic dermatitis: 5 cases each), were included for the validation model.

Skin Physiological Detection and Dermatoscopy

Dermatoscopy

After being cleaned, the skin lesions were fully exposed to the CBS-908 dermatoscopy of China Boshi. First, pictures were taken in polarized light mode and then in non-polarized light mode (50 times). At the same time, new structural patterns were observed.

Skin Physiological Detection

The MPA580 multi-probe skin tester made by CK from Germany was used. No skincare products were applied to the skin after faces were cleaned. Patients were required to stay indoors to rest for 30 minutes. The indoor environment had no direct sunlight, no windows, and no ventilation. Room temperature was controlled at 25–28 ℃ and relative humidity at 50–60%. Percutaneous water loss, cuticle moisture content, pH, lipid, and erythema value in the central forehead, left cheek, right cheek, and jaw of the patients were measured. The average value of the measurement of four facial parts was taken as the final value, and the values of the measurements were compared.

Clinical Data Collection

Basic data were collected, and each patient underwent skin physiology and dermatoscopy when they visited the doctor. Skin physiology examination included transepidermal water loss (TEWL), water content, elasticity, pH, melanin, erythema, lactate, and lipid, which were used as continuous variables. Dermatoscopy included the assessment of blood vessels, hair follicles, vellus, and scales, which were used as dichotomous variables (positive/negative).

Statistical Analysis

SPSS26.0 software was used for statistical analysis. For continuous variables, Brown-Forsythe and Welch tests were used for homogeneity test of variance, and Games-Howell tests were adopted to compare differences between groups. For categorical variables, the Chi-square test was used to compare differences between groups. In the machine learning part, we used the H2O machine learning platform (). H2O is a fully open-source distributed memory machine learning platform with linear scalability, the platform supports the most widely used statistical and machine learning algorithms, including GLM (Generalized Linear Model), GBM (Gradient Boosting Machine), XGBoost (eXtreme Gradient Boosting), DeepLearning, StackEnsemble, GLRM (Generalized Low Rank Models), and more. We take 70% of the patients as the training set and 30% of the patients as the validation set, and use the 5-fold cross-validation method to validate the model. The results of skin physiology detection are used as continuous variables, and the results of dermoscopy are input as categorical variables. In view of the limitation of the total number of samples, the deep learning model will be over-fitted, so we use the AutoML method of the H2O platform, and select the four algorithms of GLM, GBM, GLRM, and XGBoost for modeling.16–18

Results

Skin Physiological Detection Result

Among 450 patients used for modeling, 54 were males (12%), and 396 were females (88%), aged between 13 and 65 (Table 1). After using the Games-Howell test to compare pairwise differences between the rosacea group and three other groups, namely the acne group, the dermatitis group, and the normal group, the skin physiological detection result () revealed that most of the measurement indicators could not precisely differentiate patients with rosacea from those with other inflammatory facial diseases (Table 2). Moreover, the mean melanin index was lower in the rosacea group than in the inflammatory disease groups, and there were significant differences between the rosacea group and the acne group as well as between the rosacea group and the dermatitis group (p < 0.05), while no significant difference was found between the rosacea group and the normal group (p = 0.411). We believed that the melanin index might be a potential indicator to differentiate rosacea from other facial inflammatory skin diseases.
Table 1

Patients’ Characteristics

NMean AgeGender
MaleFemale
Rosacea15033.86144
Acne10024.01090
Dermatitis10032.02179
Normal10032.91783
Total45031.054396
Table 2

Difference Between Rosacea and the Others (Skin Physiological Testing)

Dependent Variable(I) Group(J) GroupMean Difference (I-J)Std. ErrorSig.95% Confidence Interval
Lower BoundUpper Bound
TEWLRosaceaAcne−1.021670.839470.617−3.19551.1522
Dermatitis−8.57867*1.166880.000−11.6102−5.5471
Normal5.37273*0.610810.0003.79186.9536
Water ContentRosaceaAcne3.301671.405640.091−0.33936.9427
Dermatitis14.18857*1.935230.0009.159919.2172
Normal−2.054631.101510.246−4.90370.7945
ElasticityRosaceaAcne−1.883101.206720.404−5.00811.2419
Dermatitis5.81910*1.304090.0002.43939.1989
Normal−8.24770*1.043520.000−10.9472−5.5482
PHRosaceaAcne−0.009730.059620.998−0.16410.1446
Dermatitis−0.055750.054860.740−0.19770.0862
Normal0.14057*0.052200.0380.00550.2756
Melanin IndexRosaceaAcne−32.97333*5.213010.000−46.4641−19.4826
Dermatitis−25.83333*6.321320.000−42.2213−9.4454
Normal−7.293334.712810.411−19.48354.8969
Erythema IndexRosaceaAcne43.72000*10.724580.00015.980471.4596
Dermatitis24.6200012.534410.205−7.822257.0622
Normal168.04000*9.499930.000143.4458192.6342
LactateRosaceaAcne4.286676.353190.907−12.157720.7311
Dermatitis−21.83333*7.141990.013−40.3392−3.3274
Normal28.12667*5.085170.00014.970741.2827
LipidRosaceaAcne−15.26530*3.489580.000−24.2984−6.2322
Dermatitis9.41300*3.211800.0191.103717.7223
Normal−11.31200*2.895670.001−18.8019−3.8221

Note: *The mean difference is significant at the 0.05 level.

Patients’ Characteristics Difference Between Rosacea and the Others (Skin Physiological Testing) Note: *The mean difference is significant at the 0.05 level.

Dermatoscopy Result

The results of dermatoscopy were compared between groups using the Chi-square test (Table 3), which showed significant differences (p < 0.05) between the rosacea group and the other three groups in terms of vascular polygons as well as yellow and red halos around hair follicles and pustules, especially vascular polygons, whose positive rate in the rosacea group reached 100%, while the rate in the acne group, the dermatitis group, and the normal group was 8%, 4%, and 1%, respectively.
Table 3

Difference Between Rosacea and the Others (Dermoscopy)

GroupTotal
RosaceaAcneDermatitisNormal
Dotted VesselsNegative150a5b35c100a290
Positive0a95b65c0a160
Linear VesselsNegative74a, b57b40a96c267
Positive76a, b43b60a4c183
Vascular PolygonsNegative0a92b96b, c99c287
Positive150a8b4b, c1c163
Large VesselsNegative131a94a88a100b413
Positive19a6a12a0b37
Branching VesselsNegative137a98b86a100b421
Positive13a2b14a0b29
Perifollicular Light Yellow HalosNegative89a81b62a96c328
Positive61a19b38a4c122
Perifollicular Yellowish Red HalosNegative73a95b100c100c368
Positive77a5b0c0c82
Follicular PlugsNegative135a0b68c96a299
Positive15a100b32c4a151
Follicular PustulesNegative113a90b97c100c400
Positive37a10b3c0c50
White VellusNegative54a33a72b96c255
Positive96a67a28b4c195
Dense White VellusNegative133a97b93a, b98b421
Positive17a3b7a, b2b29
Dense Black VellusNegative147a, b, c100c95b100a, c442
Positive3a, b, c0c5b0a, c8
Thicken Black VellusNegative130a87a81a100b398
Positive20a13a19a0b52
White ScalesNegative125a85a34b100c344
Positive25a15a66b0c106
Yellow-white ScalesNegative150a94b81c100a425
Positive0a6b19c0a25

Notes: Each subscript letter denotes a subset of work categories whose column proportion do not differ significantly from each other at 0.05 level.

Difference Between Rosacea and the Others (Dermoscopy) Notes: Each subscript letter denotes a subset of work categories whose column proportion do not differ significantly from each other at 0.05 level.

Machine Learning Modeling Result

First, we grouped the training and validation sets according to the ratio of 7:3 and tried to carry out dichotomous modeling in the rosacea and the other three groups. After inputting the results of skin physiology and dermatoscopy, we found that regardless of which algorithm was used, the AUC (a performance index used to evaluate the merits and demerits of the dichotomous model; the closer to 1, the better) of the dichotomous model was all above 0.99 (Table 4), which could well differentiate rosacea from other facial skin diseases. Among them, vascular polygons had the highest proportion, which was consistent with our previous medical statistical results. Since the results of the dichotomous model were satisfactory, we wanted to try the four classification model, which is a method that can accurately identify the four groups of rosacea, acne, facial dermatitis and normal controls. Next, we tried four classification modeling and found that Gradient Boosting Machine (GBM) algorithm had the lowest log loss,16,20–22 which was the highest-rated model (Table 5). The results of the model training set and validation set were presented in the form of a confusion matrix. The error rate of this model was 0 in the training set and 5.48% in the validation set (Figure 1).
Table 4

Machine Learning Results (2-Class)

Model_idaucLoglossaucprMean_Per_Class_Errorrmsemse
GBM0.9996650710.04450884420.9998419390.007392340.1099750.012094
DRF0.9982775110.09074989200.9992152880.009569370.1370970.018795
XRT0.9974401910.09402687150.9989339250.009569370.1405030.019741
GLM0.9945933010.07074192560.9980841560.017177030.1240120.0153791
Table 5

Machine Learning Results (4-Class)

Model_idMean_Per_Class_ErrorLoglossrmsemse
GBM0.0433747410.1414690560.1948288140.037958267
XRT0.046997930.2507399640.2587199950.066936036
DRF0.0542443060.2501889130.2615129480.068389022
GLM0.751.34896180.7386131670.545549411
Figure 1

The confusion matrix of 4-class GBM model. (A)The training set. (B)The validation set.

Machine Learning Results (2-Class) Machine Learning Results (4-Class) The confusion matrix of 4-class GBM model. (A)The training set. (B)The validation set. The main idea of the GBM algorithm is to establish a new machine learning based on the gradient descent direction of the loss function of the previously established one for the purpose of integrating these machine learnings, thus making the overall loss function of the model continuously decline and the model to continuously improve. Our results showed that through the use of machine learning methods, based on the results of skin physiological detection and dermatoscopy, both dichotomous and four classification models could accurately carry out the differential diagnosis of rosacea, especially in the four-classification model where we could effectively differentiate rosacea, acne and dermatitis patients to improve the accuracy of diagnosis among different facial skin diseases and help clinicians make a diagnosis.

Man-Machine Comparison Test

In order to confirm the practical value of the classification model of machine learning in clinical practice, we re-collected 45 patients (15 with rosacea, 15 with acne, and 15 with dermatitis) for GBM model prediction and invited three resident doctors for diagnosis. The results showed that the total accuracy rate of GBM model prediction was 84.4% (38/45). In each group, the accuracy rate was 93.3% (14/15), 73.3% (11/15) and 86.6% (13/15), respectively. Therefore, the accuracy rate in the rosacea group and the dermatitis group was over 90% and 80%, respectively, while the accuracy rate in the acne group was slightly lower. The overall accuracy rate of the three doctors was 35.5% (16/45), 37.8% (17/45), and 37.8% (17/45), respectively. The detailed results of the comparison between the three doctors and machine learning are shown in Table 6 and . As a result, the accuracy of the machine learning model was greatly improved compared to that of young and inexperienced doctors. It is also verified that the traditional skin physiological detection and dermatoscopy combined with machine learning technology could effectively improve the efficiency of clinicians in the differential diagnosis of rosacea, which is conducive to the diagnosis and treatment of patients.
Table 6

Human-Machine Comparison Results

Actual DiagnosisAccuracy
Doctor ADoctor BDoctor CGBM Model
Acne5/156/155/1511/15
Rosacea5/154/156/1514/15
Dermatitis5/157/156/1513/15
Human-Machine Comparison Results

Discussion

Over recent years, with the changes in modern lifestyle, especially the booming development of various skin cosmetic treatments, inflammatory facial skin diseases, which mainly manifest as facial flushing, papules, and pustules, have become increasingly frequent in clinical practices. The most common inflammatory facial skin diseases include rosacea, seborrheic dermatitis, contact dermatitis, atopic dermatitis, and acne. However, the diagnosis of skin diseases is mostly symptomatic, and the clinical manifestations of these inflammatory facial skin diseases in different periods are very similar and “overlapping”, so there are often misdiagnosis and mistreatment. In the present study, we used dermatoscopy and skin physiological detection to examine patients with common inflammatory facial diseases, including rosacea, manifested by erythema, papules, and pustules. Besides, we established a model through comprehensive analysis with machine learning method to detect the characteristics that can accurately differentiate rosacea from other common inflammatory facial skin diseases, aiming to improve the accuracy of clinical and differential diagnosis of rosacea. In the present study, we first described and analyzed the characteristics and forms of common facial skin diseases from the perspective of conventional medical statistics. We found that the melanin index in physiological skin detection and the vascular polygons, as well as yellow and red halos around hair follicles and pustules in dermatoscopy, were all invaluable indicators in the differential diagnosis of rosacea, which is consistent with a previous study.19 The research included 115 patients, including 25 rosacea patients, all of whom had positive dermoscopic results for vascular polygons.17 However, because some patients with other facial skin diseases also have positive cases of vascular polygons, it was necessary to find a more optimized method based on the results of vascular polygons to improve the efficiency of our differential diagnosis. Our study found that the positive rate of vascular polygons, as well as light yellow and yellowish red halos around hair follicles, reached 100% in the rosacea group, while the rate in the acne group, the dermatitis group, and the normal group was 8%, 4% and 1%, respectively. Therefore, we believe that vascular polygons as well as light yellow and yellowish red halos around hair follicles could be used as an indicator in the diagnosis and differential diagnosis of rosacea to distinguish rosacea patients from patients with other facial skin diseases. In the process of machine learning modeling, we found that log loss scores were relatively poor (logloss > 1) when only the results of physiological skin detection were included in the model, and reliable models could be obtained only after the simultaneous inclusion of dermatoscopy results. Therefore, dermatoscopy is indispensable for the differential diagnosis of rosacea. In addition, by observing and studying several factors with high weights in the dichotomous and four classification models, we found that except vascular polygons, which were the result of conventional statistics, the melanin index, yellow and red halos, and pustules around hair follicles were not ranked very high, while keratotic plugs, erythema, punctate vessels, transepidermal water loss (TEWL) and water content ranked the highest, which is a very interesting phenomenon. TEWL and water content are two important indicators reflecting skin barrier function. Combined with clinical experience and the results of our analysis, there was no significant difference in these two indicators between the rosacea group and the acne group and between the rosacea group and the normal group. However, in the dermatitis group, TEWL was significantly increased, while water content was significantly decreased. Therefore, we believe that the skin barrier function of primary rosacea does not significantly differ from that of normal people, and there is no serious barrier damage as we previously thought. The damage to skin barrier function in patients with rosacea is significantly different from that of patients with atopic dermatitis, contact dermatitis, and seborrheic dermatitis. Based on this, it is reasonable to believe that if patients with rosacea are also associated with abnormal physiological indicators, there might be other causes of barrier function damage, such as drugs for internal use (eg, glucocorticoids) and other diseases, such as rosacea associated with seborrheic dermatitis. The case of primary rosacea associated with other diseases is also worth exploring. Suppose we want to optimize the model and make the included indicators more concise and precise in the future. In that case, we can start from these factors with high weights to facilitate the application of the model in clinical practices to assist doctors in the differential diagnosis of various facial skin diseases and their complications so that patients can receive the correct treatment. Finally, the highlight of this study is that we collected the testing results of 45 cases and carried out the man-machine comparison study. Previous research related to skin diseases has mainly focused on the machine learning differential diagnosis of skin canceration and the man-machine comparison,23 so our study adds to this field several man-machine comparison results of differential diagnosis related to inflammatory facial diseases, thus expanding the application of machine learning technology in skin diseases.

Conclusion

In conclusion, dermatoscopy combined with machine learning revealed better sensitivity and specificity for the diagnosis of rosacea and could effectively improve the diagnosis rate of inexperienced doctors for rosacea. Of course, due to the complexity of the current model, it is difficult to promote it in clinical practice, so our team plans to expand the sample size further so as to optimize the model and make it applicable in clinical practices in the future.
  21 in total

1.  An interpretation for the ROC curve and inference using GLM procedures.

Authors:  M S Pepe
Journal:  Biometrics       Date:  2000-06       Impact factor: 2.571

2.  Dermoscopy of early stage mycosis fungoides.

Authors:  A Lallas; Z Apalla; I Lefaki; T Tzellos; A Karatolias; E Sotiriou; E Lazaridou; D Ioannides; I Zalaudek; G Argenziano
Journal:  J Eur Acad Dermatol Venereol       Date:  2012-03-09       Impact factor: 6.166

Review 3.  Rosacea: I. Etiology, pathogenesis, and subtype classification.

Authors:  Glen H Crawford; Michelle T Pelle; William D James
Journal:  J Am Acad Dermatol       Date:  2004-09       Impact factor: 11.527

4.  Estimation of elimination half-lives of organic chemicals in humans using gradient boosting machine.

Authors:  Jing Lu; Dong Lu; Xiaochen Zhang; Yi Bi; Keguang Cheng; Mingyue Zheng; Xiaomin Luo
Journal:  Biochim Biophys Acta       Date:  2016-05-20

5.  Ros-NET: A deep convolutional neural network for automatic identification of rosacea lesions.

Authors:  Hamidullah Binol; Alisha Plotner; Jennifer Sopkovich; Benjamin Kaffenberger; Muhammad Khalid Khan Niazi; Metin N Gurcan
Journal:  Skin Res Technol       Date:  2019-12-17       Impact factor: 2.365

6.  Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study.

Authors:  Philipp Tschandl; Noel Codella; Bengü Nisa Akay; Giuseppe Argenziano; Ralph P Braun; Horacio Cabo; David Gutman; Allan Halpern; Brian Helba; Rainer Hofmann-Wellenhof; Aimilios Lallas; Jan Lapins; Caterina Longo; Josep Malvehy; Michael A Marchetti; Ashfaq Marghoob; Scott Menzies; Amanda Oakley; John Paoli; Susana Puig; Christoph Rinner; Cliff Rosendahl; Alon Scope; Christoph Sinz; H Peter Soyer; Luc Thomas; Iris Zalaudek; Harald Kittler
Journal:  Lancet Oncol       Date:  2019-06-12       Impact factor: 41.316

7.  A Gradient Boosting Machine for Hierarchically Clustered Data.

Authors:  Patrick J Miller; Daniel B McArtor; Gitta H Lubke
Journal:  Multivariate Behav Res       Date:  2017-01-18       Impact factor: 5.923

8.  Data augmentation in dermatology image recognition using machine learning.

Authors:  St Lt Pushkar Aggarwal
Journal:  Skin Res Technol       Date:  2019-05-29       Impact factor: 2.365

Review 9.  Sarcoidosis: a comprehensive review and update for the dermatologist: part I. Cutaneous disease.

Authors:  Adele Haimovic; Miguel Sanchez; Marc A Judson; Stephen Prystowsky
Journal:  J Am Acad Dermatol       Date:  2012-05       Impact factor: 11.527

Review 10.  Dermoscopy in general dermatology.

Authors:  Iris Zalaudek; Giuseppe Argenziano; Alessandro Di Stefani; Gerardo Ferrara; Ashfaq A Marghoob; Rainer Hofmann-Wellenhof; H Peter Soyer; Ralph Braun; Helmut Kerl
Journal:  Dermatology       Date:  2006       Impact factor: 5.366

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.