Literature DB >> 36170434

Artificial Intelligence Confirming Treatment Success: The Role of Gender- and Age-Specific Scales in Performance Evaluation.

Anastasia Georgievskaya¹.
1. From Haut.AI OU.

Abstract

SUMMARY: In plastic surgery and cosmetic dermatology, photographic data are an invaluable element of research and clinical practice. Additionally, the use of before and after images is a standard documentation method for procedures, and these images are particularly useful in consultations for effective communication with the patient. An artificial intelligence (AI)-based approach has been proven to have significant results in medical dermatology, plastic surgery, and antiaging procedures in recent years, with applications ranging from skin cancer screening to 3D face reconstructions, the prediction of biological age and perceived age. The increasing use of AI and computer vision methods is due to their noninvasive nature and their potential to provide remote diagnostics. This is especially helpful in instances where traveling to a physical office is complicated, as we have experienced in recent years with the global coronavirus pandemic. However, one question remains: how should the results of AI-based analysis be presented to enable personalization? In this paper, the author investigates the benefit of using gender- and age-specific scales to present skin parameter scores calculated using AI-based systems when analyzing image data.

Entities: Chemical

Mesh：

Year: 2021 PMID： 36170434 PMCID： PMC9512241 DOI： 10.1097/PRS.0000000000009671

Source DB: PubMed Journal: Plast Reconstr Surg ISSN： 0032-1052 Impact factor: 5.169

The importance of the visual component of cosmetic products’ effects and aesthetic treatments goes beyond protocoling. Artificial intelligence (AI) and machine learning advances reshape the future of aesthetics and anti-ageing domains.[1-11] Such digital innovations are recognized within cosmetic dermatology and are used to complement traditional skin evaluation methods. Our appearance has a significant impact on our self-perception and how we are perceived by others,[12] both of which contribute significantly to our life satisfaction, overall well-being,[13] and confidence.[14] Clinical assessment, sensorial data, biophysical evaluation, and imaging methods[15] are used as criteria to assess the skin condition and measure the effect of interventions. In the aesthetic field, expert-level grading is quite well established, but it still utilizes a limited number of categories for grading and can inherit the biases of the expert.[16,17] In practice, it is challenging to arrange patient evaluation by sensorial and biophysical measurements since it requires their physical attendance at a dedicated research center. The same applies to laboratory photography. However, as our smartphones have become an increasingly versatile computing device and an advanced imaging system in our pocket,[18] we can anticipate a rapid rise in high-scale delivery of computer vision-based selfie diagnostics to patients (Table 1).

Table 1.

Comparison of Automation Potential and Application of AI Algorithms with Regard to Different Skin Analysis Methods

Group of Methods	Data Type	Automation Potentialwith AI Methods	Potential to Deliver Tech at Scale to End-Users	Limitations
Sensors	Time series	Medium	Medium	Results might be hard to reproduce[19]
Biophysical	Time series	Medium	Low	Requires special equipmentDatasets have a limited number of data points
Clinical evaluation	Classes of severity	Medium	High	Expert grading might be biased[17,20]
Imaging	Images	High	High	Requires data standardization[21]

Comparison of Automation Potential and Application of AI Algorithms with Regard to Different Skin Analysis Methods Although selfie photographs are an essential attribute of social media, it remains difficult to capture strictly standardized images.[16] Data quality is crucial to the performance of computer vision algorithms. Therefore, dedicated methods for image-quality control should be applied to achieve the adequate level of accuracy in the analysis.[22] As an example, in Haut.AI’s research group, we use our LIQA (live image-quality assurance) software, which was developed in-house to analyze image quality live, as the image is being taken, and provide instant feedback to the user. Once we have captured a standardized, high-quality image and processed it with computer vision methods, we get predictions from the algorithms’ models. The nature of these predictions or scores varies depending on the type of algorithm. Generally, the main tasks in biomedical imaging are object detection, segmentation of the region of interest, classification and ranking.[23,24] For object detection, the output can be binary (object detected or not detected) or quantitative (counting the number of objects, eg, pigmentation spots). For segmentation, the algorithm tries to provide a mask of the target object, whereas in classification, it assigns a class or several classes to an image. The ranking model orders the images based on a partial ranking obtained from an annotation. The presentation of the results of computer vision-based systems can be done using absolute value. Although these scores will be very useful for statistical research, they will not necessarily be grasped by the doctor and the patient. To organize the data and format it in a suitable manner so scores from different subjects or different conditions can be compared, we need to apply a normalization technique. Normalization allows data to be presented across a specific range, for example, between 0 and 100. Normalization transforms the scores’ absolute values to relative values. The absolute scores are calculated on a diverse dataset of a relatively large size and can be normalized to relative based on the extreme distribution of values in the dataset. This way, we will be able to compare results with the general population presented in the dataset. The normalization can be repeated as new high-quality image data is collected. AI-based face analysis technology should aid both the doctor and the patient’s understanding of the patient’s skin and face condition, acting as an assistant for the doctor and an educator for the patient. In a study of the perception of using AI to interpret radiology data,[25] it was found that the main benefits to patients were actionable information, second opinion, and preparing for the consultation. Patients’ perceptions and acceptance of using AI technology to interpret their photographic data will largely depend on how comprehensive, useful, and personalized it will be. The differences in skin properties between genders[26,27] are well-known. In this work, the author analyzes the distribution of skin parameter scores in the Beauty.AI dataset, which constitutes an anonymized collection of skin data derived from selfies. The aim of this study is to analyze the difference in the distribution of skin parameters in men and women of different ages and explore whether the same scoring scales for different age and gender groups can be used.

METHODS

Image Processing

Image-quality analysis and standardization were performed by Haut.AI’s Image Quality software.[28] First, the system checked the position of the face and the head’s rotation angle on the x (yaw), y (pitch) and z (roll) axes based on 400+ predicted facial keypoints. Then, the system standardized the face image by aligning the center of the face, calculated from the geometry of the facial features. Next, the following image-quality metrics were analyzed: face illumination, background illumination, presence of shadows and glares on the face, presence of visual noise, image resolution, and the resolution of the bounding box that included only the face area. Finally, the system eliminated all pixels not related to the skin to anonymize the image and remove all artifacts that could affect the calculation of skin parameters by computer vision algorithms. To calculate the skin parameters for the face, we utilized Haut.AI’s Skin Metrics Report software.[28] This computer vision-based software was trained on a total of three million data points, including synthetic and real image data. It utilizes neural network algorithms and computer vision methods, and it calculates 136 face and skin attributes. The following skin parameters were selected for analysis in this study: Sagging, Dark Circles, Eye Bags, Wrinkles, Pores, Uniformness, Acne, Pigmentation, Redness, and Translucency.

Dataset

The author used the private Beauty.AI dataset,[29] which constitutes a collection of fully anonymized skin data derived from selfie pictures. Beauty.AI is a research project established in 2015 with the objective to develop algorithms for evaluating human appearance. Participants submitted their photos using Beauty.AI’s project mobile application and consented to the processing of those images for the evaluation of the algorithms. As no protected health information was accessed and all photographs were fully anonymized, institutional review board approval was not required. The Beauty.AI dataset includes 17,700 selfies captured with smartphone cameras. From this dataset, the images of 433 subjects whose selfie photographs had sufficient image quality were selected. The images contained key facial landmarks, adequate illumination and head position, no face occlusion, no distortion and no image noise. The inclusion criteria were set to filter out photographs that lacked at least one quality requirement. The individual typology angle[30] of subjects in the dataset ranged from −42° to 60°. The subjects were 18–67 years of age, with a mean age of 29.2 years. The age information was provided by the subjects. The dataset included 244 female and 189 male subjects.

Results Visualization

To visualize the results observed in the dataset, the boxplot function from the matplotlib[31] plotting library for the Python[32] programming language was used.

Statistical Analysis

To compare the skin metrics for different subject groups, the Mann–Whitney U test[32,33] from the scipy[34] analytics library for the Python programming language was used.

RESULTS

All participants were divided into six groups: women 18–30 years old (154 subjects), men 18–30 years old (123 subjects), women 30–45 years old (72 subjects), men 30–45 years old (45 subjects), women 45 years old and older (18 subjects), and men 30–45 years old and older (21 subjects). The gender classes were balanced, whereas the age classes were biased toward younger ages. Haut.AI’s Skin Metrics Report software provides the numeric output for the skin parameters on a scale of 0 to 100, where 0 is the lowest score observed in the dataset and 100 is the best score observed in the dataset. The observed distribution in the dataset is presented in Figure, Supplemental Digital Content 1, which shows the distribution of the skin parameters for the face observed in the dataset in different age and gender groups. Color references for groups: light pink for women 18–30 years old, light blue for men 18–30 years old, magenta for women 30–45 years old, blue for men 30–45 years old, fuschia for women 45 years old and older, and dark blue for men 45 years old and older. The yellow line plot on the graphs connects the median values for every age group for each of criteria, http://links.lww.com/PRS/F373.

Gender Differences

The summarized results are presented in Table 2. First, we compared the results for the all-female subject groups and all-male subject groups. We found that the difference in the distribution was significant for all parameters except for dark circles and acne when comparing all women and men. The Sagging, Wrinkles, Uniformness, Pores, Pigmentation, and Redness scores were higher in women compared to men. However, the Eye Bags and Translucency scores were higher for men than for women.

Table 2.

Mann–Whitney U Test Results for Face Parameter Scores in Different Gender Groups of the Same Age*

Parameter	Median Diff.F-M All Ages	p	Median Diff. F-M <30 Years Old	p	Median Diff. F-M 30–45 Years Old	p	Median Diff. F-M 45+ Years Old	p
Sagging	18.5	<0.001	15.0	<0.001	18.0	<0.001	30.0	<0.001
Dark circles	0.0	0.11	1.0	0.31	−1.0	0.25	6.5	0.08
Eye bags	−5.0	<0.001	−4.0	<0.001	−3.5	0.02	−2.5	0.09
Wrinkles	2.0	<0.001	1.0	<0.001	4.5	0.05	4.0	0.49
Pores	12.0	<0.001	12.0	<0.001	14.0	<0.001	17.0	0.04
Uniformness	10.0	<0.001	10.0	<0.001	5.0	0.07	20.0	0.01
Acne	0.0	0.34	1.5	0.12	−2.0	0.03	−2.0	0.05
Pigmentation	1.0	0.02	1.0	0.01	0.0	0.35	0.0	0.21
Redness	3.0	0.01	5.0	0.02	1.0	0.33	5.5	0.19
Translucency	−4.0	<0.001	−3.5	0.02	−2.0	0.03	−1.5	0.27

The p values in bold font indicate statistical significance (P < 0.05).

Mann–Whitney U Test Results for Face Parameter Scores in Different Gender Groups of the Same Age* The p values in bold font indicate statistical significance (P < 0.05). The same results were observed when comparing the 18- to 30-year-old female and male groups. In the 30- to 45-year-old groups, the Sagging, Wrinkles, and Pores scores were higher for women than for men. It was observed that Eye Bags, Acne, and Translucency scores were higher for men. For other parameters, the difference was not statistically significant. In the groups where subjects were older than 45 years old, it was only in the Sagging, Pores, Uniformness and Acne parameters that the difference in scores was statistically significant. For women, the Sagging, Pores, and Uniformness scores were noticeably higher.

Age Groups Difference for Women

The results for differences in the parameter scores were analyzed for the different female (Table 3) and male (Table 4) age groups. We observed a statistically significant difference in the scores for the Sagging, Eye Bags, Wrinkles, Pores and Uniformness parameters in the group of women 18−30 years old when compared with the group of women 30−45 years old.

Table 3.

Mann−Whitney U Test Results for Face Parameter Scores for Women of Different Age Groups*

Parameter	Median F	Median F <30	Median F 30−45	Median F 45+	p, 30 vs 30-45	p, 30 vs 45+	p, 30−45 vs 45+
Sagging	77	79	71	68	0.03	0.14	0.34
Dark circles	61	62	60	63.5	0.13	0.47	0.31
Eye Bags	57	60	53.5	50.5	<0.001	<0.001	0.05
Wrinkles	97	97	95.5	76	<0.001	<0.001	<0.001
Pores	90	92	89	85	0.01	0.03	0.34
Uniformness	55	55	50	45	0.01	0.01	0.13
Acne	97	97	96	98	0.39	0.05	0.03
Pigmentation	97	97	96	95	0.07	0.01	0.06
Redness	83	85	82	79.5	0.09	0.03	0.12
Translucency	36	36	35	41.5	0.26	0.06	0.02

The p values in bold font indicate statistical significance (P < 0.05).

Table 4.

Mann−Whitney U Test Results for Face Parameter Scores in Men of Different Age Groups*

Parameter	Median, M	Median, M <30	Median, M 30−45	Median, M >45	p, 30 vs 30−45	p, 30 vs 45+	p, 30−45 vs 45+
Sagging	59	64	53	38	<0.001	<0.001	0.37
Dark circles	61	61	61	57	0.12	0.02	0.17
Eye bags	62	64	57	53	0.01	<0.001	0.11
Wrinkles	95	96	91	72	0.01	<0.001	0.02
Pores	78	80	75	68	0.02	0.01	0.19
Uniformness	45	45	45	25	0.32	<0.001	<0.001
Acne	97	95.5	98	100	0.01	<0.001	0.01
Pigmentation	96	96	96	95	0.46	0.31	0.35
Redness	80	80	81	74	0.44	0.03	0.04
Translucency	40	39.5	37	43	0.48	0.14	0.15

The p values in bold font indicate statistical significance (P < 0.05).

Mann−Whitney U Test Results for Face Parameter Scores for Women of Different Age Groups* The p values in bold font indicate statistical significance (P < 0.05). Mann−Whitney U Test Results for Face Parameter Scores in Men of Different Age Groups* The p values in bold font indicate statistical significance (P < 0.05). A statistically significant difference was observed in the same parameters when comparing the group 18−30 years old with the group older than 45 years old. It was also observed for the Pigmentation and Redness parameters. When comparing the female 30−45 years old group with the group older than 45 years old, a statistically significant difference was observed for the Eye Bags, Wrinkles, Acne, and Translucency parameters. In all cases, when we observed statistically significant differences, the values of the parameters declined with age.

Age Groups Difference for Men

A statistically significant difference was observed in the group of men 18−30 years old when compared with the group 30−45 years old for the Sagging, Eye Bags, Wrinkles, Pores, and Acne parameters. When comparing the group of men 18–30 years old with the group of men older than 45 years old, a statistically significant difference was observed for the same parameters mentioned above, as well as for Dark Circles, Uniformness and Redness. When comparing the 30- to 45-year-old group with the group older than 45 years old, a statistically significant difference was identified for the Wrinkles, Uniformness, Acne, and Redness parameters. For most of the skin parameters, the scores declined with age when statistically significant results were observed.

DISCUSSION

Gender-specific scales for selected skin parameters have been discussed and suggested by several researchers in the past.[26,35] However, based on the author’s previously published work,[10,11] the accuracy of AI systems—including those for skin analysis—can benefit from population-specific algorithms and biomarkers that have also been confirmed using other research groups.[36,37] Of all the skin parameters reviewed in this study, the least variability for both genders was observed for Dark Circles, Eye Bags, Acne, Pigmentation, and Redness. This finding suggests that the same scale can be used for men and women of different ages to compare the results for different individuals. Subjects from different age groups tend to display considerable variation in their Sagging, Wrinkles, Pores, Uniformness, Redness, and Translucency features. These observations indicate that the variability between individuals is large within different age and gender groups. It is suggested that the individual’s age and gender group should be taken into consideration when developing scales. In practice, it means that the absolute scores for skin parameters will be normalized to the minimal and maximum scores observed in each demographic group. Facial aging is characterized by change in the face shape, texture, and color of the skin.[38,39] Uniformness reflects how flawless the skin is, and it is represented by a combination of color and texture. The change in the Sagging parameter reflects changes in the skin’s elasticity, fat, muscle tissues, and the facial skeleton.[40] A deterioration in the Pores parameter can indicate a loss of skin elasticity. The Pearson correlation coefficients between the chronological age and the Uniformness, Sagging, and Pores parameters were equal to −0.42, −0.4, −0.35, respectively. The Pearson correlation coefficient between the chronological age and the Wrinkle score was equal to −0.56. It is notable that the difference in the median value for the 30- to 45-years old group and 45 years old and older group was the greatest for the Wrinkle parameter in both the male and female groups. For most of the skin parameter criteria utilized in this study, we observed significant variability among the subjects. This kind of variability between individuals can be explained by the heterogeneity of the studied population. The sample covers people of different races, geography, and climatic conditions.[41,42] It is also worth emphasizing the possibility of a genetic predisposition in terms of the pace of aging[43,44] and the difference in lifestyles among individuals.[45,46] It is also suggested in some previous studies that people who look younger than their chronological age could also have better health overall.[47] Computer vision methods can become a powerful tool to measure the effectiveness of treatments administered topically and surgical interventions. Combined with mobile delivery, image-based face and skin analysis will allow remote patient screening and monitoring. The main value for the patient can be the second opinion, remote prescreening, and a greater understanding of the skin’s current condition. The benefits for the doctor are the ability to save time by providing remote consultations, the preliminary report on the patient’s current skin profile, and being more informed when selecting the appropriate procedure. Selfie-based mobile image analysis is a cost-efficient tool since it does not require any expensive equipment. Software requires a lower level of investment compared to hardware. Another benefit of mobile image-based analysis is that the software analysis calculations can be done in a secure cloud environment and the results can be retrieved instantly within 2–3 seconds. The accessibility of selfies allows a larger number of data points to be collected, providing more insights into the skin’s dynamics. The implementation of selfie-based analysis as a new instrument will require an investment in patient education. First of all, the criteria of what images are appropriate for the analysis should be clearly communicated to the patient. In this work, the author did not discuss the development of different Fitzpatrick skin type-specific and ethnicity-specific groups.[48-51] However, the author is planning to extend their research to these areas. Use of gender-specific and age-specific scales allows the patients’ scores to be compared with their reference groups. As a result, realistic expectations can be set about the skin results that can be achieved within a patient’s relevant demographic group. These realistic goals allow positive reinforcement and inspire patient optimism, both of which play important roles in the treatment process.[52]

33 in total

Review 1. Artificial Intelligence in Dermatology-Where We Are and the Way to the Future: A Review.

Authors: Daniel T Hogarty; John C Su; Kevin Phan; Mohamed Attia; Mohammed Hossny; Saeid Nahavandi; Patricia Lenane; Fergal J Moloney; Anousha Yazdabadi
Journal: Am J Clin Dermatol Date: 2019-07-05 Impact factor: 7.403

Review 2. The skin aging exposome.

Authors: Jean Krutmann; Anne Bouloc; Gabrielle Sore; Bruno A Bernard; Thierry Passeron
Journal: J Dermatol Sci Date: 2016-09-28 Impact factor: 4.563

Review 3. Climate and skin function: an overview.

Authors: Babu Singh; Howard Maibach
Journal: Skin Res Technol Date: 2013-03-25 Impact factor: 2.365

4. Evaluating age-related changes of some facial signs among men of four different ethnic groups.

Authors: F Flament; D Amar; C Feltin; R Bazin
Journal: Int J Cosmet Sci Date: 2018-10 Impact factor: 2.970

Review 5. Changes in the Facial Skeleton With Aging: Implications and Clinical Applications in Facial Rejuvenation.

Authors: Bryan Mendelson; Chin-Ho Wong
Journal: Aesthetic Plast Surg Date: 2012-05-12 Impact factor: 2.326

6. Measurement of hydration in the stratum corneum with the MoistureMeter and comparison with the Corneometer.

Authors: Esko Alanen; Jouni Nuutinen; Kirsi Nicklén; Tapani Lahtinen; Jukka Mönkkönen
Journal: Skin Res Technol Date: 2004-02 Impact factor: 2.365

7. Influence of facial skin attributes on the perceived age of Caucasian women.

Authors: A Nkengne; C Bertin; G N Stamatas; A Giron; A Rossi; N Issachar; B Fertil
Journal: J Eur Acad Dermatol Venereol Date: 2008-06-06 Impact factor: 6.166

Review 8. Psychology of the Facelift Patient.

Authors: David Sarcu; Peter Adamson
Journal: Facial Plast Surg Date: 2017-06-01 Impact factor: 1.446

9. Genetic polymorphisms and skin aging: the identification of population genotypic groups holds potential for personalized treatments.

Authors: Jordi Naval; Vicente Alonso; Miquel Angel Herranz
Journal: Clin Cosmet Investig Dermatol Date: 2014-07-01

10. Population Specific Biomarkers of Human Aging: A Big Data Study Using South Korean, Canadian, and Eastern European Patient Populations.

Authors: Polina Mamoshina; Kirill Kochetov; Evgeny Putin; Franco Cortese; Alexander Aliper; Won-Suk Lee; Sung-Min Ahn; Lee Uhn; Neil Skjodt; Olga Kovalchuk; Morten Scheibye-Knudsen; Alex Zhavoronkov
Journal: J Gerontol A Biol Sci Med Sci Date: 2018-10-08 Impact factor: 6.053