Literature DB >> 29062644

Validation of a Unilateral Cleft Lip Surgical Outcomes Evaluation Scale for Surgeons and Laypersons.

Alex Campbell¹, Carolina Restrepo¹, Gaurav Deshpande¹, Caroline Tredway¹, Sarah M Bernstein¹, Rachel Patzer¹, Lisa Wendby¹, Bjorn Schonmeyr¹.

Abstract

BACKGROUND: A standardized evaluation tool is needed for the assessment of surgical outcomes in cleft lip surgery. Current scales for evaluating unilateral cleft lip/nose (UCL/N) aesthetic outcomes are limited in their reliability, ease of use, and application. The Unilateral Cleft Lip Surgical Outcomes Evaluation (UCL SOE) scale measures symmetry of 4 components and sums these for a total score. The purpose of this study was to validate the SOE as a reliable tool for use by both surgeons and laypersons.
METHODS: Twenty participants (9 surgeons and 12 laypeople) used the SOE to evaluate 25 sets of randomly selected presurgical and postsurgical standardized photographs of UCL/N patients. Interrater reliability for surgeon and laypeople was determined using an intraclass correlation coefficient (ICC).
RESULTS: Individual surgeons and laypeople both reached an ICC in the "fair to good" range (ICC = 0.42 and 0.59, respectively). Averaging 2 evaluators in the surgeon group improved the ICC to 0.58 and in the laypeople group to 0.74, respectively. Averaging 3 evaluators increased the ICC for surgeons to the "good" range (ICC = 0.71) and the ICC for laypeople to the "very good" range (ICC = 0.82).
CONCLUSIONS: Surgeon and layperson raters can reliably use the SOE to assess the aesthetics results after surgical repair of UCL/N, and improved reliability and reproducibility is achieved by averaging the scores of multiple reviewers.

Entities: Disease Gene Species

Year: 2017 PMID： 29062644 PMCID： PMC5640349 DOI： 10.1097/GOX.0000000000001472

Source DB: PubMed Journal: Plast Reconstr Surg Glob Open ISSN： 2169-7574

INTRODUCTION

The goal of multidisciplinary cleft care is complete rehabilitation and normalization of a child. The objective of cleft lip repair is to minimize all stigmata of the cleft-related deformity and to free the patient from the psychological implications of impaired facial appearance. Many techniques and protocols exist in cleft surgery, and there is a central need for institutions, hospitals, and organizations involved in cleft care to systematically measure and compare outcomes to guide best practices. A central element of the overall outcome after cleft lip repair is the aesthetic appearance of the lip and nose. Previous studies assessing outcomes in cleft patients have focused on satisfaction with treatment, dental arch relationship, and speech, but evaluation of facial appearance has received limited attention.[1-7] The majority of studies have focused on assessing and differentiating among different surgical techniques or assessing the treatment outcome among different centers.[8-11] Assessment of the appearance of the cleft-related deformity and the impact of surgical treatment is a critical component of the quality of life outcome for this patient pool.[12] Researchers have created and tested scales, but proposed methods vary widely in their study design, patient population, treatment stage, type of cleft, evaluators’ education, and familiarity with the patients and techniques.[13-16] Limitations of reliable, valid, and meaningful ways of measuring aesthetic results inhibit our ability to determine the best strategies for treatment.[17] To make any accurate determination of the efficacy of the numerous treatment methods currently used, we need a uniform and simple assessment tool that provides objective evaluations of the results of treatment.[18] Al-Omari et al.[19] undertook a review of grading systems of the appearance of cleft deformities, and the review concluded that, because existing study designs vary extensively, “an internationally agreed objective method of assessment … is required”. Subjective or qualitative assessments using standardized photographs analyzed by a panel of judges have been shown by Asher-McDade et al.[20] to provide valid, reliable, and reproducible ratings of cleft patients. This scale uses a standardized method to evaluate 4 nasolabial components (nasal form, nose symmetry, nasal profile, and vermilion border) and grades each component on a 5-point scale. The Asher-McDade scale introduced numerous advances in the evaluation of surgical outcomes. These include a validated scale based on subjective evaluation of standardized 2D images, summing scores of the individual components for a final evaluation score, and improving reliability by averaging scores of multiple reviewers. However, numerous weaknesses limit its relevance for measuring the aesthetic outcome after a primary cleft lip repair on a scale that is necessary to provide comprehensive outcomes evaluations in global cleft care. The scale is designed to evaluate patients in late childhood/early adolescence after years of multidisciplinary treatment, rather than focusing on the aesthetic results of a single surgical intervention. The scale looks at 3 separate components of outcomes on the nose and does not place significant emphasis on critical elements of the lip repair such as the philtrum, white roll, lateral lip, and balance of cupid’s bow. For “very good” reliability, individual patients must each be scored by multiple calibrated expert raters, reviewers, which is inconvenient and impractical. The Asher-McDade scale has been utilized in the Eurocleft, Americleft, Scandcleft, and the CSAG studies, with each patient was evaluated by multiple experts in cleft care and interrater reliability in the “moderate to good” range (Tables 1, 2). Very little was learned about aesthetic outcomes in general, and no insights were offered regarding merits and pitfalls of various surgical techniques.[21-24] Authors in the Americleft study summarized that “ideally, more standardized and objective assessment methods will be developed to improve the accuracy and reliability of evaluations of nasolabial aesthetic outcomes.”

Table 1.

Interpretation of Intraclass ICC

Table 2.

Summary of Various Studies Using Asher-McDade Methodology

Interpretation of Intraclass ICC Summary of Various Studies Using Asher-McDade Methodology Responding to the need for improved outcomes measures, a group of surgeons at Operation Smile experienced in cleft care participated in development of the Unilateral Cleft Lip Surgical Outcomes Evaluation (UCL SOE) scale (Figs. 1, 2). The UCL SOE scores symmetry of 4 individual anthropomorphic components of the cleft repair (Cupid’s bow, lateral lip, nose, and free vermillion). Each element is scored on a 3 point scale: 2 (excellent), 1 (mild asymmetry), 0 (unsatisfactory). The scores of the 4 individual scores are then summed for a total score of 0 (lowest) to 8 (highest). The purpose of this study was to validate the UCL SOE as a reliable tool for use by both surgeons and laypersons.

Fig. 1.

The UCL SOE scores symmetry of 4 individual anthropomorphic components of the cleft repair (Cupid’s bow, lateral lip, nose, and free vermillion).

Fig. 2.

Each element is scored on a 3-point scale: 2 (excellent), 1 (mild asymmetry), 0 (unsatisfactory). The scores of the 4 individual components are then summed for a total score of 0 (lowest) to 8 (highest).

The UCL SOE scores symmetry of 4 individual anthropomorphic components of the cleft repair (Cupid’s bow, lateral lip, nose, and free vermillion). Each element is scored on a 3-point scale: 2 (excellent), 1 (mild asymmetry), 0 (unsatisfactory). The scores of the 4 individual components are then summed for a total score of 0 (lowest) to 8 (highest).

MATERIALS AND METHODS

All study subjects were drawn from patients at the Guwahati Comprehensive Cleft Care Center in Assam, India, between 2011 and 2014 admitted for primary cheiloplasty. All patients’ parents or guardians signed an informed consent allowing for the use of their medical records and photographs for research. All required forms and signatures were submitted to the institutional review board, and ethical approval was granted. Patients ranged in age from 6 to 24 months, and exclusion criteria included prior surgeries, known congenital syndromes, or other craniofacial abnormalities. From this group, 25 sets of standardized frontal and basal photographs taken with a Nikon digital single-lens reflex camera were randomly selected and deidentified. Photographs were cropped to minimize the portion of the face or body not affected by the cleft. All photographs were formatted to be of uniform length and width. Two sets of evaluators were recruited to participate in the study. The first set consisted of 9 experienced cleft surgeons. All surgeons hold current plastic surgery board certification, international credentialing with Operation Smile, and have extensive experience in cleft lip and palate repair. The second set of 12 laypersons were recruited from nonsurgical staff at the Guwahati Comprehensive Cleft Care Center and medical students at Emory University. Demographic information of the evaluators can be found in Table 3.

Table 3.

Demographic Information of the Surgeon and Layperson Evaluators

Demographic Information of the Surgeon and Layperson Evaluators Written and video instructions on how to use the UCL SOE were provided to all evaluators, and a 1-hour teaching session was provided to describe the scale and perform practice cases. Evaluators provided with frontal and basal photographs for each of the 25 cases. On-table, immediate postoperative photographs were utilized. Evaluators recorded their impressions or each of the 4 components of the scale in the provided table adjacent to the pictures. The evaluations were then e-mailed back to the researchers for analysis. Interexaminer reliability of the scale was calculated using the intraclass correlation coefficient (ICC). The ICC was calculated for the ratings of individual surgeons, for the ratings of individual laypeople, and for the ratings of all individual evaluators combined. The ICC was also calculated for the ratings of averages of 2 and 3 randomly grouped surgeons and for the ratings of averages of 2 and 3 randomly grouped laymen. For the ICCs for averages of 2, random sets of 2 were averaged, and then the ICC of these sets of 2 was calculated; this was done 4 times to ensure there was little variation. The same method was used for calculating the ICC for averages of 3. Means of the scores and their components were calculated for both groups of surgeons and laypeople. Differences in average scoring pattern were assessed using a paired t -test. Regressions were calculated for average surgeon versus average layman score (total and each of the 4 components) for each of the 25 pictures. Distributions of the total scores and each of the 4 components were calculated for surgeons and laymen. To compare the distributions between the 2 groups, a correction was made to account for the differing numbers of surgeons and laymen completing the evaluations. The surgeons’ distributions were multiplied by the number of laymen[13] divided by the number of surgeons.[10]

RESULTS

The ICCs for individual and averaged ratings in each group were calculated and are listed in Table 4.

Table 4.

The Intraclass ICCs for Individual and Averaged Ratings in Each Group

The Intraclass ICCs for Individual and Averaged Ratings in Each Group Statisticians consider an ICC of greater than 0.40 to be a “moderate” correlation for health care research, ICC above 0.60 to be a “good” correlation, and above 0.80 to be a “very good” correlation.[25,26] Individual surgeons and laymen both reached an ICC in the “moderate” range for the total score (ICC = 0.42 and 0.59, respectively). ICCs were lower among both surgeons and laymen for individual components. ICCs of all the evaluators combined did not decrease below the ICCs of either group, which suggests that the laymen and the surgeons exhibited similar patterns of scoring. Averaging 2 evaluators in the surgeon group improved the ICC for the total score to 0.58 and in the laypeople group to 0.74, respectively. Averaging 2 evaluators in both the surgeon and the layman groups also produced ICCs in the “moderate to good” range for individual components of scoring (ICCs = 0.50–0.74). Averaging 3 evaluators increased the ICC for the total score judged by surgeons to the “good” range (ICC = 0.71) and the ICC for laypeople to the “very good” range (ICC = 0.82). Averaging 3 evaluators in both the surgeon and the layman groups produced ICCs in the “moderate to very good” range in all individual components of scoring (ICCs = 0.44–0.82). The plot of the average surgeon versus the average layman total score for each picture, along with the regression for this plot, is found in Figure 3. The R2 for the regression is 0.86, showing a strong correlation between surgeon and layman scoring patterns. The X variable of 0.82 and the intercept of 1.48 suggest that as the outcome score increases the surgeons and the laymen show more agreement.

Fig. 3.

The plot of the average surgeon versus the average layman total score for each picture, along with the regression. An R2 of 0.86, shows a strong correlation between surgeon and layman scoring patterns. Plots were also created of the average surgeon versus the average layman component scores, along with the regressions for those plots (see figure, Supplemental Digital Content 1, which displays plots and regressions of the average surgeon versus the average layman total score for Cupid’s bow, lateral lip, nose, and free vermillion, http://links.lww.com/PRSGO/A525). The R2 for the Cupid’s bow, lateral lip, and nose regressions show fairly strong correlation between surgeon and layman scoring patterns (R2 = 0.80, 0.72, and 0.79, respectively), but a poorer correlation between surgeon and layman scoring patterns for the free vermillion (R2 = 0.56). As with the total score, all regressions show an increase in surgeon and layman score agreement as the outcome score increases (X variables = 0.64–0.82; intercepts = 0.33–0.66). The distributions of the total scores for each group of evaluators (Fig. 4) and of each of the 4 components for each group of evaluators also demonstrate strong correlation between surgeon and layman (see figure, Supplemental Digital Content 2, which displays distributions of Cupid’s bow, lateral lip, nose, and free vermillion for each group of evaluators, http://links.lww.com/PRSGO/A526).

Fig. 4.

Distributions of the total scores for each group of evaluators.

DISCUSSION

Oral clefts are among the most widely known and common craniofacial anomalies, yet there is currently no widely utilized assessment tool to evaluate the aesthetic result after primary UCL repair. Thus, there is relatively little evidence on aesthetic results after UCL repair to guide best techniques and protocols. Many techniques and protocols exist in cleft surgery, and there is a central need for institutions, hospitals, and organizations involved in cleft care to systematically measure and compare outcomes to guide best practices. The UCL SOE builds on the concepts and work of prior authors to quickly, efficiently, and consistently score the aesthetic result of a primary UCL repair.[27,28] The intent for the UCL SOE is to be an intuitive, easy to use, and reliable outcomes evaluation tool for use by experts and laypersons. In structuring the scale, researchers utilized numerous concepts and components that had been validated through prior studies. Precise measures that accurately capture aesthetic results after surgical treatment for cleft lip are an immediate necessity. Without these, accurate comparison among surgeons, techniques, presurgical interventions, and treatment protocols cannot be made. Long-term assessments that include 3-dimensional data from computed tomographic scans, animated recordings from videos, and clinical examination by experts undoubtedly allow superior discrimination in assessing results. However, these are expensive, labor intensive, and require significant patient participation. Furthermore, the vast majority of cleft surgery takes place in developing regions, with charities treating cleft lip and cleft palate performing more than 150,000 operations per year. Standardized photographs are currently the most commonly used medium for outcomes analysis worldwide. They are easy to use, widely available as part of the medical record, and reproducible. Numerous studies have successfully used 2-dimensional photographs for assessment of the cleft deformity.[20,29-31] Operation Smile utilizes specialized patient imaging technicians to capture and organize standardized photographs of all patients at each stage in intervention. Complete sets of patient photographs (frontal and basal) were available for all patients in this study, and views were consistent among patients being evaluated. Most studies evaluating facial appearance in cleft patients have involved panels of health care professionals with experience in cleft care. Relatively few studies have used laypersons as evaluators and even fewer have compared evaluations of laypersons to the evaluations of surgeons. Those that have yielded inconsistent results, with some studies indicating laypeople and surgeons agree on grading and others suggesting varying degrees of disagreement between expert and nonexpert reviewers.[30,32,33] The potential to use laypersons as evaluators holds great appeal as surgeons and other medical professionals with expertise in cleft care are relatively limited in their numbers, time, and willingness to perform large numbers of evaluations. A recent study by Tse, et al.[34] demonstrated the power of using crowdsourcing to complete massive numbers of layperson assessments on an unprecedented scale in a convenient, rapid, and reliable means of assessing aesthetic outcome of treatment for UCL. The UCL SOE overcomes many shortcomings of prior scales and is useful for large-scale outcomes evaluations. This study demonstrates that surgeon and layperson raters can reliably use the UCL SOE to assess the aesthetic result of UCL/N. According to these ICC calculations, individual raters can assess the total aesthetic result, with “moderate” reliability, and improved reliability and reproducibility is achieved by averaging the scores of multiple reviewers. Results also demonstrate that surgical expertise is not necessary to use the UCL SOE reliably and that “very good” reliability (ICC = 0.82) is achieved when the scores of 3 layperson reviewers are pooled and averaged. Combined with the Unilateral Cleft Lip Cleft Severity Scale (UCL CSS), validated in a parallel study, the UCL SOE allows surgeons and laypersons to grade the preoperative severity of the UCL/N deformity and the final aesthetic result after primary surgical repair.[35] This has significant implications on the ability to conduct outcomes studies evaluating and comparing various surgeons, centers, techniques, and protocols. These tools have additional value to track patient results through time and also to monitor surgical development during training and practice. The ability to objectively measure UCL surgical outcomes will provide insight into the factors that contribute to differences in outcomes among patients. Future studies will be able to use the scale to investigate factors that may contribute to differences in surgical outcomes. Broad implementation of these tools will gather data necessary to create a “bell curve” of expected aesthetic results after cleft lip repair, allowing significant advancements in quality assurance as well as identification of best practices that lead to superior results.

PATIENT CONSENT

Parents or guardians provided written consent for the use of the patients’ image.

31 in total

1. The intrajudge reliability of the perceptual rating of cleft palate speech before and after pharyngeal flap surgery: the effect of judges and speech samples.

Authors: K H Keuning; G H Wieneke; P H Dejonckere
Journal: Cleft Palate Craniofac J Date: 1999-07

2. A simple assessment method for auditing multi-centre unilateral cleft lip repairs.

Authors: J B Kim; P Strike; M C Cadier
Journal: J Plast Reconstr Aesthet Surg Date: 2010-05-15 Impact factor: 2.740

3. A new index for assessing surgical outcome in unilateral cleft lip and palate subjects aged five: reproducibility and validity.

Authors: N E Atack; I S Hathorn; G Semb; T Dowell; J R Sandy
Journal: Cleft Palate Craniofac J Date: 1997-05

4. The Goslon Yardstick: a new system of assessing dental arch relationships in children with unilateral clefts of the lip and palate.

Authors: M Mars; D A Plint; W J Houston; O Bergland; G Semb
Journal: Cleft Palate J Date: 1987-10

Review 5. Measuring quality of life in cleft lip and palate patients: currently available patient-reported outcomes measures.

Authors: Donna A Eckstein; Rebecca L Wu; Takintope Akinbiyi; Lester Silver; Peter J Taub
Journal: Plast Reconstr Surg Date: 2011-11 Impact factor: 4.730

6. The Clinical Standards Advisory Group (CSAG) Cleft Lip and Palate Study.

Authors: J Sandy; A Williams; S Mildinhall; T Murphy; D Bearn; B Shaw; D Sell; B Devlin; J Murray
Journal: Br J Orthod Date: 1998-02

7. Assessment of deformities of the lip and nose in cleft lip alveolus and palate patients by a rating scale.

Authors: B R Rajanikanth; Krishna Shama Rao; S M Sharma; B Rajendra Prasad
Journal: J Maxillofac Oral Surg Date: 2011-10-18

8. A six-center international study of treatment outcome in patients with clefts of the lip and palate: Part 4. Assessment of nasolabial appearance.

Authors: C Asher-McDade; V Brattström; E Dahl; J McWilliam; K Mølsted; D A Plint; B Prahl-Andersen; G Semb; W C Shaw; R P The
Journal: Cleft Palate Craniofac J Date: 1992-09

9. A panel based assessment of early versus no nasal correction of the cleft lip nose.

Authors: P D Cussons; M S Murison; A E Fernandez; R W Pigott
Journal: Br J Plast Surg Date: 1993-01

10. Assessment of the cleft nasal deformity using a regression equation.

Authors: Soo Chan Kim; Ki Chang Nam; Dong Kyun Rah; Eun Jong Cha; Deok Won Kim
Journal: Cleft Palate Craniofac J Date: 2008-07-30

5 in total

1. A Competency Assessment Tool for Unilateral Cleft Lip Repair.

Authors: Carolyn R Rogers-Vizena; Georgios D Sideridis; Krishna G Patel; Catharine B Garland; Delora L Mount; Caroline A Yao
Journal: Plast Reconstr Surg Glob Open Date: 2020-07-14

2. Influence of Severity on Aesthetic Outcomes of Unilateral Cleft Lip Repair in 1,823 Patients.

Authors: Alex Campbell; Carolina Restrepo; Eugene Park; Genesis Navas; Gaurav Deshpande; Jordan Swanson; Bjorn Schonmeyr; Lisa Wendby; Ruben Ayala
Journal: Plast Reconstr Surg Glob Open Date: 2019-01-22

3. Effect of Cleft Types on Outcome of Unilateral Cleft Lip Repair.

Authors: Adekunle Moses Adetayo; Abdurazzak Olanrewaju Taiwo; Modupe Olusola Adetayo; Omotayo F Salami
Journal: Ann Maxillofac Surg Date: 2020-11-10

4. Evaluation of a Digital Protocol for Pre-Surgical Orthopedic Treatment of Cleft Lip and Palate in Newborn Patients: A Pilot Study.

Authors: Domenico Dalessandri; Ingrid Tonni; Laura Laffranchi; Marco Migliorati; Gaetano Isola; Stefano Bonetti; Luca Visconti; Corrado Paganelli
Journal: Dent J (Basel) Date: 2019-12-09

5. Outcomes at 18 mo of 37 noma (cancrum oris) cases surgically treated at the Noma Children's Hospital, Sokoto, Nigeria.

Authors: Elise S Farley; Mohana Amirtharajah; Ryan D Winters; Abdurrazaq O Taiwo; Modupe J Oyemakinde; Adolphe Fotso; Linda A Torhee; Ushma C Mehta; Karla A Bil; Annick D Lenglet
Journal: Trans R Soc Trop Med Hyg Date: 2020-11-06 Impact factor: 2.184

5 in total