Alexandra E Rojek1, Raman Khanna2, Joanne W L Yim3, Rebekah Gardner4, Sarah Lisker1,5, Karen E Hauer1, Catherine Lucey1, Urmimala Sarkar6,7. 1. University of California, San Francisco School of Medicine, San Francisco, CA, USA. 2. Division of Hospital Medicine, University of California, San Francisco, School of Medicine, San Francisco, CA, USA. 3. Health Informatics, UCSF Health, University of California, San Francisco, San Francisco, CA, USA. 4. Warren Alpert Medical School of Brown University, Providence, RI, USA. 5. UCSF Center for Vulnerable Populations, San Francisco, CA, USA. 6. University of California, San Francisco School of Medicine, San Francisco, CA, USA. urmimala.sarkar@ucsf.edu. 7. UCSF Center for Vulnerable Populations, San Francisco, CA, USA. urmimala.sarkar@ucsf.edu.
Abstract
BACKGROUND: In varied educational settings, narrative evaluations have revealed systematic and deleterious differences in language describing women and those underrepresented in their fields. In medicine, limited qualitative studies show differences in narrative language by gender and under-represented minority (URM) status. OBJECTIVE: To identify and enumerate text descriptors in a database of medical student evaluations using natural language processing, and identify differences by gender and URM status in descriptions. DESIGN: An observational study of core clerkship evaluations of third-year medical students, including data on student gender, URM status, clerkship grade, and specialty. PARTICIPANTS: A total of 87,922 clerkship evaluations from core clinical rotations at two medical schools in different geographic areas. MAIN MEASURES: We employed natural language processing to identify differences in the text of evaluations for women compared to men and for URM compared to non-URM students. KEY RESULTS: We found that of the ten most common words, such as "energetic" and "dependable," none differed by gender or URM status. Of the 37 words that differed by gender, 62% represented personal attributes, such as "lovely" appearing more frequently in evaluations of women (p < 0.001), while 19% represented competency-related behaviors, such as "scientific" appearing more frequently in evaluations of men (p < 0.001). Of the 53 words that differed by URM status, 30% represented personal attributes, such as "pleasant" appearing more frequently in evaluations of URM students (p < 0.001), and 28% represented competency-related behaviors, such as "knowledgeable" appearing more frequently in evaluations of non-URM students (p < 0.001). CONCLUSIONS: Many words and phrases reflected students' personal attributes rather than competency-related behaviors, suggesting a gap in implementing competency-based evaluation of students. We observed a significant difference in narrative evaluations associated with gender and URM status, even among students receiving the same grade. This finding raises concern for implicit bias in narrative evaluation, consistent with prior studies, and suggests opportunities for improvement.
BACKGROUND: In varied educational settings, narrative evaluations have revealed systematic and deleterious differences in language describing women and those underrepresented in their fields. In medicine, limited qualitative studies show differences in narrative language by gender and under-represented minority (URM) status. OBJECTIVE: To identify and enumerate text descriptors in a database of medical student evaluations using natural language processing, and identify differences by gender and URM status in descriptions. DESIGN: An observational study of core clerkship evaluations of third-year medical students, including data on student gender, URM status, clerkship grade, and specialty. PARTICIPANTS: A total of 87,922 clerkship evaluations from core clinical rotations at two medical schools in different geographic areas. MAIN MEASURES: We employed natural language processing to identify differences in the text of evaluations for women compared to men and for URM compared to non-URM students. KEY RESULTS: We found that of the ten most common words, such as "energetic" and "dependable," none differed by gender or URM status. Of the 37 words that differed by gender, 62% represented personal attributes, such as "lovely" appearing more frequently in evaluations of women (p < 0.001), while 19% represented competency-related behaviors, such as "scientific" appearing more frequently in evaluations of men (p < 0.001). Of the 53 words that differed by URM status, 30% represented personal attributes, such as "pleasant" appearing more frequently in evaluations of URM students (p < 0.001), and 28% represented competency-related behaviors, such as "knowledgeable" appearing more frequently in evaluations of non-URM students (p < 0.001). CONCLUSIONS: Many words and phrases reflected students' personal attributes rather than competency-related behaviors, suggesting a gap in implementing competency-based evaluation of students. We observed a significant difference in narrative evaluations associated with gender and URM status, even among students receiving the same grade. This finding raises concern for implicit bias in narrative evaluation, consistent with prior studies, and suggests opportunities for improvement.
Entities:
Keywords:
medical education; medical education—assessment/evaluation; medical student and residency education
Authors: Marcella Nunez-Smith; Maria M Ciarleglio; Teresa Sandoval-Schaefer; Johanna Elumn; Laura Castillo-Page; Peter Peduzzi; Elizabeth H Bradley Journal: Am J Public Health Date: 2012-03-15 Impact factor: 9.308
Authors: Katherine B Lee; Sanjeev N Vaishnavi; Steven K M Lau; Dorothy A Andriole; Donna B Jeffe Journal: J Natl Med Assoc Date: 2007-10 Impact factor: 1.798
Authors: Corinne A Moss-Racusin; John F Dovidio; Victoria L Brescoll; Mark J Graham; Jo Handelsman Journal: Proc Natl Acad Sci U S A Date: 2012-09-17 Impact factor: 11.205
Authors: Virginia Sheffield; Sarah Hartley; R Brent Stansfield; Megan Mack; Staci Blackburn; Valerie M Vaughn; Lauren Heidemann; Robert Chang; Jennifer Reilly Lukela Journal: J Gen Intern Med Date: 2021-08-17 Impact factor: 5.128
Authors: Gary L Beck Dallaghan; Irene Alexandraki; Jennifer Christner; Meg Keeley; Sorabh Khandelwal; Beat Steiner; Paul A Hemmer Journal: Cureus Date: 2021-04-14