Peter Yeates1, Marc Moreau, Kevin Eva. 1. P. Yeates is clinical lecturer in medical education, Centre for Respiratory Medicine and Allergy, Institute of Inflammation and Repair, University of Manchester, and specialist registrar, Respiratory and General Internal Medicine, Health Education North West, Manchester, United Kingdom. M. Moreau is assistant dean for admissions, Faculty of Medicine and Dentistry, and professor, Division of Orthopaedic Surgery, University of Alberta, Edmonton, Alberta, Canada. K. Eva is senior scientist, Centre for Health Education Scholarship, and professor and director of educational research and scholarship, Department of Medicine, University of British Columbia, Vancouver, British Columbia, Canada.
Abstract
PURPOSE: Laboratory studies have shown that performance assessment judgments can be biased by "contrast effects." Assessors' scores become more positive, for example, when the assessed performance is preceded by relatively weak candidates. The authors queried whether this effect occurs in real, high-stakes performance assessments despite increased formality and behavioral descriptors. METHOD: Data were obtained for the 2011 United Kingdom Foundational Programme clinical assessment and the 2008 University of Alberta Multiple Mini Interview. Candidate scores were compared with scores for immediately preceding candidates and progressively distant candidates. In addition, average scores for the preceding three candidates were calculated. Relationships between these variables were examined using linear regression. RESULTS: Negative relationships were observed between index scores and both immediately preceding and recent scores for all exam formats. Relationships were greater between index scores and the average of the three preceding scores. These effects persisted even when examiners had judged several performances, explaining up to 11% of observed variance on some occasions. CONCLUSIONS: These findings suggest that contrast effects do influence examiner judgments in high-stakes performance-based assessments. Although the observed effect was smaller than observed in experimentally controlled laboratory studies, this is to be expected given that real-world data lessen the strength of the intervention by virtue of less distinct differences between candidates. Although it is possible that the format of circuital exams reduces examiners' susceptibility to these influences, the finding of a persistent effect after examiners had judged several candidates suggests that the potential influence on candidate scores should not be ignored.
PURPOSE: Laboratory studies have shown that performance assessment judgments can be biased by "contrast effects." Assessors' scores become more positive, for example, when the assessed performance is preceded by relatively weak candidates. The authors queried whether this effect occurs in real, high-stakes performance assessments despite increased formality and behavioral descriptors. METHOD: Data were obtained for the 2011 United Kingdom Foundational Programme clinical assessment and the 2008 University of Alberta Multiple Mini Interview. Candidate scores were compared with scores for immediately preceding candidates and progressively distant candidates. In addition, average scores for the preceding three candidates were calculated. Relationships between these variables were examined using linear regression. RESULTS: Negative relationships were observed between index scores and both immediately preceding and recent scores for all exam formats. Relationships were greater between index scores and the average of the three preceding scores. These effects persisted even when examiners had judged several performances, explaining up to 11% of observed variance on some occasions. CONCLUSIONS: These findings suggest that contrast effects do influence examiner judgments in high-stakes performance-based assessments. Although the observed effect was smaller than observed in experimentally controlled laboratory studies, this is to be expected given that real-world data lessen the strength of the intervention by virtue of less distinct differences between candidates. Although it is possible that the format of circuital exams reduces examiners' susceptibility to these influences, the finding of a persistent effect after examiners had judged several candidates suggests that the potential influence on candidate scores should not be ignored.
Authors: Catherine Hyde; Sarah Yardley; Janet Lefroy; Simon Gay; Robert K McKinley Journal: Adv Health Sci Educ Theory Pract Date: 2020-01-29 Impact factor: 3.853
Authors: Peter Yeates; Alice Moult; Natalie Cope; Gareth McCray; Eleftheria Xilas; Tom Lovelock; Nicholas Vaughan; Dan Daw; Richard Fuller; Robert K Bob McKinley Journal: Acad Med Date: 2021-03-02 Impact factor: 7.840