Josef Bartels1, Christopher John Mooney2, Robert Thompson Stone3. 1. a Family Medicine , WWAMI Region Practice & Research Network , Boise , ID , USA. 2. b Office of Medical Education , University of Rochester School of Medicine and Dentistry , Rochester , NY , USA. 3. c Neurology , University of Rochester School of Medicine and Dentistry , Rochester , NY , USA.
Abstract
BACKGROUND: Medical school evaluations typically rely on both language-based narrative descriptions and psychometrically converted numeric scores to convey performance to the grading committee. We evaluated inter-rater reliability and correlation of numeric versus narrative evaluations for students on their Neurology Clerkship. DESIGN/ METHODS: 50 Neurology Clerkship in-training evaluation reports completed by their residents and faculty members at the University of Rochester School of Medicine were dissected into narrative and numeric components. 5 Clerkship grading committee members retrospectively gave new narrative scores (NNS) while blinded to original numeric scores (ONS). We calculated intra-class correlation coefficients (ICC) and their associated confidence intervals for the ONS and the NNS. In addition, we calculated the correlation between ONS and NNS. RESULTS: The ICC was greater for the NNS (ICC = .88 (95% CI = .70-.94)) than the ONS (ICC = .62 (95% CI = .40-.77)) Pearson correlation coefficient showed that the ONS and NNS were highly correlated (r = .81). CONCLUSIONS: Narrative evaluations converted by a small group of experienced graders are at least as reliable as numeric scoring by individual evaluators. We could allow evaluators to focus their efforts on creating richer narrative of greater value to trainees.
BACKGROUND: Medical school evaluations typically rely on both language-based narrative descriptions and psychometrically converted numeric scores to convey performance to the grading committee. We evaluated inter-rater reliability and correlation of numeric versus narrative evaluations for students on their Neurology Clerkship. DESIGN/ METHODS: 50 Neurology Clerkship in-training evaluation reports completed by their residents and faculty members at the University of Rochester School of Medicine were dissected into narrative and numeric components. 5 Clerkship grading committee members retrospectively gave new narrative scores (NNS) while blinded to original numeric scores (ONS). We calculated intra-class correlation coefficients (ICC) and their associated confidence intervals for the ONS and the NNS. In addition, we calculated the correlation between ONS and NNS. RESULTS: The ICC was greater for the NNS (ICC = .88 (95% CI = .70-.94)) than the ONS (ICC = .62 (95% CI = .40-.77)) Pearson correlation coefficient showed that the ONS and NNS were highly correlated (r = .81). CONCLUSIONS: Narrative evaluations converted by a small group of experienced graders are at least as reliable as numeric scoring by individual evaluators. We could allow evaluators to focus their efforts on creating richer narrative of greater value to trainees.
Authors: Nicholas D Hartman; David E Manthey; Lindsay C Strowd; Nicholas M Potisek; Andrea Vallevand; Janet Tooze; Jon Goforth; Kimberly McDonough; Kim L Askew Journal: Med Sci Educ Date: 2021-05-27
Authors: Matthew Kelleher; Benjamin Kinnear; Dana R Sall; Danielle E Weber; Bailey DeCoursey; Jennifer Nelson; Melissa Klein; Eric J Warm; Daniel J Schumacher Journal: Perspect Med Educ Date: 2021-09-02