Rose Hatala1, Adam P Sawatsky, Nancy Dudek, Shiphra Ginsburg, David A Cook. 1. R. Hatala is associate professor of medicine, Faculty of Medicine, and director, Clinical Educator Fellowship, Centre for Health Education Scholarship, University of British Columbia, Vancouver, British Columbia, Canada. A.P. Sawatsky is assistant professor of medicine and senior associate consultant, Division of General Internal Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota. N. Dudek is associate professor, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada. S. Ginsburg is professor, Department of Medicine, Faculty of Medicine, University of Toronto, scientist, Wilson Centre for Research in Education, University Health Network/University of Toronto, and staff physician, Mount Sinai Hospital, Toronto, Ontario, Canada. D.A. Cook is professor of medicine and medical education, associate director, Mayo Clinic Online Learning, and consultant, Division of General Internal Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota.
Abstract
PURPOSE: In-training evaluation reports (ITERs) constitute an integral component of medical student and postgraduate physician trainee (resident) assessment. ITER narrative comments have received less attention than the numeric scores. The authors sought both to determine what validity evidence informs the use of narrative comments from ITERs for assessing medical students and residents and to identify evidence gaps. METHOD: Reviewers searched for relevant English-language studies in MEDLINE, EMBASE, Scopus, and ERIC (last search June 5, 2015), and in reference lists and author files. They included all original studies that evaluated ITERs for qualitative assessment of medical students and residents. Working in duplicate, they selected articles for inclusion, evaluated quality, and abstracted information on validity evidence using Kane's framework (inferences of scoring, generalization, extrapolation, and implications). RESULTS: Of 777 potential articles, 22 met inclusion criteria. The scoring inference is supported by studies showing that rich narratives are possible, that changing the prompt can stimulate more robust narratives, and that comments vary by context. Generalization is supported by studies showing that narratives reach thematic saturation and that analysts make consistent judgments. Extrapolation is supported by favorable relationships between ITER narratives and numeric scores from ITERs and non-ITER performance measures, and by studies confirming that narratives reflect constructs deemed important in clinical work. Evidence supporting implications is scant. CONCLUSIONS: The use of ITER narratives for trainee assessment is generally supported, except that evidence is lacking for implications and decisions. Future research should seek to confirm implicit assumptions and evaluate the impact of decisions.
PURPOSE: In-training evaluation reports (ITERs) constitute an integral component of medical student and postgraduate physician trainee (resident) assessment. ITER narrative comments have received less attention than the numeric scores. The authors sought both to determine what validity evidence informs the use of narrative comments from ITERs for assessing medical students and residents and to identify evidence gaps. METHOD: Reviewers searched for relevant English-language studies in MEDLINE, EMBASE, Scopus, and ERIC (last search June 5, 2015), and in reference lists and author files. They included all original studies that evaluated ITERs for qualitative assessment of medical students and residents. Working in duplicate, they selected articles for inclusion, evaluated quality, and abstracted information on validity evidence using Kane's framework (inferences of scoring, generalization, extrapolation, and implications). RESULTS: Of 777 potential articles, 22 met inclusion criteria. The scoring inference is supported by studies showing that rich narratives are possible, that changing the prompt can stimulate more robust narratives, and that comments vary by context. Generalization is supported by studies showing that narratives reach thematic saturation and that analysts make consistent judgments. Extrapolation is supported by favorable relationships between ITER narratives and numeric scores from ITERs and non-ITER performance measures, and by studies confirming that narratives reflect constructs deemed important in clinical work. Evidence supporting implications is scant. CONCLUSIONS: The use of ITER narratives for trainee assessment is generally supported, except that evidence is lacking for implications and decisions. Future research should seek to confirm implicit assumptions and evaluate the impact of decisions.
Authors: Annabel K Frank; Patricia O'Sullivan; Lynnea M Mills; Virginie Muller-Juge; Karen E Hauer Journal: J Gen Intern Med Date: 2019-05 Impact factor: 5.128
Authors: Ali Al Maawali; Allan Puran; Sarah Schwartz; Julie Johnstone; Zia Bismilla Journal: Paediatr Child Health Date: 2021-03-03 Impact factor: 2.253
Authors: Andrew S Parsons; Kelley Mark; James R Martindale; Megan J Bray; Ryan P Smith; Elizabeth Bradley; Maryellen Gusic Journal: J Gen Intern Med Date: 2022-06-16 Impact factor: 6.473
Authors: Jenny J Ko; Mark S Ballard; Tamara Shenkier; Jessica Simon; Amanda Roze des Ordons; Gillian Fyles; Shilo Lefresne; Philippa Hawley; Charlie Chen; Michael McKenzie; Isabella Ghement; Justin J Sanders; Rachelle Bernacki; Scott Jones Journal: Palliat Med Rep Date: 2020-11-24