Rebecca E Scully1, Shanley B Deal2, Michael J Clark3, Katherine Yang3, Greg Wnuk4, Douglas S Smink1, Jonathan P Fryer5, Jordan D Bohnen6, Ezra N Teitelbaum5, Shari L Meyerson7, Andreas H Meier8, Paul G Gauger4, Rishindra M Reddy4, Daniel E Kendrick9, Michael Stern10, David T Hughes10, Jeffrey G Chipman11, Jitesh A Patel7, Adnan Alseidi2, Brian C George4. 1. Department of Surgery, Brigham and Women's Hospital, Boston, Massachusetts. 2. Department of Surgery, Virginia Mason Medical Center, Seattle, Washington. 3. Consulting for Statistics, Computing, and Analytics, University of Michigan, Ann Arbor, Michigan. 4. Center for Surgical Training and Research, Department of Surgery, University of Michigan, Ann Arbor, Michigan. 5. Department of Surgery, Northwestern Memorial Hospital, Chicago, Illinois. 6. Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts. 7. Department of Surgery, University of Kentucky Medical Center, Lexington, Kentucky. 8. Department of Surgery, SUNY Upstate University Hospital, Syracuse, New York. 9. University Hospitals Case Western Reserve, Cleveland Ohio; Center for Surgical Training and Research, Department of Surgery, University of Michigan, Ann Arbor, Michigan. 10. Department of Surgery, University of Michigan, Ann Arbor, Michigan. 11. Department of Surgery, University of Minnesota, Minneapolis, Minnesota.
Abstract
OBJECTIVE: We examined the impact of video editing and rater expertise in surgical resident evaluation on operative performance ratings of surgical trainees. DESIGN: Randomized independent review of intraoperative video. SETTING: Operative video was captured at a single, tertiary hospital in Boston, MA. PARTICIPANTS: Six common general surgery procedures were video recorded of 6 attending-trainee dyads. Full-length and condensed versions (n = 12 videos) were then reviewed by 13 independent surgeon raters (5 evaluation experts, 8 nonexperts) using a crossed design. Trainee performance was rated using the Operative Performance Rating Scale, System for Improving and Measuring Procedural Learning (SIMPL) Performance scale, the Zwisch scale, and ten Cate scale. These ratings were then standardized before being compared using Bayesian mixed models with raters and videos treated as random effects. RESULTS: Editing had no effect on the Operative Performance Rating Scale Overall Performance (-0.10, p = 0.30), SIMPL Performance (0.13, p = 0.71), Zwisch (-0.12, p = 0.27), and ten Cate scale (-0.13, p = 0.29). Additionally, rater expertise (evaluation expert vs. nonexpert) had no effect on the same scales (-0.16 (p = 0.32), 0.18 (p = 0.74), 0.25 (p = 0.81), and 0.25 (p = 0.17). CONCLUSIONS: There is little difference in operative performance assessment scores when raters use condensed videos or when raters who are not experts in surgical resident evaluation are used. Future validation studies of operative performance assessment scales may be facilitated by using nonexpert surgeon raters viewing videos condensed using a standardized protocol.
OBJECTIVE: We examined the impact of video editing and rater expertise in surgical resident evaluation on operative performance ratings of surgical trainees. DESIGN: Randomized independent review of intraoperative video. SETTING: Operative video was captured at a single, tertiary hospital in Boston, MA. PARTICIPANTS: Six common general surgery procedures were video recorded of 6 attending-trainee dyads. Full-length and condensed versions (n = 12 videos) were then reviewed by 13 independent surgeon raters (5 evaluation experts, 8 nonexperts) using a crossed design. Trainee performance was rated using the Operative Performance Rating Scale, System for Improving and Measuring Procedural Learning (SIMPL) Performance scale, the Zwisch scale, and ten Cate scale. These ratings were then standardized before being compared using Bayesian mixed models with raters and videos treated as random effects. RESULTS: Editing had no effect on the Operative Performance Rating Scale Overall Performance (-0.10, p = 0.30), SIMPL Performance (0.13, p = 0.71), Zwisch (-0.12, p = 0.27), and ten Cate scale (-0.13, p = 0.29). Additionally, rater expertise (evaluation expert vs. nonexpert) had no effect on the same scales (-0.16 (p = 0.32), 0.18 (p = 0.74), 0.25 (p = 0.81), and 0.25 (p = 0.17). CONCLUSIONS: There is little difference in operative performance assessment scores when raters use condensed videos or when raters who are not experts in surgical resident evaluation are used. Future validation studies of operative performance assessment scales may be facilitated by using nonexpert surgeon raters viewing videos condensed using a standardized protocol.
Authors: Saba Balvardi; Anitha Kammili; Melissa Hanson; Carmen Mueller; Melina Vassiliou; Lawrence Lee; Kevin Schwartzman; Julio F Fiore; Liane S Feldman Journal: Surg Endosc Date: 2022-05-12 Impact factor: 4.584