Simon L Conti1,2,3, William Brubaker1, Benjamin I Chung1, Mario Sofer4, Ryan S Hsi5, Rajesh Shinghal6, Christopher S Elliott1,7, Thomas Caruso3,8, John T Leppert1,2. 1. 1 Department of Urology, Stanford University School of Medicine , Stanford, California. 2. 2 Veterans Affairs Palo Alto Health Care System , Palo Alto, California. 3. 3 Johns Hopkins School of Education , Baltimore, Maryland. 4. 4 Tel Aviv Sourasky Medical Center , Tel Aviv, Israel . 5. 5 Department of Urologic Surgery, Vanderbilt University Medical Center , Nashville, Tennessee. 6. 6 Palo Alto Medical Foundation , Palo Alto, California. 7. 7 Division of Urology, Santa Clara Valley Medical Center , Santa Clara, California. 8. 8 Department of Anesthesia, Stanford University School of Medicine , Stanford, California.
Abstract
OBJECTIVES: We sought to validate the use of crowdsourced surgical video assessment in the evaluation of urology residents performing flexible ureteroscopic laser lithotripsy. METHODS: We collected video feeds from 30 intrarenal ureteroscopic laser lithotripsy cases where residents, postgraduate year (PGY) two through six, handled the ureteroscope. The video feeds were annotated to represent overall performance and to contain parts of the procedure being scored. Videos were submitted to a commercially available surgical video evaluation platform (Crowd-Sourced Assessment of Technical Skills). We used a validated ureteroscopic laser lithotripsy global assessment tool that was modified to include only those domains that could be evaluated on the captured video. Videos were evaluated by crowd workers recruited using Amazon's Mechanical Turk platform as well as five endourology-trained experts. Mean scores were calculated and intraclass correlation coefficients (ICCs) were computed for the expert domain and total scores. ICCs were estimated using a linear mixed-effects model. Spearman rank correlation coefficients were calculated as a measure of the strength of the relationships between the crowd mean and expert average scores. RESULTS: A total of 30 videos were reviewed 2488 times by 487 crowd workers and five expert endourologists. ICCs between expert raters were all below accepted levels of correlation (0.30), with the overall score having an ICC of <0.001. For individual domains, the crowd scores did not correlate with expert scores, except for the stone retrieval domain (0.60 p = 0.015). In addition, crowdsourced scores had a negative correlation with the PGY level (0.44, p = 0.019). CONCLUSIONS: There is poor agreement between experts and poor correlation between expert and crowd scores when evaluating video feeds of ureteroscopic laser lithotripsy. The use of an intraoperative video of ureteroscopy with laser lithotripsy for assessment of resident trainee skills does not appear reliable. This is further supported by the lack of correlation between crowd scores and advancing PGY level.
OBJECTIVES: We sought to validate the use of crowdsourced surgical video assessment in the evaluation of urology residents performing flexible ureteroscopic laser lithotripsy. METHODS: We collected video feeds from 30 intrarenal ureteroscopic laser lithotripsy cases where residents, postgraduate year (PGY) two through six, handled the ureteroscope. The video feeds were annotated to represent overall performance and to contain parts of the procedure being scored. Videos were submitted to a commercially available surgical video evaluation platform (Crowd-Sourced Assessment of Technical Skills). We used a validated ureteroscopic laser lithotripsy global assessment tool that was modified to include only those domains that could be evaluated on the captured video. Videos were evaluated by crowd workers recruited using Amazon's Mechanical Turk platform as well as five endourology-trained experts. Mean scores were calculated and intraclass correlation coefficients (ICCs) were computed for the expert domain and total scores. ICCs were estimated using a linear mixed-effects model. Spearman rank correlation coefficients were calculated as a measure of the strength of the relationships between the crowd mean and expert average scores. RESULTS: A total of 30 videos were reviewed 2488 times by 487 crowd workers and five expert endourologists. ICCs between expert raters were all below accepted levels of correlation (0.30), with the overall score having an ICC of <0.001. For individual domains, the crowd scores did not correlate with expert scores, except for the stone retrieval domain (0.60 p = 0.015). In addition, crowdsourced scores had a negative correlation with the PGY level (0.44, p = 0.019). CONCLUSIONS: There is poor agreement between experts and poor correlation between expert and crowd scores when evaluating video feeds of ureteroscopic laser lithotripsy. The use of an intraoperative video of ureteroscopy with laser lithotripsy for assessment of resident trainee skills does not appear reliable. This is further supported by the lack of correlation between crowd scores and advancing PGY level.
Authors: Grace L Paley; Rebecca Grove; Tejas C Sekhar; Jack Pruett; Michael V Stock; Tony N Pira; Steven M Shields; Evan L Waxman; Bradley S Wilson; Mae O Gordon; Susan M Culican Journal: J Surg Educ Date: 2021-02-25 Impact factor: 2.891
Authors: Justin M Dubin; W Austin Wyant; Navin C Balaji; Iakov V Efimenko; Quinn C Rainer; Belen Mora; Lisa Paz; Ashley G Winter; Ranjith Ramasamy Journal: Sex Med Date: 2021-05-29 Impact factor: 2.491