Ariel K Dubin1, Roger Smith2, Danielle Julian2, Alyssa Tanaka2, Patricia Mattingly3. 1. Department of Obstetrics and Gynecology, Columbia University Medical Center, New York, New York. Electronic address: akhdubin@gmail.com. 2. Florida Hospital Nicholson Center, Celebration, Florida. 3. Department of Obstetrics and Gynecology, Columbia University Medical Center, New York, New York.
Abstract
STUDY OBJECTIVE: To answer the question of whether there is a difference between robotic virtual reality simulator performance assessment and validated human reviewers. Current surgical education relies heavily on simulation. Several assessment tools are available to the trainee, including the actual robotic simulator assessment metrics and the Global Evaluative Assessment of Robotic Skills (GEARS) metrics, both of which have been independently validated. GEARS is a rating scale through which human evaluators can score trainees' performances on 6 domains: depth perception, bimanual dexterity, efficiency, force sensitivity, autonomy, and robotic control. Each domain is scored on a 5-point Likert scale with anchors. We used 2 common robotic simulators, the dV-Trainer (dVT; Mimic Technologies Inc., Seattle, WA) and the da Vinci Skills Simulator (dVSS; Intuitive Surgical, Sunnyvale, CA), to compare the performance metrics of robotic surgical simulators with the GEARS for a basic robotic task on each simulator. DESIGN: A prospective single-blinded randomized study. SETTING: A surgical education and training center. PARTICIPANTS: Surgeons and surgeons in training. INTERVENTIONS: Demographic information was collected including sex, age, level of training, specialty, and previous surgical and simulator experience. Subjects performed 2 trials of ring and rail 1 (RR1) on each of the 2 simulators (dVSS and dVT) after undergoing randomization and warm-up exercises. The second RR1 trial simulator performance was recorded, and the deidentified videos were sent to human reviewers using GEARS. Eight different simulator assessment metrics were identified and paired with a similar performance metric in the GEARS tool. The GEARS evaluation scores and simulator assessment scores were paired and a Spearman rho calculated for their level of correlation. MEASUREMENTS AND MAIN RESULTS:Seventy-four subjects were enrolled in this randomized study with 9 subjects excluded for missing or incomplete data. There was a strong correlation between the GEARS score and the simulator metric score for time to complete versus efficiency, time to complete versus total score, economy of motion versus depth perception, and overall score versus total score with rho coefficients greater than or equal to 0.70; these were significant (p < .0001). Those with weak correlation (rho ≥0.30) were bimanual dexterity versus economy of motion, efficiency versus master workspace range, bimanual dexterity versus master workspace range, and robotic control versus instrument collisions. CONCLUSION: On basic VR tasks, several simulator metrics are well matched with GEARS scores assigned by human reviewers, but others are not. Identifying these matches/mismatches can improve the training and assessment process when using robotic surgical simulators.
RCT Entities:
STUDY OBJECTIVE: To answer the question of whether there is a difference between robotic virtual reality simulator performance assessment and validated human reviewers. Current surgical education relies heavily on simulation. Several assessment tools are available to the trainee, including the actual robotic simulator assessment metrics and the Global Evaluative Assessment of Robotic Skills (GEARS) metrics, both of which have been independently validated. GEARS is a rating scale through which human evaluators can score trainees' performances on 6 domains: depth perception, bimanual dexterity, efficiency, force sensitivity, autonomy, and robotic control. Each domain is scored on a 5-point Likert scale with anchors. We used 2 common robotic simulators, the dV-Trainer (dVT; Mimic Technologies Inc., Seattle, WA) and the da Vinci Skills Simulator (dVSS; Intuitive Surgical, Sunnyvale, CA), to compare the performance metrics of robotic surgical simulators with the GEARS for a basic robotic task on each simulator. DESIGN: A prospective single-blinded randomized study. SETTING: A surgical education and training center. PARTICIPANTS: Surgeons and surgeons in training. INTERVENTIONS: Demographic information was collected including sex, age, level of training, specialty, and previous surgical and simulator experience. Subjects performed 2 trials of ring and rail 1 (RR1) on each of the 2 simulators (dVSS and dVT) after undergoing randomization and warm-up exercises. The second RR1 trial simulator performance was recorded, and the deidentified videos were sent to human reviewers using GEARS. Eight different simulator assessment metrics were identified and paired with a similar performance metric in the GEARS tool. The GEARS evaluation scores and simulator assessment scores were paired and a Spearman rho calculated for their level of correlation. MEASUREMENTS AND MAIN RESULTS: Seventy-four subjects were enrolled in this randomized study with 9 subjects excluded for missing or incomplete data. There was a strong correlation between the GEARS score and the simulator metric score for time to complete versus efficiency, time to complete versus total score, economy of motion versus depth perception, and overall score versus total score with rho coefficients greater than or equal to 0.70; these were significant (p < .0001). Those with weak correlation (rho ≥0.30) were bimanual dexterity versus economy of motion, efficiency versus master workspace range, bimanual dexterity versus master workspace range, and robotic control versus instrument collisions. CONCLUSION: On basic VR tasks, several simulator metrics are well matched with GEARS scores assigned by human reviewers, but others are not. Identifying these matches/mismatches can improve the training and assessment process when using robotic surgical simulators.
Authors: Cho Rok Lee; Seoung Yoon Rho; Sang Hyup Han; Young Moon; Sun Young Hwang; Young Joo Kim; Chang Moo Kang Journal: World J Surg Date: 2019-11 Impact factor: 3.352