Joshua J Bagley1, Brian Piazza2, Michelle D Lazarus3,4, Edward J Fox1, Xiang Zhan5. 1. Department of Orthopaedics and Rehabilitation, Penn State Milton S. Hershey Medical Center, Hershey, Pennsylvania. 2. Department of Orthopedics and Sports Medicine, Billings Clinic, Billins, Montana. 3. Department of Anatomy and Developmental Biology, Centre for Human Anatomy Education, Monash University, Clayton, Victoria, Australia. 4. Monash Centre for Scholarship in Health Education (MCSHE), Faculty of Medicine, Nursing and Health Sciences, Monash University, Clayton, Victoria, Australia. 5. Department of Public Health Sciences, Penn State College of Medicine, Hershey, Pennsylvania.
Abstract
Medical knowledge and technical skills are foundations of surgical competency. The American Board of Orthopaedic Surgery (ABOS) and the Resident Review Committee for Orthopaedic Surgery recently mandated simulation training to improve surgical skills, listing 17 surgical skills modules to improve residents' technical skills. However, there is no established tool to measure the effectiveness of these modules. The Global Index for Technical Skills (GRITS) tool has been previously validated for evaluating general surgery residents. The aim of this study was to determine whether the GRITS tool is valid, practical, and reliable in evaluating the skills of orthopaedic residents in a simulation setting, whether the outcomes correlate to performance in the operating room, and to what extent these simulation modules are valued by residents. METHODS: Simulation performance was assessed longitudinally on 5 residents using the GRITS assessment through postgraduate years (PGY) 1 to 5 (n = 25 evaluations) in a simulated volar forearm approach using cadaveric specimens. An additional 20 PGY-1 residents were evaluated cross-sectionally in this same time frame. Written, open-ended feedback on the simulation experience was sought and analyzed via a thematic analysis. For correlative data, evaluations (n = 65 evaluations) of a variety of authentic surgical procedures were compiled on PGY-2 through PGY-5 orthopaedic residents and compared with the simulated experiences. RESULTS: GRITS scores were averaged for each group of residents, and validity and reliability were assessed using R-software. PGY-1 residents' mean GRITS evaluation score (expressed as a value from 1 to 5) was 3.4. Longitudinally, this mean score increased over the PGY years 2-5 to 4.4, 4.7, 4.9, and 4.8, respectively. Of the parameters measured by GRITS, the lowest average scores were "flow of operation" and "time and motion" across all levels, although these did improve over PGY years 2 to 5. Findings were consistent between simulation and "real-world" procedures. Open-ended responses evaluating the module were positive. CONCLUSIONS: Our study suggests that the GRITS tool shows promise as an effective and reliable method for assessing orthopaedic resident's technical skills based on an ABOS module system.
Medical knowledge and technical skills are foundations of surgical competency. The American Board of Orthopaedic Surgery (ABOS) and the Resident Review Committee for Orthopaedic Surgery recently mandated simulation training to improve surgical skills, listing 17 surgical skills modules to improve residents' technical skills. However, there is no established tool to measure the effectiveness of these modules. The Global Index for Technical Skills (GRITS) tool has been previously validated for evaluating general surgery residents. The aim of this study was to determine whether the GRITS tool is valid, practical, and reliable in evaluating the skills of orthopaedic residents in a simulation setting, whether the outcomes correlate to performance in the operating room, and to what extent these simulation modules are valued by residents. METHODS: Simulation performance was assessed longitudinally on 5 residents using the GRITS assessment through postgraduate years (PGY) 1 to 5 (n = 25 evaluations) in a simulated volar forearm approach using cadaveric specimens. An additional 20 PGY-1 residents were evaluated cross-sectionally in this same time frame. Written, open-ended feedback on the simulation experience was sought and analyzed via a thematic analysis. For correlative data, evaluations (n = 65 evaluations) of a variety of authentic surgical procedures were compiled on PGY-2 through PGY-5 orthopaedic residents and compared with the simulated experiences. RESULTS: GRITS scores were averaged for each group of residents, and validity and reliability were assessed using R-software. PGY-1 residents' mean GRITS evaluation score (expressed as a value from 1 to 5) was 3.4. Longitudinally, this mean score increased over the PGY years 2-5 to 4.4, 4.7, 4.9, and 4.8, respectively. Of the parameters measured by GRITS, the lowest average scores were "flow of operation" and "time and motion" across all levels, although these did improve over PGY years 2 to 5. Findings were consistent between simulation and "real-world" procedures. Open-ended responses evaluating the module were positive. CONCLUSIONS: Our study suggests that the GRITS tool shows promise as an effective and reliable method for assessing orthopaedic resident's technical skills based on an ABOS module system.
Medical knowledge and technical skills are the foundation of general and specialty surgical competency. Although there are well-established methods to evaluating medical knowledge via written and oral board examinations provided by the American Board of Orthopaedic Surgery (ABOS), the Board relies on residency programs to teach residents the technical skills necessary to become independent, skilled surgeons. Over the course of 5 years, orthopaedic surgery residents actively participate in operating rooms alongside their attending surgeon(s) in an effort to master the essential knowledge and technical skills. The relatively recent changes in the healthcare landscape, such as growing concerns about patient safety, cost burdens, time pressures in busy operating rooms, and work hour restrictions, have decreased these resident education opportunities. As a result, ensuring competency of these necessary healthcare discipline skills is increasingly challenging[1,2].With the added pressure to produce well-trained surgeons despite increasingly limited practice opportunities, the ABOS (in association with the Resident Review Committee for Orthopaedic Surgery) provided a potential solution with the recently mandated introduction of simulation training into orthopaedic residency training programs in an effort to foster improved surgical skill competencies. The ABOS initiated an educational framework that includes 17 surgical skills modules “to improve surgical skills by establishing goals and objectives and assessment metrics, providing training in skills used in the initial management of injured patients and basic operative skills to prepare residents to participate in surgical procedures”[3]. Although the module topics are outlined, the interpretation for execution and educational approach for undertaking these simulation modules is entirely up to the individual residency programs. Furthermore, recommendations for validated competency assessment approaches of these orthopaedic skill-based modules are not provided. These gaps, including how to run and effectively assess these modules, remain unknown.Urgency in identifying effective orthopaedic residency skill assessment approaches is amplified in modern residency training programs because of greater focus on achieving competencies over simply relying on exposure or “practice hours”[4,5]. Various assessment tools have been used over the years to assess surgical residents' technical skills[2,6]. These evaluation methods range from formal technical skills examinations in surgical skills laboratories, videotaped case reviews, motion-analysis, questionnaires, and observational assessments by trained surgeons[2,4,6-9]. These scales, however, are yet to be assessed in the modern simulation modules proposed by the ABOS.One promising assessment tool, with potential for surgical skill assessment, is the Global Index for Technical Skills (GRITS) scale (Fig. 1). Doyle et al. found this particular tool to be valid and reliable in the assessment of general surgery residents' technical skills, suggesting that this tool may be applicable to other surgical contexts[2]. It remains, however, unknown whether this tool is applicable for the evaluation of other surgical subspecialties, such as orthopaedics. Thus, the aim of this study is to determine whether the GRITS tool is a valid, practical, and reliable means of effectively evaluating the skills of orthopaedic surgical residents in an ABOS-recommended simulation setting and if those scores correlate to performance in the authentic operating room. An additional objective of this study was to gauge the usefulness of the simulation module from the residents' perspectives. We hypothesized that the mean GRITS score would correlate with the postgraduate years (PGY) level and that the resident's GRITS score would increase with further training. Furthermore, we asked, “what is the impact of these simulations on residents' perspectives of usefulness and value?”
Fig. 1
Global Rating Index for Technical Skills. (Reproduced, with permission, from: Doyle JD, Webber EM, Siddu RS. A universal global rating skill for the evaluation of technical skills in the operating room. Am J Surg. 2007;193[5]:551-5.)
Global Rating Index for Technical Skills. (Reproduced, with permission, from: Doyle JD, Webber EM, Siddu RS. A universal global rating skill for the evaluation of technical skills in the operating room. Am J Surg. 2007;193[5]:551-5.)
Methods
Participants
This study used the GRITS evaluation assessment to evaluate orthopaedic residents' soft-tissue handling in a simulated volar forearm approach using cadaveric specimens. Five residents were evaluated in the Spring annually through postgraduate years (PGY) 1 through 5, providing longitudinal data. An additional 20 PGY-1 residents were evaluated over this time period, providing additional cross-sectional data. All residents were from the same residency program, which accepts 5 residents per year, and no subset was used. Before beginning the laboratory test, participants were provided relevant articles describing the forearm approaches on which they would be evaluated.
ABOS Soft-Tissue Module Simulation
A single orthopaedic attending (E.J.F.) facilitated each session, grading residents (n = 25 residents) on completion by applying the GRITS score assessment. Each resident was graded in 7 categories—respect for tissue, time and motion, instrument handling/knowledge, flow of operation, knowledge of specific procedure, use of assistants, and communication skills—on a scale from 1 to 5 with a maximum total score of 35 points. The final score was then divided by 7 to calculate a mean evaluation score for each resident. After session completion, each participant provided feedback regarding their confidence in their ability to perform the task and the overall impact of the simulation to their residency training.
Authentic Surgical Resident Training
In addition to the simulated cadaver surgery and over the same 5-year period as the simulation model, 65 evaluations were completed using the GRITS scales during a variety of surgical procedures such as lumbar laminectomy and fusion, total knee arthroplasty, closed reduction and percutaneous pinning of pediatric supracondylar humerus fractures, tibial intramedullary nailing, and many others. Unlike the simulated soft-tissue module with a single assessor, various orthopaedic attending surgeons (n = 23) assessed these “real-world” surgical procedures (n = 65 evaluations). Resident participants (n = 22) in this part of the study were PGY-2 through PGY-5, some of whom (n = 5) participated in the simulation module portion of the study.
Thematic Analysis
Open-ended feedback was analyzed using applied thematic analysis where text was coded into a codebook to identify key themes[10]. Coding in NVivo (version 12 plus) was undertaken on simulation participants' responses to the following postmodule questions: “Please share your opinion of how this session helped and/or hindered your residency education” and “Please share any input regarding ideas to improve this session and why these changes would help you.” Over 1982 words from 25 participants were analyzed using open coding by M.D.L. Themes were reviewed by all coauthors, and discrepancies in data were rectified before the final analysis.
GRITS Statistical Analysis
The GRITS scores were averaged for each group of residents in each of the 7 categories for comparison of different PGY levels. Raw p values were calculated from 2-sample comparisons using the Wilcoxon signed-rank test. To account for multiple testing comparing all years together, we used the Bonferroni correction to obtain adjusted p values. Validity was assessed using the Pearson correlation coefficient for both the simulated and “real-world” experiences. Reliability was determined using Cronbach's alpha for each setting as well. All statistical analyses were conducted in the R-software[11].
Results
Residents were evaluated using the GRITS assessment tool in 2 educational contexts, within a cadaveric volar forearm surgical simulation and within the authentic surgical environment. In the simulation model, most participants were evaluated at a single timepoint (n = 20), with an additional 5 residents evaluated longitudinally across multiple timepoints during the 5-year residency training. In the authentic surgical environment, 65 total evaluations were collected on residents (n = 22 residents) of differing levels of training.During the PGY-1 year in the simulation model, the residents' mean evaluation score (expressed as a value from 1 to 5) was 3.4. Longitudinally, this mean score increased over the PGY-2, PGY-3, PGY-4, and PGY-5 years to 4.4, 4.7, 4.9, and 4.8, respectively (Fig. 2). Raw p values comparing each year against one another showed statistical differences only when comparing PGY-1 against all other years. When comparing scores across all years together, scores varied significantly by PGY level (adjusted p-value = 4.77e−07). A Pearson correlation also demonstrated the mean evaluation score and PGY level to be highly correlated (Pearson coefficient = 0.879, p-value = 0.0495). Cronbach's alpha (a measure of interitem consistency) was calculated for the 5 residents with 5 simulation procedures (1 each year from 2015 to 2019) and was 0.94.
Fig. 2
Mean scores of evaluations by the level of training (postgraduate year) of simulation surgical procedures. Possible scores range from 1 to 5. Error bars represent SD.
Mean scores of evaluations by the level of training (postgraduate year) of simulation surgical procedures. Possible scores range from 1 to 5. Error bars represent SD.When applied to “real-world” procedures, the mean evaluation scores also varied significantly by PGY level (Fig. 3, p-value = 0.0017), with a Pearson correlation demonstrating a strong correlation between the mean evaluation score and PGY level (Pearson coefficient = 0.951, p-value = 0.0487). Cronbach's alpha for the 10 residents with 3 or 3+ real surgical procedures was 0.7. When calculated for the 6 residents with 4 or 4+ real surgical procedures, Cronbach's alpha was 0.76.
Fig. 3
Mean scores of evaluations by the level of training (postgraduate year) of real surgical procedures. Possible scores range from 1 to 5. Error bars represent SD.
Mean scores of evaluations by the level of training (postgraduate year) of real surgical procedures. Possible scores range from 1 to 5. Error bars represent SD.In the simulation setting, of the 7 individual categories evaluated by the GRITS tool, “flow of operation” (mean score = 2.9) and “time and motion” (mean score = 2.8) were consistently the lowest scoring categories during the PGY-1 year. Although these parameters continued to improve over PGY years 2 to 5, these parameters remained the lowest average score in the residents' final year (flow of operation = 4.4, time and motion = 4.6).After completing the simulation, open-ended responses evaluating the module were overwhelmingly positive. Themes identified included the following: “Increased opportunities for practice and the ability to practice in a low-stress environment.” The PGY-1s, in particular, specified that under the “increased opportunity for practice” theme, the simulations provided the novel opportunity for them to be “lead surgeons” despite being early in their residency training, although—also under this theme—senior residents reportedly valued the “repetition.” When responding to questions related to improving the fresh-tissue simulation module universally, the theme identified was to “increase frequency” of these. PGY-1s also wanted more “feedback checkpoints,” whereas more senior residents wanted the modules to “expand into fractures and fixation.”
Discussion
The need for knowledgeable, skilled, well-trained surgeons is apparent to all[12]. Many educational recommendations are made to help residents achieve these competencies, but debates about reliably and validly assessing residents in this training period remain, with particular gaps in the identification of an effective objective evaluation tool of surgical residents' technical skills[2,4]. Various assessment tools have been created in hopes of filling this gap of assessing surgical residents' technical skills[2,8,9]. One such tool showing promise among general surgery residents is the GRITS assessment created by Doyle et al.[2]. The purpose of this study was to determine whether the GRITS assessment tool is effective in evaluating orthopaedic residents in a surgical simulation setting and, second, whether the scores translated to the genuine operating room. Our study supports the GRITS as a valid and reliable tool for objectively evaluating an orthopaedic surgery resident's technical skills in both the operating room and a simulation setting. This combination of cross-sectional and longitudinal data allowed for the evaluation of both the simulation module and the impact of these educational activities on residents' skill progression with experience. Equally important, our results illustrate that the assessment tool is effective regardless of the evaluating surgeon because the alignment of scores between the 2 settings was quite high.The data also revealed that the 2 GRITS categories with the lowest scores were “flow of operation” and “time and motion.” This supports the data shown by D'Angelo et al.[4] in which general surgery chief residents were evaluated while performing 3 simulated surgeries. Of the 6 categories evaluated, “time and motion” once again proved to be among the lowest scoring categories. This is particularly important given that most would agree that the cited skills are improved with time and experience. Our data support this notion and suggest that 1 important factor to improving the GRITS scores is to increase exposure to contexts for residential practice, such as the soft-tissue handling session used here because the 2 previously mentioned categories saw the greatest improvement in the residents evaluated over 5 years. Furthermore, the strong correlation between resident experience, inferred by the PGY level, and improved GRITS scores in both the simulation and real-world settings supports the GRITS as a reliable and valid tool for assessing orthopaedic residents' skills and progression. Given this, perhaps simulation, in association with the GRITS scale, could contribute to the overall evaluation of an orthopaedic resident's readiness to enter independent practice.An important component of any educational activity is learner perceptions because this affects the uptake and learner engagement[13]. Not only was this simulation experience at least as effective for learners as the authentic surgical environment, residents of all year levels valued the increased opportunity for practice in a calm environment. Evidence suggests that these types of “repeated” practice opportunities in supportive environments decrease cognitive load, supporting enhanced learning opportunities[14]. The open-ended response feedback also suggests that scaffolding the module targeting the learners’ experience level is recommended; for PGY-1s, providing early opportunities to “lead” a surgical approach seems important, but increasing feedback as they progress through this simulation may be important. In later years, as learners progress through competencies, residency programs are encouraged to build upon the module (in this case, adding in fractures and fixation to the fresh tissue module) to further support residents' development outside of the surgical operating rooms. A validated assessment tool such as the GRITS could certainly play a role in tracking progression and guiding both junior and senior residents in improving their technical skills in these repeated settings.Although this study did show improvement in the overall surgical skills of participating residents as evidenced by the improvement in the mean GRITS score with increased training level, there was no measure of quality of the clinical outcome. Although we certainly hope that improved surgical skills directly translates to improved clinical outcomes, this remains unknown and is a potential weakness of this study. In fact, Anderson et al. found that improvement in objective structured assessments of residents' surgical skills did not correlate to improvement in surgical results[7]. Certainly, further research directed at correlating a resident's technical skills and the quality of surgical outcome goal would be welcome, although this would be difficult to study outside of a simulation setting.Another potential weakness of this study is that a single attending surgeon evaluated residents in the simulation setting, raising questions about the applicability of the GRITS and the intergrading reliability of the GRITS. However, when the GRITS was applied to real surgical procedures with numerous evaluating surgeons, the correlation between the mean GRITS score and PGY level remained, suggesting the GRITS as both applicable and reliable regardless of the evaluating surgeon. Still, in the real-world setting, we did not control for evaluating surgeon, so it remains possible that mean scores were overinflated or underinflated if a surgeon was prone to give lower or higher scores. Furthermore, this study was completed at a single institution; therefore, it should be recognized that generalization to other training sites has not yet been established.Finally, one further weakness of this study is that in the simulation setting, there was very little difference in the mean scores for PGY 3-5 residents, suggesting residents may achieve competency by year 3, raising questions about the validity of the GRITS assessment. When viewing these results, 1 must remember that these scores reflected a volar forearm approach on a cadaveric specimen, a procedure that most PGY-3 and PGY-4 residents should feel comfortable completing. This was reflected by the GRITS score. Furthermore, when the GRITS was applied to the real-world setting with a myriad of differing procedures, the scores more closely reflected a linear progression from postgraduate years 2 to 5. We recommend future studies be focused at applying the GRITS to both simple and complex procedures to establish baseline scores for the procedures. Residents' scores could then be compared with these established norms to determine competency with respect to their level of training.With increasing responsibilities and limited work hours, ensuring residents are adequately trained and ready to proceed forward into independent practice is a recognized challenge. The ABOS recently mandated surgical simulation to improve surgical skills to meet this challenge; however, without any established tool to the monitor progress, evaluating surgical skills objectively remains difficult. Based on the results of this study, our evidence supports that the GRITS tool shows promise as an effective and reliable method for assessing orthopaedic resident's technical skills based on an ABOS module system.
Authors: Carly E Glarner; Robert J McDonald; Amy B Smith; Glen E Leverson; Sarah Peyre; Carla M Pugh; Caprice C Greenberg; Jacob A Greenberg; Eugene F Foley Journal: J Surg Educ Date: 2013-08-06 Impact factor: 2.891
Authors: Anne-Lise D D'Angelo; Elaine R Cohen; Calvin Kwan; Shlomi Laufer; Caprice Greenberg; Jacob Greenberg; Douglas Wiegmann; Carla M Pugh Journal: Am J Surg Date: 2014-10-22 Impact factor: 2.565
Authors: Jennifer A Perone; Grant T Fankhauser; Deepak Adhikari; Hemalkumar B Mehta; Majka B Woods; Douglas S Tyler; Kimberly M Brown Journal: Am J Surg Date: 2016-10-08 Impact factor: 2.565
Authors: Mathilde Labbé; Meredith Young; Marco Mascarella; Murad Husein; Philip C Doyle; Lily H P Nguyen Journal: Acad Med Date: 2020-05 Impact factor: 6.893
Authors: Kivanc Atesok; Richard M Satava; Ann Van Heest; MaCalus V Hogan; Robert A Pedowitz; Freddie H Fu; Irena Sitnikov; J Lawrence Marsh; Shepard R Hurwitz Journal: J Am Acad Orthop Surg Date: 2016-08 Impact factor: 3.020
Authors: Donald D Anderson; Steven Long; Geb W Thomas; Matthew D Putnam; Joan E Bechtold; Matthew D Karam Journal: Clin Orthop Relat Res Date: 2016-04 Impact factor: 4.176