Suzanne L Strom1, Craig L Anderson2, Luanna Yang3, Cecilia Canales3, Alpesh Amin4, Shahram Lotfipour2, C Eric McCoy2, Megan Boysen Osborn, Mark I Langdorf2. 1. University of California Irvine School of Medicine, Department of Anesthesia and Perioperative Care, Irvine, California. 2. University of California Irvine School of Medicine, Department of Emergency. 3. Medicine, Irvine, California. 4. University of California Irvine, Irvine, CaliforniaUniversity of California Irvine School of Medicine, Department of Medicine, Irvine, California.
Abstract
INTRODUCTION: Traditional Advanced Cardiac Life Support (ACLS) courses are evaluated using written multiple-choice tests. High-fidelity simulation is a widely used adjunct to didactic content, and has been used in many specialties as a training resource as well as an evaluative tool. There are no data to our knowledge that compare simulation examination scores with written test scores for ACLS courses. OBJECTIVE: To compare and correlate a novel high-fidelity simulation-based evaluation with traditional written testing for senior medical students in an ACLS course. METHODS: We performed a prospective cohort study to determine the correlation between simulation-based evaluation and traditional written testing in a medical school simulation center. Students were tested on a standard acute coronary syndrome/ventricular fibrillation cardiac arrest scenario. Our primary outcome measure was correlation of exam results for 19 volunteer fourth-year medical students after a 32-hour ACLS-based Resuscitation Boot Camp course. Our secondary outcome was comparison of simulation-based vs. written outcome scores. RESULTS: The composite average score on the written evaluation was substantially higher (93.6%) than the simulation performance score (81.3%, absolute difference 12.3%, 95% CI [10.6-14.0%], p<0.00005). We found a statistically significant moderate correlation between simulation scenario test performance and traditional written testing (Pearson r=0.48, p=0.04), validating the new evaluation method. CONCLUSION: Simulation-based ACLS evaluation methods correlate with traditional written testing and demonstrate resuscitation knowledge and skills. Simulation may be a more discriminating and challenging testing method, as students scored higher on written evaluation methods compared to simulation.
INTRODUCTION: Traditional Advanced Cardiac Life Support (ACLS) courses are evaluated using written multiple-choice tests. High-fidelity simulation is a widely used adjunct to didactic content, and has been used in many specialties as a training resource as well as an evaluative tool. There are no data to our knowledge that compare simulation examination scores with written test scores for ACLS courses. OBJECTIVE: To compare and correlate a novel high-fidelity simulation-based evaluation with traditional written testing for senior medical students in an ACLS course. METHODS: We performed a prospective cohort study to determine the correlation between simulation-based evaluation and traditional written testing in a medical school simulation center. Students were tested on a standard acute coronary syndrome/ventricular fibrillation cardiac arrest scenario. Our primary outcome measure was correlation of exam results for 19 volunteer fourth-year medical students after a 32-hour ACLS-based Resuscitation Boot Camp course. Our secondary outcome was comparison of simulation-based vs. written outcome scores. RESULTS: The composite average score on the written evaluation was substantially higher (93.6%) than the simulation performance score (81.3%, absolute difference 12.3%, 95% CI [10.6-14.0%], p<0.00005). We found a statistically significant moderate correlation between simulation scenario test performance and traditional written testing (Pearson r=0.48, p=0.04), validating the new evaluation method. CONCLUSION: Simulation-based ACLS evaluation methods correlate with traditional written testing and demonstrate resuscitation knowledge and skills. Simulation may be a more discriminating and challenging testing method, as students scored higher on written evaluation methods compared to simulation.
There is early and promising evidence that high-fidelity simulation may be more effective in training healthcare providers in the management of critically illpatients.1–4 Previous work has reported its use to assess the psychomotor performance of senior medical students on the American Heart Association’s (AHA) standardized Advanced Cardiac Life Support (ACLS) clinical resuscitation scenarios.5 This research showed that a simulation-based course in ACLS resulted in enhanced student performance, with improved critical action completion, clinical knowledge and psychomotor skill application, and decreased time to cardiopulmonary resuscitation (CPR) and defibrillation.Student assessment of knowledge acquisition after an ACLS course is traditionally performed using multiple-choice testing alone, with practical skills demonstration of basic airway management, CPR and defibrillation. Although with little evidence to support its use, written evaluations for the assessment of critical management skills has been the historical standard. The advent of evidenced-based medicine and medical simulation has created debate on the optimal evaluation method to assess medical students’ ability to manage critically illpatients.We are not aware of any literature that evaluates the relationship between integrated high-fidelity simulation-based methods and traditional written cognitive testing with non-integrated psychomotor performance.6 This evaluation was recommended as one of the critical steps of core competency assessment by a professional academic society working group on assessment of observable learner performance.The objective of our study was to correlate results of a novel high-fidelity simulation-based evaluation method with traditional written evaluation for senior medical students enrolled in an ACLS course.
The composite average score on the three written evaluations was substantially higher (93.6%) than the simulation performance score (81.3%, absolute difference 12.3%, 95% CI [10.6–14.0%], p<0.00005). The various component mean and SD scores are listed in the Table.
Table
Individual and grouped percent correct performance scores for traditional written evaluation vs. simulation evaluation.
Mean±SD
Multiple choice test
89.4±5.7%
Cardiac rhythm test
97.8±10.7%
Clinical management test
93.8±6.3%
Mean of three written tests
93.6±5.0%
Simulation test
81.3±3.2%
Difference
12.3±3.5%
SD, standard deviation
We found a statistically significant moderate correlation between simulation scenario test performance and traditional written test performance (Figure) (Pearson r=0.48, p=0.04).
Figure
Correlation between mean percent correct score on traditional three-component written evaluation vs. percent correct score on simulation evaluation. Open circle student was excluded due to outlying low score on cardiac rhythm test.
Inter-rater reliability for scoring the participants in pre- and post- training scenarios was good. The median kappa for the 75 test items was 0.68 (interquartile range 0.36–0.94). Forty-six items (61%) had kappa >0.60.
DISCUSSION
We found that high-fidelity simulation-based evaluation and traditional written testing for senior medical students in an ACLS course correlates well with each other. Simulation is being incorporated in the education, training, and evaluation of healthcare providers at a rapid pace. As educational technology advances rapidly, the research to support its use has lagged behind. Traditional written evaluations are widely used, and have been accepted as the standard for healthcare providers’ ability to manage critical patients. However, as simulation is realistic, actively engaging and clinically based, healthcare teachers have begun to question written testing.7We found a positive moderate correlation between simulation-based evaluation and traditional written evaluation. Other studies have compared the two but did not specifically assess correlation, nor report results in medical trainees. Rodgers’ study on nursing students in an ACLS course completing both a written and practical evaluation concluded that written evaluation is, not surprisingly, a poor predictor of skill performance.8 Issenberg similarly found no association between CPR psychomotor skills and total knowledge in nursing students.9 As physician trainees are destined to be team leaders in resuscitation, our work is the first to study medical students, and therefore adds to this literature.10,11The issue of correlation should not be misconstrued as equivalence. We contend that the simulation evaluation is superior to evaluate psychomotor skills, yet accept the place of written evaluation to demonstrate cognitive mastery across broad medical content. The correlation demonstrates that students who have traditionally done well in written testing are likely to also do well in a simulation evaluation. An educator should acknowledge that the two evaluation methods are complementary, rather than substitutable, and consider adding such evaluation to tasks that require manual dexterity and critical thinking. Furthermore, simulation requires substantial human and capital resources to show competence, and therefore limits its widespread application. Simulation, by necessity, focuses on narrow clinical scenarios, which, though chosen to represent critical management, cannot cover the entire breadth of cardiac resuscitation. Hence both simulation and written evaluations are likely necessary.Our secondary outcome compared the two scores. The composite average score on the written evaluation was substantially higher (93.6%) than the simulation performance score (81.3%). It is important to note that all 19 students had the same training and were evaluated by both written and simulation methods. A higher written test score does not mean better performance, as the two modalities measure different outcomes.Participants find high-fidelity simulation for critical event management to be a valuable educational experience.12 Emotional arousal is effective in memory acquisition13 and simulation-based experiential learning has been shown to be effective in retention of skills,14 improving clinical outcomes,15 and reducing error related healthcare costs.16 Furthermore, repetition of simulation experience reinforces knowledge acquisition and increases confidence.17,18Written testing has historically been the most common mode of evaluation. However, the construct validity of the AHA’s ACLS test has been challenged, as nurses’ scores were not shown to correlate with performance on resuscitation after an ACLS course. These same authors opined that the written testing at least had content validity, as the tests questions were drawn directly from the student manual. Finally, their analysis supported our contention that the two modalities complement each other in providing a broad assessment of the learner’s performance.19Despite these questions, newer examination techniques, such as simulation-based evaluations, need to be validated before widespread use. Our study provides preliminary evidence that will shape this discussion. There has been a move toward simulation for assessment, as exemplified by a report of five years of certification via Fundamentals of Laparoscopic Surgery.20 In addition, simulation is used in both initial and maintenance of certification in anesthesiology.21–23 Hence, it is critical to scrutinize new testing methods to validate that they at least approximate traditional techniques.Students performed better on the written form of testing than on the simulation. We believe this indicates that the simulation evaluation method is a more demanding measure, which emphasizes application of knowledge over rote memorization. Furthermore, we found a narrower range of student performance with the simulation method (range of scores 74.8–87.2%, Δ12.4%) than the written assessment (80.9–98.2%, Δ17.3%), which indicates a more uniform and direct performance in concert with course goals. Since the purpose of the boot camp is to prepare students for clinical practice, an instrument/method that better generates a consistent execution of skills is valued over abstract knowledge applied in isolation. In the end, the educator should consider using both methods of evaluation when teaching psychomotor skills. In addition, a quantitative simulation evaluation with an established “pass” threshold should be incorporated, in order to move toward competency-based evaluation.
LIMITATIONS
Our study has limitations, including enrolling a small sample of self-selected, highly motivated students entering the fields of emergency medicine, anesthesiology, or surgery. We did not have any baseline data on the subjects’ prior ACLS training or experience. However, this did not affect our study’s ability to evaluate the relationship between a simulation-based and written evaluation tool, as students served as their own controls. We used a previously non-validated simulation evaluation scale with arbitrary weighting of points for critical actions (derived from two expert ACLS instructors), though the action items had been used for grading in the course for 15 years. Furthermore, our assessment tool is based on AHA guidelines, and is clinically focused on critical action items that approximate real clinical care, compared to a multiple-choice or even fill-in-the-blank format. Our course was non-traditional and expanded from ACLS, and included advanced airway management and additional didactics. However, both assessment methods tested knowledge and skills from this non-traditional course format, which would not confound the assessment methods themselves. Our criterion reference was the ACLS written exam. To our knowledge, these test questions are not analyzed for reliability or validity. There are no previous studies that demonstrate construct validity of the AHA written examination or correlate clinical performance with the written examination. The correlation between written and simulation examination performance in this study does demonstrate some degree of construct validity. The written examination is based entirely on the ACLS manual and should therefore have content validity.To provide maximal experience with simulation and to reinforce specific and detailed proper ACS/cardiac arrest management, we used the same teaching and testing scenario and informed the students that the pre- and post-tests would be identical. This may have artificially improved post-test performance through studying specifically for the known test, as well as additional familiarity with the simulation technology. We did not control for progressive experience and therefore comfort with the mannequin or simulation experience, nor was there a traditional ACLS course student control group.Future studies should use students destined for all specialty residencies, and assess the rate of long-term retention of psychomotor skills.We excluded one outlier who scored far below the other students, at 55% correct on the rhythm matching test (11/20 correct). This student scored average on the simulation evaluation, which only required identification of three (not 20) obvious rhythms. Including this outlier would have made our correlation fall short of statistical significance. However, the scatter plot visually demonstrates the conclusion that higher written scores are associated with higher simulation scores. With our small sample size, one outlier has a higher possibility of skewing results away from statistically significant correlation. Further research will be needed to determine if exclusion of this outlier was appropriate.We did not study, nor do we advocate, any particular “pass” threshold for simulation evaluation. As in any other course, the instructor would need to establish this given the difficulty of content, ability of students to master material with the course format, and degree of “high stakes” activity.The three components of the written testing have not been correlated with each other, as they are designed to test different cognitive skills. Therefore, correlation of their aggregate with simulation evaluation may lack a basic level of validation. Nevertheless, the simulation is new, labor intensive and expensive, and therefore more in need of scrutiny and validation. Our results of testing relatively novice learners may not be generalizable to more experienced providers. Lastly, the simulation evaluation raters had, at best, vague familiarity with the students. That they were identifiable on the recordings may have introduced an unknown bias into the evaluation.
CONCLUSION
This study is the first to compare written and simulation-based evaluation in medical students. Simulation-based ACLS evaluation methods correlate with traditional written evaluation methods, and provide additional opportunity to demonstrate competency of resuscitation knowledge and skills.Simulation may be a more discriminating and challenging testing method, as students scored higher on written evaluation methods compared to simulation. The meaning of this difference needs clarification through further research.
Authors: Geoffrey K Lighthall; Juliana Barr; Steven K Howard; Eran Gellar; Yasser Sowb; Edward Bertacini; David Gaba Journal: Crit Care Med Date: 2003-10 Impact factor: 7.598
Authors: Diane B Wayne; John Butter; Viva J Siddall; Monica J Fudala; Lee A Linquist; Joe Feinglass; Leonard D Wade; William C McGaghie Journal: Teach Learn Med Date: 2005 Impact factor: 2.414
Authors: Elaine R Cohen; Joe Feinglass; Jeffrey H Barsuk; Cynthia Barnard; Anna O'Donnell; William C McGaghie; Diane B Wayne Journal: Simul Healthc Date: 2010-04 Impact factor: 1.929
Authors: Christopher Eric McCoy; Michael Menchine; Craig Anderson; Robert Kollen; Mark I Langdorf; Shahram Lotfipour Journal: J Emerg Med Date: 2010-04-22 Impact factor: 1.484
Authors: Megan Boysen-Osborn; Craig L Anderson; Roman Navarro; Justin Yanuck; Suzanne Strom; Christopher E McCoy; Julie Youm; Mary Frances Ypma-Wong; Mark I Langdorf Journal: J Educ Eval Health Prof Date: 2016-02-18