Te-Chuan Chen1,2,3, Meng-Chih Lin2,4, Yuan-Cheng Chiang2,5, Lynn Monrouxe3, Shao-Ju Chien6,7. 1. a Division of Nephrology, Department of Internal Medicine , Kaohsiung Chang Gung Memorial Hospital , Kaohsiung , Taiwan. 2. b School of Medicine , Chang Gung University College of Medicine , Tao-Yuan , Taiwan. 3. c Chang Gung Memorial Hospital Linkou Branch, Chang Gung Medical Education Research Centre , Tao-Yuan , Taiwan. 4. d Division of Pulmonary Medicine, Department of Internal Medicine , Kaohsiung Chang Gung Memorial Hospital , Kaohsiung , Taiwan. 5. e Department of Plastic and Reconstructive Surgery , Kaohsiung Chang Gung Memorial Hospital , Kaohsiung , Taiwan. 6. f Division of Pediatric Cardiology, Department of Pediatrics , Kaohsiung Chang Gung Memorial Hospital , Kaohsiung , Taiwan. 7. g School of Traditional Chinese Medicine, Chang Gung University College of Medicine , Tao-Yuan , Taiwan.
Abstract
Introduction: Onsite scoring is common in traditional OSCEs although there is the potential for an audience effect facilitating or inhibiting performance. We aim to (1) analyze the reliability between onsite scoring (OS) and remote scoring (RS); and (2) explore the factors that affect the scoring in different locations. Methods: A total of 154 students and 84 raters were enrolled in a single-site during 2013-2015. We selected six stations randomly from a 12-station national high-stakes OSCE. We applied generalisability theory for the analysis and investigated the perceptions that affected RS scoring. Results: The internal consistency reliability Cronbach's α of the checklists was 0.92. The kappa agreement was 0.623 and the G value was 0.93. The major source of variance comes from the students themselves, but some from locations and raters. The three-component analysis including Technical Feasibility, Facilitates Wellbeing, and Observational and Attention Deficits explained 73.886% of the total variance in RS scoring. Conclusions: Our study has demonstrated moderate agreement and good reliability between OS and RS ratings. We validated the factors of facility operation and quality for RS raters. Remote scoring can provide an alternative forum for the raters to overcome the barriers of distance, space, and avoid the audience effect.
Introduction: Onsite scoring is common in traditional OSCEs although there is the potential for an audience effect facilitating or inhibiting performance. We aim to (1) analyze the reliability between onsite scoring (OS) and remote scoring (RS); and (2) explore the factors that affect the scoring in different locations. Methods: A total of 154 students and 84 raters were enrolled in a single-site during 2013-2015. We selected six stations randomly from a 12-station national high-stakes OSCE. We applied generalisability theory for the analysis and investigated the perceptions that affected RS scoring. Results: The internal consistency reliability Cronbach's α of the checklists was 0.92. The kappa agreement was 0.623 and the G value was 0.93. The major source of variance comes from the students themselves, but some from locations and raters. The three-component analysis including Technical Feasibility, Facilitates Wellbeing, and Observational and Attention Deficits explained 73.886% of the total variance in RS scoring. Conclusions: Our study has demonstrated moderate agreement and good reliability between OS and RS ratings. We validated the factors of facility operation and quality for RS raters. Remote scoring can provide an alternative forum for the raters to overcome the barriers of distance, space, and avoid the audience effect.