Amir H Sam1,2, Samantha M Field1, Carlos F Collares3, Cees P M van der Vleuten3, Val J Wass4, Colin Melville5, Joanne Harris1, Karim Meeran1,2. 1. Medical Education Research Unit, Imperial College London, London, UK. 2. Division of Diabetes, Endocrinology and Metabolism, Imperial College London, London, UK. 3. Department of Educational Research and Development, Maastricht University, Maastricht, the Netherlands. 4. Faculty of Medicine and Health, Keele University, Keele, UK. 5. General Medical Council, London, UK.
Abstract
CONTEXT: Single-best-answer questions (SBAQs) have been widely used to test knowledge because they are easy to mark and demonstrate high reliability. However, SBAQs have been criticised for being subject to cueing. OBJECTIVES: We used a novel assessment tool that facilitates efficient marking of open-ended very-short-answer questions (VSAQs). We compared VSAQs with SBAQs with regard to reliability, discrimination and student performance, and evaluated the acceptability of VSAQs. METHODS:Medical students were randomised to sit a 60-question assessment administered in either VSAQ and then SBAQ format (Group 1, n = 155) or the reverse (Group 2, n = 144). The VSAQs were delivered on a tablet; responses were computer-marked and subsequently reviewed by two examiners. The standard error of measurement (SEM) across the ability spectrum was estimated using item response theory. RESULTS: The review of machine-marked questions took an average of 1 minute, 36 seconds per question for all students. The VSAQs had high reliability (alpha: 0.91), a significantly lower SEM than the SBAQs (p < 0.001) and higher mean item-total point biserial correlations (p < 0.001). The VSAQ scores were significantly lower than the SBAQ scores (p < 0.001). The difference in scores between VSAQs and SBAQs was attenuated in Group 2. Although 80.4% of students found the VSAQs more difficult, 69.2% found them more authentic. CONCLUSIONS: The VSAQ format demonstrated high reliability and discrimination and items were perceived as more authentic. The SBAQ format was associated with significant cueing. The present results suggest the VSAQ format has a higher degree of validity.
RCT Entities:
CONTEXT: Single-best-answer questions (SBAQs) have been widely used to test knowledge because they are easy to mark and demonstrate high reliability. However, SBAQs have been criticised for being subject to cueing. OBJECTIVES: We used a novel assessment tool that facilitates efficient marking of open-ended very-short-answer questions (VSAQs). We compared VSAQs with SBAQs with regard to reliability, discrimination and student performance, and evaluated the acceptability of VSAQs. METHODS: Medical students were randomised to sit a 60-question assessment administered in either VSAQ and then SBAQ format (Group 1, n = 155) or the reverse (Group 2, n = 144). The VSAQs were delivered on a tablet; responses were computer-marked and subsequently reviewed by two examiners. The standard error of measurement (SEM) across the ability spectrum was estimated using item response theory. RESULTS: The review of machine-marked questions took an average of 1 minute, 36 seconds per question for all students. The VSAQs had high reliability (alpha: 0.91), a significantly lower SEM than the SBAQs (p < 0.001) and higher mean item-total point biserial correlations (p < 0.001). The VSAQ scores were significantly lower than the SBAQ scores (p < 0.001). The difference in scores between VSAQs and SBAQs was attenuated in Group 2. Although 80.4% of students found the VSAQs more difficult, 69.2% found them more authentic. CONCLUSIONS: The VSAQ format demonstrated high reliability and discrimination and items were perceived as more authentic. The SBAQ format was associated with significant cueing. The present results suggest the VSAQ format has a higher degree of validity.
Authors: Amir H Sam; Chee Yeen Fung; Rebecca K Wilson; Emilia Peleva; David C Kluth; Martin Lupton; David R Owen; Colin R Melville; Karim Meeran Journal: BMJ Open Date: 2019-07-09 Impact factor: 2.692
Authors: Stefan K Schauber; Stefanie C Hautz; Juliane E Kämmer; Fabian Stroben; Wolf E Hautz Journal: Adv Health Sci Educ Theory Pract Date: 2021-05-11 Impact factor: 3.853