| Literature DB >> 26597452 |
James M Kilgour1, Saadia Tayyaba2.
Abstract
In UK medical schools, five-option single-best answer (SBA) questions are the most widely accepted format of summative knowledge assessment. However, writing SBA questions with four effective incorrect options is difficult and time consuming, and consequently, many SBAs contain a high frequency of implausible distractors. Previous research has suggested that fewer than five-options could hence be used for assessment, without deterioration in quality. Despite an existing body of empirical research in this area however, evidence from undergraduate medical education is sparse. The study investigated the frequency of non-functioning distractors in a sample of 480 summative SBA questions at Cardiff University. Distractor functionality was analysed, and then various question models were tested to investigate the impact of reducing the number of distractors per question on examination difficulty, reliability, discrimination and pass rates. A survey questionnaire was additionally administered to 108 students (33 % response rate) to gain insight into their perceptions of these models. The simulation of various exam models revealed that, for four and three-option SBA models, pass rates, reliability, and mean item discrimination remained relatively constant. The average percentage mark however consistently increased by 1-3 % with the four and three-option models, respectively. The questionnaire survey revealed that the student body had mixed views towards the proposed format change. This study is one of the first to comprehensively investigate distractor performance in SBA examinations in undergraduate medical education. It provides evidence to suggest that using three-option SBA questions would maximise efficiency whilst maintaining, or possibly improving, psychometric quality, through allowing a greater number of questions per exam paper.Entities:
Keywords: Assessment; Examination quality; Examination reliability; Single-best answer exams; Testing; Undergraduate medical education; Written examination
Mesh:
Year: 2015 PMID: 26597452 PMCID: PMC4923093 DOI: 10.1007/s10459-015-9652-7
Source DB: PubMed Journal: Adv Health Sci Educ Theory Pract ISSN: 1382-4996 Impact factor: 3.853
Analysis of overall examination performance for all 3 years
| Statistic | Y03 | Y04 | Y05 |
|---|---|---|---|
| Alpha reliabilitya | 0.81 | 0.82 | 0.79 |
| Average score | 95.14 | 144.98 | 87.94 |
| Maximum possible score | 140 | 200 | 140 |
| Standard deviation (SD) | 11.02 | 12.88 | 11.01 |
| Range of scores | 61–126 | 108–172 | 63–125 |
| Number of items | 140 | 200 | 140 |
| Average percentage correct | 68 % | 72 % | 63 % |
| Average item discrimination (Rpbis)b | 0.16 | 0.14 | 0.14 |
| Number of candidates | 262 | 278 | 273 |
aAlpha reliability ranges between 0 and 1 (i.e. no consistency to perfect internal consistency). The desirable range for high stake assessments is 0.8–0.89. The higher the stakes of the examination, the higher the value of the alpha is required to be in order to ensure a high degree of confidence in pass/fail decisions
bRpbis ranges between −1 and 1 (i.e. negatively discriminatory to perfectly discriminatory). In high stakes examinations, it is desirable to have an Rpbis approaching 0.20, as this indicates a high level of discrimination between competent and non-competent candidates
Number of functional distractors per item
| Number of functional distractors per question | Y03 | Y04 | Y05 | Overall |
|---|---|---|---|---|
| Zero | 18 (12.9 %) | 39 (19.5 %) | 11 (7.9 %) | 68 (14.2 %) |
| One | 35 (25.0 %) | 59 (29.5 %) | 33 (23.6 %) | 127 (26.5 %) |
| Two | 43 (30.7 %) | 61 (30.5 %) | 55 (39.3 % | 159 (33.1 %) |
| Three | 32 (22.9 %) | 30 (15.0 %) | 30 (21.4 %) | 92 (19.2 %) |
| Four | 12 (8.6 %) | 11 (5.5 %) | 11 (7.9 %) | 34 (7.1 %) |
| Average | 1.89 | 1.58 | 1.98 | 1.82 |
Breakdown of individual distractor performance grouped into categories by frequency of selection (all years combined)
| Distractor functionality by frequency of selection (%) | Number (%) |
|---|---|
| 0 | 341 (17.8) |
| <5 | 721 (37.6) |
| 5–10 | 374 (19.5) |
| 11–20 | 278 (14.5) |
| >20 | 206 (10.7) |
Effect of reducing the number of options per item on important exam attributes (five, four and three option models)
| Year | Psychometric attribute | Five-option model | Four-option model | Three-option model |
|---|---|---|---|---|
| Y03 | Mean % correct | 68 % | 69 % | 71 % |
| Mean Rpbis | 0.16 | 0.16 | 0.15 | |
| Alpha reliability | 0.81 | 0.81 | 0.82 | |
| Number of fails | 2 | 2 | 2 | |
| Y04 | Mean % correct | 72 % | 73 % | 75 % |
| Mean Rpbis | 0.14 | 0.17 | 0.16 | |
| Alpha reliability | 0.82 | 0.82 | 0.82 | |
| Number of fails | 1 | 1 | 1 | |
| Y05 | Mean % correct | 63 % | 64 % | 66 % |
| Mean Rpbis | 0.14 | 0.14 | 0.13 | |
| Alpha reliability | 0.79 | 0.80 | 0.80 | |
| Number of fails | 0 | 0 | 0 | |
| Average | Mean % correct | 68 % | 69 % | 71 % |
| Mean Rpbis | 0.15 | 0.16 | 0.15 | |
| Alpha reliability | 0.81 | 0.81 | 0.81 |
Fig. 1Student self-perceived ability to eliminate distractors (N = 104)