| Literature DB >> 32532278 |
Hatim S AlKhatib1, Gayle Brazeau2, Amal Akour3, Suha A Almuhaissen4.
Abstract
BACKGROUND: Examinations are the traditional assessment tools. In addition to measurement of learning, exams are used to guide the improvement of academic programs. The current study attempted to evaluate the quality of assessment items of sixth year clinical clerkships examinations as a function of assessment items format and type/structure and to assess the effect of the number of response choices on the characteristics of MCQs as assessment items.Entities:
Keywords: Assessment items; Clinical clerkships; Difficulty index; Discriminating index; Point Biserial
Year: 2020 PMID: 32532278 PMCID: PMC7291500 DOI: 10.1186/s12909-020-02107-3
Source DB: PubMed Journal: BMC Med Educ ISSN: 1472-6920 Impact factor: 2.463
Descriptive Statistics for Evaluated Assessment Items (N = 173)
| Variable | No. (%) |
|---|---|
| Format | |
| Case based item | 134 (77.5) |
| Non-case based item | 39 (22.5) |
| Type/Structure | |
| Open-ended/essay item | 98 (56.6) |
| Multiple choice item | 75 (43.4) |
| Number of Choices* | |
| 2 (True/False) | 11 (14.7) |
| 3 | 3 (4) |
| 4 | 33 (44) |
| 5 | 28 (37.3) |
| Bloom’s level | |
| Remembering and Understanding Skills | 60 (34.7) |
| Analysis skills | 57 (32.9) |
| Application Skills | 28 (16.2) |
| Evaluation and Creation Skills | 28 (16.2) |
| Difficulty Index (Difficulty) Levels1 | |
| Difficult (Difficulty< 20%) | 6 (3.5) |
| Acceptable/Good (20% ≤ Difficulty< 50%) | 29 (16.7) |
| Excellent (50 ≤ Difficulty< 80%) | 93 (53.8) |
| Easy/Poor (Difficulty≥80%) | 45 (26) |
| Discriminating Index (Discrimination) Levels1 | |
| Poor/Flawed (Discrimination< 0) | 5 (2.9) |
| Poor (0 ≤ Discrimination<.2) | 36 (20.8) |
| Acceptable (.2 ≤ Discrimination<.3) | 40 (23.1) |
| Good (.3 ≤ Discrimination<.4) | 39 (22.5) |
| Excellent (Discrimination≥.4) | 53 (30.6) |
| Point-Biserial Levels2 | |
| Poor/Flawed (Point biserial < 0) | 14 (8.1) |
| Poor (0 ≤ Point biserial <.15) | 28 (16.2) |
| Recommended (.15 ≤ Point biserial <.25) | 28 (16.2) |
| Good (Point biserial ≥.25) | 103 (59.5) |
* Percent relative to MCQs count
Item Performance Characteristics Based on Item measured ILO Level
| Performance Characteristics | Remembering and Understanding Skills Mean (SD) | Analysis Skills Mean (SD) | Application Skills Mean (SD) | Evaluation and Creation Skills Mean (SD) |
|---|---|---|---|---|
| Level of Difficultya | 68.2 (13.3)b | 56.4 (21.8) | 63.2 (20.4) | 63.2 (22.4) |
| Discriminating Indexc | .41 (.19)d | .40 (.18)d | .22 (.15) | .25 (.10) |
| Point biseriale | .39 (.16)d | .38 (.16)d | .21 (.14) | .23 (.14) |
ILO Intended Learning Outcome
N 173
a Model is significant, F (3,169) = 3.7, p = 0.012
b significant difference between pairwise comparisons over Analysis skills
c Model is significant, F (3,169) = 12.7, p < 0.001
d significant difference between pairwise comparisons over application skills and evaluation and creation skills
c Model is significant, F (3,169) = 14.4, p < 0.001
Item Performance Characteristics Based on Different Item Properties and Layers (N = 173)
| Level of Difficulty Mean (SD) | Discriminating Index Mean (SD) | Point-biserial Mean (SD) | |
|---|---|---|---|
| Item Format | |||
| Case Based | 63.7 (19.7) | .34 (.18) | 0.34 (.17) |
| Noncase Based | 59.2 (19)NS | .38 (.22)NS | 0.30 (.16)NS |
| Item Type/Structure | |||
| Open-ended/Essay | 63.4 (18) | .39 (.18) | .39 (.15) |
| Multiple Choice | 61.7 (21.5)NS | .31 (.19)* | .26 (.16) |
| Number of Choices/options | |||
| 4-option Items | 51.3 (19.1) | .30 (.18) | .24 (.14) |
| 5-option Items | 67.2 (21.5)* | .31 (.19)NS | .29 (.17)NS |
| Item Format: Item Type/Structure | |||
| Case Based: Open-ended | 64.5 (18.3) | .37 (.17) | .38 (.16) |
| Case Based: MCQs | 62.2 (22.2)NS | .29 (.17)* | .26 (.15)* |
| Noncase Based: Open-ended | 54.2 (11.7) | .54 (.17) | .45 (.07) |
| Noncase Based: MCQs | 61 (20.8)NS | .33 (.21)* | .25 (.18)* |
| Item Format: Number of Choices/options | |||
| Case Based: 4-option | 50.3 (20.4) | .3 (.17) | .25 (.13) |
| Case Based: 5-option | 73.7 (18.8)* | .27 (.16)NS | .29 (.17)NS |
| Noncase Based: 4-option | 53.5 (16.6) | .33 (.22) | .21 (.17) |
| Noncase Based: 5-option | 59 (22.6)* | .36 (.23) NS | .28 (.19)NS |
| Item Type/Structure: Item Format | |||
| Open-ended: Case Based | 64.5 (18.3) | .37 (.17) | .38 (.16) |
| Open-ended: Noncase Based | 54.2 (11.8)NS | .54 (.17)* | .45 (.07)* |
| MCQ: Case Based | 62.2 (22.2) | .29 (.1) | .20 (.1) |
| MCQ: Noncase Based | 61 (20.8)NS | .33 (.1)* | .26 (.1)* |
| Number of Choices/options: Item Format | |||
| 4-option: Case Based | 50.3 (20) | .3 (.17) | .25 (.13) |
| 4-option: Noncase Based | 53.6 (17)* | .33 (.22)NS | .21 (.17)NS |
| 5-option: Case Based | 73.7 (18.8) | .27 (.16) | .29 (.17) |
| 5-option: Noncase Based | 60 (22.5)* | .36 (.23)NS | .28 (.16)NS |
NS Not Significant
* Significant at measurement level
Linear Regression Analysis of Item Performance as a function of Item Characteristics
| Predictor* | Difficulty Index | Discrimination Index | Point biserial | |||
|---|---|---|---|---|---|---|
| Coefficient | Coefficient | Coefficient | ||||
| Model 1$# | F(2,170) = .78 | .46 | F(2,170) = 6.84 | .001 | F(2,170) = 15.192 | <.001 |
| Case/Noncase based items | −.04 | .26 | .08 | .019 | .01 | .67 |
| Open-ended (essay)/MCQs items | −.01 | .89 | −.10 | .001 | −.14 | <.001 |
| Model 2@# | F(2,72) = .76 | .47 | F(2,72) = .40 | .67 | F(2,72) = .18 | .83 |
| Case/Noncase based items | −.02 | .77 | .04 | .39 | −.01 | .79 |
| Number of Choices | −.03 | .23 | .01 | .78 | .01 | .60 |
*because the number of choices variable can’t be included in the same model as the item type variable, two separate models (one for item format and item type and then one for item format and number of choices among the MCQ questions) were analyzed
$N = 173
@N = 75
#The interaction of the two factors in the model was tested and no significant effect was detected with also no effect on the significance on other factors