| Literature DB >> 31647012 |
Wenli Ouyang1, Polina Harik2, Brian E Clauser2, Miguel A Paniagua2.
Abstract
BACKGROUND: Examinees often believe that changing answers will lower their scores; however, empirical studies suggest that allowing examinees to change responses may improve their performance in classroom assessments. To date, no studies have been able to examine answer changes during large scale professional credentialing or licensing examinations.Entities:
Keywords: Answer change on MCQ; Credentialing examination; Step 2 CK; Test taking; USMLE
Mesh:
Year: 2019 PMID: 31647012 PMCID: PMC6806526 DOI: 10.1186/s12909-019-1816-3
Source DB: PubMed Journal: BMC Med Educ ISSN: 1472-6920 Impact factor: 2.463
Examinee sample description (N = 27,830)
| Number of examinees | % | Step2 CK | |
|---|---|---|---|
| Gender | |||
| Male | 15,151 | 54.4% | 232 (22.8) |
| Female | 12,679 | 45.6% | 233 (22.6) |
| English proficiency | |||
| ESL | 9474 | 34.0% | 226 (23.9) |
| ENL | 18,356 | 66.0% | 237 (21.0) |
| Number of takes | |||
| First time takers | 24,939 | 89.6% | 236 (21.5) |
| Repeaters | 2891 | 10.4% | 210 (18.6) |
| Pass/Fail | |||
| Pass | 24,119 | 86.7% | 239 (16.1) |
| Fail | 3711 | 13.3% | 192 (16.4) |
| Total | 27,830 | 233 (22.7) | |
Response change patterns and outcomes for block 4
| Number of examinees | % (Based on all examinees) | % (Based on examinees with response change) | |
| Examinees who revisited at least one item | 27,560 | 99.0% | |
| Examinees who changed at least one response | 18,936 | 68.0% | 100% |
| | 8461 | 30.4% | 44.7% |
| | 5263 | 18.9% | 27.8% |
| | 5212 | 18.7% | 27.5% |
| Mean number of items | % (Based on all items) | % (Based on items with revisits) | |
| Items with revisits | 16.0 | 36.4% | 100% |
| Items with revisits and no response change | 14.6 | 33.2% | 91.2% |
| Items with revisits and response change | 1.4 | 3.2% | 8.8% |
| |
| ||
| |
| ||
| |
| ||
| |
| ||
| W-R/R-W ratio | 0.59/0.40 = 1.48 | ||
| Overall mean score change | 0.004 |
W-R wrong to right response changes, R-W right to wrong response changes, W-W wrong to wrong response changes, R-R right to right response changes
Response change patterns and outcomes by score change for block 4
| Score Gain ( | Score Loss ( | No Score Change ( | ||||
|---|---|---|---|---|---|---|
| Mean number of items | % | Mean number of items | % | Mean number of items | % | |
| Items with revisits | 18.6 | 42.3% | 17.8 | 40.4% | 13.8 | 31.3% |
| Items with revisits and no response change | 16.4 | 88.2%a | 15.8 | 88.8%a | 13.1 | 94.8%a |
| Items with revisits and response change | 2.2 | 11.8%a | 2.0 | 11.2%a | 0.7 | 5.1%a |
| | 1.55 | 0.14 | ||||
| | 0.16 | 1.36 | ||||
| | 0.04 | 0.04 | ||||
| | 0.45 | 0.44 | ||||
| W-R/R-W ratio | 9.69 | 0.10 | ||||
| Mean score change | 0.032 | −0.028 | ||||
W-R wrong to right response changes, R-W right to wrong response changes, W-W wrong to wrong response changes, R-R right to right response changes
aPercentage is calculated based on items with revisits
Response change patterns and outcomes by examinee ability for block 4
| High Ability ( | Medium Ability ( | Low Ability ( | ||||
|---|---|---|---|---|---|---|
| Mean number of items | % | Mean number of items | % | Mean number of items | % | |
| Items with revisits | 17.9 | 40.7% | 14.6 | 33.2% | 12.1 | 27.5% |
| Items with revisits and no response change | 16.5 | 92.2%a | 13.2 | 90.4%a | 10.7 | 88.4% |
| Items with revisits and response change | 1.4 | 7.8%a | 1.4 | 9.6%a | 1.4 | 11.6% |
| | 0.63 | 0.56 | 0.51 | |||
| | 0.40 | 0.39 | 0.40 | |||
| | 0.04 | 0.03 | 0.03 | |||
| | 0.33 | 0.41 | 0.50 | |||
| W-R/R-W ratio | 1.58 | 1.44 | 1.28 | |||
| Mean score change | 0.005 | 0.004 | 0.003 | |||
W-R wrong to right response changes, R-W right to wrong response changes, W-W wrong to wrong response changes, R-R right to right response changes
aPercentage is calculated based on items with revisits
Results from the Multinomial Logistic Regression analysis for block 4
| Variable | Coefficient Estimate | Std. Error | OR | 95% Confidence Interval | |||
|---|---|---|---|---|---|---|---|
W-R vs. R-W log odds | Intercept | 0.299 | 0.044 | < 0.0001 | |||
| Ability: | High vs. Low | 0.226 | 0.046 | 1.25 | 1.15–1.37 | < 0.0001 | |
| Medium vs. Low | 0.103 | 0.046 | 1.11 | 1.01–1.21 | 0.026 | ||
R-R vs. R-W log odds | Intercept | −2.830 | 0.125 | < 0.0001 | |||
| Ability: | High vs. Low | 0.116 | 0.132 | 1.12 | 0.87–1.45 | 0.380 | |
| Medium vs. Low | 0.032 | 0.134 | 1.03 | 0.79–1.34 | 0.809 | ||
W-W vs. R-W log odds | Intercept | 0.136 | 0.044 | 0.002 | |||
| Ability: | High vs. Low | −0.491 | 0.048 | 0.61 | 0.56–0.67 | < 0.0001 | |
| Medium vs. Low | −0.208 | 0.047 | 0.81 | 0.74–0.89 | < 0.0001 | ||
W-R wrong to right response changes, R-W right to wrong response changes, W-W wrong to wrong response changes, R-R right to right response changes
Item revisiting duration (in sec) across four types of response change patterns for block 4
| Item revisiting duration (Sec) | |||
|---|---|---|---|
| Marginal Meana | Mean Difference from R-W (S.E.) | ||
| W-R | 44.0 | −2.4 (0.5) | < 0.0001 |
| R-R | 58.2 | 11.8 (1.4) | < 0.0001 |
| W-W | 50.4 | 4.0 (0.6) | < 0.0001 |
| R-W | 46.4 | n/a | n/a |
W-R wrong to right response changes, R-W right to wrong response changes, W-W wrong to wrong response changes, R-R right to right response changes
aCovariates appearing in the model are evaluated at the following values: STEP2 total test score = 232.51; Item difficulty = .7607
Response change patterns and outcomes for block 8
| Number of examinees | % (Based on all examinees) | % (Based on examinees with response change) | |
| Examinees who revisited at least one item | 27,521 | 99.0% | |
| Examinees who changed at least one response | 19,178 | 68.9% | 100.0% |
| | 8742 | 31.4% | 45.6% |
| | 5142 | 18.5% | 26.8% |
| | 5294 | 19.0% | 27.6% |
| Mean number of items | % (Based on all items) | % (Based on items with revisits) | |
| Items with revisits | 15.8 | 35.9% | 100.0% |
| Items with revisits and no response change | 14.4 | 32.7% | 91.1% |
| Items with revisits and response change | 1.4 | 3.2% | 8.9% |
| |
| ||
| |
| ||
| |
| ||
| |
| ||
| W-R/R-W ratio | 0.60/0.40 = 1.50 | ||
| Overall mean score change | 0.005 |
W-R wrong to right response changes, R-W right to wrong response changes, W-W wrong to wrong response changes, R-R right to right response changes