| Literature DB >> 27907190 |
Stefan Zimmermann1, Dietrich Klusmann1, Wolfgang Hampe1.
Abstract
Cheating is a common phenomenon in high stakes admission, licensing and university exams and threatens their validity. To detect if some exam questions had been affected by cheating, we simulated how data would look like if some test takers possessed item preknowledge: Responses to a small number of items were set to correct for 1-10% of test takers. Item difficulty, item discrimination, item fit, and local dependence were computed using an IRT 2PL model. Then changes in these item properties from the non-compromised to the compromised dataset were scrutinized for their sensitivity to item preknowledge. A decline in the discrimination parameter compared with previous test versions and an increase in local item dependence turned out to be the most sensitive indicators of item preknowledge. A multiplicative combination of shifts in item discrimination, item difficulty, and local item dependence detected item preknowledge with a sensitivity of 1.0 and a specificity of .95 if 11 of 80 items were preknown to 10% of the test takers. Cheating groups smaller than 5% of the test takers were not detected reliably. In the discussion, we outline an effective search for items affected by cheating, which would enable faculty staff without IRT knowledge to detect compromised items and exclude them from scoring.Entities:
Mesh:
Year: 2016 PMID: 27907190 PMCID: PMC5131967 DOI: 10.1371/journal.pone.0167545
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Shift in item parameters, item fit and local dependence after items have been compromised by preknowledge (HN2013;10%;11 versus HN2013;0%;11).
Fig 2Receiver Operating Characteristic curve for selected indicators for item preknowledge in the 2013/10%;11 simulation.
Sensitivity of indicators of item preknowledge if rate of false positives is 5%.
| Year | Fraction of cheaters | Δ difficulty | Δ discrimination | Δ LD χ2 count | SIP |
|---|---|---|---|---|---|
| 10.0% | 1.00 | 0.91 | 1.00 | 1.00 | |
| 5.0% | 0.36 | 0.64 | 0.73 | 0.91 | |
| 2.5% | 0.36 | 0.64 | 0.09 | 0.73 | |
| 1.0% | 0.09 | 0.64 | 0.00 | 0.82 | |
| 10.0% | 1.00 | 1.00 | 0.71 | 1.00 | |
| 5.0% | 0.86 | 1.00 | 0.43 | 1.00 | |
| 2.5% | 0.71 | 1.00 | 0.29 | 0.71 | |
| 1.0% | 0.14 | 1.00 | 0.29 | 0.71 |
Note: The sensitivity of a test is the proportion of subjects identified correctly as exhibiting the proportion in question (in this case IP).
* SIP is not zero even when ΔLD χ2 count is zero, because all three variables entering SIP are transformed to a distribution with a mean 10 and SD 1.
Fig 3Heat map of LD values in the 80x80 matrix of the HN2013;10%;11 simulation.
Note: Each item pair occurs twice. Due to the bond energy algorithm, items on the x-axis are not in the same order as on the y-axis. Therefore the plot is not symmetrically. ‡ 11 items compromised by item preknowledge. * 2 items locally dependent due to item content (item 14 and item 56). Both items are about DNA replication and require the same answer.