| Literature DB >> 29951023 |
Jesper Tijmstra1, Maria Bolsinova2.
Abstract
In many applications of high- and low-stakes ability tests, a non-negligible amount of respondents may fail to reach the end of the test within the specified time limit. Since for respondents that ran out of time some item responses will be missing, this raises the question of how to best deal with these missing responses for the purpose of obtaining an optimal assessment of ability. Commonly, researchers consider three general solutions: ignore the missing responses, treat them as being incorrect, or treat the responses as missing but model the missingness mechanism. This paper approaches the issue of dealing with not reached items from a measurement perspective, and considers the question what the operationalization of ability should be in maximum performance tests that work with effective time limits. We argue that the target ability that the test attempts to measure is maximum performance when operating at the test-indicated speed, and that the test instructions should be taken to imply that respondents should operate at this target speed. The phenomenon of the speed-ability trade-off informs us that the ability that is measured by the test will depend on this target speed, as different speed levels will result in different levels of performance on the same set of items. Crucially, since respondents with not reached items worked at a speed level lower than this target speed, the level of ability that they have been able to display on the items that they did reach is higher than the level of ability that they would have displayed if they had worked at the target speed (i.e., higher than their level on the target ability). Thus, statistical methods that attempt to obtain unbiased estimates of the ability as displayed on the items that were reached will result in biased estimates of the target ability. The practical implications are studied in a simulation study where different methods of dealing with not reached items are contrasted, which shows that current methods result in biased estimates of target ability when a speed-ability trade-off is present. The paper concludes with a discussion of ways in which the issue can be resolved.Entities:
Keywords: Item response theory (IRT); item non-response; maximum performance test; missing data; not reached items; speed-ability trade-off; speed-accuracy trade-off; speeded power test
Year: 2018 PMID: 29951023 PMCID: PMC6008412 DOI: 10.3389/fpsyg.2018.00964
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Figure 1The hypothetical speed-ability trade-off functions of three respondents. The three vertical lines represent different testing conditions, where different minimal levels of effective speed are imposed.
Figure 2The speed-ability trade-off function that is shared by the three hypothetical respondents. The vertical lines indicate the different levels of effective speed discussed in the example, and the horizontal lines capture the level of effective ability of these persons at those levels of effective speed (i.e., their level on θ′, θ*, and θ″).
Overview of the properties of the 8 conditions used in the simulation study.
| 10 | No | Strong | −0.47 | 0 | 0.92 |
| 10 | No | Weak | −0.27 | 0 | 0.46 |
| 10 | Yes | Strong | −0.03 | −0.41 | 0.52 |
| 10 | Yes | Weak | 0.28 | −0.41 | 0.06 |
| 5 | No | Strong | −0.27 | 0 | 0.52 |
| 5 | No | Weak | −0.14 | 0 | 0.26 |
| 5 | Yes | Strong | 0.32 | −0.41 | −0.01 |
| 5 | Yes | Weak | 0.43 | −0.41 | −0.20 |
The first three columns refer to the design factors. PM refers to the overall percentage of missing responses, SAbT refers to the strength of the speed-ability trade-off, θ.
Results of the simulation study, based on 1000 replications.
| Ignorable | 10 | No | Strong | 0.36 | −0.36 | 0.44 | 0.37 | 0.83 |
| Weak | 0.19 | −0.19 | 0.25 | 0.21 | 0.91 | |||
| Yes | Strong | 0.47 | −0.47 | 0.51 | 0.47 | 0.76 | ||
| Weak | 0.24 | −0.24 | 0.27 | 0.24 | 0.89 | |||
| 5 | No | Strong | 0.22 | −0.22 | 0.26 | 0.23 | 0.90 | |
| Weak | 0.11 | −0.11 | 0.16 | 0.14 | 0.93 | |||
| Yes | Strong | 0.27 | −0.27 | 0.27 | 0.27 | 0.89 | ||
| Weak | 0.15 | −0.15 | 0.16 | 0.15 | 0.92 | |||
| Latent | 10 | No | Strong | 0.45 | −0.45 | 0.52 | 0.45 | 0.80 |
| Regression | Weak | 0.23 | −0.23 | 0.28 | 0.24 | 0.90 | ||
| Yes | Strong | 0.47 | −0.47 | 0.51 | 0.47 | 0.76 | ||
| Weak | 0.20 | −0.20 | 0.22 | 0.20 | 0.90 | |||
| 5 | No | Strong | 0.25 | −0.25 | 0.28 | 0.25 | 0.90 | |
| Weak | 0.12 | −0.12 | 0.16 | 0.14 | 0.93 | |||
| Yes | Strong | 0.24 | −0.24 | 0.24 | 0.24 | 0.90 | ||
| Weak | 0.10 | −0.10 | 0.10 | 0.10 | 0.93 | |||
| Imputation | 10 | No | Strong | 0.07 | −0.07 | 0.38 | 0.15 | 0.89 |
| Weak | −0.06 | 0.06 | 0.28 | 0.11 | 0.91 | |||
| yes | strong | 0.18 | −0.18 | 0.24 | 0.19 | 0.90 | ||
| Weak | 0.02 | −0.02 | 0.13 | 0.03 | 0.93 | |||
| 5 | No | Strong | 0.09 | −0.09 | 0.24 | 0.13 | 0.92 | |
| Weak | −0.01 | 0.01 | 0.18 | 0.08 | 0.93 | |||
| Yes | Strong | 0.16 | −0.16 | 0.18 | 0.16 | 0.92 | ||
| Weak | 0.06 | −0.06 | 0.10 | 0.06 | 0.94 | |||
Bias and absolute bias refer to the average (absolute) bias of the estimate of the target ability in the groups compliers (Z = 1) and non-compliers (Z = 0). PM stands for the percentage of missing responses in the full sample, SAbT refers to the strength of the speed-ability trade-off, θ* refers to the target ability, .
Figure 3Relationship between the effective speed (on the x-axis) and the bias of the target ability (on the y-axis) for the condition with 10% missing responses in the full sample, a strong speed-ability trade-off and the target speed and group membership (Z) being independent. The results for non-compliers are shown in a scatterplot with density contours, and the results for compliers are shown in a boxplot.