| Literature DB >> 27917132 |
Andreas Frey1, Nicki-Nils Seitz2, Steffen Brandt3.
Abstract
Multidimensional adaptive testing (MAT) is a highly efficient method for the simultaneous measurement of several latent traits. Currently, no psychometrically sound approach is available for the use of MAT in testlet-based tests. Testlets are sets of items sharing a common stimulus such as a graph or a text. They are frequently used in large operational testing programs like TOEFL, PISA, PIRLS, or NAEP. To make MAT accessible for such testing programs, we present a novel combination of MAT with a multidimensional generalization of the random effects testlet model (MAT-MTIRT). MAT-MTIRT compared to non-adaptive testing is examined for several combinations of testlet effect variances (0.0, 0.5, 1.0, and 1.5) and testlet sizes (3, 6, and 9 items) with a simulation study considering three ability dimensions with simple loading structure. MAT-MTIRT outperformed non-adaptive testing regarding the measurement precision of the ability estimates. Further, the measurement precision decreased when testlet effect variances and testlet sizes increased. The suggested combination of the MTIRT model therefore provides a solution to the substantial problems of testlet-based tests while keeping the length of the test within an acceptable range.Entities:
Keywords: computerized adaptive testing; item response theory; large-scale assessment; multidimensional IRT models; testlets
Year: 2016 PMID: 27917132 PMCID: PMC5114539 DOI: 10.3389/fpsyg.2016.01758
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Figure 1Path diagram of the structure of a testlet model for 12 items (Y. Every item belongs to exactly one of four testlets γ1 to γ4. For identifiability, the means of all latent variables are set to 0 and the variances of θ1 and θ2 to 1.
Scaling factors to transform MTIRT item difficulties to MIRT item difficulties by testlet effect variance and testlet size.
| 3 | 1.000 | 0.925 | 0.863 | 0.811 |
| 6 | 1.000 | 0.927 | 0.867 | 0.815 |
| 9 | 1.000 | 0.930 | 0.871 | 0.821 |
Estimated testlet effect variance by testing algorithm, testlet size, and true testlet effect variance.
| MAT | 3 | 0.022 | 0.012 | 0.506 | 0.012 | 1.012 | 0.017 | 1.509 | 0.024 |
| 6 | 0.007 | 0.004 | 0.505 | 0.014 | 0.993 | 0.027 | 1.514 | 0.027 | |
| 9 | 0.005 | 0.510 | 0.006 | 1.015 | 0.033 | 1.553 | 0.045 | ||
| RAN | 3 | 0.025 | 0.015 | 0.495 | 0.026 | 1.001 | 0.020 | 1.506 | 0.026 |
| 6 | 0.008 | 0.496 | 0.019 | 1.012 | 0.027 | 1.510 | 0.032 | ||
| 9 | 0.019 | 0.012 | 0.511 | 0.018 | 1.062 | 0.043 | 0.049 | ||
Testlet effect variances whose 95%-credibility interval does not cover the testlet effect variance used for data generation are written in bold face. MAT, multidimensional adaptive testing; RAN, random testlet selection.
Average mean square error (.
| 3 | 0.0 | 0.150 | 0.002 | 0.150 | 0.002 | 0.233 | 0.002 | 0.234 | 0.002 |
| 0.5 | 0.194 | 0.001 | 0.196 | 0.002 | 0.267 | 0.005 | 0.271 | 0.005 | |
| 1.0 | 0.230 | 0.002 | 0.241 | 0.002 | 0.296 | 0.004 | 0.309 | 0.003 | |
| 1.5 | 0.261 | 0.002 | 0.281 | 0.003 | 0.321 | 0.004 | 0.345 | 0.005 | |
| 6 | 0.0 | 0.155 | 0.002 | 0.155 | 0.002 | 0.238 | 0.004 | 0.237 | 0.004 |
| 0.5 | 0.229 | 0.003 | 0.243 | 0.003 | 0.297 | 0.005 | 0.317 | 0.006 | |
| 1.0 | 0.286 | 0.005 | 0.329 | 0.006 | 0.343 | 0.004 | 0.400 | 0.008 | |
| 1.5 | 0.336 | 0.005 | 0.405 | 0.005 | 0.386 | 0.007 | 0.482 | 0.007 | |
| 9 | 0.0 | 0.161 | 0.002 | 0.161 | 0.002 | 0.242 | 0.003 | 0.242 | 0.003 |
| 0.5 | 0.257 | 0.003 | 0.300 | 0.004 | 0.326 | 0.005 | 0.374 | 0.007 | |
| 1.0 | 0.329 | 0.004 | 0.446 | 0.005 | 0.395 | 0.011 | 0.516 | 0.009 | |
| 1.5 | 0.393 | 0.008 | 0.581 | 0.009 | 0.461 | 0.016 | 0.658 | 0.010 | |
MAT, multidimensional adaptive testing; RAN, random testlet selection; MTIRT, multidimensional item response theory random effect testlet model; MIRT, multidimensional item response theory model.
Figure 2Average mean square error (. Testlet size = 6 items.
Figure 3Average mean square error (. Testlet effect variance = 1.000.
Figure 4Average mean square error (.