| Literature DB >> 34987268 |
Carl F Falk1, Leah M Feuerstahler2.
Abstract
Large-scale assessments often use a computer adaptive test (CAT) for selection of items and for scoring respondents. Such tests often assume a parametric form for the relationship between item responses and the underlying construct. Although semi- and nonparametric response functions could be used, there is scant research on their performance in a CAT. In this work, we compare parametric response functions versus those estimated using kernel smoothing and a logistic function of a monotonic polynomial. Monotonic polynomial items can be used with traditional CAT item selection algorithms that use analytical derivatives. We compared these approaches in CAT simulations with a variety of item selection algorithms. Our simulations also varied the features of the calibration and item pool: sample size, the presence of missing data, and the percentage of nonstandard items. In general, the results support the use of semi- and nonparametric item response functions in a CAT.Entities:
Keywords: computer adaptive test; large-scale testing; monotonic polynomial; nonparametric IRT
Year: 2021 PMID: 34987268 PMCID: PMC8721622 DOI: 10.1177/00131644211014261
Source DB: PubMed Journal: Educ Psychol Meas ISSN: 0013-1644 Impact factor: 2.821
Mean RIMSE for Item Banks Under Complete Data Collection Design.
| Model | ||||
|---|---|---|---|---|
| Sample size ( | Proportion nonstandard (%) | 2PL | MP | KS |
| 1,000 | ||||
| 30 | .07 | .07 | .10 | |
| 70 | .11 | .10 | .11 | |
| 3,000 | ||||
| 30 | .04 | .03 | .04 | |
| 70 | .09 | .04 | .05 | |
Note. RIMSE = root integrated mean square error; 2PL = two-parameter logistic; MP = monotonic polynomial; KS = kernel smoothing.
Mean RIMSE for Item Banks Under Missing Data Collection Design.
| Model | |||
|---|---|---|---|
| Sample size ( | Proportion nonstandard (%) | 2PL | MP |
| 5,000 | |||
| 30 | .07 | .09 | |
| 70 | .12 | .09 | |
| 10,000 | |||
| 30 | .05 | .05 | |
| 70 | .09 | .07 | |
Note. RIMSE = root integrated mean square error; PL = parameter logistic; 2PL = two-parameter logistic; MP = monotonic polynomial.
Figure 1.Recovery of response functions for standard (std) and nonstandard (nonstd) items for each calibration.
Average RMSE of Latent Trait Scores for CAT Simulations Under Complete Data Calibration, Standard Normal Latent Traits.
| Model | ||||||
|---|---|---|---|---|---|---|
| Item selection | Sample size ( | Proportion nonstandard (%) | True | 2PL | MP | KS |
| KL | 1,000 | |||||
| 30 | .363 | .381 |
| .386 | ||
| 70 | .341 | .385 | .384 |
| ||
| 3,000 | ||||||
| 30 | .375 | .396 | .396 |
| ||
| 70 | .360 | .412 |
| .380 | ||
| MFI | 1,000 | |||||
| 30 | .359 | .383 |
| |||
| 70 | .343 | .384 |
| |||
| 3,000 | ||||||
| 30 | .373 | .396 |
| |||
| 70 | .361 | .412 |
| |||
| MPWI | 1,000 | |||||
| 30 | .361 | .383 |
| |||
| 70 | .349 |
| .390 | |||
| 3,000 | ||||||
| 30 | .375 | .398 |
| |||
| 70 | .364 | .407 |
| |||
Note. Excluding the true model condition, the best performing method in each row appears in bold. Sample size refers to that used in calibration. RMSE = root mean square error; KL = Kullback–Leibler information; MFI = maximum Fisher information; MPWI = maximum posterior weighted information; True = true model; 2PL = two-parameter logistic; MP = monotonic polynomial; KS = kernel smoothing.
Average RMSE of Latent Trait Scores for CAT Simulations Under Missing Data Calibration andStandard Normal Latent Traits.
| Model | |||||
|---|---|---|---|---|---|
| Item selection | Sample size ( | Proportion nonstandard (%) | True | 2PL | MP |
| KL | 5,000 | ||||
| 30 | .331 |
| .374 | ||
| 70 | .286 | .361 |
| ||
| 10,000 | |||||
| 30 | .327 | .352 |
| ||
| 70 | .311 |
| .350 | ||
| MFI | 5,000 | ||||
| 30 | .318 | .373 |
| ||
| 70 | .284 | .364 |
| ||
| 10,000 | |||||
| 30 | .327 | .351 |
| ||
| 70 | .310 |
|
| ||
| MPWI | 5,000 | ||||
| 30 | .331 | .369 |
| ||
| 70 | .288 | .360 |
| ||
| 10,000 | |||||
| 30 | .327 | .350 |
| ||
| 70 | .312 |
| .350 | ||
Note. Excluding the true model condition, the best performing method in each row appears in bold. Sample size refers to that used in calibration. RMSE = root mean square error; KL = Kullback–Leibler information; MFI = maximum Fisher information; MPWI = maximum posterior weighted information; True = true model; 2PL = two-parameter logistic; MP = monotonic polynomial; KS = kernel smoothing.
Figure 2.RMSE of latent trait scores at discrete points along with KL item selection, complete data calibration.
Note. KL = Kullback–Leibler information; RMSE = root mean square error; non-std = nonstandard.
Figure 3.RMSE of latent trait scores at discrete points along with KL item selection, missing data calibration.
Note. KL = Kullback–Leibler information; RMSE = root mean square error; non-std = nonstandard.