| Literature DB >> 36046415 |
Tianci Liu1, Chun Wang2, Gongjun Xu1.
Abstract
Multidimensional Item Response Theory (MIRT) is widely used in educational and psychological assessment and evaluation. With the increasing size of modern assessment data, many existing estimation methods become computationally demanding and hence they are not scalable to big data, especially for the multidimensional three-parameter and four-parameter logistic models (i.e., M3PL and M4PL). To address this issue, we propose an importance-weighted sampling enhanced Variational Autoencoder (VAE) approach for the estimation of M3PL and M4PL. The key idea is to adopt a variational inference procedure in machine learning literature to approximate the intractable marginal likelihood, and further use importance-weighted samples to boost the trained VAE with a better log-likelihood approximation. Simulation studies are conducted to demonstrate the computational efficiency and scalability of the new algorithm in comparison to the popular alternative algorithms, i.e., Monte Carlo EM and Metropolis-Hastings Robbins-Monro methods. The good performance of the proposed method is also illustrated by a NAEP multistage testing data set.Entities:
Keywords: Monte Carlo (MC) algorithm; Multidimensional Item Response Theory (MIRT); estimation; four parameter item response theory; variational auto encoder (VAE)
Year: 2022 PMID: 36046415 PMCID: PMC9421264 DOI: 10.3389/fpsyg.2022.935419
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Stochastic gradient ascent of IWVAE.
| Initialize |
|
|
| randomly draw |
| draw |
| compute |
|
|
|
|
|
|
| randomly draw |
| draw |
| compute |
|
|
|
|
|
|
| randomly draw |
| draw |
| compute |
|
|
|
|
Mean and SE of RMSE of M estimate on M4PL models under single regime setting, best results are in bold.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| 500 | Between | MCEM | 9.400 ± 0.181 | 11.477 ± 0.424 | 0.175 ± 0.010 | 0.183 ± 0.010 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 10.406 ± 0.240 | 11.500 ± 0.481 | 0.163 ± 0.010 | 0.146 ± 0.008 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| 1000 | Between | MCEM | 8.230 ± 0.170 | 11.785 ± 0.450 | 0.189 ± 0.012 | 0.178 ± 0.011 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 7.799 ± 0.230 | 8.999 ± 0.464 | 0.132 ± 0.010 | 0.161 ± 0.011 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| 5,000 | Between | MCEM | 3.240 ± 0.156 | 4.351 ± 0.276 | 0.189 ± 0.011 | 0.155 ± 0.011 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 3.235 ± 0.221 | 2.939 ± 0.254 | 0.139 ± 0.012 | 0.133 ± 0.010 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| 10,000 | Between | MCEM | 1.988 ± 0.099 | 2.690 ± 0.184 | 0.174 ± 0.010 | 0.186 ± 0.011 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 1.823 ± 0.145 | 1.674 ± 0.151 | 0.136 ± 0.009 | 0.125 ± 0.007 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 |
Factors are diagonal. Between item structure: each item depends on 1 factor. Within item structure: each item depends on 2 factors.
Mean and SE of RMSE of M estimate on M3PL models under double regime setting, best results are in bold.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| 500 | Between | MCEM | 9.137 ± 0.396 | 12.034 ± 0.799 | 0.192 ± 0.012 | 1.00 |
| MHRM |
| 0.441 ± 0.062 |
| 0.05 | ||
| IWVAE | 0.659 ± 0.020 |
| 0.081 ± 0.008 | 1.00 | ||
| Within | MCEM | 7.644 ± 0.465 | 7.291 ± 0.612 | 0.153 ± 0.011 | 1.00 | |
| MHRM |
|
|
| 0.45 | ||
| IWVAE | 0.733 ± 0.020 | 0.440 ± 0.038 | 0.073 ± 0.008 | 1.00 | ||
| 1000 | Between | MCEM | 4.662 ± 0.200 | 7.364 ± 0.410 | 0.220 ± 0.010 | 1.00 |
| MHRM | / | / | / | 0.00 | ||
| IWVAE |
|
|
| 1.00 | ||
| Within | MCEM | 2.233 ± 0.171 | 2.294 ± 0.266 | 0.104 ± 0.006 | 1.00 | |
| MHRM | / | / | / | 0.00 | ||
| IWVAE |
|
|
| 1.00 | ||
| 5,000 | Between | MCEM | 0.576 ± 0.017 | 0.450 ± 0.048 | 0.101 ± 0.006 | 1.00 |
| MHRM | / | / | / | 0.00 | ||
| IWVAE |
|
|
| 1.00 | ||
| Within | MCEM | 0.728 ± 0.013 |
|
| 1.00 | |
| MHRM | / | / | / | 0.00 | ||
| IWVAE |
| 0.290 ± 0.018 | 0.082 ± 0.005 | 1.00 | ||
| 10,000 | Between | MCEM | 0.571 ± 0.012 | 0.720 ± 0.049 | 0.089 ± 0.004 | 1.00 |
| MHRM | / | / | / | 0.00 | ||
| IWVAE |
|
|
| 1.00 | ||
| Within | MCEM | 0.858 ± 0.018 | 1.411 ± 0.087 |
| 1.00 | |
| MHRM | / | / | / | 0.00 | ||
| IWVAE |
|
| 0.085 ± 0.004 | 1.00 |
Factors are correlated. Between item structure: each item depends on 1 factor. Within item structure: each item depends on 2 factors.
Mean and SE of RMSE of M estimate on M3PL models under double regime setting, best results are in bold.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| 500 | Between | MCEM | 10.020 ± 0.318 | 13.638 ± 0.752 | 0.213 ± 0.013 | 1.00 |
| MHRM |
| 0.567 ± 0.080 | 0.099 ± 0.007 | 0.35 | ||
| IWVAE | 0.641 ± 0.022 |
|
| 1.00 | ||
| Within | MCEM | 8.133 ± 0.459 | 8.687 ± 0.727 | 0.194 ± 0.014 | 1.00 | |
| MHRM |
|
| 0.078 ± 0.005 | 0.40 | ||
| IWVAE | 0.708 ± 0.021 | 0.461 ± 0.039 |
| 1.00 | ||
| 1000 | Between | MCEM | 5.224 ± 0.192 | 8.544 ± 0.407 | 0.243 ± 0.011 | 1.00 |
| MHRM | / | / | / | 0.00 | ||
| IWVAE |
|
|
| 1.00 | ||
| Within | MCEM | 2.976 ± 0.193 | 3.441 ± 0.304 | 0.157 ± 0.008 | 1.00 | |
| MHRM | / | / | / | 0.00 | ||
| IWVAE |
|
|
| 1.00 | ||
| 5,000 | Between | MCEM | 0.612 ± 0.019 | 0.508 ± 0.048 | 0.114 ± 0.007 | 1.00 |
| MHRM | / | / | / | 0.00 | ||
| IWVAE |
|
|
| 1.00 | ||
| Within | MCEM | 0.693 ± 0.015 | 0.306 ± 0.020 |
| 1.00 | |
| MHRM | / | / | / | 0.00 | ||
| IWVAE |
|
| 0.082 ± 0.005 | 1.00 | ||
| 10,000 | Between | MCEM | 0.561 ± 0.010 | 0.572 ± 0.032 | 0.097 ± 0.004 | 1.00 |
| MHRM | / | / | / | 0.00 | ||
| IWVAE |
|
|
| 1.00 | ||
| Within | MCEM | 0.678 ± 0.010 | 0.581 ± 0.047 |
| 1.00 | |
| MHRM | / | / | / | 0.00 | ||
| IWVAE |
|
| 0.085 ± 0.004 | 1.00 |
Factors are diagonal. Between item structure: each item depends on 1 factor. Within item structure: each item depends on 2 factors.
Mean and SE of RMSE of M estimate on M4PL models under single regime setting, best results are in bold.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| 500 | Between | MCEM | 11.248 ± 0.217 | 13.315 ± 0.491 | 0.172 ± 0.011 | 0.178 ± 0.010 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 12.038 ± 0.286 | 12.611 ± 0.665 | 0.148 ± 0.010 | 0.138 ± 0.008 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| 1000 | Between | MCEM | 8.231 ± 0.159 | 11.774 ± 0.417 | 0.180 ± 0.011 | 0.189 ± 0.012 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 7.181 ± 0.302 | 7.160 ± 0.527 | 0.108 ± 0.009 | 0.134 ± 0.010 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| 5,000 | Between | MCEM | 3.315 ± 0.167 | 4.790 ± 0.354 | 0.182 ± 0.014 | 0.159 ± 0.012 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 1.886 ± 0.152 | 1.468 ± 0.186 | 0.094 ± 0.009 | 0.087 ± 0.007 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| 10,000 | Between | MCEM | 1.918 ± 0.114 | 2.555 ± 0.213 | 0.157 ± 0.010 | 0.176 ± 0.011 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 1.127 ± 0.075 | 0.943 ± 0.079 | 0.095 ± 0.007 | 0.087 ± 0.006 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 |
Factors are correlated. Between item structure: each item depends on 1 factor. Within item structure: each item depends on 2 factors.
Figure 1Fitting times for IWVAE and MCEM with M3PL model under the single (different sample sizes N and fixed item dimension J) and double [different (N, J)] asymptotic regimes. Vertical bar areas mark empirical 95% intervals.
Figure 2Fitting times for IWVAE and MCEM with M4PL model under the single (different sample sizes N and fixed item dimension J) and double [different (N, J)] asymptotic regimes. Vertical bar areas mark empirical 95% intervals.
Comparison of estimated R from different models on MST dataset.
|
|
|
|
|
|---|---|---|---|
| M4PL |
|
| / |
| M3PL |
|
|
|
| M2PL |
|
|
|
Mean and SE of train and held-out accuracy/log-likelihood on MST dataset (over 5 replications).
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| IWVAE | M4PL | 0.707 ± 0.001 | 0.704 ± 0.002 | −0.531 ± 0.001 | −0.539 ± 0.001 |
| M3PL | 0.707 ± 0.000 | 0.706 ± 0.002 | −0.530 ± 0.000 | −0.537 ± 0.000 | |
| M2PL | 0.706 ± 0.001 | 0.703 ± 0.001 | −0.531 ± 0.001 | −0.539 ± 0.001 | |
| MCEM | M4PL | 0.764 ± 0.001 | 0.693 ± 0.002 | −0.481 ± 0.001 | −0.603 ± 0.001 |
| M3PL | 0.761 ± 0.000 | 0.697 ± 0.001 | −0.482 ± 0.000 | −0.599 ± 0.000 | |
| M2PL | 0.759 ± 0.001 | 0.697 ± 0.001 | −0.485 ± 0.001 | −0.589 ± 0.001 | |
| MHRM | M4PL | / | / | / | / |
| M3PL | 0.612 ± 0.003 | 0.613 ± 0.002 | −0.682 ± 0.003 | −0.683 ± 0.003 | |
| M2PL | 0.622 ± 0.003 | 0.623 ± 0.002 | −0.662 ± 0.003 | −0.664 ± 0.003 |
Figure 3Predicted log-likelihood on held-out items using different methods (IWVAE, MHRM, MCEM) to fit different MIRT models on MST data from a randomly selected trial. Outlier predictions are removed.
Figure 4Predicted log-likelihood on held-out items using different methods (IWVAE, MHRM, MCEM) to fit different MIRT models on MST data from a randomly selected trial. Outlier predictions are kept.
Mean and SE of RMSE of M estimate on M4PL models under double regime setting, best results are in bold.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| 500 | Between | MCEM | 9.400 ± 0.181 | 11.477 ± 0.424 | 0.175 ± 0.010 | 0.183 ± 0.010 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 10.406 ± 0.240 | 11.500 ± 0.481 | 0.163 ± 0.010 | 0.146 ± 0.008 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| 1000 | Between | MCEM | 5.397 ± 0.069 | 8.312 ± 0.188 | 0.180 ± 0.007 | 0.178 ± 0.007 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 8.242 ± 0.174 | 9.254 ± 0.354 | 0.151 ± 0.007 | 0.158 ± 0.007 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| 5,000 | Between | MCEM | 1.624 ± 0.058 | 1.519 ± 0.068 | 0.163 ± 0.006 | 0.156 ± 0.005 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 1.388 ± 0.079 | 0.876 ± 0.076 | 0.092 ± 0.005 | 0.086 ± 0.004 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| 10,000 | Between | MCEM | 1.022 ± 0.023 | 1.152 ± 0.036 | 0.150 ± 0.004 | 0.153 ± 0.004 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 0.930 ± 0.018 | 0.993 ± 0.031 | 0.119 ± 0.004 | 0.119 ± 0.003 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 |
Factors are diagonal. Between item structure: each item depends on 1 factor. Within item structure: each item depends on 2 factors.
Mean and SE of RMSE of M estimate on M3PL models under single regime setting, best results are in bold.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| 500 | Between | MCEM | 10.020 ± 0.318 | 13.638 ± 0.752 | 0.213 ± 0.013 | 1.00 |
| MHRM |
| 0.567 ± 0.080 | 0.099 ± 0.007 | 0.35 | ||
| IWVAE | 0.641 ± 0.022 |
|
| 1.00 | ||
| Within | MCEM | 8.133 ± 0.459 | 8.687 ± 0.727 | 0.194 ± 0.014 | 1.00 | |
| MHRM |
|
| 0.078 ± 0.005 | 0.40 | ||
| IWVAE | 0.708 ± 0.021 | 0.461 ± 0.039 |
| 1.00 | ||
| 1000 | Between | MCEM | 5.338 ± 0.287 | 7.799 ± 0.544 | 0.237 ± 0.016 | 1.00 |
| MHRM |
|
| 0.089 ± 0.007 | 0.30 | ||
| IWVAE | 0.492 ± 0.019 | 0.320 ± 0.026 |
| 1.00 | ||
| Within | MCEM | 2.564 ± 0.235 | 3.023 ± 0.423 | 0.120 ± 0.011 | 1.00 | |
| MHRM |
|
| 0.075 ± 0.005 | 0.65 | ||
| IWVAE | 0.590 ± 0.020 | 0.325 ± 0.024 |
| 1.00 | ||
| 5,000 | Between | MCEM | 1.031 ± 0.110 | 1.190 ± 0.204 | 0.144 ± 0.013 | 1.00 |
| MHRM |
| 0.264 ± 0.028 |
| 0.30 | ||
| IWVAE | 0.403 ± 0.024 |
| 0.091 ± 0.011 | 1.00 | ||
| Within | MCEM | 0.881 ± 0.063 | 0.575 ± 0.077 | 0.097 ± 0.009 | 1.00 | |
| MHRM |
|
|
| 0.90 | ||
| IWVAE | 0.562 ± 0.026 | 0.279 ± 0.032 | 0.075 ± 0.008 | 1.00 | ||
| 10,000 | Between | MCEM | 0.810 ± 0.078 | 1.008 ± 0.169 | 0.112 ± 0.011 | 1.00 |
| MHRM |
| 0.381 ± 0.131 |
| 0.70 | ||
| IWVAE | 0.393 ± 0.027 |
| 0.084 ± 0.008 | 1.00 | ||
| Within | MCEM | 0.754 ± 0.045 | 0.662 ± 0.129 | 0.076 ± 0.007 | 1.00 | |
| MHRM |
|
|
| 0.75 | ||
| IWVAE | 0.535 ± 0.027 | 0.283 ± 0.039 | 0.084 ± 0.010 | 1.00 |
Factors are diagonal. Between item structure: each item depends on 1 factor. Within item structure: each item depends on 2 factors.
Mean and SE of RMSE of M estimate on M4PL models under double regime setting, best results are in bold.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| 500 | Between | MCEM | 11.248 ± 0.217 | 13.315 ± 0.491 | 0.172 ± 0.011 | 0.178 ± 0.010 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 12.038 ± 0.286 | 12.611 ± 0.665 | 0.148 ± 0.010 | 0.138 ± 0.008 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| 1000 | Between | MCEM | 8.314 ± 0.097 | 11.742 ± 0.298 | 0.178 ± 0.008 | 0.197 ± 0.008 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 7.323 ± 0.213 | 7.628 ± 0.362 | 0.125 ± 0.006 | 0.131 ± 0.007 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| 5,000 | Between | MCEM | 2.041 ± 0.090 | 2.292 ± 0.154 | 0.143 ± 0.006 | 0.139 ± 0.006 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 1.049 ± 0.053 | 0.525 ± 0.065 |
|
| 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
| 0.083 ± 0.005 | 0.080 ± 0.005 | 1.00 | ||
| 10,000 | Between | MCEM | 1.062 ± 0.036 | 1.163 ± 0.060 | 0.129 ± 0.004 | 0.131 ± 0.004 | 1.00 |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 | ||
| Within | MCEM | 0.884 ± 0.015 | 0.944 ± 0.031 | 0.098 ± 0.003 | 0.099 ± 0.003 | 1.00 | |
| MHRM | / | / | / | / | / | ||
| IWVAE |
|
|
|
| 1.00 |
Factors are correlated. Between item structure: each item depends on 1 factor. Within item structure: each item depends on 2 factors.
Mean and SE of RMSE of M estimate on M3PL models under single regime setting, best results are in bold.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| 500 | Between | MCEM | 9.137 ± 0.396 | 12.034 ± 0.799 | 0.192 ± 0.012 | 1.00 |
| MHRM |
| 0.441 ± 0.062 |
| 0.05 | ||
| IWVAE | 0.659 ± 0.020 |
| 0.081 ± 0.008 | 1.00 | ||
| Within | MCEM | 7.644 ± 0.465 | 7.291 ± 0.612 | 0.153 ± 0.011 | 1.00 | |
| MHRM |
|
|
| 0.45 | ||
| IWVAE | 0.733 ± 0.020 | 0.440 ± 0.038 | 0.073 ± 0.008 | 1.00 | ||
| 1000 | Between | MCEM | 5.231 ± 0.279 | 8.080 ± 0.573 | 0.219 ± 0.014 | 1.00 |
| MHRM |
| 0.350 ± 0.041 | 0.090 ± 0.007 | 0.25 | ||
| IWVAE | 0.483 ± 0.019 |
|
| 1.00 | ||
| Within | MCEM | 2.855 ± 0.299 | 2.912 ± 0.457 | 0.118 ± 0.010 | 1.00 | |
| MHRM |
| 0.336 ± 0.020 |
| 0.80 | ||
| IWVAE | 0.601 ± 0.024 |
| 0.069 ± 0.007 | 1.00 | ||
| 5,000 | Between | MCEM | 0.762 ± 0.063 | 0.638 ± 0.117 | 0.130 ± 0.015 | 1.00 |
| MHRM |
|
|
| 0.75 | ||
| IWVAE | 0.415 ± 0.023 | 0.256 ± 0.031 | 0.091 ± 0.011 | 1.00 | ||
| Within | MCEM | 0.986 ± 0.075 | 0.421 ± 0.054 | 0.066 ± 0.006 | 1.00 | |
| MHRM |
| 0.460 ± 0.013 |
| 0.80 | ||
| IWVAE | 0.578 ± 0.029 |
| 0.075 ± 0.008 | 1.00 | ||
| 10,000 | Between | MCEM | 0.917 ± 0.107 | 1.073 ± 0.170 | 0.110 ± 0.012 | 1.00 |
| MHRM |
|
|
| 0.65 | ||
| IWVAE | 0.397 ± 0.027 | 0.297 ± 0.035 | 0.084 ± 0.008 | 1.00 | ||
| Within | MCEM | 0.898 ± 0.056 | 0.751 ± 0.165 | 0.057 ± 0.006 | 1.00 | |
| MHRM |
| 0.461 ± 0.027 |
| 0.40 | ||
| IWVAE | 0.547 ± 0.032 |
| 0.084 ± 0.010 | 1.00 |
Factors are correlated. Between item structure: each item depends on 1 factor. Within item structure: each item depends on 2 factors.