| Literature DB >> 27729878 |
Prathiba Natesan1, Ratna Nandakumar2, Tom Minka3, Jonathan D Rubright4.
Abstract
This study investigated the impact of three prior distributions: matched, standard vague, and hierarchical in Bayesian estimation parameter recovery in two and one parameter models. Two Bayesian estimation methods were utilized: Markov chain Monte Carlo (MCMC) and the relatively new, Variational Bayesian (VB). Conditional (CML) and Marginal Maximum Likelihood (MML) estimates were used as baseline methods for comparison. Vague priors produced large errors or convergence issues and are not recommended. For both MCMC and VB, the hierarchical and matched priors showed the lowest root mean squared errors (RMSEs) for ability estimates; RMSEs of difficulty estimates were similar across estimation methods. For the standard errors (SEs), MCMC-hierarchical displayed the largest values across most conditions. SEs from the VB estimation were among the lowest in all but one case. Overall, VB-hierarchical, VB-matched, and MCMC-matched performed best. VB with hierarchical priors are recommended in terms of their accuracy, and cost and (subsequently) time effectiveness.Entities:
Keywords: Bayesian; Markov chain Monte Carlo; item response theory; marginal maximum likelihood; variational Bayesian
Year: 2016 PMID: 27729878 PMCID: PMC5037236 DOI: 10.3389/fpsyg.2016.01422
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Estimation methods and prior distributions for normal data.
| MCMC | Matched | θ, | MCMC-matched |
| a ~ lognormal(0, 0.25) | |||
| Standard vague | θ ~ normal(0, 1) | MCMC-stdvague | |
| Hierarchical | θ ~ normal( | MCMC-hierarchical | |
| | |||
| | |||
| | |||
| | |||
| Variational | Matched | Same as MCMC | VB-matched |
| Bayes | Standard vague | Same as MCMC | VB-stdvague |
| Hierarchical | Same as MCMC | VB-hierarchical | |
| CML | NA | CML | |
| MML | NA | MML |
CML and MML estimation methods were used for 1-PL data and only MML was used for 2-PL data.
Effect sizes (η.
| EM | 84.413 | 99.327 | 95.163 | 18.014 | 60.936 |
| TL | 11.484 | 72.505 | |||
| SS | 21.736 | ||||
| EM × SS | 9.436 | ||||
Marginal means by estimation method for 1-PL data.
| CML | 0.60 | 0.58 | 0.09 | 0.09 | 0.11 |
| MCMC-hierarchical | 0.47 | 0.60 | 0.09 | 0.35 | 0.09 |
| MCMC-matched | 0.47 | 0.47 | 0.09 | 0.10 | 0.09 |
| MCMC-stdvague | 1.33 | 0.48 | 1.40 | 0.11 | 0.36 |
| MML | 0.60 | 0.59 | 0.10 | 0.10 | 0.16 |
| VB-hierarchical | 0.47 | 0.46 | 0.09 | 0.09 | 0.09 |
| VB-matched | 0.47 | 0.46 | 0.09 | 0.09 | 0.09 |
| VB-stdvague | 0.47 | 0.46 | 0.10 | 0.10 | 0.09 |
Marginal means aggregated over test length and sample size for the 1PL model.
| diff-RMSE | 250 | 0.15 | 0.16 | 0.15 | 0.15 | 1.39 | 0.15 | 0.15 | 0.15 |
| 500 | 0.10 | 0.11 | 0.10 | 0.10 | 1.39 | 0.10 | 0.10 | 0.10 | |
| 1000 | 0.07 | 0.08 | 0.07 | 0.07 | 1.40 | 0.07 | 0.07 | 0.07 | |
| 2000 | 0.05 | 0.06 | 0.05 | 0.05 | 1.40 | 0.05 | 0.05 | 0.05 | |
| diff-SE | 250 | 0.15 | 0.16 | 0.55 | 0.16 | 0.16 | 0.15 | 0.15 | 0.15 |
| 500 | 0.10 | 0.11 | 0.38 | 0.11 | 0.12 | 0.10 | 0.10 | 0.10 | |
| 1000 | 0.07 | 0.08 | 0.27 | 0.08 | 0.08 | 0.07 | 0.07 | 0.07 | |
| 2000 | 0.05 | 0.06 | 0.19 | 0.06 | 0.06 | 0.05 | 0.05 | 0.05 | |
| ab-RMSE | 10 | 0.82 | 0.82 | 0.60 | 0.60 | 1.28 | 0.60 | 0.60 | 0.60 |
| 20 | 0.58 | 0.58 | 0.47 | 0.47 | 1.33 | 0.47 | 0.47 | 0.47 | |
| 40 | 0.40 | 0.40 | 0.35 | 0.35 | 1.37 | 0.35 | 0.35 | 0.35 | |
| ab-SE | 10 | 0.79 | 0.83 | 0.77 | 0.60 | 0.60 | 0.58 | 0.59 | 0.59 |
| 20 | 0.56 | 0.57 | 0.59 | 0.47 | 0.47 | 0.46 | 0.46 | 0.46 | |
| 40 | 0.39 | 0.39 | 0.44 | 0.35 | 0.35 | 0.35 | 0.35 | 0.35 | |
| p-RMSE | 10 | 0.14 | 0.19 | 0.11 | 0.11 | 0.36 | 0.11 | 0.11 | 0.11 |
| 20 | 0.10 | 0.15 | 0.10 | 0.09 | 0.36 | 0.09 | 0.09 | 0.09 | |
| 40 | 0.07 | 0.13 | 0.07 | 0.07 | 0.37 | 0.07 | 0.07 | 0.07 | |
Effect sizes (η.
| EM | 56.32 | 16.58 | 50.18 | 48.92 | 42.51 | ||
| TL | 93.19 | 88.36 | 26.15 | ||||
| SS | 21.61 | 47.19 | 22.05 | 19.73 | |||
| EM × SS | 21.22 | 25.69 | 22.14 | 16.95 | |||
Marginal means aggregated over sample size for the 2-PL model.
| Ability RMSE | 10 | 0.61 | 0.59 | 0.60 | 0.61 | 0.59 | 0.59 |
| 20 | 0.51 | 0.46 | 0.47 | 0.48 | 0.46 | 0.46 | |
| 40 | 0.46 | 0.35 | 0.35 | 0.36 | 0.35 | 0.35 | |
| Ability SE | 10 | 0.82 | 0.59 | 0.60 | 1.37 | 0.98 | 0.58 |
| 20 | 0.56 | 0.47 | 0.48 | 1.00 | 0.80 | 0.45 | |
| 40 | 0.38 | 0.35 | 0.37 | 0.69 | 0.63 | 0.34 | |
| Probability RMSE | 10 | 0.14 | 0.12 | 0.12 | 0.12 | 0.12 | 0.11 |
| 20 | 0.10 | 0.09 | 0.09 | 0.09 | 0.09 | 0.09 | |
| 40 | 0.08 | 0.07 | 0.07 | 0.07 | 0.07 | 0.07 | |
For the 2-PL data, VB-stdvague estimates did not converge for several conditions. Therefore they are not reported.
Marginal means aggregated over test length for the 2PL model.
| Difficulty RMSE | 250 | 0.41 | 0.18 | 0.6 | 0.26 | 0.19 | 0.18 |
| 500 | 0.4 | 0.14 | 0.27 | 0.19 | 0.14 | 0.14 | |
| 1000 | 0.4 | 0.1 | 0.13 | 0.13 | 0.11 | 0.1 | |
| 2000 | 0.4 | 0.08 | 0.09 | 0.09 | 0.08 | 0.08 | |
| Difficulty SE | 250 | 0.31 | 0.2 | 1.18 | 1.27 | 0.26 | 0.15 |
| 500 | 0.21 | 0.15 | 0.35 | 0.92 | 0.19 | 0.11 | |
| 1000 | 0.13 | 0.11 | 0.15 | 0.58 | 0.14 | 0.08 | |
| 2000 | 0.09 | 0.08 | 0.1 | 0.38 | 0.1 | 0.05 | |
| Discrimination RMSE | 250 | 0.23 | 0.16 | 0.46 | 0.19 | 0.17 | 0.16 |
| 500 | 0.16 | 0.13 | 0.19 | 0.14 | 0.13 | 0.13 | |
| 1000 | 0.11 | 0.1 | 0.12 | 0.11 | 0.1 | 0.1 | |
| 2000 | 0.08 | 0.07 | 0.08 | 0.08 | 0.07 | 0.08 | |
| Discrimination SE | 250 | 0.15 | 0.17 | 0.5 | 0.35 | 0.07 | 0.12 |
| 500 | 0.1 | 0.14 | 0.19 | 0.29 | 0.05 | 0.09 | |
| 1000 | 0.07 | 0.11 | 0.12 | 0.25 | 0.04 | 0.06 | |
| 2000 | 0.05 | 0.08 | 0.08 | 0.2 | 0.03 | 0.05 | |
Marginal means aggregated over SS and TL for the 2PL model.
| Ability | RMSE | 0.52 | 0.47 | 0.47 | 0.48 | 0.47 | 0.47 |
| SE | 0.59 | 0.47 | 0.48 | 1.02 | 0.46 | 0.8 | |
| Difficulty | RMSE | 0.4 | 0.13 | 0.27 | 0.17 | 0.13 | 0.13 |
| SE | 0.18 | 0.14 | 0.44 | 0.79 | 0.1 | 0.17 | |
| Discrimination | RMSE | 0.15 | 0.12 | 0.21 | 0.13 | 0.12 | 0.12 |
| SE | 0.09 | 0.12 | 0.22 | 0.27 | 0.08 | 0.05 | |
| Probability | RMSE | 0.11 | 0.09 | 0.09 | 0.09 | 0.09 | 0.09 |