Literature DB >> 34565938

An Iterative Parametric Bootstrap Approach to Evaluating Rater Fit.

Wenjing Guo1, Stefanie A Wind1.   

Abstract

When analysts evaluate performance assessments, they often use modern measurement theory models to identify raters who frequently give ratings that are different from what would be expected, given the quality of the performance. To detect problematic scoring patterns, two rater fit statistics, the infit and outfit mean square error (MSE) statistics are routinely used. However, the interpretation of these statistics is not straightforward. A common practice is that researchers employ established rule-of-thumb critical values to interpret infit and outfit MSE statistics. Unfortunately, prior studies have shown that these rule-of-thumb values may not be appropriate in many empirical situations. Parametric bootstrapped critical values for infit and outfit MSE statistics provide a promising alternative approach to identifying item and person misfit in item response theory (IRT) analyses. However, researchers have not examined the performance of this approach for detecting rater misfit. In this study, we illustrate a bootstrap procedure that researchers can use to identify critical values for infit and outfit MSE statistics, and we used a simulation study to assess the false-positive and true-positive rates of these two statistics. We observed that the false-positive rates were highly inflated, and the true-positive rates were relatively low. Thus, we proposed an iterative parametric bootstrap procedure to overcome these limitations. The results indicated that using the iterative procedure to establish 95% critical values of infit and outfit MSE statistics had better-controlled false-positive rates and higher true-positive rates compared to using traditional parametric bootstrap procedure and rule-of-thumb critical values.
© The Author(s) 2021.

Entities:  

Keywords:  false-positive rates; parametric bootstrap method; rater-mediated assessment; true-positive rate

Year:  2021        PMID: 34565938      PMCID: PMC8361373          DOI: 10.1177/01466216211013105

Source DB:  PubMed          Journal:  Appl Psychol Meas        ISSN: 0146-6216


  9 in total

1.  Computing confidence intervals of item fit statistics in the family of Rasch models using the bootstrap method.

Authors:  Ya-Hui Su; Ching-Fan Sheu; Wen-Chung Wang
Journal:  J Appl Meas       Date:  2007

2.  Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit.

Authors:  J Cohen
Journal:  Psychol Bull       Date:  1968-10       Impact factor: 17.737

3.  Using item mean squares to evaluate fit to the Rasch model.

Authors:  R M Smith; R E Schumacker; M J Bush
Journal:  J Outcome Meas       Date:  1998

4.  A Study of Rasch, partial credit, and rating scale model parameter recovery in WINSTEPS and jMetrik.

Authors:  J Patrick Meyer; Emily Hailey
Journal:  J Appl Meas       Date:  2012

5.  Examining rating quality in writing assessment: rater agreement, error, and accuracy.

Authors:  Stefanie A Wind; George Engelhard
Journal:  J Appl Meas       Date:  2012

6.  Exploring the Combined Effects of Rater Misfit and Differential Rater Functioning in Performance Assessments.

Authors:  Stefanie A Wind; Wenjing Guo
Journal:  Educ Psychol Meas       Date:  2019-04-02       Impact factor: 2.821

7.  A critique of Rasch residual fit statistics.

Authors:  G Karabatsos
Journal:  J Appl Meas       Date:  2000

8.  A bootstrap approach to evaluating person and item fit to the Rasch model.

Authors:  Edward W Wolfe
Journal:  J Appl Meas       Date:  2013

9.  Using the Bootstrap Method to Evaluate the Critical Range of Misfit for Polytomous Rasch Fit Statistics.

Authors:  Hyunsoo Seol
Journal:  Psychol Rep       Date:  2016-05-19
  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.