| Literature DB >> 30289923 |
Patrícia Martinková1,2, Dan Goldhaber3, Elena Erosheva4,5.
Abstract
Ratings are present in many areas of assessment including peer review of research proposals and journal articles, teacher observations, university admissions and selection of new hires. One feature present in any rating process with multiple raters is that different raters often assign different scores to the same assessee, with the potential for bias and inconsistencies related to rater or assessee covariates. This paper analyzes disparities in ratings of internal and external applicants to teaching positions using applicant data from Spokane Public Schools. We first test for biases in rating while accounting for measures of teacher applicant qualifications and quality. Then, we develop model-based inter-rater reliability (IRR) estimates that allow us to account for various sources of measurement error, the hierarchical structure of the data, and to test whether covariates, such as applicant status, moderate IRR. We find that applicants external to the district receive lower ratings for job applications compared to internal applicants. This gap in ratings remains significant even after including measures of qualifications and quality such as experience, state licensure scores, or estimated teacher value added. With model-based IRR, we further show that consistency between raters is significantly lower when rating external applicants. We conclude the paper by discussing policy implications and possible applications of our model-based IRR estimate for hiring and selection practices in and out of the teacher labor market.Entities:
Mesh:
Year: 2018 PMID: 30289923 PMCID: PMC6173388 DOI: 10.1371/journal.pone.0203002
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
54-point screening rubric.
| Criterion | Look for … |
|---|---|
| Note completion of course of study, certificate held (current or pending), and education. | |
| Look for quality, depth, and level of candidate’s additional training related to position. | |
| Note the degree to which experience supports the prediction of success, not just the number of years. A beginning candidate could be rated highly. | |
| Look for specific references to successful strategies. This may mean | |
| Note multiple endorsements, activity, coaching interests, student, building or district, or community support. Willing to learn new concepts and procedures; successfully teachers a variety of assignments; effectively uses various teaching styles. | |
| Look for specific references in support of skill in this area: plans; implements; evaluates; relates to students; creative; employs multiple approaches; monitors and adjusts; uses culturally responsive strategies appropriate to age, background, and intended learning of students. | |
| Develops and maintains effective working relationships with diverse staff, students, parents/guardians, and community. | |
| Look for specific references to successful strategies for building and maintaining a relationship with each student and their family. This may not be explicitly mentioned, but the following strategies offer some evidence of cultural competency: specific instructional strategies providing each student access to a rigorous curriculum, inclusive/respectful language about students and families, a belief that all children can achieve at high levels, mention of conflict resolution/restorative practices, specific instructional strategies for integrating culturally responsive materials that are also rigorous, and appropriate statements about their work with diverse populations. Note relevant training, coursework, and authors/book titles listed. | |
| Look for possession of qualifications as indicated on the job posting. |
Applicant characteristics for internal and external applicant ratings.
| Characteristics | Internal | External | Effect size | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Obs. | N | Mean | SD | Obs. | N | Mean | SD | |||
| Gender (Female ratio) | 2257 | 644 | 0.77 | 0.42 | 1024 | 392 | 0.67 | 0.47 | 0.23 | |
| Teaching experience | 2322 | 678 | 3.35 | 4.87 | 1149 | 461 | 4.62 | 5.34 | 0.25 | |
| WEST-B | ||||||||||
| Average | 1056 | 251 | -0.04 | 0.71 | 355 | 148 | -0.11 | 0.75 | 0.10 | |
| Math | 1060 | 252 | -0.04 | 1.09 | 355 | 148 | -0.04 | 1.01 | 0.00 | |
| Reading | 1057 | 252 | -0.08 | 0.89 | 355 | 148 | -0.21 | 0.96 | 0.14 | |
| Writing | 1056 | 251 | 0.01 | 0.78 | 355 | 148 | -0.09 | 0.89 | 0.12 | |
| 54-Pt Rubric | ||||||||||
| Total | 2322 | 678 | 39.13 | 6.63 | 1152 | 463 | 35.22 | 6.80 | 0.58 | |
| Certificate and Education | 2226 | 668 | 5.13 | 0.80 | 1100 | 446 | 4.91 | 1.04 | 0.24 | |
| Training | 2314 | 677 | 4.11 | 1.12 | 1137 | 460 | 3.56 | 1.18 | 0.48 | |
| Experience | 2322 | 678 | 4.21 | 1.00 | 1151 | 463 | 3.77 | 1.09 | 0.42 | |
| Management | 2301 | 676 | 4.22 | 0.94 | 1145 | 462 | 3.75 | 1.02 | 0.48 | |
| Flexibility | 2313 | 678 | 4.37 | 0.92 | 1146 | 461 | 3.99 | 1.00 | 0.39 | |
| Instructional Skills | 2316 | 678 | 4.34 | 0.98 | 1147 | 463 | 3.82 | 1.03 | 0.52 | |
| Interpersonal Skills | 2310 | 678 | 4.52 | 0.86 | 1143 | 461 | 4.14 | 1.00 | 0.41 | |
| Cultural Competency | 2302 | 677 | 4.12 | 0.93 | 1141 | 461 | 3.70 | 1.09 | 0.41 | |
| Preferred qualifications | 1720 | 614 | 4.09 | 1.28 | 840 | 391 | 3.58 | 1.27 | 0.40 | |
| Later VA | ||||||||||
| Math | 271 | 83 | -0.04 | 0.23 | 32 | 17 | -0.05 | 0.14 | 0.05 | |
| Read | 279 | 83 | -0.09 | 0.19 | 57 | 24 | -0.06 | 0.15 | 0.15 | |
Notes: WEST-B: scores on state licensure test, standardized statewide, VA: teacher value added estimates based on changes of student performance in achievement tests. Obs.: number of observations, N: number of applicants, SD: standard deviation, significance levels for p values corrected for multiple comparisons
* p < 0.05
** p < 0.01
*** p < 0.001.
Fig 1Distribution of total ratings for internal and external applicants.
Fig 2Distribution of subcomponent ratings for internal and external applicants.
Mixed effect models for summative total score.
| Model A | Model B | Model C | Model D1 | Model D2 | Model D | |
|---|---|---|---|---|---|---|
| Internal Only | Experience Only | WEST-B | VA Math Only | VA Read Only | Both VA | |
| N = 3474 | N = 3473 | N = 1411 | N = 303 | N = 336 | N = 267 | |
| Intercept | 36.03 | 35.57 | 36.23 | 37.34 | 36.96 | 36.74 |
| (0.48) | (0.50) | (0.60) | (1.32) | (1.11) | (1.37) | |
| Internal | 3.09 | 3.16 | 2.84 | 3.97 | 4.15 | 4.80 |
| (0.31) | (0.31) | (0.50) | (1.29) | (1.11) | (1.35) | |
| Experience | - | 0.11 | - | - | - | - |
| (0.03) | ||||||
| WEST-B | ||||||
| Writing | - | - | 0.11 | - | - | - |
| (0.35) | ||||||
| Reading | - | - | 0.40 | - | - | - |
| (0.33) | ||||||
| Math | - | - | 0.09 | - | - | - |
| (0.27) | ||||||
| Later VA | ||||||
| Math | - | - | - | 3.9 | - | 5.62 |
| (2.00) | (2.46) | |||||
| Reading | - | - | - | - | 3.29 | -3.10 |
| (2.27) | (3.04) | |||||
| Appl:Sch | 15.52 | 15.58 | 16.43 | 13.33 | 12.50 | 10.64 |
| (3.94) | (3.95) | (4.05) | (3.65) | (3.54) | (3.26) | |
| Appl | 10.22 | 10.26 | 5.16 | 4.97 | 5.37 | 3.75 |
| (3.20) | (3.20) | (2.27) | (2.23) | (2.32) | (1.94) | |
| Rtr | 12.07 | 11.96 | 11.25 | 10.42 | 11.14 | 12.21 |
| (3.50) | (3.46) | (3.35) | (2.23) | (3.34) | (3.49) | |
| Sch | 2.24 | 2.15 | 2.26 | 1.20 | 0.00 | 0.00 |
| (1.50) | (1.47) | (1.50) | (1.10) | (0.00) | (0.00) | |
| Residual | 21.15 | 20.95 | 21.28 | 14.07 | 15.85 | 15.71 |
| (4.60) | (4.58) | (4.61) | (3.75) | (3.98) | (3.96) |
Notes: WEST-B: scores on state licensure test, standardized statewide, VA: teacher value added estimates, significance levels for p values
* p < 0.05
** p < 0.01
*** p < 0.001.
Fig 3Mean and range of summative ratings of applicants rated multiple times between 2009–2013.
Each vertical line connects summative ratings given to single applicant during this period. Applicants are ordered by average summative rating (solid circles).
Fig 4Variance decomposition for internal and external applicants calculated using Model (3) jointly on all data.
Fig 5Within-school IRR estimates for internal applicants, external applicants and their difference, including bootstrap confidence intervals, calculated using Model (3) jointly on all data.
Effect of number of raters on reliability, standard error and predictive validity of scoring.
| Within-school IRR | Standard error of measures (SEM) | Estimated correlation with VA | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 rater | 2 raters | 3 raters | 1 rater | 2 raters | 3 raters | 1 rater | 2 raters | 3 raters | SEM = 0 | ||
| Internal | 0.51 | 0.67 | 0.76 | 5.46 | 4.44 | 3.84 | 0.17** | 0.19*** | 0.20*** | 0.23*** | |
| External | 0.42 | 0.59 | 0.68 | 6.05 | 5.08 | 4.47 | 0.17** | 0.20*** | 0.21*** | 0.26*** | |
| Internal | 0.36 | 0.53 | 0.63 | 0.87 | 0.75 | 0.66 | 0.01 | 0.02 | 0.02 | 0.02 | |
| External | 0.42 | 0.59 | 0.68 | 0.91 | 0.77 | 0.68 | 0.01 | 0.02 | 0.02 | 0.02 | |
| Internal | 0.44 | 0.61 | 0.70 | 0.96 | 0.80 | 0.70 | 0.09 | 0.11 | 0.12* | 0.14* | |
| External | 0.39 | 0.56 | 0.65 | 1.06 | 0.90 | 0.79 | 0.09 | 0.11 | 0.12* | 0.15* | |
| Internal | 0.47 | 0.64 | 0.73 | 0.86 | 0.71 | 0.62 | 0.11 | 0.13* | 0.14* | 0.16* | |
| External | 0.44 | 0.61 | 0.70 | 0.93 | 0.77 | 0.68 | 0.11 | 0.13* | 0.14* | 0.17* | |
| Internal | 0.42 | 0.59 | 0.68 | 0.87 | 0.73 | 0.64 | 0.19*** | 0.23*** | 0.24*** | 0.30*** | |
| External | 0.39 | 0.56 | 0.66 | 0.91 | 0.77 | 0.68 | 0.19*** | 0.23*** | 0.25*** | 0.31*** | |
| Internal | 0.40 | 0.57 | 0.67 | 0.86 | 0.73 | 0.64 | 0.13* | 0.16** | 0.17** | 0.21*** | |
| External | 0.37 | 0.54 | 0.64 | 0.90 | 0.77 | 0.68 | 0.13* | 0.16** | 0.17** | 0.22*** | |
| Internal | 0.51 | 0.68 | 0.76 | 0.80 | 0.65 | 0.56 | 0.22*** | 0.25*** | 0.27*** | 0.31*** | |
| External | 0.46 | 0.63 | 0.72 | 0.86 | 0.72 | 0.62 | 0.22*** | 0.26*** | 0.28*** | 0.33*** | |
| Internal | 0.38 | 0.55 | 0.65 | 0.84 | 0.72 | 0.64 | 0.14* | 0.17** | 0.19*** | 0.23*** | |
| External | 0.36 | 0.53 | 0.63 | 0.91 | 0.78 | 0.69 | 0.14* | 0.17** | 0.19*** | 0.24*** | |
| Internal | 0.35 | 0.51 | 0.61 | 0.95 | 0.82 | 0.73 | 0.11 | 0.14* | 0.15* | 0.19*** | |
| External | 0.33 | 0.49 | 0.59 | 1.01 | 0.87 | 0.78 | 0.11 | 0.14* | 0.15* | 0.19*** | |
| Internal | 0.43 | 0.61 | 0.70 | 1.16 | 0.97 | 0.85 | 0.08 | 0.10 | 0.10 | 0.12* | |
| External | 0.38 | 0.55 | 0.65 | 1.21 | 1.03 | 0.91 | 0.08 | 0.10 | 0.11 | 0.13* | |