| Literature DB >> 35923730 |
Abstract
The reliability of a test score is discussed from the viewpoint of underestimation of and, specifically, deflation in estimates or reliability. Many widely used estimators are known to underestimate reliability. Empirical cases have shown that estimates by widely used estimators such as alpha, theta, omega, and rho may be deflated by up to 0.60 units of reliability or even more, with certain types of datasets. The reason for this radical deflation lies in the item-score correlation (Rit) embedded in the estimators: because the estimates by Rit are deflated when the number of categories in scales are far from each other, as is always the case with item and score, the estimates of reliability are deflated as well. A short-cut method to reach estimates closer to the true magnitude, new types of estimators, and deflation-corrected estimators of reliability (DCERs), are studied in the article. The empirical section is a study on the characteristics of combinations of DCERs formed by different bases for estimators (alpha, theta, omega, and rho), different alternative estimators of correlation as the linking factor between item and the score variable, and different conditions. Based on the simulation, an initial typology of the families of DCERs is presented: some estimators are better with binary items and some with polytomous items; some are better with small sample sizes and some with larger ones.Entities:
Keywords: coefficient alpha; coefficient omega; coefficient theta; deflation in reliability; deflation-corrected reliability; maximal reliability; reliability
Year: 2022 PMID: 35923730 PMCID: PMC9341485 DOI: 10.3389/fpsyg.2022.891959
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Figure 1The magnitude of a mechanical error on the estimates of correlation (MEC) by selected estimators of correlation. Tau-b = Kendall tau-b; Rir = Henrysson item–rest correlation ( = PMC), Rit = item–total correlation (= PMC); eta = coefficient eta (X dependent), RS = Spearman rank-order correlation (= PMC), D = Somers delta (X-dependent); D2 = dimension-corrected D; RPC = polychoric correlation; RREG = r-polyreg correlation; G = Goodman-Kruskal gamma; G2 = dimension-corrected G, RAC = attenuation-corrected Rit, EAC = attenuation-corrected eta.
Figure 2Measurement models without and with elements of MEC. (A) Traditional measurement model. (B) Measurement model including elements of MEC.
Figure 3MEC-corrected one-latent variable measurement model.
Average estimates of reliability and deviance from the population value in simulation.
|
|
|
| ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Estimate | 0.850 | 0.858 | 0.854 | 0.875 | 0.891 | 0.896 | 0.925 | 0.935 | 0.885 | 0.890 | 0.920 | 0.928 |
| Deviation | −0,016 | −0,001 | −0,012 | 0,012 | −0,009 | −0,002 | −0,005 | 0,008 | −0,005 | 0,001 | −0,001 | 0,007 |
|
| 1,440 | 1,440 | 1,394 | 1,384 | 1,440 | 1,440 | 1,440 | 1,418 | 1,440 | 1,440 | 1,440 | 1,421 |
|
|
|
| ||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Estimate | 0.831 | 0.834 | 0.893 | 0.904 | 0.789 | 0.796 | 0.873 | 0.883 | 0.905 | 0.910 | 0.933 | 0.942 |
| Deviation | −0,009 | −0,005 | −0,005 | 0,009 | −0,010 | −0,002 | −0,005 | 0,009 | −0,009 | −0,001 | −0,005 | 0,009 |
|
| 1,440 | 1,440 | 1,440 | 1,418 | 1,440 | 1,440 | 1,440 | 1,426 | 1,440 | 1,440 | 1,440 | 1,418 |
|
|
|
| ||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Estimate | 0.884 | 0.890 | 0.920 | 0.930 | 0.891 | 0.897 | 0.924 | 0.934 | 0.901 | 0.906 | 0.930 | 0.939 |
| Deviation | −0,010 | −0,002 | −0,005 | 0,009 | −0,007 | 0,001 | −0,003 | 0,010 | −0,006 | 0,001 | −0,002 | 0,010 |
|
| 1,440 | 1,440 | 1,440 | 1,426 | 1,440 | 1,440 | 1,440 | 1,418 | 1,440 | 1,440 | 1,440 | 1,418 |
Average estimate.
Average deviation between the sample and population estimates.
Figure 4Average estimates by DCERs based on the form of omega.
Figure 5Deviance between sample and population estimates by DCERs based on the form of omega.
Figure 6Deviance between sample and population estimates by a DCER based on omega.
Figure 7Deviance between sample and population estimates in DCERs based on omega.
Figure 9The behavior of DCERs based on omega by sample size. (A) Base omega by N; binary items. (B) Base omega by N; polytomous items. (C) Deviance by N; binary items. (D) Deviation by N; polytomous items.
Figure 8Average estimates of reliability and deviance from the population by sample size.
Figure 10The behavior of traditional estimators of reliability by the width of the score [df(X)]. The behavior of DCERs by the width of the score [df(X)]. (a) Base omega; binary items. (b) Base omega; polytomous items.
Figure 11The behavior of DCERs by test difficulty (the highest and traditional estimates are highlighted). (a) Base omega; binary items. (b) Base omega; polytomous items.
Typology of selected deflation-corrected estimators of reliability and their characteristics.
|
|
|
|
| |||
|---|---|---|---|---|---|---|
| General characteristics | • Reflects latent reliability, not strictly related to the observed score nor observed items | • Reflects reliability of the observed score but uses non-observed items | • Reflects reliability of observed score | • Reflects reliability of the observed score but uses non-observed items | ||
| Base | Alpha | • Always underestimates population reliability |
|
|
|
|
| Theta | • Maximizes alpha |
|
|
|
| |
| Omega | • Always higher than alpha |
|
|
|
| |
| rho (maximal reliability) | • Maximizes omega |
|
|
|
|
MEC-corrected estimators.
Typology of selected deflation-corrected estimators of reliability and their characteristics; attenuation-corrected estimators.
|
| ||||
|---|---|---|---|---|
|
|
| |||
| General characteristics | • Reflects reliability of the observed score but uses non-observed items | • Reflects reliability of the observed score but uses non-observed items | ||
| Base | Alpha | • Always underestimates population reliability |
|
|
| Theta | • Maximizes alpha |
|
| |
| Omega | • Always higher than alpha |
|
| |
| Rho (maximal reliability) | • Maximizes omega |
|
| |
Descriptive statistics of the test items from Metsämuuronen and Ukkola (2019) (N = 7,770).
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| g1 | 0–1 | 0.96 | 0.96 | 0.186 | 0.0348 |
| g2 | 0–1 | 0.98 | 0.98 | 0.126 | 0.0160 |
| g3 | 0–1 | 0.99 | 0.99 | 0.088 | 0.0078 |
| g4 | 0–1 | 0.91 | 0.91 | 0.287 | 0.0824 |
| g5 | 0–2 | 1.78 | 0.89 | 0.610 | 0.3715 |
| g6 | 0–1 | 0.98 | 0.98 | 0.122 | 0.0150 |
| g7 | 0–2 | 1.97 | 0.985 | 0.211 | 0.0446 |
| g8 | 0–2 | 1.98 | 0.99 | 0.169 | 0.0285 |
| SUM | 0.6004 | ||||
| Score | 3–11 | 10.57 | 0.961 | 0.875 | 0.7650 |
Principal component and factor loadings.
|
|
| |||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| g1 | 0.447 | 0.200 | 0.276 | 0.076 | 0.924 | 0.082 |
| g2 | 0.430 | 0.185 | 0.260 | 0.068 | 0.932 | 0.073 |
| g3 | 0.605 | 0.366 | 0.471 | 0.222 | 0.778 | 0.285 |
| g4 | 0.468 | 0.219 | 0.291 | 0.085 | 0.915 | 0.093 |
| g5 | 0.204 | 0.042 | 0.111 | 0.012 | 0.988 | 0.012 |
| g6 | 0.375 | 0.141 | 0.213 | 0.045 | 0.955 | 0.048 |
| g7 | 0.288 | 0.083 | 0.160 | 0.026 | 0.974 | 0.026 |
| g8 | 0.633 | 0.401 | 0.512 | 0.262 | 0.738 | 0.355 |
| SUM | 1.636 | 2.294 | 7.204 | 0.974 | ||
Estimators of correlation between the item and raw score.
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|
| g1 | 0.351 | 0.677 | 0.436 | 0.791 | 0.857 | 0.791 | 0.857 | 0.551 | 0.551 |
| g2 | 0.268 | 0.618 | 0.375 | 0.779 | 0.846 | 0.779 | 0.846 | 0.489 | 0.489 |
| g3 | 0.283 | 0.696 | 0.408 | 0.858 | 0.911 | 0.858 | 0.911 | 0.603 | 0.603 |
| g4 | 0.458 | 0.736 | 0.529 | 0.789 | 0.834 | 0.789 | 0.834 | 0.603 | 0.603 |
| g5 | 0.746 | 0.931 | 0.732 | 0.952 | 0.979 | 0.958 | 0.982 | 0.921 | 0.923 |
| g6 | 0.260 | 0.602 | 0.364 | 0.766 | 0.831 | 0.766 | 0.831 | 0.477 | 0.477 |
| g7 | 0.327 | 0.702 | 0.425 | 0.832 | 0.897 | 0.943 | 0.976 | 0.568 | 0.567 |
| g8 | 0.373 | 0.760 | 0.457 | 0.877 | 0.924 | 0.961 | 0.983 | 0.680 | 0.693 |
Derivatives of the estimators of correlation between an item and a raw score.
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|
| g1 | 0.035 | 0.065 | 0.126 | 0.147 | 0.160 | 0.147 | 0.160 | 0.103 | 0.103 |
| g2 | 0.016 | 0.034 | 0.078 | 0.098 | 0.107 | 0.098 | 0.107 | 0.062 | 0.062 |
| g3 | 0.008 | 0.025 | 0.061 | 0.076 | 0.080 | 0.076 | 0.080 | 0.053 | 0.053 |
| g4 | 0.082 | 0.131 | 0.211 | 0.226 | 0.239 | 0.226 | 0.239 | 0.173 | 0.173 |
| g5 | 0.372 | 0.455 | 0.568 | 0.580 | 0.597 | 0.584 | 0.598 | 0.561 | 0.562 |
| g6 | 0.015 | 0.032 | 0.074 | 0.094 | 0.102 | 0.094 | 0.102 | 0.058 | 0.058 |
| g7 | 0.045 | 0.069 | 0.148 | 0.176 | 0.189 | 0.199 | 0.206 | 0.120 | 0.120 |
| g8 | 0.028 | 0.063 | 0.128 | 0.148 | 0.156 | 0.162 | 0.166 | 0.115 | 0.117 |
| SUM | 0.600 | 0.874 | 1.395 | 1.546 | 1.630 | 1.587 | 1.658 | 1.245 | 1.248 |
Estimates of reliability.
|
|
| ||||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
| Alfa | 0.2450 (θX) | 0.7901 | 0.4196 | 0.8556 | 0.8846 | 0.8703 | 0.8934 | 0.7004 | 0.7025 |
| Theta | 0.4444 (θPC) | 0.8686 | 0.5200 | 0.9368 | 0.9610 | 0.9494 | 0.9684 | 0.7779 | 0.7802 |
| Omega | 0.4221 (θMLE) | 0.8952 | 0.6925 | 0.9473 | 0.9669 | 0.9572 | 0.9729 | 0.8310 | 0.8323 |
| Rho | 0.4934 (θMLE) | 0.9287 | 0.7353 | 0.9605 | 0.9795 | 0.9757 | 0.9891 | 0.9012 | 0.9031 |