| Literature DB >> 30402814 |
Laura Kolbe1, Terrence D Jorgensen2.
Abstract
Restricted factor analysis (RFA) is a powerful method to test for uniform differential item functioning (DIF), but it may require empirically selecting anchor items to prevent inflated Type I error rates. We conducted a simulation study to compare two empirical anchor-selection strategies: a one-step rank-based strategy and an iterative selection procedure. Unlike the iterative procedure, the rank-based strategy had a low risk and degree of contamination within the empirically selected anchor set, even with small samples. To detect nonuniform DIF, RFA requires an interaction effect with the latent factor. The latent moderated structural equations (LMS) method has been applied to RFA and has revealed inflated Type I error rates. We propose using product indicators (PI) as a more widely available alternative to measure the latent interaction. A simulation study, involving several sample-size conditions and magnitudes of uniform and nonuniform DIF, revealed that PI obtained similar power but lower Type I error rates, as compared to LMS.Entities:
Keywords: Differential item functioning; Factor analysis; Latent moderated structural equations; Measurement invariance; Product indicators
Mesh:
Year: 2019 PMID: 30402814 PMCID: PMC6420445 DOI: 10.3758/s13428-018-1151-3
Source DB: PubMed Journal: Behav Res Methods ISSN: 1554-351X
Fig. 1An RFA model with LMS for DIF detection. The items X9 and X10 are the anchor items. The dashed and dotted arrows represent effects that may be estimated in order to test for uniform and nonuniform DIF, respectively.
Fig. 2An RFA model with product indicators for DIF detection. The items X9 and X10 are the anchor items. The dashed and dotted arrows represent effects that may be estimated in order to test for uniform and nonuniform DIF, respectively.
Percentages of replications using latent moderated structural equations (LMS) with invalid results in Study 1
|
| Percentage of Invalid Results | |
|---|---|---|
| Small DIF | Large DIF | |
| 50 | 23.10 | 22.50 |
| 100 | 18.00 | 15.50 |
| 150 | 19.50 | 17.40 |
| 200 | 24.40 | 26.30 |
DIF = differential item functioning. The total number of replications in each condition was 1,000. Only the percentages of invalid results when using LMS are reported in this table, because none of the replications with product indicators obtained invalid results.
Results of the anchor-selection strategies for each of the conditions in Study 1
| Method | Size of DIF |
| Average Risk of Contamination | Average Degree of Contamination (Average Count) | ||||
|---|---|---|---|---|---|---|---|---|
| RB (20%) | RB (70%) | IP | RB (20%) | RB (70%) | IP | |||
| LMS | Small | 50 | 6.11 | 62.68 | 88.30 | 3.06 (0.061) | 9.23 (0.646) | 14.14 (1.203) |
| 100 | 1.46 | 36.83 | 73.05 | 0.73 (0.015) | 5.26 (0.368) | 11.39 (0.973) | ||
| 150 | 0.62 | 23.35 | 55.78 | 0.31 (0.006) | 3.34 (0.234) | 9.43 (0.826) | ||
| 200 | 0.13 | 14.68 | 43.52 | 0.07 (0.001) | 2.10 (0.147) | 8.14 (0.735) | ||
| Large | 50 | 0.52 | 24.52 | 52.77 | 0.26 (0.005) | 3.52 (0.247) | 8.77 (0.764) | |
| 100 | 0.12 | 4.26 | 26.04 | 0.06 (0.001) | 0.61 (0.043) | 5.91 (0.556) | ||
| 150 | 0.00 | 0.85 | 17.68 | 0.00 (0.000) | 0.12 (0.008) | 4.64 (0.452) | ||
| 200 | 0.95 | 0.95 | 20.90 | 0.47 (0.009) | 0.14 (0.009) | 5.93 (0.585) | ||
| PI | Small | 50 | 5.80 | 65.70 | 89.50 | 2.90 (0.058) | 9.66 (0.676) | 12.33 (0.995) |
| 100 | 2.00 | 43.40 | 75.30 | 1.00 (0.020) | 6.20 (0.434) | 9.53 (0.753) | ||
| 150 | 0.40 | 31.80 | 57.30 | 0.20 (0.004) | 4.54 (0.318) | 7.29 (0.573) | ||
| 200 | 0.30 | 21.30 | 43.70 | 0.15 (0.003) | 3.04 (0.213) | 5.54 (0.437) | ||
| Large | 50 | 2.90 | 31.80 | 49.60 | 1.45 (0.029) | 4.63 (0.324) | 6.38 (0.506) | |
| 100 | 0.20 | 7.30 | 11.00 | 0.10 (0.002) | 1.04 (0.073) | 1.42 (0.111) | ||
| 150 | 0.10 | 2.20 | 0.50 | 0.05 (0.001) | 0.31 (0.022) | 0.07 (0.005) | ||
| 200 | 0.00 | 0.40 | 0.20 | 0.00 (0.000) | 0.06 (0.004) | 0.02 (0.002) | ||
The average count (i.e., the average number of DIF items in the anchor set) is reported in parentheses alongside the average degree (as a percentage) of contamination. DIF = differential item functioning; LMS = latent moderated structural equations; PI = product indicators; RB (20%) = rank-based strategy selecting 20% of all items as anchors; RB (70%) = rank-based strategy selecting 70% of all items as anchors; IP = iterative procedure. Risk of contamination = percentage of replications in which the anchor set contained at least one item exhibiting DIF. Degree of contamination = percentage of items exhibiting DIF in the anchor set averaged over all replications
Average numbers of items selected as anchors in the iterative procedure
| Size of DIF |
| Number of Items in the Anchor Set | |
|---|---|---|---|
| LMS | PI | ||
| Small | 50 | 8.112 | 7.897 |
| 100 | 7.878 | 7.671 | |
| 150 | 7.739 | 7.476 | |
| 200 | 7.631 | 7.350 | |
| Large | 50 | 7.679 | 7.415 |
| 100 | 7.460 | 7.034 | |
| 150 | 7.366 | 6.929 | |
| 200 | 7.479 | 6.920 | |
DIF = differential item functioning; LMS = latent moderated structural equations; PI = product indicators. Ideally, only seven items would be included in the anchor set (i.e., the seven items without DIF in the population).
Percentages of replications with invalid results in Study 2 for the best-case and empirical scenarios
| Method | Size of DIF |
| Percentage of Invalid Results | |
|---|---|---|---|---|
| Best-Case | Empirical | |||
| LMS | Small | 50 | 21.50 | 24.40 |
| 100 | 18.10 | 19.70 | ||
| 150 | 26.50 | 22.50 | ||
| 200 | 31.40 | 26.40 | ||
| Large | 50 | 21.50 | 24.10 | |
| 100 | 18.00 | 18.30 | ||
| 150 | 25.60 | 22.60 | ||
| 200 | 32.50 | 28.90 | ||
DIF = differential item functioning. The total number of replications in each condition was 1,000. Only the percentages of invalid results when using latent moderated structural equations (LMS) are reported in this table, because none of the replications with product indicators obtained invalid results.
Power of the latent moderated structural equations (LMS) and product indicators (PI) methods under each condition of the best-case scenario in Study 2
| Type of DIF |
| Small DIF | Large DIF | ||
|---|---|---|---|---|---|
| LMS | PI | LMS | PI | ||
| Uniform | 50 | .828 | .737 | .932 | .981 |
| 100 | .960 | .995 | .977 | 1.000 | |
| 150 | .973 | 1.000 | .991 | 1.000 | |
| 200 | .994 | 1.000 | .991 | 1.000 | |
| Nonuniform | 50 | .162 | .108 | .535 | .464 |
| 100 | .341 | .218 | .834 | .839 | |
| 150 | .544 | .358 | .882 | .973 | |
| 200 | .672 | .493 | .825 | .996 | |
| Combination | 50 | .660 | .710 | .925 | .977 |
| 100 | .947 | .994 | .966 | 1.000 | |
| 150 | .980 | 1.000 | .974 | 1.000 | |
| 200 | .993 | 1.000 | .982 | 1.000 | |
DIF = differential item functioning; small uniform DIF = a difference of 0.5 in intercepts across groups; large uniform DIF = a difference of 0.8 in intercepts across groups; small nonuniform DIF = a difference of 0.25 in factor loadings across groups; large nonuniform DIF = a difference of 0.5 in factor loadings across groups.
Type I error rates of latent moderated structural equations (LMS) and product indicators (PI) under each condition of the best-case scenario in Study 2
| Size of DIF |
| Type I Error [95% CI] | |
|---|---|---|---|
| LMS | PI | ||
| Small | 50 |
| .058 [.045, .074] |
| 100 |
|
| |
| 150 |
| .051 [.039, .067] | |
| 200 |
| .047 [.035, .062] | |
| Large | 50 |
| .059 [.046, .075] |
| 100 |
|
| |
| 150 |
| .051 [.039, .067] | |
| 200 |
| .049 [.037, .064] | |
DIF = differential item functioning. Bold font indicates that the lower 95% confidence limit exceeds the nominal 5% alpha level, implying the Type I error rate is statistically significantly inflated. The square brackets contain Agresti–Coull confidence intervals around the error rates.
Power of the latent moderated structural equations (LMS) and product indicators (PI) methods under each condition of the empirical scenario in Study 2
| Type of DIF |
| Small DIF | Large DIF | ||
|---|---|---|---|---|---|
| LMS | PI | LMS | PI | ||
| Uniform | 50 | .718 | .560 | .906 | .949 |
| 100 | .963 | .979 | .969 | .998 | |
| 150 | .983 | 1.000 | .994 | .999 | |
| 200 | .997 | 1.000 | .992 | 1.000 | |
| Nonuniform | 50 | .168 | .057 | .573 | .422 |
| 100 | .367 | .156 | .859 | .825 | |
| 150 | .563 | .288 | .894 | .980 | |
| 200 | .696 | .414 | .834 | .997 | |
| Combination | 50 | .577 | .537 | .920 | .948 |
| 100 | .949 | .977 | .963 | .999 | |
| 150 | .974 | .999 | .984 | 1.000 | |
| 200 | .997 | 1.000 | .997 | 1.000 | |
DIF = differential item functioning; small uniform DIF = a difference of 0.5 in intercepts across groups; large uniform DIF = a difference of 0.8 in intercepts across groups; small nonuniform DIF = a difference of 0.25 in factor loadings across groups; large nonuniform DIF = a difference of 0.5 in factor loadings across groups.
Type I error rates of latent moderated structural equations (LMS) and product indicators (PI) under each condition of the empirical scenario in Study 2
| Size of DIF |
| Type I Error [95% CI] | |
|---|---|---|---|
| LMS | PI | ||
| Small | 50 |
| .022 [.014, .033] |
| 100 |
| .026 [.018, .038] | |
| 150 |
| .023 [.015, .034] | |
| 200 |
| .035 [.025, .048] | |
| Large | 50 |
| .048 [.036, .063] |
| 100 |
| .041 [.030, .055] | |
| 150 |
| .032 [.023, .045] | |
| 200 |
| .032 [.023, .045] | |
DIF = differential item functioning. Bold font indicates that the lower 95% confidence limit exceeds the nominal 5% alpha level, implying the Type I error rate is statistically significantly inflated. The square brackets contain Agresti–Coull confidence intervals around the error rates.