OBJECTIVE: Differential item functioning (DIF) analyses are increasingly used to evaluate health-related quality of life (HRQoL) instruments, which often include relatively short subscales. Computer simulations were used to explore how various factors including scale length affect analysis of DIF by ordinal logistic regression. STUDY DESIGN AND SETTING: Simulated data, representative of HRQoL scales with four-category items, were generated. The power and type I error rates of the DIF method were then investigated when, respectively, DIF was deliberately introduced and when no DIF was added. The sample size, scale length, floor effects (FEs) and significance level were varied. RESULTS: When there was no DIF, type I error rates were close to 5%. Detecting moderate uniform DIF in a two-item scale required a sample size of 300 per group for adequate (>80%) power. For longer scales, a sample size of 200 was adequate. Considerably larger sample sizes were required to detect nonuniform DIF, when there were extreme FEs or when a reduced type I error rate was required. CONCLUSION: The impact of the number of items in the scale was relatively small. Ordinal logistic regression successfully detects DIF for HRQoL instruments with short scales. Sample size guidelines are provided.
OBJECTIVE: Differential item functioning (DIF) analyses are increasingly used to evaluate health-related quality of life (HRQoL) instruments, which often include relatively short subscales. Computer simulations were used to explore how various factors including scale length affect analysis of DIF by ordinal logistic regression. STUDY DESIGN AND SETTING: Simulated data, representative of HRQoL scales with four-category items, were generated. The power and type I error rates of the DIF method were then investigated when, respectively, DIF was deliberately introduced and when no DIF was added. The sample size, scale length, floor effects (FEs) and significance level were varied. RESULTS: When there was no DIF, type I error rates were close to 5%. Detecting moderate uniform DIF in a two-item scale required a sample size of 300 per group for adequate (>80%) power. For longer scales, a sample size of 200 was adequate. Considerably larger sample sizes were required to detect nonuniform DIF, when there were extreme FEs or when a reduced type I error rate was required. CONCLUSION: The impact of the number of items in the scale was relatively small. Ordinal logistic regression successfully detects DIF for HRQoL instruments with short scales. Sample size guidelines are provided.
Authors: Jeanne A Teresi; Katja Ocepek-Welikson; Marjorie Kleinman; Joseph P Eimicke; Paul K Crane; Richard N Jones; Jin-Shei Lai; Seung W Choi; Ron D Hays; Bryce B Reeve; Steven P Reise; Paul A Pilkonis; David Cella Journal: Psychol Sci Q Date: 2009
Authors: Pamela A Kisala; Aaron J Boulton; Matthew L Cohen; Mary D Slavin; Alan M Jette; Susan Charlifue; Robin Hanks; M J Mulcahey; David Cella; David S Tulsky Journal: Health Psychol Date: 2019-05 Impact factor: 4.267
Authors: John M Salsman; Benjamin D Schalet; Crystal L Park; Login George; Michael F Steger; Elizabeth A Hahn; Mallory A Snyder; David Cella Journal: Qual Life Res Date: 2020-04-19 Impact factor: 4.147
Authors: Neil W Scott; Peter M Fayers; Neil K Aaronson; Andrew Bottomley; Alexander de Graeff; Mogens Groenvold; Chad Gundy; Michael Koller; Morten A Petersen; Mirjam A G Sprangers Journal: Health Qual Life Outcomes Date: 2010-08-04 Impact factor: 3.186
Authors: John M Salsman; Jin-Shei Lai; Hugh C Hendrie; Zeeshan Butt; Nicholas Zill; Paul A Pilkonis; Christopher Peterson; Catherine M Stoney; Pim Brouwers; David Cella Journal: Qual Life Res Date: 2013-06-16 Impact factor: 4.147