PURPOSE: DIF detection within an IRT framework is highly powerful, often identifying significant DIF that is of little clinical importance. This paper introduces two metrics for IRT DIF evaluation that can discern potentially problematic DIF among items flagged with statistically significant DIF. METHODS: Computation of two DIF metrics-(1) a weighted area between the expected score curves (wABC) and (2) a difference in expected a posteriori scores across item response categories (dEAP)-is described. Their use is demonstrated using data from a 27-item cancer stigma index fielded to four adult samples: (1) Arabic (N = 633) and (2) English speakers (N = 324) residing in Jordan and Egypt, and (3) English (N = 500) and (4) Mandarin speakers (N = 500) residing in China. We used IRTPRO's DIF module to calculate IRT-based Wald chi-square DIF statistics according to language within each region. After standard p value adjustments for multiple comparisons, we further evaluated DIF impact with wABC and dEAP. RESULTS: There were a total of twenty statistically significant DIF comparisons after p value adjustment. The wABCs for these items ranged from 0.13 to 0.90. Upon inspection of curves, DIF comparisons with wABCs >0.3 were deemed potentially problematic and were considered further for removal. The dEAP metric was also informative regarding impact of DIF on expected scores, but less consistently useful for narrowing down potentially problematic items. CONCLUSIONS: The calculations of wABC and dEAP function as DIF effect size indicators. Use of these metrics can substantially augment IRT DIF evaluation by discerning truly problematic DIF items among those with statistically significant DIF.
PURPOSE:DIF detection within an IRT framework is highly powerful, often identifying significant DIF that is of little clinical importance. This paper introduces two metrics for IRT DIF evaluation that can discern potentially problematic DIF among items flagged with statistically significant DIF. METHODS: Computation of two DIF metrics-(1) a weighted area between the expected score curves (wABC) and (2) a difference in expected a posteriori scores across item response categories (dEAP)-is described. Their use is demonstrated using data from a 27-item cancer stigma index fielded to four adult samples: (1) Arabic (N = 633) and (2) English speakers (N = 324) residing in Jordan and Egypt, and (3) English (N = 500) and (4) Mandarin speakers (N = 500) residing in China. We used IRTPRO's DIF module to calculate IRT-based Wald chi-square DIF statistics according to language within each region. After standard p value adjustments for multiple comparisons, we further evaluated DIF impact with wABC and dEAP. RESULTS: There were a total of twenty statistically significant DIF comparisons after p value adjustment. The wABCs for these items ranged from 0.13 to 0.90. Upon inspection of curves, DIF comparisons with wABCs >0.3 were deemed potentially problematic and were considered further for removal. The dEAP metric was also informative regarding impact of DIF on expected scores, but less consistently useful for narrowing down potentially problematic items. CONCLUSIONS: The calculations of wABC and dEAP function as DIF effect size indicators. Use of these metrics can substantially augment IRT DIF evaluation by discerning truly problematic DIF items among those with statistically significant DIF.
Authors: James W Varni; Brian D Stucky; David Thissen; Esi Morgan Dewitt; Debra E Irwin; Jin-Shei Lai; Karin Yeatts; Darren A Dewalt Journal: J Pain Date: 2010-06-02 Impact factor: 5.820
Authors: Karon F Cook; Alyssa M Bamer; Dagmar Amtmann; Ivan R Molton; Mark P Jensen Journal: Arch Phys Med Rehabil Date: 2012-03-02 Impact factor: 3.966
Authors: Karin B Yeatts; Brian Stucky; David Thissen; Deb Irwin; James W Varni; Esi Morgan DeWitt; Jin-Shei Lai; Darren A DeWalt Journal: J Asthma Date: 2010-04 Impact factor: 2.515
Authors: Esi Morgan DeWitt; Brian D Stucky; David Thissen; Debra E Irwin; Michelle Langer; James W Varni; Jin-Shei Lai; Karin B Yeatts; Darren A Dewalt Journal: J Clin Epidemiol Date: 2011-02-02 Impact factor: 6.437
Authors: Adam C Carle; David Cella; Li Cai; Seung W Choi; Paul K Crane; S McKay Curtis; Jonathan Gruhl; Jin-Shei Lai; Shubhabrata Mukherjee; Steven P Reise; Jeanne A Teresi; David Thissen; Eric J Wu; Ron D Hays Journal: Expert Rev Pharmacoecon Outcomes Res Date: 2011-12 Impact factor: 2.217
Authors: Morten Aa Petersen; Johannes M Giesinger; Bernhard Holzner; Juan I Arraras; Thierry Conroy; Eva-Maria Gamper; Madeleine T King; Irma M Verdonck-de Leeuw; Teresa Young; Mogens Groenvold Journal: Qual Life Res Date: 2013-02-28 Impact factor: 4.147
Authors: Wenjing Huang; Brian D Stucky; Maria O Edelen; Joan S Tucker; William G Shadel; Mark Hansen; Li Cai Journal: Nicotine Tob Res Date: 2016-01-31 Impact factor: 4.244
Authors: Alina Ionela Palimaru; William E Cunningham; Marcus Dillistone; Arturo Vargas-Bustamante; Honghu Liu; Ron D Hays Journal: Arch Phys Med Rehabil Date: 2018-04-26 Impact factor: 3.966
Authors: H Felix Fischer; Inka Wahl; Sandra Nolte; Gregor Liegl; Elmar Brähler; Bernd Löwe; Matthias Rose Journal: Int J Methods Psychiatr Res Date: 2016-10-16 Impact factor: 4.035
Authors: Christine M McDonough; Pengsheng Ni; Kara Peterik; Elizabeth E Marfeo; Molly E Marino; Mark Meterko; Elizabeth K Rasch; Diane E Brandt; Alan M Jette; Leighton Chan Journal: Qual Life Res Date: 2016-12-22 Impact factor: 4.147
Authors: Lewis E Kazis; Molly Marino; Pengsheng Ni; Marina Soley Bori; Flor Amaya; Emily Dore; Colleen M Ryan; Jeff C Schneider; Vivian Shie; Amy Acton; Alan M Jette Journal: Qual Life Res Date: 2017-05-10 Impact factor: 4.147
Authors: Wendy J Coster; Pengsheng Ni; Mary D Slavin; Pamela A Kisala; Ratna Nandakumar; Mary Jane Mulcahey; David S Tulsky; Alan M Jette Journal: Dev Med Child Neurol Date: 2016-04-21 Impact factor: 5.449