Olawale F Ayilara1, Tolulope T Sajobi2, Ruth Barclay3, Eric Bohm4, Mohammad Jafari Jozani5, Lisa M Lix6. 1. Department of Community Health Sciences, University of Manitoba, S113-750 Bannatyne Avenue, Winnipeg, MB, R3E 0W3, Canada. 2. Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada. 3. Department of Physical Therapy, University of Manitoba, Winnipeg, MB, Canada. 4. Department of Surgery, University of Manitoba, Winnipeg, MB, Canada. 5. Department of Statistics, University of Manitoba, Winnipeg, MB, Canada. 6. Department of Community Health Sciences, University of Manitoba, S113-750 Bannatyne Avenue, Winnipeg, MB, R3E 0W3, Canada. lisa.lix@umanitoba.ca.
Abstract
PURPOSE: Item non-response (i.e., missing data) may mask the detection of differential item functioning (DIF) in patient-reported outcome measures or result in biased DIF estimates. Non-response can be challenging to address in ordinal data. We investigated an unsupervised machine-learning method for ordinal item-level imputation and compared it with commonly-used item non-response methods when testing for DIF. METHODS: Computer simulation and real-world data were used to assess several item non-response methods using the item response theory likelihood ratio test for DIF. The methods included: (a) list-wise deletion (LD), (b) half-mean imputation (HMI), (c) full information maximum likelihood (FIML), and (d) non-negative matrix factorization (NNMF), which adopts a machine-learning approach to impute missing values. Control of Type I error rates were evaluated using a liberal robustness criterion for α = 0.05 (i.e., 0.025-0.075). Statistical power was assessed with and without adoption of an item non-response method; differences > 10% were considered substantial. RESULTS: Type I error rates for detecting DIF using LD, FIML and NNMF methods were controlled within the bounds of the robustness criterion for > 95% of simulation conditions, although the NNMF occasionally resulted in inflated rates. The HMI method always resulted in inflated error rates with 50% missing data. Differences in power to detect moderate DIF effects for LD, FIML and NNMF methods were substantial with 50% missing data and otherwise insubstantial. CONCLUSION: The NNMF method demonstrated comparable performance to commonly-used non-response methods. This computationally-efficient method represents a promising approach to address item-level non-response when testing for DIF.
PURPOSE: Item non-response (i.e., missing data) may mask the detection of differential item functioning (DIF) in patient-reported outcome measures or result in biased DIF estimates. Non-response can be challenging to address in ordinal data. We investigated an unsupervised machine-learning method for ordinal item-level imputation and compared it with commonly-used item non-response methods when testing for DIF. METHODS: Computer simulation and real-world data were used to assess several item non-response methods using the item response theory likelihood ratio test for DIF. The methods included: (a) list-wise deletion (LD), (b) half-mean imputation (HMI), (c) full information maximum likelihood (FIML), and (d) non-negative matrix factorization (NNMF), which adopts a machine-learning approach to impute missing values. Control of Type I error rates were evaluated using a liberal robustness criterion for α = 0.05 (i.e., 0.025-0.075). Statistical power was assessed with and without adoption of an item non-response method; differences > 10% were considered substantial. RESULTS: Type I error rates for detecting DIF using LD, FIML and NNMF methods were controlled within the bounds of the robustness criterion for > 95% of simulation conditions, although the NNMF occasionally resulted in inflated rates. The HMI method always resulted in inflated error rates with 50% missing data. Differences in power to detect moderate DIF effects for LD, FIML and NNMF methods were substantial with 50% missing data and otherwise insubstantial. CONCLUSION: The NNMF method demonstrated comparable performance to commonly-used non-response methods. This computationally-efficient method represents a promising approach to address item-level non-response when testing for DIF.
Authors: Iris Eekhout; Henrica C W de Vet; Jos W R Twisk; Jaap P L Brand; Michiel R de Boer; Martijn W Heymans Journal: J Clin Epidemiol Date: 2013-12-02 Impact factor: 6.437
Authors: Olawale F Ayilara; Lisa Zhang; Tolulope T Sajobi; Richard Sawatzky; Eric Bohm; Lisa M Lix Journal: Health Qual Life Outcomes Date: 2019-06-20 Impact factor: 3.186