Literature DB >> 22826110

Comparison of methods for imputing ordinal data using multivariate normal imputation: a case study of non-linear effects in a large cohort study.

Katherine J Lee1, John C Galati, Julie A Simpson, John B Carlin.   

Abstract

BACKGROUND: Multiple imputation is becoming increasingly popular for handling missing data, with Markov chain Monte Carlo assuming multivariate normality (MVN) a commonly used approach. Imputing categorical variables (which are clearly non-normal) using MVN imputation is challenging, and several approaches have been suggested. However, it remains unclear which approach should be preferred.
METHODS: We explore methods for imputing ordinal variables using MVN imputation, including imputing as a continuous variable and as a set of indicators, and various methods for assigning imputed values to the possible categories (rounding), for estimating a non-linear association between an ordinal exposure and binary outcome. We introduce a new approach where we impute as continuous and assign imputed values into categories based on the mean indicators imputed in a separate round of imputation. We compare these approaches in a simple setting where we make 50% of data in an ordinal exposure missing completely at random, within an otherwise complete real dataset.
RESULTS: Methods that impute the ordinal exposure as continuous distorted the non-linear exposure-outcome association by biasing the relationship towards linearity irrespective of the rounding method. In contrast, imputing using indicators preserved the non-linear association but not the marginal distribution of the ordinal variable.
CONCLUSIONS: Imputing ordinal variables as continuous can bias the estimation of the exposure-outcome association in the presence of non-linear relationships. Further work is needed to develop optimal methods for handling ordinal (and nominal) variables when using MVN imputation.
Copyright © 2012 John Wiley & Sons, Ltd.

Mesh:

Year:  2012        PMID: 22826110     DOI: 10.1002/sim.5445

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  3 in total

1.  Comparison of methods for imputing limited-range variables: a simulation study.

Authors:  Laura Rodwell; Katherine J Lee; Helena Romaniuk; John B Carlin
Journal:  BMC Med Res Methodol       Date:  2014-04-26       Impact factor: 4.615

2.  Multiple imputation methods for handling missing values in a longitudinal categorical variable with restrictions on transitions over time: a simulation study.

Authors:  Anurika Priyanjali De Silva; Margarita Moreno-Betancur; Alysha Madhu De Livera; Katherine Jane Lee; Julie Anne Simpson
Journal:  BMC Med Res Methodol       Date:  2019-01-10       Impact factor: 4.615

3.  Model checking in multiple imputation: an overview and case study.

Authors:  Cattram D Nguyen; John B Carlin; Katherine J Lee
Journal:  Emerg Themes Epidemiol       Date:  2017-08-23
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.