Literature DB >> 25694614

Variable selection in the presence of missing data: resampling and imputation.

Qi Long1, Brent A Johnson2.   

Abstract

In the presence of missing data, variable selection methods need to be tailored to missing data mechanisms and statistical approaches used for handling missing data. We focus on the mechanism of missing at random and variable selection methods that can be combined with imputation. We investigate a general resampling approach (BI-SS) that combines bootstrap imputation and stability selection, the latter of which was developed for fully observed data. The proposed approach is general and can be applied to a wide range of settings. Our extensive simulation studies demonstrate that the performance of BI-SS is the best or close to the best and is relatively insensitive to tuning parameter values in terms of variable selection, compared with several existing methods for both low-dimensional and high-dimensional problems. The proposed approach is further illustrated using two applications, one for a low-dimensional problem and the other for a high-dimensional problem.
© The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Keywords:  Bootstrap imputation; Missing data; Resampling; Stability selection; Variable selection

Mesh:

Substances:

Year:  2015        PMID: 25694614      PMCID: PMC5156376          DOI: 10.1093/biostatistics/kxv003

Source DB:  PubMed          Journal:  Biostatistics        ISSN: 1465-4644            Impact factor:   5.899


  9 in total

1.  Variable selection when missing values are present: a case study.

Authors:  Peter A Lachenbruch
Journal:  Stat Methods Med Res       Date:  2010-05-04       Impact factor: 3.021

2.  Penalized Estimating Functions and Variable Selection in Semiparametric Regression Models.

Authors:  Brent A Johnson; D Y Lin; Donglin Zeng
Journal:  J Am Stat Assoc       Date:  2008-06-01       Impact factor: 5.033

3.  VARIABLE SELECTION FOR REGRESSION MODELS WITH MISSING DATA.

Authors:  Ramon I Garcia; Joseph G Ibrahim; Hongtu Zhu
Journal:  Stat Sin       Date:  2010-01       Impact factor: 1.261

4.  Imputation and variable selection in linear regression models with missing covariates.

Authors:  Xiaowei Yang; Thomas R Belin; W John Boscardin
Journal:  Biometrics       Date:  2005-06       Impact factor: 2.571

5.  How should variable selection be performed with multiply imputed data?

Authors:  Angela M Wood; Ian R White; Patrick Royston
Journal:  Stat Med       Date:  2008-07-30       Impact factor: 2.373

6.  Multiple imputation in the presence of high-dimensional data.

Authors:  Yize Zhao; Qi Long
Journal:  Stat Methods Med Res       Date:  2013-11-25       Impact factor: 3.021

7.  Variable selection for multiply-imputed data with application to dioxin exposure study.

Authors:  Qixuan Chen; Sijian Wang
Journal:  Stat Med       Date:  2013-03-25       Impact factor: 2.373

8.  Variable selection in the cox regression model with covariates missing at random.

Authors:  Ramon I Garcia; Joseph G Ibrahim; Hongtu Zhu
Journal:  Biometrics       Date:  2009-05-18       Impact factor: 2.571

9.  Variable selection under multiple imputation using the bootstrap in a prognostic study.

Authors:  Martijn W Heymans; Stef van Buuren; Dirk L Knol; Willem van Mechelen; Henrica C W de Vet
Journal:  BMC Med Res Methodol       Date:  2007-07-13       Impact factor: 4.615

  9 in total
  9 in total

1.  Variable Selection in the Presence of Missing Data: Imputation-based Methods.

Authors:  Yize Zhao; Qi Long
Journal:  Wiley Interdiscip Rev Comput Stat       Date:  2017-05-24

2.  Machine Learning and Decision Support in Critical Care.

Authors:  Alistair E W Johnson; Mohammad M Ghassemi; Shamim Nemati; Katherine E Niehaus; David A Clifton; Gari D Clifford
Journal:  Proc IEEE Inst Electr Electron Eng       Date:  2016-01-25       Impact factor: 10.961

3.  A comparison of the characteristics and treatment outcomes of migrant and Australian-born users of a national digital mental health service.

Authors:  Rony Kayrouz; Eyal Karin; Lauren G Staples; Olav Nielssen; Blake F Dear; Nickolai Titov
Journal:  BMC Psychiatry       Date:  2020-03-11       Impact factor: 3.630

4.  Variable Selection in the Regularized Simultaneous Component Analysis Method for Multi-Source Data Integration.

Authors:  Zhengguo Gu; Niek C de Schipper; Katrijn Van Deun
Journal:  Sci Rep       Date:  2019-12-09       Impact factor: 4.379

5.  A flexible approach for variable selection in large-scale healthcare database studies with missing covariate and outcome data.

Authors:  Jung-Yi Joyce Lin; Liangyuan Hu; Chuyue Huang; Ji Jiayi; Steven Lawrence; Usha Govindarajulu
Journal:  BMC Med Res Methodol       Date:  2022-05-04       Impact factor: 4.612

6.  Evaluating the surrogacy of multiple vaccine-induced immune response biomarkers in HIV vaccine trials.

Authors:  Sayan Dasgupta; Ying Huang
Journal:  Biostatistics       Date:  2021-04-10       Impact factor: 5.279

7.  Stability Enhanced Variable Selection for a Semiparametric Model with Flexible Missingness Mechanism and Its Application to the ChAMP Study.

Authors:  Yang Yang; Jiwei Zhao; Gregory Wilding; Melissa Kluczynski; Leslie Bisson
Journal:  J Appl Stat       Date:  2019-08-24       Impact factor: 1.416

8.  Association between biomarkers and clinical characteristics in chronic subdural hematoma patients assessed with lasso regression.

Authors:  Are Hugo Pripp; Milo Stanišić
Journal:  PLoS One       Date:  2017-11-06       Impact factor: 3.240

9.  A comparison of model selection methods for prediction in the presence of multiply imputed data.

Authors:  Le Thi Phuong Thao; Ronald Geskus
Journal:  Biom J       Date:  2018-10-23       Impact factor: 2.207

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.