Literature DB >> 27213023

VARIABLE SELECTION AND PREDICTION WITH INCOMPLETE HIGH-DIMENSIONAL DATA.

Ying Liu1, Yuanjia Wang2, Yang Feng3, Melanie M Wall4.   

Abstract

We propose a Multiple Imputation Random Lasso (mirl) method to select important variables and to predict the outcome for an epidemiological study of Eating and Activity in Teens. In this study 80% of individuals have at least one variable missing. Therefore, using variable selection methods developed for complete data after listwise deletion substantially reduces prediction power. Recent work on prediction models in the presence of incomplete data cannot adequately account for large numbers of variables with arbitrary missing patterns. We propose MIRL to combine penalized regression techniques with multiple imputation and stability selection. Extensive simulation studies are conducted to compare MIRL with several alternatives. MIRL outperforms other methods in high-dimensional scenarios in terms of both reduced prediction error and improved variable selection performance, and it has greater advantage when the correlation among variables is high and missing proportion is high. MIRL is shown to have improved performance when comparing with other applicable methods when applied to the study of Eating and Activity in Teens for the boys and girls separately, and to a subgroup of low social economic status (ses) Asian boys who are at high risk of developing obesity.

Entities:  

Keywords:  Missing data; multiple imputation; random lasso; stability selection; variable ranking; variable selection

Year:  2016        PMID: 27213023      PMCID: PMC4872715          DOI: 10.1214/15-AOAS899

Source DB:  PubMed          Journal:  Ann Appl Stat        ISSN: 1932-6157            Impact factor:   2.083


  16 in total

1.  Comparison of the predicted and observed secondary structure of T4 phage lysozyme.

Authors:  B W Matthews
Journal:  Biochim Biophys Acta       Date:  1975-10-20

2.  Penalized Estimating Functions and Variable Selection in Semiparametric Regression Models.

Authors:  Brent A Johnson; D Y Lin; Donglin Zeng
Journal:  J Am Stat Assoc       Date:  2008-06-01       Impact factor: 5.033

3.  VARIABLE SELECTION FOR REGRESSION MODELS WITH MISSING DATA.

Authors:  Ramon I Garcia; Joseph G Ibrahim; Hongtu Zhu
Journal:  Stat Sin       Date:  2010-01       Impact factor: 1.261

4.  Multiple imputation by chained equations: what is it and how does it work?

Authors:  Melissa J Azur; Elizabeth A Stuart; Constantine Frangakis; Philip J Leaf
Journal:  Int J Methods Psychiatr Res       Date:  2011-03       Impact factor: 4.035

5.  How should variable selection be performed with multiply imputed data?

Authors:  Angela M Wood; Ian R White; Patrick Royston
Journal:  Stat Med       Date:  2008-07-30       Impact factor: 2.373

6.  Random-effects models for longitudinal data.

Authors:  N M Laird; J H Ware
Journal:  Biometrics       Date:  1982-12       Impact factor: 2.571

7.  Secular trends in weight status and weight-related attitudes and behaviors in adolescents from 1999 to 2010.

Authors:  Dianne Neumark-Sztainer; Melanie M Wall; Nicole Larson; Mary Story; Jayne A Fulkerson; Marla E Eisenberg; Peter J Hannan
Journal:  Prev Med       Date:  2011-10-15       Impact factor: 4.018

8.  Fixed and random effects selection in mixed effects models.

Authors:  Joseph G Ibrahim; Hongtu Zhu; Ramon I Garcia; Ruixin Guo
Journal:  Biometrics       Date:  2010-07-21       Impact factor: 2.571

9.  Using an Approximate Bayesian Bootstrap to Multiply Impute Nonignorable Missing Data.

Authors:  Juned Siddique; Thomas R Belin
Journal:  Comput Stat Data Anal       Date:  2008-12-15       Impact factor: 1.681

10.  Variable selection for multiply-imputed data with application to dioxin exposure study.

Authors:  Qixuan Chen; Sijian Wang
Journal:  Stat Med       Date:  2013-03-25       Impact factor: 2.373

View more
  4 in total

1.  Variable Selection in the Presence of Missing Data: Imputation-based Methods.

Authors:  Yize Zhao; Qi Long
Journal:  Wiley Interdiscip Rev Comput Stat       Date:  2017-05-24

2.  A Practical Guide to Variable Selection in Structural Equation Models with Regularized MIMIC Models.

Authors:  Ross Jacobucci; Andreas M Brandmaier; Rogier A Kievit
Journal:  Adv Methods Pract Psychol Sci       Date:  2019-03-25

3.  Structure and stability of symptoms in first episode psychosis: a longitudinal network approach.

Authors:  Max Birchwood; Rachel Upthegrove; Siân Lowri Griffiths; Samuel P Leighton; Pavan Kumar Mallikarjun; Georgina Blake; Linda Everard; Peter B Jones; David Fowler; Joanne Hodgekins; Tim Amos; Nick Freemantle; Vimal Sharma; Max Marshall; Paul McCrone; Swaran P Singh
Journal:  Transl Psychiatry       Date:  2021-11-06       Impact factor: 6.222

4.  Health system influences on potentially avoidable hospital admissions by secondary mental health service use: A national ecological study.

Authors:  Charlotte Woodhead; Peter Martin; David Osborn; Helen Barratt; Rosalind Raine
Journal:  J Health Serv Res Policy       Date:  2021-08-01
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.