Literature DB >> 33089538

Selection of variables for multivariable models: Opportunities and limitations in quantifying model stability by resampling.

Christine Wallisch1, Daniela Dunkler1, Geraldine Rauch2,3, Riccardo de Bin4, Georg Heinze1.   

Abstract

Statistical models are often fitted to obtain a concise description of the association of an outcome variable with some covariates. Even if background knowledge is available to guide preselection of covariates, stepwise variable selection is commonly applied to remove irrelevant ones. This practice may introduce additional variability and selection is rarely certain. However, these issues are often ignored and model stability is not questioned. Several resampling-based measures were proposed to describe model stability, including variable inclusion frequencies (VIFs), model selection frequencies, relative conditional bias (RCB), and root mean squared difference ratio (RMSDR). The latter two were recently proposed to assess bias and variance inflation induced by variable selection. Here, we study the consistency and accuracy of resampling estimates of these measures and the optimal choice of the resampling technique. In particular, we compare subsampling and bootstrapping for assessing stability of linear, logistic, and Cox models obtained by backward elimination in a simulation study. Moreover, we exemplify the estimation and interpretation of all suggested measures in a study on cardiovascular risk. The VIF and the model selection frequency are only consistently estimated in the subsampling approach. By contrast, the bootstrap is advantageous in terms of bias and precision for estimating the RCB as well as the RMSDR. Though, unbiased estimation of the latter quantity requires independence of covariates, which is rarely encountered in practice. Our study stresses the importance of addressing model stability after variable selection and shows how to cope with it.
© 2020 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

Entities:  

Keywords:  backward elimination; bootstrap; stability measures; subsampling; variable selection

Year:  2020        PMID: 33089538      PMCID: PMC7820988          DOI: 10.1002/sim.8779

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  16 in total

1.  Stability of multivariable fractional polynomial models with selection of variables and transformations: a bootstrap investigation.

Authors:  P Royston; W Sauerbrei
Journal:  Stat Med       Date:  2003-02-28       Impact factor: 2.373

2.  Pitfalls of hypothesis tests and model selection on bootstrap samples: Causes and consequences in biometrical applications.

Authors:  Silke Janitza; Harald Binder; Anne-Laure Boulesteix
Journal:  Biom J       Date:  2015-09-15       Impact factor: 2.207

3.  A bootstrap resampling procedure for model building: application to the Cox regression model.

Authors:  W Sauerbrei; M Schumacher
Journal:  Stat Med       Date:  1992-12       Impact factor: 2.373

4.  On stability issues in deriving multivariable regression models.

Authors:  Willi Sauerbrei; Anika Buchholz; Anne-Laure Boulesteix; Harald Binder
Journal:  Biom J       Date:  2014-12-15       Impact factor: 2.207

5.  Categorical variables with many categories are preferentially selected in bootstrap-based model selection procedures for multivariable regression models.

Authors:  Susanne Rospleszcz; Silke Janitza; Anne-Laure Boulesteix
Journal:  Biom J       Date:  2016-03-22       Impact factor: 2.207

6.  The bootstrap and identification of prognostic factors via Cox's proportional hazards regression model.

Authors:  C H Chen; S L George
Journal:  Stat Med       Date:  1985 Jan-Mar       Impact factor: 2.373

7.  2013 ACC/AHA guideline on the assessment of cardiovascular risk: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines.

Authors:  David C Goff; Donald M Lloyd-Jones; Glen Bennett; Sean Coady; Ralph B D'Agostino; Raymond Gibbons; Philip Greenland; Daniel T Lackland; Daniel Levy; Christopher J O'Donnell; Jennifer G Robinson; J Sanford Schwartz; Susan T Shero; Sidney C Smith; Paul Sorlie; Neil J Stone; Peter W F Wilson
Journal:  J Am Coll Cardiol       Date:  2013-11-12       Impact factor: 24.094

8.  General cardiovascular risk profile for use in primary care: the Framingham Heart Study.

Authors:  Ralph B D'Agostino; Ramachandran S Vasan; Michael J Pencina; Philip A Wolf; Mark Cobain; Joseph M Massaro; William B Kannel
Journal:  Circulation       Date:  2008-01-22       Impact factor: 29.690

9.  Re-estimation improved the performance of two Framingham cardiovascular risk equations and the Pooled Cohort equations: A nationwide registry analysis.

Authors:  Christine Wallisch; Georg Heinze; Christoph Rinner; Gerald Mundigler; Wolfgang C Winkelmayer; Daniela Dunkler
Journal:  Sci Rep       Date:  2020-05-18       Impact factor: 4.379

10.  Selection of variables for multivariable models: Opportunities and limitations in quantifying model stability by resampling.

Authors:  Christine Wallisch; Daniela Dunkler; Geraldine Rauch; Riccardo de Bin; Georg Heinze
Journal:  Stat Med       Date:  2020-10-21       Impact factor: 2.373

View more
  3 in total

1.  Parametric evaluation of impedance curve in radiofrequency ablation: A quantitative description of the asymmetry and dynamic variation of impedance in bovine ex vivo model.

Authors:  Ronei Delfino da Fonseca; Paulo Roberto Santos; Melissa Silva Monteiro; Luciana Alves Fernandes; Andreia Henrique Campos; Díbio L Borges; Suélia De Siqueira Rodrigues Fleury Rosa
Journal:  PLoS One       Date:  2021-01-15       Impact factor: 3.240

2.  A simple pooling method for variable selection in multiply imputed datasets outperformed complex methods.

Authors:  A M Panken; M W Heymans
Journal:  BMC Med Res Methodol       Date:  2022-08-04       Impact factor: 4.612

3.  Selection of variables for multivariable models: Opportunities and limitations in quantifying model stability by resampling.

Authors:  Christine Wallisch; Daniela Dunkler; Geraldine Rauch; Riccardo de Bin; Georg Heinze
Journal:  Stat Med       Date:  2020-10-21       Impact factor: 2.373

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.