Literature DB >> 26288150

Subsampling versus bootstrapping in resampling-based model selection for multivariable regression.

Riccardo De Bin1, Silke Janitza1, Willi Sauerbrei2, Anne-Laure Boulesteix1.   

Abstract

In recent years, increasing attention has been devoted to the problem of the stability of multivariable regression models, understood as the resistance of the model to small changes in the data on which it has been fitted. Resampling techniques, mainly based on the bootstrap, have been developed to address this issue. In particular, the approaches based on the idea of "inclusion frequency" consider the repeated implementation of a variable selection procedure, for example backward elimination, on several bootstrap samples. The analysis of the variables selected in each iteration provides useful information on the model stability and on the variables' importance. Recent findings, nevertheless, show possible pitfalls in the use of the bootstrap, and alternatives such as subsampling have begun to be taken into consideration in the literature. Using model selection frequencies and variable inclusion frequencies, we empirically compare these two different resampling techniques, investigating the effect of their use in selected classical model selection procedures for multivariable regression. We conduct our investigations by analyzing two real data examples and by performing a simulation study. Our results reveal some advantages in using a subsampling technique rather than the bootstrap in this context.
© 2015, The International Biometric Society.

Keywords:  Bootstrap; Model selection; Model stability; Subsampling

Mesh:

Year:  2015        PMID: 26288150     DOI: 10.1111/biom.12381

Source DB:  PubMed          Journal:  Biometrics        ISSN: 0006-341X            Impact factor:   2.571


  12 in total

1.  Symptom-Disease Pair Analysis of Diagnostic Error (SPADE): a conceptual framework and methodological approach for unearthing misdiagnosis-related harms using big data.

Authors:  Ava L Liberman; David E Newman-Toker
Journal:  BMJ Qual Saf       Date:  2018-01-22       Impact factor: 7.035

2.  Analyzing Hierarchical Multi-View MRI Data With StaPLR: An Application to Alzheimer's Disease Classification.

Authors:  Wouter van Loon; Frank de Vos; Marjolein Fokkema; Botond Szabo; Marisa Koini; Reinhold Schmidt; Mark de Rooij
Journal:  Front Neurosci       Date:  2022-04-25       Impact factor: 5.152

3.  The case-crossover design via penalized regression.

Authors:  Sam Doerken; Maja Mockenhaupt; Luigi Naldi; Martin Schumacher; Peggy Sekula
Journal:  BMC Med Res Methodol       Date:  2016-08-22       Impact factor: 4.615

Review 4.  Variable selection - A review and recommendations for the practicing statistician.

Authors:  Georg Heinze; Christine Wallisch; Daniela Dunkler
Journal:  Biom J       Date:  2018-01-02       Impact factor: 2.207

5.  Introduction to statistical simulations in health research.

Authors:  Anne-Laure Boulesteix; Rolf Hh Groenwold; Michal Abrahamowicz; Harald Binder; Matthias Briel; Roman Hornung; Tim P Morris; Jörg Rahnenführer; Willi Sauerbrei
Journal:  BMJ Open       Date:  2020-12-13       Impact factor: 2.692

6.  Application of FLIC model to predict adverse events onset in neuroendocrine tumors treated with PRRT.

Authors:  F Scalorbi; G Argiroffi; M Baccini; L Gherardini; V Fuoco; N Prinzi; S Pusceddu; E M Garanzini; G Centonze; M Kirienko; E Seregni; M Milione; M Maccauro
Journal:  Sci Rep       Date:  2021-09-30       Impact factor: 4.379

7.  Random forest versus logistic regression: a large-scale benchmark experiment.

Authors:  Raphael Couronné; Philipp Probst; Anne-Laure Boulesteix
Journal:  BMC Bioinformatics       Date:  2018-07-17       Impact factor: 3.169

8.  State of the art in selection of variables and functional forms in multivariable analysis-outstanding issues.

Authors:  Willi Sauerbrei; Aris Perperoglou; Matthias Schmid; Michal Abrahamowicz; Heiko Becher; Harald Binder; Daniela Dunkler; Frank E Harrell; Patrick Royston; Georg Heinze
Journal:  Diagn Progn Res       Date:  2020-04-02

9.  Selection of variables for multivariable models: Opportunities and limitations in quantifying model stability by resampling.

Authors:  Christine Wallisch; Daniela Dunkler; Geraldine Rauch; Riccardo de Bin; Georg Heinze
Journal:  Stat Med       Date:  2020-10-21       Impact factor: 2.373

10.  Distribution and Associated Factors of Hepatic Iron-A Population-Based Imaging Study.

Authors:  Lisa Maier; Ricarda von Krüchten; Roberto Lorbeer; Jule Filler; Johanna Nattenmüller; Barbara Thorand; Wolfgang Koenig; Wolfgang Rathmann; Fabian Bamberg; Christopher L Schlett; Annette Peters; Susanne Rospleszcz
Journal:  Metabolites       Date:  2021-12-15
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.