Literature DB >> 25501529

On stability issues in deriving multivariable regression models.

Willi Sauerbrei1, Anika Buchholz1, Anne-Laure Boulesteix2, Harald Binder3.   

Abstract

In many areas of science where empirical data are analyzed, a task is often to identify important variables with influence on an outcome. Most often this is done by using a variable selection strategy in the context of a multivariable regression model. Using a study on ozone effects in children (n = 496, 24 covariates), we will discuss aspects relevant for deriving a suitable model. With an emphasis on model stability, we will explore and illustrate differences between predictive models and explanatory models, the key role of stopping criteria, and the value of bootstrap resampling (with and without replacement). Bootstrap resampling will be used to assess variable selection stability, to derive a predictor that incorporates model uncertainty, check for influential points, and visualize the variable selection process. For the latter two tasks we adapt and extend recent approaches, such as stability paths, to serve our purposes. Based on earlier experiences and on results from the example, we will argue for simpler models and that predictions are usually very similar, irrespective of the selection method used. Important differences exist for the corresponding variances, and the model uncertainty concept helps to protect against serious underestimation of the variance of a predictor-derived data dependently. Results of stability investigations illustrate severe difficulties in the task of deriving a suitable explanatory model. It seems possible to identify a small number of variables with an important and probably true influence on the outcome, but too often several variables are included whose selection may be a result of chance or may depend on a small number of observations.
© 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Entities:  

Keywords:  Bootstrap resampling; Influential point; Model stability; Regression model; Variable selection

Mesh:

Substances:

Year:  2014        PMID: 25501529     DOI: 10.1002/bimj.201300222

Source DB:  PubMed          Journal:  Biom J        ISSN: 0323-3847            Impact factor:   2.207


  14 in total

1.  Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling.

Authors:  Stefanie Hieke; Axel Benner; Richard F Schlenk; Martin Schumacher; Lars Bullinger; Harald Binder
Journal:  PLoS One       Date:  2016-05-09       Impact factor: 3.240

2.  Dealing with prognostic signature instability: a strategy illustrated for cardiovascular events in patients with end-stage renal disease.

Authors:  Harald Binder; Thorsten Kurz; Sven Teschner; Clemens Kreutz; Marcel Geyer; Johannes Donauer; Annette Kraemer-Guth; Jens Timmer; Martin Schumacher; Gerd Walz
Journal:  BMC Med Genomics       Date:  2016-07-20       Impact factor: 3.063

3.  IPF-LASSO: Integrative L1-Penalized Regression with Penalty Factors for Prediction Based on Multi-Omics Data.

Authors:  Anne-Laure Boulesteix; Riccardo De Bin; Xiaoyu Jiang; Mathias Fuchs
Journal:  Comput Math Methods Med       Date:  2017-05-04       Impact factor: 2.238

Review 4.  Variable selection - A review and recommendations for the practicing statistician.

Authors:  Georg Heinze; Christine Wallisch; Daniela Dunkler
Journal:  Biom J       Date:  2018-01-02       Impact factor: 2.207

Review 5.  A review of spline function procedures in R.

Authors:  Aris Perperoglou; Willi Sauerbrei; Michal Abrahamowicz; Matthias Schmid
Journal:  BMC Med Res Methodol       Date:  2019-03-06       Impact factor: 4.615

6.  How can the occurrence of delayed elevation of thyroid stimulating hormone in preterm infants born between 35 and 36 weeks gestation be predicted?

Authors:  You Jung Heo; Young Ah Lee; Bora Lee; Yun Jeong Lee; Youn Hee Lim; Hye Rim Chung; Seung Han Shin; Choong Ho Shin; Sei Won Yang
Journal:  PLoS One       Date:  2019-08-23       Impact factor: 3.240

7.  A plea for taking all available clinical information into account when assessing the predictive value of omics data.

Authors:  Alexander Volkmann; Riccardo De Bin; Willi Sauerbrei; Anne-Laure Boulesteix
Journal:  BMC Med Res Methodol       Date:  2019-07-24       Impact factor: 4.615

8.  Integrating multiple molecular sources into a clinical risk prediction signature by extracting complementary information.

Authors:  Stefanie Hieke; Axel Benner; Richard F Schlenl; Martin Schumacher; Lars Bullinger; Harald Binder
Journal:  BMC Bioinformatics       Date:  2016-08-30       Impact factor: 3.169

9.  Spatial heterogeneity and socioeconomic determinants of opioid prescribing in England between 2015 and 2018.

Authors:  Rossano Schifanella; Dario Delle Vedove; Alberto Salomone; Paolo Bajardi; Daniela Paolotti
Journal:  BMC Med       Date:  2020-05-15       Impact factor: 8.775

10.  State of the art in selection of variables and functional forms in multivariable analysis-outstanding issues.

Authors:  Willi Sauerbrei; Aris Perperoglou; Matthias Schmid; Michal Abrahamowicz; Heiko Becher; Harald Binder; Daniela Dunkler; Frank E Harrell; Patrick Royston; Georg Heinze
Journal:  Diagn Progn Res       Date:  2020-04-02
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.