Literature DB >> 31596468

Adaptive penalization in high-dimensional regression and classification with external covariates using variational Bayes.

Britta Velten1, Wolfgang Huber1.   

Abstract

Penalization schemes like Lasso or ridge regression are routinely used to regress a response of interest on a high-dimensional set of potential predictors. Despite being decisive, the question of the relative strength of penalization is often glossed over and only implicitly determined by the scale of individual predictors. At the same time, additional information on the predictors is available in many applications but left unused. Here, we propose to make use of such external covariates to adapt the penalization in a data-driven manner. We present a method that differentially penalizes feature groups defined by the covariates and adapts the relative strength of penalization to the information content of each group. Using techniques from the Bayesian tool-set our procedure combines shrinkage with feature selection and provides a scalable optimization scheme. We demonstrate in simulations that the method accurately recovers the true effect sizes and sparsity patterns per feature group. Furthermore, it leads to an improved prediction performance in situations where the groups have strong differences in dynamic range. In applications to data from high-throughput biology, the method enables re-weighting the importance of feature groups from different assays. Overall, using available covariates extends the range of applications of penalized regression, improves model interpretability and can improve prediction performance.
© The Author 2019. Published by Oxford University Press.

Entities:  

Keywords:  Classification; External covariates; Feature selection; Penalized regression; Variational Bayes

Mesh:

Year:  2021        PMID: 31596468      PMCID: PMC8036004          DOI: 10.1093/biostatistics/kxz034

Source DB:  PubMed          Journal:  Biostatistics        ISSN: 1465-4644            Impact factor:   5.899


  21 in total

1.  Synchronized age-related gene expression changes across multiple tissues in human and the link to complex diseases.

Authors:  Jialiang Yang; Tao Huang; Francesca Petralia; Quan Long; Bin Zhang; Carmen Argmann; Yong Zhao; Charles V Mobbs; Eric E Schadt; Jun Zhu; Zhidong Tu
Journal:  Sci Rep       Date:  2015-10-19       Impact factor: 4.379

Review 2.  Methods of integrating data to uncover genotype-phenotype interactions.

Authors:  Marylyn D Ritchie; Emily R Holzinger; Ruowang Li; Sarah A Pendergrass; Dokyoon Kim
Journal:  Nat Rev Genet       Date:  2015-01-13       Impact factor: 53.242

3.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

4.  Reproducible RNA-seq analysis using recount2.

Authors:  Leonardo Collado-Torres; Abhinav Nellore; Kai Kammers; Shannon E Ellis; Margaret A Taub; Kasper D Hansen; Andrew E Jaffe; Ben Langmead; Jeffrey T Leek
Journal:  Nat Biotechnol       Date:  2017-04-11       Impact factor: 54.908

Review 5.  From big data analysis to personalized medicine for all: challenges and opportunities.

Authors:  Akram Alyass; Michelle Turcotte; David Meyre
Journal:  BMC Med Genomics       Date:  2015-06-27       Impact factor: 3.063

6.  Transforming RNA-Seq data to improve the performance of prognostic gene signatures.

Authors:  Isabella Zwiener; Barbara Frisch; Harald Binder
Journal:  PLoS One       Date:  2014-01-08       Impact factor: 3.240

7.  Data-driven hypothesis weighting increases detection power in genome-scale multiple testing.

Authors:  Nikolaos Ignatiadis; Bernd Klaus; Judith B Zaugg; Wolfgang Huber
Journal:  Nat Methods       Date:  2016-05-30       Impact factor: 28.547

Review 8.  Multi-omics approaches to disease.

Authors:  Yehudit Hasin; Marcus Seldin; Aldons Lusis
Journal:  Genome Biol       Date:  2017-05-05       Impact factor: 13.583

9.  IPF-LASSO: Integrative L1-Penalized Regression with Penalty Factors for Prediction Based on Multi-Omics Data.

Authors:  Anne-Laure Boulesteix; Riccardo De Bin; Xiaoyu Jiang; Mathias Fuchs
Journal:  Comput Math Methods Med       Date:  2017-05-04       Impact factor: 2.238

10.  Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets.

Authors:  Ricard Argelaguet; Britta Velten; Damien Arnol; Sascha Dietrich; Thorsten Zenz; John C Marioni; Florian Buettner; Wolfgang Huber; Oliver Stegle
Journal:  Mol Syst Biol       Date:  2018-06-20       Impact factor: 11.429

View more
  4 in total

1.  Integration of mechanistic immunological knowledge into a machine learning pipeline improves predictions.

Authors:  Anthony Culos; Amy S Tsai; Natalie Stanley; Martin Becker; Mohammad S Ghaemi; David R McIlwain; Ramin Fallahzadeh; Athena Tanada; Huda Nassar; Camilo Espinosa; Maria Xenochristou; Edward Ganio; Laura Peterson; Xiaoyuan Han; Ina A Stelzer; Kazuo Ando; Dyani Gaudilliere; Thanaphong Phongpreecha; Ivana Marić; Alan L Chang; Gary M Shaw; David K Stevenson; Sean Bendall; Kara L Davis; Wendy Fantl; Garry P Nolan; Trevor Hastie; Robert Tibshirani; Martin S Angst; Brice Gaudilliere; Nima Aghaeepour
Journal:  Nat Mach Intell       Date:  2020-10-12

2.  Flexible co-data learning for high-dimensional prediction.

Authors:  Mirrelijn M van Nee; Lodewyk F A Wessels; Mark A van de Wiel
Journal:  Stat Med       Date:  2021-08-26       Impact factor: 2.497

3.  Design and Rationale of the ERA-CVD Consortium PREMED-CAD-Precision Medicine in Coronary Artery Disease.

Authors:  Apurva Shrivastava; Vincenzo Marzolla; Henri Weidmann; Massimiliano Caprio; David-Alexandre Tregouet; Tanja Zeller; Mahir Karakas
Journal:  Biomolecules       Date:  2020-01-11

Review 4.  A Detailed Catalogue of Multi-Omics Methodologies for Identification of Putative Biomarkers and Causal Molecular Networks in Translational Cancer Research.

Authors:  Efstathios Iason Vlachavas; Jonas Bohn; Frank Ückert; Sylvia Nürnberg
Journal:  Int J Mol Sci       Date:  2021-03-10       Impact factor: 5.923

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.