Literature DB >> 30799887

Bayesian Approximate Kernel Regression with Variable Selection.

Lorin Crawford1,2,3, Kris C Wood4, Xiang Zhou5,6, Sayan Mukherjee7,8,9,10.   

Abstract

Nonlinear kernel regression models are often used in statistics and machine learning because they are more accurate than linear models. Variable selection for kernel regression models is a challenge partly because, unlike the linear regression setting, there is no clear concept of an effect size for regression coefficients. In this paper, we propose a novel framework that provides an effect size analog for each explanatory variable in Bayesian kernel regression models when the kernel is shift-invariant - for example, the Gaussian kernel. We use function analytic properties of shift-invariant reproducing kernel Hilbert spaces (RKHS) to define a linear vector space that: (i) captures nonlinear structure, and (ii) can be projected onto the original explanatory variables. This projection onto the original explanatory variables serves as an analog of effect sizes. The specific function analytic property we use is that shift-invariant kernel functions can be approximated via random Fourier bases. Based on the random Fourier expansion, we propose a computationally efficient class of Bayesian approximate kernel regression (BAKR) models for both nonlinear regression and binary classification for which one can compute an analog of effect sizes. We illustrate the utility of BAKR by examining two important problems in statistical genetics: genomic selection (i.e. phenotypic prediction) and association mapping (i.e. inference of significant variants or loci). State-of-the-art methods for genomic selection and association mapping are based on kernel regression and linear models, respectively. BAKR is the first method that is competitive in both settings.

Entities:  

Year:  2018        PMID: 30799887      PMCID: PMC6383716          DOI: 10.1080/01621459.2017.1361830

Source DB:  PubMed          Journal:  J Am Stat Assoc        ISSN: 0162-1459            Impact factor:   5.033


  6 in total

1.  Multi-scale inference of genetic trait architecture using biologically annotated neural networks.

Authors:  Pinar Demetci; Wei Cheng; Gregory Darnell; Xiang Zhou; Sohini Ramachandran; Lorin Crawford
Journal:  PLoS Genet       Date:  2021-08-19       Impact factor: 5.917

2.  A topological data analytic approach for discovering biophysical signatures in protein dynamics.

Authors:  Wai Shing Tang; Gabriel Monteiro da Silva; Henry Kirveslahti; Erin Skeens; Bibo Feng; Timothy Sudijono; Kevin K Yang; Sayan Mukherjee; Brenda Rubenstein; Lorin Crawford
Journal:  PLoS Comput Biol       Date:  2022-05-02       Impact factor: 4.779

3.  VARIABLE PRIORITIZATION IN NONLINEAR BLACK BOX METHODS: A GENETIC ASSOCIATION CASE STUDY1.

Authors:  Lorin Crawford; Seth R Flaxman; Daniel E Runcie; Mike West
Journal:  Ann Appl Stat       Date:  2019-06-17       Impact factor: 2.083

4.  Bayesian multitrait kernel methods improve multienvironment genome-based prediction.

Authors:  Osval Antonio Montesinos-López; José Cricelio Montesinos-López; Abelardo Montesinos-López; Juan Manuel Ramírez-Alcaraz; Jesse Poland; Ravi Singh; Susanne Dreisigacker; Leonardo Crespo; Sushismita Mondal; Velu Govidan; Philomin Juliana; Julio Huerta Espino; Sandesh Shrestha; Rajeev K Varshney; José Crossa
Journal:  G3 (Bethesda)       Date:  2022-02-04       Impact factor: 3.542

5.  Systematic mapping of BCL-2 gene dependencies in cancer reveals molecular determinants of BH3 mimetic sensitivity.

Authors:  Ryan S Soderquist; Lorin Crawford; Esther Liu; Min Lu; Anika Agarwal; Gray R Anderson; Kevin H Lin; Peter S Winter; Merve Cakir; Kris C Wood
Journal:  Nat Commun       Date:  2018-08-29       Impact factor: 14.919

6.  Label propagation defines signaling networks associated with recurrently mutated cancer genes.

Authors:  Merve Cakir; Sayan Mukherjee; Kris C Wood
Journal:  Sci Rep       Date:  2019-06-28       Impact factor: 4.379

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.