Literature DB >> 23869109

Efficient Gaussian process regression for large datasets.

Anjishnu Banerjee1, David B Dunson, Surya T Tokdar.   

Abstract

Gaussian processes are widely used in nonparametric regression, classification and spatiotemporal modelling, facilitated in part by a rich literature on their theoretical properties. However, one of their practical limitations is expensive computation, typically on the order of n3 where n is the number of data points, in performing the necessary matrix inversions. For large datasets, storage and processing also lead to computational bottlenecks, and numerical stability of the estimates and predicted values degrades with increasing n. Various methods have been proposed to address these problems, including predictive processes in spatial data analysis and the subset-of-regressors technique in machine learning. The idea underlying these approaches is to use a subset of the data, but this raises questions concerning sensitivity to the choice of subset and limitations in estimating fine-scale structure in regions that are not well covered by the subset. Motivated by the literature on compressive sensing, we propose an alternative approach that involves linear projection of all the data points onto a lower-dimensional subspace. We demonstrate the superiority of this approach from a theoretical perspective and through simulated and real data examples.

Entities:  

Keywords:  Bayesian regression; Compressive sensing; Dimensionality reduction; Gaussian process; Random projection

Year:  2013        PMID: 23869109      PMCID: PMC3712798          DOI: 10.1093/biomet/ass068

Source DB:  PubMed          Journal:  Biometrika        ISSN: 0006-3444            Impact factor:   2.445


  4 in total

1.  Sparse on-line gaussian processes.

Authors:  Lehel Csató; Manfred Opper
Journal:  Neural Comput       Date:  2002-03       Impact factor: 2.026

2.  Adaptive Gaussian Predictive Process Models for Large Spatial Datasets.

Authors:  Rajarshi Guhaniyogi; Andrew O Finley; Sudipto Banerjee; Alan E Gelfand
Journal:  Environmetrics       Date:  2011-12       Impact factor: 1.900

3.  Improving the performance of predictive process modeling for large datasets.

Authors:  Andrew O Finley; Huiyan Sang; Sudipto Banerjee; Alan E Gelfand
Journal:  Comput Stat Data Anal       Date:  2009-06-15       Impact factor: 1.681

4.  Gaussian predictive process models for large spatial data sets.

Authors:  Sudipto Banerjee; Alan E Gelfand; Andrew O Finley; Huiyan Sang
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2008-09-01       Impact factor: 4.488

  4 in total
  8 in total

1.  Joint hierarchical Gaussian process model with application to personalized prediction in medical monitoring.

Authors:  Leo L Duan; Xia Wang; John P Clancy; Rhonda D Szczesniak
Journal:  Stat (Int Stat Inst)       Date:  2018-03-04

2.  Efficient Bayesian hierarchical functional data analysis with basis function approximations using Gaussian-Wishart processes.

Authors:  Jingjing Yang; Dennis D Cox; Jong Soo Lee; Peng Ren; Taeryon Choi
Journal:  Biometrics       Date:  2017-04-10       Impact factor: 2.571

3.  Statistical properties of sketching algorithms.

Authors:  D C Ahfock; W J Astle; S Richardson
Journal:  Biometrika       Date:  2020-07-30       Impact factor: 2.445

4.  A Semiparametric Bayesian Approach to Dropout in Longitudinal Studies with Auxiliary Covariates.

Authors:  Tianjian Zhou; Michael J Daniels; Peter Müller
Journal:  J Comput Graph Stat       Date:  2019-07-02       Impact factor: 2.302

5.  Machine Learning Models for Predicting the Occurrence of Respiratory Diseases Using Climatic and Air-Pollution Factors.

Authors:  Yunseo Ku; Soon Bin Kwon; Jeong-Hwa Yoon; Seog-Kyun Mun; Munyoung Chang
Journal:  Clin Exp Otorhinolaryngol       Date:  2022-01-07       Impact factor: 3.340

6.  Framework for enhancing the estimation of model parameters for data with a high level of uncertainty.

Authors:  Gustavo B Libotte; Lucas Dos Anjos; Regina C C Almeida; Sandra M C Malta; Renato S Silva
Journal:  Nonlinear Dyn       Date:  2022-01-07       Impact factor: 5.741

7.  Fast methods for training Gaussian processes on large datasets.

Authors:  C J Moore; A J K Chua; C P L Berry; J R Gair
Journal:  R Soc Open Sci       Date:  2016-05-11       Impact factor: 2.963

8.  Prediction of Left Ventricular Mechanics Using Machine Learning.

Authors:  Yaghoub Dabiri; Alex Van der Velden; Kevin L Sack; Jenny S Choy; Ghassan S Kassab; Julius M Guccione
Journal:  Front Phys       Date:  2019-09-06
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.