Literature DB >> 29033469

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data.

Faming Liang1, Jinsu Kim2, Qifan Song3.   

Abstract

Markov chain Monte Carlo (MCMC) methods have proven to be a very powerful tool for analyzing data of complex structures. However, their computer-intensive nature, which typically require a large number of iterations and a complete scan of the full dataset for each iteration, precludes their use for big data analysis. In this paper, we propose the so-called bootstrap Metropolis-Hastings (BMH) algorithm, which provides a general framework for how to tame powerful MCMC methods to be used for big data analysis; that is to replace the full data log-likelihood by a Monte Carlo average of the log-likelihoods that are calculated in parallel from multiple bootstrap samples. The BMH algorithm possesses an embarrassingly parallel structure and avoids repeated scans of the full dataset in iterations, and is thus feasible for big data problems. Compared to the popular divide-and-combine method, BMH can be generally more efficient as it can asymptotically integrate the whole data information into a single simulation run. The BMH algorithm is very flexible. Like the Metropolis-Hastings algorithm, it can serve as a basic building block for developing advanced MCMC algorithms that are feasible for big data problems. This is illustrated in the paper by the tempering BMH algorithm, which can be viewed as a combination of parallel tempering and the BMH algorithm. BMH can also be used for model selection and optimization by combining with reversible jump MCMC and simulated annealing, respectively.

Entities:  

Keywords:  Big Data; Bootstrap; Markov Chain Monte Carlo; Metropolis-Hastings; Parallel Computing

Year:  2016        PMID: 29033469      PMCID: PMC5637557          DOI: 10.1080/00401706.2016.1142905

Source DB:  PubMed          Journal:  Technometrics        ISSN: 0040-1706


  2 in total

1.  Optimization by simulated annealing.

Authors:  S Kirkpatrick; C D Gelatt; M P Vecchi
Journal:  Science       Date:  1983-05-13       Impact factor: 47.728

2.  A Monte Carlo Metropolis-Hastings algorithm for sampling from distributions with intractable normalizing constants.

Authors:  Faming Liang; Ick-Hoon Jin
Journal:  Neural Comput       Date:  2013-04-22       Impact factor: 2.026

  2 in total
  2 in total

1.  Statistical methods and computing for big data.

Authors:  Chun Wang; Ming-Hui Chen; Elizabeth Schifano; Jing Wu; Jun Yan
Journal:  Stat Interface       Date:  2016       Impact factor: 0.582

2.  Application of Entropy Ensemble Filter in Neural Network Forecasts of Tropical Pacific Sea Surface Temperatures.

Authors:  Hossein Foroozand; Valentina Radić; Steven V Weijs
Journal:  Entropy (Basel)       Date:  2018-03-20       Impact factor: 2.524

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.