| Literature DB >> 31011242 |
Abstract
This paper proposes a simple, practical and efficient MCMC algorithm for Bayesian analysis of big data. The proposed algorithm suggests to divide the big dataset into some smaller subsets and provides a simple method to aggregate the subset posteriors to approximate the full data posterior. To further speed up computation, the proposed algorithm employs the population stochastic approximation Monte Carlo (Pop-SAMC) algorithm, a parallel MCMC algorithm, to simulate from each subset posterior. Since this algorithm consists of two levels of parallel, data parallel and simulation parallel, it is coined as "Double Parallel Monte Carlo". The validity of the proposed algorithm is justified mathematically and numerically.Entities:
Keywords: Divide-and-Combine; Embarrassingly Parallel; MCMC; Pop-SAMC; Subset Posterior Aggregation
Year: 2017 PMID: 31011242 PMCID: PMC6474686 DOI: 10.1007/s11222-017-9791-1
Source DB: PubMed Journal: Stat Comput ISSN: 0960-3174 Impact factor: 2.559