Literature DB >> 33456293

Consensus Monte Carlo for Random Subsets using Shared Anchors.

Yang Ni1, Yuan Ji2, Peter Müller3.   

Abstract

We present a consensus Monte Carlo algorithm that scales existing Bayesian nonparametric models for clustering and feature allocation to big data. The algorithm is valid for any prior on random subsets such as partitions and latent feature allocation, under essentially any sampling model. Motivated by three case studies, we focus on clustering induced by a Dirichlet process mixture sampling model, inference under an Indian buffet process prior with a binomial sampling model, and with a categorical sampling model. We assess the proposed algorithm with simulation studies and show results for inference with three datasets: an MNIST image dataset, a dataset of pancreatic cancer mutations, and a large set of electronic health records (EHR). Supplementary materials for this article are available online.

Entities:  

Keywords:  Big data; electronic health records; image cluster; parallel computing; tumor heterogeneity

Year:  2020        PMID: 33456293      PMCID: PMC7810350          DOI: 10.1080/10618600.2020.1737085

Source DB:  PubMed          Journal:  J Comput Graph Stat        ISSN: 1061-8600            Impact factor:   2.302


  6 in total

1.  Are Gibbs-Type Priors the Most Natural Generalization of the Dirichlet Process?

Authors:  Pierpaolo De Blasi; Stefano Favaro; Antonio Lijoi; Ramsés H Mena; Igor Prünster; Matteo Ruggiero
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2015-02       Impact factor: 6.226

2.  MAD Bayes for Tumor Heterogeneity - Feature Allocation with Exponential Family Sampling.

Authors:  Yanxun Xu; Peter Müller; Yuan Yuan; Kamalakar Gulukota; Yuan Ji
Journal:  J Am Stat Assoc       Date:  2015-03-01       Impact factor: 5.033

3.  Sparse covariance estimation in heterogeneous samples.

Authors:  Abel Rodríguez; Alex Lenkoski; Adrian Dobra
Journal:  Electron J Stat       Date:  2011-09-15       Impact factor: 1.125

4.  Scalable Bayesian Nonparametric Clustering and Classification.

Authors:  Yang Ni; Peter Müller; Maurice Diesendruck; Sinead Williamson; Yitan Zhu; Yuan Ji
Journal:  J Comput Graph Stat       Date:  2019-07-19       Impact factor: 2.302

5.  Fast Bayesian Inference in Dirichlet Process Mixture Models.

Authors:  Lianming Wang; David B Dunson
Journal:  J Comput Graph Stat       Date:  2011-01-01       Impact factor: 2.302

6.  Piecewise Approximate Bayesian Computation: fast inference for discretely observed Markov models using a factorised posterior distribution.

Authors:  S R White; T Kypraios; S P Preston
Journal:  Stat Comput       Date:  2013-11-29       Impact factor: 2.559

  6 in total
  2 in total

1.  Bayesian biclustering for microbial metagenomic sequencing data via multinomial matrix factorization.

Authors:  Fangting Zhou; Kejun He; Qiwei Li; Robert S Chapkin; Yang Ni
Journal:  Biostatistics       Date:  2022-07-18       Impact factor: 5.279

2.  Consensus clustering for Bayesian mixture models.

Authors:  Stephen Coleman; Paul D W Kirk; Chris Wallace
Journal:  BMC Bioinformatics       Date:  2022-07-21       Impact factor: 3.307

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.