| Literature DB >> 26170513 |
Yanxun Xu1, Peter Müller2, Yuan Yuan3, Kamalakar Gulukota4, Yuan Ji4.
Abstract
We propose small-variance asymptotic approximations for inference on tumor heterogeneity (TH) using next-generation sequencing data. Understanding TH is an important and open research problem in biology. The lack of appropriate statistical inference is a critical gap in existing methods that the proposed approach aims to fill. We build on a hierarchical model with an exponential family likelihood and a feature allocation prior. The proposed implementation of posterior inference generalizes similar small-variance approximations proposed by Kulis and Jordan (2012) and Broderick et.al (2012b) for inference with Dirichlet process mixture and Indian buffet process prior models under normal sampling. We show that the new algorithm can successfully recover latent structures of different haplotypes and subclones and is magnitudes faster than available Markov chain Monte Carlo samplers. The latter are practically infeasible for high-dimensional genomics data. The proposed approach is scalable, easy to implement and benefits from the exibility of Bayesian nonparametric models. More importantly, it provides a useful tool for applied scientists to estimate cell subtypes in tumor samples. R code is available on http://www.ma.utexas.edu/users/yxu/.Entities:
Keywords: Bayesian nonparametric; Bregman divergence; Feature allocation; Indian buffet process; Next-generation sequencing; Tumor heterogeneity
Year: 2015 PMID: 26170513 PMCID: PMC4498588 DOI: 10.1080/01621459.2014.995794
Source DB: PubMed Journal: J Am Stat Assoc ISSN: 0162-1459 Impact factor: 5.033