| Literature DB >> 28785119 |
Tianzhou Ma1, Faming Liang2, George Tseng3.
Abstract
Meta-analysis combining multiple transcriptomic studies increases statistical power and accuracy in detecting differentially expressed genes. As the next-generation sequencing experiments become mature and affordable, increasing number of RNA-seq datasets are available in the public domain. The count-data based technology provides better experimental accuracy, reproducibility and ability to detect low-expressed genes. A naive approach to combine multiple RNA-seq studies is to apply differential analysis tools such as edgeR and DESeq to each study and then combine the summary statistics of p-values or effect sizes by conventional meta-analysis methods. Such a two-stage approach loses statistical power, especially for genes with short length or low expression abundance. In this paper, we propose a full Bayesian hierarchical model (namely, BayesMetaSeq) for RNA-seq meta-analysis by modelling count data, integrating information across genes and across studies, and modelling potentially heterogeneous differential signals across studies via latent variables. A Dirichlet process mixture (DPM) prior is further applied on the latent variables to provide categorization of detected biomarkers according to their differential expression patterns across studies, facilitating improved interpretation and biological hypothesis generation. Simulations and a real application on multi-brain-region HIV-1 transgenic rats demonstrate improved sensitivity, accuracy and biological findings of the proposed method.Entities:
Keywords: Bayesian hierarchical model; RNA sequencing (RNA-seq); differential expression (DE); meta-analysis; model-based clustering
Year: 2016 PMID: 28785119 PMCID: PMC5543999 DOI: 10.1111/rssc.12199
Source DB: PubMed Journal: J R Stat Soc Ser C Appl Stat ISSN: 0035-9254 Impact factor: 1.864