Literature DB >> 14601748

Bayesian shrinkage estimation of the relative abundance of mRNA transcripts using SAGE.

Jeffrey S Morris1, Keith A Baggerly, Kevin R Coombes.   

Abstract

Serial analysis of gene expression (SAGE) is a technology for quantifying gene expression in biological tissue that yields count data that can be modeled by a multinomial distribution with two characteristics: skewness in the relative frequencies and small sample size relative to the dimension. As a result of these characteristics, a given SAGE sample may fail to capture a large number of expressed mRNA species present in the tissue. Empirical estimators of mRNA species' relative abundance effectively ignore these missing species, and as a result tend to overestimate the abundance of the scarce observed species comprising a vast majority of the total. We have developed a new Bayesian estimation procedure that quantifies our prior information about these characteristics, yielding a nonlinear shrinkage estimator with efficiency advantages over the MLE. Our prior is mixture of Dirichlets, whereby species are stochastically partitioned into abundant and scarce classes, each with its own multivariate prior. Simulation studies reveal our estimator has lower integrated mean squared error (IMSE) than the MLE for the SAGE scenarios simulated, and yields relative abundance profiles closer in Euclidean distance to the truth for all samples simulated. We apply our method to a SAGE library of normal colon tissue, and discuss its implications for assessing differential expression.

Mesh:

Substances:

Year:  2003        PMID: 14601748     DOI: 10.1111/1541-0420.00057

Source DB:  PubMed          Journal:  Biometrics        ISSN: 0006-341X            Impact factor:   2.571


  9 in total

1.  Estimating species richness by a Poisson-compound gamma model.

Authors:  Ji-Ping Wang
Journal:  Biometrika       Date:  2010-06-22       Impact factor: 2.445

2.  Bayesian hierarchical modeling and selection of differentially expressed genes for the EST data.

Authors:  Fang Yu; Ming-Hui Chen; Lynn Kuo; Peng Huang; Wanling Yang
Journal:  Biometrics       Date:  2011-03       Impact factor: 2.571

3.  Bias correction and Bayesian analysis of aggregate counts in SAGE libraries.

Authors:  Russell L Zaretzki; Michael A Gilchrist; William M Briggs; Artin Armagan
Journal:  BMC Bioinformatics       Date:  2010-02-03       Impact factor: 3.169

4.  Bayesian Nonparametric Inference - Why and How.

Authors:  Peter Müller; Riten Mitra
Journal:  Bayesian Anal       Date:  2013       Impact factor: 3.728

5.  A Bayesian Semi-parametric Approach for the Differential Analysis of Sequence Counts Data.

Authors:  Michele Guindani; Nuno Sepúlveda; Carlos Daniel Paulino; Peter Müller
Journal:  J R Stat Soc Ser C Appl Stat       Date:  2014-04       Impact factor: 1.864

6.  Modeling SAGE tag formation and its effects on data interpretation within a Bayesian framework.

Authors:  Michael A Gilchrist; Hong Qin; Russell Zaretzki
Journal:  BMC Bioinformatics       Date:  2007-10-18       Impact factor: 3.169

7.  Modeling Sage data with a truncated gamma-Poisson model.

Authors:  Helene H Thygesen; Aeilko H Zwinderman
Journal:  BMC Bioinformatics       Date:  2006-03-20       Impact factor: 3.169

8.  Modeling transcriptome based on transcript-sampling data.

Authors:  Jiang Zhu; Fuhong He; Jing Wang; Jun Yu
Journal:  PLoS One       Date:  2008-02-20       Impact factor: 3.240

9.  Bayesian model accounting for within-class biological variability in Serial Analysis of Gene Expression (SAGE).

Authors:  Ricardo Z N Vêncio; Helena Brentani; Diogo F C Patrão; Carlos A B Pereira
Journal:  BMC Bioinformatics       Date:  2004-08-31       Impact factor: 3.169

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.