Literature DB >> 12611798

Can transcriptome size be estimated from SAGE catalogs?

Michael D Stern1, Sergey V Anisimov, Kenneth R Boheler.   

Abstract

MOTIVATION: SAGE (Serial Analysis of Gene Expression) can be used to estimate the number of unique transcripts in a transcriptome. A simple estimator that corrects for sequencing and sampling errors was applied to a SAGE library (137 832 tags) obtained from mouse embryonic stem cells, and also to Monte Carlo simulated libraries generated using assumed distributions of 'true' expression levels consistent with the data.
RESULTS: When the corrected data themselves were taken as the underlying model of 'ground truth', the estimator converged to the 'true' value (53 535) only after counting 300 000 simulated tags, more than twice the number in the experiment. The SAGE data could also be well fit by a Monte Carlo model based on a truncated inverse-square distribution of expression levels, with 130 000 'true' transcripts and 10(6) samples needed for convergence. We conclude that the size of a transcriptome is ill-determined from SAGE libraries of even moderately large size. In order to obtain a valid estimate, one must sample a number of tags inversely proportional to the lowest abundance level, which is not known a priori. This constrains the design of SAGE experiments intended to determine biological complexity. AVAILABILITY: The 'homemade' software used for this analysis was not designed for general or 'production' use, but the authors will be happy to share Fortran sourcecode with interested parties. CONTACT: sternm@grc.nia.nih.gov

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 12611798     DOI: 10.1093/bioinformatics/btg018

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  13 in total

Review 1.  Methods for transcriptional profiling in plants. Be fruitful and replicate.

Authors:  Blake C Meyers; David W Galbraith; Timothy Nelson; Vikas Agrawal
Journal:  Plant Physiol       Date:  2004-06-01       Impact factor: 8.340

2.  Increasing the efficiency of SAGE adaptor ligation by directed ligation chemistry.

Authors:  Austin P So; Robin F B Turner; Charles A Haynes
Journal:  Nucleic Acids Res       Date:  2004-07-06       Impact factor: 16.971

3.  Detecting novel low-abundant transcripts in Drosophila.

Authors:  Sanggyu Lee; Jingyue Bao; Guolin Zhou; Joshua Shapiro; Jinhua Xu; Run Zhang Shi; Xuemei Lu; Terry Clark; Deborah Johnson; Yeong C Kim; Claudia Wing; Charles Tseng; Min Sun; Wei Lin; Jun Wang; Huanming Yang; Jian Wang; Wei Du; Chung-I Wu; Xiuqing Zhang; San Ming Wang
Journal:  RNA       Date:  2005-06       Impact factor: 4.942

4.  The maize root transcriptome by serial analysis of gene expression.

Authors:  V Poroyko; L G Hejlek; W G Spollen; G K Springer; H T Nguyen; R E Sharp; H J Bohnert
Journal:  Plant Physiol       Date:  2005-06-17       Impact factor: 8.340

5.  Cryptococcus neoformans gene expression during experimental cryptococcal meningitis.

Authors:  B R Steen; S Zuyderduyn; D L Toffaletti; M Marra; S J M Jones; J R Perfect; J Kronstad
Journal:  Eukaryot Cell       Date:  2003-12

6.  Estimating the proportion of microarray probes expressed in an RNA sample.

Authors:  Wei Shi; Carolyn A de Graaf; Sarah A Kinkel; Ariel H Achtman; Tracey Baldwin; Louis Schofield; Hamish S Scott; Douglas J Hilton; Gordon K Smyth
Journal:  Nucleic Acids Res       Date:  2010-01-07       Impact factor: 16.971

7.  Modeling SAGE tag formation and its effects on data interpretation within a Bayesian framework.

Authors:  Michael A Gilchrist; Hong Qin; Russell Zaretzki
Journal:  BMC Bioinformatics       Date:  2007-10-18       Impact factor: 3.169

8.  A combination of LongSAGE with Solexa sequencing is well suited to explore the depth and the complexity of transcriptome.

Authors:  Lucie Hanriot; Céline Keime; Nadine Gay; Claudine Faure; Carole Dossat; Patrick Wincker; Céline Scoté-Blachon; Christelle Peyron; Olivier Gandrillon
Journal:  BMC Genomics       Date:  2008-09-16       Impact factor: 3.969

9.  Modeling Sage data with a truncated gamma-Poisson model.

Authors:  Helene H Thygesen; Aeilko H Zwinderman
Journal:  BMC Bioinformatics       Date:  2006-03-20       Impact factor: 3.169

10.  Modeling transcriptome based on transcript-sampling data.

Authors:  Jiang Zhu; Fuhong He; Jing Wang; Jun Yu
Journal:  PLoS One       Date:  2008-02-20       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.