Literature DB >> 12162890

Computational Analysis of Gene Identification with SAGE.

Terry Clark1, Sanggyu Lee, L Ridgway Scott, San Ming Wang.   

Abstract

SAGE is one of the few techniques capable of uniformly probing gene expression at a genome level irrespective of mRNA abundance and without a priori knowledge of the transcripts present. However, individual SAGE tags can match many sequences in the reference database, complicating gene identification. We perform a baseline evaluation of gene identification with SAGE using UniGene Human as the reference database by analyzing 1) the distributions of tags for various length tag sets formed for UniGene Human and 2) the tag-to-sequence mapping using a SAGE tag set consisting of 37,522 tags derived from human myeloid cells. The extensive multiplicity of the dbEST component of UniGene significantly detracts from gains that might be expected by extending tags within the scope of the SAGE protocol. In order to achieve reasonable sequence specificity for gene identification with the content of the commonly used UniGene sequence collection, tags on the order of hundreds of bases in length are required. One way to produce tags of such lengths is with GLGI, which extends SAGE tags to the 3' end of cDNA. We show that the longer sequences produced by GLGI relieve significantly the multiple match condition. In the myeloid sample, we also found a correlation between multiple match severity and high copy number. We extrapolate these findings, providing insights into the use of UniGene Human as a reference for gene identification.

Entities:  

Mesh:

Substances:

Year:  2002        PMID: 12162890     DOI: 10.1089/106652702760138600

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  3 in total

1.  Transcript identification by analysis of short sequence tags--influence of tag length, restriction site and transcript database.

Authors:  Per Unneberg; Anders Wennborg; Magnus Larsson
Journal:  Nucleic Acids Res       Date:  2003-04-15       Impact factor: 16.971

2.  The impact of SNPs on the interpretation of SAGE and MPSS experimental data.

Authors:  Ana Paula M Silva; Jorge E S De Souza; Pedro A F Galante; Gregory J Riggins; Sandro J De Souza; Anamaria A Camargo
Journal:  Nucleic Acids Res       Date:  2004-11-23       Impact factor: 16.971

3.  SAGE is far more sensitive than EST for detecting low-abundance transcripts.

Authors:  Miao Sun; Guolin Zhou; Sanggyu Lee; Jianjun Chen; Run Zhang Shi; San Ming Wang
Journal:  BMC Genomics       Date:  2004-01-05       Impact factor: 3.969

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.