Literature DB >> 12682372

Transcript identification by analysis of short sequence tags--influence of tag length, restriction site and transcript database.

Per Unneberg1, Anders Wennborg, Magnus Larsson.   

Abstract

There exist a number of gene expression profiling techniques that utilize restriction enzymes for generation of short expressed sequence tags. We have studied how the choice of restriction enzyme influences various characteristics of tags generated in an experiment. We have also investigated various aspects of in silico transcript identification that these profiling methods rely on. First, analysis of 14 248 mRNA sequences derived from the RefSeq transcript database showed that 1-30% of the sequences lack a given restriction enzyme recognition site. Moreover, 1-5% of the transcripts have recognition sites located less than 10 bases from the poly(A) tail. The uniqueness of 10 bp tags lies in the range 90-95%, which increases only slightly with longer tags, due to the existence of closely related transcripts. Furthermore, 3-30% of upstream 10 bp tags are identical to 3' tags, introducing a risk of misclassification if upstream tags are present in a sample. Second, we found that a sequence length of 16-17 bp, including the recognition site, is sufficient for unique transcript identification by BLAST based sequence alignment to the UniGene Human non-redundant database. Third, we constructed a tag-to-gene mapping for UniGene and compared it to an existing mapping database. The mappings agreed to 79-83%, where the selection of representative sequences in the UniGene clusters is the main cause of the disagreement. The results of this study may serve to improve the interpretation of sequence-based expression studies and the design of hybridization arrays, by identifying short tags that have a high reliability and separating them from tags that carry an inherent ambiguity in their capacity to discriminate between genes. To this end, supplementary information in the form of a web companion to this paper is located at http:// biobase.biotech.kth.se/tagseq.

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 12682372      PMCID: PMC153741          DOI: 10.1093/nar/gkg313

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  22 in total

1.  Tandem arrayed ligation of expressed sequence tags (TALEST): a new method for generating global gene expression profiles.

Authors:  D G Spinella; A K Bernardino; A C Redding; P Koutz; Y Wei; E K Pratt; K K Myers; G Chappell; S Gerken; S J McConnell
Journal:  Nucleic Acids Res       Date:  1999-09-15       Impact factor: 16.971

Review 2.  Genomics, gene expression and DNA arrays.

Authors:  D J Lockhart; E A Winzeler
Journal:  Nature       Date:  2000-06-15       Impact factor: 49.962

3.  Essay: Amersham Pharmacia Biotech & Science prize. Tantalizing transcriptomes--SAGE and its use in global gene expression analysis.

Authors:  V E Velculescu
Journal:  Science       Date:  1999-11-19       Impact factor: 47.728

4.  Initial sequencing and analysis of the human genome.

Authors:  E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki
Journal:  Nature       Date:  2001-02-15       Impact factor: 49.962

5.  SAGEmap: a public gene expression resource.

Authors:  A E Lash; C M Tolstoshev; L Wagner; G D Schuler; R L Strausberg; G J Riggins; S F Altschul
Journal:  Genome Res       Date:  2000-07       Impact factor: 9.043

6.  Patterns of variant polyadenylation signal usage in human genes.

Authors:  E Beaudoing; S Freier; J R Wyatt; J M Claverie; D Gautheret
Journal:  Genome Res       Date:  2000-07       Impact factor: 9.043

7.  Correct identification of genes from serial analysis of gene expression tag sequences.

Authors:  Sanggyu Lee; Terry Clark; Jianjun Chen; Guolin Zhou; L Ridgway Scott; Janet D Rowley; San Ming Wang
Journal:  Genomics       Date:  2002-04       Impact factor: 5.736

8.  A quantitative evaluation of SAGE.

Authors:  J Stollberg; J Urschitz; Z Urban; C D Boyd
Journal:  Genome Res       Date:  2000-08       Impact factor: 9.043

9.  Heterogeneity in polyadenylation cleavage sites in mammalian mRNA sequences: implications for SAGE analysis.

Authors:  E Pauws; A H van Kampen; S A van de Graaf; J J de Vijlder; C Ris-Stalpers
Journal:  Nucleic Acids Res       Date:  2001-04-15       Impact factor: 16.971

10.  Computational Analysis of Gene Identification with SAGE.

Authors:  Terry Clark; Sanggyu Lee; L Ridgway Scott; San Ming Wang
Journal:  J Comput Biol       Date:  2002       Impact factor: 1.479

View more
  9 in total

1.  Genomic representations using concatenates of Type IIB restriction endonuclease digestion fragments.

Authors:  Torstein Tengs; Thomas LaFramboise; Robert B Den; David N Hayes; Jianhua Zhang; Saikat DebRoy; Robert C Gentleman; Keith O'Neill; Bruce Birren; Matthew Meyerson
Journal:  Nucleic Acids Res       Date:  2004-08-25       Impact factor: 16.971

2.  The maize root transcriptome by serial analysis of gene expression.

Authors:  V Poroyko; L G Hejlek; W G Spollen; G K Springer; H T Nguyen; R E Sharp; H J Bohnert
Journal:  Plant Physiol       Date:  2005-06-17       Impact factor: 8.340

3.  Global transcriptional profiling of the toxic dinoflagellate Alexandrium fundyense using Massively Parallel Signature Sequencing.

Authors:  Deana L Erdner; Donald M Anderson
Journal:  BMC Genomics       Date:  2006-04-25       Impact factor: 3.969

4.  Digital gene expression approach over multiple RNA-Seq data sets to detect neoblast transcriptional changes in Schmidtea mediterranea.

Authors:  Gustavo Rodríguez-Esteban; Alejandro González-Sastre; José Ignacio Rojo-Laguna; Emili Saló; Josep F Abril
Journal:  BMC Genomics       Date:  2015-05-08       Impact factor: 3.969

5.  Next-generation genome sequencing can be used to rapidly characterise sequences flanking T-DNA insertions in random insertional mutants of Leptosphaeria maculans.

Authors:  Kylie Chambers; Rohan Gt Lowe; Barbara J Howlett; Manuel Zander; Jacqueline Batley; Angela P Van de Wouw; Candace E Elliott
Journal:  Fungal Biol Biotechnol       Date:  2014-12-07

6.  Unexpected observations after mapping LongSAGE tags to the human genome.

Authors:  Céline Keime; Marie Sémon; Dominique Mouchiroud; Laurent Duret; Olivier Gandrillon
Journal:  BMC Bioinformatics       Date:  2007-05-15       Impact factor: 3.169

7.  Deep analysis of cellular transcriptomes - LongSAGE versus classic MPSS.

Authors:  Lawrence Hene; Vattipally B Sreenu; Mai T Vuong; S Hussain I Abidi; Julian K Sutton; Sarah L Rowland-Jones; Simon J Davis; Edward J Evans
Journal:  BMC Genomics       Date:  2007-09-24       Impact factor: 3.969

8.  iGentifier: indexing and large-scale profiling of unknown transcriptomes.

Authors:  Achim Fischer; Alistair Lenhard; Heike Tronecker; Yvonne Lorat; Marcel Kraenzle; Oliver Sorgenfrei; Tim Zeppenfeld; Michael Haushalter; Gerhard Vogt; Ulrich Gruene; Annette Meyer; Ulrich Handlbichler; Patrick Schweizer; Leo Gaelweiler
Journal:  Nucleic Acids Res       Date:  2007-06-25       Impact factor: 16.971

Review 9.  RNA sequencing: from tag-based profiling to resolving complete transcript structure.

Authors:  Eleonora de Klerk; Johan T den Dunnen; Peter A C 't Hoen
Journal:  Cell Mol Life Sci       Date:  2014-05-15       Impact factor: 9.261

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.