Literature DB >> 14764557

Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus.

Volker Brendel1, Liqun Xing, Wei Zhu.   

Abstract

MOTIVATION: Accurate gene structure annotation is a challenging computational problem in genomics. The best results are achieved with spliced alignment of full-length cDNAs or multiple expressed sequence tags (ESTs) with sufficient overlap to cover the entire gene. For most species, cDNA and EST collections are far from comprehensive. We sought to overcome this bottleneck by exploring the possibility of using combined EST resources from fairly diverged species that still share a common gene space. Previous spliced alignment tools were found inadequate for this task because they rely on very high sequence similarity between the ESTs and the genomic DNA.
RESULTS: We have developed a computer program, GeneSeqer, which is capable of aligning thousands of ESTs with a long genomic sequence in a reasonable amount of time. The algorithm is uniquely designed to tolerate a high percentage of mismatches and insertions or deletions in the EST relative to the genomic template. This feature allows use of non-cognate ESTs for gene structure prediction, including ESTs derived from duplicated genes and homologous genes from related species. The increased gene prediction sensitivity results in part from novel splice site prediction models that are also available as a stand-alone splice site prediction tool. We assessed GeneSeqer performance relative to a standard Arabidopsis thaliana gene set and demonstrate its utility for plant genome annotation. In particular, we propose that this method provides a timely tool for the annotation of the rice genome, using abundant ESTs from other cereals and plants. AVAILABILITY: The source code is available for download at http://bioinformatics.iastate.edu/bioinformatics2go/gs/download.html. Web servers for Arabidopsis and other plant species are accessible at http://www.plantgdb.org/cgi-bin/AtGeneSeqer.cgi and http://www.plantgdb.org/cgi-bin/GeneSeqer.cgi, respectively. For non-plant species, use http://bioinformatics.iastate.edu/cgi-bin/gs.cgi. The splice site prediction tool (SplicePredictor) is distributed with the GeneSeqer code. A SplicePredictor web server is available at http://bioinformatics.iastate.edu/cgi-bin/sp.cgi

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 14764557     DOI: 10.1093/bioinformatics/bth058

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  42 in total

1.  Genomewide comparative analysis of alternative splicing in plants.

Authors:  Bing-Bing Wang; Volker Brendel
Journal:  Proc Natl Acad Sci U S A       Date:  2006-04-21       Impact factor: 11.205

2.  Characterization of novel SLC6A8 variants with the use of splice-site analysis tools and implementation of a newly developed LOVD database.

Authors:  Ofir T Betsalel; Efraim H Rosenberg; Ligia S Almeida; Tjitske Kleefstra; Charles E Schwartz; Vassili Valayannopoulos; Omar Abdul-Rahman; Nicola Poplawski; Laura Vilarinho; Philipp Wolf; Johan T den Dunnen; Cornelis Jakobs; Gajja S Salomons
Journal:  Eur J Hum Genet       Date:  2010-08-18       Impact factor: 4.246

3.  Evaluation of five ab initio gene prediction programs for the discovery of maize genes.

Authors:  Hong Yao; Ling Guo; Yan Fu; Lisa A Borsuk; Tsui-Jung Wen; David S Skibbe; Xiangqin Cui; Brian E Scheffler; Jun Cao; Scott J Emrich; Daniel A Ashlock; Patrick S Schnable
Journal:  Plant Mol Biol       Date:  2005-02       Impact factor: 4.076

4.  Genetic dissection of intermated recombinant inbred lines using a new genetic map of maize.

Authors:  Yan Fu; Tsui-Jung Wen; Yefim I Ronin; Hsin D Chen; Ling Guo; David I Mester; Yongjie Yang; Michael Lee; Abraham B Korol; Daniel A Ashlock; Patrick S Schnable
Journal:  Genetics       Date:  2006-09-01       Impact factor: 4.562

5.  Comparative plant genomics resources at PlantGDB.

Authors:  Qunfeng Dong; Carolyn J Lawrence; Shannon D Schlueter; Matthew D Wilkerson; Stefan Kurtz; Carol Lushbough; Volker Brendel
Journal:  Plant Physiol       Date:  2005-10       Impact factor: 8.340

6.  Characterization of Na, K-ATPase genes in Atlantic salmon (Salmo salar) and comparative genomic organization with rainbow trout (Oncorhynchus mykiss).

Authors:  Karim Gharbi; Moira M Ferguson; Roy G Danzmann
Journal:  Mol Genet Genomics       Date:  2005-05-10       Impact factor: 3.291

7.  The maize genetics and genomics database. The community resource for access to diverse maize data.

Authors:  Carolyn J Lawrence; Trent E Seigfried; Volker Brendel
Journal:  Plant Physiol       Date:  2005-05       Impact factor: 8.340

8.  Efficient plant gene identification based on interspecies mapping of full-length cDNAs.

Authors:  Naoki Amano; Tsuyoshi Tanaka; Hisataka Numa; Hiroaki Sakai; Takeshi Itoh
Journal:  DNA Res       Date:  2010-07-28       Impact factor: 4.458

9.  Improvement of whole-genome annotation of cereals through comparative analyses.

Authors:  Wei Zhu; C Robin Buell
Journal:  Genome Res       Date:  2007-02-06       Impact factor: 9.043

10.  ESTPiper--a web-based analysis pipeline for expressed sequence tags.

Authors:  Zuojian Tang; Jeong-Hyeon Choi; Chris Hemmerich; Ankita Sarangi; John K Colbourne; Qunfeng Dong
Journal:  BMC Genomics       Date:  2009-04-21       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.