Literature DB >> 16306388

Incorporation of splice site probability models for non-canonical introns improves gene structure prediction in plants.

Michael E Sparks1, Volker Brendel.   

Abstract

MOTIVATION: The vast majority of introns in protein-coding genes of higher eukaryotes have a GT dinucleotide at their 5'-terminus and an AG dinucleotide at their 3' end. About 1-2% of introns are non-canonical, with the most abundant subtype of non-canonical introns being characterized by GC and AG dinucleotides at their 5'- and 3'-termini, respectively. Most current gene prediction software, whether based on ab initio or spliced alignment approaches, does not include explicit models for non-canonical introns or may exclude their prediction altogether. With present amounts of genome and transcript data, it is now possible to apply statistical methodology to non-canonical splice site prediction. We pursued one such approach and describe the training and implementation of GC-donor splice site models for Arabidopsis and rice, with the goal of exploring whether specific modeling of non-canonical introns can enhance gene structure prediction accuracy.
RESULTS: Our results indicate that the incorporation of non-canonical splice site models yields dramatic improvements in annotating genes containing GC-AG and AT-AC non-canonical introns. Comparison of models shows differences between monocot and dicot species, but also suggests GC intron-specific biases independent of taxonomic clade. We also present evidence that GC-AG introns occur preferentially in genes with atypically high exon counts. AVAILABILITY: Source code for the updated versions of GeneSeqer and SplicePredictor (distributed with the GeneSeqer code) isavailable at http://bioinformatics.iastate.edu/bioinformatics2go/gs/download.html. Web servers for Arabidopsis, rice and other plant species are accessible at http://www.plantgdb.org/PlantGDB-cgi/GeneSeqer/AtGDBgs.cgi, http://www.plantgdb.org/PlantGDB-cgi/GeneSeqer/OsGDBgs.cgi and http://www.plantgdb.org/PlantGDB-cgi/GeneSeqer/PlantGDBgs.cgi, respectively. A SplicePredictor web server is available at http://bioinformatics.iastate.edu/cgi-bin/sp.cgi. Software to generate training data and parameterizations for Bayesian splice site models is available at http://gremlin1.gdcb.iastate.edu/~volker/SB05B/BSSM4GSQ/

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 16306388     DOI: 10.1093/bioinformatics/bti1205

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  15 in total

1.  Genomewide comparative analysis of alternative splicing in plants.

Authors:  Bing-Bing Wang; Volker Brendel
Journal:  Proc Natl Acad Sci U S A       Date:  2006-04-21       Impact factor: 11.205

2.  MaizeGDB becomes 'sequence-centric'.

Authors:  Taner Z Sen; Carson M Andorf; Mary L Schaeffer; Lisa C Harper; Michael E Sparks; Jon Duvick; Volker P Brendel; Ethalinda Cannon; Darwin A Campbell; Carolyn J Lawrence
Journal:  Database (Oxford)       Date:  2009-12-07       Impact factor: 3.451

3.  Association of the proprotein convertase subtilisin/kexin-type 2 (PCSK2) gene with type 2 diabetes in an African American population.

Authors:  Tennille S Leak; Keith L Keene; Carl D Langefeld; Carla J Gallagher; Josyf C Mychaleckyj; Barry I Freedman; Donald W Bowden; Stephen S Rich; Michèle M Sale
Journal:  Mol Genet Metab       Date:  2007-07-06       Impact factor: 4.797

4.  Evidence-based gene predictions in plant genomes.

Authors:  Chengzhi Liang; Long Mao; Doreen Ware; Lincoln Stein
Journal:  Genome Res       Date:  2009-06-18       Impact factor: 9.043

5.  RNA-seq analysis of the C. briggsae transcriptome.

Authors:  Bora Uyar; Jeffrey S C Chu; Ismael A Vergara; Shu Yi Chua; Martin R Jones; Tammy Wong; David L Baillie; Nansheng Chen
Journal:  Genome Res       Date:  2012-07-06       Impact factor: 9.043

6.  MetWAMer: eukaryotic translation initiation site prediction.

Authors:  Michael E Sparks; Volker Brendel
Journal:  BMC Bioinformatics       Date:  2008-09-18       Impact factor: 3.169

7.  Comprehensive splice-site analysis using comparative genomics.

Authors:  Nihar Sheth; Xavier Roca; Michelle L Hastings; Ted Roeder; Adrian R Krainer; Ravi Sachidanandam
Journal:  Nucleic Acids Res       Date:  2006-08-12       Impact factor: 16.971

8.  Genome-wide analysis of alternative splicing in Volvox carteri.

Authors:  Arash Kianianmomeni; Cheng Soon Ong; Gunnar Rätsch; Armin Hallmann
Journal:  BMC Genomics       Date:  2014-12-16       Impact factor: 3.969

9.  Polyphenol oxidase (PPO) in wheat and wild relatives: molecular evidence for a multigene family.

Authors:  Alicia N Massa; Brian Beecher; Craig F Morris
Journal:  Theor Appl Genet       Date:  2007-02-14       Impact factor: 5.574

10.  A method of predicting changes in human gene splicing induced by genetic variants in context of cis-acting elements.

Authors:  Alexander Churbanov; Igor Vorechovský; Chindo Hicks
Journal:  BMC Bioinformatics       Date:  2010-01-12       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.