Literature DB >> 14707176

Computational gene prediction using multiple sources of evidence.

Jonathan E Allen1, Mihaela Pertea, Steven L Salzberg.   

Abstract

This article describes a computational method to construct gene models by using evidence generated from a diverse set of sources, including those typical of a genome annotation pipeline. The program, called Combiner, takes as input a genomic sequence and the locations of gene predictions from ab initio gene finders, protein sequence alignments, expressed sequence tag and cDNA alignments, splice site predictions, and other evidence. Three different algorithms for combining evidence in the Combiner were implemented and tested on 1783 confirmed genes in Arabidopsis thaliana. Our results show that combining gene prediction evidence consistently outperforms even the best individual gene finder and, in some cases, can produce dramatic improvements in sensitivity and specificity.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 14707176      PMCID: PMC314291          DOI: 10.1101/gr.1562804

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  20 in total

1.  A greedy algorithm for aligning DNA sequences.

Authors:  Z Zhang; S Schwartz; L Wagner; W Miller
Journal:  J Comput Biol       Date:  2000 Feb-Apr       Impact factor: 1.479

Review 2.  Computational gene finding in plants.

Authors:  Mihaela Pertea; Steven L Salzberg
Journal:  Plant Mol Biol       Date:  2002-01       Impact factor: 4.076

3.  A Bayesian framework for combining gene predictions.

Authors:  Vladimir Pavlović; Ashutosh Garg; Simon Kasif
Journal:  Bioinformatics       Date:  2002-01       Impact factor: 6.937

Review 4.  Databases and tools for browsing genomes.

Authors:  Ewan Birney; Michele Clamp; Tim Hubbard
Journal:  Annu Rev Genomics Hum Genet       Date:  2002-04-15       Impact factor: 8.929

Review 5.  Computational prediction of eukaryotic protein-coding genes.

Authors:  Michael Q Zhang
Journal:  Nat Rev Genet       Date:  2002-09       Impact factor: 53.242

6.  GAZE: a generic framework for the integration of gene-prediction data by dynamic programming.

Authors:  Kevin L Howe; Tom Chothia; Richard Durbin
Journal:  Genome Res       Date:  2002-09       Impact factor: 9.043

7.  Improving gene recognition accuracy by combining predictions from two gene-finding programs.

Authors:  Sanja Rogic; B F Francis Ouellette; Alan K Mackworth
Journal:  Bioinformatics       Date:  2002-08       Impact factor: 6.937

8.  Analysis of the genome sequence of the flowering plant Arabidopsis thaliana.

Authors: 
Journal:  Nature       Date:  2000-12-14       Impact factor: 49.962

9.  Leveraging the mouse genome for gene prediction in human: from whole-genome shotgun reads to a global synteny map.

Authors:  Paul Flicek; Evan Keibler; Ping Hu; Ian Korf; Michael R Brent
Journal:  Genome Res       Date:  2003-01       Impact factor: 9.043

10.  Full-length messenger RNA sequences greatly improve genome annotation.

Authors:  Brian J Haas; Natalia Volfovsky; Christopher D Town; Maxim Troukhan; Nickolai Alexandrov; Kenneth A Feldmann; Richard B Flavell; Owen White; Steven L Salzberg
Journal:  Genome Biol       Date:  2002-05-30       Impact factor: 13.583

View more
  39 in total

1.  EGPred: prediction of eukaryotic genes using ab initio methods after combining with sequence similarity approaches.

Authors:  Biju Issac; Gajendra Pal Singh Raghava
Journal:  Genome Res       Date:  2004-09       Impact factor: 9.043

2.  DNA-energetics-based analyses suggest additional genes in prokaryotes.

Authors:  Garima Khandelwal; Jalaj Gupta; B Jayaram
Journal:  J Biosci       Date:  2012-07       Impact factor: 1.826

3.  High speed BLASTN: an accelerated MegaBLAST search tool.

Authors:  Ying Chen; Weicai Ye; Yongdong Zhang; Yuesheng Xu
Journal:  Nucleic Acids Res       Date:  2015-08-06       Impact factor: 16.971

4.  Evaluation of five ab initio gene prediction programs for the discovery of maize genes.

Authors:  Hong Yao; Ling Guo; Yan Fu; Lisa A Borsuk; Tsui-Jung Wen; David S Skibbe; Xiangqin Cui; Brian E Scheffler; Jun Cao; Scott J Emrich; Daniel A Ashlock; Patrick S Schnable
Journal:  Plant Mol Biol       Date:  2005-02       Impact factor: 4.076

5.  Prediction of small, noncoding RNAs in bacteria using heterogeneous data.

Authors:  Brian Tjaden
Journal:  J Math Biol       Date:  2007-03-13       Impact factor: 2.259

Review 6.  Genome sequencing and genome resources in model legumes.

Authors:  Shusei Sato; Yasukazu Nakamura; Erika Asamizu; Sachiko Isobe; Satoshi Tabata
Journal:  Plant Physiol       Date:  2007-06       Impact factor: 8.340

7.  Conrad: gene prediction using conditional random fields.

Authors:  David DeCaprio; Jade P Vinson; Matthew D Pearson; Philip Montgomery; Matthew Doherty; James E Galagan
Journal:  Genome Res       Date:  2007-08-09       Impact factor: 9.043

8.  A method for construction, cloning and expression of intron-less gene from unannotated genomic DNA.

Authors:  Vineet Agrawal; Bharti Gupta; Uttam Chand Banerjee; Nilanjan Roy
Journal:  Mol Biotechnol       Date:  2008-06-10       Impact factor: 2.695

9.  Modern approaches to understanding stress and disease susceptibility: A review with special emphasis on respiratory disease.

Authors:  Palok Aich; Andrew A Potter; Philip J Griebel
Journal:  Int J Gen Med       Date:  2009-07-30

10.  A method for identifying alternative or cryptic donor splice sites within gene and mRNA sequences. Comparisons among sequences from vertebrates, echinoderms and other groups.

Authors:  Katherine M Buckley; Liliana D Florea; L Courtney Smith
Journal:  BMC Genomics       Date:  2009-07-16       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.