Literature DB >> 10743555

Evaluation of gene prediction software using a genomic data set: application to Arabidopsis thaliana sequences.

N Pavy1, S Rombauts, P Déhais, C Mathé, D V Ramana, P Leroy, P Rouzé.   

Abstract

MOTIVATION: The annotation of the Arabidopsis thaliana genome remains a problem in terms of time and quality. To improve the annotation process, we want to choose the most appropriate tools to use inside a computer-assisted annotation platform. We therefore need evaluation of prediction programs with Arabidopsis sequences containing multiple genes.
RESULTS: We have developed AraSet, a data set of contigs of validated genes, enabling the evaluation of multi-gene models for the Arabidopsis genome. Besides conventional metrics to evaluate gene prediction at the site and the exon levels, new measures were introduced for the prediction at the protein sequence level as well as for the evaluation of gene models. This evaluation method is of general interest and could apply to any new gene prediction software and to any eukaryotic genome. The GeneMark.hmm program appears to be the most accurate software at all three levels for the Arabidopsis genomic sequences. Gene modeling could be further improved by combination of prediction software. AVAILABILITY: The AraSet sequence set, the Perl programs and complementary results and notes are available at http://sphinx.rug.ac.be:8080/biocomp/napav/. CONTACT: Pierre.Rouze@gengenp.rug.ac.be.

Entities:  

Mesh:

Year:  1999        PMID: 10743555     DOI: 10.1093/bioinformatics/15.11.887

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  37 in total

1.  Identification and analysis of Arabidopsis expressed sequence tags characteristic of non-coding RNAs.

Authors:  G C MacIntosh; C Wilkerson; P J Green
Journal:  Plant Physiol       Date:  2001-11       Impact factor: 8.340

2.  Massive sequence comparisons as a help in annotating genomic sequences.

Authors:  A Louis; E Ollivier; J C Aude; J L Risler
Journal:  Genome Res       Date:  2001-07       Impact factor: 9.043

3.  Genome-wide analysis of core cell cycle genes in Arabidopsis.

Authors:  Klaas Vandepoele; Jeroen Raes; Lieven De Veylder; Pierre Rouzé; Stephane Rombauts; Dirk Inzé
Journal:  Plant Cell       Date:  2002-04       Impact factor: 11.277

Review 4.  Computational gene finding in plants.

Authors:  Mihaela Pertea; Steven L Salzberg
Journal:  Plant Mol Biol       Date:  2002-01       Impact factor: 4.076

Review 5.  Plant genome evolution: lessons from comparative genomics at the DNA level.

Authors:  Renate Schmidt
Journal:  Plant Mol Biol       Date:  2002-01       Impact factor: 4.076

6.  The automatic detection of homologous regions (ADHoRe) and its application to microcolinearity between Arabidopsis and rice.

Authors:  Klaas Vandepoele; Yvan Saeys; Cedric Simillion; Jeroen Raes; Yves Van De Peer
Journal:  Genome Res       Date:  2002-11       Impact factor: 9.043

Review 7.  Genomics and plant cells: application of genomics strategies to Arabidopsis cell biology.

Authors:  Michael Bevan
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2002-06-29       Impact factor: 6.237

8.  EUGENE'HOM: A generic similarity-based gene finder using multiple homologous sequences.

Authors:  Sylvain Foissac; Philippe Bardou; Annick Moisan; Marie-Josée Cros; Thomas Schiex
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

9.  GeneSeqer@PlantGDB: Gene structure prediction in plant genomes.

Authors:  Shannon D Schlueter; Qunfeng Dong; Volker Brendel
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

10.  Refined annotation of the Arabidopsis genome by complete expressed sequence tag mapping.

Authors:  Wei Zhu; Shannon D Schlueter; Volker Brendel
Journal:  Plant Physiol       Date:  2003-06       Impact factor: 8.340

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.