Literature DB >> 11337477

Evaluation of gene-finding programs on mammalian sequences.

S Rogic1, A K Mackworth, F B Ouellette.   

Abstract

We present an independent comparative analysis of seven recently developed gene-finding programs: FGENES, GeneMark.hmm, Genie, Genescan, HMMgene, Morgan, and MZEF. For evaluation purposes we developed a new, thoroughly filtered, and biologically validated dataset of mammalian genomic sequences that does not overlap with the training sets of the programs analyzed. Our analysis shows that the new generation of programs has substantially better results than the programs analyzed in previous studies. The accuracy of the programs was also examined as a function of various sequence and prediction features, such as G + C content of the sequence, length and type of exons, signal type, and score of the exon prediction. This approach pinpoints the strengths and weaknesses of each individual program as well as those of computational gene-finding in general. The dataset used in this analysis (HMR195) as well as the tables with the complete results are available at http://www.cs.ubc.ca/~rogic/evaluation/.

Entities:  

Mesh:

Substances:

Year:  2001        PMID: 11337477      PMCID: PMC311133          DOI: 10.1101/gr.147901

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  38 in total

1.  The DNA sequence of human chromosome 22.

Authors:  I Dunham; N Shimizu; B A Roe; S Chissoe; A R Hunt; J E Collins; R Bruskiewich; D M Beare; M Clamp; L J Smink; R Ainscough; J P Almeida; A Babbage; C Bagguley; J Bailey; K Barlow; K N Bates; O Beasley; C P Bird; S Blakey; A M Bridgeman; D Buck; J Burgess; W D Burrill; K P O'Brien
Journal:  Nature       Date:  1999-12-02       Impact factor: 49.962

2.  Protein-length distributions for the three domains of life.

Authors:  J Zhang
Journal:  Trends Genet       Date:  2000-03       Impact factor: 11.639

3.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

4.  The gene identification problem: an overview for developers.

Authors:  J W Fickett
Journal:  Comput Chem       Date:  1996-03

5.  Selection of splice sites in pre-mRNAs with short internal exons.

Authors:  Z Dominski; R Kole
Journal:  Mol Cell Biol       Date:  1991-12       Impact factor: 4.272

6.  The complete DNA sequence of yeast chromosome III.

Authors:  S G Oliver; Q J van der Aart; M L Agostoni-Carbone; M Aigle; L Alberghina; D Alexandraki; G Antoine; R Anwar; J P Ballesta; P Benit
Journal:  Nature       Date:  1992-05-07       Impact factor: 49.962

Review 7.  A survey on intron and exon lengths.

Authors:  J D Hawkins
Journal:  Nucleic Acids Res       Date:  1988-11-11       Impact factor: 16.971

8.  Frequent alternative splicing of human genes.

Authors:  A A Mironov; J W Fickett; M S Gelfand
Journal:  Genome Res       Date:  1999-12       Impact factor: 9.043

9.  A minimal intron length but no specific internal sequence is required for splicing the large rabbit beta-globin intron.

Authors:  B Wieringa; E Hofer; C Weissmann
Journal:  Cell       Date:  1984-07       Impact factor: 41.582

10.  Overlapping genes of Drosophila melanogaster: organization of the z600-gonadal-Eip28/29 gene cluster.

Authors:  R A Schulz; B A Butler
Journal:  Genes Dev       Date:  1989-02       Impact factor: 11.361

View more
  60 in total

1.  The K(A)/K(S) ratio test for assessing the protein-coding potential of genomic regions: an empirical and simulation study.

Authors:  Anton Nekrutenko; Kateryna D Makova; Wen-Hsiung Li
Journal:  Genome Res       Date:  2002-01       Impact factor: 9.043

2.  ExInt: an Exon Intron Database.

Authors:  M Sakharkar; F Passetti; J E de Souza; M Long; S J de Souza
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

3.  Compositional gradients in Gramineae genes.

Authors:  Gane Ka-Shu Wong; Jun Wang; Lin Tao; Jun Tan; JianGuo Zhang; Douglas A Passey; Jun Yu
Journal:  Genome Res       Date:  2002-06       Impact factor: 9.043

4.  Molecular and evolutionary analysis of the growth-controlling region on the human Y chromosome.

Authors:  Stefan Kirsch; Birgit Weiss; Klaus Zumbach; Gudrun Rappold
Journal:  Hum Genet       Date:  2003-10-25       Impact factor: 4.132

5.  Identification and characterization of multi-species conserved sequences.

Authors:  Elliott H Margulies; Mathieu Blanchette; David Haussler; Eric D Green
Journal:  Genome Res       Date:  2003-12       Impact factor: 9.043

6.  Nucleotide sequencing analysis of the swine 433-kb genomic segment located between the non-classical and classical SLA class I gene clusters.

Authors:  Atsuko Shigenari; Asako Ando; Christine Renard; Patrick Chardon; Takashi Shiina; Jerzy K Kulski; Hiroshi Yasue; Hidetoshi Inoko
Journal:  Immunogenetics       Date:  2003-12-12       Impact factor: 2.846

7.  Comparative gene prediction in human and mouse.

Authors:  Genís Parra; Pankaj Agarwal; Josep F Abril; Thomas Wiehe; James W Fickett; Roderic Guigó
Journal:  Genome Res       Date:  2003-01       Impact factor: 9.043

8.  ETOPE: Evolutionary test of predicted exons.

Authors:  Anton Nekrutenko; Wen-Yu Chung; Wen-Hsiung Li
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

Review 9.  Current methods of gene prediction, their strengths and weaknesses.

Authors:  Catherine Mathé; Marie-France Sagot; Thomas Schiex; Pierre Rouzé
Journal:  Nucleic Acids Res       Date:  2002-10-01       Impact factor: 16.971

10.  AUGUSTUS: a web server for gene finding in eukaryotes.

Authors:  Mario Stanke; Rasmus Steinkamp; Stephan Waack; Burkhard Morgenstern
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.