Literature DB >> 10779493

Genie--gene finding in Drosophila melanogaster.

M G Reese1, D Kulp, H Tammana, D Haussler.   

Abstract

A hidden Markov model-based gene-finding system called Genie was applied to the genomic Adh region in Drosophila melanogaster as a part of the Genome Annotation Assessment Project (GASP). Predictions from three versions of the Genie gene-finding system were submitted, one based on statistical properties of coding genes, a second included EST alignment information, and a third that integrated protein sequence homology information. All three programs were trained on the provided Drosophila training data. In addition, promoter assignments from an integrated neural network were submitted. The gene assignments overlapped >90% of the 222 annotated genes and 26 possibly novel genes were predicted, of which some might be overpredictions. The system correctly identified the exon boundaries of 70% of the exons in cDNA-confirmed genes and 77% of the exons with the addition of EST sequence alignments. The best of the three Genie submissions predicted 19 of the annotated 43 gene structures entirely correct (44%). In the promoter category, only 30% of the transcription start sites could be detected, but by integrating this program as a sensor into Genie the false-positive rate could be dropped to 1/16,786 (0.006%). The results of the experiment on the long contiguous genomic sequence revealed some problems concerning gene assembly in Genie. The results were used to improve the system. We show that Genie is a robust hidden Markov model system that allows for a generalized integration of information from different sources such as signal sensors (splice sites, start codon, etc.), content sensors (exons, introns, intergenic) and alignments of mRNA, EST, and peptide sequences. The assessment showed that Genie could effectively be used for the annotation of complete genomes from higher organisms.

Entities:  

Mesh:

Year:  2000        PMID: 10779493      PMCID: PMC310881          DOI: 10.1101/gr.10.4.529

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  9 in total

Review 1.  Assessment of protein coding measures.

Authors:  J W Fickett; C S Tung
Journal:  Nucleic Acids Res       Date:  1992-12-25       Impact factor: 16.971

2.  Local alignment statistics.

Authors:  S F Altschul; W Gish
Journal:  Methods Enzymol       Date:  1996       Impact factor: 1.600

3.  Integrating database homology in a probabilistic gene structure model.

Authors:  D Kulp; D Haussler; M G Reese; F H Eeckman
Journal:  Pac Symp Biocomput       Date:  1997

4.  A generalized hidden Markov model for the recognition of human genes in DNA.

Authors:  D Kulp; D Haussler; M G Reese; F H Eeckman
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  1996

5.  Improved splice site detection in Genie.

Authors:  M G Reese; F H Eeckman; D Kulp; D Haussler
Journal:  J Comput Biol       Date:  1997       Impact factor: 1.479

6.  Genome annotation assessment in Drosophila melanogaster.

Authors:  M G Reese; G Hartzell; N L Harris; U Ohler; J F Abril; S E Lewis
Journal:  Genome Res       Date:  2000-04       Impact factor: 9.043

7.  Prediction of complete gene structures in human genomic DNA.

Authors:  C Burge; S Karlin
Journal:  J Mol Biol       Date:  1997-04-25       Impact factor: 5.469

8.  Optimally parsing a sequence into different classes based on multiple types of evidence.

Authors:  G D Stormo; D Haussler
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  1994

9.  An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: the Adh region.

Authors:  M Ashburner; S Misra; J Roote; S E Lewis; R Blazej; T Davis; C Doyle; R Galle; R George; N Harris; G Hartzell; D Harvey; L Hong; K Houston; R Hoskins; G Johnson; C Martin; A Moshrefi; M Palazzolo; M G Reese; A Spradling; G Tsang; K Wan; K Whitelaw; S Celniker
Journal:  Genetics       Date:  1999-09       Impact factor: 4.562

  9 in total
  54 in total

1.  Identification of eukaryotic peptide deformylases reveals universality of N-terminal protein processing mechanisms.

Authors:  C Giglione; A Serero; M Pierre; B Boisson; T Meinnel
Journal:  EMBO J       Date:  2000-11-01       Impact factor: 11.598

2.  Computational inference of homologous gene structures in the human genome.

Authors:  R F Yeh; L P Lim; C B Burge
Journal:  Genome Res       Date:  2001-05       Impact factor: 9.043

3.  SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model.

Authors:  Marina Alexandersson; Simon Cawley; Lior Pachter
Journal:  Genome Res       Date:  2003-03       Impact factor: 9.043

4.  A complexity reduction algorithm for analysis and annotation of large genomic sequences.

Authors:  Trees-Juen Chuang; Wen-Chang Lin; Hurng-Chun Lee; Chi-Wei Wang; Keh-Lin Hsiao; Zi-Hao Wang; Danny Shieh; Simon C Lin; Lan-Yang Ch'ang
Journal:  Genome Res       Date:  2003-02       Impact factor: 9.043

5.  GlimmerM, Exonomy and Unveil: three ab initio eukaryotic genefinders.

Authors:  William H Majoros; Mihaela Pertea; Corina Antonescu; Steven L Salzberg
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

6.  GAZE: a generic framework for the integration of gene-prediction data by dynamic programming.

Authors:  Kevin L Howe; Tom Chothia; Richard Durbin
Journal:  Genome Res       Date:  2002-09       Impact factor: 9.043

7.  AUGUSTUS: a web server for gene finding in eukaryotes.

Authors:  Mario Stanke; Rasmus Steinkamp; Stephan Waack; Burkhard Morgenstern
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

8.  Accurate identification of novel human genes through simultaneous gene prediction in human, mouse, and rat.

Authors:  Colin Dewey; Jia Qian Wu; Simon Cawley; Marina Alexandersson; Richard Gibbs; Lior Pachter
Journal:  Genome Res       Date:  2004-04       Impact factor: 9.043

9.  Evaluating bacterial gene-finding HMM structures as probabilistic logic programs.

Authors:  Søren Mørk; Ian Holmes
Journal:  Bioinformatics       Date:  2012-01-03       Impact factor: 6.937

Review 10.  A beginner's guide to eukaryotic genome annotation.

Authors:  Mark Yandell; Daniel Ence
Journal:  Nat Rev Genet       Date:  2012-04-18       Impact factor: 53.242

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.