Literature DB >> 7731036

Identification of protein coding regions in genomic DNA.

E E Snyder1, G D Stormo.   

Abstract

We have developed a computer program, GeneParser, which identifies and determines the fine structure of protein genes in genomic DNA sequences. The program scores all subintervals in a sequence for content statistics indicative of introns and exons, and for sites that identify their boundaries. This information is weighted by a neural network to approximate the log-likelihood that each subinterval exactly represents an intron or exon (first, internal or last). A dynamic programming algorithm is then applied to this data to find the combination of introns and exons that maximizes the likelihood function. Using this method, we can rapidly generate ranked suboptimal solutions, each of which is the optimum solution containing a given intron-exon junction. We have tested the system on a large collection of human genes. On sequences not used in training, we achieved a correlation coefficient for exon nucleotide prediction of 0.89. For a subset of G + C-rich genes, a correlation coefficient of 0.94 was achieved. We have also quantified the robustness of the method to substitution and frame-shift errors and show how the system can be optimized for performance on sequences with known levels of sequencing errors.

Entities:  

Mesh:

Substances:

Year:  1995        PMID: 7731036     DOI: 10.1006/jmbi.1995.0198

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  22 in total

1.  Positional characterisation of false positives from computational prediction of human splice sites.

Authors:  T A Thanaraj
Journal:  Nucleic Acids Res       Date:  2000-02-01       Impact factor: 16.971

2.  Multiple splicing defects in an intronic false exon.

Authors:  H Sun; L A Chasin
Journal:  Mol Cell Biol       Date:  2000-09       Impact factor: 4.272

Review 3.  The BioTools Suite. A comprehensive suite of platform-independent bioinformatics tools.

Authors:  D S Wishart; S Fortin
Journal:  Mol Biotechnol       Date:  2001-09       Impact factor: 2.695

4.  Evaluation of gene-finding programs on mammalian sequences.

Authors:  S Rogic; A K Mackworth; F B Ouellette
Journal:  Genome Res       Date:  2001-05       Impact factor: 9.043

5.  Gene structure prediction in syntenic DNA segments.

Authors:  Jonathan E Moore; James A Lake
Journal:  Nucleic Acids Res       Date:  2003-12-15       Impact factor: 16.971

Review 6.  Current methods of gene prediction, their strengths and weaknesses.

Authors:  Catherine Mathé; Marie-France Sagot; Thomas Schiex; Pierre Rouzé
Journal:  Nucleic Acids Res       Date:  2002-10-01       Impact factor: 16.971

7.  Gene prediction by spectral rotation measure: a new method for identifying protein-coding regions.

Authors:  Daniel Kotlar; Yizhar Lavner
Journal:  Genome Res       Date:  2003-07-17       Impact factor: 9.043

8.  Identification of programmed translational -1 frameshifting sites in the genome of Saccharomyces cerevisiae.

Authors:  Michaël Bekaert; Hugues Richard; Bernard Prum; Jean-Pierre Rousset
Journal:  Genome Res       Date:  2005-10       Impact factor: 9.043

Review 9.  Computational methods for exon detection.

Authors:  J M Claverie
Journal:  Mol Biotechnol       Date:  1998-08       Impact factor: 2.695

10.  Genome annotation in the presence of insertional RNA editing.

Authors:  Christina Beargie; Tsunglin Liu; Mark Corriveau; Ha Youn Lee; Jonatha Gott; Ralf Bundschuh
Journal:  Bioinformatics       Date:  2008-09-25       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.