Literature DB >> 1321415

The prediction of exons through an analysis of spliceable open reading frames.

G B Hutchinson1, M R Hayden.   

Abstract

We have developed a computer program which predicts internal exons from naive genomic sequence data and which will run on any IBM-compatible 80286 (or higher) computer. The algorithm searches a sequence for 'spliceable open reading frames' (SORFs), which are open reading frames bracketed by suitable splice-recognition sequences, and then analyzes the region for codon usage. Potential exons are stratified according to the reliability of their prediction, from confidence levels 1 to 5. The program is designed to predict internal exons of length greater than 60 nucleotides. In an analysis of 116 genes of a training set, 384 out of 441 such exons (87.1%) are identified, with 280 (63.5%) of predictions matching the true exon exactly (at both 5' and 3' splice junctions and in the correct reading frame), and with 104 (23.6%) exons matching partially. In a similar analysis of 14 genes in a test set unrelated to the genes used to generate the parameters of the program, 70 out of 80 internal exons greater than 60 bp in length are identified (87.5%), with 47 completely and 23 partially matched. SORFs that partially match true internal exons share at least one splice junction with the exon, or share both splice junctions but are interpreted in an incorrect reading frame. Specificity (the percentage of SORFs that correspond to true exons) varies from 91% at confidence level 1 to 16% at confidence level 5, with an overall specificity of 35-40%. The output displays nucleotide position, confidence level, reading frame phase at the 5' and 3' ends, acceptor and donor sequences and scoring statistics and also gives an amino acid translation of the potential exon. SORFIND compares favourably with other programs currently used to predict protein-coding regions.

Entities:  

Mesh:

Substances:

Year:  1992        PMID: 1321415      PMCID: PMC312502          DOI: 10.1093/nar/20.13.3453

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  16 in total

1.  Computer prediction of the exon-intron structure of mammalian pre-mRNAs.

Authors:  M S Gelfand
Journal:  Nucleic Acids Res       Date:  1990-10-11       Impact factor: 16.971

2.  K-tuple frequency analysis: from intron/exon discrimination to T-cell epitope mapping.

Authors:  J M Claverie; I Sauvaget; L Bougueleret
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

3.  Consensus patterns in DNA.

Authors:  G D Stormo
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

4.  Finding protein coding regions in genomic sequences.

Authors:  R Staden
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

5.  Complexity charts can be used to map functional domains in DNA.

Authors:  A K Konopka; J Owens
Journal:  Genet Anal Tech Appl       Date:  1990-04

6.  A catalogue of splice junction sequences.

Authors:  S M Mount
Journal:  Nucleic Acids Res       Date:  1982-01-22       Impact factor: 16.971

7.  Recognition of protein coding regions in DNA sequences.

Authors:  J W Fickett
Journal:  Nucleic Acids Res       Date:  1982-09-11       Impact factor: 16.971

8.  Prediction of human mRNA donor and acceptor sites from the DNA sequence.

Authors:  S Brunak; J Engelbrecht; S Knudsen
Journal:  J Mol Biol       Date:  1991-07-05       Impact factor: 5.469

9.  Selection of DNA binding sites by regulatory proteins. II. The binding specificity of cyclic AMP receptor protein to recognition sites.

Authors:  O G Berg; P H von Hippel
Journal:  J Mol Biol       Date:  1988-04-20       Impact factor: 5.469

10.  Human pre-mRNA splicing signals.

Authors:  F E Penotti
Journal:  J Theor Biol       Date:  1991-06-07       Impact factor: 2.691

View more
  8 in total

Review 1.  Current methods of gene prediction, their strengths and weaknesses.

Authors:  Catherine Mathé; Marie-France Sagot; Thomas Schiex; Pierre Rouzé
Journal:  Nucleic Acids Res       Date:  2002-10-01       Impact factor: 16.971

2.  Construction and analysis of an hn-cDNA library derived from the p-arm of pig chromosome 12.

Authors:  D V Anderson Dear; J R Miller
Journal:  Mamm Genome       Date:  1996-09       Impact factor: 2.957

3.  Fission yeast gene structure and recognition.

Authors:  M Q Zhang; T G Marr
Journal:  Nucleic Acids Res       Date:  1994-05-11       Impact factor: 16.971

4.  Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames.

Authors:  V V Solovyev; A A Salamov; C B Lawrence
Journal:  Nucleic Acids Res       Date:  1994-12-11       Impact factor: 16.971

5.  Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.

Authors:  M Borodovsky; K E Rudd; E V Koonin
Journal:  Nucleic Acids Res       Date:  1994-11-11       Impact factor: 16.971

6.  Positional cloning of ZNF217 and NABC1: genes amplified at 20q13.2 and overexpressed in breast carcinoma.

Authors:  C Collins; J M Rommens; D Kowbel; T Godfrey; M Tanner; S I Hwang; D Polikoff; G Nonet; J Cochran; K Myambo; K E Jay; J Froula; T Cloutier; W L Kuo; P Yaswen; S Dairkee; J Giovanola; G B Hutchinson; J Isola; O P Kallioniemi; M Palazzolo; C Martin; C Ericsson; D Pinkel; D Albertson; W B Li; J W Gray
Journal:  Proc Natl Acad Sci U S A       Date:  1998-07-21       Impact factor: 11.205

Review 7.  Computational Identification of Novel Genes: Current and Future Perspectives.

Authors:  Steffen Klasberg; Tristan Bitard-Feildel; Ludovic Mallet
Journal:  Bioinform Biol Insights       Date:  2016-08-01

Review 8.  A brief review of computational gene prediction methods.

Authors:  Zhuo Wang; Yazhu Chen; Yixue Li
Journal:  Genomics Proteomics Bioinformatics       Date:  2004-11       Impact factor: 7.691

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.