Literature DB >> 12869578

Gene prediction by spectral rotation measure: a new method for identifying protein-coding regions.

Daniel Kotlar1, Yizhar Lavner.   

Abstract

A new measure for gene prediction in eukaryotes is presented. The measure is based on the Discrete Fourier Transform (DFT) phase at a frequency of 1/3, computed for the four binary sequences for A, T, C, and G. Analysis of all the experimental genes of S. cerevisiae revealed distribution of the phase in a bell-like curve around a central value, in all four nucleotides, whereas the distribution of the phase in the noncoding regions was found to be close to uniform. Similar findings were obtained for other organisms. Several measures based on the phase property are proposed. The measures are computed by clockwise rotation of the vectors, obtained by DFT for each analysis frame, by an angle equal to the corresponding central value. In protein coding regions, this rotation is assumed to closely align all vectors in the complex plane, thereby amplifying the magnitude of the vector sum. In noncoding regions, this operation does not significantly change this magnitude. Computing the measures with one chromosome and applying them on sequences of others reveals improved performance compared with other algorithms that use the 1/3 frequency feature, especially in short exons. The phase property is also used to find the reading frame of the sequence.

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 12869578      PMCID: PMC403785          DOI: 10.1101/gr.1261703

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  25 in total

Review 1.  Current methods of gene prediction, their strengths and weaknesses.

Authors:  Catherine Mathé; Marie-France Sagot; Thomas Schiex; Pierre Rouzé
Journal:  Nucleic Acids Res       Date:  2002-10-01       Impact factor: 16.971

2.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences.

Authors: 
Journal:  Phys Rev Lett       Date:  1992-06-22       Impact factor: 9.161

3.  Determination of eukaryotic protein coding regions using neural networks and information theory.

Authors:  R Farber; A Lapedes; K Sirotkin
Journal:  J Mol Biol       Date:  1992-07-20       Impact factor: 5.469

4.  Periodicity in DNA coding sequences: implications in gene evolution.

Authors:  A A Tsonis; J B Elsner; P A Tsonis
Journal:  J Theor Biol       Date:  1991-08-07       Impact factor: 2.691

5.  A hidden Markov model that finds genes in E. coli DNA.

Authors:  A Krogh; I S Mian; D Haussler
Journal:  Nucleic Acids Res       Date:  1994-11-11       Impact factor: 16.971

Review 6.  Computational methods for the identification of genes in vertebrate genomic sequences.

Authors:  J M Claverie
Journal:  Hum Mol Genet       Date:  1997       Impact factor: 6.150

7.  Translation framing code and frame-monitoring mechanism as suggested by the analysis of mRNA and 16 S rRNA nucleotide sequences.

Authors:  E N Trifonov
Journal:  J Mol Biol       Date:  1987-04-20       Impact factor: 5.469

Review 8.  New genes in old sequence: a strategy for finding genes in the bacterial genome.

Authors:  M Borodovsky; E V Koonin; K E Rudd
Journal:  Trends Biochem Sci       Date:  1994-08       Impact factor: 13.807

9.  The coding function of nucleotide sequences can be discerned by statistical analysis.

Authors:  M J Shulman; C M Steinberg; N Westmoreland
Journal:  J Theor Biol       Date:  1981-02-07       Impact factor: 2.691

10.  Recognition of protein coding regions in DNA sequences.

Authors:  J W Fickett
Journal:  Nucleic Acids Res       Date:  1982-09-11       Impact factor: 16.971

View more
  23 in total

1.  In search of coding and non-coding regions of DNA sequences based on balanced estimation of diffusion entropy.

Authors:  Jin Zhang; Wenqing Zhang; Huijie Yang
Journal:  J Biol Phys       Date:  2015-08-29       Impact factor: 1.365

2.  SNR of DNA sequences mapped by general affine transformations of the indicator sequences.

Authors:  Jianfeng Shao; Xiaohua Yan; Shuo Shao
Journal:  J Math Biol       Date:  2012-07-21       Impact factor: 2.259

3.  Periodic power spectrum with applications in detection of latent periodicities in DNA sequences.

Authors:  Changchuan Yin; Jiasong Wang
Journal:  J Math Biol       Date:  2016-03-04       Impact factor: 2.259

4.  Classifying coding DNA with nucleotide statistics.

Authors:  Nicolas Carels; Diego Frías
Journal:  Bioinform Biol Insights       Date:  2009-10-28

5.  Visualization of the protein-coding regions with a self adaptive spectral rotation approach.

Authors:  Bo Chen; Ping Ji
Journal:  Nucleic Acids Res       Date:  2010-10-14       Impact factor: 16.971

6.  Origin of multiple periodicities in the Fourier power spectra of the Plasmodium falciparum genome.

Authors:  Miriam C S Nunes; Elizabeth F Wanner; Gerald Weber
Journal:  BMC Genomics       Date:  2011-12-22       Impact factor: 3.969

7.  Protein coding sequence identification by simultaneously characterizing the periodic and random features of DNA sequences.

Authors:  Jianbo Gao; Yan Qi; Yinhe Cao; Wen-wen Tung
Journal:  J Biomed Biotechnol       Date:  2005-06-30

8.  Hierarchical structure of cascade of primary and secondary periodicities in Fourier power spectrum of alphoid higher order repeats.

Authors:  Vladimir Paar; Nenad Pavin; Ivan Basar; Marija Rosandić; Matko Gluncić; Nils Paar
Journal:  BMC Bioinformatics       Date:  2008-11-03       Impact factor: 3.169

9.  Identification of protein-coding sequences using the hybridization of 18S rRNA and mRNA during translation.

Authors:  Chuanhua Xing; Donald L Bitzer; Winser E Alexander; Mladen A Vouk; Anne-Marie Stomp
Journal:  Nucleic Acids Res       Date:  2008-12-10       Impact factor: 16.971

10.  Multi-scale parametric spectral analysis for exon detection in DNA sequences based on forward-backward linear prediction and singular value decomposition of the double-base curves.

Authors:  Miew Keen Choong; Hong Yan
Journal:  Bioinformation       Date:  2008-02-12
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.