Literature DB >> 9889348

Protein-coding regions prediction combining similarity searches and conservative evolutionary properties of protein-coding sequences.

I B Rogozin1, D D'Angelo, L Milanesi.   

Abstract

The gene identification procedure in a completely new gene with no good homology with protein sequences can be a very complex task. In order to identify the protein-coding region, a new method, 'SYNCOD', based on the analysis of conservative evolutionary properties of coding regions, has been realized. This program is able to identify and use the coding region homologies of the non-annotated (unknown) protein-coding sequences already present in the nucleotide sequence databases by using the alignment produced by BLASTN. The ratio of number mismatches resulting in synonymous codons to the number of mismatches resulting in non-synonymous codons is estimated for each open reading frame. Monte Carlo simulations are then used to estimate the significance of the ratio deviation from random behavior. The SYNCOD program has been tested on generated random sequences and on different control sets. The high accuracy of predicting protein-coding regions (the correlation coefficient, CC, varies from 0.67 to 0.79) and the high specificity (the portion of wrong exons, WE, varies from 0.06 to 0.07) have proved to be important features of the suggested approach. The SYNCOD program is resident on the ITBA-CNR Web Server and can be used via the Internet (URL: www.itba.mi.cnr.it/webgene).

Mesh:

Substances:

Year:  1999        PMID: 9889348     DOI: 10.1016/s0378-1119(98)00509-5

Source DB:  PubMed          Journal:  Gene        ISSN: 0378-1119            Impact factor:   3.688


  8 in total

1.  Gene structure prediction in syntenic DNA segments.

Authors:  Jonathan E Moore; James A Lake
Journal:  Nucleic Acids Res       Date:  2003-12-15       Impact factor: 16.971

Review 2.  Current methods of gene prediction, their strengths and weaknesses.

Authors:  Catherine Mathé; Marie-France Sagot; Thomas Schiex; Pierre Rouzé
Journal:  Nucleic Acids Res       Date:  2002-10-01       Impact factor: 16.971

3.  The complete genome of hyperthermophile Methanopyrus kandleri AV19 and monophyly of archaeal methanogens.

Authors:  Alexei I Slesarev; Katja V Mezhevaya; Kira S Makarova; Nikolai N Polushin; Olga V Shcherbinina; Vera V Shakhova; Galina I Belova; L Aravind; Darren A Natale; Igor B Rogozin; Roman L Tatusov; Yuri I Wolf; Karl O Stetter; Andrei G Malykh; Eugene V Koonin; Sergei A Kozyavkin
Journal:  Proc Natl Acad Sci U S A       Date:  2002-04-02       Impact factor: 11.205

4.  Testing the coding potential of conserved short genomic sequences.

Authors:  Jing Wu
Journal:  Adv Bioinformatics       Date:  2010-03-08

5.  Negative correlation between expression level and evolutionary rate of long intergenic noncoding RNAs.

Authors:  David Managadze; Igor B Rogozin; Diana Chernikova; Svetlana A Shabalina; Eugene V Koonin
Journal:  Genome Biol Evol       Date:  2011-11-09       Impact factor: 3.416

6.  Evolutionary conservation suggests a regulatory function of AUG triplets in 5'-UTRs of eukaryotic genes.

Authors:  Alexander Churbanov; Igor B Rogozin; Vladimir N Babenko; Hesham Ali; Eugene V Koonin
Journal:  Nucleic Acids Res       Date:  2005-09-26       Impact factor: 16.971

7.  Volatile Evolution of Long Non-Coding RNA Repertoire in Retinal Pigment Epithelium: Insights from Comparison of Bovine and Human RNA Expression Profiles.

Authors:  Olga A Postnikova; Igor B Rogozin; William Samuel; German Nudelman; Vladimir N Babenko; Eugenia Poliakov; T Michael Redmond
Journal:  Genes (Basel)       Date:  2019-03-08       Impact factor: 4.096

8.  Accurate discrimination of conserved coding and non-coding regions through multiple indicators of evolutionary dynamics.

Authors:  Matteo Rè; Graziano Pesole; David S Horner
Journal:  BMC Bioinformatics       Date:  2009-09-08       Impact factor: 3.169

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.