Literature DB >> 12136100

A comparative genomic method for computational identification of prokaryotic translation initiation sites.

Megon Walker1, Vladimir Pavlovic, Simon Kasif.   

Abstract

The ever growing number of completely sequenced prokaryotic genomes facilitates cross-species comparisons by genomic annotation algorithms. This paper introduces a new probabilistic framework for comparative genomic analysis and demonstrates its utility in the context of improving the accuracy of prokaryotic gene start site detection. Our frame work employs a product hidden Markov model (PROD-HMM) with state architecture to model the species-specific trinucleotide frequency patterns in sequences immediately upstream and downstream of a translation start site and to detect the contrasting non-synonymous (amino acid changing) and synonymous (silent) substitution rates that differentiate prokaryotic coding from intergenic regions. Depending on the intricacy of the features modeled by the hidden state architecture, intergenic, regulatory, promoter and coding regions can be delimited by this method. The new system is evaluated using a preliminary set of orthologous Pyrococcus gene pairs, for which it demonstrates an improved accuracy of detection. Its robustness is confirmed by analysis with cross-validation of an experimentally verified set of Escherichia coli K-12 and Salmonella thyphimurium LT2 orthologs. The novel architecture has a number of attractive features that distinguish it from previous comparative models such as pair-HMMs.

Entities:  

Mesh:

Substances:

Year:  2002        PMID: 12136100      PMCID: PMC135744          DOI: 10.1093/nar/gkf423

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  20 in total

1.  EcoGene: a genome sequence database for Escherichia coli K-12.

Authors:  K E Rudd
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Finding prokaryotic genes by the 'frame-by-frame' algorithm: targeting gene starts and overlapping genes.

Authors:  A M Shmatkov; A A Melikyan; F L Chernousko; M Borodovsky
Journal:  Bioinformatics       Date:  1999-11       Impact factor: 6.937

3.  A novel bacterial gene-finding system with improved accuracy in locating start codons.

Authors:  T Yada; Y Totoki; T Takagi; K Nakai
Journal:  DNA Res       Date:  2001-06-30       Impact factor: 4.458

4.  GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.

Authors:  J Besemer; A Lomsadze; M Borodovsky
Journal:  Nucleic Acids Res       Date:  2001-06-15       Impact factor: 16.971

5.  Computational inference of homologous gene structures in the human genome.

Authors:  R F Yeh; L P Lim; C B Burge
Journal:  Genome Res       Date:  2001-05       Impact factor: 9.043

6.  Identification and characterization of E.coli ribosomal binding sites by free energy computation.

Authors:  T Schurr; E Nadir; H Margalit
Journal:  Nucleic Acids Res       Date:  1993-08-25       Impact factor: 16.971

Review 7.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

8.  A codon-based model of nucleotide substitution for protein-coding DNA sequences.

Authors:  N Goldman; Z Yang
Journal:  Mol Biol Evol       Date:  1994-09       Impact factor: 16.240

9.  Human and mouse gene structure: comparative analysis and application to exon prediction.

Authors:  S Batzoglou; L Pachter; J P Mesirov; B Berger; E S Lander
Journal:  Genome Res       Date:  2000-07       Impact factor: 9.043

10.  Genome evolution at the genus level: comparison of three complete genomes of hyperthermophilic archaea.

Authors:  O Lecompte; R Ripp; V Puzos-Barbe; S Duprat; R Heilig; J Dietrich; J C Thierry; O Poch
Journal:  Genome Res       Date:  2001-06       Impact factor: 9.043

View more
  5 in total

1.  In vivo evidence for the prokaryotic model of extended codon-anticodon interaction in translation initiation.

Authors:  Donna Esposito; Julien P Fey; Stephan Eberhard; Amanda J Hicks; David B Stern
Journal:  EMBO J       Date:  2003-02-03       Impact factor: 11.598

2.  Human-mouse gene identification by comparative evidence integration and evolutionary analysis.

Authors:  Lingang Zhang; Vladimir Pavlovic; Charles R Cantor; Simon Kasif
Journal:  Genome Res       Date:  2003-05-12       Impact factor: 9.043

3.  Genome majority vote improves gene predictions.

Authors:  Michael E Wall; Sindhu Raghavan; Judith D Cohn; John Dunbar
Journal:  PLoS Comput Biol       Date:  2011-11-17       Impact factor: 4.475

4.  Recent applications of Hidden Markov Models in computational biology.

Authors:  Khar Heng Choo; Joo Chuan Tong; Louxin Zhang
Journal:  Genomics Proteomics Bioinformatics       Date:  2004-05       Impact factor: 7.691

5.  Identification and utilization of arbitrary correlations in models of recombination signal sequences.

Authors:  Lindsay G Cowell; Marco Davila; Thomas B Kepler; Garnett Kelsoe
Journal:  Genome Biol       Date:  2002-11-21       Impact factor: 13.583

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.