Literature DB >> 17286872

Hon-yaku: a biology-driven Bayesian methodology for identifying translation initiation sites in prokaryotes.

Yuko Makita1, Michiel J L de Hoon, Antoine Danchin.   

Abstract

BACKGROUND: Computational prediction methods are currently used to identify genes in prokaryote genomes. However, identification of the correct translation initiation sites remains a difficult task. Accurate translation initiation sites (TISs) are important not only for the annotation of unknown proteins but also for the prediction of operons, promoters, and small non-coding RNA genes, as this typically makes use of the intergenic distance. A further problem is that most existing methods are optimized for Escherichia coli data sets; applying these methods to newly sequenced bacterial genomes may not result in an equivalent level of accuracy.
RESULTS: Based on a biological representation of the translation process, we applied Bayesian statistics to create a score function for predicting translation initiation sites. In contrast to existing programs, our combination of methods uses supervised learning to optimally use the set of known translation initiation sites. We combined the Ribosome Binding Site (RBS) sequence, the distance between the translation initiation site and the RBS sequence, the base composition of the start codon, the nucleotide composition (A-rich sequences) following start codons, and the expected distribution of the protein length in a Bayesian scoring function. To further increase the prediction accuracy, we also took into account the operon orientation. The outcome of the procedure achieved a prediction accuracy of 93.2% in 858 E. coli genes from the EcoGene data set and 92.7% accuracy in a data set of 1243 Bacillus subtilis 'non-y' genes. We confirmed the performance in the GC-rich Gamma-Proteobacteria Herminiimonas arsenicoxydans, Pseudomonas aeruginosa, and Burkholderia pseudomallei K96243.
CONCLUSION: Hon-yaku, being based on a careful choice of elements important in translation, improved the prediction accuracy in B. subtilis data sets and other bacteria except for E. coli. We believe that most remaining mispredictions are due to atypical ribosomal binding sequences used in specific translation control processes, or likely errors in the training data sets.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17286872      PMCID: PMC1805508          DOI: 10.1186/1471-2105-8-47

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  40 in total

1.  EcoGene: a genome sequence database for Escherichia coli K-12.

Authors:  K E Rudd
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Role of the substrate conformation and of the S1 protein in the cleavage efficiency of the T4 endoribonuclease RegB.

Authors:  I Lebars; R M Hu; J Y Lallemand; M Uzan; F Bontems
Journal:  J Biol Chem       Date:  2000-12-15       Impact factor: 5.157

3.  A novel bacterial gene-finding system with improved accuracy in locating start codons.

Authors:  T Yada; Y Totoki; T Takagi; K Nakai
Journal:  DNA Res       Date:  2001-06-30       Impact factor: 4.458

4.  GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.

Authors:  J Besemer; A Lomsadze; M Borodovsky
Journal:  Nucleic Acids Res       Date:  2001-06-15       Impact factor: 16.971

5.  Identification of critical staphylococcal genes using conditional phenotypes generated by antisense RNA.

Authors:  Y Ji; B Zhang; S F Van; P Warren; G Woodnutt; M K Burnham; M Rosenberg
Journal:  Science       Date:  2001-09-21       Impact factor: 47.728

6.  A computational approach to identify genes for functional RNAs in genomic sequences.

Authors:  R J Carter; I Dubchak; S R Holbrook
Journal:  Nucleic Acids Res       Date:  2001-10-01       Impact factor: 16.971

7.  Bacteriophage T4 RegB endoribonuclease.

Authors:  M Uzan
Journal:  Methods Enzymol       Date:  2001       Impact factor: 1.600

8.  Improved microbial gene identification with GLIMMER.

Authors:  A L Delcher; D Harmon; S Kasif; O White; S L Salzberg
Journal:  Nucleic Acids Res       Date:  1999-12-01       Impact factor: 16.971

9.  Non-canonical mechanism for translational control in bacteria: synthesis of ribosomal protein S1.

Authors:  I V Boni; V S Artamonova; N V Tzareva; M Dreyfus
Journal:  EMBO J       Date:  2001-08-01       Impact factor: 11.598

10.  A computational method to predict genetically encoded rare amino acids in proteins.

Authors:  Barnali N Chaudhuri; Todd O Yeates
Journal:  Genome Biol       Date:  2005-08-31       Impact factor: 13.583

View more
  10 in total

Review 1.  Phylogenetic and evolutionary relationships of RubisCO and the RubisCO-like proteins and the functional lessons provided by diverse molecular forms.

Authors:  F Robert Tabita; Thomas E Hanson; Sriram Satagopan; Brian H Witte; Nathan E Kreel
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2008-08-27       Impact factor: 6.237

2.  Re-annotation of two hyperthermophilic archaea Pyrococcus abyssi GE5 and Pyrococcus furiosus DSM 3638.

Authors:  Junxiang Gao; Ji Wang
Journal:  Curr Microbiol       Date:  2011-11-06       Impact factor: 2.188

3.  Retapamulin-Assisted Ribosome Profiling Reveals the Alternative Bacterial Proteome.

Authors:  Sezen Meydan; James Marks; Dorota Klepacki; Virag Sharma; Pavel V Baranov; Andrew E Firth; Tōnu Margus; Amira Kefi; Nora Vázquez-Laslop; Alexander S Mankin
Journal:  Mol Cell       Date:  2019-03-20       Impact factor: 17.970

4.  Identification of Translation Start Sites in Bacterial Genomes.

Authors:  Sezen Meydan; Dorota Klepacki; Alexander S Mankin; Nora Vázquez-Laslop
Journal:  Methods Mol Biol       Date:  2021

Review 5.  No wisdom in the crowd: genome annotation in the era of big data - current status and future prospects.

Authors:  Antoine Danchin; Christos Ouzounis; Taku Tokuyasu; Jean-Daniel Zucker
Journal:  Microb Biotechnol       Date:  2018-05-28       Impact factor: 5.813

6.  Genome reannotation of Escherichia coli CFT073 with new insights into virulence.

Authors:  Chengwei Luo; Gang-Qing Hu; Huaiqiu Zhu
Journal:  BMC Genomics       Date:  2009-11-22       Impact factor: 3.969

7.  Experimental determination of translational start sites resolves uncertainties in genomic open reading frame predictions - application to Mycobacterium tuberculosis.

Authors:  Katherine L Smollett; Amanda S Fivian-Hughes; Joanne E Smith; Anchi Chang; Tara Rao; Elaine O Davis
Journal:  Microbiology (Reading)       Date:  2009-01       Impact factor: 2.777

8.  Gene prediction in metagenomic fragments based on the SVM algorithm.

Authors:  Yongchu Liu; Jiangtao Guo; Gangqing Hu; Huaiqiu Zhu
Journal:  BMC Bioinformatics       Date:  2013-04-10       Impact factor: 3.169

9.  The Genome Reverse Compiler: an explorative annotation tool.

Authors:  Andrew S Warren; João Carlos Setubal
Journal:  BMC Bioinformatics       Date:  2009-01-27       Impact factor: 3.169

10.  ProTISA: a comprehensive resource for translation initiation site annotation in prokaryotic genomes.

Authors:  Gang-Qing Hu; Xiaobin Zheng; Yi-Fan Yang; Philippe Ortet; Zhen-Su She; Huaiqiu Zhu
Journal:  Nucleic Acids Res       Date:  2007-10-16       Impact factor: 16.971

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.