Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 On the convergence of a clustering algorithm for protein-coding regions in microbial genomes.

Literature DB >> 10869034

On the convergence of a clustering algorithm for protein-coding regions in microbial genomes.

Abstract

MOTIVATION: As the number of fully sequenced prokaryotic genomes continues to grow rapidly, computational methods for reliably detecting protein-coding regions become even more important. Audic and Claverie (1998) Proc. Natl Acad. Sci. USA, 95, 10026-10031, have proposed a clustering algorithm for protein-coding regions in microbial genomes. The algorithm is based on three Markov models of order k associated with subsequences extracted from a given genome. The parameters of the three Markov models are recursively updated by the algorithm which, in simulations, always appear to converge to a unique stable partition of the genome. The partition corresponds to three kinds of regions: (1) coding on the direct strand, (2) coding on the complementary strand, (3) non-coding.
RESULTS: Here we provide an explanation for the convergence of the algorithm by observing that it is essentially a form of the expectation maximization (EM) algorithm applied to the corresponding mixture model. We also provide a partial justification for the uniqueness of the partition based on identifiability. Other possible variations and improvements are briefly discussed.

Mesh：

Substances：
Proteins

Year: 2000 PMID： 10869034 DOI： 10.1093/bioinformatics/16.4.367

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

5 in total

1. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.

Authors: J Besemer; A Lomsadze; M Borodovsky
Journal: Nucleic Acids Res Date: 2001-06-15 Impact factor: 16.971

2. Prokaryotic gene finding based on physicochemical characteristics of codons calculated from molecular dynamics simulations.

Authors: Poonam Singhal; B Jayaram; Surjit B Dixit; David L Beveridge
Journal: Biophys J Date: 2008-03-07 Impact factor: 4.033

3. MetaGene: prokaryotic gene finding from environmental genome shotgun sequences.

Authors: Hideki Noguchi; Jungho Park; Toshihisa Takagi
Journal: Nucleic Acids Res Date: 2006-10-05 Impact factor: 16.971

4. Gene identification in novel eukaryotic genomes by self-training algorithm.

Authors: Alexandre Lomsadze; Vardges Ter-Hovhannisyan; Yury O Chernoff; Mark Borodovsky
Journal: Nucleic Acids Res Date: 2005-11-28 Impact factor: 16.971

5. MetaGeneAnnotator: detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes.

Authors: Hideki Noguchi; Takeaki Taniguchi; Takehiko Itoh
Journal: DNA Res Date: 2008-10-21 Impact factor: 4.458

5 in total