| Literature DB >> 16706717 |
Pierre Nicolas1, Anne-Sophie Tocquet, Vincent Miele, Florence Muri.
Abstract
Effective probabilistic modeling approaches have been developed to find motifs of biological function in DNA sequences. However, the problem of automated model choice remains largely open and becomes more essential as the number of sequences to be analyzed is constantly increasing. Here we propose a reversible jump Markov chain Monte Carlo algorithm for estimating both parameters and model dimension of a Bayesian hidden semi-Markov model dedicated to bacterial promoter motif discovery. Bacterial promoters are complex motifs composed of two boxes separated by a spacer of variable but constrained length and occurring close to the protein translation start site. The algorithm allows simultaneous estimations of the width of the boxes, of the support size of the spacer length distribution, and of the order of the Markovian model used for the "background" nucleotide composition. The application of this method on three sequence sets points out the good behavior of the algorithm and the biological relevance of the estimated promoter motifs.Mesh:
Substances:
Year: 2006 PMID: 16706717 DOI: 10.1089/cmb.2006.13.651
Source DB: PubMed Journal: J Comput Biol ISSN: 1066-5277 Impact factor: 1.479