Literature DB >> 18535083

Modeling promoter grammars with evolving hidden Markov models.

Kyoung-Jae Won1, Albin Sandelin, Troels Torben Marstrand, Anders Krogh.   

Abstract

MOTIVATION: Describing and modeling biological features of eukaryotic promoters remains an important and challenging problem within computational biology. The promoters of higher eukaryotes in particular display a wide variation in regulatory features, which are difficult to model. Often several factors are involved in the regulation of a set of co-regulated genes. If so, promoters can be modeled with connected regulatory features, where the network of connections is characteristic for a particular mode of regulation.
RESULTS: With the goal of automatically deciphering such regulatory structures, we present a method that iteratively evolves an ensemble of regulatory grammars using a hidden Markov Model (HMM) architecture composed of interconnected blocks representing transcription factor binding sites (TFBSs) and background regions of promoter sequences. The ensemble approach reduces the risk of overfitting and generally improves performance. We apply this method to identify TFBSs and to classify promoters preferentially expressed in macrophages, where it outperforms other methods due to the increased predictive power given by the grammar. AVAILABILITY: The software and the datasets are available from http://modem.ucsd.edu/won/eHMM.tar.gz

Mesh:

Substances:

Year:  2008        PMID: 18535083     DOI: 10.1093/bioinformatics/btn254

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  7 in total

1.  Efficient algorithms for training the parameters of hidden Markov models using stochastic expectation maximization (EM) training and Viterbi training.

Authors:  Tin Y Lam; Irmtraud M Meyer
Journal:  Algorithms Mol Biol       Date:  2010-12-09       Impact factor: 1.405

2.  Multivariate Hawkes process models of the occurrence of regulatory elements.

Authors:  Lisbeth Carstensen; Albin Sandelin; Ole Winther; Niels R Hansen
Journal:  BMC Bioinformatics       Date:  2010-09-09       Impact factor: 3.169

3.  The construction and use of log-odds substitution scores for multiple sequence alignment.

Authors:  Stephen F Altschul; John C Wootton; Elena Zaslavsky; Yi-Kuo Yu
Journal:  PLoS Comput Biol       Date:  2010-07-15       Impact factor: 4.475

4.  An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs.

Authors:  Fernando Garcia-Alcalde; Armando Blanco; Adrian J Shepherd
Journal:  BMC Bioinformatics       Date:  2010-11-08       Impact factor: 3.169

Review 5.  Computational models in plant-pathogen interactions: the case of Phytophthora infestans.

Authors:  Andrés Pinzón; Emiliano Barreto; Adriana Bernal; Luke Achenie; Andres F González Barrios; Raúl Isea; Silvia Restrepo
Journal:  Theor Biol Med Model       Date:  2009-11-12       Impact factor: 2.432

6.  Evolutionary mirages: selection on binding site composition creates the illusion of conserved grammars in Drosophila enhancers.

Authors:  Richard W Lusk; Michael B Eisen
Journal:  PLoS Genet       Date:  2010-01-22       Impact factor: 5.917

7.  Modeling tissue-specific structural patterns in human and mouse promoters.

Authors:  Alexis Vandenbon; Kenta Nakai
Journal:  Nucleic Acids Res       Date:  2009-10-22       Impact factor: 16.971

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.