Literature DB >> 15961459

Enhanced position weight matrices using mixture models.

Sridhar Hannenhalli1, Li-San Wang.   

Abstract

MOTIVATION: Positional weight matrix (PWM) is derived from a set of experimentally determined binding sites. Here we explore whether there exist subclasses of binding sites and if the mixture of these subclass-PWMs can improve the binding site prediction. Intuitively, the subclasses correspond to either distinct binding preference of the same transcription factor in different contexts or distinct subtypes of the transcription factor. AVAILABILITY: We report an Expectation Maximization algorithm adapting the mixture model of Baily and Elkan. We assessed the relative merit of using two subclass-PWMs. The resulting PWMs were evaluated with respect to preferred conservation (relative to mouse) of potential sites in human promoters and expression coherence of the potential target genes. Based on 64 JASPAR vertebrate PWMs, 61-81% of the cases resulted in a higher conservation using the mixture model. Also in 98% of the cases the expression coherence was higher for the target genes of one of the subclass-PWMs. Our analysis of Reb1 sites is consistent with previously discovered subtypes using independent methods. Additionally application of our method to mutated sites for transcription factor LEU3 reveals subclasses that segregate into strongly binding and weakly binding sites with P-value of 0.008. This is the first study which attempts to quantify the subtly different binding specificities of a transcription factor on a large scale and suggests the use of a mixture of PWMs, instead of the current practice of using a single PWM, for a transcription factor.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15961459     DOI: 10.1093/bioinformatics/bti1001

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  12 in total

1.  The multiple-specificity landscape of modular peptide recognition domains.

Authors:  David Gfeller; Frank Butty; Marta Wierzbicka; Erik Verschueren; Peter Vanhee; Haiming Huang; Andreas Ernst; Nisa Dar; Igor Stagljar; Luis Serrano; Sachdev S Sidhu; Gary D Bader; Philip M Kim
Journal:  Mol Syst Biol       Date:  2011-04-26       Impact factor: 11.429

2.  Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences.

Authors:  Matthias Siebert; Johannes Söding
Journal:  Nucleic Acids Res       Date:  2016-06-09       Impact factor: 16.971

3.  An extended set of PRDM1/BLIMP1 target genes links binding motif type to dynamic repression.

Authors:  Gina M Doody; Matthew A Care; Nicholas J Burgoyne; James R Bradford; Maria Bota; Constanze Bonifer; David R Westhead; Reuben M Tooze
Journal:  Nucleic Acids Res       Date:  2010-04-26       Impact factor: 16.971

4.  Finding subtypes of transcription factor motif pairs with distinct regulatory roles.

Authors:  Abha Singh Bais; Naftali Kaminski; Panayiotis V Benos
Journal:  Nucleic Acids Res       Date:  2011-04-12       Impact factor: 16.971

5.  Tree-based position weight matrix approach to model transcription factor binding site profiles.

Authors:  Yingtao Bi; Hyunsoo Kim; Ravi Gupta; Ramana V Davuluri
Journal:  PLoS One       Date:  2011-09-02       Impact factor: 3.240

6.  Construction of microRNA functional families by a mixture model of position weight matrices.

Authors:  Je-Keun Rhee; Soo-Yong Shin; Byoung-Tak Zhang
Journal:  PeerJ       Date:  2013-10-31       Impact factor: 2.984

7.  Searching for transcription factor binding sites in vector spaces.

Authors:  Chih Lee; Chun-Hsi Huang
Journal:  BMC Bioinformatics       Date:  2012-08-27       Impact factor: 3.169

8.  CTCF binding site classes exhibit distinct evolutionary, genomic, epigenomic and transcriptomic features.

Authors:  Kobby Essien; Sebastien Vigneau; Sofia Apreleva; Larry N Singh; Marisa S Bartolomei; Sridhar Hannenhalli
Journal:  Genome Biol       Date:  2009-11-18       Impact factor: 13.583

9.  TREMOR--a tool for retrieving transcriptional modules by incorporating motif covariance.

Authors:  Larry N Singh; Li-San Wang; Sridhar Hannenhalli
Journal:  Nucleic Acids Res       Date:  2007-10-25       Impact factor: 16.971

10.  CSI-Tree: a regression tree approach for modeling binding properties of DNA-binding molecules based on cognate site identification (CSI) data.

Authors:  Sündüz Keleş; Christopher L Warren; Clayton D Carlson; Aseem Z Ansari
Journal:  Nucleic Acids Res       Date:  2008-04-13       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.