MOTIVATION: DNA binding proteins play crucial roles in the regulation of gene expression. Transcription factors (TFs) activate or repress genes directly while other proteins influence chromatin structure for transcription. Binding sites of a TF exhibit a similar sequence pattern called a motif. However, a one-to-one map does not exist between each TF and motif. Many TFs in a protein family may recognize the same motif with subtle nucleotide differences leading to different binding affinities. Additionally, a particular TF may bind different motifs under certain conditions, for example in the presence of different co-regulators. The availability of genome-wide binding data of multiple collaborative TFs makes it possible to detect such context-dependent motifs. RESULTS: We developed a contrast motif finder (CMF) for the de novo identification of motifs that are differentially enriched in two sets of sequences. Applying this method to a number of TF binding datasets from mouse embryonic stem cells, we demonstrate that CMF achieves substantially higher accuracy than several well-known motif finding methods. By contrasting sequences bound by distinct sets of TFs, CMF identified two different motifs that may be recognized by Oct4 dependent on the presence of another co-regulator and detected subtle motif signals that may be associated with potential competitive binding between Sox2 and Tcf3. AVAILABILITY: The software CMF is freely available for academic use at www.stat.ucla.edu/∼zhou/CMF.
MOTIVATION: DNA binding proteins play crucial roles in the regulation of gene expression. Transcription factors (TFs) activate or repress genes directly while other proteins influence chromatin structure for transcription. Binding sites of a TF exhibit a similar sequence pattern called a motif. However, a one-to-one map does not exist between each TF and motif. Many TFs in a protein family may recognize the same motif with subtle nucleotide differences leading to different binding affinities. Additionally, a particular TF may bind different motifs under certain conditions, for example in the presence of different co-regulators. The availability of genome-wide binding data of multiple collaborative TFs makes it possible to detect such context-dependent motifs. RESULTS: We developed a contrast motif finder (CMF) for the de novo identification of motifs that are differentially enriched in two sets of sequences. Applying this method to a number of TF binding datasets from mouse embryonic stem cells, we demonstrate that CMF achieves substantially higher accuracy than several well-known motif finding methods. By contrasting sequences bound by distinct sets of TFs, CMF identified two different motifs that may be recognized by Oct4 dependent on the presence of another co-regulator and detected subtle motif signals that may be associated with potential competitive binding between Sox2 and Tcf3. AVAILABILITY: The software CMF is freely available for academic use at www.stat.ucla.edu/∼zhou/CMF.
Authors: Attila Reményi; Katharina Lins; L Johan Nissen; Rolland Reinbold; Hans R Schöler; Matthias Wilmanns Journal: Genes Dev Date: 2003-08-15 Impact factor: 11.361
Authors: Dean Tantin; Matthew Gemberling; Catherine Callister; William G Fairbrother; William Fairbrother Journal: Genome Res Date: 2008-01-22 Impact factor: 9.043
Authors: Rupa Sridharan; Jason Tchieu; Mike J Mason; Robin Yachechko; Edward Kuoy; Steve Horvath; Qing Zhou; Kathrin Plath Journal: Cell Date: 2009-01-23 Impact factor: 41.582
Authors: Martin Vingron; Alvis Brazma; Richard Coulson; Jacques van Helden; Thomas Manke; Kimmo Palin; Olivier Sand; Esko Ukkonen Journal: Genome Biol Date: 2009-01-30 Impact factor: 13.583
Authors: William T Chiu; Rebekah Charney Le; Ira L Blitz; Margaret B Fish; Yi Li; Jacob Biesinger; Xiaohui Xie; Ken W Y Cho Journal: Development Date: 2014-10-30 Impact factor: 6.868