| Literature DB >> 21543442 |
Abstract
MOTIVATION: Transcription factor (TF) ChIP-seq datasets have particular characteristics that provide unique challenges and opportunities for motif discovery. Most existing motif discovery algorithms do not scale well to such large datasets, or fail to report many motifs associated with cofactors of the ChIP-ed TF.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21543442 PMCID: PMC3106199 DOI: 10.1093/bioinformatics/btr261
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Primary and cofactor motifs found by DREME in 13 mES cell ChIP-seq datasets
| TF | Peaks | m | r | Cofactor motifs |
|---|---|---|---|---|
| CTCF | 39609 | 29 | 1 | |
| cMyc | 3422 | 12 | 1 | |
| E2f1 | 20699 | 25 | 2 | |
| CREB/ATF | ||||
| Esrrb | 21647 | 29 | 1 | |
| Rxra, Zic3, Ewsr1 | ||||
| Klf4 | 10875 | 26 | 1 | |
| Gata3, | ||||
| Nanog | 10343 | 24 | 4 | |
| nMyc | 7182 | 21 | 1 | |
| Sfpib | ||||
| Oct4 | 3761 | 17 | 1 | |
| STAT3 | 2546 | 13 | 1 | |
| Sp1, Irf4 | ||||
| Smad1 | 1126 | 10 | No | |
| Zfp740 | ||||
| Sox2 | 4526 | 19 | 1 | |
| Tcfcp2l1 | 26910 | 33 | 1 | |
| CREB/ATF | ||||
| Zfx | 10338 | 20 | 1 |
Columns show the name of the ChIP-ed TF; the number of ChIP-seq peaks; the number of significant motifs (m) found by DREME (E < 0.05); the rank (r) of the ChIP-ed TF's motif; and cofactor motifs found. Cofactor TFs are listed in the order of DREME significance and in bold font if they are 1 of the 12 pluripotency TFs. Only the cofactor TF family name is given when several family members match the DREME motif (e.g. ‘Myc’).
See text for discussion of E2f1 and Nanog motifs.
Fig. 1.Comparison of DREME mESC TF ChIP-seq motifs with in vitro motifs. Each panel shows the logo of the in vivo binding motif discovered by DREME in the designated TF ChIP-seq dataset (lower logo) aligned with the logo of the best available in vitro motif (upper logo). Since no in vitro motifs are available for Sox2, Oct4 and E2f1, UniProbe motifs for closely related TF family members Sox11, Pou2f3 and E2f3 are used. The in vitro motif for Nanog is taken from Jauch .
Fig. 2.Discriminative motif discovery in mESC ChIP-seq datasets. Panels (a) and (b) show the logo of the binding motif discovered by DREME in the two designated TF ChIP-seq datasets (lower logo) aligned with the logo of a known motif for the ChIP-ed TF (upper logo). (a) Upper logo is known Oct4 motif (Pou-family member Pou3f3, UniProbe Pou3f3_3235.2). (b) Upper logo is known Sox2 motif (TRANSFAC M01272). (c) Shows the most significant motif found by DREME in the Nanog dataset using (top to bottom) the shuffled Nanog dataset, the Oct4 dataset or the Sox2 dataset as the negative set.
Fig. 3.Comparison of motif discovery algorithms. (a) The table shows the average number of motifs discovered (N), number of datasets in which the ChIP-ed motif was found (S), the average number of identifiable co-factor motifs found (C), and the average running time in seconds of the algorithm on the mESC ChIP-seq datasets. Bold font indicates best performance. Note: Times for nestedMICA and MEME are for the reduced size datasets (0.5 megabase-pairs). (b) The plot shows the running times for DREME, Amadeus, Trawler and WEEDER on the full-size mESC ChIP-seq datasets. Inset plot is the same data plotted with log scales on both axes.