MOTIVATION: ChIP-seq data are enriched in binding sites for the protein immunoprecipitated. Some sequences may also contain binding sites for a coregulator. Biologists are interested in knowing which coregulatory factor motifs may be present in the sequences bound by the protein ChIP'ed. RESULTS: We present a finite mixture framework with an expectation-maximization algorithm that considers two motifs jointly and simultaneously determines which sequences contain both motifs, either one or neither of them. Tested on 10 simulated ChIP-seq datasets, our method performed better than repeated application of MEME in predicting sequences containing both motifs. When applied to a mouse liver Foxa2 ChIP-seq dataset involving ~ 12 000 400-bp sequences, coMOTIF identified co-occurrence of Foxa2 with Hnf4a, Cebpa, E-box, Ap1/Maf or Sp1 motifs in ~6-33% of these sequences. These motifs are either known as liver-specific transcription factors or have an important role in liver function. AVAILABILITY: Freely available at http://www.niehs.nih.gov/research/resources/software/comotif/. CONTACT: li3@niehs.nih.gov SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: ChIP-seq data are enriched in binding sites for the protein immunoprecipitated. Some sequences may also contain binding sites for a coregulator. Biologists are interested in knowing which coregulatory factor motifs may be present in the sequences bound by the protein ChIP'ed. RESULTS: We present a finite mixture framework with an expectation-maximization algorithm that considers two motifs jointly and simultaneously determines which sequences contain both motifs, either one or neither of them. Tested on 10 simulated ChIP-seq datasets, our method performed better than repeated application of MEME in predicting sequences containing both motifs. When applied to a mouse liver Foxa2 ChIP-seq dataset involving ~ 12 000 400-bp sequences, coMOTIF identified co-occurrence of Foxa2 with Hnf4a, Cebpa, E-box, Ap1/Maf or Sp1 motifs in ~6-33% of these sequences. These motifs are either known as liver-specific transcription factors or have an important role in liver function. AVAILABILITY: Freely available at http://www.niehs.nih.gov/research/resources/software/comotif/. CONTACT: li3@niehs.nih.gov SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Martin Tompa; Nan Li; Timothy L Bailey; George M Church; Bart De Moor; Eleazar Eskin; Alexander V Favorov; Martin C Frith; Yutao Fu; W James Kent; Vsevolod J Makeev; Andrei A Mironov; William Stafford Noble; Giulio Pavesi; Graziano Pesole; Mireille Régnier; Nicolas Simonis; Saurabh Sinha; Gert Thijs; Jacques van Helden; Mathias Vandenbogaert; Zhiping Weng; Christopher Workman; Chun Ye; Zhou Zhu Journal: Nat Biotechnol Date: 2005-01 Impact factor: 54.908
Authors: William A Thompson; Lee A Newberg; Sean Conlan; Lee Ann McCue; Charles E Lawrence Journal: Nucleic Acids Res Date: 2007-05-05 Impact factor: 16.971
Authors: Harendra Guturu; Andrew C Doxey; Aaron M Wenger; Gill Bejerano Journal: Philos Trans R Soc Lond B Biol Sci Date: 2013-11-11 Impact factor: 6.237
Authors: Jarkko Toivonen; Teemu Kivioja; Arttu Jolma; Yimeng Yin; Jussi Taipale; Esko Ukkonen Journal: Nucleic Acids Res Date: 2018-05-04 Impact factor: 16.971
Authors: Igor V Deyneko; Alexander E Kel; Olga V Kel-Margoulis; Elena V Deineko; Edgar Wingender; Siegfried Weiss Journal: BMC Bioinformatics Date: 2013-08-08 Impact factor: 3.169