Literature DB >> 17903285

Finding regulatory elements and regulatory motifs: a general probabilistic framework.

Erik van Nimwegen1.   

Abstract

Over the last two decades a large number of algorithms has been developed for regulatory motif finding. Here we show how many of these algorithms, especially those that model binding specificities of regulatory factors with position specific weight matrices (WMs), naturally arise within a general Bayesian probabilistic framework. We discuss how WMs are constructed from sets of regulatory sites, how sites for a given WM can be discovered by scanning of large sequences, how to cluster WMs, and more generally how to cluster large sets of sites from different WMs into clusters. We discuss how 'regulatory modules', clusters of sites for subsets of WMs, can be found in large intergenic sequences, and we discuss different methods for ab initio motif finding, including expectation maximization (EM) algorithms, and motif sampling algorithms. Finally, we extensively discuss how module finding methods and ab initio motif finding methods can be extended to take phylogenetic relations between the input sequences into account, i.e. we show how motif finding and phylogenetic footprinting can be integrated in a rigorous probabilistic framework. The article is intended for readers with a solid background in applied mathematics, and preferably with some knowledge of general Bayesian probabilistic methods. The main purpose of the article is to elucidate that all these methods are not a disconnected set of individual algorithmic recipes, but that they are just different facets of a single integrated probabilistic theory.

Entities:  

Mesh:

Year:  2007        PMID: 17903285      PMCID: PMC1995539          DOI: 10.1186/1471-2105-8-S6-S4

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  44 in total

1.  Algorithms for phylogenetic footprinting.

Authors:  Mathieu Blanchette; Benno Schwikowski; Martin Tompa
Journal:  J Comput Biol       Date:  2002       Impact factor: 1.479

2.  T-Coffee: A novel method for fast and accurate multiple sequence alignment.

Authors:  C Notredame; D G Higgins; J Heringa
Journal:  J Mol Biol       Date:  2000-09-08       Impact factor: 5.469

3.  Combining phylogenetic data with co-regulated genes to identify regulatory motifs.

Authors:  Ting Wang; Gary D Stormo
Journal:  Bioinformatics       Date:  2003-12-12       Impact factor: 6.937

4.  LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA.

Authors:  Michael Brudno; Chuong B Do; Gregory M Cooper; Michael F Kim; Eugene Davydov; Eric D Green; Arend Sidow; Serafim Batzoglou
Journal:  Genome Res       Date:  2003-03-12       Impact factor: 9.043

5.  Gibbs Recursive Sampler: finding transcription factor binding sites.

Authors:  William Thompson; Eric C Rouchka; Charles E Lawrence
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

6.  A biophysical approach to transcription factor binding site discovery.

Authors:  Marko Djordjevic; Anirvan M Sengupta; Boris I Shraiman
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

7.  ProbCons: Probabilistic consistency-based multiple sequence alignment.

Authors:  Chuong B Do; Mahathi S P Mahabhashyam; Michael Brudno; Serafim Batzoglou
Journal:  Genome Res       Date:  2005-02       Impact factor: 9.043

Review 8.  From gradients to stripes in Drosophila embryogenesis: filling in the gaps.

Authors:  R Rivera-Pomar; H Jäckle
Journal:  Trends Genet       Date:  1996-11       Impact factor: 11.639

9.  Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes.

Authors:  L McCue; W Thompson; C Carmack; M P Ryan; J S Liu; V Derbyshire; C E Lawrence
Journal:  Nucleic Acids Res       Date:  2001-02-01       Impact factor: 16.971

10.  Probabilistic clustering of sequences: inferring new bacterial regulons by comparative genomics.

Authors:  Erik van Nimwegen; Mihaela Zavolan; Nikolaus Rajewsky; Eric D Siggia
Journal:  Proc Natl Acad Sci U S A       Date:  2002-05-28       Impact factor: 11.205

View more
  25 in total

1.  Accurate prediction of gene expression by integration of DNA sequence statistics with detailed modeling of transcription regulation.

Authors:  Jose M G Vilar
Journal:  Biophys J       Date:  2010-10-20       Impact factor: 4.033

Review 2.  Deciphering the role of RNA-binding proteins in the post-transcriptional control of gene expression.

Authors:  Shivendra Kishore; Sandra Luber; Mihaela Zavolan
Journal:  Brief Funct Genomics       Date:  2010-12-01       Impact factor: 4.241

3.  MER41 repeat sequences contain inducible STAT1 binding sites.

Authors:  Christoph D Schmid; Philipp Bucher
Journal:  PLoS One       Date:  2010-07-06       Impact factor: 3.240

4.  Correlating gene expression variation with cis-regulatory polymorphism in Saccharomyces cerevisiae.

Authors:  Kevin Chen; Erik van Nimwegen; Nikolaus Rajewsky; Mark L Siegal
Journal:  Genome Biol Evol       Date:  2010-09-09       Impact factor: 3.416

5.  MicroRNA-221-222 regulate the cell cycle in mast cells.

Authors:  Ramon J Mayoral; Matthew E Pipkin; Mikhail Pachkov; Erik van Nimwegen; Anjana Rao; Silvia Monticelli
Journal:  J Immunol       Date:  2009-01-01       Impact factor: 5.422

6.  GADEM: a genetic algorithm guided formation of spaced dyads coupled with an EM algorithm for motif discovery.

Authors:  Leping Li
Journal:  J Comput Biol       Date:  2009-02       Impact factor: 1.479

7.  Fitness landscape for nucleosome positioning.

Authors:  Donate Weghorn; Michael Lässig
Journal:  Proc Natl Acad Sci U S A       Date:  2013-06-19       Impact factor: 11.205

8.  OHMM: a Hidden Markov Model accurately predicting the occupancy of a transcription factor with a self-overlapping binding motif.

Authors:  Amar Drawid; Nupur Gupta; Vijayalakshmi H Nagaraj; Céline Gélinas; Anirvan M Sengupta
Journal:  BMC Bioinformatics       Date:  2009-07-07       Impact factor: 3.169

Review 9.  Regulation by transcription factors in bacteria: beyond description.

Authors:  Enrique Balleza; Lucia N López-Bojorquez; Agustino Martínez-Antonio; Osbaldo Resendis-Antonio; Irma Lozada-Chávez; Yalbi I Balderas-Martínez; Sergio Encarnación; Julio Collado-Vides
Journal:  FEMS Microbiol Rev       Date:  2009-01       Impact factor: 16.408

10.  The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line.

Authors:  Harukazu Suzuki; Alistair R R Forrest; Erik van Nimwegen; Carsten O Daub; Piotr J Balwierz; Katharine M Irvine; Timo Lassmann; Timothy Ravasi; Yuki Hasegawa; Michiel J L de Hoon; Shintaro Katayama; Kate Schroder; Piero Carninci; Yasuhiro Tomaru; Mutsumi Kanamori-Katayama; Atsutaka Kubosaki; Altuna Akalin; Yoshinari Ando; Erik Arner; Maki Asada; Hiroshi Asahara; Timothy Bailey; Vladimir B Bajic; Denis Bauer; Anthony G Beckhouse; Nicolas Bertin; Johan Björkegren; Frank Brombacher; Erika Bulger; Alistair M Chalk; Joe Chiba; Nicole Cloonan; Adam Dawe; Josee Dostie; Pär G Engström; Magbubah Essack; Geoffrey J Faulkner; J Lynn Fink; David Fredman; Ko Fujimori; Masaaki Furuno; Takashi Gojobori; Julian Gough; Sean M Grimmond; Mika Gustafsson; Megumi Hashimoto; Takehiro Hashimoto; Mariko Hatakeyama; Susanne Heinzel; Winston Hide; Oliver Hofmann; Michael Hörnquist; Lukasz Huminiecki; Kazuho Ikeo; Naoko Imamoto; Satoshi Inoue; Yusuke Inoue; Ryoko Ishihara; Takao Iwayanagi; Anders Jacobsen; Mandeep Kaur; Hideya Kawaji; Markus C Kerr; Ryuichiro Kimura; Syuhei Kimura; Yasumasa Kimura; Hiroaki Kitano; Hisashi Koga; Toshio Kojima; Shinji Kondo; Takeshi Konno; Anders Krogh; Adele Kruger; Ajit Kumar; Boris Lenhard; Andreas Lennartsson; Morten Lindow; Marina Lizio; Cameron Macpherson; Norihiro Maeda; Christopher A Maher; Monique Maqungo; Jessica Mar; Nicholas A Matigian; Hideo Matsuda; John S Mattick; Stuart Meier; Sei Miyamoto; Etsuko Miyamoto-Sato; Kazuhiko Nakabayashi; Yutaka Nakachi; Mika Nakano; Sanne Nygaard; Toshitsugu Okayama; Yasushi Okazaki; Haruka Okuda-Yabukami; Valerio Orlando; Jun Otomo; Mikhail Pachkov; Nikolai Petrovsky; Charles Plessy; John Quackenbush; Aleksandar Radovanovic; Michael Rehli; Rintaro Saito; Albin Sandelin; Sebastian Schmeier; Christian Schönbach; Ariel S Schwartz; Colin A Semple; Miho Sera; Jessica Severin; Katsuhiko Shirahige; Cas Simons; George St Laurent; Masanori Suzuki; Takahiro Suzuki; Matthew J Sweet; Ryan J Taft; Shizu Takeda; Yoichi Takenaka; Kai Tan; Martin S Taylor; Rohan D Teasdale; Jesper Tegnér; Sarah Teichmann; Eivind Valen; Claes Wahlestedt; Kazunori Waki; Andrew Waterhouse; Christine A Wells; Ole Winther; Linda Wu; Kazumi Yamaguchi; Hiroshi Yanagawa; Jun Yasuda; Mihaela Zavolan; David A Hume; Takahiro Arakawa; Shiro Fukuda; Kengo Imamura; Chikatoshi Kai; Ai Kaiho; Tsugumi Kawashima; Chika Kawazu; Yayoi Kitazume; Miki Kojima; Hisashi Miura; Kayoko Murakami; Mitsuyoshi Murata; Noriko Ninomiya; Hiromi Nishiyori; Shohei Noma; Chihiro Ogawa; Takuma Sano; Christophe Simon; Michihira Tagami; Yukari Takahashi; Jun Kawai; Yoshihide Hayashizaki
Journal:  Nat Genet       Date:  2009-04-19       Impact factor: 38.330

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.