Literature DB >> 16477324

PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.

Rahul Siddharthan1, Eric D Siggia, Erik van Nimwegen.   

Abstract

A central problem in the bioinformatics of gene regulation is to find the binding sites for regulatory proteins. One of the most promising approaches toward identifying these short and fuzzy sequence patterns is the comparative analysis of orthologous intergenic regions of related species. This analysis is complicated by various factors. First, one needs to take the phylogenetic relationship between the species into account in order to distinguish conservation that is due to the occurrence of functional sites from spurious conservation that is due to evolutionary proximity. Second, one has to deal with the complexities of multiple alignments of orthologous intergenic regions, and one has to consider the possibility that functional sites may occur outside of conserved segments. Here we present a new motif sampling algorithm, <span class="Chemical">PhyloGibbs, that runs on arbitrary collections of multiple local sequence alignments of orthologous sequences. The algorithm searches over all ways in which an arbitrary number of binding sites for an arbitrary number of transcription factors (TFs) can be assigned to the multiple sequence alignments. These binding site configurations are scored by a Bayesian probabilistic model that treats aligned sequences by a model for the evolution of binding sites and "background" intergenic DNA. This model takes the phylogenetic relationship between the species in the alignment explicitly into account. The algorithm uses simulated annealing and Monte Carlo Markov-chain sampling to rigorously assign posterior probabilities to all the binding sites that it reports. In tests on synthetic data and real data from five Saccharomyces species our algorithm performs significantly better than four other motif-finding algorithms, including algorithms that also take phylogeny into account. Our results also show that, in contrast to the other algorithms, PhyloGibbs can make realistic estimates of the reliability of its predictions. Our tests suggest that, running on the five-species multiple alignment of a single gene's upstream region, PhyloGibbs on average recovers over 50% of all binding sites in S. cerevisiae at a specificity of about 50%, and 33% of all binding sites at a specificity of about 85%. We also tested PhyloGibbs on collections of multiple alignments of intergenic regions that were recently annotated, based on ChIP-on-chip data, to contain binding sites for the same TF. We compared PhyloGibbs's results with the previous analysis of these data using six other motif-finding algorithms. For 16 of 21 TFs for which all other motif-finding methods failed to find a significant motif, PhyloGibbs did recover a motif that matches the literature consensus. In 11 cases where there was disagreement in the results we compiled lists of known target genes from the literature, and found that running PhyloGibbs on their regulatory regions yielded a binding motif matching the literature consensus in all but one of the cases. Interestingly, these literature gene lists had little overlap with the targets annotated based on the ChIP-on-chip data. The PhyloGibbs code can be downloaded from http://www.biozentrum.unibas.ch/~nimwegen/cgi-bin/phylogibbs.cgi or http://www.imsc.res.in/~rsidd/phylogibbs. The full set of predicted sites from our tests on yeast are available at http://www.swissregulon.unibas.ch.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 16477324      PMCID: PMC1309704          DOI: 10.1371/journal.pcbi.0010067

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


  57 in total

1.  BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes.

Authors:  X Liu; D L Brutlag; J S Liu
Journal:  Pac Symp Biocomput       Date:  2001

2.  The evolution of DNA regulatory regions for proteo-gamma bacteria by interspecies comparisons.

Authors:  Nikolaus Rajewsky; Nicholas D Socci; Martin Zapotocky; Eric D Siggia
Journal:  Genome Res       Date:  2002-02       Impact factor: 9.043

3.  Algorithms for phylogenetic footprinting.

Authors:  Mathieu Blanchette; Benno Schwikowski; Martin Tompa
Journal:  J Comput Biol       Date:  2002       Impact factor: 1.479

4.  Discovery of regulatory elements by a computational method for phylogenetic footprinting.

Authors:  Mathieu Blanchette; Martin Tompa
Journal:  Genome Res       Date:  2002-05       Impact factor: 9.043

5.  Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes.

Authors:  L McCue; W Thompson; C Carmack; M P Ryan; J S Liu; V Derbyshire; C E Lawrence
Journal:  Nucleic Acids Res       Date:  2001-02-01       Impact factor: 16.971

6.  The Saccharomyces cerevisiae Sko1p transcription factor mediates HOG pathway-dependent osmotic regulation of a set of genes encoding enzymes implicated in protection from oxidative damage.

Authors:  M Rep; M Proft; F Remize; M Tamás; R Serrano; J M Thevelein; S Hohmann
Journal:  Mol Microbiol       Date:  2001-06       Impact factor: 3.501

7.  Probabilistic clustering of sequences: inferring new bacterial regulons by comparative genomics.

Authors:  Erik van Nimwegen; Mihaela Zavolan; Nikolaus Rajewsky; Eric D Siggia
Journal:  Proc Natl Acad Sci U S A       Date:  2002-05-28       Impact factor: 11.205

8.  The membrane proteins, Spt23p and Mga2p, play distinct roles in the activation of Saccharomyces cerevisiae OLE1 gene expression. Fatty acid-mediated regulation of Mga2p activity is independent of its proteolytic processing into a soluble transcription activator.

Authors:  R Chellappa; P Kandasamy; C S Oh; Y Jiang; M Vemula; C E Martin
Journal:  J Biol Chem       Date:  2001-09-13       Impact factor: 5.157

9.  Cadmium-inducible expression of the yeast GSH1 gene requires a functional sulfur-amino acid regulatory network.

Authors:  U H Dormer; J Westwater; N F McLaren; N A Kent; J Mellor; D J Jamieson
Journal:  J Biol Chem       Date:  2000-10-20       Impact factor: 5.157

10.  A microarray-assisted screen for potential Hap1 and Rox1 target genes in Saccharomyces cerevisiae.

Authors:  José J M Ter Linde; H Yde Steensma
Journal:  Yeast       Date:  2002-07       Impact factor: 3.239

View more
  134 in total

Review 1.  Phylogenetic footprinting: a boost for microbial regulatory genomics.

Authors:  Pramod Katara; Atul Grover; Vinay Sharma
Journal:  Protoplasma       Date:  2011-11-24       Impact factor: 3.356

2.  Known and novel post-transcriptional regulatory sequences are conserved across plant families.

Authors:  Justin N Vaughn; Sally R Ellingson; Flavio Mignone; Albrecht von Arnim
Journal:  RNA       Date:  2012-01-11       Impact factor: 4.942

3.  Novel sequence-based method for identifying transcription factor binding sites in prokaryotic genomes.

Authors:  Gurmukh Sahota; Gary D Stormo
Journal:  Bioinformatics       Date:  2010-08-31       Impact factor: 6.937

4.  Genome-wide identification of cis-regulatory motifs and modules underlying gene coregulation using statistics and phylogeny.

Authors:  Hervé Rouault; Khalil Mazouni; Lydie Couturier; Vincent Hakim; François Schweisguth
Journal:  Proc Natl Acad Sci U S A       Date:  2010-07-29       Impact factor: 11.205

Review 5.  Deciphering the role of RNA-binding proteins in the post-transcriptional control of gene expression.

Authors:  Shivendra Kishore; Sandra Luber; Mihaela Zavolan
Journal:  Brief Funct Genomics       Date:  2010-12-01       Impact factor: 4.241

6.  Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation.

Authors:  Hervé Rouault; Marc Santolini; François Schweisguth; Vincent Hakim
Journal:  Nucleic Acids Res       Date:  2014-03-25       Impact factor: 16.971

7.  Connecting protein structure with predictions of regulatory sites.

Authors:  Alexandre V Morozov; Eric D Siggia
Journal:  Proc Natl Acad Sci U S A       Date:  2007-04-16       Impact factor: 11.205

8.  Drosophila melanogaster Zelda and Single-minded collaborate to regulate an evolutionarily dynamic CNS midline cell enhancer.

Authors:  Joseph C Pearson; Joseph D Watson; Stephen T Crews
Journal:  Dev Biol       Date:  2012-04-17       Impact factor: 3.582

9.  TargetOrtho: a phylogenetic footprinting tool to identify transcription factor targets.

Authors:  Lori Glenwinkel; Di Wu; Gregory Minevich; Oliver Hobert
Journal:  Genetics       Date:  2014-02-20       Impact factor: 4.562

10.  Alternative polyadenylation in glioblastoma multiforme and changes in predicted RNA binding protein profiles.

Authors:  Jiaofang Shao; Jing Zhang; Zengming Zhang; Huawei Jiang; Xiaoyan Lou; Bingding Huang; Gregory Foltz; Qing Lan; Qiang Huang; Biaoyang Lin
Journal:  OMICS       Date:  2013-02-19
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.