Literature DB >> 10944202

Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis.

H J Bussemaker1, H Li, E D Siggia.   

Abstract

The availability of complete genome sequences and mRNA expression data for all genes creates new opportunities and challenges for identifying DNA sequence motifs that control gene expression. An algorithm, "MobyDick," is presented that decomposes a set of DNA sequences into the most probable dictionary of motifs or words. This method is applicable to any set of DNA sequences: for example, all upstream regions in a genome or all genes expressed under certain conditions. Identification of words is based on a probabilistic segmentation model in which the significance of longer words is deduced from the frequency of shorter ones of various lengths, eliminating the need for a separate set of reference data to define probabilities. We have built a dictionary with 1,200 words for the 6, 000 upstream regulatory regions in the yeast genome; the 500 most significant words (some with as few as 10 copies in all of the upstream regions) match 114 of 443 experimentally determined sites (a significance level of 18 standard deviations). When analyzing all of the genes up-regulated during sporulation as a group, we find many motifs in addition to the few previously identified by analyzing the subclusters individually to the expression subclusters. Applying MobyDick to the genes derepressed when the general repressor Tup1 is deleted, we find known as well as putative binding sites for its regulatory partners.

Entities:  

Mesh:

Substances:

Year:  2000        PMID: 10944202      PMCID: PMC27717          DOI: 10.1073/pnas.180265397

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  20 in total

1.  The transcriptional program of sporulation in budding yeast.

Authors:  S Chu; J DeRisi; M Eisen; J Mulholland; D Botstein; P O Brown; I Herskowitz
Journal:  Science       Date:  1998-10-23       Impact factor: 47.728

2.  The anatomy of a hypoxic operator in Saccharomyces cerevisiae.

Authors:  J Deckert; A M Torres; S M Hwang; A J Kastaniotis; R S Zitomer
Journal:  Genetics       Date:  1998-12       Impact factor: 4.562

3.  Regulation of gene expression during meiosis in Saccharomyces cerevisiae: SPR3 is controlled by both ABFI and a new sporulation control element.

Authors:  N Ozsarac; M J Straffon; H E Dalton; I W Dawes
Journal:  Mol Cell Biol       Date:  1997-03       Impact factor: 4.272

4.  Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm.

Authors:  I Rigoutsos; A Floratos
Journal:  Bioinformatics       Date:  1998       Impact factor: 6.937

5.  Genomic cis-regulatory logic: experimental and computational analysis of a sea urchin gene.

Authors:  C H Yuh; H Bolouri; E H Davidson
Journal:  Science       Date:  1998-03-20       Impact factor: 47.728

6.  Exploring the metabolic and genetic control of gene expression on a genomic scale.

Authors:  J L DeRisi; V R Iyer; P O Brown
Journal:  Science       Date:  1997-10-24       Impact factor: 47.728

7.  Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies.

Authors:  J van Helden; B André; J Collado-Vides
Journal:  J Mol Biol       Date:  1998-09-04       Impact factor: 5.469

8.  Participation of the yeast activator Abf1 in meiosis-specific expression of the HOP1 gene.

Authors:  V Gailus-Durner; J Xie; C Chintamaneni; A K Vershon
Journal:  Mol Cell Biol       Date:  1996-06       Impact factor: 4.272

9.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.

Authors:  P T Spellman; G Sherlock; M Q Zhang; V R Iyer; K Anders; M B Eisen; P O Brown; D Botstein; B Futcher
Journal:  Mol Biol Cell       Date:  1998-12       Impact factor: 4.138

10.  Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation.

Authors:  F P Roth; J D Hughes; P W Estep; G M Church
Journal:  Nat Biotechnol       Date:  1998-10       Impact factor: 54.908

View more
  66 in total

1.  The evolution of DNA regulatory regions for proteo-gamma bacteria by interspecies comparisons.

Authors:  Nikolaus Rajewsky; Nicholas D Socci; Martin Zapotocky; Eric D Siggia
Journal:  Genome Res       Date:  2002-02       Impact factor: 9.043

Review 2.  In silico identification of metazoan transcriptional regulatory regions.

Authors:  Wyeth W Wasserman; William Krivan
Journal:  Naturwissenschaften       Date:  2003-03-27

3.  GeneCensus: genome comparisons in terms of metabolic pathway activity and protein family sharing.

Authors:  J Lin; J Qian; D Greenbaum; P Bertone; R Das; N Echols; A Senes; B Stenger; M Gerstein
Journal:  Nucleic Acids Res       Date:  2002-10-15       Impact factor: 16.971

4.  Transcriptome-based determination of multiple transcription regulator activities in Escherichia coli by using network component analysis.

Authors:  Katy C Kao; Young-Lyeol Yang; Riccardo Boscolo; Chiara Sabatti; Vwani Roychowdhury; James C Liao
Journal:  Proc Natl Acad Sci U S A       Date:  2003-12-23       Impact factor: 11.205

5.  Identification of the binding sites of regulatory proteins in bacterial genomes.

Authors:  Hao Li; Virgil Rhodius; Carol Gross; Eric D Siggia
Journal:  Proc Natl Acad Sci U S A       Date:  2002-08-14       Impact factor: 11.205

6.  Network component analysis: reconstruction of regulatory signals in biological systems.

Authors:  James C Liao; Riccardo Boscolo; Young-Lyeol Yang; Linh My Tran; Chiara Sabatti; Vwani P Roychowdhury
Journal:  Proc Natl Acad Sci U S A       Date:  2003-12-12       Impact factor: 11.205

7.  A motif co-occurrence approach for genome-wide prediction of transcription-factor-binding sites in Escherichia coli.

Authors:  Martha L Bulyk; Abigail M McGuire; Nobuhisa Masuda; George M Church
Journal:  Genome Res       Date:  2004-02       Impact factor: 9.043

Review 8.  Computational approaches to identify promoters and cis-regulatory elements in plant genomes.

Authors:  Stephane Rombauts; Kobe Florquin; Magali Lescot; Kathleen Marchal; Pierre Rouzé; Yves van de Peer
Journal:  Plant Physiol       Date:  2003-07       Impact factor: 8.340

9.  CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling.

Authors:  Qing Zhou; Wing H Wong
Journal:  Proc Natl Acad Sci U S A       Date:  2004-08-05       Impact factor: 11.205

10.  Local graph alignment and motif search in biological networks.

Authors:  Johannes Berg; Michael Lässig
Journal:  Proc Natl Acad Sci U S A       Date:  2004-09-24       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.