Literature DB >> 17402923

Supervised detection of conserved motifs in DNA sequences with cosmo.

Oliver Bembom1, Sunduz Keles, Mark J van der Laan.   

Abstract

A number of computational methods have been proposed for identifying transcription factor binding sites from a set of unaligned sequences that are thought to share the motif in question. We here introduce an algorithm, called cosmo, that allows this search to be supervised by specifying a set of constraints that the position weight matrix of the unknown motif must satisfy. Such constraints may be formulated, for example, on the basis of prior knowledge about the structure of the transcription factor in question. The algorithm is based on the same two-component multinomial mixture model used by MEME, with stronger reliance, however, on the likelihood principle instead of more ad-hoc criteria like the E-value. The intensity parameter in the ZOOPS and TCM models, for instance, is estimated based on a profile-likelihood approach, and the width of the unknown motif is selected based on BIC. These changes allow cosmo to outperform MEME even in the absence of any constraints, as evidenced by 2- to 3-fold greater sensitivity in some simulation studies. Additional improvements in performance can be achieved by selecting the model type (OOPS, ZOOPS, or TCM) data-adaptively or by supplying correctly specified constraints, especially if the motif appears only as a weak signal in the data. The algorithm can data-adaptively choose between working in a given constrained model or in the completely unconstrained model, guarding against the risk of supplying mis-specified constraints. Simulation studies suggest that this approach can offer 3 to 3.5 times greater sensitivity than MEME. The algorithm has been implemented in the form of a stand-alone C program as well as a web application that can be accessed at http://cosmoweb.berkeley.edu. An R package is available through Bioconductor (http://bioconductor.org).

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17402923     DOI: 10.2202/1544-6115.1260

Source DB:  PubMed          Journal:  Stat Appl Genet Mol Biol        ISSN: 1544-6115


  27 in total

1.  Divergence of Pumilio/fem-3 mRNA binding factor (PUF) protein specificity through variations in an RNA-binding pocket.

Authors:  Chen Qiu; Aaron Kershner; Yeming Wang; Cynthia P Holley; Daniel Wilinski; Sunduz Keles; Judith Kimble; Marvin Wickens; Traci M Tanaka Hall
Journal:  J Biol Chem       Date:  2011-12-28       Impact factor: 5.157

2.  Circulating microRNA trafficking and regulation: computational principles and practice.

Authors:  Juan Cui; Jiang Shu
Journal:  Brief Bioinform       Date:  2020-07-15       Impact factor: 11.622

3.  Genetic framework for GATA factor function in vascular biology.

Authors:  Amelia K Linnemann; Henriette O'Geen; Sunduz Keles; Peggy J Farnham; Emery H Bresnick
Journal:  Proc Natl Acad Sci U S A       Date:  2011-08-01       Impact factor: 11.205

4.  Integrative inference of gene-regulatory networks in Escherichia coli using information theoretic concepts and sequence analysis.

Authors:  Christoph Kaleta; Anna Göhler; Stefan Schuster; Knut Jahreis; Reinhard Guthke; Swetlana Nikolajewa
Journal:  BMC Syst Biol       Date:  2010-08-18

5.  Discovering hematopoietic mechanisms through genome-wide analysis of GATA factor chromatin occupancy.

Authors:  Tohru Fujiwara; Henriette O'Geen; Sunduz Keles; Kimberly Blahnik; Amelia K Linnemann; Yoon-A Kang; Kyunghee Choi; Peggy J Farnham; Emery H Bresnick
Journal:  Mol Cell       Date:  2009-11-25       Impact factor: 17.970

6.  Uncovering transcriptional regulation of glycerol metabolism in Aspergilli through genome-wide gene expression data analysis.

Authors:  Margarita Salazar; Wanwipa Vongsangnak; Gianni Panagiotou; Mikael R Andersen; Jens Nielsen
Journal:  Mol Genet Genomics       Date:  2009-09-26       Impact factor: 3.291

7.  A trispecies Aspergillus microarray: comparative transcriptomics of three Aspergillus species.

Authors:  Mikael R Andersen; Wanwipa Vongsangnak; Gianni Panagiotou; Margarita P Salazar; Linda Lehmann; Jens Nielsen
Journal:  Proc Natl Acad Sci U S A       Date:  2008-03-10       Impact factor: 11.205

8.  Transcription of histone gene cluster by differential core-promoter factors.

Authors:  Yoh Isogai; Sündüz Keles; Matthias Prestel; Andreas Hochheimer; Robert Tjian
Journal:  Genes Dev       Date:  2007-10-31       Impact factor: 11.361

9.  Systemic analysis of the response of Aspergillus niger to ambient pH.

Authors:  Mikael R Andersen; Linda Lehmann; Jens Nielsen
Journal:  Genome Biol       Date:  2009-05-01       Impact factor: 13.583

10.  Accurate recognition of cis-regulatory motifs with the correct lengths in prokaryotic genomes.

Authors:  Guojun Li; Bingqiang Liu; Ying Xu
Journal:  Nucleic Acids Res       Date:  2009-11-11       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.