Literature DB >> 14555629

Background rareness-based iterative multiple sequence alignment algorithm for regulatory element detection.

Chandrasegaran Narasimhan1, Philip LoCascio, Edward Uberbacher.   

Abstract

MOTIVATION: Experimental methods capable of generating sets of co-regulated genes have become commonplace, however, recognizing the regulatory motifs responsible for this regulation remains difficult. As a result, computational detection of transcription factor binding sites in such data sets has been an active area of research. Most approaches have utilized either Gibbs sampling or greedy strategies to identify such elements in sets of sequences. These existing methods have varying degrees of success depending on the strength and length of the signals and the number of available sequences. We present a new deterministic iterative algorithm for regulatory element detection based on a Markov chain background. As in other methods, sequences in the entire genome and the training set are taken into account in order to discriminate against commonly occurring signals and produce patterns, which are significant in the training set.
RESULTS: The results of the algorithm compare favorably with existing tools on previously known and newly compiled data sets. The iteration based search appears rather rigorous, not only finding the binding sites, but also showing how the binding site stands out from genomic background. The approach used to score the results is critical and a discussion of various scoring schemes and options is also presented. Benchmarking of several methods shows that while most tools are good at detecting strong signals, Gibbs sampling algorithms give inconsistent results when the regulatory element signal becomes weak. A Markov chain based background model alleviates the drawbacks of MAP (maximum a posteriori log likelihood) scores. AVAILABILITY: Available on request from the authors. SUPPLEMENTARY INFORMATION: Data and the results presented in this paper are available on the web at http://compbio.ornl.gov/mira/index.html

Mesh:

Substances:

Year:  2003        PMID: 14555629     DOI: 10.1093/bioinformatics/btg266

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  8 in total

1.  Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes.

Authors:  Giulio Pavesi; Paolo Mereghetti; Giancarlo Mauri; Graziano Pesole
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

2.  Motif discovery and transcription factor binding sites before and after the next-generation sequencing era.

Authors:  Federico Zambelli; Graziano Pesole; Giulio Pavesi
Journal:  Brief Bioinform       Date:  2012-04-19       Impact factor: 11.622

3.  Binding Motifs in Bacterial Gene Promoters Modulate Transcriptional Effects of Global Regulators CRP and ArcA.

Authors:  Michael R Leuze; Tatiana V Karpinets; Mustafa H Syed; Alexander S Beliaev; Edward C Uberbacher
Journal:  Gene Regul Syst Bio       Date:  2012-05-30

4.  GibbsST: a Gibbs sampling method for motif discovery with enhanced resistance to local optima.

Authors:  Kazuhito Shida
Journal:  BMC Bioinformatics       Date:  2006-11-04       Impact factor: 3.169

5.  Scoring functions for transcription factor binding site prediction.

Authors:  Markus Friberg; Peter von Rohr; Gaston Gonnet
Journal:  BMC Bioinformatics       Date:  2005-04-04       Impact factor: 3.169

6.  Genomic DNA k-mer spectra: models and modalities.

Authors:  Benny Chor; David Horn; Nick Goldman; Yaron Levy; Tim Massingham
Journal:  Genome Biol       Date:  2009-10-08       Impact factor: 13.583

7.  WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences.

Authors:  Giulio Pavesi; Federico Zambelli; Graziano Pesole
Journal:  BMC Bioinformatics       Date:  2007-02-07       Impact factor: 3.169

8.  Comparative analysis of regulatory motif discovery tools for transcription factor binding sites.

Authors:  Wei Wei; Xiao-Dan Yu
Journal:  Genomics Proteomics Bioinformatics       Date:  2007-05       Impact factor: 7.691

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.