Literature DB >> 15797905

Identification of transcription factor binding sites with variable-order Bayesian networks.

I Ben-Gal1, A Shani, A Gohr, J Grau, S Arviv, A Shmilovici, S Posch, I Grosse.   

Abstract

MOTIVATION: We propose a new class of variable-order Bayesian network (VOBN) models for the identification of transcription factor binding sites (TFBSs). The proposed models generalize the widely used position weight matrix (PWM) models, Markov models and Bayesian network models. In contrast to these models, where for each position a fixed subset of the remaining positions is used to model dependencies, in VOBN models, these subsets may vary based on the specific nucleotides observed, which are called the context. This flexibility turns out to be of advantage for the classification and analysis of TFBSs, as statistical dependencies between nucleotides in different TFBS positions (not necessarily adjacent) may be taken into account efficiently--in a position-specific and context-specific manner.
RESULTS: We apply the VOBN model to a set of 238 experimentally verified sigma-70 binding sites in Escherichia coli. We find that the VOBN model can distinguish these 238 sites from a set of 472 intergenic 'non-promoter' sequences with a higher accuracy than fixed-order Markov models or Bayesian trees. We use a replicated stratified-holdout experiment having a fixed true-negative rate of 99.9%. We find that for a foreground inhomogeneous VOBN model of order 1 and a background homogeneous variable-order Markov (VOM) model of order 5, the obtained mean true-positive (TP) rate is 47.56%. In comparison, the best TP rate for the conventional models is 44.39%, obtained from a foreground PWM model and a background 2nd-order Markov model. As the standard deviation of the estimated TP rate is approximately 0.01%, this improvement is highly significant.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15797905     DOI: 10.1093/bioinformatics/bti410

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  52 in total

1.  Mapping specificity landscapes of RNA-protein interactions by high throughput sequencing.

Authors:  Eckhard Jankowsky; Michael E Harris
Journal:  Methods       Date:  2017-03-02       Impact factor: 3.608

2.  Integrative content-driven concepts for bioinformatics "beyond the cell".

Authors:  Edgar Wingender; Torsten Crass; Jennifer D Hogan; Alexander E Kel; Olga V Kel-Margoulis; Anatolij P Potapov
Journal:  J Biosci       Date:  2007-01       Impact factor: 1.826

3.  Sequence-based identification of 3D structural modules in RNA with RMDetect.

Authors:  José Almeida Cruz; Eric Westhof
Journal:  Nat Methods       Date:  2011-05-08       Impact factor: 28.547

Review 4.  Specificity and nonspecificity in RNA-protein interactions.

Authors:  Eckhard Jankowsky; Michael E Harris
Journal:  Nat Rev Mol Cell Biol       Date:  2015-08-19       Impact factor: 94.444

5.  Apples and oranges: avoiding different priors in Bayesian DNA sequence analysis.

Authors:  Jens Keilwagen; Jan Grau; Stefan Posch; Ivo Grosse
Journal:  BMC Bioinformatics       Date:  2010-03-22       Impact factor: 3.169

6.  Unifying generative and discriminative learning principles.

Authors:  Jens Keilwagen; Jan Grau; Stefan Posch; Marc Strickert; Ivo Grosse
Journal:  BMC Bioinformatics       Date:  2010-02-22       Impact factor: 3.169

7.  Inclusion of neighboring base interdependencies substantially improves genome-wide prokaryotic transcription factor binding site prediction.

Authors:  Rafik A Salama; Dov J Stekel
Journal:  Nucleic Acids Res       Date:  2010-05-03       Impact factor: 16.971

8.  The word landscape of the non-coding segments of the Arabidopsis thaliana genome.

Authors:  Jens Lichtenberg; Alper Yilmaz; Joshua D Welch; Kyle Kurz; Xiaoyu Liang; Frank Drews; Klaus Ecker; Stephen S Lee; Matt Geisler; Erich Grotewold; Lonnie R Welch
Journal:  BMC Genomics       Date:  2009-10-08       Impact factor: 3.969

9.  MotifAdjuster: a tool for computational reassessment of transcription factor binding site annotations.

Authors:  Jens Keilwagen; Jan Baumbach; Thomas A Kohl; Ivo Grosse
Journal:  Genome Biol       Date:  2009-05-01       Impact factor: 13.583

10.  POIMs: positional oligomer importance matrices--understanding support vector machine-based signal detectors.

Authors:  Sören Sonnenburg; Alexander Zien; Petra Philips; Gunnar Rätsch
Journal:  Bioinformatics       Date:  2008-07-01       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.