Literature DB >> 27288444

Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences.

Matthias Siebert1, Johannes Söding2.   

Abstract

Position weight matrices (PWMs) are the standard model for DNA and RNA regulatory motifs. In PWMs nucleotide probabilities are independent of nucleotides at other positions. Models that account for dependencies need many parameters and are prone to overfitting. We have developed a Bayesian approach for motif discovery using Markov models in which conditional probabilities of order k - 1 act as priors for those of order k This Bayesian Markov model (BaMM) training automatically adapts model complexity to the amount of available data. We also derive an EM algorithm for de-novo discovery of enriched motifs. For transcription factor binding, BaMMs achieve significantly (P    =  1/16) higher cross-validated partial AUC than PWMs in 97% of 446 ChIP-seq ENCODE datasets and improve performance by 36% on average. BaMMs also learn complex multipartite motifs, improving predictions of transcription start sites, polyadenylation sites, bacterial pause sites, and RNA binding sites by 26-101%. BaMMs never performed worse than PWMs. These robust improvements argue in favour of generally replacing PWMs by BaMMs.
© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27288444      PMCID: PMC5291271          DOI: 10.1093/nar/gkw521

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  67 in total

1.  Identification of transcription factor binding sites with variable-order Bayesian networks.

Authors:  I Ben-Gal; A Shani; A Gohr; J Grau; S Arviv; A Shmilovici; S Posch; I Grosse
Journal:  Bioinformatics       Date:  2005-03-29       Impact factor: 6.937

2.  Varying levels of complexity in transcription factor binding motifs.

Authors:  Jens Keilwagen; Jan Grau
Journal:  Nucleic Acids Res       Date:  2015-06-26       Impact factor: 16.971

3.  Transcriptome surveillance by selective termination of noncoding RNA synthesis.

Authors:  Daniel Schulz; Bjoern Schwalb; Anja Kiesel; Carlo Baejen; Phillipp Torkler; Julien Gagneur; Johannes Soeding; Patrick Cramer
Journal:  Cell       Date:  2013-11-07       Impact factor: 41.582

4.  Identification of core promoter modules in Drosophila and their application in accurate transcription start site prediction.

Authors:  Uwe Ohler
Journal:  Nucleic Acids Res       Date:  2006-10-26       Impact factor: 16.971

5.  Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo.

Authors:  Silvi Rouskin; Meghan Zubradt; Stefan Washietl; Manolis Kellis; Jonathan S Weissman
Journal:  Nature       Date:  2013-12-15       Impact factor: 49.962

6.  The role of DNA shape in protein-DNA recognition.

Authors:  Remo Rohs; Sean M West; Alona Sosinsky; Peng Liu; Richard S Mann; Barry Honig
Journal:  Nature       Date:  2009-10-29       Impact factor: 49.962

7.  Stability selection for regression-based models of transcription factor-DNA binding specificity.

Authors:  Fantine Mordelet; John Horton; Alexander J Hartemink; Barbara E Engelhardt; Raluca Gordân
Journal:  Bioinformatics       Date:  2013-07-01       Impact factor: 6.937

8.  Extensive transcriptional heterogeneity revealed by isoform profiling.

Authors:  Vicent Pelechano; Wu Wei; Lars M Steinmetz
Journal:  Nature       Date:  2013-04-24       Impact factor: 49.962

9.  Combinatorial Cis-regulation in Saccharomyces Species.

Authors:  Aaron T Spivak; Gary D Stormo
Journal:  G3 (Bethesda)       Date:  2016-01-15       Impact factor: 3.154

10.  JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles.

Authors:  Anthony Mathelier; Oriol Fornes; David J Arenillas; Chih-Yu Chen; Grégoire Denay; Jessica Lee; Wenqiang Shi; Casper Shyr; Ge Tan; Rebecca Worsley-Hunt; Allen W Zhang; François Parcy; Boris Lenhard; Albin Sandelin; Wyeth W Wasserman
Journal:  Nucleic Acids Res       Date:  2015-11-03       Impact factor: 16.971

View more
  25 in total

1.  The BaMM web server for de-novo motif discovery and regulatory sequence analysis.

Authors:  Anja Kiesel; Christian Roth; Wanwan Ge; Maximilian Wess; Markus Meier; Johannes Söding
Journal:  Nucleic Acids Res       Date:  2018-07-02       Impact factor: 16.971

2.  HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis.

Authors:  Ivan V Kulakovskiy; Ilya E Vorontsov; Ivan S Yevshin; Ruslan N Sharipov; Alla D Fedorova; Eugene I Rumynskiy; Yulia A Medvedeva; Arturo Magana-Mora; Vladimir B Bajic; Dmitry A Papatsenko; Fedor A Kolpakov; Vsevolod J Makeev
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

3.  Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding.

Authors:  Jinsen Li; Jared M Sagendorf; Tsu-Pei Chiu; Marco Pasi; Alberto Perez; Remo Rohs
Journal:  Nucleic Acids Res       Date:  2017-12-15       Impact factor: 16.971

4.  NF-κB signaling controls H3K9me3 levels at intronic LINE-1 and hematopoietic stem cell genes in cis.

Authors:  Donia Hidaoui; Anne Stolz; Françoise Porteu; Emilie Elvira-Matelot; Yanis Pelinski; François Hermetet; Rabie Chelbi; M'boyba Khadija Diop; Amir M Chioukh
Journal:  J Exp Med       Date:  2022-07-08       Impact factor: 17.579

5.  FABIAN-variant: predicting the effects of DNA variants on transcription factor binding.

Authors:  Robin Steinhaus; Peter N Robinson; Dominik Seelow
Journal:  Nucleic Acids Res       Date:  2022-05-26       Impact factor: 19.160

6.  RNANetMotif: Identifying sequence-structure RNA network motifs in RNA-protein binding sites.

Authors:  Hongli Ma; Han Wen; Zhiyuan Xue; Guojun Li; Zhaolei Zhang
Journal:  PLoS Comput Biol       Date:  2022-07-12       Impact factor: 4.779

7.  Disentangling transcription factor binding site complexity.

Authors:  Ralf Eggeling
Journal:  Nucleic Acids Res       Date:  2018-11-16       Impact factor: 16.971

8.  JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework.

Authors:  Aziz Khan; Oriol Fornes; Arnaud Stigliani; Marius Gheorghe; Jaime A Castro-Mondragon; Robin van der Lee; Adrien Bessy; Jeanne Chèneby; Shubhada R Kulkarni; Ge Tan; Damir Baranasic; David J Arenillas; Albin Sandelin; Klaas Vandepoele; Boris Lenhard; Benoît Ballester; Wyeth W Wasserman; François Parcy; Anthony Mathelier
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

9.  Direct AUC optimization of regulatory motifs.

Authors:  Lin Zhu; Hong-Bo Zhang; De-Shuang Huang
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

10.  Thermodynamic modeling reveals widespread multivalent binding by RNA-binding proteins.

Authors:  Salma Sohrabi-Jahromi; Johannes Söding
Journal:  Bioinformatics       Date:  2021-07-12       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.