Literature DB >> 16873468

Context-specific independence mixture modeling for positional weight matrices.

Benjamin Georgi1, Alexander Schliep.   

Abstract

MOTIVATION: A positional weight matrix (PWM) is a statistical representation of the binding pattern of a transcription factor estimated from known binding site sequences. Previous studies showed that for factors which bind to divergent binding sites, mixtures of multiple PWMs increase performance. However, estimating a conventional mixture distribution for each position will in many cases cause overfitting.
RESULTS: We propose a context-specific independence (CSI) mixture model and a learning algorithm based on a Bayesian approach. The CSI model adjusts complexity to fit the amount of variation observed on the sequence level in each position of a site. This not only yields a more parsimonious description of binding patterns, which improves parameter estimates, it also increases robustness as the model automatically adapts the number of components to fit the data. Evaluation of the CSI model on simulated data showed favorable results compared to conventional mixtures. We demonstrate its adaptive properties in a classical model selection setup. The increased parsimony of the CSI model was shown for the transcription factor Leu3 where two binding-energy subgroups were distinguished equally well as with a conventional mixture but requiring 30% less parameters. Analysis of the human-mouse conservation of predicted binding sites of 64 JASPAR TFs showed that CSI was as good or better than a conventional mixture for 89% of the TFs and for 70% for a single PWM model. AVAILABILITY: http://algorithmics.molgen.mpg.de/mixture.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16873468     DOI: 10.1093/bioinformatics/btl249

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  9 in total

1.  Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences.

Authors:  Matthias Siebert; Johannes Söding
Journal:  Nucleic Acids Res       Date:  2016-06-09       Impact factor: 16.971

2.  An extended set of PRDM1/BLIMP1 target genes links binding motif type to dynamic repression.

Authors:  Gina M Doody; Matthew A Care; Nicholas J Burgoyne; James R Bradford; Maria Bota; Constanze Bonifer; David R Westhead; Reuben M Tooze
Journal:  Nucleic Acids Res       Date:  2010-04-26       Impact factor: 16.971

3.  Finding subtypes of transcription factor motif pairs with distinct regulatory roles.

Authors:  Abha Singh Bais; Naftali Kaminski; Panayiotis V Benos
Journal:  Nucleic Acids Res       Date:  2011-04-12       Impact factor: 16.971

4.  Tree-based position weight matrix approach to model transcription factor binding site profiles.

Authors:  Yingtao Bi; Hyunsoo Kim; Ravi Gupta; Ramana V Davuluri
Journal:  PLoS One       Date:  2011-09-02       Impact factor: 3.240

5.  MODER2: first-order Markov modeling and discovery of monomeric and dimeric binding motifs.

Authors:  Jarkko Toivonen; Pratyush K Das; Jussi Taipale; Esko Ukkonen
Journal:  Bioinformatics       Date:  2020-05-01       Impact factor: 6.937

6.  DLm6Am: A Deep-Learning-Based Tool for Identifying N6,2'-O-Dimethyladenosine Sites in RNA Sequences.

Authors:  Zhengtao Luo; Wei Su; Liliang Lou; Wangren Qiu; Xuan Xiao; Zhaochun Xu
Journal:  Int J Mol Sci       Date:  2022-09-20       Impact factor: 6.208

7.  Partially-supervised protein subclass discovery with simultaneous annotation of functional residues.

Authors:  Benjamin Georgi; Jörg Schultz; Alexander Schliep
Journal:  BMC Struct Biol       Date:  2009-10-26

8.  PyMix--the python mixture package--a tool for clustering of heterogeneous biological data.

Authors:  Benjamin Georgi; Ivan Gesteira Costa; Alexander Schliep
Journal:  BMC Bioinformatics       Date:  2010-01-06       Impact factor: 3.169

9.  Searching for transcription factor binding sites in vector spaces.

Authors:  Chih Lee; Chun-Hsi Huang
Journal:  BMC Bioinformatics       Date:  2012-08-27       Impact factor: 3.169

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.