Literature DB >> 18174183

Natural similarity measures between position frequency matrices with an application to clustering.

Utz J Pape1, Sven Rahmann, Martin Vingron.   

Abstract

MOTIVATION: Transcription factors (TFs) play a key role in gene regulation by binding to target sequences. In silico prediction of potential binding of a TF to a binding site is a well-studied problem in computational biology. The binding sites for one TF are represented by a position frequency matrix (PFM). The discovery of new PFMs requires the comparison to known PFMs to avoid redundancies. In general, two PFMs are similar if they occur at overlapping positions under a null model. Still, most existing methods compute similarity according to probabilistic distances of the PFMs. Here we propose a natural similarity measure based on the asymptotic covariance between the number of PFM hits incorporating both strands. Furthermore, we introduce a second measure based on the same idea to cluster a set of the Jaspar PFMs.
RESULTS: We show that the asymptotic covariance can be efficiently computed by a two dimensional convolution of the score distributions. The asymptotic covariance approach shows strong correlation with simulated data. It outperforms three alternative methods. The Jaspar clustering yields distinct groups of TFs of the same class. Furthermore, a representative PFM is given for each class. In contrast to most other clustering methods, PFMs with low similarity automatically remain singletons. AVAILABILITY: A website to compute the similarity and to perform clustering, the source code and Supplementary Material are available at http://mosta.molgen.mpg.de.

Mesh:

Substances:

Year:  2008        PMID: 18174183     DOI: 10.1093/bioinformatics/btm610

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  27 in total

1.  DNA-dependent formation of transcription factor pairs alters their binding specificity.

Authors:  Arttu Jolma; Yimeng Yin; Kazuhiro R Nitta; Kashyap Dave; Alexander Popov; Minna Taipale; Martin Enge; Teemu Kivioja; Ekaterina Morgunova; Jussi Taipale
Journal:  Nature       Date:  2015-11-09       Impact factor: 49.962

2.  Motif comparison based on similarity of binding affinity profiles.

Authors:  Samuel A Lambert; Mihai Albu; Timothy R Hughes; Hamed S Najafabadi
Journal:  Bioinformatics       Date:  2016-07-27       Impact factor: 6.937

3.  Insights into the Diversification and Evolution of R2R3-MYB Transcription Factors in Plants.

Authors:  Chen-Kun Jiang; Guang-Yuan Rao
Journal:  Plant Physiol       Date:  2020-04-14       Impact factor: 8.340

4.  RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections.

Authors:  Jaime Abraham Castro-Mondragon; Sébastien Jaeger; Denis Thieffry; Morgane Thomas-Chollier; Jacques van Helden
Journal:  Nucleic Acids Res       Date:  2017-07-27       Impact factor: 16.971

5.  Impact of cytosine methylation on DNA binding specificities of human transcription factors.

Authors:  Yimeng Yin; Ekaterina Morgunova; Arttu Jolma; Eevi Kaasinen; Biswajyoti Sahu; Syed Khund-Sayeed; Pratyush K Das; Teemu Kivioja; Kashyap Dave; Fan Zhong; Kazuhiro R Nitta; Minna Taipale; Alexander Popov; Paul A Ginno; Silvia Domcke; Jian Yan; Dirk Schübeler; Charles Vinson; Jussi Taipale
Journal:  Science       Date:  2017-05-05       Impact factor: 47.728

6.  Metamotifs--a generative model for building families of nucleotide position weight matrices.

Authors:  Matias Piipari; Thomas A Down; Tim Jp Hubbard
Journal:  BMC Bioinformatics       Date:  2010-06-25       Impact factor: 3.169

7.  Predicting DNA-binding specificities of eukaryotic transcription factors.

Authors:  Adrian Schröder; Johannes Eichner; Jochen Supper; Jonas Eichner; Dierk Wanke; Carsten Henneges; Andreas Zell
Journal:  PLoS One       Date:  2010-11-30       Impact factor: 3.240

8.  Statistical detection of cooperative transcription factors with similarity adjustment.

Authors:  Utz J Pape; Holger Klein; Martin Vingron
Journal:  Bioinformatics       Date:  2009-03-13       Impact factor: 6.937

9.  FISim: a new similarity measure between transcription factor binding sites based on the fuzzy integral.

Authors:  Fernando Garcia; Francisco J Lopez; Carlos Cano; Armando Blanco
Journal:  BMC Bioinformatics       Date:  2009-07-20       Impact factor: 3.169

10.  A novel alignment-free method for comparing transcription factor binding site motifs.

Authors:  Minli Xu; Zhengchang Su
Journal:  PLoS One       Date:  2010-01-20       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.