Literature DB >> 21543443

Improved similarity scores for comparing motifs.

Emi Tanaka1, Timothy Bailey, Charles E Grant, William Stafford Noble, Uri Keich.   

Abstract

MOTIVATION: A question that often comes up after applying a motif finder to a set of co-regulated DNA sequences is whether the reported putative motif is similar to any known motif. While several tools have been designed for this task, Habib et al. pointed out that the scores that are commonly used for measuring similarity between motifs do not distinguish between a good alignment of two informative columns (say, all-A) and one of two uninformative columns. This observation explains why tools such as Tomtom occasionally return an alignment of uninformative columns which is clearly spurious. To address this problem, Habib et al. suggested a new score [Bayesian Likelihood 2-Component (BLiC)] which uses a Bayesian information criterion to penalize matches that are also similar to the background distribution.
RESULTS: We show that the BLiC score exhibits other, highly undesirable properties, and we offer instead a general approach to adjust any motif similarity score so as to reduce the number of reported spurious alignments of uninformative columns. We implement our method in Tomtom and show that, without significantly compromising Tomtom's retrieval accuracy or its runtime, we can drastically reduce the number of uninformative alignments.
AVAILABILITY AND IMPLEMENTATION: The modified Tomtom is available as part of the MEME Suite at http://meme.nbcr.net.

Mesh:

Year:  2011        PMID: 21543443      PMCID: PMC3106196          DOI: 10.1093/bioinformatics/btr257

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  15 in total

1.  TRANSFAC: an integrated system for gene expression regulation.

Authors:  E Wingender; X Chen; R Hehl; H Karas; I Liebich; V Matys; T Meinhardt; M Prüss; I Reuter; F Schacherer
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  WebLogo: a sequence logo generator.

Authors:  Gavin E Crooks; Gary Hon; John-Marc Chandonia; Steven E Brenner
Journal:  Genome Res       Date:  2004-06       Impact factor: 9.043

3.  MotifPrototyper: a Bayesian profile model for motif families.

Authors:  Eric P Xing; Richard M Karp
Journal:  Proc Natl Acad Sci U S A       Date:  2004-07-13       Impact factor: 11.205

4.  Constrained binding site diversity within families of transcription factors enhances pattern discovery bioinformatics.

Authors:  Albin Sandelin; Wyeth W Wasserman
Journal:  J Mol Biol       Date:  2004-04-23       Impact factor: 5.469

5.  Sequence logos: a new way to display consensus sequences.

Authors:  T D Schneider; R M Stephens
Journal:  Nucleic Acids Res       Date:  1990-10-25       Impact factor: 16.971

6.  Metamotifs--a generative model for building families of nucleotide position weight matrices.

Authors:  Matias Piipari; Thomas A Down; Tim Jp Hubbard
Journal:  BMC Bioinformatics       Date:  2010-06-25       Impact factor: 3.169

7.  Assessing computational tools for the discovery of transcription factor binding sites.

Authors:  Martin Tompa; Nan Li; Timothy L Bailey; George M Church; Bart De Moor; Eleazar Eskin; Alexander V Favorov; Martin C Frith; Yutao Fu; W James Kent; Vsevolod J Makeev; Andrei A Mironov; William Stafford Noble; Giulio Pavesi; Graziano Pesole; Mireille Régnier; Nicolas Simonis; Saurabh Sinha; Gert Thijs; Jacques van Helden; Mathias Vandenbogaert; Zhiping Weng; Christopher Workman; Chun Ye; Zhou Zhu
Journal:  Nat Biotechnol       Date:  2005-01       Impact factor: 54.908

8.  Quantifying similarity between motifs.

Authors:  Shobhit Gupta; John A Stamatoyannopoulos; Timothy L Bailey; William Stafford Noble
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

9.  An improved map of conserved regulatory sites for Saccharomyces cerevisiae.

Authors:  Kenzie D MacIsaac; Ting Wang; D Benjamin Gordon; David K Gifford; Gary D Stormo; Ernest Fraenkel
Journal:  BMC Bioinformatics       Date:  2006-03-07       Impact factor: 3.169

10.  STAMP: a web tool for exploring DNA-binding motif similarities.

Authors:  Shaun Mahony; Panayiotis V Benos
Journal:  Nucleic Acids Res       Date:  2007-05-03       Impact factor: 16.971

View more
  23 in total

1.  Discriminative motif analysis of high-throughput dataset.

Authors:  Zizhen Yao; Kyle L Macquarrie; Abraham P Fong; Stephen J Tapscott; Walter L Ruzzo; Robert C Gentleman
Journal:  Bioinformatics       Date:  2013-10-25       Impact factor: 6.937

2.  Improving MEME via a two-tiered significance analysis.

Authors:  Emi Tanaka; Timothy L Bailey; Uri Keich
Journal:  Bioinformatics       Date:  2014-03-24       Impact factor: 6.937

3.  RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections.

Authors:  Jaime Abraham Castro-Mondragon; Sébastien Jaeger; Denis Thieffry; Morgane Thomas-Chollier; Jacques van Helden
Journal:  Nucleic Acids Res       Date:  2017-07-27       Impact factor: 16.971

4.  Regulation of fatty acid biosynthesis by the global regulator CcpA and the local regulator FabT in Streptococcus mutans.

Authors:  R C Faustoferri; C J Hubbard; B Santiago; A A Buckley; T B Seifert; R G Quivey
Journal:  Mol Oral Microbiol       Date:  2014-10-27       Impact factor: 3.563

5.  De novo prediction of DNA-binding specificities for Cys2His2 zinc finger proteins.

Authors:  Anton V Persikov; Mona Singh
Journal:  Nucleic Acids Res       Date:  2013-10-03       Impact factor: 16.971

6.  Phosphorylated Lamin A/C in the Nuclear Interior Binds Active Enhancers Associated with Abnormal Transcription in Progeria.

Authors:  Kohta Ikegami; Stefano Secchia; Omar Almakki; Jason D Lieb; Ivan P Moskowitz
Journal:  Dev Cell       Date:  2020-03-23       Impact factor: 12.270

7.  Assessment of algorithms for inferring positional weight matrix motifs of transcription factor binding sites using protein binding microarray data.

Authors:  Yaron Orenstein; Chaim Linhart; Ron Shamir
Journal:  PLoS One       Date:  2012-09-28       Impact factor: 3.240

8.  DBD2BS: connecting a DNA-binding protein with its binding sites.

Authors:  Ting-Ying Chien; Chih-Kang Lin; Chih-Wei Lin; Yi-Zhong Weng; Chien-Yu Chen; Darby Tien-Hao Chang
Journal:  Nucleic Acids Res       Date:  2012-06-11       Impact factor: 16.971

9.  Deep RNA sequencing reveals novel cardiac transcriptomic signatures for physiological and pathological hypertrophy.

Authors:  Hong Ki Song; Seong-Eui Hong; Taeyong Kim; Do Han Kim
Journal:  PLoS One       Date:  2012-04-16       Impact factor: 3.240

10.  Determination and inference of eukaryotic transcription factor sequence specificity.

Authors:  Matthew T Weirauch; Ally Yang; Mihai Albu; Atina G Cote; Alejandro Montenegro-Montero; Philipp Drewe; Hamed S Najafabadi; Samuel A Lambert; Ishminder Mann; Kate Cook; Hong Zheng; Alejandra Goity; Harm van Bakel; Jean-Claude Lozano; Mary Galli; Mathew G Lewsey; Eryong Huang; Tuhin Mukherjee; Xiaoting Chen; John S Reece-Hoyes; Sridhar Govindarajan; Gad Shaulsky; Albertha J M Walhout; François-Yves Bouget; Gunnar Ratsch; Luis F Larrondo; Joseph R Ecker; Timothy R Hughes
Journal:  Cell       Date:  2014-09-11       Impact factor: 41.582

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.