Emi Tanaka1, Timothy L Bailey2, Uri Keich2. 1. School of Mathematics and Statistics, University of Sydney, Sydney 2006, School of Mathematics and Applied Statistics, University of Wollongong, Wollongong 2522, New South Wales and Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, AustraliaSchool of Mathematics and Statistics, University of Sydney, Sydney 2006, School of Mathematics and Applied Statistics, University of Wollongong, Wollongong 2522, New South Wales and Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia. 2. School of Mathematics and Statistics, University of Sydney, Sydney 2006, School of Mathematics and Applied Statistics, University of Wollongong, Wollongong 2522, New South Wales and Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia.
Abstract
MOTIVATION: With over 9000 unique users recorded in the first half of 2013, MEME is one of the most popular motif-finding tools available. Reliable estimates of the statistical significance of motifs can greatly increase the usefulness of any motif finder. By analogy, it is difficult to imagine evaluating a BLAST result without its accompanying E-value. Currently MEME evaluates its EM-generated candidate motifs using an extension of BLAST's E-value to the motif-finding context. Although we previously indicated the drawbacks of MEME's current significance evaluation, we did not offer a practical substitute suited for its needs, especially because MEME also relies on the E-value internally to rank competing candidate motifs. RESULTS: Here we offer a two-tiered significance analysis that can replace the E-value in selecting the best candidate motif and in evaluating its overall statistical significance. We show that our new approach could substantially improve MEME's motif-finding performance and would also provide the user with a reliable significance analysis. In addition, for large input sets, our new approach is in fact faster than the currently implemented E-value analysis.
MOTIVATION: With over 9000 unique users recorded in the first half of 2013, MEME is one of the most popular motif-finding tools available. Reliable estimates of the statistical significance of motifs can greatly increase the usefulness of any motif finder. By analogy, it is difficult to imagine evaluating a BLAST result without its accompanying E-value. Currently MEME evaluates its EM-generated candidate motifs using an extension of BLAST's E-value to the motif-finding context. Although we previously indicated the drawbacks of MEME's current significance evaluation, we did not offer a practical substitute suited for its needs, especially because MEME also relies on the E-value internally to rank competing candidate motifs. RESULTS: Here we offer a two-tiered significance analysis that can replace the E-value in selecting the best candidate motif and in evaluating its overall statistical significance. We show that our new approach could substantially improve MEME's motif-finding performance and would also provide the user with a reliable significance analysis. In addition, for large input sets, our new approach is in fact faster than the currently implemented E-value analysis.
Authors: S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman Journal: Nucleic Acids Res Date: 1997-09-01 Impact factor: 16.971
Authors: Christopher T Harbison; D Benjamin Gordon; Tong Ihn Lee; Nicola J Rinaldi; Kenzie D Macisaac; Timothy W Danford; Nancy M Hannett; Jean-Bosco Tagne; David B Reynolds; Jane Yoo; Ezra G Jennings; Julia Zeitlinger; Dmitry K Pokholok; Manolis Kellis; P Alex Rolfe; Ken T Takusagawa; Eric S Lander; David K Gifford; Ernest Fraenkel; Richard A Young Journal: Nature Date: 2004-09-02 Impact factor: 49.962
Authors: Anton G Henssen; Eileen Jiang; Jiali Zhuang; Luca Pinello; Nicholas D Socci; Richard Koche; Mithat Gonen; Camila M Villasante; Scott A Armstrong; Daniel E Bauer; Zhiping Weng; Alex Kentsis Journal: BMC Genomics Date: 2016-08-04 Impact factor: 3.969