Alice Cheng1, Charles E Grant1, William S Noble1,2, Timothy L Bailey3. 1. Department of Genome Sciences, University of Washington, Seattle, WA, USA. 2. Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA. 3. Department of Pharmacology, University of Nevada, Reno, NV, USA.
Abstract
MOTIVATION: Post-translational modifications (PTMs) of proteins are associated with many significant biological functions and can be identified in high throughput using tandem mass spectrometry. Many PTMs are associated with short sequence patterns called 'motifs' that help localize the modifying enzyme. Accordingly, many algorithms have been designed to identify these motifs from mass spectrometry data. Accurate statistical confidence estimates for discovered motifs are critically important for proper interpretation and in the design of downstream experimental validation. RESULTS: We describe a method for assigning statistical confidence estimates to PTM motifs, and we demonstrate that this method provides accurate P-values on both simulated and real data. Our methods are implemented in MoMo, a software tool for discovering motifs among sets of PTMs that we make available as a web server and as downloadable source code. MoMo re-implements the two most widely used PTM motif discovery algorithms-motif-x and MoDL-while offering many enhancements. Relative to motif-x, MoMo offers improved statistical confidence estimates and more accurate calculation of motif scores. The MoMo web server offers more proteome databases, more input formats, larger inputs and longer running times than the motif-x web server. Finally, our study demonstrates that the confidence estimates produced by motif-x are inaccurate. This inaccuracy stems in part from the common practice of drawing 'background' peptides from an unshuffled proteome database. Our results thus suggest that many of the papers that use motif-x to find motifs may be reporting results that lack statistical support. AVAILABILITY AND IMPLEMENTATION: The MoMo web server and source code are provided at http://meme-suite.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Post-translational modifications (PTMs) of proteins are associated with many significant biological functions and can be identified in high throughput using tandem mass spectrometry. Many PTMs are associated with short sequence patterns called 'motifs' that help localize the modifying enzyme. Accordingly, many algorithms have been designed to identify these motifs from mass spectrometry data. Accurate statistical confidence estimates for discovered motifs are critically important for proper interpretation and in the design of downstream experimental validation. RESULTS: We describe a method for assigning statistical confidence estimates to PTM motifs, and we demonstrate that this method provides accurate P-values on both simulated and real data. Our methods are implemented in MoMo, a software tool for discovering motifs among sets of PTMs that we make available as a web server and as downloadable source code. MoMo re-implements the two most widely used PTM motif discovery algorithms-motif-x and MoDL-while offering many enhancements. Relative to motif-x, MoMo offers improved statistical confidence estimates and more accurate calculation of motif scores. The MoMo web server offers more proteome databases, more input formats, larger inputs and longer running times than the motif-x web server. Finally, our study demonstrates that the confidence estimates produced by motif-x are inaccurate. This inaccuracy stems in part from the common practice of drawing 'background' peptides from an unshuffled proteome database. Our results thus suggest that many of the papers that use motif-x to find motifs may be reporting results that lack statistical support. AVAILABILITY AND IMPLEMENTATION: The MoMo web server and source code are provided at http://meme-suite.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Martin Lee Miller; Lars Juhl Jensen; Francesca Diella; Claus Jørgensen; Michele Tinti; Lei Li; Marilyn Hsiung; Sirlester A Parker; Jennifer Bordeaux; Thomas Sicheritz-Ponten; Marina Olhovsky; Adrian Pasculescu; Jes Alexander; Stefan Knapp; Nikolaj Blom; Peer Bork; Shawn Li; Gianni Cesareni; Tony Pawson; Benjamin E Turk; Michael B Yaffe; Søren Brunak; Rune Linding Journal: Sci Signal Date: 2008-09-02 Impact factor: 8.192
Authors: Neil F W Saunders; Ross I Brinkworth; Thomas Huber; Bruce E Kemp; Bostjan Kobe Journal: BMC Bioinformatics Date: 2008-05-26 Impact factor: 3.169
Authors: Aaron J Storey; Kevin S Naceanceno; Renny S Lan; Charity L Washam; Lisa M Orr; Samuel G Mackintosh; Alan J Tackett; Rick D Edmondson; Zhengyu Wang; Hong-Yu Li; Brendan Frett; Samantha Kendrick; Stephanie D Byrum Journal: Mol Omics Date: 2020-04-29
Authors: Lichao Zhang; Sebastian Winkler; Fabian P Schlottmann; Oliver Kohlbacher; Josh E Elias; Jan M Skotheim; Jennifer C Ewald Journal: Front Cell Dev Biol Date: 2019-12-17