Literature DB >> 23419374

Non-parametric Bayesian approach to post-translational modification refinement of predictions from tandem mass spectrometry.

Clement Chung1, Andrew Emili, Brendan J Frey.   

Abstract

MOTIVATION: Tandem mass spectrometry (MS/MS) is a dominant approach for large-scale high-throughput post-translational modification (PTM) profiling. Although current state-of-the-art blind PTM spectral analysis algorithms can predict thousands of modified peptides (PTM predictions) in an MS/MS experiment, a significant percentage of these predictions have inaccurate modification mass estimates and false modification site assignments. This problem can be addressed by post-processing the PTM predictions with a PTM refinement algorithm. We developed a novel PTM refinement algorithm, iPTMClust, which extends a recently introduced PTM refinement algorithm PTMClust and uses a non-parametric Bayesian model to better account for uncertainties in the quantity and identity of PTMs in the input data. The use of this new modeling approach enables iPTMClust to provide a confidence score per modification site that allows fine-tuning and interpreting resulting PTM predictions.
RESULTS: The primary goal behind iPTMClust is to improve the quality of the PTM predictions. First, to demonstrate that iPTMClust produces sensible and accurate cluster assignments, we compare it with k-means clustering, mixtures of Gaussians (MOG) and PTMClust on a synthetically generated PTM dataset. Second, in two separate benchmark experiments using PTM data taken from a phosphopeptide and a yeast proteome study, we show that iPTMClust outperforms state-of-the-art PTM prediction and refinement algorithms, including PTMClust. Finally, we illustrate the general applicability of our new approach on a set of human chromatin protein complex data, where we are able to identify putative novel modified peptides and modification sites that may be involved in the formation and regulation of protein complexes. Our method facilitates accurate PTM profiling, which is an important step in understanding the mechanisms behind many biological processes and should be an integral part of any proteomic study. AVAILABILITY: Our algorithm is implemented in Java and is freely available for academic use from http://genes.toronto.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23419374     DOI: 10.1093/bioinformatics/btt056

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  3 in total

Review 1.  Proteomic approaches and identification of novel therapeutic targets for alcoholism.

Authors:  Giorgio Gorini; R Adron Harris; R Dayne Mayfield
Journal:  Neuropsychopharmacology       Date:  2013-07-31       Impact factor: 7.853

Review 2.  Challenges and Opportunities for Bayesian Statistics in Proteomics.

Authors:  Oliver M Crook; Chun-Wa Chung; Charlotte M Deane
Journal:  J Proteome Res       Date:  2022-03-08       Impact factor: 4.466

3.  Characterization of Proteoforms with Unknown Post-translational Modifications Using the MIScore.

Authors:  Qiang Kou; Binhai Zhu; Si Wu; Charles Ansong; Nikola Tolić; Ljiljana Paša-Tolić; Xiaowen Liu
Journal:  J Proteome Res       Date:  2016-07-01       Impact factor: 4.466

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.