| Literature DB >> 33596079 |
William E Fondrie1, William S Noble1,2.
Abstract
Proteomics studies rely on the accurate assignment of peptides to the acquired tandem mass spectra-a task where machine learning algorithms have proven invaluable. We describe mokapot, which provides a flexible semisupervised learning algorithm that allows for highly customized analyses. We demonstrate some of the unique features of mokapot by improving the detection of RNA-cross-linked peptides from an analysis of RNA-binding proteins and increasing the consistency of peptide detection in a single-cell proteomics study.Entities:
Keywords: SVM; bioinformatics; confidence estimation; machine learning; peptide identification; percolator; proteomics; single-cell mass spectrometry; support vector machine; tandem mass spectrometry
Mesh:
Substances:
Year: 2021 PMID: 33596079 PMCID: PMC8022319 DOI: 10.1021/acs.jproteome.0c01010
Source DB: PubMed Journal: J Proteome Res ISSN: 1535-3893 Impact factor: 5.370
Figure 1Mokapot improves the detection of RNA-cross-linked peptides from open modification search results. The nonlinear XGBoost classifier resulted in the detection of more modified (a) PSMs, (b) peptides, and (c) proteins over a linear SVM (the default model in mokapot) or the MSFragger E-value. (d) The XGBoost classifier gained PSMs over the linear SVM with mass shifts that correspond to known modifications at 1% FDR (q-value ≤0.01).
Figure 2Joint models improve the power and consistency of peptide detection from single-cell proteomics experiments. (a) Joint models detect more PSMs, peptides, and proteins at 1% FDR than when experiments are analyzed individually. The detected (b) peptides and (c) proteins are more consistent across experiments using joint models in comparison to analyzing each experiment individually. In both cases, the joint models are comparable to using a static model but without the requirement of a training data set.