Literature DB >> 19005226

Bayesian nonparametric model for the validation of peptide identification in shotgun proteomics.

Jiyang Zhang1, Jie Ma, Lei Dou, Songfeng Wu, Xiaohong Qian, Hongwei Xie, Yunping Zhu, Fuchu He.   

Abstract

Tandem mass spectrometry combined with database searching allows high throughput identification of peptides in shotgun proteomics. However, validating database search results, a problem with a lot of solutions proposed, is still advancing in some aspects, such as the sensitivity, specificity, and generalizability of the validation algorithms. Here a Bayesian nonparametric (BNP) model for the validation of database search results was developed that incorporates several popular techniques in statistical learning, including the compression of feature space with a linear discriminant function, the flexible nonparametric probability density function estimation for the variable probability structure in complex problem, and the Bayesian method to calculate the posterior probability. Importantly the BNP model is compatible with the popular target-decoy database search strategy naturally. We tested the BNP model on standard proteins and real, complex sample data sets from multiple MS platforms and compared it with Peptide-Prophet, the cutoff-based method, and a simple nonparametric method (proposed by us previously). The performance of the BNP model was shown to be superior for all data sets searched on sensitivity and generalizability. Some high quality matches that had been filtered out by other methods were detected and assigned with high probability by the BNP model. Thus, the BNP model could be able to validate the database search results effectively and extract more information from MS/MS data.

Mesh:

Substances:

Year:  2008        PMID: 19005226      PMCID: PMC2649816          DOI: 10.1074/mcp.M700558-MCP200

Source DB:  PubMed          Journal:  Mol Cell Proteomics        ISSN: 1535-9476            Impact factor:   5.911


  50 in total

1.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search.

Authors:  Andrew Keller; Alexey I Nesvizhskii; Eugene Kolker; Ruedi Aebersold
Journal:  Anal Chem       Date:  2002-10-15       Impact factor: 6.986

2.  Prediction of low-energy collision-induced dissociation spectra of peptides.

Authors:  Zhongqi Zhang
Journal:  Anal Chem       Date:  2004-07-15       Impact factor: 6.986

3.  Standard mixtures for proteome studies.

Authors:  Samuel Purvine; Alex F Picone; Eugene Kolker
Journal:  OMICS       Date:  2004

Review 4.  Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book.

Authors:  Rovshan G Sadygov; Daniel Cociorva; John R Yates
Journal:  Nat Methods       Date:  2004-12       Impact factor: 28.547

5.  Charge state estimation for tandem mass spectrometry proteomics.

Authors:  Jason M Hogan; Roger Higdon; Natali Kolker; Eugene Kolker
Journal:  OMICS       Date:  2005

6.  Valid data from large-scale proteomics studies.

Authors:  Daniel Chamrad; Helmut E Meyer
Journal:  Nat Methods       Date:  2005-09       Impact factor: 28.547

7.  A dataset of human fetal liver proteome identified by subcellular fractionation and multiple protein separation and identification technology.

Authors:  Wantao Ying; Ying Jiang; Lihai Guo; Yunwei Hao; Yangjun Zhang; Songfeng Wu; Fan Zhong; Jinglan Wang; Rong Shi; Dong Li; Ping Wan; Xiaohai Li; Handong Wei; Jianqi Li; Zhongsheng Wang; Xiaofang Xue; Yun Cai; Yunping Zhu; Xiaohong Qian; Fuchu He
Journal:  Mol Cell Proteomics       Date:  2006-06-30       Impact factor: 5.911

8.  Complexity and scoring function of MS/MS peptide de novo sequencing.

Authors:  Changjiang Xu; Bin Ma
Journal:  Comput Syst Bioinformatics Conf       Date:  2006

9.  A new strategy to filter out false positive identifications of peptides in SEQUEST database search results.

Authors:  Jiyang Zhang; Jianqi Li; Hongwei Xie; Yunping Zhu; Fuchu He
Journal:  Proteomics       Date:  2007-11       Impact factor: 3.984

10.  Direct analysis of protein complexes using mass spectrometry.

Authors:  A J Link; J Eng; D M Schieltz; E Carmack; G J Mize; D R Morris; B M Garvik; J R Yates
Journal:  Nat Biotechnol       Date:  1999-07       Impact factor: 54.908

View more
  10 in total

Review 1.  A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics.

Authors:  Alexey I Nesvizhskii
Journal:  J Proteomics       Date:  2010-09-08       Impact factor: 4.044

2.  Speeding Up Percolator.

Authors:  John T Halloran; Hantian Zhang; Kaan Kara; Cédric Renggli; Matthew The; Ce Zhang; David M Rocke; Lukas Käll; William Stafford Noble
Journal:  J Proteome Res       Date:  2019-08-23       Impact factor: 4.466

3.  Correct interpretation of comprehensive phosphorylation dynamics requires normalization by protein expression changes.

Authors:  Ronghu Wu; Noah Dephoure; Wilhelm Haas; Edward L Huttlin; Bo Zhai; Mathew E Sowa; Steven P Gygi
Journal:  Mol Cell Proteomics       Date:  2011-05-07       Impact factor: 5.911

4.  Mass spectrometric analysis of the N-glycoproteome in statin-treated liver cells with two lectin-independent chemical enrichment methods.

Authors:  Haopeng Xiao; Ju Eun Hwang; Ronghu Wu
Journal:  Int J Mass Spectrom       Date:  2017-05-27       Impact factor: 1.986

5.  A universal chemical enrichment method for mapping the yeast N-glycoproteome by mass spectrometry (MS).

Authors:  Weixuan Chen; Johanna M Smeekens; Ronghu Wu
Journal:  Mol Cell Proteomics       Date:  2014-04-01       Impact factor: 5.911

6.  Using the entrapment sequence method as a standard to evaluate key steps of proteomics data analysis process.

Authors:  Xiao-Dong Feng; Li-Wei Li; Jian-Hong Zhang; Yun-Ping Zhu; Cheng Chang; Kun-Xian Shu; Jie Ma
Journal:  BMC Genomics       Date:  2017-03-14       Impact factor: 3.969

7.  Comparison of extensive protein fractionation and repetitive LC-MS/MS analyses on depth of analysis for complex proteomes.

Authors:  Huan Wang; Tony Chang-Wong; Hsin-Yao Tang; David W Speicher
Journal:  J Proteome Res       Date:  2010-02-05       Impact factor: 4.466

8.  Learning from decoys to improve the sensitivity and specificity of proteomics database search results.

Authors:  Amit Kumar Yadav; Dhirendra Kumar; Debasis Dash
Journal:  PLoS One       Date:  2012-11-26       Impact factor: 3.240

9.  pGlyco: a pipeline for the identification of intact N-glycopeptides by using HCD- and CID-MS/MS and MS3.

Authors:  Wen-Feng Zeng; Ming-Qi Liu; Yang Zhang; Jian-Qiang Wu; Pan Fang; Chao Peng; Aiying Nie; Guoquan Yan; Weiqian Cao; Chao Liu; Hao Chi; Rui-Xiang Sun; Catherine C L Wong; Si-Min He; Pengyuan Yang
Journal:  Sci Rep       Date:  2016-05-03       Impact factor: 4.379

10.  A cost-sensitive online learning method for peptide identification.

Authors:  Xijun Liang; Zhonghang Xia; Ling Jian; Yongxiang Wang; Xinnan Niu; Andrew J Link
Journal:  BMC Genomics       Date:  2020-04-25       Impact factor: 3.969

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.