Literature DB >> 23111981

Unifying protein inference and peptide identification with feedback to update consistency between peptides.

Jinhong Shi1, Bolin Chen, Fang-Xiang Wu.   

Abstract

We first propose a new method to process peptide identification reports from databases search engines. Then via it we develop a method for unifying protein inference and peptide identification by adding a feedback from protein inference to peptide identification. The feedback information is a list of high-confidence proteins, which is used to update an adjacency matrix between peptides. The adjacency matrix is used in the regularization of peptide scores. Logistic regression (LR) is used to compute the probability of peptide identification with the regularized scores. Protein scores are then calculated with the LR probability of peptides. Instead of selecting the best peptide match for each MS/MS, we select multiple peptides. By testing on two datasets, the results have shown that the proposed method can robustly assign accurate probabilities to peptides, and have a higher discrimination power than PeptideProphet to distinguish correct and incorrect identified peptides. Additionally, not only can our method infer more true positive proteins but also infer less false positive proteins than ProteinProphet at the same false positive rate. The coverage of inferred proteins is also significantly increased due to the selection of multiple peptides for each MS/MS and the improvement of their scores by the feedback from the inferred proteins.
© 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 23111981     DOI: 10.1002/pmic.201200338

Source DB:  PubMed          Journal:  Proteomics        ISSN: 1615-9853            Impact factor:   3.984


  2 in total

Review 1.  Improving protein identification from tandem mass spectrometry data by one-step methods and integrating data from other platforms.

Authors:  Sinjini Sikdar; Ryan Gill; Susmita Datta
Journal:  Brief Bioinform       Date:  2015-07-03       Impact factor: 11.622

2.  A fast and high performance multiple data integration algorithm for identifying human disease genes.

Authors:  Bolin Chen; Min Li; Jianxin Wang; Xuequn Shang; Fang-Xiang Wu
Journal:  BMC Med Genomics       Date:  2015-09-23       Impact factor: 3.063

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.