Literature DB >> 25987413

A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets.

Mikhail M Savitski1, Mathias Wilhelm2, Hannes Hahne3, Bernhard Kuster4, Marcus Bantscheff5.   

Abstract

Calculating the number of confidently identified proteins and estimating false discovery rate (FDR) is a challenge when analyzing very large proteomic data sets such as entire human proteomes. Biological and technical heterogeneity in proteomic experiments further add to the challenge and there are strong differences in opinion regarding the conceptual validity of a protein FDR and no consensus regarding the methodology for protein FDR determination. There are also limitations inherent to the widely used classic target-decoy strategy that particularly show when analyzing very large data sets and that lead to a strong over-representation of decoy identifications. In this study, we investigated the merits of the classic, as well as a novel target-decoy-based protein FDR estimation approach, taking advantage of a heterogeneous data collection comprised of ∼19,000 LC-MS/MS runs deposited in ProteomicsDB (https://www.proteomicsdb.org). The "picked" protein FDR approach treats target and decoy sequences of the same protein as a pair rather than as individual entities and chooses either the target or the decoy sequence depending on which receives the highest score. We investigated the performance of this approach in combination with q-value based peptide scoring to normalize sample-, instrument-, and search engine-specific differences. The "picked" target-decoy strategy performed best when protein scoring was based on the best peptide q-value for each protein yielding a stable number of true positive protein identifications over a wide range of q-value thresholds. We show that this simple and unbiased strategy eliminates a conceptual issue in the commonly used "classic" protein FDR approach that causes overprediction of false-positive protein identification in large data sets. The approach scales from small to very large data sets without losing performance, consistently increases the number of true-positive protein identifications and is readily implemented in proteomics analysis software.
© 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

Entities:  

Mesh:

Year:  2015        PMID: 25987413      PMCID: PMC4563723          DOI: 10.1074/mcp.M114.046995

Source DB:  PubMed          Journal:  Mol Cell Proteomics        ISSN: 1535-9476            Impact factor:   5.911


  44 in total

1.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search.

Authors:  Andrew Keller; Alexey I Nesvizhskii; Eugene Kolker; Ruedi Aebersold
Journal:  Anal Chem       Date:  2002-10-15       Impact factor: 6.986

2.  Open mass spectrometry search algorithm.

Authors:  Lewis Y Geer; Sanford P Markey; Jeffrey A Kowalak; Lukas Wagner; Ming Xu; Dawn M Maynard; Xiaoyu Yang; Wenyao Shi; Stephen H Bryant
Journal:  J Proteome Res       Date:  2004 Sep-Oct       Impact factor: 4.466

3.  Comparison of novel decoy database designs for optimizing protein identification searches using ABRF sPRG2006 standard MS/MS data sets.

Authors:  Luca Blanco; Jennifer A Mead; Conrad Bessant
Journal:  J Proteome Res       Date:  2009-04       Impact factor: 4.466

4.  Rapid and deep proteomes by faster sequencing on a benchtop quadrupole ultra-high-field Orbitrap mass spectrometer.

Authors:  Christian D Kelstrup; Rosa R Jersie-Christensen; Tanveer S Batth; Tabiwang N Arrey; Andreas Kuehn; Markus Kellmann; Jesper V Olsen
Journal:  J Proteome Res       Date:  2014-11-10       Impact factor: 4.466

5.  Tracking cancer drugs in living cells by thermal profiling of the proteome.

Authors:  Mikhail M Savitski; Friedrich B M Reinhard; Holger Franken; Thilo Werner; Maria Fälth Savitski; Dirk Eberhard; Daniel Martinez Molina; Rozbeh Jafari; Rebecca Bakszt Dovega; Susan Klaeger; Bernhard Kuster; Pär Nordlund; Marcus Bantscheff; Gerard Drewes
Journal:  Science       Date:  2014-10-02       Impact factor: 47.728

6.  Mass-spectrometry-based draft of the human proteome.

Authors:  Mathias Wilhelm; Judith Schlegl; Hannes Hahne; Amin Moghaddas Gholami; Marcus Lieberenz; Mikhail M Savitski; Emanuel Ziegler; Lars Butzmann; Siegfried Gessulat; Harald Marx; Toby Mathieson; Simone Lemeer; Karsten Schnatbaum; Ulf Reimer; Holger Wenschuh; Martin Mollenhauer; Julia Slotta-Huspenina; Joos-Hendrik Boese; Marcus Bantscheff; Anja Gerstmair; Franz Faerber; Bernhard Kuster
Journal:  Nature       Date:  2014-05-29       Impact factor: 49.962

Review 7.  A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics.

Authors:  Alexey I Nesvizhskii
Journal:  J Proteomics       Date:  2010-09-08       Impact factor: 4.044

8.  Integrated proteomic analysis of post-translational modifications by serial enrichment.

Authors:  Philipp Mertins; Jana W Qiao; Jinal Patel; Namrata D Udeshi; Karl R Clauser; D R Mani; Michael W Burgess; Michael A Gillette; Jacob D Jaffe; Steven A Carr
Journal:  Nat Methods       Date:  2013-06-09       Impact factor: 28.547

9.  Identification of missing proteins in the neXtProt database and unregistered phosphopeptides in the PhosphoSitePlus database as part of the Chromosome-centric Human Proteome Project.

Authors:  Takashi Shiromizu; Jun Adachi; Shio Watanabe; Tatsuo Murakami; Takahisa Kuga; Satoshi Muraoka; Takeshi Tomonaga
Journal:  J Proteome Res       Date:  2013-01-11       Impact factor: 4.466

10.  False discovery rates in spectral identification.

Authors:  Kyowon Jeong; Sangtae Kim; Nuno Bandeira
Journal:  BMC Bioinformatics       Date:  2012-11-05       Impact factor: 3.169

View more
  127 in total

1.  Posttranslational Protein Modifications in Plant Metabolism.

Authors:  Giulia Friso; Klaas J van Wijk
Journal:  Plant Physiol       Date:  2015-09-03       Impact factor: 8.340

2.  GAPP: A Proteogenomic Software for Genome Annotation and Global Profiling of Post-translational Modifications in Prokaryotes.

Authors:  Jia Zhang; Ming-Kun Yang; Honghui Zeng; Feng Ge
Journal:  Mol Cell Proteomics       Date:  2016-09-14       Impact factor: 5.911

3.  Metrics for the Human Proteome Project 2015: Progress on the Human Proteome and Guidelines for High-Confidence Protein Identification.

Authors:  Gilbert S Omenn; Lydie Lane; Emma K Lundberg; Ronald C Beavis; Alexey I Nesvizhskii; Eric W Deutsch
Journal:  J Proteome Res       Date:  2015-07-30       Impact factor: 4.466

4.  Bayesian Confidence Intervals for Multiplexed Proteomics Integrate Ion-statistics with Peptide Quantification Concordance.

Authors:  Leonid Peshkin; Meera Gupta; Lillia Ryazanova; Martin Wühr
Journal:  Mol Cell Proteomics       Date:  2019-07-16       Impact factor: 5.911

5.  Comprehensive identification of peptides in tandem mass spectra using an efficient open search engine.

Authors:  Hao Chi; Chao Liu; Hao Yang; Wen-Feng Zeng; Long Wu; Wen-Jing Zhou; Rui-Min Wang; Xiu-Nan Niu; Yue-He Ding; Yao Zhang; Zhao-Wei Wang; Zhen-Lin Chen; Rui-Xiang Sun; Tao Liu; Guang-Ming Tan; Meng-Qiu Dong; Ping Xu; Pei-Heng Zhang; Si-Min He
Journal:  Nat Biotechnol       Date:  2018-10-08       Impact factor: 54.908

6.  Integrated Identification and Quantification Error Probabilities for Shotgun Proteomics.

Authors:  Matthew The; Lukas Käll
Journal:  Mol Cell Proteomics       Date:  2018-11-27       Impact factor: 5.911

7.  Accurate Estimation of Context-Dependent False Discovery Rates in Top-Down Proteomics.

Authors:  Richard D LeDuc; Ryan T Fellers; Bryan P Early; Joseph B Greer; Daniel P Shams; Paul M Thomas; Neil L Kelleher
Journal:  Mol Cell Proteomics       Date:  2019-01-15       Impact factor: 5.911

8.  Integrated RNA-seq and Proteomic Studies Reveal Resource Reallocation towards Energy Metabolism and Defense in Skeletonema marinoi in Response to CO2 Increase.

Authors:  Mei Zhang; Yu Zhen; Tiezhu Mi; Senjie Lin
Journal:  Appl Environ Microbiol       Date:  2020-12-18       Impact factor: 4.792

9.  Selection of Heating Temperatures Improves the Sensitivity of the Proteome Integral Solubility Alteration Assay.

Authors:  Jiaming Li; Jonathan G Van Vranken; Joao A Paulo; Edward L Huttlin; Steven P Gygi
Journal:  J Proteome Res       Date:  2020-04-13       Impact factor: 4.466

10.  Human Proteome Project Mass Spectrometry Data Interpretation Guidelines 2.1.

Authors:  Eric W Deutsch; Christopher M Overall; Jennifer E Van Eyk; Mark S Baker; Young-Ki Paik; Susan T Weintraub; Lydie Lane; Lennart Martens; Yves Vandenbrouck; Ulrike Kusebauch; William S Hancock; Henning Hermjakob; Ruedi Aebersold; Robert L Moritz; Gilbert S Omenn
Journal:  J Proteome Res       Date:  2016-08-24       Impact factor: 4.466

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.