Literature DB >> 19275164

Statistical calibration of the SEQUEST XCorr function.

Aaron A Klammer1, Christopher Y Park, William Stafford Noble.   

Abstract

Obtaining accurate peptide identifications from shotgun proteomics liquid chromatography tandem mass spectrometry (LC-MS/MS) experiments requires a score function that consistently ranks correct peptide-spectrum matches (PSMs) above incorrect matches. We have observed that, for the Sequest score function Xcorr, the inability to discriminate between correct and incorrect PSMs is due in part to spectrum-specific properties of the score distribution. In other words, some spectra score well regardless of which peptides they are scored against, and other spectra score well because they are scored against a large number of peptides. We describe a protocol for calibrating PSM score functions, and we demonstrate its application to Xcorr and the preliminary Sequest score function Sp. The protocol accounts for spectrum- and peptide-specific effects by calculating p values for each spectrum individually, using only that spectrum's score distribution. We demonstrate that these calculated p values are uniform under a null distribution and therefore accurately measure significance. These p values can be used to estimate the false discovery rate, therefore, eliminating the need for an extra search against a decoy database. In addition, we show that the pvalues are better calibrated than their underlying scores; consequently, when ranking top-scoring PSMs from multiple spectra, p values are better at discriminating between correct and incorrect PSMs. The calibration protocol is generally applicable to any PSM score function for which an appopriate parametric family can be identified.

Entities:  

Mesh:

Year:  2009        PMID: 19275164      PMCID: PMC2807930          DOI: 10.1021/pr8011107

Source DB:  PubMed          Journal:  J Proteome Res        ISSN: 1535-3893            Impact factor:   4.466


  21 in total

1.  SCOPE: a probabilistic model for scoring tandem mass spectra against a peptide database.

Authors:  V Bafna; N Edwards
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

2.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search.

Authors:  Andrew Keller; Alexey I Nesvizhskii; Eugene Kolker; Ruedi Aebersold
Journal:  Anal Chem       Date:  2002-10-15       Impact factor: 6.986

3.  A hypergeometric probability model for protein identification and validation using tandem mass spectral data and protein sequence databases.

Authors:  Rovshan G Sadygov; John R Yates
Journal:  Anal Chem       Date:  2003-08-01       Impact factor: 6.986

4.  Estimating and evaluating the statistics of gapped local-alignment scores.

Authors:  Timothy L Bailey; Michael Gribskov
Journal:  J Comput Biol       Date:  2002       Impact factor: 1.479

5.  Statistical significance for genomewide studies.

Authors:  John D Storey; Robert Tibshirani
Journal:  Proc Natl Acad Sci U S A       Date:  2003-07-25       Impact factor: 11.205

6.  TANDEM: matching proteins with tandem mass spectra.

Authors:  Robertson Craig; Ronald C Beavis
Journal:  Bioinformatics       Date:  2004-02-19       Impact factor: 6.937

7.  Open mass spectrometry search algorithm.

Authors:  Lewis Y Geer; Sanford P Markey; Jeffrey A Kowalak; Lukas Wagner; Ming Xu; Dawn M Maynard; Xiaoyu Yang; Wenyao Shi; Stephen H Bryant
Journal:  J Proteome Res       Date:  2004 Sep-Oct       Impact factor: 4.466

8.  Statistical model for large-scale peptide identification in databases from tandem mass spectra using SEQUEST.

Authors:  Daniel López-Ferrer; Salvador Martínez-Bartolomé; Margarita Villar; Mónica Campillos; Fernando Martín-Maroto; Jesús Vázquez
Journal:  Anal Chem       Date:  2004-12-01       Impact factor: 6.986

9.  Rapid and accurate peptide identification from tandem mass spectra.

Authors:  Christopher Y Park; Aaron A Klammer; Lukas Käll; Michael J MacCoss; William S Noble
Journal:  J Proteome Res       Date:  2008-05-28       Impact factor: 4.466

10.  Empirical statistical estimates for sequence similarity searches.

Authors:  W R Pearson
Journal:  J Mol Biol       Date:  1998-02-13       Impact factor: 5.469

View more
  30 in total

1.  Identification of best indicators of peptide-spectrum match using a permutation resampling approach.

Authors:  Malik N Akhtar; Bruce R Southey; Per E Andrén; Jonathan V Sweedler; Sandra L Rodriguez-Zas
Journal:  J Bioinform Comput Biol       Date:  2014-10       Impact factor: 1.122

2.  Recommendations for mass spectrometry data quality metrics for open access data (corollary to the Amsterdam Principles).

Authors:  Christopher R Kinsinger; James Apffel; Mark Baker; Xiaopeng Bian; Christoph H Borchers; Ralph Bradshaw; Mi-Youn Brusniak; Daniel W Chan; Eric W Deutsch; Bruno Domon; Jeff Gorman; Rudolf Grimm; William Hancock; Henning Hermjakob; David Horn; Christie Hunter; Patrik Kolar; Hans-Joachim Kraus; Hanno Langen; Rune Linding; Robert L Moritz; Gilbert S Omenn; Ron Orlando; Akhilesh Pandey; Peipei Ping; Amir Rahbar; Robert Rivers; Sean L Seymour; Richard J Simpson; Douglas Slotta; Richard D Smith; Stephen E Stein; David L Tabb; Danilo Tagle; John R Yates; Henry Rodriguez
Journal:  Mol Cell Proteomics       Date:  2011-11-03       Impact factor: 5.911

3.  Rapid and accurate peptide identification from tandem mass spectra.

Authors:  Christopher Y Park; Aaron A Klammer; Lukas Käll; Michael J MacCoss; William S Noble
Journal:  J Proteome Res       Date:  2008-05-28       Impact factor: 4.466

4.  Averaging Strategy To Reduce Variability in Target-Decoy Estimates of False Discovery Rate.

Authors:  Uri Keich; Kaipo Tamura; William Stafford Noble
Journal:  J Proteome Res       Date:  2019-01-03       Impact factor: 4.466

5.  Assigning spectrum-specific P-values to protein identifications by mass spectrometry.

Authors:  Victor Spirin; Alexander Shpunt; Jan Seebacher; Marc Gentzel; Andrej Shevchenko; Steven Gygi; Shamil Sunyaev
Journal:  Bioinformatics       Date:  2011-02-23       Impact factor: 6.937

Review 6.  A face in the crowd: recognizing peptides through database search.

Authors:  Jimmy K Eng; Brian C Searle; Karl R Clauser; David L Tabb
Journal:  Mol Cell Proteomics       Date:  2011-08-29       Impact factor: 5.911

7.  iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates.

Authors:  David Shteynberg; Eric W Deutsch; Henry Lam; Jimmy K Eng; Zhi Sun; Natalie Tasman; Luis Mendoza; Robert L Moritz; Ruedi Aebersold; Alexey I Nesvizhskii
Journal:  Mol Cell Proteomics       Date:  2011-08-29       Impact factor: 5.911

8.  Detecting cross-linked peptides by searching against a database of cross-linked peptide pairs.

Authors:  Sean McIlwain; Paul Draghicescu; Pragya Singh; David R Goodlett; William Stafford Noble
Journal:  J Proteome Res       Date:  2010-05-07       Impact factor: 4.466

9.  MixGF: spectral probabilities for mixture spectra from more than one peptide.

Authors:  Jian Wang; Philip E Bourne; Nuno Bandeira
Journal:  Mol Cell Proteomics       Date:  2014-09-15       Impact factor: 5.911

10.  Expanding the Scope of Cross-Link Identifications by Incorporating Collisional Activated Dissociation and Ultraviolet Photodissociation Methods.

Authors:  Michael B Cammarata; Luis A Macias; Jake Rosenberg; Alexander Bolufer; Jennifer S Brodbelt
Journal:  Anal Chem       Date:  2018-05-11       Impact factor: 6.986

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.