Literature DB >> 14572045

A hypergeometric probability model for protein identification and validation using tandem mass spectral data and protein sequence databases.

Rovshan G Sadygov1, John R Yates.   

Abstract

We present a new probability-based method for protein identification using tandem mass spectra and protein databases. The method employs a hypergeometric distribution to model frequencies of matches between fragment ions predicted for peptide sequences with a specific (M + H)+ value (at some mass tolerance) in a protein sequence database and an experimental tandem mass spectrum. The hypergeometric distribution constitutes null hypothesis-all peptide matches to a tandem mass spectrum are random. It is used to generate a score characterizing the randomness of a database sequence match to an experimental tandem mass spectrum and to determine the level of significance of the null hypothesis. For each tandem mass spectrum and database search, a peptide is identified that has the least probability of being a random match to the spectrum and the corresponding level of significance of the null hypothesis is determined. To check the validity of the hypergeometric model in describing fragment ion matches, we used chi2 test. The distribution of frequencies and corresponding hypergeometric probabilities are generated for each tandem mass spectrum. No proteolytic cleavage specificity is used to create the peptide sequences from the database. We do not use any empirical probabilities in this method. The scores generated by the hypergeometric model do not have a significant molecular weight bias and are reasonably independent of database size. The approach has been implemented in a database search algorithm, PEP_PROBE. By using a large set of tandem mass spectra derived from a set of peptides created by digestion of a collection of known proteins using four different proteases, a false positive rate of 5% is demonstrated.

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 14572045     DOI: 10.1021/ac034157w

Source DB:  PubMed          Journal:  Anal Chem        ISSN: 0003-2700            Impact factor:   6.986


  58 in total

1.  Identification of best indicators of peptide-spectrum match using a permutation resampling approach.

Authors:  Malik N Akhtar; Bruce R Southey; Per E Andrén; Jonathan V Sweedler; Sandra L Rodriguez-Zas
Journal:  J Bioinform Comput Biol       Date:  2014-10       Impact factor: 1.122

2.  Host cell interactome of HIV-1 Rev includes RNA helicases involved in multiple facets of virus production.

Authors:  Souad Naji; Géza Ambrus; Peter Cimermančič; Jason R Reyes; Jeffrey R Johnson; Rebecca Filbrandt; Michael D Huber; Paul Vesely; Nevan J Krogan; John R Yates; Andrew C Saphire; Larry Gerace
Journal:  Mol Cell Proteomics       Date:  2011-12-15       Impact factor: 5.911

3.  Toward objective evaluation of proteomic algorithms.

Authors:  John R Yates; Sung Kyu Robin Park; Claire M Delahunty; Tao Xu; Jeffrey N Savas; Daniel Cociorva; Paulo Costa Carvalho
Journal:  Nat Methods       Date:  2012-04-27       Impact factor: 28.547

4.  Software Analysis of Uncorrelated MS1 Peaks for Discovery of Post-Translational Modifications.

Authors:  Bruce D Pascal; Graham M West; Catherina Scharager-Tapia; Ricardo Flefil; Tina Moroni; Pablo Martinez-Acedo; Patrick R Griffin; Anthony C Carvalloza
Journal:  J Am Soc Mass Spectrom       Date:  2015-08-12       Impact factor: 3.109

5.  MS2Grouper: group assessment and synthetic replacement of duplicate proteomic tandem mass spectra.

Authors:  David L Tabb; Melissa R Thompson; Gurusahai Khalsa-Moyers; Nathan C VerBerkmoes; W Hayes McDonald
Journal:  J Am Soc Mass Spectrom       Date:  2005-08       Impact factor: 3.109

6.  Improving gene annotation using peptide mass spectrometry.

Authors:  Stephen Tanner; Zhouxin Shen; Julio Ng; Liliana Florea; Roderic Guigó; Steven P Briggs; Vineet Bafna
Journal:  Genome Res       Date:  2006-12-22       Impact factor: 9.043

7.  MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis.

Authors:  David L Tabb; Christopher G Fernando; Matthew C Chambers
Journal:  J Proteome Res       Date:  2007-02       Impact factor: 4.466

8.  De novo peptide identification via tandem mass spectrometry and integer linear optimization.

Authors:  Peter A DiMaggio; Christodoulos A Floudas
Journal:  Anal Chem       Date:  2007-02-15       Impact factor: 6.986

9.  Differential protein expression by Porphyromonas gingivalis in response to secreted epithelial cell components.

Authors:  Yi Zhang; Tiansong Wang; Weibin Chen; Ozlem Yilmaz; Yoonsuk Park; Il-Young Jung; Murray Hackett; Richard J Lamont
Journal:  Proteomics       Date:  2005-01       Impact factor: 3.984

10.  MassMatrix: a database search program for rapid characterization of proteins and peptides from tandem mass spectrometry data.

Authors:  Hua Xu; Michael A Freitas
Journal:  Proteomics       Date:  2009-03       Impact factor: 3.984

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.