Literature DB >> 25423621

The generating function approach for Peptide identification in spectral networks.

Adrian Guthals1, Christina Boucher, Nuno Bandeira.   

Abstract

Tandem mass (MS/MS) spectrometry has become the method of choice for protein identification and has launched a quest for the identification of every translated protein and peptide. However, computational developments have lagged behind the pace of modern data acquisition protocols and have become a major bottleneck in proteomics analysis of complex samples. As it stands today, attempts to identify MS/MS spectra against large databases (e.g., the human microbiome or 6-frame translation of the human genome) face a search space that is 10-100 times larger than the human proteome, where it becomes increasingly challenging to separate between true and false peptide matches. As a result, the sensitivity of current state-of-the-art database search methods drops by nearly 38% to such low identification rates that almost 90% of all MS/MS spectra are left as unidentified. We address this problem by extending the generating function approach to rigorously compute the joint spectral probability of multiple spectra being matched to peptides with overlapping sequences, thus enabling the confident assignment of higher significance to overlapping peptide-spectrum matches (PSMs). We find that these joint spectral probabilities can be several orders of magnitude more significant than individual PSMs, even in the ideal case when perfect separation between signal and noise peaks could be achieved per individual MS/MS spectrum. After benchmarking this approach on a typical lysate MS/MS dataset, we show that the proposed intersecting spectral probabilities for spectra from overlapping peptides improve peptide identification by 30-62%.

Entities:  

Keywords:  algorithms; computational molecular biology; databases; probability; statistical models

Mesh:

Substances:

Year:  2014        PMID: 25423621      PMCID: PMC4425220          DOI: 10.1089/cmb.2014.0165

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  27 in total

1.  De novo peptide sequencing via tandem mass spectrometry.

Authors:  V Dancík; T A Addona; K R Clauser; J E Vath; P A Pevzner
Journal:  J Comput Biol       Date:  1999 Fall-Winter       Impact factor: 1.479

2.  Mutation-tolerant protein identification by mass spectrometry.

Authors:  P A Pevzner; V Dancík; C L Tang
Journal:  J Comput Biol       Date:  2000       Impact factor: 1.479

3.  Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility.

Authors:  David L Tabb; Michael J MacCoss; Christine C Wu; Scott D Anderson; John R Yates
Journal:  Anal Chem       Date:  2003-05-15       Impact factor: 6.986

4.  The generating function of CID, ETD, and CID/ETD pairs of tandem mass spectra: applications to database search.

Authors:  Sangtae Kim; Nikolai Mischerikow; Nuno Bandeira; J Daniel Navarro; Louis Wich; Shabaz Mohammed; Albert J R Heck; Pavel A Pevzner
Journal:  Mol Cell Proteomics       Date:  2010-09-09       Impact factor: 5.911

5.  Shotgun protein sequencing by tandem mass spectra assembly.

Authors:  Nuno Bandeira; Haixu Tang; Vineet Bafna; Pavel Pevzner
Journal:  Anal Chem       Date:  2004-12-15       Impact factor: 6.986

Review 6.  A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics.

Authors:  Alexey I Nesvizhskii
Journal:  J Proteomics       Date:  2010-09-08       Impact factor: 4.044

7.  Sequencing-grade de novo analysis of MS/MS triplets (CID/HCD/ETD) from overlapping peptides.

Authors:  Adrian Guthals; Karl R Clauser; Ari M Frank; Nuno Bandeira
Journal:  J Proteome Res       Date:  2013-05-30       Impact factor: 4.466

8.  Discovery and revision of Arabidopsis genes by proteogenomics.

Authors:  Natalie E Castellana; Samuel H Payne; Zhouxin Shen; Mario Stanke; Vineet Bafna; Steven P Briggs
Journal:  Proc Natl Acad Sci U S A       Date:  2008-12-19       Impact factor: 11.205

9.  Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases.

Authors:  Sangtae Kim; Nitin Gupta; Pavel A Pevzner
Journal:  J Proteome Res       Date:  2008-07-03       Impact factor: 4.466

10.  False discovery rates in spectral identification.

Authors:  Kyowon Jeong; Sangtae Kim; Nuno Bandeira
Journal:  BMC Bioinformatics       Date:  2012-11-05       Impact factor: 3.169

View more
  3 in total

Review 1.  Dereplication, sequencing and identification of peptidic natural products: from genome mining to peptidogenomics to spectral networks.

Authors:  Hosein Mohimani; Pavel A Pevzner
Journal:  Nat Prod Rep       Date:  2016-01       Impact factor: 13.423

Review 2.  Recognition of the polycistronic nature of human genes is critical to understanding the genotype-phenotype relationship.

Authors:  Marie A Brunet; Sébastien A Levesque; Darel J Hunting; Alan A Cohen; Xavier Roucou
Journal:  Genome Res       Date:  2018-04-06       Impact factor: 9.043

3.  Site-specific identification and quantitation of endogenous SUMO modifications under native conditions.

Authors:  Ryan J Lumpkin; Hongbo Gu; Yiying Zhu; Marilyn Leonard; Alla S Ahmad; Karl R Clauser; Jesse G Meyer; Eric J Bennett; Elizabeth A Komives
Journal:  Nat Commun       Date:  2017-10-27       Impact factor: 14.919

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.