Literature DB >> 28603547

A better scoring model for de novo peptide sequencing: the symmetric difference between explained and measured masses.

Thomas Tschager1, Simon Rösch1, Ludovic Gillet2, Peter Widmayer1.   

Abstract

BACKGROUND: Given a peptide as a string of amino acids, the masses of all its prefixes and suffixes can be found by a trivial linear scan through the amino acid masses. The inverse problem is the idealde novopeptide sequencing problem: Given all prefix and suffix masses, determine the string of amino acids. In biological reality, the given masses are measured in a lab experiment, and measurements by necessity are noisy. The (real, noisy) de novo peptide sequencing problem therefore has a noisy input: a few of the prefix and suffix masses of the peptide are missing and a few other masses are given in addition. For this setting, we ask for an amino acid string that explains the given masses as accurately as possible.
RESULTS: Past approaches interpreted accuracy by searching for a string that explains as many masses as possible. We feel, however, that it is not only bad to not explain a mass that appears, but also to explain a mass that does not appear. We propose to minimize the symmetric difference between the set of given masses and the set of masses that the string explains. For this new optimization problem, we propose an efficient algorithm that computes both the best and the k best solutions. Proof-of-concept experiments on measurements of synthesized peptides show that our approach leads to better results compared to finding a string that explains as many given masses as possible.
CONCLUSIONS: We conclude that considering the symmetric difference as optimization goal can improve the identification rates for de novo peptide sequencing. A preliminary version of this work has been presented at WABI 2016.

Entities:  

Keywords:  Computational proteomics; De novo peptide sequencing; Mass spectrometry; Peptide identification

Year:  2017        PMID: 28603547      PMCID: PMC5464308          DOI: 10.1186/s13015-017-0104-1

Source DB:  PubMed          Journal:  Algorithms Mol Biol        ISSN: 1748-7188            Impact factor:   1.405


  19 in total

1.  De novo peptide sequencing via tandem mass spectrometry.

Authors:  V Dancík; T A Addona; K R Clauser; J E Vath; P A Pevzner
Journal:  J Comput Biol       Date:  1999 Fall-Winter       Impact factor: 1.479

2.  A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry.

Authors:  T Chen; M Y Kao; M Tepel; J Rush; G M Church
Journal:  J Comput Biol       Date:  2001       Impact factor: 1.479

3.  A suboptimal algorithm for de novo peptide sequencing via tandem mass spectrometry.

Authors:  Bingwen Lu; Ting Chen
Journal:  J Comput Biol       Date:  2003       Impact factor: 1.479

4.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search.

Authors:  Andrew Keller; Alexey I Nesvizhskii; Eugene Kolker; Ruedi Aebersold
Journal:  Anal Chem       Date:  2002-10-15       Impact factor: 6.986

5.  Intensity-based protein identification by machine learning from a library of tandem mass spectra.

Authors:  Joshua E Elias; Francis D Gibbons; Oliver D King; Frederick P Roth; Steven P Gygi
Journal:  Nat Biotechnol       Date:  2004-01-18       Impact factor: 54.908

6.  PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry.

Authors:  Bin Ma; Kaizhong Zhang; Christopher Hendrie; Chengzhi Liang; Ming Li; Amanda Doherty-Kirby; Gilles Lajoie
Journal:  Rapid Commun Mass Spectrom       Date:  2003       Impact factor: 2.419

Review 7.  The ABC's (and XYZ's) of peptide sequencing.

Authors:  Hanno Steen; Matthias Mann
Journal:  Nat Rev Mol Cell Biol       Date:  2004-09       Impact factor: 94.444

8.  PepNovo: de novo peptide sequencing via probabilistic network modeling.

Authors:  Ari Frank; Pavel Pevzner
Journal:  Anal Chem       Date:  2005-02-15       Impact factor: 6.986

9.  MSNovo: a dynamic programming algorithm for de novo peptide sequencing via tandem mass spectrometry.

Authors:  Lijuan Mo; Debojyoti Dutta; Yunhu Wan; Ting Chen
Journal:  Anal Chem       Date:  2007-06-06       Impact factor: 6.986

Review 10.  Introduction to computational proteomics.

Authors:  Jacques Colinge; Keiryn L Bennett
Journal:  PLoS Comput Biol       Date:  2007-07       Impact factor: 4.475

View more
  1 in total

1.  Improved de novo peptide sequencing using LC retention time information.

Authors:  Yves Frank; Tomas Hruz; Thomas Tschager; Valentin Venzin
Journal:  Algorithms Mol Biol       Date:  2018-08-29       Impact factor: 1.405

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.