Literature DB >> 19231891

A ranking-based scoring function for peptide-spectrum matches.

Ari M Frank1.   

Abstract

The analysis of the large volume of tandem mass spectrometry (MS/MS) proteomics data that is generated these days relies on automated algorithms that identify peptides from their mass spectra. An essential component of these algorithms is the scoring function used to evaluate the quality of peptide-spectrum matches (PSMs). In this paper, we present new approach to scoring of PSMs. We argue that since this problem is at its core a ranking task (especially in the case of de novo sequencing), it can be solved effectively using machine learning ranking algorithms. We developed a new discriminative boosting-based approach to scoring. Our scoring models draw upon a large set of diverse feature functions that measure different qualities of PSMs. Our method improves the performance of our de novo sequencing algorithm beyond the current state-of-the-art, and also greatly enhances the performance of database search programs. Furthermore, by increasing the efficiency of tag filtration and improving the sensitivity of PSM scoring, we make it practical to perform large-scale MS/MS analysis, such as proteogenomic search of a six-frame translation of the human genome (in which we achieve a reduction of the running time by a factor of 15 and a 60% increase in the number of identified peptides, compared to the InsPecT database search tool). Our scoring function is incorporated into PepNovo+ which is available for download or can be run online at http://bix.ucsd.edu.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19231891      PMCID: PMC2692183          DOI: 10.1021/pr800678b

Source DB:  PubMed          Journal:  J Proteome Res        ISSN: 1535-3893            Impact factor:   4.466


  71 in total

1.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search.

Authors:  Andrew Keller; Alexey I Nesvizhskii; Eugene Kolker; Ruedi Aebersold
Journal:  Anal Chem       Date:  2002-10-15       Impact factor: 6.986

2.  Proteogenomic mapping as a complementary method to perform genome annotation.

Authors:  Jacob D Jaffe; Howard C Berg; George M Church
Journal:  Proteomics       Date:  2004-01       Impact factor: 3.984

3.  High-throughput identification of proteins and unanticipated sequence modifications using a mass-based alignment algorithm for MS/MS de novo sequencing results.

Authors:  Brian C Searle; Surendra Dasari; Mark Turner; Ashok P Reddy; Dongseok Choi; Phillip A Wilmarth; Ashley L McCormack; Larry L David; Srinivasa R Nagalla
Journal:  Anal Chem       Date:  2004-04-15       Impact factor: 6.986

4.  PepNovo: de novo peptide sequencing via probabilistic network modeling.

Authors:  Ari Frank; Pavel Pevzner
Journal:  Anal Chem       Date:  2005-02-15       Impact factor: 6.986

5.  Peptide sequence tags for fast database search in mass-spectrometry.

Authors:  Ari Frank; Stephen Tanner; Vineet Bafna; Pavel Pevzner
Journal:  J Proteome Res       Date:  2005 Jul-Aug       Impact factor: 4.466

6.  Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry.

Authors:  Marshall Bern; Yuhan Cai; David Goldberg
Journal:  Anal Chem       Date:  2007-01-23       Impact factor: 6.986

7.  Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures.

Authors:  Alexander Stark; Michael F Lin; Pouya Kheradpour; Jakob S Pedersen; Leopold Parts; Joseph W Carlson; Madeline A Crosby; Matthew D Rasmussen; Sushmita Roy; Ameya N Deoras; J Graham Ruby; Julius Brennecke; Emily Hodges; Angie S Hinrichs; Anat Caspi; Benedict Paten; Seung-Won Park; Mira V Han; Morgan L Maeder; Benjamin J Polansky; Bryanne E Robson; Stein Aerts; Jacques van Helden; Bassem Hassan; Donald G Gilbert; Deborah A Eastman; Michael Rice; Michael Weir; Matthew W Hahn; Yongkyu Park; Colin N Dewey; Lior Pachter; W James Kent; David Haussler; Eric C Lai; David P Bartel; Gregory J Hannon; Thomas C Kaufman; Michael B Eisen; Andrew G Clark; Douglas Smith; Susan E Celniker; William M Gelbart; Manolis Kellis
Journal:  Nature       Date:  2007-11-08       Impact factor: 49.962

8.  The Paragon Algorithm, a next generation search engine that uses sequence temperature values and feature probabilities to identify peptides from tandem mass spectra.

Authors:  Ignat V Shilov; Sean L Seymour; Alpesh A Patel; Alex Loboda; Wilfred H Tang; Sean P Keating; Christie L Hunter; Lydia M Nuwaysir; Daniel A Schaeffer
Journal:  Mol Cell Proteomics       Date:  2007-05-27       Impact factor: 5.911

9.  Targeted discovery of novel human exons by comparative genomics.

Authors:  Adam Siepel; Mark Diekhans; Brona Brejová; Laura Langton; Michael Stevens; Charles L G Comstock; Colleen Davis; Brent Ewing; Shelly Oommen; Christopher Lau; Hung-Chun Yu; Jianfeng Li; Bruce A Roe; Phil Green; Daniela S Gerhard; Gary Temple; David Haussler; Michael R Brent
Journal:  Genome Res       Date:  2007-11-07       Impact factor: 9.043

10.  Genome annotation of Anopheles gambiae using mass spectrometry-derived data.

Authors:  Dário E Kalume; Suraj Peri; Raghunath Reddy; Jun Zhong; Mobolaji Okulate; Nirbhay Kumar; Akhilesh Pandey
Journal:  BMC Genomics       Date:  2005-09-19       Impact factor: 3.969

View more
  28 in total

1.  Sequencing cyclic peptides by multistage mass spectrometry.

Authors:  Hosein Mohimani; Yu-Liang Yang; Wei-Ting Liu; Pei-Wen Hsieh; Pieter C Dorrestein; Pavel A Pevzner
Journal:  Proteomics       Date:  2011-08-09       Impact factor: 3.984

2.  Spectral profiles, a novel representation of tandem mass spectra and their applications for de novo peptide sequencing and identification.

Authors:  Sangtae Kim; Nuno Bandeira; Pavel A Pevzner
Journal:  Mol Cell Proteomics       Date:  2009-03-02       Impact factor: 5.911

3.  Neutron-encoded signatures enable product ion annotation from tandem mass spectra.

Authors:  Alicia L Richards; Catherine E Vincent; Adrian Guthals; Christopher M Rose; Michael S Westphall; Nuno Bandeira; Joshua J Coon
Journal:  Mol Cell Proteomics       Date:  2013-09-16       Impact factor: 5.911

4.  Gapped spectral dictionaries and their applications for database searches of tandem mass spectra.

Authors:  Kyowon Jeong; Sangtae Kim; Nuno Bandeira; Pavel A Pevzner
Journal:  Mol Cell Proteomics       Date:  2011-03-28       Impact factor: 5.911

Review 5.  Proteogenomics to discover the full coding content of genomes: a computational perspective.

Authors:  Natalie Castellana; Vineet Bafna
Journal:  J Proteomics       Date:  2010-07-08       Impact factor: 4.044

6.  Combinatorial approach for large-scale identification of linked peptides from tandem mass spectrometry spectra.

Authors:  Jian Wang; Veronica G Anania; Jeff Knott; John Rush; Jennie R Lill; Philip E Bourne; Nuno Bandeira
Journal:  Mol Cell Proteomics       Date:  2014-02-03       Impact factor: 5.911

Review 7.  A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics.

Authors:  Alexey I Nesvizhskii
Journal:  J Proteomics       Date:  2010-09-08       Impact factor: 4.044

Review 8.  Algorithms and design strategies towards automated glycoproteomics analysis.

Authors:  Han Hu; Kshitij Khatri; Joseph Zaia
Journal:  Mass Spectrom Rev       Date:  2016-01-04       Impact factor: 10.946

9.  A high-throughput de novo sequencing approach for shotgun proteomics using high-resolution tandem mass spectrometry.

Authors:  Chongle Pan; Byung H Park; William H McDonald; Patricia A Carey; Jillian F Banfield; Nathan C VerBerkmoes; Robert L Hettich; Nagiza F Samatova
Journal:  BMC Bioinformatics       Date:  2010-03-05       Impact factor: 3.169

10.  Sequencing-grade de novo analysis of MS/MS triplets (CID/HCD/ETD) from overlapping peptides.

Authors:  Adrian Guthals; Karl R Clauser; Ari M Frank; Nuno Bandeira
Journal:  J Proteome Res       Date:  2013-05-30       Impact factor: 4.466

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.