Literature DB >> 29327814

Systematic Evaluation of Protein Sequence Filtering Algorithms for Proteoform Identification Using Top-Down Mass Spectrometry.

Qiang Kou1, Si Wu2, Xiaowen Liu1,3.   

Abstract

Complex proteoforms contain various primary structural alterations resulting from variations in genes, RNA, and proteins. Top-down mass spectrometry is commonly used for analyzing complex proteoforms because it provides whole sequence information of the proteoforms. Proteoform identification by top-down mass spectral database search is a challenging computational problem because the types and/or locations of some alterations in target proteoforms are in general unknown. Although spectral alignment and mass graph alignment algorithms have been proposed for identifying proteoforms with unknown alterations, they are extremely slow to align millions of spectra against tens of thousands of protein sequences in high throughput proteome level analyses. Many software tools in this area combine efficient protein sequence filtering algorithms and spectral alignment algorithms to speed up database search. As a result, the performance of these tools heavily relies on the sensitivity and efficiency of their filtering algorithms. Here, we propose two efficient approximate spectrum-based filtering algorithms for proteoform identification. We evaluated the performances of the proposed algorithms and four existing ones on simulated and real top-down mass spectrometry data sets. Experiments showed that the proposed algorithms outperformed the existing ones for complex proteoform identification. In addition, combining the proposed filtering algorithms and mass graph alignment algorithms identified many proteoforms missed by ProSightPC in proteome-level proteoform analyses.
© 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Entities:  

Keywords:  filtering algorithms; spectral identification; top-down mass spectrometry

Mesh:

Substances:

Year:  2018        PMID: 29327814      PMCID: PMC5825287          DOI: 10.1002/pmic.201700306

Source DB:  PubMed          Journal:  Proteomics        ISSN: 1615-9853            Impact factor:   3.984


  47 in total

1.  Peptide sequence tags for fast database search in mass-spectrometry.

Authors:  Ari Frank; Stephen Tanner; Vineet Bafna; Pavel Pevzner
Journal:  J Proteome Res       Date:  2005 Jul-Aug       Impact factor: 4.466

2.  Pervasive combinatorial modification of histone H3 in human cells.

Authors:  Benjamin A Garcia; James J Pesavento; Craig A Mizzen; Neil L Kelleher
Journal:  Nat Methods       Date:  2007-05-21       Impact factor: 28.547

3.  An efficient algorithm for the blocked pattern matching problem.

Authors:  Fei Deng; Lusheng Wang; Xiaowen Liu
Journal:  Bioinformatics       Date:  2014-10-15       Impact factor: 6.937

4.  Interpreting raw biological mass spectra using isotopic mass-to-charge ratio and envelope fingerprinting.

Authors:  Li Li; Zhixin Tian
Journal:  Rapid Commun Mass Spectrom       Date:  2013-06-15       Impact factor: 2.419

5.  Error-tolerant identification of peptides in sequence databases by peptide sequence tags.

Authors:  M Mann; M Wilm
Journal:  Anal Chem       Date:  1994-12-15       Impact factor: 6.986

Review 6.  Top Down proteomics: facts and perspectives.

Authors:  Adam D Catherman; Owen S Skinner; Neil L Kelleher
Journal:  Biochem Biophys Res Commun       Date:  2014-02-17       Impact factor: 3.575

7.  Byonic: advanced peptide and protein identification software.

Authors:  Marshall Bern; Yong J Kil; Christopher Becker
Journal:  Curr Protoc Bioinformatics       Date:  2012-12

8.  De novo sequencing of unique sequence tags for discovery of post-translational modifications of proteins.

Authors:  Yufeng Shen; Nikola Tolić; Kim K Hixson; Samuel O Purvine; Gordon A Anderson; Richard D Smith
Journal:  Anal Chem       Date:  2008-09-11       Impact factor: 6.986

9.  Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases.

Authors:  Sangtae Kim; Nitin Gupta; Pavel A Pevzner
Journal:  J Proteome Res       Date:  2008-07-03       Impact factor: 4.466

10.  Elucidating Proteoform Families from Proteoform Intact-Mass and Lysine-Count Measurements.

Authors:  Michael R Shortreed; Brian L Frey; Mark Scalf; Rachel A Knoener; Anthony J Cesnik; Lloyd M Smith
Journal:  J Proteome Res       Date:  2016-03-16       Impact factor: 4.466

View more
  5 in total

1.  A Markov Chain Monte Carlo Method for Estimating the Statistical Significance of Proteoform Identifications by Top-Down Mass Spectrometry.

Authors:  Qiang Kou; Zhe Wang; Rachele A Lubeckyj; Si Wu; Liangliang Sun; Xiaowen Liu
Journal:  J Proteome Res       Date:  2019-01-28       Impact factor: 4.466

2.  TopPIC Gateway: A Web Gateway for Top-Down Mass Spectrometry Data Interpretation.

Authors:  In Kwon Choi; Eroma Abeysinghe; Eric Coulter; Suresh Marru; Marlon Pierce; Xiaowen Liu
Journal:  PEARC20 (2020)       Date:  2020-07

3.  Proteoform Identification by Combining RNA-Seq and Top-Down Mass Spectrometry.

Authors:  Wenrong Chen; Xiaowen Liu
Journal:  J Proteome Res       Date:  2020-11-12       Impact factor: 4.466

4.  Integrating Top-Down and Bottom-Up Mass Spectrometric Strategies for Proteomic Profiling of Iranian Saw-Scaled Viper, Echis carinatus sochureki, Venom.

Authors:  Parviz Ghezellou; Wendell Albuquerque; Vannuruswamy Garikapati; Nicholas R Casewell; Seyed Mahdi Kazemi; Alireza Ghassempour; Bernhard Spengler
Journal:  J Proteome Res       Date:  2020-11-22       Impact factor: 5.370

Review 5.  Proteome Discoverer-A Community Enhanced Data Processing Suite for Protein Informatics.

Authors:  Benjamin C Orsburn
Journal:  Proteomes       Date:  2021-03-23
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.