Literature DB >> 31591261

Uncovering Thousands of New Peptides with Sequence-Mask-Search Hybrid De Novo Peptide Sequencing Framework.

Korrawe Karunratanakul1, Hsin-Yao Tang2, David W Speicher3, Ekapol Chuangsuwanich4,5, Sira Sriswasdi6,7.   

Abstract

Typical analyses of mass spectrometry data only identify amino acid sequences that exist in reference databases. This restricts the possibility of discovering new peptides such as those that contain uncharacterized mutations or originate from unexpected processing of RNAs and proteins. De novo peptide sequencing approaches address this limitation but often suffer from low accuracy and require extensive validation by experts. Here, we develop SMSNet, a deep learning-based de novo peptide sequencing framework that achieves >95% amino acid accuracy while retaining good identification coverage. Applications of SMSNet on landmark proteomics and peptidomics studies reveal over 10,000 previously uncharacterized HLA antigens and phosphopeptides, and in conjunction with database-search methods, expand the coverage of peptide identification by almost 30%. The power to accurately identify new peptides of SMSNet would make it an invaluable tool for any future proteomics and peptidomics studies, including tumor neoantigen discovery, antibody sequencing, and proteome characterization of non-model organisms.
© 2019 Karunratanakul et al.

Entities:  

Keywords:  De novo sequencing; bioinformatics searching; deep learning; mass spectrometry; peptides; phosphoproteome; software

Mesh:

Substances:

Year:  2019        PMID: 31591261      PMCID: PMC6885704          DOI: 10.1074/mcp.TIR119.001656

Source DB:  PubMed          Journal:  Mol Cell Proteomics        ISSN: 1535-9476            Impact factor:   5.911


  28 in total

1.  Searching sequence databases via de novo peptide sequencing by tandem mass spectrometry.

Authors:  Richard S Johnson; J Alex Taylor
Journal:  Mol Biotechnol       Date:  2002-11       Impact factor: 2.695

2.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.

Authors:  Aaron McKenna; Matthew Hanna; Eric Banks; Andrey Sivachenko; Kristian Cibulskis; Andrew Kernytsky; Kiran Garimella; David Altshuler; Stacey Gabriel; Mark Daly; Mark A DePristo
Journal:  Genome Res       Date:  2010-07-19       Impact factor: 9.043

3.  PepNovo: de novo peptide sequencing via probabilistic network modeling.

Authors:  Ari Frank; Pavel Pevzner
Journal:  Anal Chem       Date:  2005-02-15       Impact factor: 6.986

4.  PDV: an integrative proteomics data viewer.

Authors:  Kai Li; Marc Vaudel; Bing Zhang; Yan Ren; Bo Wen
Journal:  Bioinformatics       Date:  2019-04-01       Impact factor: 6.937

5.  De novo peptide sequencing by deep learning.

Authors:  Ngoc Hieu Tran; Xianglilan Zhang; Lei Xin; Baozhen Shan; Ming Li
Journal:  Proc Natl Acad Sci U S A       Date:  2017-07-18       Impact factor: 11.205

6.  High-throughput and high-sensitivity phosphoproteomics with the EasyPhos platform.

Authors:  Sean J Humphrey; Ozge Karayel; David E James; Matthias Mann
Journal:  Nat Protoc       Date:  2018-09       Impact factor: 13.491

7.  A large fraction of HLA class I ligands are proteasome-generated spliced peptides.

Authors:  Juliane Liepe; Fabio Marino; John Sidney; Anita Jeko; Daniel E Bunting; Alessandro Sette; Peter M Kloetzel; Michael P H Stumpf; Albert J R Heck; Michele Mishto
Journal:  Science       Date:  2016-10-20       Impact factor: 47.728

8.  Building ProteomeTools based on a complete synthetic human proteome.

Authors:  Daniel P Zolg; Mathias Wilhelm; Karsten Schnatbaum; Johannes Zerweck; Tobias Knaute; Bernard Delanghe; Derek J Bailey; Siegfried Gessulat; Hans-Christian Ehrlich; Maximilian Weininger; Peng Yu; Judith Schlegl; Karl Kramer; Tobias Schmidt; Ulrike Kusebauch; Eric W Deutsch; Ruedi Aebersold; Robert L Moritz; Holger Wenschuh; Thomas Moehring; Stephan Aiche; Andreas Huhmer; Ulf Reimer; Bernhard Kuster
Journal:  Nat Methods       Date:  2017-01-30       Impact factor: 28.547

9.  A cross-platform toolkit for mass spectrometry and proteomics.

Authors:  Matthew C Chambers; Brendan Maclean; Robert Burke; Dario Amodei; Daniel L Ruderman; Steffen Neumann; Laurent Gatto; Bernd Fischer; Brian Pratt; Jarrett Egertson; Katherine Hoff; Darren Kessner; Natalie Tasman; Nicholas Shulman; Barbara Frewen; Tahmina A Baker; Mi-Youn Brusniak; Christopher Paulse; David Creasy; Lisa Flashner; Kian Kani; Chris Moulding; Sean L Seymour; Lydia M Nuwaysir; Brent Lefebvre; Frank Kuhlmann; Joe Roark; Paape Rainer; Suckau Detlev; Tina Hemenway; Andreas Huhmer; James Langridge; Brian Connolly; Trey Chadick; Krisztina Holly; Josh Eckels; Eric W Deutsch; Robert L Moritz; Jonathan E Katz; David B Agus; Michael MacCoss; David L Tabb; Parag Mallick
Journal:  Nat Biotechnol       Date:  2012-10       Impact factor: 54.908

10.  PhosphoSitePlus, 2014: mutations, PTMs and recalibrations.

Authors:  Peter V Hornbeck; Bin Zhang; Beth Murray; Jon M Kornhauser; Vaughan Latham; Elzbieta Skrzypek
Journal:  Nucleic Acids Res       Date:  2014-12-16       Impact factor: 16.971

View more
  8 in total

1.  Experimental Validation of the Noncoding Potential for lncRNAs.

Authors:  Emily A Dangelmaier; Ashish Lal
Journal:  Methods Mol Biol       Date:  2021

Review 2.  When Long Noncoding Becomes Protein Coding.

Authors:  Corrine Corrina R Hartford; Ashish Lal
Journal:  Mol Cell Biol       Date:  2020-02-27       Impact factor: 4.272

3.  Software Options for the Analysis of MS-Proteomic Data.

Authors:  Avinash Yadav; Federica Marini; Alessandro Cuomo; Tiziana Bonaldi
Journal:  Methods Mol Biol       Date:  2021

Review 4.  A tale of solving two computational challenges in protein science: neoantigen prediction and protein structure prediction.

Authors:  Ngoc Hieu Tran; Jinbo Xu; Ming Li
Journal:  Brief Bioinform       Date:  2022-01-17       Impact factor: 11.622

5.  Unsupervised Mining of HLA-I Peptidomes Reveals New Binding Motifs and Potential False Positives in the Community Database.

Authors:  Chatchapon Sricharoensuk; Tanupat Boonchalermvichien; Phijitra Muanwien; Poorichaya Somparn; Trairak Pisitkun; Sira Sriswasdi
Journal:  Front Immunol       Date:  2022-03-21       Impact factor: 7.561

6.  Identification of Daboia siamensis venome using integrated multi-omics data.

Authors:  Thammakorn Saethang; Poorichaya Somparn; Sunchai Payungporn; Sira Sriswasdi; Khin Than Yee; Kenneth Hodge; Mark A Knepper; Lawan Chanhome; Orawan Khow; Narongsak Chaiyabutr; Visith Sitprija; Trairak Pisitkun
Journal:  Sci Rep       Date:  2022-07-30       Impact factor: 4.996

Review 7.  The Current State-of-the-Art Identification of Unknown Proteins Using Mass Spectrometry Exemplified on De Novo Sequencing of a Venom Protease from Bothrops moojeni.

Authors:  Simone König; Wolfgang M J Obermann; Johannes A Eble
Journal:  Molecules       Date:  2022-08-05       Impact factor: 4.927

Review 8.  Uncovering the impacts of alternative splicing on the proteome with current omics techniques.

Authors:  Marina Reixachs-Solé; Eduardo Eyras
Journal:  Wiley Interdiscip Rev RNA       Date:  2022-01-03       Impact factor: 9.349

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.