Literature DB >> 28720701

De novo peptide sequencing by deep learning.

Ngoc Hieu Tran1, Xianglilan Zhang1,2, Lei Xin3, Baozhen Shan3, Ming Li4.   

Abstract

De novo peptide sequencing from tandem MS data is the key technology in proteomics for the characterization of proteins, especially for new sequences, such as mAbs. In this study, we propose a deep neural network model, DeepNovo, for de novo peptide sequencing. DeepNovo architecture combines recent advances in convolutional neural networks and recurrent neural networks to learn features of tandem mass spectra, fragment ions, and sequence patterns of peptides. The networks are further integrated with local dynamic programming to solve the complex optimization task of de novo sequencing. We evaluated the method on a wide variety of species and found that DeepNovo considerably outperformed state of the art methods, achieving 7.7-22.9% higher accuracy at the amino acid level and 38.1-64.0% higher accuracy at the peptide level. We further used DeepNovo to automatically reconstruct the complete sequences of antibody light and heavy chains of mouse, achieving 97.5-100% coverage and 97.2-99.5% accuracy, without assisting databases. Moreover, DeepNovo is retrainable to adapt to any sources of data and provides a complete end-to-end training and prediction solution to the de novo sequencing problem. Not only does our study extend the deep learning revolution to a new field, but it also shows an innovative approach in solving optimization problems by using deep learning and dynamic programming.

Entities:  

Keywords:  MS; de novo sequencing; deep learning

Year:  2017        PMID: 28720701      PMCID: PMC5547637          DOI: 10.1073/pnas.1705691114

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  49 in total

1.  De novo peptide sequencing via tandem mass spectrometry.

Authors:  V Dancík; T A Addona; K R Clauser; J E Vath; P A Pevzner
Journal:  J Comput Biol       Date:  1999 Fall-Winter       Impact factor: 1.479

2.  A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry.

Authors:  T Chen; M Y Kao; M Tepel; J Rush; G M Church
Journal:  J Comput Biol       Date:  2001       Impact factor: 1.479

3.  Implementation and uses of automated de novo peptide sequencing by tandem mass spectrometry.

Authors:  J A Taylor; R S Johnson
Journal:  Anal Chem       Date:  2001-06-01       Impact factor: 6.986

Review 4.  The ABC's (and XYZ's) of peptide sequencing.

Authors:  Hanno Steen; Matthias Mann
Journal:  Nat Rev Mol Cell Biol       Date:  2004-09       Impact factor: 94.444

5.  NovoHMM: a hidden Markov model for de novo peptide sequencing.

Authors:  Bernd Fischer; Volker Roth; Franz Roos; Jonas Grossmann; Sacha Baginsky; Peter Widmayer; Wilhelm Gruissem; Joachim M Buhmann
Journal:  Anal Chem       Date:  2005-11-15       Impact factor: 6.986

6.  De novo peptide sequencing based on a divide-and-conquer algorithm and peptide tandem spectrum simulation.

Authors:  Zhongqi Zhang
Journal:  Anal Chem       Date:  2004-11-01       Impact factor: 6.986

7.  PepNovo: de novo peptide sequencing via probabilistic network modeling.

Authors:  Ari Frank; Pavel Pevzner
Journal:  Anal Chem       Date:  2005-02-15       Impact factor: 6.986

8.  Sequencing-grade de novo analysis of MS/MS triplets (CID/HCD/ETD) from overlapping peptides.

Authors:  Adrian Guthals; Karl R Clauser; Ari M Frank; Nuno Bandeira
Journal:  J Proteome Res       Date:  2013-05-30       Impact factor: 4.466

9.  MS-GF+ makes progress towards a universal database search tool for proteomics.

Authors:  Sangtae Kim; Pavel A Pevzner
Journal:  Nat Commun       Date:  2014-10-31       Impact factor: 14.919

10.  Complete De Novo Assembly of Monoclonal Antibody Sequences.

Authors:  Ngoc Hieu Tran; M Ziaur Rahman; Lin He; Lei Xin; Baozhen Shan; Ming Li
Journal:  Sci Rep       Date:  2016-08-26       Impact factor: 4.379

View more
  57 in total

1.  Adaption of the Aristotle Classifier for Accurately Identifying Highly Similar Bacteria Analyzed by MALDI-TOF MS.

Authors:  Heather Desaire; David Hua
Journal:  Anal Chem       Date:  2019-12-10       Impact factor: 6.986

Review 2.  HIPs and HIP-reactive T cells.

Authors:  T A Wiles; T Delong
Journal:  Clin Exp Immunol       Date:  2019-06-17       Impact factor: 4.330

3.  EnvCNN: A Convolutional Neural Network Model for Evaluating Isotopic Envelopes in Top-Down Mass-Spectral Deconvolution.

Authors:  Abdul Rehman Basharat; Xia Ning; Xiaowen Liu
Journal:  Anal Chem       Date:  2020-05-13       Impact factor: 6.986

4.  Comment on "A subset of HLA-I peptides are not genomically templated: Evidence for cis- and trans-spliced peptide ligands".

Authors:  Zach Rolfs; Markus Müller; Michael R Shortreed; Lloyd M Smith; Michal Bassani-Sternberg
Journal:  Sci Immunol       Date:  2019-08-16

5.  Precision De Novo Peptide Sequencing Using Mirror Proteases of Ac-LysargiNase and Trypsin for Large-scale Proteomics.

Authors:  Hao Yang; Yan-Chang Li; Ming-Zhi Zhao; Fei-Lin Wu; Xi Wang; Wei-Di Xiao; Yi-Hao Wang; Jun-Ling Zhang; Fu-Qiang Wang; Feng Xu; Wen-Feng Zeng; Christopher M Overall; Si-Min He; Hao Chi; Ping Xu
Journal:  Mol Cell Proteomics       Date:  2019-01-08       Impact factor: 5.911

6.  PDV: an integrative proteomics data viewer.

Authors:  Kai Li; Marc Vaudel; Bing Zhang; Yan Ren; Bo Wen
Journal:  Bioinformatics       Date:  2019-04-01       Impact factor: 6.937

7.  Uncovering Thousands of New Peptides with Sequence-Mask-Search Hybrid De Novo Peptide Sequencing Framework.

Authors:  Korrawe Karunratanakul; Hsin-Yao Tang; David W Speicher; Ekapol Chuangsuwanich; Sira Sriswasdi
Journal:  Mol Cell Proteomics       Date:  2019-10-07       Impact factor: 5.911

8.  Deep Learning Benchmarks on L1000 Gene Expression Data.

Authors:  Matthew B A McDermott; Jennifer Wang; Wen-Ning Zhao; Steven D Sheridan; Peter Szolovits; Isaac Kohane; Stephen J Haggarty; Roy H Perlis
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2020-12-08       Impact factor: 3.710

9.  DeltaMass: Automated Detection and Visualization of Mass Shifts in Proteomic Open-Search Results.

Authors:  Dmitry M Avtonomov; Andy Kong; Alexey I Nesvizhskii
Journal:  J Proteome Res       Date:  2018-12-17       Impact factor: 4.466

10.  QAlign: aligning nanopore reads accurately using current-level modeling.

Authors:  Dhaivat Joshi; Shunfu Mao; Sreeram Kannan; Suhas Diggavi
Journal:  Bioinformatics       Date:  2021-05-05       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.