Literature DB >> 24295440

DeNovoGUI: an open source graphical user interface for de novo sequencing of tandem mass spectra.

Thilo Muth1, Lisa Weilnböck, Erdmann Rapp, Christian G Huber, Lennart Martens, Marc Vaudel, Harald Barsnes.   

Abstract

De novo sequencing is a popular technique in proteomics for identifying peptides from tandem mass spectra without having to rely on a protein sequence database. Despite the strong potential of de novo sequencing algorithms, their adoption threshold remains quite high. We here present a user-friendly and lightweight graphical user interface called DeNovoGUI for running parallelized versions of the freely available de novo sequencing software PepNovo+, greatly simplifying the use of de novo sequencing in proteomics. Our platform-independent software is freely available under the permissible Apache2 open source license. Source code, binaries, and additional documentation are available at http://denovogui.googlecode.com .

Entities:  

Mesh:

Year:  2014        PMID: 24295440      PMCID: PMC3923451          DOI: 10.1021/pr4008078

Source DB:  PubMed          Journal:  J Proteome Res        ISSN: 1535-3893            Impact factor:   4.466


Mass spectrometry (MS)-based proteomics is an efficient high-throughput method for the analysis of peptides and proteins.[1,2] However, in a typical tandem mass spectrometry (MS/MS) experiment, a high proportion of the mass spectra remain unidentified when matched against in silico-generated spectra, derived from peptides obtained through in silico proteolytic digestion of known protein sequences.[3] Some of these unidentified spectra derive from contaminants and low-quality spectra, but the rest are likely to contain unexpected peptides.[4] One obstacle for the successful identification of such peptides is the fact that protein sequence databases are incomplete, as many organisms have not yet been sequenced, an issue that is particularly strongly felt in challenging fields such as metaproteomics[5] or plant proteomics.[6] Another common issue is the presence of unknown or unexpected modifications on the peptide precursors.[4]De novo sequencing constitutes a powerful technique for overcoming such issues and successfully assigning high-quality unidentified spectra to peptides. Moreover, de novo-derived peptide sequences can be used for the validation of insignificant database search results, for instance, proteins backed merely by a single peptide identification, so-called “one-hit wonders”.[7] Several de novo algorithms have been described and evaluated in the literature,[8,9] including the commercial PEAKS[10] software suite. PepNovo+,[11] on the other hand, is a powerful, freely available software tool. However, as with most open source de novo algorithms, it comes with several shortcomings. (i) It is distributed only with a command line interface, thus requiring advanced computational skills to operate. (ii) Modifications need to be configured manually for every search and are not based on the standardized PSI-MOD[12] controlled vocabulary. (iii) The search is not parallelized when multiple cores are available. (iv) The output of the algorithm is a text file containing only the derived sequences and their scores, thus omitting additional useful information such as fragment ions and spectrum annotation. Because of these issues, user validation of the results is cumbersome, and standardized dissemination of results is quite difficult. Here, we describe an intuitive, end-user-oriented front end to the PepNovo+ algorithm called DeNovoGUI, which aims to solve the aforementioned problems. Similar to the SearchGUI[13] software for the OMSSA[14] and X!Tandem[15] database search algorithms, DeNovoGUI provides a self-contained and easily adopted solution for convenient and efficient de novo sequencing using the PepNovo+ algorithm. The processing of a large amount of spectra has been accelerated by automated and completely transparent parallelization across multiple cores, a crucial feature for modern computers that typically come equipped with two to eight (hyperthreaded) cores. DeNovoGUI can be installed with minimal effort by downloading the latest release from the tool Web site (http://denovogui.googlecode.com), subsequently unzipping the downloaded file, and then double clicking the DeNovoGUI jar file. To start the de novo procedure, the user has to provide only the spectrum files to analyze (in the standard mgf format), the settings to use, and the desired output folder in which to store the de novo results (Figure 1A). The settings include the fragment and precursor ion mass tolerances, as well as the fixed and variable post-translational modifications to consider. Furthermore, additional settings for fine-tuning the PepNovo+ algorithm can be specified, including the number of de novo solutions to provide for each spectrum. Figure 1B provides a complete overview of the available settings. Importantly, the handling of modifications is greatly simplified by the graphical user interface: user-defined modifications can easily be created from the Edit menu in the mainframe. Note that DeNovoGUI allows all settings to be saved for later reuse or batch entry.
Figure 1

(A) Main DeNovoGUI interface that allows the user to input the spectrum files, the settings, and the output folder for the results. (B) De novo sequencing settings dialogue that allows the user to specify the fragment ion and precursor mass tolerances, and the fixed and variable post-translational modifications. Additional settings for fine-tuning the PepNovo+ algorithm can also be configured.

(A) Main DeNovoGUI interface that allows the user to input the spectrum files, the settings, and the output folder for the results. (B) De novo sequencing settings dialogue that allows the user to specify the fragment ion and precursor mass tolerances, and the fixed and variable post-translational modifications. Additional settings for fine-tuning the PepNovo+ algorithm can also be configured. As soon as the settings and input files have been provided, the de novo sequencing can be initiated by clicking the “Start Sequencing!” button in the main DeNovoGUI interface. While the PepNovo+ algorithm is running, the user is continuously informed about the status of the sequencing and a progress bar is displayed to indicate overall progress. When the process is complete, the de novo sequencing results are stored in the provided output folder in a simple text-based format, and the detailed results can be visualized in the DeNovoGUI interface (see Figure 2). At the top, the user can browse through all the input spectra in the ‘Query Spectra’ table, and through the de novo peptide matches for the selected spectrum in the ‘De Novo Peptides’ table. The ‘Query Spectra’ table provides information collected from the original spectra, such as title, precursor m/z, charge, and identification state, while the ‘De Novo Peptides’ table lists details obtained from the de novo sequencing results: peptide sequence, scores, and terminal gaps and precursor m/z and charge. At the bottom, a spectrum viewer[16] shows the currently selected spectrum with the fragment ion annotation corresponding to the selected de novo peptide solution. A sequence overlay is also presented on the spectrum by default, aiding the efficient validation of the proposed peptide solution. The de novo results can be validated using BLAST, either by clicking the BLAST option at the end of a given line in the ‘De Novo Peptides’ table or by exporting a list of matches in a BLAST compatible format via the Export menu. Peptide matches can also be exported in a simple text-based format from the same menu.
Figure 2

DeNovoGUI de novo results viewer that shows the currently selected de novo peptide solution and its corresponding fragment ion annotations on the selected spectrum. The ‘Query Spectra’ section at the top allows the user to browse through the input spectra, while the ‘De Novo Peptides’ section below provides sequence and scoring information for all peptide solutions for the currently selected input spectrum.

DeNovoGUI de novo results viewer that shows the currently selected de novo peptide solution and its corresponding fragment ion annotations on the selected spectrum. The ‘Query Spectra’ section at the top allows the user to browse through the input spectra, while the ‘De Novo Peptides’ section below provides sequence and scoring information for all peptide solutions for the currently selected input spectrum. A reference data set is provided as an example in DeNovoGUI and can be opened easily from the main menu. It consists of 30289 MS/MS spectra from an Arabidopsis thaliana whole leaf proteome. The obtained tryptic peptides were separated via ion-pair reversed-phase high-performance liquid chromatography on a poly(styrene/divinylbenzene) monolithic column[17] using a 5 h gradient and were measured on an LTQ Orbitrap XL mass spectrometer using high-resolution precursor ion selection followed by CID fragmentation. This reference data set of a well-established plant model system represents a realistic study case for plant proteomics and is thus ideally suited for the benchmarking of de novo sequencing algorithms. For further details about the data set, see the Supporting Information. Because of its ability to spread the de novo task across multiple compute cores and/or hyperthreads, DeNovoGUI substantially reduces the time required to analyze large amounts of spectra using PepNovo+. Indeed, while the analysis of our 30289 MS/MS spectra took ∼7 h using only a single thread, the running time was reduced to approximately 3 h using four threads and to approximately 1.5 h using eight threads. To obtain comparable and consistent results, running times were measured on identical virtual machines with the desired number of cores set (Intel Xeon CPU X5660 at 2.80 GHz). These tests clearly show that the multithreading capability of DeNovoGUI results in substantial reductions in processing time on today’s multithreading, multicore laptop and desktop computers. Upon being downloaded, DeNovoGUI comes with the latest version of PepNovo+ included, and apart from the unzipping of the downloaded DeNovoGUI zip file, no further installation is required to run the software. DeNovoGUI is written in the Java programming language and is freely available as open source under the permissive Apache2 license. Documentation, source files, and binaries can be downloaded from http://denovogui.googlecode.com.
  17 in total

1.  Monolithic capillary columns for liquid chromatography-electrospray ionization mass spectrometry in proteomic and genomic research.

Authors:  Wolfgang Walcher; Herbert Oberacher; Sonia Troiani; Georg Hölzl; Peter Oefner; Lello Zolla; Christian G Huber
Journal:  J Chromatogr B Analyt Technol Biomed Life Sci       Date:  2002-12-25       Impact factor: 3.205

Review 2.  Mass spectrometry-based proteomics.

Authors:  Ruedi Aebersold; Matthias Mann
Journal:  Nature       Date:  2003-03-13       Impact factor: 49.962

3.  What to do with "one-hit wonders"?

Authors:  Timothy D Veenstra; Thomas P Conrads; Haleem J Issaq
Journal:  Electrophoresis       Date:  2004-05       Impact factor: 3.535

4.  TANDEM: matching proteins with tandem mass spectra.

Authors:  Robertson Craig; Ronald C Beavis
Journal:  Bioinformatics       Date:  2004-02-19       Impact factor: 6.937

5.  Open mass spectrometry search algorithm.

Authors:  Lewis Y Geer; Sanford P Markey; Jeffrey A Kowalak; Lukas Wagner; Ming Xu; Dawn M Maynard; Xiaoyu Yang; Wenyao Shi; Stephen H Bryant
Journal:  J Proteome Res       Date:  2004 Sep-Oct       Impact factor: 4.466

6.  PepNovo: de novo peptide sequencing via probabilistic network modeling.

Authors:  Ari Frank; Pavel Pevzner
Journal:  Anal Chem       Date:  2005-02-15       Impact factor: 6.986

7.  Dynamic spectrum quality assessment and iterative computational analysis of shotgun proteomic data: toward more efficient identification of post-translational modifications, sequence polymorphisms, and novel peptides.

Authors:  Alexey I Nesvizhskii; Franz F Roos; Jonas Grossmann; Mathijs Vogelzang; James S Eddes; Wilhelm Gruissem; Sacha Baginsky; Ruedi Aebersold
Journal:  Mol Cell Proteomics       Date:  2005-12-12       Impact factor: 5.911

8.  Performance evaluation of existing de novo sequencing algorithms.

Authors:  Sergey Pevtsov; Irina Fedulova; Hamid Mirzaei; Charles Buck; Xiang Zhang
Journal:  J Proteome Res       Date:  2006-11       Impact factor: 4.466

9.  Improving the reliability and throughput of mass spectrometry-based proteomics by spectrum quality filtering.

Authors:  Kristian Flikka; Lennart Martens; Joël Vandekerckhove; Kris Gevaert; Ingvar Eidhammer
Journal:  Proteomics       Date:  2006-04       Impact factor: 3.984

Review 10.  A la carte proteomics with an emphasis on gel-free techniques.

Authors:  Kris Gevaert; Petra Van Damme; Bart Ghesquière; Francis Impens; Lennart Martens; Kenny Helsens; Joël Vandekerckhove
Journal:  Proteomics       Date:  2007-08       Impact factor: 3.984

View more
  24 in total

1.  2018 YPIC Challenge: A Case Study in Characterizing an Unknown Protein Sample.

Authors:  Lindsay Pino; Andy Lin; Wout Bittremieux
Journal:  J Proteome Res       Date:  2019-10-07       Impact factor: 4.466

2.  CycloBranch: De Novo Sequencing of Nonribosomal Peptides from Accurate Product Ion Mass Spectra.

Authors:  Jiří Novák; Karel Lemr; Kevin A Schug; Vladimír Havlíček
Journal:  J Am Soc Mass Spectrom       Date:  2015-07-21       Impact factor: 3.109

Review 3.  Application of Proteomics Technologies in Oil Palm Research.

Authors:  Benjamin Yii Chung Lau; Abrizah Othman; Umi Salamah Ramli
Journal:  Protein J       Date:  2018-12       Impact factor: 2.371

4.  PDV: an integrative proteomics data viewer.

Authors:  Kai Li; Marc Vaudel; Bing Zhang; Yan Ren; Bo Wen
Journal:  Bioinformatics       Date:  2019-04-01       Impact factor: 6.937

5.  A Glimpse of the Peptide Profile Presentation by Xenopus laevis MHC Class I: Crystal Structure of pXela-UAA Reveals a Distinct Peptide-Binding Groove.

Authors:  Lizhen Ma; Nianzhi Zhang; Zehui Qu; Ruiying Liang; Lijie Zhang; Bing Zhang; Geng Meng; Johannes M Dijkstra; Shen Li; Max Chun Xia
Journal:  J Immunol       Date:  2019-11-27       Impact factor: 5.422

Review 6.  Algorithms and design strategies towards automated glycoproteomics analysis.

Authors:  Han Hu; Kshitij Khatri; Joseph Zaia
Journal:  Mass Spectrom Rev       Date:  2016-01-04       Impact factor: 10.946

7.  Structure and Peptidome of the Bat MHC Class I Molecule Reveal a Novel Mechanism Leading to High-Affinity Peptide Binding.

Authors:  Zehui Qu; Zibin Li; Lizhen Ma; Xiaohui Wei; Lijie Zhang; Ruiying Liang; Geng Meng; Nianzhi Zhang; Chun Xia
Journal:  J Immunol       Date:  2019-05-10       Impact factor: 5.422

Review 8.  Proteogenomics from a bioinformatics angle: A growing field.

Authors:  Gerben Menschaert; David Fenyö
Journal:  Mass Spectrom Rev       Date:  2015-12-15       Impact factor: 10.946

9.  Proteomics in Non-model Organisms: A New Analytical Frontier.

Authors:  Michelle Heck; Benjamin A Neely
Journal:  J Proteome Res       Date:  2020-08-20       Impact factor: 4.466

10.  Peptidomes and Structures Illustrate Two Distinguishing Mechanisms of Alternating the Peptide Plasticity Caused by Swine MHC Class I Micropolymorphism.

Authors:  Xiaohui Wei; Song Wang; Zhuolin Li; Zibin Li; Zehui Qu; Suqiu Wang; Baohua Zou; Ruiying Liang; Chun Xia; Nianzhi Zhang
Journal:  Front Immunol       Date:  2021-02-26       Impact factor: 7.561

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.