Literature DB >> 24336413

HAMMER: automated operation of mass frontier to construct in silico mass spectral fragmentation libraries.

Jiarui Zhou1, Ralf J M Weber, J William Allwood, Robert Mistrik, Zexuan Zhu, Zhen Ji, Siping Chen, Warwick B Dunn, Shan He, Mark R Viant.   

Abstract

SUMMARY: Experimental MS(n) mass spectral libraries currently do not adequately cover chemical space. This limits the robust annotation of metabolites in metabolomics studies of complex biological samples. In silico fragmentation libraries would improve the identification of compounds from experimental multistage fragmentation data when experimental reference data are unavailable. Here, we present a freely available software package to automatically control Mass Frontier software to construct in silico mass spectral libraries and to perform spectral matching. Based on two case studies, we have demonstrated that high-throughput automation of Mass Frontier allows researchers to generate in silico mass spectral libraries in an automated and high-throughput fashion with little or no human intervention required.
AVAILABILITY AND IMPLEMENTATION: Documentation, examples, results and source code are available at http://www.biosciences-labs.bham.ac.uk/viant/hammer/.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 24336413      PMCID: PMC3928522          DOI: 10.1093/bioinformatics/btt711

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

Mass spectrometry (MS)-based metabolomics is a rapidly developing field that aims to detect and measure a variety of small biological molecules (metabolites) over a wide dynamic range (Dettmer ). Although thousands of metabolites are typically detected in an untargeted metabolomics study of a biological sample, their subsequent identification represents the most significant bottleneck in the discovery of new biochemical knowledge (Dunn ; Kind and Fiehn, 2010; Wishart, 2011). In many cases, multiple empirical formulae and/or putative chemical structures are reported for each observed mass feature (or more strictly mass-to-charge ratio, m/z). Multistage (MSn) mass spectrometry, which is an experimental technique to collect in-depth fragmentation data related to the chemical structure of metabolites, is often applied to increase accuracy and specificity in metabolite annotation and identification. However, experimental MSn mass spectral libraries currently do not adequately cover the search space for all metabolites present in complex biological samples, as authentic chemical standards are not available for all metabolites. The generation of in silico MSn libraries is anticipated to greatly improve the success rate of annotation of metabolites detected in metabolomics studies, when experimentally acquired MSn data of authentic chemical standards are unavailable. Commercial software packages such as Mass Frontier and ACD/MS Fragmenter predict in silico fragmentation patterns to assist the interpretation of experimental MSn data (Krauss ; Oppermann ). Additionally, a freely available tool named MetFrag has been developed previously (Wolf ). This tool selects candidate structures from a compound library based on the molecular ion m/z and generates in silico substructures to subsequently annotate fragment peaks. MetFrag does not provide the ability to create in silico patterns or libraries independent of the use of experimental data; however, it has been proven to be an efficient approach to annotate fragmentation mass spectra derived from MS experiments. Mass Frontier is a well-known software package that has been used by more than a thousand academic institutions and companies throughout the world for the management, evaluation and interpretation of mass spectra. It includes three resources that assist the user to robustly predict in silico fragmentation patterns: a set of general fragmentation rules, a fragmentation library comprising ∼150 000 fragmentation mechanisms collected from the scientific literature and user-specified mechanisms. However, Mass Frontier cannot readily perform batch processing on the scale required to generate comprehensive in silico libraries for thousands of metabolites. Here, we present a freely available software package, named HAMMER (High-throughput AutoMation of Mass frontiER), to automatically control Mass Frontier software to perform in silico fragmentation in an automated and high-throughput matter. This package allows the user to readily generate large-scale in silico fragmentation libraries and perform mass spectral matching to mass spectrometry-derived data.

2 METHODS AND IMPLEMENTATION

HAMMER is developed under the Python language and supports Windows 7 and XP (both 32 bit and 64 bit). Its software requirements include Java 6/7, Open Babel (O'Boyle ), Sikuli (Yeh ) and Mass Frontier 7. A desktop PC with 2.66 GHz CPU and 4 GB memory was used as the operating platform. As shown in Figure 1, HAMMER consists of four modules:
Fig. 1.

Workflow for HAMMER

retrieves chemical structures from a compound database of interest such as ChemSpider, Kyoto Encyclopedia of Genes and Genomes (KEGG) and PubChem. The retrieval is based on a submitted plain text file that includes a list of compounds or chemicals (with specific names, database identifier or chemical formula). Additionally, metabolites within specific KEGG pathways can be downloaded based on a KEGG pathway identifier. Candidate entries found in the databases of interest are automatically downloaded, verified and converted into .mol format (required by Mass Frontier) or other formats such as SMILES, InChI or empirical formula. Open Babel is used to perform structure verification and conversion. A report is generated to summarize the information retrieved for each compound (e.g. number of candidates, database IDs and URL to the specific database entry). automatically controls the operation of Mass Frontier by using the open-source visual scripting software Sikuli. Sikuli allows control of software when no application programming interface is available, which is the case for many commercial software packages. It readily uses image recognition of graphical user interface (GUI) elements to operate software. HAMMER contains a Sikuli standalone runtime. As a result, the InSilicoFragmentation module works without any setup, otherwise simple configuration is required. Two versions of InSilicoFragmentation are provided: Windows 7 and Windows XP. We have separated the GUI images (or patterns) and search regions from the source code itself. This allows the user to modify the script more easily if required (e.g. applying a different version of Mass Frontier, see online manual for details). The structure file (.mol) of each compound is separately imported into Mass Frontier, and in silico fragmentation is performed applying user-defined fragmentation settings (see online manual for details). Structural and m/z information for each fragment is collected for each in silico reaction and if required (i.e. for MSn) is used for the next fragmentation step. This use allows closed-loop operation of Mass Frontier to perform ‘multistage’ in silico fragmentation. In silico fragments, their corresponding unique m/z values, chemical structures (in each stage) and fragmentation mechanisms are exported for further analysis and visualization. uses the in silico fragmentation results as its input. The fragments are organized systematically in separate folders according to the compound of interest, and a 2D chemical structure image file in.png format is generated for each fragment. The in silico fragmentation results are exportable in several formats, such as in extensible mark-up language and plain text formats, as well as chemical mark-up language and National Institute of Standards and Technology spectra library files (NIST-MSP) for further analysis. These formats are compatible with common mass spectrometry databases and software packages. The hierarchical relationships of the fragments are parsed to an extensible mark-up language file that can easily be accessed using scripting languages. Additionally, a hierarchical tree visualization (as a .pdf) is generated to allow visual comparison or interpretation. computes a score, using the pMatch algorithm, to evaluate the probability of a match between an experimental fragmentation spectrum and an in silico fragmentation pattern in a library (Ye et al., 2010). The pMatch algorithm is a spectral matching algorithm originally developed for mass spectrometry-based protein identification (see Supplementary Information for details of the algorithm). The SpectralMatching module uses MSP files (see previous module), experimental and in silico, as its input. For each experimental fragmentation spectrum, a report is generated, which includes detailed information on the matching and annotation. Workflow for HAMMER

3 CASE STUDIES

HAMMER was used to generate in silico fragmentation data for two distinct groups of compounds: (i) all 72 metabolites within the ‘phenylalanine metabolism’ KEGG pathway (i.e. map00360) and (ii) the top 200 most prescribed drugs in the USA in 2011 (RxList, 2011). These were selected to demonstrate the applicability and capability of HAMMER across diverse compound sets. Some drugs are composed of multiple compounds, and each compound was separately parsed and imported into Mass Frontier. Additionally, lower mass neutral and charged molecules/atoms (e.g. water and sodium) within these separated drug compounds were excluded from this demonstration dataset (see Supplementary Information for details). This resulted in 151 unique structures for the drugs dataset. See Supplementary Table S1 for specific settings used for both case studies. The run times for the phenylalanine and drug datasets were ∼8.5 and 14.5 h, respectively. On average, ∼3200 fragments were predicted for each compound (see Supplementary Tables S2–S4). Applying the defined Mass Frontier parameter settings, it is expected that MS data are reported, where n > 2. This relatively high number of fragments illustrates the complexity and diversity of the fragmentation mechanisms. Although, these in silico fragments are predicted based on a set of general fragmentation rules and a fragmentation library comprising ∼150 000 fragmentation mechanisms collected from the scientific literature, the numbers of fragments produced are highly dependent on the complexity of the chemical structures and the parameter settings defined by the user (see Supplementary Information and online manual for details). In silico fragments may be reported, which are false positives; these fragments are not actually created in the MS fragmentation process, or are created but their low stabilities ensure that further and complete fragmentation or decomposition is observed before the ions are detected. To assess the applicability of each in silico fragmentation library, five experimental fragmentation mass spectra were retrieved from MassBank to perform spectral matching [Supplementary Table S5, Case Study I: acetyl-CoA (KNA00207), capsaicin (WA001605), isobutyryl-CoA (PR100154), N-acetyl-L-phenylalanine (KO002200) and succinic acid (KZ000074). Case study II: amoxicillin (WA001751), digoxin (WA000563), meloxicam (WA002576), naproxen (WA000359) and prednisone (CO000368)] (Horai ). All of the 10 real fragmentation spectra were correctly identified based on the data present in these small in silico mass spectral libraries (Supplementary Table S6). It is important to highlight that the parameter settings used in Mass Frontier to perform in silico fragmentation play an important role in the correct prediction of in silico fragments and mechanisms. Additionally, understanding the experimental parameters used to collect experimental data (e.g. difference in collision energy or type of fragmentation) is also important to maximize metabolite annotation. However, these two challenges are outside the scope of the work presented here. Both case studies show that HAMMER allows researchers to readily generate in silico MSn libraries and perform batch spectral matching of in silico mass spectral data to MS-derived data in an automated high-throughput fashion with minimal or no human intervention. In addition, we have shown with both case studies the value and applicability of visual scripting in the field of computational biology. We anticipate this software package will be used across a wide range of disciplines including metabolomics, organic synthetic chemistry and pharmaceutical research. Funding: Royal Society International Exchanges 2011 NSFC funding (61211130120), National Natural Science Foundation of China (NSFC) Joint Fund with Guangdong (U1201256), NSFC funding (61171125 and 61001185), the UK Natural Environment Research Council (NE/I008314/1) and the Systems Science for Health initiative at The University of Birmingham. Conflict of Interest: R.M. derives income from Mass Frontier sales.
  9 in total

1.  High precision measurement and fragmentation analysis for metabolite identification.

Authors:  Madalina Oppermann; Nicolaie Eugen Damoc; Catharina Crone; Thomas Moehring; Helmut Muenster; Martin Hornshaw
Journal:  Methods Mol Biol       Date:  2012

2.  LC-high resolution MS in environmental analysis: from target screening to the identification of unknowns.

Authors:  Martin Krauss; Heinz Singer; Juliane Hollender
Journal:  Anal Bioanal Chem       Date:  2010-03-16       Impact factor: 4.142

3.  MassBank: a public repository for sharing mass spectral data for life sciences.

Authors:  Hisayuki Horai; Masanori Arita; Shigehiko Kanaya; Yoshito Nihei; Tasuku Ikeda; Kazuhiro Suwa; Yuya Ojima; Kenichi Tanaka; Satoshi Tanaka; Ken Aoshima; Yoshiya Oda; Yuji Kakazu; Miyako Kusano; Takayuki Tohge; Fumio Matsuda; Yuji Sawada; Masami Yokota Hirai; Hiroki Nakanishi; Kazutaka Ikeda; Naoshige Akimoto; Takashi Maoka; Hiroki Takahashi; Takeshi Ara; Nozomu Sakurai; Hideyuki Suzuki; Daisuke Shibata; Steffen Neumann; Takashi Iida; Ken Tanaka; Kimito Funatsu; Fumito Matsuura; Tomoyoshi Soga; Ryo Taguchi; Kazuki Saito; Takaaki Nishioka
Journal:  J Mass Spectrom       Date:  2010-07       Impact factor: 1.982

Review 4.  Mass spectrometry-based metabolomics.

Authors:  Katja Dettmer; Pavel A Aronov; Bruce D Hammock
Journal:  Mass Spectrom Rev       Date:  2007 Jan-Feb       Impact factor: 10.946

Review 5.  Advances in metabolite identification.

Authors:  David S Wishart
Journal:  Bioanalysis       Date:  2011-08       Impact factor: 2.681

6.  Open MS/MS spectral library search to identify unanticipated post-translational modifications and increase spectral identification rate.

Authors:  Ding Ye; Yan Fu; Rui-Xiang Sun; Hai-Peng Wang; Zuo-Fei Yuan; Hao Chi; Si-Min He
Journal:  Bioinformatics       Date:  2010-06-15       Impact factor: 6.937

7.  In silico fragmentation for computer assisted identification of metabolite mass spectra.

Authors:  Sebastian Wolf; Stephan Schmidt; Matthias Müller-Hannemann; Steffen Neumann
Journal:  BMC Bioinformatics       Date:  2010-03-22       Impact factor: 3.169

8.  Advances in structure elucidation of small molecules using mass spectrometry.

Authors:  Tobias Kind; Oliver Fiehn
Journal:  Bioanal Rev       Date:  2010-08-21

9.  Open Babel: An open chemical toolbox.

Authors:  Noel M O'Boyle; Michael Banck; Craig A James; Chris Morley; Tim Vandermeersch; Geoffrey R Hutchison
Journal:  J Cheminform       Date:  2011-10-07       Impact factor: 5.514

  9 in total
  11 in total

1.  Using fragmentation trees and mass spectral trees for identifying unknown compounds in metabolomics.

Authors:  Arpana Vaniya; Oliver Fiehn
Journal:  Trends Analyt Chem       Date:  2015-06-01       Impact factor: 12.296

2.  An LC-MS-based lipidomics pre-processing framework underpins rapid hypothesis generation towards CHO systems biotechnology.

Authors:  Hock Chuan Yeo; Shuwen Chen; Ying Swan Ho; Dong-Yup Lee
Journal:  Metabolomics       Date:  2018-07-09       Impact factor: 4.290

3.  Integrated Framework for Identifying Toxic Transformation Products in Complex Environmental Mixtures.

Authors:  Leah Chibwe; Ivan A Titaley; Eunha Hoh; Staci L Massey Simonich
Journal:  Environ Sci Technol Lett       Date:  2017-01-04

Review 4.  After the feature presentation: technologies bridging untargeted metabolomics and biology.

Authors:  Kevin Cho; Nathaniel G Mahieu; Stephen L Johnson; Gary J Patti
Journal:  Curr Opin Biotechnol       Date:  2014-05-06       Impact factor: 9.740

Review 5.  Trends in the application of high-resolution mass spectrometry for human biomonitoring: An analytical primer to studying the environmental chemical space of the human exposome.

Authors:  Syam S Andra; Christine Austin; Dhavalkumar Patel; Georgia Dolios; Mahmoud Awawda; Manish Arora
Journal:  Environ Int       Date:  2017-01-04       Impact factor: 9.621

Review 6.  Applications of Fourier Transform Ion Cyclotron Resonance (FT-ICR) and Orbitrap Based High Resolution Mass Spectrometry in Metabolomics and Lipidomics.

Authors:  Manoj Ghaste; Robert Mistrik; Vladimir Shulaev
Journal:  Int J Mol Sci       Date:  2016-05-25       Impact factor: 5.923

7.  Biological Filtering and Substrate Promiscuity Prediction for Annotating Untargeted Metabolomics.

Authors:  Neda Hassanpour; Nicholas Alden; Rani Menon; Arul Jayaraman; Kyongbum Lee; Soha Hassoun
Journal:  Metabolites       Date:  2020-04-21

8.  Comparative Research of Chemical Profiling in Different Parts of Fissistigma oldhamii by Ultra-High-Performance Liquid Chromatography Coupled with Hybrid Quadrupole-Orbitrap Mass Spectrometry.

Authors:  Haibo Hu; Yau Lee-Fong; Jinnian Peng; Bin Hu; Jialin Li; Yaoli Li; Hao Huang
Journal:  Molecules       Date:  2021-02-11       Impact factor: 4.411

9.  Enhanced Isotopic Ratio Outlier Analysis (IROA) Peak Detection and Identification with Ultra-High Resolution GC-Orbitrap/MS: Potential Application for Investigation of Model Organism Metabolomes.

Authors:  Yunping Qiu; Robyn D Moir; Ian M Willis; Suresh Seethapathy; Robert C Biniakewitz; Irwin J Kurland
Journal:  Metabolites       Date:  2018-01-18

10.  Evaluating the Accuracy of the QCEIMS Approach for Computational Prediction of Electron Ionization Mass Spectra of Purines and Pyrimidines.

Authors:  Jesi Lee; Tobias Kind; Dean Joseph Tantillo; Lee-Ping Wang; Oliver Fiehn
Journal:  Metabolites       Date:  2022-01-12
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.