| Literature DB >> 32408578 |
Xusheng Wang1, Ji-Hoon Cho1, Suresh Poudel2,3, Yuxin Li1,2,3, Drew R Jones2,3, Timothy I Shaw1,4, Haiyan Tan1, Boer Xie1,2,3, Junmin Peng1,2,3.
Abstract
Metabolomics is increasingly important for biomedical research, but large-scale metabolite identification in untargeted metabolomics is still challenging. Here, we present Jumbo Mass spectrometry-based Program of Metabolomics (JUMPm) software, a streamlined software tool for identifying potential metabolite formulas and structures in mass spectrometry. During database search, the false discovery rate is evaluated by a target-decoy strategy, where the decoys are produced by breaking the octet rule of chemistry. We illustrated the utility of JUMPm by detecting metabolite formulas and structures from liquid chromatography coupled tandem mass spectrometry (LC-MS/MS) analyses of unlabeled and stable-isotope labeled yeast samples. We also benchmarked the performance of JUMPm by analyzing a mixed sample from a commercially available metabolite library in both hydrophilic and hydrophobic LC-MS/MS. These analyses confirm that metabolite identification can be significantly improved by estimating the element composition in formulas using stable isotope labeling, or by introducing LC retention time during a spectral library search, which are incorporated into JUMPm functions. Finally, we compared the performance of JUMPm and two commonly used programs, Compound Discoverer 3.1 and MZmine 2, with respect to putative metabolite identifications. Our results indicate that JUMPm is an effective tool for metabolite identification of both unlabeled and labeled data in untargeted metabolomics.Entities:
Keywords: algorithm; database search; mass spectrometry; metabolite formula; metabolite identification; metabolite structure; metabolome; metabolomics; software; yeast
Year: 2020 PMID: 32408578 PMCID: PMC7281133 DOI: 10.3390/metabo10050190
Source DB: PubMed Journal: Metabolites ISSN: 2218-1989
Figure 1General workflow of Jumbo Mass spectrometry-based Program of Metabolomics (JUMPm) and the target-decoy strategy for FDR estimation. (A) JUMPm can identify metabolites in a metabolomic experiment using either unlabeled or labeled data. Inferred stoichiometry is used to limit the search space of formulas. Candidate formulas are used to propose structure identifications which are then scored by MS2; (B) The scheme to compute the range of carbon numbers for unlabeled data using natural isotopic distribution. RSD, relative standard deviation; (C) The scheme to calculate the nitrogen number using a pair of unlabeled and labeled ions; (D) Utilization of the target-decoy strategy. Decoys (invalid structures) are made by violating the octet rule with the addition of three hydrogens without changing the charge state [32]. The impossible decoy structure for methane is shown. The decoy formulas could be incorrectly identified due to chance matches against searched m/z values; (E) Generation of a decoy MS/MS pattern by mass addition of three hydrogens on a specific atom; (F) Examples of JUMPm output for unlabeled and labeled datasets.
LC-MS/MS runs of metabolites used in this study.
| Sample Name | Sample Introduction | LC | MS Ionization Mode |
|---|---|---|---|
| Unlabeled yeast lysate | Unlabeled yeast sample | RP | Positive |
| Labeled yeast lysate | 4-plex mixture of one unlabeled sample and three stable-isotope-labeled yeast samples (C13, N15, and double labeling) | RP | Positive |
| Synthetic standards (HILIC) | A mixture of purchased synthetic metabolites | HILIC | Negative |
Yeast extracts and a synthetic standard mix were analyzed by LC-MS/MS with a reverse phase (RP) or HILIC column in positive or negative ion mode.
Figure 2JUMPm analysis of labeled samples and an example identification of phenylalanine. (A) Conceptual workflow for a stable isotope-labeled experiment (the double labeled sample is not shown); (B) The quality of each isotope-labeled pair is scored with three parameters (mass defect, relative ion intensity, and co-eluted retention time). The Pscore is used to discriminate authentic pairs from random matches; (C) For each isotope-labeled pair, the relevant MS2 spectra are scored (Mscore) and annotated with the top match. Three matched theoretical fragment ions are highlighted in red; (D) Hierarchical clustering of all structure candidates by predicted product ion intensities for the example metabolite spectrum (HMDB hits, large red dots and PubChem hits, small dots). Representative structures from each colored group are shown. All candidates share the neutral formula C9H11NO2.
Figure 3Target-decoy strategy for the estimation of FDR using JUMPm. The x-axis is different JUMP scores (Pscores/Mscores) and the y-axis is the number of structures identified using JUMPm. (A) Distributions of target and decoy Mscores from the unlabeled yeast lysate. All matches with zero Mscore are filtered out in the graph; (B) Distributions of target and decoy Mscores from the labeled yeast lysate; (C) Distributions of target and decoy Pscores from the labeled yeast lysate; (D) Distributions of target and decoy Mscores obtained from a custom library search consisting of a mixture of 120 purchased synthetic metabolites. A HILIC column is used to run the cocktail of purchased synthetic metabolites under negative ionization mode.
Figure 4Performance comparison among JUMPm, CD, and MZmine 2 on two LC-MS/MS runs, i.e., the unlabeled yeast lysate and a mixture of synthetic standards. (A) The workflow of CD and MZmine 2 (Figure 2) uses the HMDB database; (B) Number of formulas and structures detected from the sample of unlabeled yeast lysate by the three software tools. RP, reversed phase; (C) Number of formulas and structures detected from the sample of synthetic standards. HILIC. hydrophilic interaction liquid chromatography.