| Literature DB >> 32915321 |
Joanna C Wolthuis1,2, Stefania Magnusdottir1, Mia Pras-Raves3, Maryam Moshiri1, Judith J M Jans3, Boudewijn Burgering1,2, Saskia van Mil4, Jeroen de Ridder5,6.
Abstract
Direct infusion untargeted mass spectrometry-based metabolomics allows for rapid insight into a sample's metabolic activity. However, analysis is often complicated by the large array of detected m/z values and the difficulty to prioritize important m/z and simultaneously annotate their putative identities. To address this challenge, we developed MetaboShiny, a novel R/RShiny-based metabolomics package featuring data analysis, database- and formula-prediction-based annotation and visualization. To demonstrate this, we reproduce and further explore a MetaboLights metabolomics bioinformatics study on lung cancer patient urine samples. MetaboShiny enables rapid and rigorous analysis and interpretation of direct infusion untargeted mass spectrometry-based metabolomics data.Entities:
Keywords: Annotation; Direct infusion; Machine learning; Mass spectrometry; Metabolomics; R; Statistics
Mesh:
Year: 2020 PMID: 32915321 PMCID: PMC7497297 DOI: 10.1007/s11306-020-01717-8
Source DB: PubMed Journal: Metabolomics ISSN: 1573-3882 Impact factor: 4.290
Fig. 1Overview of the MetaboShiny application. The displayed results themselves are obtained from a MetaboLights dataset MTBLS28, which is more extensively discussed in section S1. a Box plot of the abundance of a single metabolite. Aside from beeswarm- and boxplots, scatter/violin plots are available. b Manhattan-like plot of all t-test hits. c Extra options for the searching algorithm. d Subset of databases available for search functionality. Users can select the ones they prefer. e–g The search results section of the sidebar. Users can scroll through the search results and display the database description and structure of the metabolite selected in that table. Results were filtered first by the main peak isotopes, and subsequently sorted by ppm error. Compounds with identical molecular formulas, but different structures/SMILES are listed as separate rows. Compounds with identical SMILES but different names (IUPAC, commercial, etc.) are collapsed into one row. Users can then view the synonyms and descriptions once they access the detail view for the selected search hit. h PLS-DA plot and loading results. i Examples of machine learning results. Users see ROC curves for one or multiple models (and their average performance) and an overview of variable importance (including metadata if the user wishes)
Fig. 2Analysis of MetaboShiny’s speed performance. a Time to annotate a m/z value. Searching one m/z value takes longer if more matches are found. On average, even with large databases including 3e8 m/z values, performing a search on a single m/z value takes under one second. Labels show the size of various included databases with Wikidata being the most extensive. Color of points represents the amount of matches found for each m/z value. b Time to process data from import to end of normalization. Size and color of markers represents minutes