| Literature DB >> 30535405 |
Kristian Peters1, James Bradbury2, Sven Bergmann3,4, Marco Capuccini5,6, Marta Cascante7, Pedro de Atauri7, Timothy M D Ebbels8, Carles Foguet7, Robert Glen8,9, Alejandra Gonzalez-Beltran10, Ulrich L Günther11, Evangelos Handakas8, Thomas Hankemeier12, Kenneth Haug13, Stephanie Herman6,14, Petr Holub15, Massimiliano Izzo10, Daniel Jacob16, David Johnson10,17, Fabien Jourdan18, Namrata Kale13, Ibrahim Karaman19, Bita Khalili3,4, Payam Emami Khonsari14, Kim Kultima14, Samuel Lampa6, Anders Larsson6,20, Christian Ludwig21, Pablo Moreno13, Steffen Neumann1,22, Jon Ander Novella6,20, Claire O'Donovan13, Jake T M Pearce8, Alina Peluso8, Marco Enrico Piras23, Luca Pireddu23, Michelle A C Reed11, Philippe Rocca-Serra10, Pierrick Roger24, Antonio Rosato25, Rico Rueedi3,4, Christoph Ruttkies1, Noureddin Sadawi26,8, Reza M Salek13, Susanna-Assunta Sansone10, Vitaly Selivanov7, Ola Spjuth6, Daniel Schober1, Etienne A Thévenot24, Mattia Tomasoni3,4, Merlijn van Rijswijk27,28, Michael van Vliet12, Mark R Viant2,29, Ralf J M Weber2,29, Gianluigi Zanetti23, Christoph Steinbeck30.
Abstract
BACKGROUND: Metabolomics is the comprehensive study of a multitude of small molecules to gain insight into an organism's metabolism. The research field is dynamic and expanding with applications across biomedical, biotechnological, and many other applied biological domains. Its computationally intensive nature has driven requirements for open data formats, data repositories, and data analysis tools. However, the rapid progress has resulted in a mosaic of independent, and sometimes incompatible, analysis methods that are difficult to connect into a useful and complete data analysis solution.Entities:
Keywords: NMR; cloud computing; computational workflows; data analysis; e-infrastructures; galaxy; mass spectrometry; metabolomics; standardization; statistics
Mesh:
Year: 2019 PMID: 30535405 PMCID: PMC6377398 DOI: 10.1093/gigascience/giy149
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:Conceptual design of the PhenoMeNal cloud e-infrastructure, which brings compute to the data for any large number of data scientists.
Figure 2:Screenshots of creating and using the PhenoMeNal cloud e-infrastructure. First, log in with ELIXIR to the cloud research environment (CRE) portal. Second, select a public or private cloud provider. After entering cloud credentials and setting up parameters in the dedicated portal, the deployment of the PhenoMeNal e-infrastructure into the cloud environment can be made. Third, in the PhenoMeNal Portal app library there are several services ready to be deployed and used in the set-up infrastructure. Fourth, dedicated web services such as Galaxy are readily available in the cloud e-infrastructure. All steps can be operated from an easy-to-use web interface that is accessible from any standard web browser.
List of workflows that are representative for their respective metabolomics domains (identification in NMR, Fluxomics, Annotation, and identification in MS and eco-metabolomics)
| Workflow name | Description | Reference |
|---|---|---|
| 1D NMR | Processes 1D NMR experiments from raw data to a data matrix required for visualization and statistical analysis, building on nmrML and NMRProcFlow. The automatic workflow is based on the MTBLS1 dataset, describing urinary changes in type 2 diabetes in humans. | [ |
| Fluxomics | Quantifies steady-state fluxes following 13C metabolic flux analysis. The workflow was first based on the analysis of the MTBLS412 dataset with 13C tracer data of human umbilical vein endothelial cells under hypoxia. | [ |
| LC-MS/MS | Processes, quantifies, and annotates/identifies features in mass spectra using MetFrag — a tool that annotates molecules from compound databases of tandem mass spectrometry (MS/MS) spectra. The workflow is based on MTBLS558. | [ |
| Univariate and Multivariate Statistics | Applies univariate and multivariate statistical analysis and illustrates how datasets may be explored, enabling the identification of variables of interest and the construction of predictive models. The workflow is based on MTBLS404. | [ |
| Eco-Metabolomics | Implementation of a resource demanding metabolomics use case in ecology, used in large field experiments to describe interactions between different species of organisms in remarkable detail. The workflow is based on MTBLS520. | [ |
| ISA-Create-Validate-Upload | A workflow to create Investigation, Study, and Assay data model framework-compliant metadata files based on study design information, augmented with semantic markup as source, implementing UK Phenome center naming conventions. Following validation, the workflow also allows visualization of overall study design and deposition to EMBL-EBI. |
Overview of the most important FAIR criteria and implementations suggested for PhenoMeNal data, tools and workflows
| Data | Tools | Workflows | |
|---|---|---|---|
|
| Indexing in domain relevant databases (e.g., MetaboLights) | Indexing in domain relevant software repositories (e.g., the PhenoMeNal App Library, GitHub) | Indexing in workflow management systems such as Galaxy (e.g., PhenoMeNal, W4M), or libraries such as [ |
| Rich descriptions of metadata (e.g., ISA-Tab) | Tool descriptions follow the EDAM ontology | Persistent identifier (e.g., W4M ID, DOI) and intuitive naming patterns | |
|
| Data access and rights management based on e.g., data use ontology (DUO) | Accessible open-source licenses | Access to workflow systems can be configured to be shared or restricted |
|
| Standard formats for experimental metadata (ISA-Tab/ISA-JSON) | Standardized tool descriptions | Standardized workflow format (e.g., Galaxy GA format, Common Workflow Language CWL) |
| Domain specific standards for raw data (e.g., mzML, nmrML) | Containerization of software tools | Execution in various software environments (e.g., through the use of containers) | |
| OboFoundry vocabularies and established domain ontologies to annotate data | EDAM ontology to annotate tools | Workflow annotation ontologies (e.g., Ontology of workflow motifs for annotating workflow specifications [ | |
|
| Deposition in data repositories (e.g., MetaboLights) and data indexing sites (e.g., OmicsDI) | Rich documentation and usage guides | Rich documentation and tutorials (e.g., Galaxy tours) |