| Literature DB >> 25566535 |
Pietro Franceschi1, Roman Mylonas1, Nir Shahaf2, Matthias Scholz1, Panagiotis Arapitsas1, Domenico Masuero1, Georg Weingart1, Silvia Carlin1, Urska Vrhovsek1, Fulvio Mattivi1, Ron Wehrens1.
Abstract
Due to their sensitivity and speed, mass-spectrometry based analytical technologies are widely used to in metabolomics to characterize biological phenomena. To address issues like metadata organization, quality assessment, data processing, data storage, and, finally, submission to public repositories, bioinformatic pipelines of a non-interactive nature are often employed, complementing the interactive software used for initial inspection and visualization of the data. These pipelines often are created as open-source software allowing the complete and exhaustive documentation of each step, ensuring the reproducibility of the analysis of extensive and often expensive experiments. In this paper, we will review the major steps which constitute such a data processing pipeline, discussing them in the context of an open-source software for untargeted MS-based metabolomics experiments recently developed at our institute. The software has been developed by integrating our metaMS R package with a user-friendly web-based application written in Grails. MetaMS takes care of data pre-processing and annotation, while the interface deals with the creation of the sample lists, the organization of the data storage, and the generation of survey plots for quality assessment. Experimental and biological metadata are stored in the ISA-Tab format making the proposed pipeline fully integrated with the Metabolights framework.Entities:
Keywords: GC-MS; ISA-Tab; LC-MS; data analysis; metabolomics; pipeline
Year: 2014 PMID: 25566535 PMCID: PMC4267269 DOI: 10.3389/fbioe.2014.00072
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
Figure 1Block diagram of the typical metabolomic experiment.
Figure 2The six steps in the MetaDB workflow. (1) Metadata is uploaded to MetaDB using ISA-Tab formatted files. (2) A MS acquisition sequence is generated with randomized samples. (3) After MS data acquisition, raw and derived spectral data files are uploaded to MetaDB. (4) Data are processed using MetaMS. (5) Quality control of acquired data. (6) Data are prepared for storage and possibly for upload to public repositories such as MetaboLights.
Figure 3Principal component analysis scoreplot produced by metaDB on the example dataset. Different colors are used to identify the two different sample classes and the QCs. The different samples can be identified also by their sample names. This feature can be useful for the fast identification of critical samples.
Figure 4Variation of the integral of the TIC over the analytical run.