Literature DB >> 24642061

An R package to analyse LC/MS metabolomic data: MAIT (Metabolite Automatic Identification Toolkit).

Francesc Fernández-Albert1, Rafael Llorach1, Cristina Andrés-Lacueva1, Alexandre Perera2.   

Abstract

UNLABELLED: Current tools for liquid chromatography and mass spectrometry for metabolomic data cover a limited number of processing steps, whereas online tools are hard to use in a programmable fashion. This article introduces the Metabolite Automatic Identification Toolkit (MAIT) package, which makes it possible for users to perform metabolomic end-to-end liquid chromatography and mass spectrometry data analysis. MAIT is focused on improving the peak annotation stage and provides essential tools to validate statistical analysis results. MAIT generates output files with the statistical results, peak annotation and metabolite identification.
AVAILABILITY AND IMPLEMENTATION: http://b2slab.upc.edu/software-and-downloads/metabolite-automatic-identification-toolkit/.
© The Author 2014. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2014        PMID: 24642061      PMCID: PMC4071204          DOI: 10.1093/bioinformatics/btu136

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

Liquid chromatography and mass spectrometry (LC/MS) is an analytical technique used widely in metabolomics to detect molecules in biological samples (Theodoridis ). A wide array of software tools is available for LC/MS profiling data analysis, including commercial, programmatic and online tools. A commercial example is Analyst®, whereas some open-source packages permit programmatic processing, such as the R package XCMS (Smith ) to detect peaks or CAMERA (Kuhl ) and AStream (Alonso ) for peak annotations. There have been efforts on just peak annotation using JAVA (Brown ). MZmine and mzMatch are modularized tools coded in JAVA that are focused on LC/MS data preprocessing and visualization (Katajamaa ; Pluskal ; Scheltema ). Online tools permit sample processing through a web Graphical User Interface, such as XCMSOnline (http://xcmsonline.scripps.edu) or MetaboAnalyst (Xia ). Refer to Supplementary Table S1 for a comparative between the capabilities for some of the main available tools. In this context, we introduce a new R package called Metabolite Automatic Identification Toolkit (MAIT) for automatic LC/MS analysis. The goal of the MAIT package is to provide an array of tools that makes programmable metabolomic end-to-end statistical analysis possible (see Section 3 of the Supplementary Material for details about the MAIT modularity). MAIT includes functions to improve peak annotation through the process called biotransformations and to assess the predictive power of statistically significant metabolites that quantify class separability.

2 METHODS

MAIT includes the stages peak detection, peak annotation, statistical analysis and table and plots creation (Fig. 1). The peak detection stage detects the peaks in the LC/MS sample files. The peak annotation stage improves the identification of the metabolites in the metabolomic samples by increasing the chemical and biological information in the dataset. A statistical analysis reveals the significant sample features and measures their predictive power. MAIT uses the R package XCMS to detect and align peaks. For the peak annotation step, MAIT uses three steps:
Fig. 1.

Correspondence between MAIT functions (centre column), generated output files (left column) and their functionality (right column)

First, MAIT uses the CAMERA package to perform the first annotation step (Kuhl ). In this stage, MAIT uses a peak correlation distance and a retention time window to find which peaks came from the same source metabolite based. The peaks within each peak group are annotated following a reference adduct/fragment table and a mass allowance window. Biotransformations could be related to specific in-source mass losses. Therefore, in the second annotation step, they are detected using a mass allowance window inside the peak groups (Breitling ). For this search, MAIT already includes a biotransformations table (here Human biotransformations). User-defined biotransformation tables can be set as input, following the procedure defined in Supplementary Text (Section 6.6). Finally, a predefined metabolite database is mined for significant masses. This identifies metabolites with the help of the Human Metabolome Database (Wishart ), 2009/07 version. Correspondence between MAIT functions (centre column), generated output files (left column) and their functionality (right column) The objective of analysing the metabolomic profiling data is to obtain the statistically significant features (SSF) that contain the highest amount of class-related information. To gather these features, MAIT can apply statistical tests such as ANOVA or Student’s t-test to every feature, selecting the significant set of features given a threshold P-value. A validation test is included to quantify SSF class separability by a repeated random subsampling cross-validation using three methods: partial least squares and discriminant analysis, support vector machines and K-nearest neighbours (Hastie ). MAIT computes overall and class-related classification ratios to evaluate the SSF class-related information.

3 RESULTS

The example data files are a subset of the data used in the reference (Saghatelian ), which are distributed freely through the faahKO package (Smith, 2012). MAIT was used to read and analyse these samples using the functions depicted in Figure 1 (see the tutorial in the Supplementary Information). The significant features for each class are found using statistical tests and analysed through the different plots that MAIT produces. Using the following function call, 2640 peaks were detected: R> MAIT <- sampleProcessing(dataDir = “Dataxcms”, project = “MAIT_Demo”, snThres = 2, rtStep = 0.03) At this point, the first annotation stage is launched: R> MAIT <- peakAnnotation(MAIT.object = MAIT) Next, we gather the significant features from the peaks detected. After the Welch’s tests, 106 of these features were found to be significant through the spectralSigFeatures function. Statistical plots such as heat maps, boxplots and principal component analysis score plots can be generated (Supplementary Figs S3 and S4). Significant features are annotated after checking for certain neutral losses (biotransformations). R> MAIT <- spectralSigFeatures(MAIT, = 0.05) R> MAIT <- Biotransformations(MAIT, peakPrecision = 0.005) By using only the SSF, a validation stage is launched, obtaining a classification ratio of 100% with three training samples for all classifiers. These results suggest that the significant variables separate both classes completely. R> MAIT <- Validation(MAIT, Iterations = 20, trainSamples = 3) Finally, the database is mined to identify the significant features. R> MAIT <- identifyMetabolites(MAIT, peakTolerance = 0.005)

4 CONCLUSIONS

MAIT provides a set of tools and functions to perform an automatic end-to-end analysis of LC/MS metabolomic data, putting special emphasis on peak annotation and metabolite identification. In addition, MAIT validation functions make it possible to estimate predictive power for significant variables. Funding: Spanish national (grants AGL2009-13906-C02-01/ALI and AGL2010-10084-E), the CONSOLIDER INGENIO 2010 Programme, FUN-C-FOOD (CSD2007-063) from the MICINN and Merck Serono 2010 Research Grants (Fundación Salud 2000). Spanish Ministerio de Ciencia y Tecnología through TEC2010-20886-C02-02 and TEC2010-20886-C02-01 (in part) A.P. is part of the 2009SGR-1395 consolidated research group of the Generalitat de Catalunya, Spain. CIBER-BBN is an initiative of the Spanish ISCIII. R.L. thanks the MICINN and the European Social Funds for their financial contribution to the R. L. Ramón y Cajal contract (Ramon y Cajal Programme, MICINN-RYC). F.F.-A. thanks EVALXARTA-UB and Agència de Gestió d’Ajuts Universitaris I de Recerca, AGAUR (Generalitat de Catalunya), for their financial support. Conflict of Interest: none declared.
  12 in total

Review 1.  Liquid chromatography-mass spectrometry based global metabolite profiling: a review.

Authors:  Georgios A Theodoridis; Helen G Gika; Elizabeth J Want; Ian D Wilson
Journal:  Anal Chim Acta       Date:  2011-11-04       Impact factor: 6.558

2.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification.

Authors:  Colin A Smith; Elizabeth J Want; Grace O'Maille; Ruben Abagyan; Gary Siuzdak
Journal:  Anal Chem       Date:  2006-02-01       Impact factor: 6.986

3.  MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data.

Authors:  Mikko Katajamaa; Jarkko Miettinen; Matej Oresic
Journal:  Bioinformatics       Date:  2006-01-10       Impact factor: 6.937

4.  PeakML/mzMatch: a file format, Java library, R library, and tool-chain for mass spectrometry data analysis.

Authors:  Richard A Scheltema; Andris Jankevics; Ritsert C Jansen; Morris A Swertz; Rainer Breitling
Journal:  Anal Chem       Date:  2011-03-14       Impact factor: 6.986

5.  AStream: an R package for annotating LC/MS metabolomic data.

Authors:  Arnald Alonso; Antonio Julià; Antoni Beltran; Maria Vinaixa; Marta Díaz; Lourdes Ibañez; Xavier Correig; Sara Marsal
Journal:  Bioinformatics       Date:  2011-03-16       Impact factor: 6.937

6.  CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets.

Authors:  Carsten Kuhl; Ralf Tautenhahn; Christoph Böttcher; Tony R Larson; Steffen Neumann
Journal:  Anal Chem       Date:  2011-12-12       Impact factor: 6.986

7.  Assignment of endogenous substrates to enzymes by global metabolite profiling.

Authors:  Alan Saghatelian; Sunia A Trauger; Elizabeth J Want; Edward G Hawkins; Gary Siuzdak; Benjamin F Cravatt
Journal:  Biochemistry       Date:  2004-11-16       Impact factor: 3.162

8.  MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data.

Authors:  Tomás Pluskal; Sandra Castillo; Alejandro Villar-Briones; Matej Oresic
Journal:  BMC Bioinformatics       Date:  2010-07-23       Impact factor: 3.169

9.  Ab initio prediction of metabolic networks using Fourier transform mass spectrometry data.

Authors:  Rainer Breitling; Shawn Ritchie; Dayan Goodenowe; Mhairi L Stewart; Michael P Barrett
Journal:  Metabolomics       Date:  2006-07-25       Impact factor: 4.290

10.  Automated workflows for accurate mass-based putative metabolite identification in LC/MS-derived metabolomic datasets.

Authors:  Marie Brown; David C Wedge; Royston Goodacre; Douglas B Kell; Philip N Baker; Louise C Kenny; Mamas A Mamas; Ludwig Neyses; Warwick B Dunn
Journal:  Bioinformatics       Date:  2011-02-16       Impact factor: 6.937

View more
  18 in total

Review 1.  Annotation: A Computational Solution for Streamlining Metabolomics Analysis.

Authors:  Xavier Domingo-Almenara; J Rafael Montenegro-Burke; H Paul Benton; Gary Siuzdak
Journal:  Anal Chem       Date:  2017-11-03       Impact factor: 6.986

2.  Evaluation of intensity drift correction strategies using MetaboDrift, a normalization tool for multi-batch metabolomics data.

Authors:  Chanisa Thonusin; Heidi B IglayReger; Tanu Soni; Amy E Rothberg; Charles F Burant; Charles R Evans
Journal:  J Chromatogr A       Date:  2017-09-09       Impact factor: 4.759

3.  Data Processing and Analysis in Mass Spectrometry-Based Metabolomics.

Authors:  Ángela Peralbo-Molina; Pol Solà-Santos; Alexandre Perera-Lluna; Eduardo Chicano-Gálvez
Journal:  Methods Mol Biol       Date:  2023

4.  Defining and Detecting Complex Peak Relationships in Mass Spectral Data: The Mz.unity Algorithm.

Authors:  Nathaniel G Mahieu; Jonathan L Spalding; Susan J Gelman; Gary J Patti
Journal:  Anal Chem       Date:  2016-08-31       Impact factor: 6.986

5.  New Strategies and Challenges in Lung Proteomics and Metabolomics. An Official American Thoracic Society Workshop Report.

Authors:  Russell P Bowler; Chris H Wendt; Michael B Fessler; Matthew W Foster; Rachel S Kelly; Jessica Lasky-Su; Angela J Rogers; Kathleen A Stringer; Brent W Winston
Journal:  Ann Am Thorac Soc       Date:  2017-12

Review 6.  Analytical methods in untargeted metabolomics: state of the art in 2015.

Authors:  Arnald Alonso; Sara Marsal; Antonio Julià
Journal:  Front Bioeng Biotechnol       Date:  2015-03-05

7.  Metabolome searcher: a high throughput tool for metabolite identification and metabolic pathway mapping directly from mass spectrometry and using genome restriction.

Authors:  A Ranjitha Dhanasekaran; Jon L Pearson; Balasubramanian Ganesan; Bart C Weimer
Journal:  BMC Bioinformatics       Date:  2015-02-25       Impact factor: 3.169

8.  Multi-Platform Metabolomics Analyses Revealed the Complexity of Serum Metabolites in LPS-Induced Neuroinflammed Rats Treated with Clinacanthus nutans Aqueous Extract.

Authors:  Amalina Ahmad Azam; Intan Safinar Ismail; Mohd Farooq Shaikh; Faridah Abas; Khozirah Shaari
Journal:  Front Pharmacol       Date:  2021-06-09       Impact factor: 5.810

9.  Null diffusion-based enrichment for metabolomics data.

Authors:  Sergio Picart-Armada; Francesc Fernández-Albert; Maria Vinaixa; Miguel A Rodríguez; Suvi Aivio; Travis H Stracker; Oscar Yanes; Alexandre Perera-Lluna
Journal:  PLoS One       Date:  2017-12-06       Impact factor: 3.240

Review 10.  Navigating freely-available software tools for metabolomics analysis.

Authors:  Rachel Spicer; Reza M Salek; Pablo Moreno; Daniel Cañueto; Christoph Steinbeck
Journal:  Metabolomics       Date:  2017-08-09       Impact factor: 4.290

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.