Literature DB >> 28961702

MWASTools: an R/bioconductor package for metabolome-wide association studies.

Andrea Rodriguez-Martinez1, Joram M Posma1, Rafael Ayala1, Ana L Neves1, Maryam Anwar2, Enrico Petretto3, Costanza Emanueli2,4, Dominique Gauguier1, Jeremy K Nicholson1, Marc-Emmanuel Dumas1.   

Abstract

Summary: MWASTools is an R package designed to provide an integrated pipeline to analyse metabonomic data in large-scale epidemiological studies. Key functionalities of our package include: quality control analysis; metabolome-wide association analysis using various models (partial correlations, generalized linear models); visualization of statistical outcomes; metabolite assignment using statistical total correlation spectroscopy (STOCSY); and biological interpretation of metabolome-wide association studies results. Availability and implementation: The MWASTools R package is implemented in R (version  > =3.4) and is available from Bioconductor: https://bioconductor.org/packages/MWASTools/. Contact: m.dumas@imperial.ac.uk. Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2017. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2018        PMID: 28961702      PMCID: PMC6049002          DOI: 10.1093/bioinformatics/btx477

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Owing to sustained developments in high-throughput platforms [i.e. nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS)], metabolic phenotyping (metabotyping) is now used for large-scale epidemiological applications such as metabolome-wide association studies (MWAS) (Elliott ; Holmes ; Nicholson ). Customized statistical modeling approaches and data visualization tools are essential for biomarker discovery in large-scale metabotyping studies. Several software packages were developed to detect and visualize metabolic changes between conditions of interest using multivariate statistical methods (Gaude ; Thevenot ). However, a major limitation of these multivariate models from an epidemiological perspective is that these do not properly account for confounding factors, which might distort the observed associations between the metabolites and the condition under study (Elliott ). Here, we present an R package to perform MWAS using univariate hypothesis testing with efficient handling of epidemiological confounders (Elliott ). Our package provides a versatile and user-friendly MWAS pipeline with a number of functionalities, ranging from quality control (QC) analysis of metabonomic data to visualization and biological interpretation of MWAS analysis results.

2 Methods and features

The MWASTools package is organized in four functional modules: (i) QC analysis; (ii) MWAS analysis; (iii) visualization of MWAS results; (iv) metabolite assignment using correlation analysis. For demonstration purposes, the MWASTools package was used to analyse plasma 1H NMR metabolic profiles of 506 patients from the FGENTCARD cohort (Rodriguez-Martinez ).

2.1 QC analysis

MWASTools performs essential QC analyses via Principal Component Analysis (PCA) and by computing the coefficients of variation (CV) (ratio of standard deviation to the mean) of individual metabolic features across the QC samples (Dumas ). The results from PCA are visualized using score plots, where tight clustering of the QC samples indicates good overall reproducibility (Supplementary Fig. S1A). The results from CV analysis can be visualized with different plots: (i) a histogram showing the distribution of CVs across the metabolic features; (ii) an NMR spectrum colored based on the CV of each spectral signal (Supplementary Fig. S1B); or a MS-based scatter plot colored based on the CV of each MS feature (Supplementary Fig. S2). MWASTools also allows filtering the metabolic variables based on a given CV threshold.

2.2 MWAS analysis

Following QC analysis, MWASTools tests for association between the phenotype under investigation [e.g. type II diabetes (T2D)] and each metabolic feature (or metabolite). Depending on the nature of data to be modeled, the user can choose among the following association methods: linear/logistic regression or Pearson/Spearman/Kendall correlation. The models can be adjusted for confounder factors, including age, gender or other clinical covariates (e.g. medication). The P-values are corrected for multiple-testing with several possible methods, such as Benjamini–Hochberg (BH) procedure (Benjamini and Hochberg, 1995). MWASTools allows performing model validation through non-parametric bootstrapping. Finally, MWAS analysis results can be filtered according to a given significance threshold.

2.3 Visualization of MWAS results

MWASTools provides a series of customizable tools to visualize the results from MWAS analysis. For NMR data, MWASTools generates a skyline plot, where the chemical shifts are displayed along the x-axis and the log10 of the P-values (sign adjusted for the direction of the association) are displayed on the y-axis (Fig. 1). For other types of metabonomic data, the results are represented using: an analogous bar plot (Supplementary Fig. S4A); a MS-based scatter plot (Supplementary Fig. S3); a correlation-based metabolic network (Supplementary Fig. S4B); or a heatmap (Supplementary Fig. S5). Finally, the metabolites identified by MWAS can be mapped onto biological pathways (Kanehisa and Goto, 2000), and visualized using pathway-based or shortest path-based networks (Posma ; Rodriguez-Martinez a; Shannon ) (Supplementary Figs S6–S7).
Fig. 1

Visualization of the associations of T2D with plasma 1 H NMR metabolites in the FGENTCARD cohort (n = 506). The associations were computed using logistic regression adjusted for age, gender and body mass index. (A) Partial skyline plot (δ 0.80–1.60) showing the −log10 (pFDR) × sign of beta coefficient of each NMR signal. Statistically significant signals positively associated with T2D were colored in red. (B) NMR spectrum (δ 0.80–1.60) of a QC sample colored based on association results (Color version of this figure is available at Bioinformatics online.)

Visualization of the associations of T2D with plasma 1 H NMR metabolites in the FGENTCARD cohort (n = 506). The associations were computed using logistic regression adjusted for age, gender and body mass index. (A) Partial skyline plot (δ 0.80–1.60) showing the −log10 (pFDR) × sign of beta coefficient of each NMR signal. Statistically significant signals positively associated with T2D were colored in red. (B) NMR spectrum (δ 0.80–1.60) of a QC sample colored based on association results (Color version of this figure is available at Bioinformatics online.)

2.4 Structural assignment of NMR features

MWASTools performs statistical total correlation spectroscopy (STOCSY) analysis (Cloarec ) to facilitate the assignment of NMR variables significantly associated with the phenotype under study. The results are represented in a pseudo-NMR spectrum displaying the covariance (height) and Pearson/Spearman correlation coefficient (color) of all NMR variables with the variable of interest (driver) (Supplementary Figure S8). In order to highlight intramolecular correlation patterns, only NMR variables significantly correlated with the driver signal are shown.

3 Discussion

Altogether, the MWASTools R package provides an integrated pipeline with efficient analysis and visualization tools for: (i) performing QC analysis; (ii) conducting robust MWAS analysis with efficient handling of epidemiological confounders; (iii) structural assignment of metabolic features of interest; (iv) biological interpretation of MWAS results. The MWASTools package can be applied to both targeted and untargeted metabonomic datasets, acquired with different analytical platforms. The open nature of R allows for integration of MWASTools with other packages for the analysis of metabonomic data.

Funding

This work was supported by Medical Research Council Doctoral Training Centre scholarship (MR/K501281/1), Imperial College scholarship (EP/M506345/1), La Caixa studentship to A.R.M.; FCT(BD/52036/2012 to A.L.N.); British Heart Foundation program grant (RG/15/5/31446) to C.E. and E.P.; BHF Chair to CE (CH/15/31199); European Commission (FGENTCARD, LSHG-CT-2006-037683 to D.G., J.K.N. and M.E.D. Conflict of Interest: none declared. Click here for additional data file.
  11 in total

1.  KEGG: kyoto encyclopedia of genes and genomes.

Authors:  M Kanehisa; S Goto
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

Review 2.  Metabonomics: a platform for studying drug toxicity and gene function.

Authors:  Jeremy K Nicholson; John Connelly; John C Lindon; Elaine Holmes
Journal:  Nat Rev Drug Discov       Date:  2002-02       Impact factor: 84.694

3.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

4.  Statistical total correlation spectroscopy: an exploratory approach for latent biomarker identification from metabolic 1H NMR data sets.

Authors:  Olivier Cloarec; Marc-Emmanuel Dumas; Andrew Craig; Richard H Barton; Johan Trygg; Jane Hudson; Christine Blancher; Dominique Gauguier; John C Lindon; Elaine Holmes; Jeremy Nicholson
Journal:  Anal Chem       Date:  2005-03-01       Impact factor: 6.986

5.  Assessment of analytical reproducibility of 1H NMR spectroscopy based metabonomics for large-scale epidemiological research: the INTERMAP Study.

Authors:  Marc-Emmanuel Dumas; Elaine C Maibaum; Claire Teague; Hirotsugu Ueshima; Beifan Zhou; John C Lindon; Jeremy K Nicholson; Jeremiah Stamler; Paul Elliott; Queenie Chan; Elaine Holmes
Journal:  Anal Chem       Date:  2006-04-01       Impact factor: 6.986

6.  Analysis of the Human Adult Urinary Metabolome Variations with Age, Body Mass Index, and Gender by Implementing a Comprehensive Workflow for Univariate and OPLS Statistical Analyses.

Authors:  Etienne A Thévenot; Aurélie Roux; Ying Xu; Eric Ezan; Christophe Junot
Journal:  J Proteome Res       Date:  2015-07-02       Impact factor: 4.466

7.  Human metabolic phenotype diversity and its association with diet and blood pressure.

Authors:  Elaine Holmes; Ruey Leng Loo; Jeremiah Stamler; Magda Bictash; Ivan K S Yap; Queenie Chan; Tim Ebbels; Maria De Iorio; Ian J Brown; Kirill A Veselkov; Martha L Daviglus; Hugo Kesteloot; Hirotsugu Ueshima; Liancheng Zhao; Jeremy K Nicholson; Paul Elliott
Journal:  Nature       Date:  2008-04-20       Impact factor: 49.962

8.  Urinary metabolic signatures of human adiposity.

Authors:  Paul Elliott; Joram M Posma; Queenie Chan; Isabel Garcia-Perez; Anisha Wijeyesekera; Magda Bictash; Timothy M D Ebbels; Hirotsugu Ueshima; Liancheng Zhao; Linda van Horn; Martha Daviglus; Jeremiah Stamler; Elaine Holmes; Jeremy K Nicholson
Journal:  Sci Transl Med       Date:  2015-04-29       Impact factor: 17.956

9.  MetaboNetworks, an interactive Matlab-based toolbox for creating, customizing and exploring sub-networks from KEGG.

Authors:  Joram M Posma; Steven L Robinette; Elaine Holmes; Jeremy K Nicholson
Journal:  Bioinformatics       Date:  2013-10-30       Impact factor: 6.937

10.  MetaboSignal: a network-based approach for topological analysis of metabotype regulation via metabolic and signaling pathways.

Authors:  Andrea Rodriguez-Martinez; Rafael Ayala; Joram M Posma; Ana L Neves; Dominique Gauguier; Jeremy K Nicholson; Marc-Emmanuel Dumas
Journal:  Bioinformatics       Date:  2017-03-01       Impact factor: 6.937

View more
  6 in total

1.  NMR-Based Metabolomics in Cancer Research.

Authors:  Rui Hu; Tao Li; Yunhuang Yang; Yuan Tian; Limin Zhang
Journal:  Adv Exp Med Biol       Date:  2021       Impact factor: 2.622

2.  A computationally efficient Bayesian seemingly unrelated regressions model for high-dimensional quantitative trait loci discovery.

Authors:  Leonardo Bottolo; Marco Banterle; Sylvia Richardson; Mika Ala-Korpela; Marjo-Riitta Järvelin; Alex Lewin
Journal:  J R Stat Soc Ser C Appl Stat       Date:  2021-05-08       Impact factor: 1.864

Review 3.  Music of metagenomics-a review of its applications, analysis pipeline, and associated tools.

Authors:  Bilal Wajid; Faria Anwar; Imran Wajid; Haseeb Nisar; Sharoze Meraj; Ali Zafar; Mustafa Kamal Al-Shawaqfeh; Ali Riza Ekti; Asia Khatoon; Jan S Suchodolski
Journal:  Funct Integr Genomics       Date:  2021-10-18       Impact factor: 3.410

4.  pJRES Binning Algorithm (JBA): a new method to facilitate the recovery of metabolic information from pJRES 1H NMR spectra.

Authors:  Andrea Rodriguez-Martinez; Rafael Ayala; Joram M Posma; Nikita Harvey; Beatriz Jiménez; Kazuhiro Sonomura; Taka-Aki Sato; Fumihiko Matsuda; Pierre Zalloua; Dominique Gauguier; Jeremy K Nicholson; Marc-Emmanuel Dumas
Journal:  Bioinformatics       Date:  2019-06-01       Impact factor: 6.937

Review 5.  Chronic Kidney Disease Cohort Studies: A Guide to Metabolome Analyses.

Authors:  Ulla T Schultheiss; Robin Kosch; Fruzsina Kotsis; Michael Altenbuchinger; Helena U Zacharias
Journal:  Metabolites       Date:  2021-07-16

Review 6.  The metaRbolomics Toolbox in Bioconductor and beyond.

Authors:  Jan Stanstrup; Corey D Broeckling; Rick Helmus; Nils Hoffmann; Ewy Mathé; Thomas Naake; Luca Nicolotti; Kristian Peters; Johannes Rainer; Reza M Salek; Tobias Schulze; Emma L Schymanski; Michael A Stravs; Etienne A Thévenot; Hendrik Treutler; Ralf J M Weber; Egon Willighagen; Michael Witting; Steffen Neumann
Journal:  Metabolites       Date:  2019-09-23
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.