Literature DB >> 29955821

MetaboAnalystR: an R package for flexible and reproducible analysis of metabolomics data.

Jasmine Chong, Jianguo Xia.   

Abstract

Summary: The MetaboAnalyst web application has been widely used for metabolomics data analysis and interpretation. Despite its user-friendliness, the web interface has presented its inherent limitations (especially for advanced users) with regard to flexibility in creating customized workflow, support for reproducible analysis, and capacity in dealing with large data. To address these limitations, we have developed a companion R package (MetaboAnalystR) based on the R code base of the web server. The package has been thoroughly tested to ensure that the same R commands will produce identical results from both interfaces. MetaboAnalystR complements the MetaboAnalyst web server to facilitate transparent, flexible and reproducible analysis of metabolomics data. Availability and implementation: MetaboAnalystR is freely available from https://github.com/xia-lab/MetaboAnalystR.

Entities:  

Mesh:

Year:  2018        PMID: 29955821      PMCID: PMC6289126          DOI: 10.1093/bioinformatics/bty528

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Metabolomics aims to study all small compounds within a biological system. It complements other omics technologies in multi-omics characterization of biological systems, and is poised to play a significant role in precision medicine (Wishart, 2016). With the growing applications of metabolomics comes an urgent need for easy-to-use, open-source software tools that are able to analyze increasingly large and complex datasets, as well as to keep pace with rapidly evolving technological innovations. The open-source nature of such software will promote transparency and reproducibility in data analysis, as well as encourage academic collaboration by allowing different research groups to further extend the existing tools or incorporate them into new software pipelines. A wide variety of Web or Galaxy-based tools exist for metabolomics data analysis (Gardinassi ), such as MetaboAnalyst (Chong ), XCMSOnline (Huan ), Workflow4Metabolomics (Giacomoni ), and Galaxy-M (Davidson ). Among them, MetaboAnalyst is one of the most widely used tools for statistical and functional analysis of metabolomics data. Despite its user-friendliness, the web-based application comes with its inherent limitations. For instance, the comprehensive analysis options provided through the web interface make it difficult for users to reproduce their results when they re-analyze their data after a long time. To address this issue, MetaboAnalyst generates a comprehensive analysis report upon completion of each analysis session. However, not all commands and parameters are recorded. The web interface also presents significant constrains in terms of developing customized workflows and handling large data. As metabolomics is increasingly used across different fields and biological systems, data analysis is not ‘one size fits all’. To address the concerns with reproducibility, flexibility, scalability and transparency, we have developed a companion R package—MetaboAnalystR.

2 Implementation and features

MetaboAnalystR is written in the R language (Team, 2016). The development version is hosted on GitHub and the stable release will soon be available as an R package on CRAN. It builds upon the R code base from the web server, with extensive modifications to ensure functional compatibility across both the web and the R command line. To ease the learning process, we have completely revamped the MetaboAnalyst web interface to expose the underlying R commands during analysis. The R package and the web server have been extensively tested to ensure that the identical results will be generated. MetaboAnalystR conforms to the R package quality standards (Wickham, 2015), including comprehensive vignettes for each modules with detailed case studies. The analysis workflow is summarized in Figure 1.
Fig. 1.

Main features of MetaboAnalystR package. The R command history generated on the web server can be directly used by MetaboAnalystR (A). Batch processing of metabolomics data can be accomplished using the R package (B). MetaboAnalystR can be integrated with other R packages (C). Its open-source nature will also facilitate further metabolomics software development (D)

Main features of MetaboAnalystR package. The R command history generated on the web server can be directly used by MetaboAnalystR (A). Batch processing of metabolomics data can be accomplished using the R package (B). MetaboAnalystR can be integrated with other R packages (C). Its open-source nature will also facilitate further metabolomics software development (D)

2.1 Functionality

MetaboAnalystR consists of over 500 functions organized into 11 modules (statistical analysis, biomarker analysis, time-series analysis, power analysis, biomarker meta-analysis, enrichment analysis, pathway analysis, joint pathway analysis, network explorer, MS peaks to pathways and other utilities). MetaboAnalystR builds upon several R packages such as caret (Kuhn, 2008) for classification and performance evaluation, and ROCR (Sing ) for visualizing biomarker performance. It also contains a high-performance implementation of the mummichog algorithm to infer pathway activities from m/z peak lists (Li ). MetaboAnalystR utilizes the MetaboAnalyst knowledgebase, including compound libraries, pathway libraries, and metabolite set libraries. They will be downloaded from the central repository upon first request.

2.2 Reproducibility and transparency

The MetaboAnalyst web interface now features an R command history panel updated in real time during data analysis. Users can export this R script containing the R functions, parameters used, and the order in which they were executed. These commands can be copy-and-pasted into R or RStudio (Team, 2015) to reproduce identical results. The web interface coupled with R commands maximizes transparency underlying each analysis step, and will greatly help teach non-programmers in using MetaboAnalystR. Both the R package and the web server generate analysis reports for each module using Sweave (Leisch, 2002). We have updated the report template for all modules, which now contain detailed information surrounding each analysis step, followed by corresponding results, and the R command history.

2.3 Flexibility and scalability

Another key feature of MetaboAnalystR is its flexibility to allow users to perform their metabolomics data analysis. The R code from the command history and R package itself allows users to easily adjust the parameters or to modify the existing workflows. The modularity of MetaboAnalystR permits it to be easily integrated with other existing tools to develop custom metabolomics pipelines. For instance, MetaboAnalystR is currently interoperable with XCMS (xcms:::.write.metaboanalyst), and supports NetCDF, mzDATA and mzXML file formats. The support for mzTab (Griss, ) will be added in a future release. Because the MetaboAnalyst public web server imposes a size restriction (50M), the R package will be of great use for users to both directly process and batch process larger datasets.

3 Case studies

To demonstrate the functionality, flexibility and scalability of the MetaboAnalystR package, we performed analyses on two sets of metabolomics data. The detailed discussions and comparisons are available on the GitHub under ‘Case Studies’.

4 Conclusion

Data analysis has become a major bottleneck in current metabolomics workflows. In-depth analysis of metabolomics data can be daunting to most researchers, and requires powerful and flexible software solutions. MetaboAnalystR complements the popular MetaboAnalyst web server by providing a comprehensive R package to facilitate flexible and reproducible metabolomics data analysis.

Funding

This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant and the Canada Research Chairs (CRC) program (to J. Xia). J. Chong is supported in part by the McGill Graduate Dean’s Award. Conflict of Interest: none declared.
  8 in total

1.  ROCR: visualizing classifier performance in R.

Authors:  Tobias Sing; Oliver Sander; Niko Beerenwinkel; Thomas Lengauer
Journal:  Bioinformatics       Date:  2005-08-11       Impact factor: 6.937

2.  Systems biology guided by XCMS Online metabolomics.

Authors:  Tao Huan; Erica M Forsberg; Duane Rinehart; Caroline H Johnson; Julijana Ivanisevic; H Paul Benton; Mingliang Fang; Aries Aisporna; Brian Hilmers; Farris L Poole; Michael P Thorgersen; Michael W W Adams; Gregory Krantz; Matthew W Fields; Paul D Robbins; Laura J Niedernhofer; Trey Ideker; Erica L Majumder; Judy D Wall; Nicholas J W Rattray; Royston Goodacre; Luke L Lairson; Gary Siuzdak
Journal:  Nat Methods       Date:  2017-04-27       Impact factor: 28.547

Review 3.  Emerging applications of metabolomics in drug discovery and precision medicine.

Authors:  David S Wishart
Journal:  Nat Rev Drug Discov       Date:  2016-03-11       Impact factor: 84.694

4.  Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics.

Authors:  Franck Giacomoni; Gildas Le Corguillé; Misharl Monsoor; Marion Landi; Pierre Pericard; Mélanie Pétéra; Christophe Duperier; Marie Tremblay-Franco; Jean-François Martin; Daniel Jacob; Sophie Goulitquer; Etienne A Thévenot; Christophe Caron
Journal:  Bioinformatics       Date:  2014-12-19       Impact factor: 6.937

5.  The mzTab data exchange format: communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience.

Authors:  Johannes Griss; Andrew R Jones; Timo Sachsenberg; Mathias Walzer; Laurent Gatto; Jürgen Hartler; Gerhard G Thallinger; Reza M Salek; Christoph Steinbeck; Nadin Neuhauser; Jürgen Cox; Steffen Neumann; Jun Fan; Florian Reisinger; Qing-Wei Xu; Noemi Del Toro; Yasset Pérez-Riverol; Fawaz Ghali; Nuno Bandeira; Ioannis Xenarios; Oliver Kohlbacher; Juan Antonio Vizcaíno; Henning Hermjakob
Journal:  Mol Cell Proteomics       Date:  2014-06-30       Impact factor: 5.911

6.  Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data.

Authors:  Robert L Davidson; Ralf J M Weber; Haoyu Liu; Archana Sharma-Oates; Mark R Viant
Journal:  Gigascience       Date:  2016-02-23       Impact factor: 6.524

7.  Predicting network activity from high throughput metabolomics.

Authors:  Shuzhao Li; Youngja Park; Sai Duraisingham; Frederick H Strobel; Nooruddin Khan; Quinlyn A Soltow; Dean P Jones; Bali Pulendran
Journal:  PLoS Comput Biol       Date:  2013-07-04       Impact factor: 4.475

8.  MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis.

Authors:  Jasmine Chong; Othman Soufan; Carin Li; Iurie Caraus; Shuzhao Li; Guillaume Bourque; David S Wishart; Jianguo Xia
Journal:  Nucleic Acids Res       Date:  2018-07-02       Impact factor: 16.971

  8 in total
  145 in total

1.  Gene Networks Underlying Cannabinoid and Terpenoid Accumulation in Cannabis.

Authors:  Jordan J Zager; Iris Lange; Narayanan Srividya; Anthony Smith; B Markus Lange
Journal:  Plant Physiol       Date:  2019-05-28       Impact factor: 8.340

2.  Using metabolomics to assess the sub-lethal effects of zinc and boscalid on an estuarine polychaete worm over time.

Authors:  Georgia M Sinclair; Allyson L O'Brien; Michael Keough; David P De Souza; Saravanan Dayalan; Komal Kanojia; Konstantinos Kouremenos; Dedreia L Tull; Rhys A Coleman; Oliver A H Jones; Sara M Long
Journal:  Metabolomics       Date:  2019-07-31       Impact factor: 4.290

3.  Erwinia amylovora Auxotrophic Mutant Exometabolomics and Virulence on Apples.

Authors:  Sara M Klee; Judith P Sinn; Melissa Finley; Erik L Allman; Philip B Smith; Osaretin Aimufua; Viji Sitther; Brian L Lehman; Teresa Krawczyk; Kari A Peter; Timothy W McNellis
Journal:  Appl Environ Microbiol       Date:  2019-07-18       Impact factor: 4.792

4.  Potential predictive value of serum targeted metabolites and concurrently mutated genes for EGFR-TKI therapeutic efficacy in lung adenocarcinoma patients with EGFR sensitizing mutations.

Authors:  Xiaohong Han; Rongrong Luo; Lin Wang; Lei Zhang; Tao Wang; Yan Zhao; Shanshan Xiao; Nan Qiao; Chi Xu; Lieming Ding; Zhishang Zhang; Yuankai Shi
Journal:  Am J Cancer Res       Date:  2020-12-01       Impact factor: 6.166

5.  A metabolic profile of routine needle biopsies identified tumor type specific metabolic signatures for breast cancer stratification: a pilot study.

Authors:  Narumi Harada-Shoji; Tomoyoshi Soga; Hiroshi Tada; Minoru Miyashita; Mutsuo Harada; Gou Watanabe; Yohei Hamanaka; Akiko Sato; Takashi Suzuki; Akihiko Suzuki; Takanori Ishida
Journal:  Metabolomics       Date:  2019-11-04       Impact factor: 4.290

Review 6.  Recent metabolomics and gene editing approaches for synthesis of microbial secondary metabolites for drug discovery and development.

Authors:  Rajeshwari Sinha; Babita Sharma; Arun Kumar Dangi; Pratyoosh Shukla
Journal:  World J Microbiol Biotechnol       Date:  2019-10-22       Impact factor: 3.312

7.  Breast cancer risk in relation to plasma metabolites among Hispanic and African American women.

Authors:  Hua Zhao; Jie Shen; Steven C Moore; Yuanqing Ye; Xifeng Wu; Francisco J Esteva; Debasish Tripathy; Wong-Ho Chow
Journal:  Breast Cancer Res Treat       Date:  2019-02-15       Impact factor: 4.872

8.  A single meal has the potential to alter brain oxylipin content.

Authors:  J E Norman; H H Aung; Y Otoki; Z Zhang; A Y Taha; J C Rutledge
Journal:  Prostaglandins Leukot Essent Fatty Acids       Date:  2020-01-30       Impact factor: 4.006

9.  Metabolomics and hormonomics to crack the code of filbert growth.

Authors:  Lauren A E Erland; Christina E Turi; Praveen K Saxena; Susan J Murch
Journal:  Metabolomics       Date:  2020-04-25       Impact factor: 4.290

10.  Candida utilis yeast as a functional protein source for Atlantic salmon (Salmo salar L.): Local intestinal tissue and plasma proteome responses.

Authors:  Felipe Eduardo Reveco-Urzua; Mette Hofossæter; Mallikarjuna Rao Kovi; Liv Torunn Mydland; Ragnhild Ånestad; Randi Sørby; Charles McLean Press; Leidy Lagos; Margareth Øverland
Journal:  PLoS One       Date:  2019-12-30       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.