Literature DB >> 25262154

Shiny-phyloseq: Web application for interactive microbiome analysis with provenance tracking.

Abstract

UNLABELLED: We have created a Shiny-based Web application, called Shiny-phyloseq, for dynamic interaction with microbiome data that runs on any modern Web browser and requires no programming, increasing the accessibility and decreasing the entrance requirement to using phyloseq and related R tools. Along with a data- and context-aware dynamic interface for exploring the effects of parameter and method choices, Shiny-phyloseq also records the complete user input and subsequent graphical results of a user's session, allowing the user to archive, share and reproduce the sequence of steps that created their result-without writing any new code themselves.
AVAILABILITY AND IMPLEMENTATION: Shiny-phyloseq is implemented entirely in the R language. It can be hosted/launched by any system with R installed, including Windows, Mac OS and most Linux distributions. Information technology administrators can also host Shiny--phyloseq from a remote server, in which case users need only have a Web browser installed. Shiny-phyloseq is provided free of charge under a GPL-3 open-source license through GitHub at http://joey711.github.io/shiny-phyloseq/.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2014 PMID： 25262154 PMCID： PMC4287943 DOI： 10.1093/bioinformatics/btu616

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 INTRODUCTION

Analysis of microbial communities requires the interpretation of one or more high-dimensional abundance matrices and its relationship with other datasets, using a complex and emerging suite of methods from ecology, genetics, phylogenetics, multivariate statistics, visualization and testing. Filtering, custom curation and transformation of the abundance data are also required usually, but the precisely reproducible workflow from raw data to final analyses is often difficult or impossible to reproduce exactly. Ideally, published scientific analyses are completely reproducible in as easy a fashion as possible; and anything less represents an impediment to both progress and peer review (Peng, 2011). Fortunately, R is well suited for the analytical aspects of microbiome research, with an interaction-oriented functional programming design (R Development Core Team, 2014) that includes support for reproducible research (Allaire ; Xie, 2014). We recently described a software package for the R language, phyloseq, dedicated to the object-oriented representation and analysis of microbiome census data (McMurdie and Holmes, 2013). One of the originally stated goals of phyloseq was to leverage R-based resources for reproducible research, and thereby, improve the reproducibility and portability of published microbiome analyses. Unfortunately, for many microbiome researchers with classical training in biology, learning a programming language—even a functional interactive language like R—has proven to be a prohibitive investment of time and effort. However, most of the necessary computations are not only tractable by R, but also fast enough for dynamic interaction via a graphical user interface (GUI). Here we describe our release of ‘Shiny-phyloseq’, a Web browser GUI that leverages phyloseq and other R resources for the analysis of microbiome census data—while also allowing the user to archive the complete code and data necessary to exactly reproduce their session results. Although it is difficult for any single GUI application to support the full range of analyses required in microbiome research, Shiny-phyloseq provides a compelling framework and introduction to microbiome analysis in R. Its modular open-source design encourages modification, customization and code reuse. There are other resources with GUI elements available for the analysis of microbiome census data, including MG-RAST (Meyer ), QIIME (Caporaso ) and CloVR (Angiuoli ). To our knowledge, there are no GUIs available for the analysis of microbiome census data that also leverage the R programming language, ggplot2 (Wickham, 2009), and phyloseq while also providing a ‘provenance record’ of a user’s session.

2 METHODS

Shiny-phyloseq is almost entirely R code, but, like any Shiny app, can be further customized/extended using HTML, CSS and JavaScript. Shiny-phyloseq is fully cross-platform and will launch locally from any R environment (Console R, Rgui, RStudio, etc.). It can also be hosted by a remote Web server—this latter case only requires that the user has a modern Web browser installed. Shiny-phyloseq leverages Shiny’s reactive programming framework to compartmentalize and cache expensive computational steps so that they are not recomputed unnecessarily during an interactive session. The current implementation of Shiny-phyloseq is dependent on many important updates to the phyloseq package, including (i) an interface to DESeq2 (Anders and Huber, 2010) for a negative Binomial method recommended by McMurdie and Holmes, 2014; (ii) a ggplot2-friendly data organizing function, psmelt; (iii) the inclusion of low-level C code from APE (Paradis ) for ∼100× faster UniFrac (Hamady ) distance calculations and tree plotting; (iv) faster network plot function with better default settings, plot_net; and (v) additional options for ordering heatmaps by covariates.

3 BIOLOGICAL APPLICATIONS

A typical workflow in Shiny-phyloseq begins with data upload and selection, followed by optional filtering of Operational Taxonomic Units (OTUs) or samples. A number of exploratory data analysis methods are available in separate panels, including alpha diversity estimates, multivariate ordination methods, as well as network, heatmap, scatter and bar graphics. Some of these methods depend on data transformations (e.g. regularized log) or ecological distances (e.g. weighted UniFrac, Bray–Curtis), which can be selected from a sidebar panel of parameter-input widgets placed next to each graphic on each panel. Graphics can be downloaded in a user-specified format by clicking the download button at the bottom of each sidebar panel. Finally, a user can download a compressed file containing the complete code and data necessary to completely reproduce the steps that led to their graphical result. Shiny-phyloseq provides new features, including (i) a context- and data-aware, browser-based interactive GUI application, (ii) interactive 3D network graphics based on d3.js, for exploring OTU or sample distance structure and (iii) provenance tracking for reproducible sessions. There are two network-graphic panels, both of which use new functionality. The Network panel features the ability to animate vertex connectedness as a change in the distance threshold, and scales edge thickness according to inverse distance value. The former feature can help a user to quickly scan the dependency of the network on the choice of global threshold. Alternatively, using the d3Network panel, networks can be interactively explored by dragging and stretching nodes in a live 3D network animation, with taxonomy or covariates mapped to node color and/or mouse-over labels (Fig. 1). This D3 (Bostock ) interactive graphic is a special phyloseq implementation of the d3Network package (Gandrud, 2014).

Fig. 1.

Shiny-phyloseq Network panel. The Shiny-phyloseq interface is organized into panels from left to right, beginning with data selection, filtering/curation, transformation and then graphic-specific panels. The user input widgets are consolidated in a left-hand sidebar of each panel. The Select Dataset panel begins each session. Pre-loaded datasets are available by default, and users can optionally specify public datasets hosted on QIIME-DB (Caporaso ), or upload private datasets in biom (McDonald ) or binary ‘.RData’ (phyloseq) formats. The Filter panel supports user-defined data filtering. Shiny-phyloseq provides a separate panel for each major graphic function in phyloseq, including alpha diversity, sample- or OTU-networks, barplot, heatmap, phylogenetic tree, ordination and scatterplot. All relevant panels support customization of figure dimensions, color palette and download file format The final panel, Provenance, includes a button that (re)initiates a special processing of the Shiny event log into reusable R code. This record of parameter changes and analysis steps—when coupled with the complete workspace environment data—constitutes a so-called provenance record of the analysis session. Shiny-phyloseq initiates a browser download of the compressed file containing the code and data sufficient to exactly reproduce the series of graphics created in the session up to that point. This can be archived or shared for batch re-execution in R.

11 in total

1. D³: Data-Driven Documents.

Authors: Michael Bostock; Vadim Ogievetsky; Jeffrey Heer
Journal: IEEE Trans Vis Comput Graph Date: 2011-12 Impact factor: 4.579

2. Reproducible research in computational science.

Authors: Roger D Peng
Journal: Science Date: 2011-12-02 Impact factor: 47.728

3. QIIME allows analysis of high-throughput community sequencing data.

Authors: J Gregory Caporaso; Justin Kuczynski; Jesse Stombaugh; Kyle Bittinger; Frederic D Bushman; Elizabeth K Costello; Noah Fierer; Antonio Gonzalez Peña; Julia K Goodrich; Jeffrey I Gordon; Gavin A Huttley; Scott T Kelley; Dan Knights; Jeremy E Koenig; Ruth E Ley; Catherine A Lozupone; Daniel McDonald; Brian D Muegge; Meg Pirrung; Jens Reeder; Joel R Sevinsky; Peter J Turnbaugh; William A Walters; Jeremy Widmann; Tanya Yatsunenko; Jesse Zaneveld; Rob Knight
Journal: Nat Methods Date: 2010-04-11 Impact factor: 28.547

4. CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing.

Authors: Samuel V Angiuoli; Malcolm Matalka; Aaron Gussman; Kevin Galens; Mahesh Vangala; David R Riley; Cesar Arze; James R White; Owen White; W Florian Fricke
Journal: BMC Bioinformatics Date: 2011-08-30 Impact factor: 3.307

5. Differential expression analysis for sequence count data.

Authors: Simon Anders; Wolfgang Huber
Journal: Genome Biol Date: 2010-10-27 Impact factor: 13.583

6. The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes.

Authors: F Meyer; D Paarmann; M D'Souza; R Olson; E M Glass; M Kubal; T Paczian; A Rodriguez; R Stevens; A Wilke; J Wilkening; R A Edwards
Journal: BMC Bioinformatics Date: 2008-09-19 Impact factor: 3.169

7. Fast UniFrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data.

Authors: Micah Hamady; Catherine Lozupone; Rob Knight
Journal: ISME J Date: 2009-08-27 Impact factor: 10.302

8. phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data.

Authors: Paul J McMurdie; Susan Holmes
Journal: PLoS One Date: 2013-04-22 Impact factor: 3.240

9. The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome.

Authors: Daniel McDonald; Jose C Clemente; Justin Kuczynski; Jai Ram Rideout; Jesse Stombaugh; Doug Wendel; Andreas Wilke; Susan Huse; John Hufnagle; Folker Meyer; Rob Knight; J Gregory Caporaso
Journal: Gigascience Date: 2012-07-12 Impact factor: 6.524

10. Waste not, want not: why rarefying microbiome data is inadmissible.

Authors: Paul J McMurdie; Susan Holmes
Journal: PLoS Comput Biol Date: 2014-04-03 Impact factor: 4.475

36 in total

1. Dynamic assessment of microbial ecology (DAME): a web app for interactive analysis and visualization of microbial sequencing data.

Authors: Brian D Piccolo; Umesh D Wankhade; Sree V Chintapalli; Sudeepa Bhattacharyya; Luo Chunqiao; Kartik Shankar
Journal: Bioinformatics Date: 2018-03-15 Impact factor: 6.937

2. Fish Skin and Gut Microbiomes Show Contrasting Signatures of Host Species and Habitat.

Authors: François-Étienne Sylvain; Aleicia Holland; Sidki Bouslama; Émie Audet-Gilbert; Camille Lavoie; Adalberto Luis Val; Nicolas Derome
Journal: Appl Environ Microbiol Date: 2020-08-03 Impact factor: 4.792

Review 3. Microbiome data science.

Authors: Sudarshan A Shetty; Leo Lahti
Journal: J Biosci Date: 2019-10 Impact factor: 1.826

4. Pavian: interactive analysis of metagenomics data for microbiome studies and pathogen identification.

Authors: Florian P Breitwieser; Steven L Salzberg
Journal: Bioinformatics Date: 2020-02-15 Impact factor: 6.937

5. Formulation of Biocides Increases Antimicrobial Potency and Mitigates the Enrichment of Nonsusceptible Bacteria in Multispecies Biofilms.

Authors: Sarah Forbes; Nicola Cowley; Gavin Humphreys; Hitesh Mistry; Alejandro Amézquita; Andrew J McBain
Journal: Appl Environ Microbiol Date: 2017-03-17 Impact factor: 4.792

6. The success of fecal microbial transplantation in Clostridium difficile infection correlates with bacteriophage relative abundance in the donor: a retrospective cohort study.

Authors: Heekuk Park; Michael R Laffin; Juan Jovel; Braden Millan; Jae E Hyun; Naomi Hotte; Dina Kao; Karen L Madsen
Journal: Gut Microbes Date: 2019-03-13

7. Xanthohumol Requires the Intestinal Microbiota to Improve Glucose Metabolism in Diet-Induced Obese Mice.

Authors: Isabelle E Logan; Natalia Shulzhenko; Thomas J Sharpton; Gerd Bobe; Kitty Liu; Stephanie Nuss; Megan L Jones; Cristobal L Miranda; Stephany Vasquez-Perez; Jamie M Pennington; Scott W Leonard; Jaewoo Choi; Wenbin Wu; Manoj Gurung; Joyce P Kim; Malcolm B Lowry; Andrey Morgun; Claudia S Maier; Jan F Stevens; Adrian F Gombart
Journal: Mol Nutr Food Res Date: 2021-10-12 Impact factor: 5.914

8. REPRODUCIBLE RESEARCH WORKFLOW IN R FOR THE ANALYSIS OF PERSONALIZED HUMAN MICROBIOME DATA.

Authors: Benjamin Callahan; Diana Proctor; David Relman; Julia Fukuyama; Susan Holmes
Journal: Pac Symp Biocomput Date: 2016

9. Soil carbonyl sulfide exchange in relation to microbial community composition: insights from a managed grassland soil amendment experiment.

Authors: Florian Kitz; María Gómez-Brandón; Bernhard Eder; Mohammad Etemadi; Felix M Spielmann; Albin Hammerle; Heribert Insam; Georg Wohlfahrt
Journal: Soil Biol Biochem Date: 2019-04-12 Impact factor: 7.609

10. Glycine-based treatment ameliorates NAFLD by modulating fatty acid oxidation, glutathione synthesis, and the gut microbiome.

Authors: Oren Rom; Yuhao Liu; Zhipeng Liu; Ying Zhao; Jianfeng Wu; Alia Ghrayeb; Luis Villacorta; Yanbo Fan; Lin Chang; Lu Wang; Cai Liu; Dongshan Yang; Jun Song; Jason C Rech; Yanhong Guo; Huilun Wang; Guizhen Zhao; Wenying Liang; Yui Koike; Haocheng Lu; Tomonari Koike; Tony Hayek; Subramaniam Pennathur; Chuanwu Xi; Bo Wen; Duxin Sun; Minerva T Garcia-Barrio; Michael Aviram; Eyal Gottlieb; Inbal Mor; Wanqing Liu; Jifeng Zhang; Y Eugene Chen
Journal: Sci Transl Med Date: 2020-12-02 Impact factor: 17.956