Literature DB >> 31896633

GenePiper, a Graphical User Interface Tool for Microbiome Sequence Data Mining.

Abstract

Amplicon sequencing of the 16S rRNA gene is commonly performed for the assessment and comparison of microbiomes. Here, we introduce GenePiper, an open-source R Shiny application that provides an easy-to-use interface, a wide range of analytical methods, and optimized graphical outputs for offline microbiome data analyses.

Entities: Species

Year: 2020 PMID： 31896633 PMCID： PMC6940285 DOI： 10.1128/MRA.01195-19

Source DB: PubMed Journal: Microbiol Resour Announc ISSN： 2576-098X

ANNOUNCEMENT

The profiling of microbiomes by high-throughput amplicon sequencing has become a standard approach in many research disciplines. Many bioinformatics tools have been developed to analyze such data, in standalone or online applications (1–5). However, most of these tools are command line based and have a steep learning curve for general users. Many of the packages require numerous sources of dependencies with different compatibilities, which complicates their installation and maintenance, whereas Web applications have data transfer and computation time constraints, especially for huge data sets. The uploading of data onto online servers also raises data security concerns. To address these issues, we developed GenePiper, an open-source R Shiny application, based on a virtual environment, that provides a united offline system for microbiome data analyses. GenePiper is an open-source R Shiny application built in a virtual Linux environment. It depends on VirtualBox (Oracle) and Vagrant (HashiCorp, USA), which are available on Windows, Mac-OS X, and Linux platforms. Users download the GenePiper Vagrant configuration file, which Vagrant uses to set up the R Shiny (6) server, with all of the applications, within a virtual environment on the local computer. The main interface of GenePiper is accessed locally through a Web browser (such as Chrome, Firefox, or Safari). GenePiper requires three input files, namely, an operational taxonomic unit (OTU) table, a taxonomy table, and a sample data table (Fig. 1). It is optional to provide a phylogenetic tree for UniFrac distance calculations (7). These files are loaded into GenePiper via the data import module. GenePiper constructs a “phyloseq-class” data structure with the loaded data and stores it in RDS format in the virtual environment. These RDS data are saved and can be recalled by a unique data label in subsequent analytical modules. Alternatively, a phyloseq-class data object stored in RDS format can be imported into GenePiper for analysis.

FIG 1

(Top) GenePiper general workflow. Data loaded into GenePiper are formatted into a phyloseq-class structure and stored in the virtual environment as an RDS file. Downstream analytical modules recall this RDS file for analysis. (Middle) Analysis module panel layout. All of the analytical modules in GenePiper share the same layout, with a top panel that shows the title, description, and references, a bottom-left panel for loading and filtering data, and a bottom-right panel for setting up the analysis parameters and displaying the results. (Bottom) Screenshot of the GenePiper interface. GenePiper complements existing packages, with easy access to many popular and well-documented analytical methods, including phyloseq (3), vegan (8), phangorn (9), ape (10), VennDiagram (11), Hmisc (12), SpiecEasi (13), SparCC (14), and many others. The analytical modules are categorized into six broad groups, i.e., diversity analysis, descriptive analysis, ordination, clustering, correlation analysis, and nonparametric statistical tests. GenePiper generates figures mainly using the ggplot R package (15) and provides full control of the graphical parameters. Users may explore their microbiome data with options for visualization including a diversity index curve, taxonomic bar chart and heatmap, phylogenetic tree, Venn diagram, scatterplot ordination such as correspondence analysis, detrended correspondence analysis, principal component analysis, principal coordinate analysis, nonmetric multidimensional scaling, canonical correspondence analysis, redundancy analysis, canonical correlation analysis, coinertia analysis, Procrustes analysis, principal response curve, correlation plot, correlation network plot, clustering dendrogram, and boxplot with nonparametric test. In summary, GenePiper is an integrated data-mining application in which the routine analytical pipeline can be operated easily using a graphical user interface. GenePiper allows researchers to efficiently test-run different parameter combinations for optimization and for generation of results for publication.

Data availability.

GenePiper is available at https://github.com/raytonghk/GenePiper. A step-by-step overview tutorial is available at https://github.com/raytonghk/genepiper/wiki/01.-Introduction.

10 in total

1. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R.

Authors: Emmanuel Paradis; Klaus Schliep
Journal: Bioinformatics Date: 2019-02-01 Impact factor: 6.937

2. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2.

Authors: Evan Bolyen; Jai Ram Rideout; Matthew R Dillon; Nicholas A Bokulich; Christian C Abnet; Gabriel A Al-Ghalith; Harriet Alexander; Eric J Alm; Manimozhiyan Arumugam; Francesco Asnicar; Yang Bai; Jordan E Bisanz; Kyle Bittinger; Asker Brejnrod; Colin J Brislawn; C Titus Brown; Benjamin J Callahan; Andrés Mauricio Caraballo-Rodríguez; John Chase; Emily K Cope; Ricardo Da Silva; Christian Diener; Pieter C Dorrestein; Gavin M Douglas; Daniel M Durall; Claire Duvallet; Christian F Edwardson; Madeleine Ernst; Mehrbod Estaki; Jennifer Fouquier; Julia M Gauglitz; Sean M Gibbons; Deanna L Gibson; Antonio Gonzalez; Kestrel Gorlick; Jiarong Guo; Benjamin Hillmann; Susan Holmes; Hannes Holste; Curtis Huttenhower; Gavin A Huttley; Stefan Janssen; Alan K Jarmusch; Lingjing Jiang; Benjamin D Kaehler; Kyo Bin Kang; Christopher R Keefe; Paul Keim; Scott T Kelley; Dan Knights; Irina Koester; Tomasz Kosciolek; Jorden Kreps; Morgan G I Langille; Joslynn Lee; Ruth Ley; Yong-Xin Liu; Erikka Loftfield; Catherine Lozupone; Massoud Maher; Clarisse Marotz; Bryan D Martin; Daniel McDonald; Lauren J McIver; Alexey V Melnik; Jessica L Metcalf; Sydney C Morgan; Jamie T Morton; Ahmad Turan Naimey; Jose A Navas-Molina; Louis Felix Nothias; Stephanie B Orchanian; Talima Pearson; Samuel L Peoples; Daniel Petras; Mary Lai Preuss; Elmar Pruesse; Lasse Buur Rasmussen; Adam Rivers; Michael S Robeson; Patrick Rosenthal; Nicola Segata; Michael Shaffer; Arron Shiffer; Rashmi Sinha; Se Jin Song; John R Spear; Austin D Swafford; Luke R Thompson; Pedro J Torres; Pauline Trinh; Anupriya Tripathi; Peter J Turnbaugh; Sabah Ul-Hasan; Justin J J van der Hooft; Fernando Vargas; Yoshiki Vázquez-Baeza; Emily Vogtmann; Max von Hippel; William Walters; Yunhu Wan; Mingxun Wang; Jonathan Warren; Kyle C Weber; Charles H D Williamson; Amy D Willis; Zhenjiang Zech Xu; Jesse R Zaneveld; Yilong Zhang; Qiyun Zhu; Rob Knight; J Gregory Caporaso
Journal: Nat Biotechnol Date: 2019-08 Impact factor: 54.908

3. Sparse and compositionally robust inference of microbial ecological networks.

Authors: Zachary D Kurtz; Christian L Müller; Emily R Miraldi; Dan R Littman; Martin J Blaser; Richard A Bonneau
Journal: PLoS Comput Biol Date: 2015-05-07 Impact factor: 4.475

4. UniFrac: a new phylogenetic method for comparing microbial communities.

Authors: Catherine Lozupone; Rob Knight
Journal: Appl Environ Microbiol Date: 2005-12 Impact factor: 4.792

5. Inferring correlation networks from genomic survey data.

Authors: Jonathan Friedman; Eric J Alm
Journal: PLoS Comput Biol Date: 2012-09-20 Impact factor: 4.475

6. phangorn: phylogenetic analysis in R.

Authors: Klaus Peter Schliep
Journal: Bioinformatics Date: 2010-12-17 Impact factor: 6.937

7. VennDiagram: a package for the generation of highly-customizable Venn and Euler diagrams in R.

Authors: Hanbo Chen; Paul C Boutros
Journal: BMC Bioinformatics Date: 2011-01-26 Impact factor: 3.307

8. Shiny-phyloseq: Web application for interactive microbiome analysis with provenance tracking.

Authors: Paul J McMurdie; Susan Holmes
Journal: Bioinformatics Date: 2014-09-26 Impact factor: 6.937

9. ranacapa: An R package and Shiny web app to explore environmental DNA data with exploratory statistics and interactive visualizations.

Authors: Gaurav S Kandlikar; Zachary J Gold; Madeline C Cowen; Rachel S Meyer; Amanda C Freise; Nathan J B Kraft; Jordan Moberg-Parker; Joshua Sprague; David J Kushner; Emily E Curd
Journal: F1000Res Date: 2018-11-01

10. phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data.

Authors: Paul J McMurdie; Susan Holmes
Journal: PLoS One Date: 2013-04-22 Impact factor: 3.240

10 in total

1 in total

1. Oral Microbiota Transplant in Dogs with Naturally Occurring Periodontitis.

Authors: T Beikler; K Bunte; Y Chan; B Weiher; S Selbach; U Peters; A Klocke; R M Watt; T F Flemmig
Journal: J Dent Res Date: 2021-03-18 Impact factor: 6.116

1 in total