Literature DB >> 29194469

bioBakery: a meta'omic analysis environment.

Lauren J McIver1,2, Galeb Abu-Ali1,2, Eric A Franzosa1,2, Randall Schwager1,2, Xochitl C Morgan1,2,3, Levi Waldron4, Nicola Segata5, Curtis Huttenhower1,2.   

Abstract

Summary: bioBakery is a meta'omic analysis environment and collection of individual software tools with the capacity to process raw shotgun sequencing data into actionable microbial community feature profiles, summary reports, and publication-ready figures. It includes a collection of pre-configured analysis modules also joined into workflows for reproducibility. Availability and implementation: bioBakery (http://huttenhower.sph.harvard.edu/biobakery) is publicly available for local installation as individual modules and as a virtual machine image. Each individual module has been developed to perform a particular task (e.g. quantitative taxonomic profiling or statistical analysis), and they are provided with source code, tutorials, demonstration data, and validation results; the bioBakery virtual image includes the entire suite of modules and their dependencies pre-installed. Images are available for both Amazon EC2 and Google Compute Engine. All software is open source under the MIT license. bioBakery is actively maintained with a support group at biobakery-users@googlegroups.com and new tools being added upon their release. Contact: chuttenh@hsph.harvard.edu. Supplementary information: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2018        PMID: 29194469      PMCID: PMC6030947          DOI: 10.1093/bioinformatics/btx754

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

The bioBakery suite is a collection of computational tools for quantitative microbial community analysis based on meta’omic (shotgun metagenome or metatranscriptome) sequencing data. It includes individual tools, workflows for executing them reproducibly, and a pre-built virtual environment that abrogates the burden of identifying and installing those tools and their dependencies. A full list of tools in the suite is included in the Supplementary Material. bioBakery implements complete ‘fire-and-forget’ analysis workflows for sample quality control, profiling and visualization, reducing the time users spend actively directing computations while ensuring workflow accuracy and completeness. The workflows perform dependency-driven, scalable analysis and produce a set of validated data products and standardized summary reports. Each can be executed with a single command and enables seamless distribution of tasks locally or across a grid computing environment.

2 The bioBakery suite

The tool suite is composed of software in three categories: (i) composition analysis, (ii) statistical analysis, and (iii) infrastructure and utilities (Supplementary Fig. S1). Composition analysis tools take shotgun-sequencing data as input to quantify the presence and abundance of microbial features, e.g. species, strains, gene families, and metabolic pathways. The products from these tools are data tables, which can then be processed by statistical analysis or visualization tools to identify significant associations (among microbial features or with sample metadata). Infrastructure and utilities tools support common meta’omic analyses in part through dependency-driven, reproducible workflows (http://huttenhower.sph.harvard.edu/biobakery_workflows) built with AnADAMA2 (http://huttenhower.sph.harvard.edu/anadama2).

3 bioBakery workflows

bioBakery workflows provide simple and reproducible execution of the many complex steps in processing meta’omic sequencing data. bioBakery includes a collection of data processing and visualization workflows that, starting from raw shotgun sequencing reads (metagenomic or metatranscriptomic), efficiently processes data through read-level quality control, taxonomic, and functional profiling steps. Workflows also exist to process raw 16S RNA gene sequencing data. The outputs of the data processing workflow are then forwarded to a visualization workflow for generating data reports and publication-ready figures (Supplementary Fig. S2). For detailed workflow diagrams, default workflow settings, and instructions on how to customize workflow parameters, refer to the bioBakery workflows user manual.

3.1 Metagenomic application example

The following two commands execute integrated workflows that process raw shotgun metagenomic sequencing reads (Fig. 1):
Fig. 1

The default metagenome workflow incorporates several individual tools that together process raw sequences into a set of data products, reports and visualizations

$ biobakery_workflows wmgx –input –output $ biobakery_workflows wmgx_vis –input –output –project-name The default metagenome workflow incorporates several individual tools that together process raw sequences into a set of data products, reports and visualizations In the first command, the input is a directory containing shotgun sequencing data (e.g. gzip-compressed fastq files) and the output is a directory where the data products are written (e.g. feature abundance tables). In the second command, the input is the directory of data products created by the first command and the output is a directory where the visualizations are written. See the bioBakery workflows tutorial for demo data sets including raw input files, data products and visualizations.

4 bioBakery homebrew packages

The bioBakery tool suite can be installed locally on MacOS with the Homebrew package manager (http://brew.sh/), and on Linux with the Linuxbrew package manager (http://linuxbrew.sh/). Linuxbrew does not require root permissions, thus making it ideal for a single user installing tools in a grid computing environment that does not have a container platform available. The bioBakery Homebrew formulas are available at https://github.com/biobakery/homebrew-biobakery. For detailed instructions on how to install the bioBakery Homebrew formulas, see the section on “installing bioBakery” in the Supplementary Material.

5 bioBakery virtual machine

The bioBakery virtual machine (VM) is a Vagrant (https://www.vagrantup.com/) box with VirtualBox (https://www.virtualbox.org/) as the provider, currently running Ubuntu 16.04. All releases are hosted by Vagrant at https://app.vagrantup.com/biobakery/boxes/biobakery. See the Supplementary Material for detailed instructions on installing and running the bioBakery VM. The bioBakery VM is ideally suited for analyzing small data sets or for learning how to run tools in the suite with their corresponding tutorials. The recommended resources for running the VM are 12 GB of RAM [of which 8 GB (tunable default) is allocated to the VM] and 16 GB of available disk space. Users lacking these resources can forego the VM installation and fetch individual bioBakery tools using the Homebrew formulas or Docker images (detailed in Supplementary Material).

6 Scaling up bioBakery

For large data sets, a grid or cloud computing environment is recommended with the tool suite installed using Homebrew or Docker images. If using Google Compute Engine (GCE), a public Google Cloud image is available for bioBakery through the bioBakery Google Cloud Bucket. There are two steps to start running with the bioBakery Google Cloud image: (i) create your own image from the public image and (ii) create your own VM instance from your new image. If using Amazon EC2, use the public bioBakery Amazon Machine Image (AMI) when creating your instance. The minimal recommended configuration (sufficient for running all bioBakery demos) is the machine type ‘n1-standard-2’ for GCE and ‘t2.large’ for Amazon EC2 (each providing ∼8 GB of RAM and 2 CPU cores).

7 Conclusion

bioBakery provides a complete meta’omic analysis environment with simple-to-use, reproducible workflows that efficiently process raw data into analysis reports containing publication-ready figures. Click here for additional data file.
  10 in total

1.  Strain-level microbial epidemiology and population genomics from shotgun metagenomics.

Authors:  Matthias Scholz; Doyle V Ward; Edoardo Pasolli; Thomas Tolio; Moreno Zolfo; Francesco Asnicar; Duy Tin Truong; Adrian Tett; Ardythe L Morrow; Nicola Segata
Journal:  Nat Methods       Date:  2016-03-21       Impact factor: 28.547

2.  Metagenomic microbial community profiling using unique clade-specific marker genes.

Authors:  Nicola Segata; Levi Waldron; Annalisa Ballarini; Vagheesh Narasimhan; Olivier Jousson; Curtis Huttenhower
Journal:  Nat Methods       Date:  2012-06-10       Impact factor: 28.547

3.  Dysfunction of the intestinal microbiome in inflammatory bowel disease and treatment.

Authors:  Xochitl C Morgan; Timothy L Tickle; Harry Sokol; Dirk Gevers; Kathryn L Devaney; Doyle V Ward; Joshua A Reyes; Samir A Shah; Neal LeLeiko; Scott B Snapper; Athos Bousvaros; Joshua Korzenik; Bruce E Sands; Ramnik J Xavier; Curtis Huttenhower
Journal:  Genome Biol       Date:  2012-04-16       Impact factor: 13.583

4.  Metagenomic biomarker discovery and explanation.

Authors:  Nicola Segata; Jacques Izard; Levi Waldron; Dirk Gevers; Larisa Miropolsky; Wendy S Garrett; Curtis Huttenhower
Journal:  Genome Biol       Date:  2011-06-24       Impact factor: 13.583

5.  Metabolic reconstruction for metagenomic data and its application to the human microbiome.

Authors:  Sahar Abubucker; Nicola Segata; Johannes Goll; Alyxandria M Schubert; Jacques Izard; Brandi L Cantarel; Beltran Rodriguez-Mueller; Jeremy Zucker; Mathangi Thiagarajan; Bernard Henrissat; Owen White; Scott T Kelley; Barbara Methé; Patrick D Schloss; Dirk Gevers; Makedonka Mitreva; Curtis Huttenhower
Journal:  PLoS Comput Biol       Date:  2012-06-13       Impact factor: 4.475

6.  A reproducible approach to high-throughput biological data acquisition and integration.

Authors:  Daniela Börnigen; Yo Sup Moon; Xochitl C Morgan; Wendy S Garrett; Gholamali Rahnavard; Levi Waldron; Lauren McIver; Afrah Shafquat; Eric A Franzosa; Larissa Miropolsky; Christopher Sweeney; Curtis Huttenhower
Journal:  PeerJ       Date:  2015-03-31       Impact factor: 2.984

7.  High-Specificity Targeted Functional Profiling in Microbial Communities with ShortBRED.

Authors:  James Kaminski; Molly K Gibson; Eric A Franzosa; Nicola Segata; Gautam Dantas; Curtis Huttenhower
Journal:  PLoS Comput Biol       Date:  2015-12-18       Impact factor: 4.475

8.  Microbial strain-level population structure and genetic diversity from metagenomes.

Authors:  Duy Tin Truong; Adrian Tett; Edoardo Pasolli; Curtis Huttenhower; Nicola Segata
Journal:  Genome Res       Date:  2017-02-06       Impact factor: 9.043

9.  PhyloPhlAn is a new method for improved phylogenetic and taxonomic placement of microbes.

Authors:  Nicola Segata; Daniela Börnigen; Xochitl C Morgan; Curtis Huttenhower
Journal:  Nat Commun       Date:  2013       Impact factor: 14.919

10.  Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences.

Authors:  Morgan G I Langille; Jesse Zaneveld; J Gregory Caporaso; Daniel McDonald; Dan Knights; Joshua A Reyes; Jose C Clemente; Deron E Burkepile; Rebecca L Vega Thurber; Rob Knight; Robert G Beiko; Curtis Huttenhower
Journal:  Nat Biotechnol       Date:  2013-08-25       Impact factor: 54.908

  10 in total
  77 in total

1.  Impact of dietary carbohydrate type and protein-carbohydrate interaction on metabolic health.

Authors:  Jibran A Wali; Annabelle J Milner; Alison W S Luk; Tamara J Pulpitel; Tim Dodgson; Harrison J W Facey; Devin Wahl; Melkam A Kebede; Alistair M Senior; Mitchell A Sullivan; Amanda E Brandon; Belinda Yau; Glen P Lockwood; Yen Chin Koay; Rosilene Ribeiro; Samantha M Solon-Biet; Kim S Bell-Anderson; John F O'Sullivan; Laurence Macia; Josephine M Forbes; Gregory J Cooney; Victoria C Cogger; Andrew Holmes; David Raubenheimer; David G Le Couteur; Stephen J Simpson
Journal:  Nat Metab       Date:  2021-06-08

2.  Structure of the Mucosal and Stool Microbiome in Lynch Syndrome.

Authors:  Yan Yan; David A Drew; Arnold Markowitz; Jason Lloyd-Price; Galeb Abu-Ali; Long H Nguyen; Christina Tran; Daniel C Chung; Katherine K Gilpin; Dana Meixell; Melanie Parziale; Madeline Schuck; Zalak Patel; James M Richter; Peter B Kelsey; Wendy S Garrett; Andrew T Chan; Zsofia K Stadler; Curtis Huttenhower
Journal:  Cell Host Microbe       Date:  2020-04-01       Impact factor: 21.023

3.  Association Between Sulfur-Metabolizing Bacterial Communities in Stool and Risk of Distal Colorectal Cancer in Men.

Authors:  Long H Nguyen; Wenjie Ma; Dong D Wang; Yin Cao; Himel Mallick; Teklu K Gerbaba; Jason Lloyd-Price; Galeb Abu-Ali; A Brantley Hall; Daniel Sikavi; David A Drew; Raaj S Mehta; Cesar Arze; Amit D Joshi; Yan Yan; Tobyn Branck; Casey DuLong; Kerry L Ivey; Shuji Ogino; Eric B Rimm; Mingyang Song; Wendy S Garrett; Jacques Izard; Curtis Huttenhower; Andrew T Chan
Journal:  Gastroenterology       Date:  2020-01-20       Impact factor: 22.682

4.  Altered Immunity of Laboratory Mice in the Natural Environment Is Associated with Fungal Colonization.

Authors:  Frank Yeung; Ying-Han Chen; Jian-Da Lin; Jacqueline M Leung; Caroline McCauley; Joseph C Devlin; Christina Hansen; Alex Cronkite; Zac Stephens; Charlotte Drake-Dunn; Yi Fulmer; Bo Shopsin; Kelly V Ruggles; June L Round; P'ng Loke; Andrea L Graham; Ken Cadwell
Journal:  Cell Host Microbe       Date:  2020-03-24       Impact factor: 21.023

Review 5.  Microbiome 101: Studying, Analyzing, and Interpreting Gut Microbiome Data for Clinicians.

Authors:  Celeste Allaband; Daniel McDonald; Yoshiki Vázquez-Baeza; Jeremiah J Minich; Anupriya Tripathi; David A Brenner; Rohit Loomba; Larry Smarr; William J Sandborn; Bernd Schnabl; Pieter Dorrestein; Amir Zarrinpar; Rob Knight
Journal:  Clin Gastroenterol Hepatol       Date:  2018-09-18       Impact factor: 11.382

6.  Microbial Metabolism Modulates Antibiotic Susceptibility within the Murine Gut Microbiome.

Authors:  Damien J Cabral; Swathi Penumutchu; Elizabeth M Reinhart; Cheng Zhang; Benjamin J Korry; Jenna I Wurster; Rachael Nilson; August Guang; William H Sano; Aislinn D Rowan-Nash; Hu Li; Peter Belenky
Journal:  Cell Metab       Date:  2019-09-12       Impact factor: 27.287

7.  Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3.

Authors:  Francesco Beghini; Lauren J McIver; Aitor Blanco-Míguez; Leonard Dubois; Francesco Asnicar; Sagun Maharjan; Ana Mailyan; Paolo Manghi; Matthias Scholz; Andrew Maltez Thomas; Mireia Valles-Colomer; George Weingart; Yancong Zhang; Moreno Zolfo; Curtis Huttenhower; Eric A Franzosa; Nicola Segata
Journal:  Elife       Date:  2021-05-04       Impact factor: 8.140

Review 8.  Multi-omics data integration considerations and study design for biological systems and disease.

Authors:  Stefan Graw; Kevin Chappell; Charity L Washam; Allen Gies; Jordan Bird; Michael S Robeson; Stephanie D Byrum
Journal:  Mol Omics       Date:  2021-04-19

9.  The gut microbiome modulates the protective association between a Mediterranean diet and cardiometabolic disease risk.

Authors:  Dong D Wang; Long H Nguyen; Yanping Li; Yan Yan; Wenjie Ma; Ehud Rinott; Kerry L Ivey; Iris Shai; Walter C Willett; Frank B Hu; Eric B Rimm; Meir J Stampfer; Andrew T Chan; Curtis Huttenhower
Journal:  Nat Med       Date:  2021-02-11       Impact factor: 53.440

10.  Association of dietary patterns with the gut microbiota in older, community-dwelling men.

Authors:  James M Shikany; Ryan T Demmer; Abigail J Johnson; Nora F Fino; Katie Meyer; Kristine E Ensrud; Nancy E Lane; Eric S Orwoll; Deborah M Kado; Joseph M Zmuda; Lisa Langsetmo
Journal:  Am J Clin Nutr       Date:  2019-10-01       Impact factor: 7.045

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.