Literature DB >> 24618462

The Bio-Community Perl toolkit for microbial ecology.

Florent E Angly1, Christopher J Fields1, Gene W Tyson1.   

Abstract

The development of bioinformatic solutions for microbial ecology in Perl is limited by the lack of modules to represent and manipulate microbial community profiles from amplicon and meta-omics studies. Here we introduce Bio-Community, an open-source, collaborative toolkit that extends BioPerl. Bio-Community interfaces with commonly used programs using various file formats, including BIOM, and provides operations such as rarefaction and taxonomic summaries. Bio-Community will help bioinformaticians to quickly piece together custom analysis pipelines and develop novel software. Availability an implementation: Bio-Community is cross-platform Perl code available from http://search.cpan.org/dist/Bio-Community under the Perl license. A readme file describes software installation and how to contribute.
© The Author 2014. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2014        PMID: 24618462      PMCID: PMC4071200          DOI: 10.1093/bioinformatics/btu130

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

Sequencing is common in most fields of biological research, and the throughput of modern platforms is orders of magnitudes higher than traditional Sanger sequencing (Metzker, 2010). The BioPerl bioinformatic toolkit (Stajich ) has attracted a large community of users and developers and has become critical in many sequencing projects by allowing quick code development and interaction between programs using incompatible file formats. In microbial ecology, sequencing is used routinely for 16S rRNA gene amplicon surveys (Tringe and Hugenholtz, 2008), metagenomics (Handelsman, 2004) and metatranscriptomics (Frias-Lopez ). Because most microorganisms remain uncultivated (Rappé and Giovannoni, 2003), culture-independent molecular surveys are essential for the characterization of environmental microbial communities. However, they require large computational resources, novel bioinformatic tools and elaborate pipelines. Many tools have been developed to analyze the resulting sequence data. For example, libraries written in Python (Knight ) and R (Dixon, 2003; Kembel ) provide blocks for building bioinformatic software. QIIME (Caporaso ) and mothur (Schloss ) are dedicated packages with scripts to build complete analysis pipelines, but they use incompatible file formats. Here, we introduce Bio-Community, a set of format-agnostic modules and scripts to parse and manipulate taxonomic or functional microbial community profiles.

2 FEATURES

2.1 Object model

Bio-Community is a Perl object-oriented toolkit that extends BioPerl. It is centered around the Community object, which contains a group of entities from the same geographic area (Fig. 1).
Fig. 1.

Main objects, their attributes and operation modules

Main objects, their attributes and operation modules These entities are Member objects, representing individual genomes, genes, taxa or operational taxonomic units from amplicon and meta-omic surveys. Member objects store attributes such as an identifier, a taxon or a sequence and can be given weights to account for the fact that there is no one-to-one relationship between a sequencing read and a microbial cell. The relative abundance or abundance rank of a Member can be calculated based on this Member’s count, weight and the total count in the Community (Fig. 2). Similarly, absolute abundance is based on total microbial abundance in the community, quantifiable by epifluorescence microscopy, qPCR or flow cytometry (Rinsoz ).
Fig. 2.

Relation between abundance types. Relative abundance depends on member counts and weights, whereas absolute abundance is further derived from a total abundance measure

Relation between abundance types. Relative abundance depends on member counts and weights, whereas absolute abundance is further derived from a total abundance measure

2.2 Diversity metrics

Bio-Community quantifies community α, β and γ diversity (Whittaker, 1972) using a range of metrics [reviewed by Magurran (2004)]. The diversity of a single Community object, α diversity, is represented by metrics of richness, evenness, dominance and indices (Supplementary Table S1). Several Community objects can be grouped into a Meta object, representing a metacommunity (Leibold ). This object provides methods to measure γ diversity, i.e. the collective diversity of its communities, and β diversity, i.e. their dissimilarity. The γ metrics are the same as those available for α diversity, whereas those for β diversity include qualitative and quantitative forms (Supplementary Table S1).

2.3 Data input and output

Community profiles (e.g. a site-by-species table) describe the distribution of members in biological samples. Operations to read and write these files are handled by the IO module and are important for exchanging data between programs using different formats. We have implemented parsers for five common file types (Supplementary Table S2), including the BIOM standard (McDonald ). Examples of these file types are given in the t/data folder of the Bio-Community package. The parsers automatically detect file format based on its content using the FormatGuesser module, and iteratively record member identifier, taxonomy and abundance.

2.4 Tools

Tool modules can perform operations such as community transformation, rarefaction and taxonomic summaries (Fig. 1). Utility scripts using these modules are available in Bio-Community (Supplementary Table S3). They allow biologists to perform specific operations on community profiles, but they do not form an entire microbial analysis pipeline. These scripts can also be regarded as examples of integration of Bio-Community into bioinformatic scripts (Fig. 3). This integration can also leverage external modules to rapidly develop powerful custom scripts, e.g. Getopt::Euclid for handling command-line arguments, BioPerl modules for reading sequences or running external programs (e.g. BLAST) (Camacho ) and Statistics::R for using R libraries or visualization capabilities.
Fig. 3.

Vignette illustrating the use of Bio-Community to read a BIOM community profile and report member information

Vignette illustrating the use of Bio-Community to read a BIOM community profile and report member information

3 CONCLUSIONS

Bio-Community provides several file formats to interface with popular programs and will help bioinformaticians quickly construct custom analysis pipelines or novel software for microbial ecology. The integration of relative and absolute abundance with diversity metrics permits holistic microbial studies (Dinsdale ; Dove ; Nathani ), while weights can be added to account for gene copy number (Kembel ) or genome length (Angly ; Beszteri ) bias. We encourage programmers to join the development of Bio-Community at https://github.com/bioperl/Bio-Community and to add support for new file formats, diversity metrics or tools. Funding: Australian Research Council DE120101213 to FEA and DP1093175 to GWT. Conflict of interest: none declared.
  18 in total

Review 1.  Metagenomics: application of genomics to uncultured microorganisms.

Authors:  Jo Handelsman
Journal:  Microbiol Mol Biol Rev       Date:  2004-12       Impact factor: 11.056

2.  Microbial community gene expression in ocean surface waters.

Authors:  Jorge Frias-Lopez; Yanmei Shi; Gene W Tyson; Maureen L Coleman; Stephan C Schuster; Sallie W Chisholm; Edward F Delong
Journal:  Proc Natl Acad Sci U S A       Date:  2008-03-03       Impact factor: 11.205

Review 3.  A renaissance for the pioneering 16S rRNA gene.

Authors:  Susannah G Tringe; Philip Hugenholtz
Journal:  Curr Opin Microbiol       Date:  2008-10-08       Impact factor: 7.934

Review 4.  Sequencing technologies - the next generation.

Authors:  Michael L Metzker
Journal:  Nat Rev Genet       Date:  2009-12-08       Impact factor: 53.242

5.  BLAST+: architecture and applications.

Authors:  Christiam Camacho; George Coulouris; Vahram Avagyan; Ning Ma; Jason Papadopoulos; Kevin Bealer; Thomas L Madden
Journal:  BMC Bioinformatics       Date:  2009-12-15       Impact factor: 3.169

6.  QIIME allows analysis of high-throughput community sequencing data.

Authors:  J Gregory Caporaso; Justin Kuczynski; Jesse Stombaugh; Kyle Bittinger; Frederic D Bushman; Elizabeth K Costello; Noah Fierer; Antonio Gonzalez Peña; Julia K Goodrich; Jeffrey I Gordon; Gavin A Huttley; Scott T Kelley; Dan Knights; Jeremy E Koenig; Ruth E Ley; Catherine A Lozupone; Daniel McDonald; Brian D Muegge; Meg Pirrung; Jens Reeder; Joel R Sevinsky; Peter J Turnbaugh; William A Walters; Jeremy Widmann; Tanya Yatsunenko; Jesse Zaneveld; Rob Knight
Journal:  Nat Methods       Date:  2010-04-11       Impact factor: 28.547

7.  Comparative evaluation of rumen metagenome community using qPCR and MG-RAST.

Authors:  Neelam M Nathani; Amrutlal K Patel; Prakash S Dhamannapatil; Ramesh K Kothari; Krishna M Singh; Chaitanya G Joshi
Journal:  AMB Express       Date:  2013-09-11       Impact factor: 3.298

8.  Incorporating 16S gene copy number information improves estimates of microbial diversity and abundance.

Authors:  Steven W Kembel; Martin Wu; Jonathan A Eisen; Jessica L Green
Journal:  PLoS Comput Biol       Date:  2012-10-25       Impact factor: 4.475

9.  Microbial ecology of four coral atolls in the Northern Line Islands.

Authors:  Elizabeth A Dinsdale; Olga Pantos; Steven Smriga; Robert A Edwards; Florent Angly; Linda Wegley; Mark Hatay; Dana Hall; Elysa Brown; Matthew Haynes; Lutz Krause; Enric Sala; Stuart A Sandin; Rebecca Vega Thurber; Bette L Willis; Farooq Azam; Nancy Knowlton; Forest Rohwer
Journal:  PLoS One       Date:  2008-02-27       Impact factor: 3.240

10.  PyCogent: a toolkit for making sense from sequence.

Authors:  Rob Knight; Peter Maxwell; Amanda Birmingham; Jason Carnes; J Gregory Caporaso; Brett C Easton; Michael Eaton; Micah Hamady; Helen Lindsay; Zongzhi Liu; Catherine Lozupone; Daniel McDonald; Michael Robeson; Raymond Sammut; Sandra Smit; Matthew J Wakefield; Jeremy Widmann; Shandy Wikman; Stephanie Wilson; Hua Ying; Gavin A Huttley
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

View more
  7 in total

1.  Taxonomic and Functional Analyses of the Supragingival Microbiome from Caries-Affected and Caries-Free Hosts.

Authors:  Jinzhi He; Qichao Tu; Yichen Ge; Yujia Qin; Bomiao Cui; Xiaoyu Hu; Yuxia Wang; Ye Deng; Kun Wang; Joy D Van Nostrand; Jiyao Li; Jizhong Zhou; Yan Li; Xuedong Zhou
Journal:  Microb Ecol       Date:  2017-09-20       Impact factor: 4.552

2.  Diuron tolerance and potential degradation by pelagic microbiomes in the Great Barrier Reef lagoon.

Authors:  Florent E Angly; Olga Pantos; Thomas C Morgan; Virginia Rich; Hemerson Tonin; David G Bourne; Philip Mercurio; Andrew P Negri; Gene W Tyson
Journal:  PeerJ       Date:  2016-03-08       Impact factor: 2.984

3.  Archaeal and bacterial communities across a chronosequence of drained lake basins in Arctic Alaska.

Authors:  J Kao-Kniffin; B J Woodcroft; S M Carver; J G Bockheim; J Handelsman; G W Tyson; K M Hinkel; C W Mueller
Journal:  Sci Rep       Date:  2015-12-18       Impact factor: 4.379

4.  biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format.

Authors:  Markus J Ankenbrand; Niklas Terhoeven; Sonja Hohlfeld; Frank Förster; Alexander Keller
Journal:  F1000Res       Date:  2016-09-20

5.  CopyRighter: a rapid tool for improving the accuracy of microbial community profiles through lineage-specific gene copy number correction.

Authors:  Florent E Angly; Paul G Dennis; Adam Skarshewski; Inka Vanwonterghem; Philip Hugenholtz; Gene W Tyson
Journal:  Microbiome       Date:  2014-04-07       Impact factor: 14.650

6.  Marine microbial communities of the Great Barrier Reef lagoon are influenced by riverine floodwaters and seasonal weather events.

Authors:  Florent E Angly; Candice Heath; Thomas C Morgan; Hemerson Tonin; Virginia Rich; Britta Schaffelke; David G Bourne; Gene W Tyson
Journal:  PeerJ       Date:  2016-01-05       Impact factor: 2.984

7.  Characterization of soil nematode communities in three cropping systems through morphological and DNA metabarcoding approaches.

Authors:  Amy M Treonis; Samantha K Unangst; Ryan M Kepler; Jeffrey S Buyer; Michel A Cavigelli; Steven B Mirsky; Jude E Maul
Journal:  Sci Rep       Date:  2018-01-31       Impact factor: 4.379

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.