Literature DB >> 21903626

TopiaryExplorer: visualizing large phylogenetic trees with environmental metadata.

Meg Pirrung1, Ryan Kennedy, J Gregory Caporaso, Jesse Stombaugh, Doug Wendel, Rob Knight.   

Abstract

MOTIVATION: Microbial community profiling is a highly active area of research, but tools that facilitate visualization of phylogenetic trees and associated environmental data have not kept up with the increasing quantity of data generated in these studies.
RESULTS: TopiaryExplorer supports the visualization of very large phylogenetic trees, including features such as the automated coloring of branches by environmental data, manipulation of trees and incorporation of per-tip metadata (e.g. taxonomic labels). AVAILABILITY: http://topiaryexplorer.sourceforge.net. CONTACT: rob.knight@colorado.edu.

Entities:  

Mesh:

Year:  2011        PMID: 21903626      PMCID: PMC3198578          DOI: 10.1093/bioinformatics/btr517

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

Microbial community profiling using marker genes such as the 16S rRNA greatly expanded our knowledge of the diversity of the microbial world (Tringe and Hugenholtz, 2008). Phylogenetic trees are key to our understanding of the microbial world (Pace, 1997), and provide an important view into microbial data (Ludwig ). Software such as QIIME (Caporaso ) has kept pace with the increasing rate of sequence acquisition to allow statistical analysis of these data, but tools to visualize and manipulate phylogenetic trees and corresponding metadata (Huson ; Letunic and Bork, 2011; www.phylosoft.org/archaeopteryx; http://tree.bio.ed.ac.uk/software/figtree), which worked well on datasets that were characteristic a few years ago, are less suitable for datasets containing millions of sequences. A key question in microbial ecology is which portions of a phylogenetic reference tree are differentially represented in specific groups of samples. To address this question, users should be able to load trees with thousands of tips, assign taxonomic labels to the tips, and color the branches based on data about each sample. Here we present TopiaryExplorer, a software package that facilitates visual exploration of large phylogenetic trees, including information about each sample and each tip. This integration of what is often called ‘sequence metadata’ is crucial to understanding how sequences (and their source organisms) are distributed across environments, and the processes underlying the observed patterns. TopiaryExplorer additionally allows display and revision of the taxonomy (including multiple taxonomies for the same tree, facilitating taxonomic comparisons), and integration with databases that contain sample/tree information (an example database is provided to assist users in creating their own: see below). It also provides key user interface improvements including: the ability to dynamically collapse or expand the whole tree using several different tree layout algorithms, allowing rapid visual exploration of which lineages are shared among or unique to specific subsets of environments; the ability to spawn new windows for investigation of specific subtrees and to view multiple trees at the same time; control over labels and layout features critical for producing publication-quality graphics; the ability to export results in any of several graphical and machine-readable formats for further analysis; and the ability to handle datasets of hundreds of thousands of tips, which can easily be created from larger datasets by OTU picking with UCLUST (Edgar, 2010) or related tools. TopiaryExplorer metadata is provided as tab-separated text, and trees are provided as Newick-formatted strings. These standard file formats allow data generated with different tools to be easily imported into TopiaryExplorer. TopiaryExplorer is written in Java using Processing, which allows for rapid tree visualization and PDF export using OpenGL. The tree layout rendering algorithms in TopiaryExplorer were adapted from PyCogent (Knight ). Several strategies were applied to efficiently visualize large trees and associated metadata, including caching node lookups rather than running multiple lookups of the same node, and using sparse table representations for storing and accessing metadata.

2 RESULTS AND CONCLUSIONS

To illustrate the utility of TopiaryExplorer we applied it to a microbial survey of the hands and keyboards of three individuals (Fierer ). This study illustrated how individuals can be matched to objects they have touched through the use of ‘microbial fingerprints’. It has previously been difficult to identify the specific taxonomic differences between the hand or keyboard microbial communities from different individuals. We applied TopiaryExplorer to specifically address this question and find that between-community taxonomic differences are present and easily discernable (Fig. 1A–D). The trees visually represent the results of the original study, that M3's keyboard is more similar to M3's fingertips than the fingertips of either M2 or M9, but additionally allows us to immediately determine which taxonomic groups are differentially represented in the different sample types. For example, M3's fingertips and keyboard contain Proteobacteria in higher abundance than M2 or M9. Figure 1E represents a screenshot of the TopiaryExplorer interface.
Fig. 1.

Coverage of samples over tree of 16S rRNA OTUs observed in this study. Wedges summarize groups of tips and colored on a gray to blue scale based on the abundance of tips in that wedge which are represented in the given sample type [(A) M3 (subject 3) keyboard, (B) M2 (subject 2) fingertips, (C) M3 fingertips, (D) M9 (subject 9) fingertips]. (E) TopiaryExplorer interface: branches are colored by individual (M2, M3, M9), tips labels are colored by source (specific keyboard key or fingertips), and the pie chart summarizes the representation of individuals in the clade. Pink boxes highlight tips matching the search term.

Coverage of samples over tree of 16S rRNA OTUs observed in this study. Wedges summarize groups of tips and colored on a gray to blue scale based on the abundance of tips in that wedge which are represented in the given sample type [(A) M3 (subject 3) keyboard, (B) M2 (subject 2) fingertips, (C) M3 fingertips, (D) M9 (subject 9) fingertips]. (E) TopiaryExplorer interface: branches are colored by individual (M2, M3, M9), tips labels are colored by source (specific keyboard key or fingertips), and the pie chart summarizes the representation of individuals in the clade. Pink boxes highlight tips matching the search term. These results show how visual inspection of phylogenetic trees with environmental data can facilitate interpretation of the relative differences between microbial communities. TopiaryExplorer fills a necessary gap in tools for the comparison of microbial communities. The tree of life is being rapidly filled by large-scale projects such as Genomic Encyclopedia of Bacteria and Archaea (GEBA), the Human Microbiome Project and the Earth Microbiome Project and annotated with emerging standards such as Minimum Information about any Sequence (MIxS). The ability to determine what lineages are novel in a new dataset, and what lineages distinguish among samples associated with clinical or environmental parameters, will be crucial for understanding the ecology and evolution of the microbes that comprise the vast majority of life on earth.
  9 in total

1.  ARB: a software environment for sequence data.

Authors:  Wolfgang Ludwig; Oliver Strunk; Ralf Westram; Lothar Richter; Harald Meier; Arno Buchner; Tina Lai; Susanne Steppi; Gangolf Jobb; Wolfram Förster; Igor Brettske; Stefan Gerber; Anton W Ginhart; Oliver Gross; Silke Grumann; Stefan Hermann; Ralf Jost; Andreas König; Thomas Liss; Ralph Lüssmann; Michael May; Björn Nonhoff; Boris Reichel; Robert Strehlow; Alexandros Stamatakis; Norbert Stuckmann; Alexander Vilbig; Michael Lenke; Thomas Ludwig; Arndt Bode; Karl-Heinz Schleifer
Journal:  Nucleic Acids Res       Date:  2004-02-25       Impact factor: 16.971

2.  Search and clustering orders of magnitude faster than BLAST.

Authors:  Robert C Edgar
Journal:  Bioinformatics       Date:  2010-08-12       Impact factor: 6.937

3.  Forensic identification using skin bacterial communities.

Authors:  Noah Fierer; Christian L Lauber; Nick Zhou; Daniel McDonald; Elizabeth K Costello; Rob Knight
Journal:  Proc Natl Acad Sci U S A       Date:  2010-03-15       Impact factor: 11.205

Review 4.  A renaissance for the pioneering 16S rRNA gene.

Authors:  Susannah G Tringe; Philip Hugenholtz
Journal:  Curr Opin Microbiol       Date:  2008-10-08       Impact factor: 7.934

Review 5.  A molecular view of microbial diversity and the biosphere.

Authors:  N R Pace
Journal:  Science       Date:  1997-05-02       Impact factor: 47.728

6.  QIIME allows analysis of high-throughput community sequencing data.

Authors:  J Gregory Caporaso; Justin Kuczynski; Jesse Stombaugh; Kyle Bittinger; Frederic D Bushman; Elizabeth K Costello; Noah Fierer; Antonio Gonzalez Peña; Julia K Goodrich; Jeffrey I Gordon; Gavin A Huttley; Scott T Kelley; Dan Knights; Jeremy E Koenig; Ruth E Ley; Catherine A Lozupone; Daniel McDonald; Brian D Muegge; Meg Pirrung; Jens Reeder; Joel R Sevinsky; Peter J Turnbaugh; William A Walters; Jeremy Widmann; Tanya Yatsunenko; Jesse Zaneveld; Rob Knight
Journal:  Nat Methods       Date:  2010-04-11       Impact factor: 28.547

7.  Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy.

Authors:  Ivica Letunic; Peer Bork
Journal:  Nucleic Acids Res       Date:  2011-04-05       Impact factor: 16.971

8.  Dendroscope: An interactive viewer for large phylogenetic trees.

Authors:  Daniel H Huson; Daniel C Richter; Christian Rausch; Tobias Dezulian; Markus Franz; Regula Rupp
Journal:  BMC Bioinformatics       Date:  2007-11-22       Impact factor: 3.169

9.  PyCogent: a toolkit for making sense from sequence.

Authors:  Rob Knight; Peter Maxwell; Amanda Birmingham; Jason Carnes; J Gregory Caporaso; Brett C Easton; Michael Eaton; Micah Hamady; Helen Lindsay; Zongzhi Liu; Catherine Lozupone; Daniel McDonald; Michael Robeson; Raymond Sammut; Sandra Smit; Matthew J Wakefield; Jeremy Widmann; Shandy Wikman; Stephanie Wilson; Hua Ying; Gavin A Huttley
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

  9 in total
  13 in total

1.  Evidence for a persistent microbial seed bank throughout the global ocean.

Authors:  Sean M Gibbons; J Gregory Caporaso; Meg Pirrung; Dawn Field; Rob Knight; Jack A Gilbert
Journal:  Proc Natl Acad Sci U S A       Date:  2013-03-04       Impact factor: 11.205

Review 2.  Sequencing our way towards understanding global eukaryotic biodiversity.

Authors:  Holly M Bik; Dorota L Porazinska; Simon Creer; J Gregory Caporaso; Rob Knight; W Kelley Thomas
Journal:  Trends Ecol Evol       Date:  2012-01-11       Impact factor: 17.712

3.  The role of host phylogeny varies in shaping microbial diversity in the hindguts of lower termites.

Authors:  Vera Tai; Erick R James; Christine A Nalepa; Rudolf H Scheffrahn; Steve J Perlman; Patrick J Keeling
Journal:  Appl Environ Microbiol       Date:  2014-12-01       Impact factor: 4.792

4.  Transmission of atherosclerosis susceptibility with gut microbial transplantation.

Authors:  Jill C Gregory; Jennifer A Buffa; Elin Org; Zeneng Wang; Bruce S Levison; Weifei Zhu; Matthew A Wagner; Brian J Bennett; Lin Li; Joseph A DiDonato; Aldons J Lusis; Stanley L Hazen
Journal:  J Biol Chem       Date:  2014-12-30       Impact factor: 5.157

Review 5.  Advancing analytical algorithms and pipelines for billions of microbial sequences.

Authors:  Antonio Gonzalez; Rob Knight
Journal:  Curr Opin Biotechnol       Date:  2011-12-13       Impact factor: 9.740

6.  Seasonal restructuring of the ground squirrel gut microbiota over the annual hibernation cycle.

Authors:  Hannah V Carey; William A Walters; Rob Knight
Journal:  Am J Physiol Regul Integr Comp Physiol       Date:  2012-11-14       Impact factor: 3.619

7.  Identifying genomic and metabolic features that can underlie early successional and opportunistic lifestyles of human gut symbionts.

Authors:  Catherine Lozupone; Karoline Faust; Jeroen Raes; Jeremiah J Faith; Daniel N Frank; Jesse Zaneveld; Jeffrey I Gordon; Rob Knight
Journal:  Genome Res       Date:  2012-06-04       Impact factor: 9.043

8.  A format for phylogenetic placements.

Authors:  Frederick A Matsen; Noah G Hoffman; Aaron Gallagher; Alexandros Stamatakis
Journal:  PLoS One       Date:  2012-02-22       Impact factor: 3.240

9.  Advancing our understanding of the human microbiome using QIIME.

Authors:  José A Navas-Molina; Juan M Peralta-Sánchez; Antonio González; Paul J McMurdie; Yoshiki Vázquez-Baeza; Zhenjiang Xu; Luke K Ursell; Christian Lauber; Hongwei Zhou; Se Jin Song; James Huntley; Gail L Ackermann; Donna Berg-Lyons; Susan Holmes; J Gregory Caporaso; Rob Knight
Journal:  Methods Enzymol       Date:  2013       Impact factor: 1.600

10.  Communities of microbial eukaryotes in the mammalian gut within the context of environmental eukaryotic diversity.

Authors:  Laura Wegener Parfrey; William A Walters; Christian L Lauber; Jose C Clemente; Donna Berg-Lyons; Clotilde Teiling; Chinnappa Kodira; Mohammed Mohiuddin; Julie Brunelle; Mark Driscoll; Noah Fierer; Jack A Gilbert; Rob Knight
Journal:  Front Microbiol       Date:  2014-06-19       Impact factor: 5.640

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.