Literature DB >> 18034891

Dendroscope: An interactive viewer for large phylogenetic trees.

Daniel H Huson1, Daniel C Richter, Christian Rausch, Tobias Dezulian, Markus Franz, Regula Rupp.   

Abstract

BACKGROUND: Research in evolution requires software for visualizing and editing phylogenetic trees, for increasingly very large datasets, such as arise in expression analysis or metagenomics, for example. It would be desirable to have a program that provides these services in an efficient and user-friendly way, and that can be easily installed and run on all major operating systems. Although a large number of tree visualization tools are freely available, some as a part of more comprehensive analysis packages, all have drawbacks in one or more domains. They either lack some of the standard tree visualization techniques or basic graphics and editing features, or they are restricted to small trees containing only tens of thousands of taxa. Moreover, many programs are difficult to install or are not available for all common operating systems.
RESULTS: We have developed a new program, Dendroscope, for the interactive visualization and navigation of phylogenetic trees. The program provides all standard tree visualizations and is optimized to run interactively on trees containing hundreds of thousands of taxa. The program provides tree editing and graphics export capabilities. To support the inspection of large trees, Dendroscope offers a magnification tool. The software is written in Java 1.4 and installers are provided for Linux/Unix, MacOS X and Windows XP.
CONCLUSION: Dendroscope is a user-friendly program for visualizing and navigating phylogenetic trees, for both small and large datasets.

Entities:  

Mesh:

Year:  2007        PMID: 18034891      PMCID: PMC2216043          DOI: 10.1186/1471-2105-8-460

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


Background

Phylogenetic trees are used to represent evolutionary relationships between biological taxa, while taxonomical hierarchies such as the NCBI taxonomy are used to structure the wealth of molecular sequence data. The size of trees under consideration is growing larger and larger. The Tree of Life project [1], which aims at reconstructing the evolutionary relationship of all living species on earth, now considers more than 11,000 species. The Ribosomal Database Project II provides a hierarchical browser for a collection of approximately 340,000 ribosomal RNA sequences. Recent metagenomic analysis software [2] makes use of the full NCBI taxonomy, which now contains more than 390,000 taxa, to estimate the taxonomical content of a dataset. Most currently available tree viewers are designed to handle trees containing up to a few thousand nodes. A notable exception is TreeJuxtaposer [3], which was explicitly designed to visualize large trees. While TreeJuxtaposer is the tool of choice for very large datasets (containing hundreds of thousands of taxa), it has limited value as an all-round tree visualization tool, as it only implements one particular tree view (namely the rectangular phylogram, perhaps because this is the only view that is useful for large trees), it lacks basic graphics export capabilities and it does not allow one to save and reopen a modified tree.

Results and Discussion

Dendroscope is designed as an all-round tree visualization tool that can handle trees with hundred thousands of taxa (see Figure 1). Trees can be read and written in Newick or Nexus format [4], as produced by standard tree reconstruction programs. Additionally, Dendroscope uses its own file format to save and reopen (lists of) trees that have been edited graphically using different colors, line widths and fonts.
Figure 1

. The placement of Homo sapiens and the Hominidae in the NCBI taxonomy, as displayed in Dendroscope using the program's magnifier feature.

. The placement of Homo sapiens and the Hominidae in the NCBI taxonomy, as displayed in Dendroscope using the program's magnifier feature. A tree can be displayed in a number of views, namely as a circular, radial or rectangular phylogram, as (an internal or external) circular, rectangular or slanted cladogram, or as an unrooted diagram (see Figure 2). The nodes, edges and labels of a tree can be interactively formatted and edited (see Figure 3). Trees can be rerooted and subtrees can be rotated, collapsed, extracted and removed. In the rectangular and slanted views, a horizontal magnifier band can be used to enlarge a part of the tree. In the circular and radial views, a circular magnifier is available, which can also be switched to "magnify all mode", if desired (in which the complete tree is visible under the magnifier). A search tool can be used to find and locate taxa in the tree. All views are exportable as EPS, SVG, PNG, JPEG, GIF and BMP graphic files. Installers are available for Linux/Unix, MacOS X and Windows XP.
Figure 2

Four different views of the same dataset. Four different views for the same dataset of 28 sequences of genera of the daisy family: (a) circular cladogram, (b) radial phylogram, (c) rectangular phylogram, and (d) slanted cladogram.

Figure 3

Formatting nodes and edges of a tree. Dendroscope provides a dialog box for formatting the nodes and edges of a tree; the example shows a tree drawn as an internal circular cladogram.

Four different views of the same dataset. Four different views for the same dataset of 28 sequences of genera of the daisy family: (a) circular cladogram, (b) radial phylogram, (c) rectangular phylogram, and (d) slanted cladogram. Formatting nodes and edges of a tree. Dendroscope provides a dialog box for formatting the nodes and edges of a tree; the example shows a tree drawn as an internal circular cladogram.

Comparison with other tree viewers

In a survery of existing tree vizualisation and manipulation programs we screened over 40 programs (for extensive lists of such programs, e.g. see [5,6]). In Table 1, we compare Dendroscope to a selection of tree viewing programs which are either widely used or have exceptional features: ATV [7], HyperTree [8], MEGA [9], PHYLIP's [10] drawtree/drawgram, SplitsTree4 [11], TreeView [12], TreeJuxtaposer [3] and TreeDyn [13]. Of the existing programs, only TreeJuxtaposer and PHYLIP's drawtree and drawgram can handle very large trees. PHYLIP's drawtree and drawgram are non-interactive and so are of limited use. TreeJuxtaposer is currently the viewer of choice for large trees. SplitsTree4 and TreeJuxtaposer provide different mechanisms for comparing two or more trees. TreeDyn provides useful features such as scriptability, interoperability with tree databases and especially the possibility to display and manipulate many trees in parallel. Its drawbacks are the limit to trees of only moderate size and the complex user interface.
Table 1

Comparison of popular tree viewers. Description of column headers: A: displayable taxa (see Methods section for details), B: search function, C: tree comparison, D: coloring of subtrees, E: editing of labels, F: collapsing of subtrees, G: rerooting, H: rectangular view, I: slanted view, J: radial view, K: circular view, L: graphic export formats

ABCDEFGHIJKL
ATV2 kpdf
Dendroscope350 keps, svg, png, jpg, gif, bmp
HyperTree20 k1-
MEGA20 kemf
PHYLIP1336 k2ps, bmp, pict, pov, fig
SplitsTree41 k3eps, svg, png, jpg, gif, bmp
TreeDyn5 kps, svg, png, jpg, gif, etc.
TreeJuxtaposer1002 k-
TreeView2 k456wmf, emf

1only single edges

2only if "Iterate to improve tree" is set to "no", though trees become illegible as there is no possibility of hiding or magnifying subtrees

3using consensus networks

4TreeViewX (equivalent to version 0.95 of TreeView): 50 k

5only labels

6only internal nodes

Comparison of popular tree viewers. Description of column headers: A: displayable taxa (see Methods section for details), B: search function, C: tree comparison, D: coloring of subtrees, E: editing of labels, F: collapsing of subtrees, G: rerooting, H: rectangular view, I: slanted view, J: radial view, K: circular view, L: graphic export formats 1only single edges 2only if "Iterate to improve tree" is set to "no", though trees become illegible as there is no possibility of hiding or magnifying subtrees 3using consensus networks 4TreeViewX (equivalent to version 0.95 of TreeView): 50 k 5only labels 6only internal nodes The system requirements of existing viewers vary: some work only with particular versions of Unix/Linux or MacOS, or they need additional software to be installed. However, all viewers listed in Table 1 run on Linux/Unix, MacOS and Windows, except MEGA, which runs only on Windows.

Dendroscope at work

Our objective was to build a tree viewer that is able to handle a tree as large as the current version of the NCBI taxonomy. On a standard laptop, Dendroscope performs well on this tree in all rectangular and slanted views. Circular and radial view are less suitable for very large data sets. Figure 1 shows a screenshot of the NCBI taxonomic tree loaded in Dendroscope showing Homo sapiens and the Hominidae. Figure 2 demonstrates some of the views provided by the program.

Conclusion

With Dendroscope, we have developed a new all-round tree viewer that combines all major features found in popular viewers into a single program that can handle large datasets.

Availability and Requirements

Dendroscope is freely available and can be downloaded from . The software is written in Java 1.4 and installers are provided for Linux/Unix, MacOS X and Windows.

Methods

Processing of trees

Since we want to represent very large trees, we need to be able to focus on the crucial parts of the representation to speed up calculations. To this end, we use bounding boxes: to each subtree, we assign a box containing the subtree. The tree is drawn from the root down, and each subtree is drawn only if its bounding box is in the visible region or at least intersects with it. In addition, we compare the height of the bounding box to the number of edges it contains; if we find too many edges in a too small a box, we draw the box as an opaque single element instead of drawing each edge separately. When we want to identify the element (edge or node) at a selected position, we also make use of the bounding boxes: The tree is searched from the root down, leaving out all subtrees whose bounding boxes do not contain the selected position. This reduces the search time from (n) to (log(n)). We supply two different magnifiers to let the user easily access inner nodes and taxa: a horizontal magnifier band for rectangular and slanted views, and a circular one for radial tree views. In both cases, a point with distance d to the center of the magnifier is mapped to a point with distance from the center, where D denotes the diameter or height of the magnifier, as appropriate.

Test data and system

To estimate the number of displayable taxa for each viewer (see Table 1), we applied the viewer to a list of trees containing increasingly large numbers of taxa: 1 k, 2 k, 5 k, 10 k, 20 k, 50 k, 100 k, 200 k, 334 k, 668 k, 1002 k, 1336 k and 2004 k. In Table 1, we report the maximal size of dataset that could be opened by the viewer, and then loaded and browsed in a reasonable amount of time (less than 90 seconds to open and an interaction response time of less than 15 seconds) on a standard workstation.

Authors' contributions

All authors participated in the specification and testing of the program. The overall software design is credited to DHH. The program was mainly written by DHH with contributions from TD, MF, CR, DCR and RR. RR worked on the mathematical aspects of the magnification algorithm and contributed to the manuscript. CR and DCR evaluated existing tree viewers, generated test datasets and wrote the main draft of the paper. All authors read and agreed with the final manuscript.
  8 in total

1.  ATV: display and manipulation of annotated phylogenetic trees.

Authors:  C M Zmasek; S R Eddy
Journal:  Bioinformatics       Date:  2001-04       Impact factor: 6.937

2.  Visualizing large hierarchical clusters in hyperbolic space.

Authors:  J Bingham; S Sudarsanam
Journal:  Bioinformatics       Date:  2000-07       Impact factor: 6.937

3.  NEXUS: an extensible file format for systematic information.

Authors:  D R Maddison; D L Swofford; W P Maddison
Journal:  Syst Biol       Date:  1997-12       Impact factor: 15.683

4.  MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment.

Authors:  Sudhir Kumar; Koichiro Tamura; Masatoshi Nei
Journal:  Brief Bioinform       Date:  2004-06       Impact factor: 11.622

5.  Application of phylogenetic networks in evolutionary studies.

Authors:  Daniel H Huson; David Bryant
Journal:  Mol Biol Evol       Date:  2005-10-12       Impact factor: 16.240

6.  MEGAN analysis of metagenomic data.

Authors:  Daniel H Huson; Alexander F Auch; Ji Qi; Stephan C Schuster
Journal:  Genome Res       Date:  2007-01-25       Impact factor: 9.043

7.  TreeView: an application to display phylogenetic trees on personal computers.

Authors:  R D Page
Journal:  Comput Appl Biosci       Date:  1996-08

8.  TreeDyn: towards dynamic graphics and annotations for analyses of trees.

Authors:  François Chevenet; Christine Brun; Anne-Laure Bañuls; Bernard Jacq; Richard Christen
Journal:  BMC Bioinformatics       Date:  2006-10-10       Impact factor: 3.169

  8 in total
  566 in total

1.  Diversifying evolution of highly pathogenic H5N1 avian influenza virus in Egypt from 2006 to 2011.

Authors:  E M Abdelwhab; Abdel-Satar Arafa; Jürgen Stech; Christian Grund; Olga Stech; Marcus Graeber-Gerberding; Martin Beer; Mohamed K Hassan; Mona M Aly; Timm C Harder; Hafez M Hafez
Journal:  Virus Genes       Date:  2012-06-05       Impact factor: 2.332

2.  Identification of a novel arsenite oxidase gene, arxA, in the haloalkaliphilic, arsenite-oxidizing bacterium Alkalilimnicola ehrlichii strain MLHE-1.

Authors:  Kamrun Zargar; Shelley Hoeft; Ronald Oremland; Chad W Saltikov
Journal:  J Bacteriol       Date:  2010-05-07       Impact factor: 3.490

3.  Genetic diversity and structure of a worldwide collection of Phaseolus coccineus L.

Authors:  G Spataro; B Tiranti; P Arcaleni; E Bellucci; G Attene; R Papa; P Spagnoletti Zeuli; V Negri
Journal:  Theor Appl Genet       Date:  2011-01-29       Impact factor: 5.699

4.  Evolution and functional diversification of the small heat shock protein/α-crystallin family in higher plants.

Authors:  Hernán Gabriel Bondino; Estela Marta Valle; Arjen Ten Have
Journal:  Planta       Date:  2011-12-31       Impact factor: 4.116

5.  Polarized cell growth in Arabidopsis requires endosomal recycling mediated by GBF1-related ARF exchange factors.

Authors:  Sandra Richter; Lena M Müller; York-Dieter Stierhof; Ulrike Mayer; Nozomi Takada; Benedikt Kost; Anne Vieten; Niko Geldner; Csaba Koncz; Gerd Jürgens
Journal:  Nat Cell Biol       Date:  2011-12-04       Impact factor: 28.824

6.  Comparative analysis of 16S rRNA and amoA genes from archaea selected with organic and inorganic amendments in enrichment culture.

Authors:  Mouzhong Xu; Jon Schnorr; Brandon Keibler; Holly M Simon
Journal:  Appl Environ Microbiol       Date:  2012-01-20       Impact factor: 4.792

7.  Phosphorylation of calcineurin B-like (CBL) calcium sensor proteins by their CBL-interacting protein kinases (CIPKs) is required for full activity of CBL-CIPK complexes toward their target proteins.

Authors:  Kenji Hashimoto; Christian Eckert; Uta Anschütz; Martin Scholz; Katrin Held; Rainer Waadt; Antonella Reyer; Michael Hippler; Dirk Becker; Jörg Kudla
Journal:  J Biol Chem       Date:  2012-01-17       Impact factor: 5.157

8.  Use of structural phylogenetic networks for classification of the ferritin-like superfamily.

Authors:  Daniel Lundin; Anthony M Poole; Britt-Marie Sjöberg; Martin Högbom
Journal:  J Biol Chem       Date:  2012-04-25       Impact factor: 5.157

9.  Comparative Analysis of the IclR-Family of Bacterial Transcription Factors and Their DNA-Binding Motifs: Structure, Positioning, Co-Evolution, Regulon Content.

Authors:  Inna A Suvorova; Mikhail S Gelfand
Journal:  Front Microbiol       Date:  2021-06-10       Impact factor: 5.640

10.  Haemophilus influenzae genome evolution during persistence in the human airways in chronic obstructive pulmonary disease.

Authors:  Melinda M Pettigrew; Christian P Ahearn; Janneane F Gent; Yong Kong; Mary C Gallo; James B Munro; Adonis D'Mello; Sanjay Sethi; Hervé Tettelin; Timothy F Murphy
Journal:  Proc Natl Acad Sci U S A       Date:  2018-03-19       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.