| Literature DB >> 28369572 |
Lucas Czech1, Jaime Huerta-Cepas2, Alexandros Stamatakis1,3.
Abstract
Phylogenetic trees are routinely visualized to present and interpret the evolutionary relationships of species. Most empirical evolutionary data studies contain a visualization of the inferred tree with branch support values. Ambiguous semantics in tree file formats can lead to erroneous tree visualizations and therefore to incorrect interpretations of phylogenetic analyses. Here, we discuss problems that arise when displaying branch values on trees after rerooting. Branch values are typically stored as node labels in the widely-used Newick tree format. However, such values are attributes of branches. Storing them as node labels can therefore yield errors when rerooting trees. This depends on the mostly implicit semantics that tools deploy to interpret node labels. We reviewed ten tree viewers and ten bioinformatics toolkits that can display and reroot trees. We found that 14 out of 20 of these tools do not permit users to select the semantics of node labels. Thus, unaware users might obtain incorrect results when rooting trees. We illustrate such incorrect mappings for several test cases and real examples taken from the literature. This review has already led to improvements in eight tools. We suggest tools should provide options that explicitly force users to define the semantics of node labels.Entities:
Keywords: Newick format; bioinformatics toolkits; branch labels; branch support values; bugs; phylogenetic trees; software; tree viewers; tree visualization
Mesh:
Year: 2017 PMID: 28369572 PMCID: PMC5435079 DOI: 10.1093/molbev/msx055
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
FOur exemplary tree, before and after rooting on the branch leading to the tip node X. The rooted trees contain an additional root node R’. (a) Original rooting (via top-level trifurcation) and visual representation of our Newick test tree T. Inner nodes and branches are colored according to the correct node label to branch mapping of T. (b) Tree rooted on node R’. Node labels are mapped incorrectly to branches, resulting in a tree with an erroneous node label to branch value mapping. (c) Tree rooted on node R’. Node labels are correctly mapped to the branches of the tree.
Evaluated Tree Viewers (first half) and Bioinformatics Toolkits (second half) with Accumulated Number of Citations (https://scholar.google.com, accessed on 2016-11-11).
| Tool | Version | Reference | Citations |
|---|---|---|---|
| Archaeopteryx | 0.9911 | 268 | |
| ATV | 4.00 alpha 13 | 288 | |
| Dendroscope | 3.4.0 and 3.5.3 | 1,348 | |
| ETE (GUI) | 2.3.10 | 238 | |
| EvolView | Accessed 2016-08-15 | 105 | |
| FigTree | 1.4.2 | >2,362 | |
| iTOL | Accessed 2016-08-15 | 1,879 | |
| PhyloWidget | Accessed 2016-08-15 | 113 | |
| TreeView | 1.6.6 (Windows) | 10,570 | |
| T-REX | Accessed 2016-08-15 | 285 | |
| APE | 3.4 | 3,915 | |
| BioPerl | 1.006925 | 1,410 | |
| BioPython | 1.63b | 797 | |
| Dendropy | 4.1.0 | 525 | |
| ETE (API) | 3.0.0b35 | 238 | |
| Geneious | 10.0.5 | 1,689 | |
| Mega | 7.0.14 build 7160126 | 69,134 | |
| Mesquite | 3.10 (build 765) | 5,616 | |
| Newick Utilities | 1.6 | 31 | |
| Pycogent/scikit-bio | 1.5.3 | 148 | |
| Total | 100,721 |
FigTree does not have an official publication, so we estimated the number of citations by accumulating the counts for the most recent versions.
Evaluation of tree viewers and bioinformatics toolkits. The columns “Nodes” and “Branches” indicate which of the two interpretations of Newick node labels the tool supports. The last column shows whether the rerooting behavior is correct according to the interpretation offered or implied by the tool.
| Tool | Nodes | Branches | Default behavior | Correct rerooting |
|---|---|---|---|---|
| Archaeopteryx | Nodes | |||
| ATV | Branches | |||
| Dendroscope | Dialog | |||
| ETE (GUI) | Branches | |||
| EvolView | Branches | |||
| FigTree | Both | |||
| iTOL | Input dependent | |||
| PhyloWidget | Nodes | |||
| TreeView | Branches | |||
| T-REX | Branches | ( | ||
| APE | Nodes | |||
| BioPerl | Nodes | |||
| BioPython | Nodes | |||
| Dendropy | Nodes | ( | ||
| ETE (API) | Branches | |||
| Geneious | ( | Nodes | ||
| MEGA | Branches | |||
| Mesquite | Nodes | |||
| Newick Utilities | Nodes | |||
| Pycogent/scikit-bio | Branches |
Option added or improved after this review.
FExample of a published phylogeny showing that the issue occurred in real-life data. We used the original data from Lundin et al. (2010) to recreate Figure 2(a) of Lundin et al. (2010). (a) The original tree with the branch used for rerooting marked by a red cross. (b) The rerooted tree with incorrectly placed branch support values (e.g., the one underlined in green). This tree was created using Dendroscope 3.4.0. (c) The same rerooted tree, this time using the updated Dendroscope 3.5.3. The error does not occur, because the correct interpretation of the values was selected. Note that, the value underlined in red is now correctly duplicated at both ends of the root branch. We colored the subtrees to highlight their positions after rerooting.