| Literature DB >> 28374552 |
Thibaut Jombart1, Michelle Kendall2, Jacob Almagro-Garcia3, Caroline Colijn2.
Abstract
The increasing availability of large genomic data sets as well as the advent of Bayesian phylogenetics facilitates the investigation of phylogenetic incongruence, which can result in the impossibility of representing phylogenetic relationships using a single tree. While sometimes considered as a nuisance, phylogenetic incongruence can also reflect meaningful biological processes as well as relevant statistical uncertainty, both of which can yield valuable insights in evolutionary studies. We introduce a new tool for investigating phylogenetic incongruence through the exploration of phylogenetic tree landscapes. Our approach, implemented in the R package treespace, combines tree metrics and multivariate analysis to provide low-dimensional representations of the topological variability in a set of trees, which can be used for identifying clusters of similar trees and group-specific consensus phylogenies. treespace also provides a user-friendly web interface for interactive data analysis and is integrated alongside existing standards for phylogenetics. It fills a gap in the current phylogenetics toolbox in R and will facilitate the investigation of phylogenetic results.Entities:
Keywords: incongruence; multivariate analysis; package; software; tree distances; tree metric
Mesh:
Year: 2017 PMID: 28374552 PMCID: PMC5724650 DOI: 10.1111/1755-0998.12676
Source DB: PubMed Journal: Mol Ecol Resour ISSN: 1755-098X Impact factor: 7.090
Methods available in treespace for defining distances between trees
| Metric/tree summary | References | R function ( |
|---|---|---|
| Robinson–Foulds metric | (Robinson & Foulds, |
|
| Branch score distance | (Kuhner & Felsenstein, |
|
| Billera–Holmes–Vogtmann metric (BHV) | (Billera et al., |
|
| Path difference metric (a.k.a. patristic distance/node distance/tip distance/dissimilarity measure) | (Steel & Penny, |
|
| Kendall–Colijn metric | (Kendall & Colijn, |
|
| Abouheif's dissimilarity | (Pavoine et al., |
|
| Sum of direct descendents | (Pavoine et al., |
|
Figure 1Rationale of the approach used in treespace. This diagram illustrates the four‐step approach for exploring phylogenetic tree spaces in treespace. (a). The input is a set of rooted, labelled trees describing the same taxa. Colours are used here to represent variability amongst trees. (b). Pairwise Euclidean distances between trees are computed, using various tree “summaries” or metrics. (c). These distances are represented in a space of lower dimension using multidimensional scaling (MDS), and potential groups of similar trees can be identified using various clustering methods. (d). Representative trees are derived from each group [Colour figure can be viewed at wileyonlinelibrary.com]
Figure 2Dengue virus phylogenies obtained by various inference methods, demonstrating the variety of results. (a) neighbour‐joining (NJ), (b) maximum‐likelihood (ML), (c,d) beast, where (c) is a densitree plot of 200 trees randomly sampled from the converged beast posterior, and (d) is the MCC tree from this sample. Bootstrap support values for NJ and ML trees and posterior support values for the beast MCC tree were calculated; values below 100% are shown. The dashed lines delineate the Philippines clade, referred to in the text [Colour figure can be viewed at wileyonlinelibrary.com]
Figure 3An analysis of the dengue virus phylogenies from figure 2 using treespace. (a) Three‐dimensional MDS plot demonstrating the variety between phylogenies inferred by different methods. The NJ and ML trees are indicated by larger spheres, with their corresponding bootstrap trees marked as smaller spheres of the same colour. (b) Two‐dimensional MDS plot of the beast trees alone, coloured by cluster obtained using the function findGroves. Scree plots are given as insets. (c–f) From each cluster in (b), a median tree was selected using medTree. These are highlighted in (b) by crosses. The MCC tree (Figure 2d) is indicated by a star in (b), and sits close to the green median tree (d). Indeed, these two trees differ only in their topologies amongst the tips “D4Brazil82,” “D4NewCal81,” “D4Mexico84” and “D4ElSal83” [Colour figure can be viewed at wileyonlinelibrary.com]