| Literature DB >> 30451977 |
Susanne Reimering1, Sebastian Muñoz1, Alice C McHardy2,3.
Abstract
Phylogeographic methods reconstruct the origin and spread of taxa by inferring locations for internal nodes of the phylogenetic tree from sampling locations of genetic sequences. This is commonly applied to study pathogen outbreaks and spread. To evaluate such reconstructions, the inferred spread paths from root to leaf nodes should be compared to other methods or references. Usually, ancestral state reconstructions are evaluated by node-wise comparisons, therefore requiring the same tree topology, which is usually unknown. Here, we present a method for comparing phylogeographies across different trees inferred from the same taxa. We compare paths of locations by calculating discrete Fréchet distances. By correcting the distances by the number of paths going through a node, we define the Fréchet tree distance as a distance measure between phylogeographies. As an application, we compare phylogeographic spread patterns on trees inferred with different methods from hemagglutinin sequences of H5N1 influenza viruses, finding that both tree inference and ancestral reconstruction cause variation in phylogeographic spread that is not directly reflected by topological differences. The method is suitable for comparing phylogeographies inferred with different tree or phylogeographic inference methods to each other or to a known ground truth, thus enabling a quality assessment of such techniques.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30451977 PMCID: PMC6242967 DOI: 10.1038/s41598-018-35421-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Calculation of the discrete Fréchet tree distances on two paths and on two phylogenetic trees with different topologies. (A) Coupling which minimizes the discrete Fréchet distance between two paths P (red) and Q (blue) with different lengths. The coupling between points is indicated by the black dashed line. (B) Example of two phylogenetic trees (reference on the left, reconstruction on the right) with different topologies inferred on the same taxa. Labels at the nodes indicate the locations, which are shown on a two-dimensional map in C, D and E. For each leaf node, the paths along the trees are compared. (C) Comparison of paths to location A. (D) Comparison of paths to location B. (E) Comparison of paths to location C. The coupling of nodes minimizing the distance between the paths is indicated by black dashed lines. For each node, these distances are summarized across all leaves. In case of the reference tree, this leads to the following calculations: , , , , . These costs are then divided by the number of descendant leaves and summarized to calculate a final cost for the reference tree.
Figure 2Phylogenetic trees with locations mapped to internal nodes and branches as indicated by the colors. The trees were generated by maximum parsimony (A) neighbor joining (B) UPGMA (C) maximum likelihood using the Jukes-Cantor model (D) and maximum likelihood using the GTR model (E). Ancestral states were inferred by parsimony. The visualization was performed using GraPhlAn[34].
Pairwise Robinson Foulds metric for all five inferred trees (below the main diagonal) and the corresponding z-scores (above the main diagonal).
| UPGMA | NJ | Parsimony | MLJC | MLGTR | |
|---|---|---|---|---|---|
| UPGMA | 0 | 1.38 | 0.60 | −0.14 | −0.14 |
| NJ | 242 | 0 | 0.22 | 0.57 | 0.57 |
| Parsimony | 198 | 176 | 0 | −0.31 | −0.35 |
| MLJC | 156 | 196 | 146 | 0 | −2.40 |
| MLGTR | 156 | 196 | 144 | 28 | 0 |
Pairwise Fréchet tree distances for all five inferred trees using parsimony for the ancestral character state reconstruction (below the main diagonal) and the corresponding z-scores (above the main diagonal).
| UPGMA | NJ | Parsimony | MLJC | MLGTR | |
|---|---|---|---|---|---|
| UPGMA | 0 | 1.24 | −0.52 | −0.51 | −0.63 |
| NJ | 22905.86 | 0 | 0.94 | 1.27 | 1.04 |
| Parsimony | 8043.998 | 20399.78 | 0 | −0.68 | −0.75 |
| MLJC | 8132.359 | 23165.58 | 6678.068 | 0 | −1.34 |
| MLGTR | 7052.893 | 21257.84 | 6050.793 | 573.3009 | 0 |
Figure 3Visualization of pairwise Fréchet tree distances. A nonmetric multidimensional scaling was performed on the pairwise Fréchet tree distances to plot the differences in a 2-dimensional space. (A) Fréchet tree distances using parsimony for ancestral reconstruction. (B) Fréchet tree distances using maximum likelihood for ancestral reconstruction.
Pairwise Fréchet tree distances for all five inferred trees using maximum likelihood for the ancestral character state reconstruction (below the main diagonal) and the corresponding z-scores (above the main diagonal).
| UPGMA | NJ | Parsimony | MLJC | MLGTR | |
|---|---|---|---|---|---|
| UPGMA | 0 | 1.99 | −0.23 | 0.64 | 0.61 |
| NJ | 23451.97 | 0 | −0.46 | 0.21 | 0.47 |
| Parsimony | 8486.976 | 6949.106 | 0 | −0.87 | −0.87 |
| MLJC | 14317.09 | 11452.11 | 4145.878 | 0 | −1.49 |
| MLGTR | 14113.19 | 13171.35 | 4145.878 | 0 | 0 |