| Literature DB >> 24093883 |
Joseph Heled1, Remco R Bouckaert.
Abstract
BACKGROUND: Bayesian phylogenetic analysis generates a set of trees which are often condensed into a single tree representing the whole set. Many methods exist for selecting a representative topology for a set of unrooted trees, few exist for assigning branch lengths to a fixed topology, and even fewer for simultaneously setting the topology and branch lengths. However, there is very little research into locating a good representative for a set of rooted time trees like the ones obtained from a BEAST analysis.Entities:
Mesh:
Year: 2013 PMID: 24093883 PMCID: PMC3853548 DOI: 10.1186/1471-2148-13-221
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Rankings of methods for building a summary tree
| TP(med) | 1/3 | 0/0 | 12/9 | 8/8 | 7/5 | 3/3 | 0/0 | 3/3 |
| TP(avg) | 0/4 | 0/0 | 13/9 | 6/7 | 0/3 | 11/10 | 1/6 | 14/15 |
| MED,TCB | 1/0 | 3/3 | 10/7 | 6/4 | 6/6 | 9/7 | 8/11 | 9/9 |
| MED,MCC | 1/0 | 6/6 | 12/10 | 7/4 | 7/7 | 7/6 | 7/10 | 7/7 |
| RBS,TCB | 6/8 | 10/10 | 4/3 | 12/12 | 11/10 | 2/2 | 4/3 | 1/1 |
| RBS,MCC | 7/9 | 12/12 | 5/4 | 12/12 | 11/11 | 1/1 | 3/2 | 0/0 |
| HSO,TCB | 1/1 | 2/2 | 11/8 | 6/4 | 6/6 | 10/7 | 8/11 | 10/10 |
| HSO,MCC | 1/2 | 5/5 | 13/11 | 6/4 | 7/7 | 8/6 | 7/10 | 8/8 |
| SRBS,TCB | 3/5 | 8/7 | 8/5 | 6/5 | 1/2 | 12/9 | 6/9 | 13/13 |
| SRBS,MCC | 4/6 | 9/9 | 9/6 | 6/5 | 3/4 | 11/9 | 5/9 | 11/12 |
| RAS,MCC | 5/6 | 14/14 | 7/4 | 9/9 | 9/8 | 4/4 | 3/4 | 6/6 |
| RAS,TCB | 5/6 | 13/13 | 6/3 | 10/10 | 10/9 | 5/4 | 5/5 | 5/5 |
| mSRBS | 3/5 | 9/8 | 9/6 | 7/6 | 2/3 | 11/8 | 4/8 | 12/11 |
| mRAS | 5/7 | 15/15 | 7/4 | 11/11 | 10/9 | 6/5 | 6/7 | 4/4 |
| mRBS | 6/8 | 11/11 | 3/3 | 13/13 | 11/10 | 0/0 | 2/1 | 2/2 |
| mHS | 1/0 | 18/19 | 1/1 | 2/0 | 1/0 | 16/12 | 12/14 | 18/18 |
| AVG,MCC | 0/4 | 7/7 | 11/9 | 5/5 | 5/8 | 13/11 | 9/12 | 15/14 |
| CAT,TCB | 0/5 | 1/1 | 14/10 | 0/0 | 1/2 | 18/16 | 14/15 | 20/21 |
| CAT,MCC | 0/4 | 4/4 | 15/12 | 1/1 | 1/2 | 17/15 | 13/14 | 19/20 |
| HS,TCB | 2/2 | 17/17 | 2/2 | 4/3 | 4/1 | 15/14 | 11/17 | 16/16 |
| HS,MCC | 1/1 | 19/18 | 2/2 | 3/2 | 3/0 | 14/13 | 10/16 | 17/17 |
| CONS(med) | 1/0 | 16/16 | 0/0 | 3/2 | 8/8 | 17/14 | 15/13 | 19/19 |
Rankings of methods for building a summary tree from posterior samples. Both the comparison and error magnitude ranking are given for each method and 7 error measures (as a comparison/magnitude pair). The error measures are root height error (RH), clades missed (CME), clades called (CCE), clade ages errors (CAE), divergence times errors (DVE), model fit (MF) and tree likelihood/coalescent likelihood (TLL/CLL). Method names are as defined in the methods section, except for CONS,MED,AVG and HSO. CONS is the strict consensus tree with ages set by median estimates, as implemented by DendroPy. MED and AVG respectively use the median and average of clades ages from all matching trees in the posterior. HSO also uses the same clade ages, but uses the search algorithm utilized by the tree distance methods to find heights which minimize the total squared error.
Condensed rankings for methods in Table1with additional performance numbers
| CAT,TCB | 1 | 1 | 14 | 19 | 0.0% | 45.2% | 3.79% | 36.33% |
| CAT,MCC | 2 | 4 | 15 | 18 | 0.0% | 45.3% | 3.79% | 36.46% |
| TP (avg) | 5 | 0 | 13 | 11 | 0.0% | 93.2% | 4.37% | 36.22% |
| TP (med) | 12 | 0 | 12 | 1 | 0.0% | 98.6% | 4.50% | 36.21% |
| | ||||||||
| SRBS,TCB | 6 | 8 | 7 | 12 | 4.1% | 91.8% | 4.38% | 36.64% |
| MED,TCB | 7 | 3 | 9 | 8 | 1.1% | 94.0% | 4.36% | 36.36% |
| HSO,TCB | 8 | 2 | 10 | 10 | 1.1% | 94.0% | 4.36% | 36.35% |
| MED,MCC | 9 | 6 | 13 | 6 | 0.9% | 94.6% | 4.37% | 36.48% |
| mSRBS | 9 | 9 | 8 | 9 | 3.8% | 92.1% | 4.39% | 36.80% |
| HSO,MCC | 10 | 5 | 14 | 7 | 0.9% | 94.6% | 4.37% | 36.48% |
| AVG,MCC | 10 | 7 | 11 | 13 | 1.1% | 84.2% | 4.30% | 36.49% |
| SRBS,MCC | 11 | 10 | 8 | 11 | 4.2% | 91.9% | 4.39% | 36.8% |
| | ||||||||
| mHS | 0 | 19 | 1 | 16 | 29.8% | 50.0% | 3.97% | 44.48% |
| HS,MCC | 3 | 19 | 2 | 14 | 34.3% | 54.9% | 4.16% | 44.27% |
| HS,TCB | 4 | 18 | 2 | 15 | 34.5% | 54.6% | 4.18% | 44.09% |
| CONS (med) | 6 | 17 | 0 | 17 | 27.5% | 46.3% | 3.98% | 43.00% |
| RAS,MCC | 13 | 15 | 6 | 3 | 24.4% | 93.1% | 4.66% | 42.45% |
| RAS,TCB | 14 | 14 | 5 | 4 | 24.2% | 92.7% | 4.67% | 42.26% |
| mRAS | 15 | 16 | 6 | 5 | 24.5% | 88.2% | 4.74% | 42.73% |
| RBS,TCB | 16 | 11 | 4 | 2 | 23.4% | 99.0% | 4.82% | 40.60% |
| mRBS | 17 | 12 | 3 | 0 | 23.8% | 99.1% | 4.84% | 40.80% |
| RBS,MCC | 18 | 13 | 5 | 0 | 23.7% | 99.0% | 4.81% | 40.94% |
The ranks for RH, CAE and DVE were added to make the TIMES rank indicating fit of clade heights, and MF, TLL and CLL ranks added to make MODEL rank indicating fit of topology. The POLY column shows the mean number of branches with length zero, which effectively create a polytomy in the tree. The number of zero length branches in each tree were divided by the total number of branches to turn them in percentages so that they can be averaged over all 2000 test cases. The MF% column shows the mean percentile of the summary tree log-likelihood (tree+coalescent) in the posterior samples. For example, a value of 94% means that the summary tree log-likelihood was higher than 94% of the posterior trees. The CAE% column show the mean clade age errors per clade, as a percent of tree height. The CME% column shows the mean number of missed clades, as a percentage of the number of non-trivial clades in the tree. The means are obtained by averaging the statistic over the 2000 summary trees produced by each method.
Figure 1Four Summary Trees. Four summary trees generated from the same data set drawn over a DensiTree. A DensiTree draws all trees in a set of trees using transparancy so that in places where the trees in the tree set agrees there is dark colouring, while in places where there is a lot of variation there is light colouring. The bars in the tree from RBS shows the 95% credible intervals for all clades.