| Literature DB >> 18447942 |
Kord Eickmeyer1, Peter Huggins, Lior Pachter, Ruriko Yoshida.
Abstract
The popular neighbor-joining (NJ) algorithm used in phylogenetics is a greedy algorithm for finding the balanced minimum evolution (BME) tree associated to a dissimilarity map. From this point of view, NJ is "optimal" when the algorithm outputs the tree which minimizes the balanced minimum evolution criterion. We use the fact that the NJ tree topology and the BME tree topology are determined by polyhedral subdivisions of the spaces of dissimilarity maps [equation; see text] to study the optimality of the neighbor-joining algorithm. In particular, we investigate and compare the polyhedral subdivisions for n </= 8. This requires the measurement of volumes of spherical polytopes in high dimension, which we obtain using a combination of Monte Carlo methods and polyhedral algorithms. Our results include a demonstration that highly unrelated trees can be co-optimal in BME reconstruction, and that NJ regions are not convex. We obtain the l2 radius for neighbor-joining for n = 5 and we conjecture that the ability of the neighbor-joining algorithm to recover the BME tree depends on the diameter of the BME tree.Entities:
Year: 2008 PMID: 18447942 PMCID: PMC2430562 DOI: 10.1186/1748-7188-3-5
Source DB: PubMed Journal: Algorithms Mol Biol ISSN: 1748-7188 Impact factor: 1.405
Figure 1The four types of facets of P.
Figure 2A tree with five leaves.
The f-vector for small BME polytopes.
| #leaves | dim(BME polytope) | |
| 4 | 2 | (3,3) |
| 5 | 5 | (15, 105, 250, 210, 52) |
| 6 | 9 | (105, 5460, ?, ?, ?, 90262) |
| 7 | 14 | (945, 445410, ?, ?, ?, ?, ?) |
| ℬ | ℬ | ℬ |
| ((2 |
Figure 3The non-edges on the BME polytope for Two trees will form a non-edge if and only if they are trees that have three cherries, and differ by the pair of leaf exchanges shown in the figure. There are two ways to perform each leaf-exchange, so each binary tree with three cherries is not adjacent to 4 trees.
The 14 rays of the cone C23,45. Each ray is determined by a vector shown in the second column. The third column shows, for each ray, which cones it belongs to. If a cone is starred then the ray is on the boundary of that cone, but not a ray of it.
| Type | rays | Cones |
| I | (-3, 5, -3, -1, 5, -3, -1, 1, 1, -1) | |
| II | (-1, 1, -1, 1, 1, -1, -1, 1, 1, -1) | |
| III | (1, -1, -1, 1, 1, -1, -1, -1, 3, -1) | |
Comparison of NJ and BME cones. The volume estimates for n = 8 do not all add up to exactly 100% due to round-off errors
| #taxa | tree shape | #trees | NJ vol | BME vol | NJ accuracy |
| 4 | unique | 3 | 100% | 100% | 100% |
| 5 | unique | 15 | 100% | 100% | 98.06% |
| 6 | 3-cherry | 15 | 18.49% | 18.57% | 90.39% |
| 6 | caterpillar | 90 | 81.51% | 81.43% | 91.33% |
| 7 | 3-cherry | 315 | 45.32% | 44.58% | 82.42% |
| 7 | caterpillar | 630 | 54.68% | 55.42% | 78.85% |
| 8 | 4-cherry | 315 | 6.48% | 6.36% | 70.12% |
| 8 | 3-cherry (two are neighbors) | 2520 | 27.12% | 25.84% | 69.93% |
| 8 | 3-cherry (none are neighbors) | 2520 | 35.67% | 34.55% | 71.63% |
| 8 | caterpillar | 5040 | 30.73% | 33.24% | 61.75% |
Figure 4Frequencies of the all three possible types of NJ trees that may picked instead of the BME tree for Neighbor-joining agrees with the BME tree 98.06% of the time.