| Literature DB >> 24229408 |
Wellington S Martins1, Welton C Carmo, Humberto J Longo, Thierson C Rosa, Thiago F Rangel.
Abstract
BACKGROUND: Phylogenetic comparative analyses usually rely on a single consensus phylogenetic tree in order to study evolutionary processes. However, most phylogenetic trees are incomplete with regard to species sampling, which may critically compromise analyses. Some approaches have been proposed to integrate non-molecular phylogenetic information into incomplete molecular phylogenies. An expanded tree approach consists of adding missing species to random locations within their clade. The information contained in the topology of the resulting expanded trees can be captured by the pairwise phylogenetic distance between species and stored in a matrix for further statistical analysis. Thus, the random expansion and processing of multiple phylogenetic trees can be used to estimate the phylogenetic uncertainty through a simulation procedure. Because of the computational burden required, unless this procedure is efficiently implemented, the analyses are of limited applicability.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24229408 PMCID: PMC4225676 DOI: 10.1186/1471-2105-14-324
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1(Left) Phylogenetic tree and (top right) the species to be inserted. Input data representation. Phylogenetic tree and the species to be inserted.
Figure 2A possible expanded tree. A possible final expanded tree where branches connecting inserted species are shown by dashed lines.
Figure 3Heavy chain decomposition. The previously expanded tree after the heavy chain decomposition. The chains produced are: [1-5-7-8-17-9], [2-23-3], [11-19-20], [21-4], [15-10], [24], [22], [6], [18], [16], [12], [13].
Tree generation time (seconds) for |S|=128, |T |=(126,…,510) and m=1,000
| 126 | 0.32 | 4.25 | 806.80 |
| 209 | 0.36 | 4.69 | 1,021.49 |
| 304 | 0.42 | 5.93 | 1,691.40 |
| 419 | 0.47 | 7.35 | 2,253.90 |
| 510 | 0.52 | 7.86 | 2,730.90 |
Time in seconds for generating m = 1000 expanded trees, each one receiving |S| = 128 new species, for different input trees with |T| = (126, …, 510) species (Phyllostomi, Carnivora, Hummingbirds and Amphibia phylogenies).
Tree generation time (seconds) for |S|=128, |T |=304 and m=(1,…,10,000)
| 1 | 0.00 | 0.01 | 1.75 |
| 10 | 0.02 | 0.06 | 17.44 |
| 100 | 0.05 | 0.59 | 174.20 |
| 1000 | 0.42 | 5.93 | 1,739.80 |
| 10000 | 4.21 | 59.86 | 17,412.06 |
Time in seconds for generating a varying number m = (1, …, 10,000) of expanded trees, given a single input tree containing |T| = 304 species (Hummingbirds phylogeny) and |S| = 128 species to be inserted.
Tree generation time (seconds) for |S|=(32,…,512), |T |=304 and m=1,000
| 32 | 0.23 | 5.46 | 457.21 |
| 64 | 0.31 | 5.60 | 856.80 |
| 128 | 0.42 | 5.93 | 1,739.80 |
| 256 | 0.70 | 6.48 | 3,839.10 |
| 512 | 1.23 | 7.56 | 9,487.30 |
Time in seconds for generating m = 1000 expanded trees, given a single input tree containing |T| = 304 species (Hummingbirds phylogeny) and a varying number |S| = (32, …, 512) of species to be inserted.
Distance computation time (seconds) for |T |=(126,…,510) and m=1,000
| 126 | 0.55 | 2.05 | 39.92 | 1.33 |
| 209 | 1.58 | 3.41 | 90.40 | 4.06 |
| 304 | 3.85 | 6.97 | 150.90 | 15.89 |
| 419 | 7.89 | 15.58 | 211.32 | 36.49 |
| 510 | 12.11 | 25.31 | 289.15 | 51.69 |
Time in seconds for the computation of m = 1000 distance matrices, for different input trees with |T| = (126, …, 510) species (Phyllostomi, Carnivora, Hummingbirds and Amphibia phylogenies).
Distance computation time (seconds) for |T |=304 and m=(1,…,10,000)
| 1 | 0.00 | 0.01 | 0.15 | 0.02 |
| 10 | 0.04 | 0.08 | 1.50 | 0.18 |
| 100 | 0.39 | 0.70 | 15.05 | 1.64 |
| 1,000 | 3.85 | 6.97 | 150.90 | 16.47 |
| 10,000 | 38.51 | 69.69 | 1,519.00 | 164.69 |
Time in seconds for the computation of a varying number m = (1, …,10,000) of distance matrices, each one associated with a |T| = 304 species (Hummingbirds phylogeny) expanded tree.
Tree generation and distance matrix computation time (seconds) for SUNPLIN-C++ using different randomly generated phylogenies
| | ||
|---|---|---|
| 100 | 0.25 | 1.98 |
| 250 | 0.32 | 6.00 |
| 500 | 0.44 | 14.45 |
| 750 | 0.56 | 28.12 |
| 1,000 | 0.67 | 44.80 |
Time in seconds for generating m = 1,000 expanded trees and to calculate distance matrices. The input trees contain |T| = (100, …, 1,000) species (randomly generated phylogenies) and each one is expanded with the insertion of |S| = 100 species.
Tree generation and distance matrix computation time (seconds) for SUNPLIN-C++ using a randomly generated phylogeny
| | ||
|---|---|---|
| 1 | 0.01 | 0.02 |
| 10 | 0.02 | 0.15 |
| 100 | 0.06 | 1.43 |
| 1,000 | 0.44 | 14.45 |
| 10,000 | 4.27 | 144.27 |
Time in seconds for generating a varying number m = (1, …,10,000) of expanded trees and to calculate distance matrices. The input tree contains |T| = 500 species (randomly generated phylogeny) and is expanded with the insertion of |S| = 100 species.