| Literature DB >> 29390973 |
Diep Thi Hoang1, Le Sy Vinh1, Tomáš Flouri2, Alexandros Stamatakis3,4, Arndt von Haeseler5,6, Bui Quang Minh7.
Abstract
BACKGROUND: The nonparametric bootstrap is widely used to measure the branch support of phylogenetic trees. However, bootstrapping is computationally expensive and remains a bottleneck in phylogenetic analyses. Recently, an ultrafast bootstrap approximation (UFBoot) approach was proposed for maximum likelihood analyses. However, such an approach is still missing for maximum parsimony.Entities:
Keywords: Maximum parsimony; Nonparametric bootstrap; Phylogenetic inference
Mesh:
Substances:
Year: 2018 PMID: 29390973 PMCID: PMC5796505 DOI: 10.1186/s12862-018-1131-3
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Cumulative runtimes (hours) for the five tested methods for 114 TreeBASE MSAs
| Uniform cost | Non-uniform cost | |
|---|---|---|
| fast-TNT |
| 1784 |
| MPBoot SPR3 | 36.2 |
|
| MPBoot SPR6 | 70.1 | 491 |
| intensive-TNT | 72.5 | 2470 |
| PAUP* | 206.1 | NA |
Run times in bold-face highlight the respective fastest method under the given cost models
Fig. 1Performance comparison in terms of runtimes and MP scores between MPBoot SPR3 and fast-TNT under uniform (a, b) and non-uniform cost matrices (c, d), for real DNA and amino-acid MSAs. Each dot in the main diagrams represents a single MSA. The y-axis displays the difference between the CPU times of the two programs. The x-axis displays the difference between parsimony scores of the MP trees on the original MSA inferred by the two programs. The histograms at the top and the side present the marginal frequencies. Dots to the left of the vertical dashed line represent alignments where MPBoot found a better parsimony score. If a dot is below the horizontal dashed line, the bootstrap analysis by MPBoot was faster. Percentages in the quadrants of histograms denote the fraction of alignments in that region. Percentages on the dashed line reflect the number of alignments where two programs obtain equal MP scores
Fig. 2Performance comparison in terms of runtimes and MP scores between MPBoot SPR6 and intensive-TNT under uniform (a, b) and non-uniform cost matrices (c, d), for real DNA and amino-acid MSAs. Each dot in the main diagrams represents a single MSA. The y-axis displays the difference between the CPU times of the two programs. The x-axis displays the difference between parsimony scores of the MP trees on the original MSA inferred by the two programs. The histograms at the top and the side present the marginal frequencies. Dots to the left of the vertical dashed line represent alignments where MPBoot found a better parsimony score. If a dot is below the horizontal dashed line, the bootstrap analysis by MPBoot was faster. Percentages in the quadrants of histograms denote the fraction of alignments in that region. Percentages on the dashed line reflect the number of alignments where two programs obtain equal MP scores
Fig. 3Performance of tested methods in the inference of MP trees for the original MSAs. The bar-plots show the frequencies with which each of the five tested methods produced the best MP score for original MSAs in the (a) simulated PANDIT and (b) TreeBASE data sets. Note that the best MP score for a given MSA can be found by more than one methods; therefore the sum of frequencies for a data set may be greater than one. Data for PAUP* under the non-uniform cost matrix is not available due to excessive execution times
Fig. 4Accuracy of bootstrap supports on simulated PANDIT DNA and protein MSAs for MPBoot SPR3 (green curves), MPBoot SPR6 (blue curves), fast-TNT (red curves), intensive-TNT (yellow curves), and PAUP* (black curves) under uniform cost matrices (a, b) and non-uniform cost matrices (c, d). The bin size on x-axis is 1%