| Literature DB >> 29300759 |
Fernando Antoneli1,2, Fernando M Passos1,3, Luciano R Lopes1,2, Marcelo R S Briones1,2.
Abstract
Divergence date estimates are central to understand evolutionary processes and depend, in the case of molecular phylogenies, on tests of molecular clocks. Here we propose two non-parametric tests of strict and relaxed molecular clocks built upon a framework that uses the empirical cumulative distribution (ECD) of branch lengths obtained from an ensemble of Bayesian trees and well known non-parametric (one-sample and two-sample) Kolmogorov-Smirnov (KS) goodness-of-fit test. In the strict clock case, the method consists in using the one-sample Kolmogorov-Smirnov (KS) test to directly test if the phylogeny is clock-like, in other words, if it follows a Poisson law. The ECD is computed from the discretized branch lengths and the parameter λ of the expected Poisson distribution is calculated as the average branch length over the ensemble of trees. To compensate for the auto-correlation in the ensemble of trees and pseudo-replication we take advantage of thinning and effective sample size, two features provided by Bayesian inference MCMC samplers. Finally, it is observed that tree topologies with very long or very short branches lead to Poisson mixtures and in this case we propose the use of the two-sample KS test with samples from two continuous branch length distributions, one obtained from an ensemble of clock-constrained trees and the other from an ensemble of unconstrained trees. Moreover, in this second form the test can also be applied to test for relaxed clock models. The use of a statistically equivalent ensemble of phylogenies to obtain the branch lengths ECD, instead of one consensus tree, yields considerable reduction of the effects of small sample size and provides a gain of power.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29300759 PMCID: PMC5754089 DOI: 10.1371/journal.pone.0190826
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
PKS test for strict clock on the simulated data (9 taxa) (1,000 trees, k = 0.50).
| Full Tree (K2P) | Ingroup (K2P) | Outgroup (K2P) | |
|---|---|---|---|
| Mean Branch Length ( | 7.01 | 5.75 | 25.18 |
| 0.15 | 0.01 | 0.06 | |
| 33 (15) | 49 (14) | 182 (1) | |
| 247 | 343 | 91 | |
| Critical Value (1%) | 0.05 | 0.04 | 0.10 |
| < 0.00001 | 0.98 | 0.34 | |
| Power estimate | > 0.99 | > 0.99 | > 0.99 |
PKS test for strict clock on Sanson et al. [36] data (16 taxa) (1,000 trees, k = 0.50).
| Ensemble of Trees (GTR) | Real Tree (Sanson et al. 2002) | |
|---|---|---|
| Mean Branch Length ( | 5.76 | 5.36 |
| 0.07 | 0.07 | |
| 31 (30) | 1 (30) | |
| 465 | 30 | |
| Critical Value (1%) | 0.04 | 0.16 |
| < 0.00001 | 0.65 | |
| Power estimate | > 0.83 | > 0.83 |
PKS test for strict clock on three data-sets of real sequences ENV, COX1 and 18S rDNA (1,000 trees, k = 0.50).
| Lentiviruses | Primates | Yeasts 18S rDNA | |
|---|---|---|---|
| Mean Branch Length ( | 752.3 | 186.2 | 26.5 |
| 0.53 | 0.67 | 0.63 | |
| 224 (13) | 17 (15) | 89 (31) | |
| 1,458 | 127 | 1,382 | |
| Critical Value (1%) | 0.02 | 0.09 | 0.02 |
| < 0.00001 | < 0.00001 | < 0.00001 |
Two-sample KS test for strict clock on three data-sets (ENV and COX1 with 9 taxa and 18S rDNA with 17 taxa) of real sequences (1,000 trees, k = 0.192, k = 0.116).
| Lentiviruses | Primates | Yeasts 18S rDNA | |
|---|---|---|---|
| 349 (15) | 940 (15) | 211 (31) | |
| 670 | 2,707 | 1,255 | |
| 404 | 1,635 | 758 | |
| 0.21 | 0.11 | 0.11 | |
| Critical Value (1%) | 0.10 | 0.05 | 0.07 |
| < 0.00001 | < 0.00001 | < 0.00001 |
Two-sample KS test for relaxed clock on three data-sets (ENV and COX1 with 9 taxa and 18S rDNA with 17 taxa) of real sequences (1,000 trees, k = 0.192, k = 0.099).
| Lentiviruses | Primates | Yeasts 18S rDNA | |
|---|---|---|---|
| 349 (15) | 940 (15) | 211 (31) | |
| 670 | 2,707 | 1,255 | |
| 354 | 1,395 | 647 | |
| 0.043 | 0.005 | 0.007 | |
| Critical Value (1%) | 0.107 | 0.050 | 0.070 |
| 0.773 | 0.999 | 0.999 |
PKS test for strict clock on Padovan et al. [37] data (134 taxa) (15,000 trees, k = 0.04).
| Non-clock ensemble | |
|---|---|
| Mean Branch Length ( | 15.0 |
| 0.47 | |
| 377 (264) | |
| 4,366 | |
| Critical Value (1%) | 0.01 |
| < 0.00001 |
Two-sample KS tests for strict and relaxed clock on Padovan et al. [37] data (134 taxa) (15,000 trees, k = 0.04, k = 0.19 (strict clock), k = 0.05 (relaxed clock)).
| Non-clock x Strict | Non-clock x Relaxed | Strict x Relaxed | |
|---|---|---|---|
| 274 (267) | 274 (267) | 274 (267) | |
| 2,926 | 2,926 | 13,900 | |
| 13,900 | 3,657 | 3,657 | |
| 0.078 | 0.059 | 0.044 | |
| Critical Value (1%) | 0.033 | 0.040 | 0.030 |
| < 0.000001 | 0.000019 | 0.000020 |
Likelihood ratio tests (1% significance level).
| Clock ( | Non-clock ( | 2 | ||
|---|---|---|---|---|
| Simulated | −1,476.80 | −1,477.02 | 0.44 | 18.47 (8) |
| Sanson | −4,312.59 | −4,321.71 | 18.24 | 29.14 (14) |