| Literature DB >> 29769015 |
K Jun Tong1, David A Duchêne1, Sebastián Duchêne2, Jemma L Geoghegan3, Simon Y W Ho4.
Abstract
BACKGROUND: Phylogenetic analysis of DNA from modern and ancient samples allows the reconstruction of important demographic and evolutionary processes. A critical component of these analyses is the estimation of evolutionary rates, which can be calibrated using information about the ages of the samples. However, the reliability of these rate estimates can be negatively affected by among-lineage rate variation and non-random sampling. Using a simulation study, we compared the performance of three phylogenetic methods for inferring evolutionary rates from time-structured data sets: regression of root-to-tip distances, least-squares dating, and Bayesian inference. We also applied these three methods to time-structured mitogenomic data sets from six vertebrate species.Entities:
Keywords: Ancient DNA; Bayesian phylogenetics; Least-squares dating; Mitogenomes; Substitution rate; Tip dating
Mesh:
Substances:
Year: 2018 PMID: 29769015 PMCID: PMC5956955 DOI: 10.1186/s12862-018-1192-3
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Six time-structured mitogenomic data sets analysed in this study
| Species | Scientific name | Tips (modern + ancient) | Length (nt) | Age range (years before present) | Outgroupa | Main sourcesb |
|---|---|---|---|---|---|---|
| Adélie penguin |
| 13 + 7 | 14,198 | 0–44,000 |
| [ |
| Brown/polar bear | 31 + 1 | 14,609 | 0–122,500 |
| [ | |
| Dog |
| 120 + 18 | 14,596 | 0–36,000 |
| [ |
| Horse |
| 147 + 20 | 14,910 | 0–42,577 |
| [ |
| Modern human |
| 200 + 37 | 14,893 | 0–7134 |
| [ |
| Woolly mammoth |
| 0 + 65 | 14,951 | 12,210–46,455 |
| [ |
aGenBank accession numbers for outgroup sequences are given in the sequence data files
bMain publications from which the sequence data were obtained
Fig. 1Standardized error in estimates of substitution rates from sequence data produced under 12 different simulation conditions, representing different combinations of substitution rate, rate variation among lineages, and phylo-temporal clustering. Data were analysed using a regression of root-to-tip distances in TempEst, b least-squares dating in LSD, and c Bayesian inference in BEAST. Numbers along the top of each panel indicate the proportion of estimates that are above the rate used for simulation (‘true’ rate). Asterisks indicate cases in which one-sample Wilcoxon tests of errors in rate estimates show a significant difference from zero (see Additional file 2: Table S1)
Fig. 2a Uncertainty in Bayesian estimates of substitution rates across 12 simulation conditions, as measured by the width of the 95% credibility interval of the estimate divided by the rate used for simulation. One hundred data sets were produced by simulation under distinct evolutionary conditions and analysed using BEAST. b Relationship between phylogenetic stemminess (proportion of overall tree length represented by internal branches) and the error in the Bayesian median rate estimates
Results from analyses of six time-structured mitogenomic data sets
| Species | Clock modela | Tree priora | Phylo-temporal clusteringb ( | Date-randomization testc | |
|---|---|---|---|---|---|
| CR1 | CR2 | ||||
| Adélie penguin | Strict | Constant size | 0.079 | Fail | Fail |
| Brown/polar bear | Strict | Constant size | 0.168 | Pass | Pass |
| Dog | Relaxed | Constant size | 0.006 | Pass | Pass |
| Horse | Strict | Constant size | < 0.001 | Pass | Pass |
| Modern human | Strict | Skyride | 0.166 | Pass | Pass |
| Woolly mammoth | Relaxed | Constant size | 0.075 | Pass | Pass |
aClock models and tree priors were compared using marginal likelihoods estimated by stepping-stone sampling. Marginal likelihoods are given in Additional file 2: Table S5
bP-values below 0.05 indicate that sequences with similar ages tend to be clustered together in the phylogenetic tree
cWe considered two criteria that have been proposed for the date-randomization test, CR1 and CR2 [27]. These criteria are described in the Methods. Our results are based on 20 date-randomized replicates
Fig. 3Estimates of substitution rates from time-structured mitogenomic data sets from six vertebrate species. Data were analysed using Bayesian inference in BEAST, least-squares dating in LSD, and regression of root-to-tip distances in TempEst. Bayesian estimates are indicated by their median and 95% credibility intervals. Regression of root-to-tip distances failed to yield a positive rate estimate from the mitogenomes from woolly mammoth, so no rate estimate is shown for TempEst. Details of the six data sets are given in Table 1