Literature DB >> 29878203

RelTime Relaxes the Strict Molecular Clock throughout the Phylogeny.

Fabia U Battistuzzi1,2, Qiqing Tao3,4, Lance Jones1, Koichiro Tamura5,6, Sudhir Kumar3,4,7.   

Abstract

The RelTime method estimates divergence times when evolutionary rates vary among lineages. Theoretical analyses show that RelTime relaxes the strict molecular clock throughout a molecular phylogeny, and it performs well in the analyses of empirical and computer simulated data sets in which evolutionary rates are variable. Lozano-Fernandez et al. (2017) found that the application of RelTime to one metazoan data set (Erwin et al. 2011) produced equal rates for several ancient lineages, which led them to speculate that RelTime imposes a strict molecular clock for deep animal divergences. RelTime does not impose a strict molecular clock. The pattern observed by Lozano-Fernandez et al. (2017) was a result of the use of an option to assign the same rate to lineages in RelTime when the rates are not statistically significantly different. The median rate difference was 5% for many deep metazoan lineages for the Erwin et al. (2011) data set, so the rate equality was not rejected. In fact, RelTime analyses with and without the option to test rate differences produced very similar time estimates. We also found that the Bayesian time estimates vary widely depending on the root priors assigned, and that the use of less restrictive priors produces Bayesian divergence times that are concordant with those from RelTime for the Erwin et al. (2011) data set. Therefore, it is prudent to discuss Bayesian estimates obtained under a range of priors in any discourse about molecular dating, including method comparisons.

Entities:  

Mesh:

Year:  2018        PMID: 29878203      PMCID: PMC6022624          DOI: 10.1093/gbe/evy118

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

RelTime was developed to estimate timetrees from molecular sequence data when evolutionary rates vary among lineages (Tamura et al. 2012, 2018). It has been shown to be accurate in the analysis of computer simulated data generated with extensive rate heterogeneity throughout the tree (Tamura et al. 2012, 2018; Filipski et al. 2014). In analyses of many large empirical data sets, RelTime produced divergence times similar to those reported from Bayesian methods, when equivalent priors and calibrations were used (Mello et al. 2017). In addition, theoretical analyses clearly established that a relative rate framework, which does not assume a strict molecular clock, forms the mathematical foundation of the RelTime method (Tamura et al. 2018). These theoretical and empirical findings are inconsistent with the Lozano-Fernandez et al. (2017) report, which concluded that RelTime functionally maintained a strict molecular clock for animal divergences in an analysis of one data set containing 117 species and 2,049 amino acids (Erwin et al. 2011). They surmised that this pattern is the cause of the curvilinear relationship between Bayesian and RelTime node age estimates observed by Battistuzzi et al. (2015). Lozano-Fernandez et al. (2017) also reported that the standard errors (SE) of RelTime estimates linearly increase in size with deeper node ages, unlike Bayesian approaches, and speculated that this pattern might explain the difference of animal divergence between Erwin et al. (2011) and Battistuzzi et al. (2015). Here, we present results from a reanalysis of Erwin et al. (2011) data to evaluate Lozano-Fernandez et al. (2017) concerns (fig. 1). We estimated node ages and evolutionary rates using RelTime in MEGA (Tamura et al. 2013; Kumar et al. 2016) and Bayesian methods in Phylobayes (Pb) (Lartillot et al. 2009). These software were selected because they were used in the prior studies discussed here (Erwin et al. 2011; Battistuzzi et al. 2015; Lozano-Fernandez et al).
. 1.

—Comparisons of rates, dates, and standard errors from Bayesian and RelTme analyses. (a) RelTime estimates of node ages calculated with (“many clocks”) and without (“all clocks”) the rate merging option. The linear slope and R2 value are shown. Dotted gray line represents 1:1 relationship. (b) Normalized RelTime relative rates for nodes at different time depths, with rates greater than the average, >1.0, showing acceleration and those <1.0 showing a slow-down (blue and yellow backgrounds, respectively). Relative node ages were normalized to the age of ingroup root. (c) Relationship of Phylobayes node estimates without root calibration and with root age constraint at 1. Node ages were normalized to the age of the root. Solid line shows the polynomial fit and dotted gray line represents 1:1 relationship. (d) Relationship of RelTime and Phylobayes node ages obtained without root calibration. All node ages were normalized to the sum of ingroup node ages. The linear slope and R2 value are shown. (e) Relationship of RelTime estimates with Phylobayes with and without specified root calibration, and normalized to either Monosiga (Choanoflagellate) or Metazoa. Solid lines show polynomial fit for each comparison and dotted gray line represents 1:1 relationship. The R2 values for the polynomial fit are all >0.94. (f) Standard errors (SEs) of node ages produced by RelTime and Phylobayes under different calibration constraints. Black circles: RelTime estimates of SEs of node ages when the ingroup root node is constrained at 1. Red circles: Phylobayes estimates of SEs of node ages without the root calibration; Phylobayes estimates were divided by 1,000 for direct comparisons because root calibration is automatically set to 1,000 when no root calibration is specified.

—Comparisons of rates, dates, and standard errors from Bayesian and RelTme analyses. (a) RelTime estimates of node ages calculated with (“many clocks”) and without (“all clocks”) the rate merging option. The linear slope and R2 value are shown. Dotted gray line represents 1:1 relationship. (b) Normalized RelTime relative rates for nodes at different time depths, with rates greater than the average, >1.0, showing acceleration and those <1.0 showing a slow-down (blue and yellow backgrounds, respectively). Relative node ages were normalized to the age of ingroup root. (c) Relationship of Phylobayes node estimates without root calibration and with root age constraint at 1. Node ages were normalized to the age of the root. Solid line shows the polynomial fit and dotted gray line represents 1:1 relationship. (d) Relationship of RelTime and Phylobayes node ages obtained without root calibration. All node ages were normalized to the sum of ingroup node ages. The linear slope and R2 value are shown. (e) Relationship of RelTime estimates with Phylobayes with and without specified root calibration, and normalized to either Monosiga (Choanoflagellate) or Metazoa. Solid lines show polynomial fit for each comparison and dotted gray line represents 1:1 relationship. The R2 values for the polynomial fit are all >0.94. (f) Standard errors (SEs) of node ages produced by RelTime and Phylobayes under different calibration constraints. Black circles: RelTime estimates of SEs of node ages when the ingroup root node is constrained at 1. Red circles: Phylobayes estimates of SEs of node ages without the root calibration; Phylobayes estimates were divided by 1,000 for direct comparisons because root calibration is automatically set to 1,000 when no root calibration is specified.

Materials and Methods

The Metazoan data set consists of 117 species and 2,049 aligned amino acids (Erwin et al. 2011). All analyses were conducted with RelTime (Tamura et al. 2012, 2018) and Phylobayes v. 4.1f (Lartillot et al. 2009). Phylobayes was selected because it was the Bayesian dating software used in all the previous studies discussed here (Erwin et al. 2011; Battistuzzi et al. 2015; Lozano-Fernandez et al. 2017). We also conducted RelTime analysis on another data set consisting of 274 species and 7,370 sites (dos Reis et al. 2012). For the RelTime analyses, no calibration times were used. All RelTime analyses were conducted with MEGA 6 (Tamura et al. 2013), the software used by Battistuzzi et al. (2015), with the exception of the SE calculation. The SE calculation implemented in MEGA 6 does not consider calibrations. Thus, to make the direct comparison with SE estimates in Pb analysis with a root constraint at 1 (fig. 1), we conducted the RelTime analysis in MEGA 7 (Kumar et al. 2016), which can accommodate user-specified calibration constraints. RelTime relative rates were normalized to their mean rate estimate over the whole tree for comparative purposes (fig. 1). Note that RelTime produces relative lineage rates and these should not be directly compared with branch rates produced by Phylobayes. In the comparison of relative node ages obtained from RelTime with “many clocks” and “all clocks” options, we also conducted a t-test to examine whether the node ages from these two options are significantly different from each other; that is, the null hypothesis of regression slope equal to 1 (P > 0.2). Two sets of parameters were used in Phylobayes analyses: one without any root calibration and another one calibrating the root node to 1 (as done in Lozano-Fernandez et al. 2017). Phylobayes automatically scales node ages to 1,000 when no root calibration is specified. We selected a birth–death speciation model with default parameters in these analyses. The final option used in this study was –cat –gtr –cir –bd with default hyperprior (10−3). All analyses in Phylobayes were run for at least 20,000 generations. While full convergence is expected to take many months, time estimates from these truncated analyses appear reliable, because relative time estimates remained stable at 2,500, 6,000, 8,500, 12,500, and 20,000 generations. This convergence approach produced times identical to those obtained by Lozano-Fernandez et al under the same analysis conditions and was validated using Tracecomp (Lozano-Fernandez et al. 2017).

Results

RelTime Relaxes the Strict Molecular Clock in Shallow as Well as Deep Nodes

Lozano-Fernandez et al. (2017) stated that RelTime does not relax the strict molecular clock in the deep branches of a metazoan phylogeny, because the (relative) rates reported by RelTime were equal to 1 for many deep lineages; RelTime rates are all relative to the rate of the ingroup root node that is assigned a value of 1 for ease of reference. In RelTime, any deep or shallow lineage may receive the same rate as its ancestral lineage, when one chooses the “many clocks” option in MEGA6. Under this option, evolutionary rates of ancestral and descendant lineage pairs are compared and merged if their equality cannot be rejected statistically (Tamura et al. 2012). This is indeed the case for the Erwin et al. data (see fig. 1c in Lozano-Fernandez et al. 2017), where lineages near the root showed very similar evolutionary rates. However, this pattern is not unique to deep nodes. A few other lineages in shallower parts of the phylogeny also showed identical rates (e.g., 6 other lineages have the same relative rate of 1.86). A RelTime analysis of another large data set (274 species, 7,370 sites; dos Reis et al. 2012), also using the “many clocks” option, confirms that rate identity is not influenced by the depth of the node, but rather the outcome of statistical tests carried out on all the nodes (fig. 2). In this data set, lineages emanating from the ingroup root showed different rates, and many intermediate and shallow lineages showed similar rates (fig. 2). These rates were not statistically different, so they were assigned the same value. Importantly, in both of these data sets, the estimates of node ages showed an excellent linear relationship with and without the “many clocks” option (slope = 0.99 and 0.99 in figs. 1, respectively). A t-test did not reject the null hypothesis of equality of node ages (i.e., linear regression slope = 1) from RelTime analysis with and without the “many clocks” option in MEGA6 (P value > 0.2). These results contradict Lozano-Fernandez et al.’s conclusion that RelTime imposed a strict molecular clock on deep divergences of the metazoan data set, because the equality of rates they observed is the consequence of the lack of significant rate differences. Thus, RelTime does not impose a strict clock in deep divergences.
. 2.

—RelTime estimates of rates and relative node ages for a data set of 274 species (dos Reis et al. 2012). (a) Rate estimates in relation to node ages obtained using “many clocks” option in MEGA6. (b) Comparison of node ages obtained with and without “many clocks” option. Regressio slope through the origin and R2 values are shown.

RelTime estimates of rates and relative node ages for a data set of 274 species (dos Reis et al. 2012). (a) Rate estimates in relation to node ages obtained using “many clocks” option in MEGA6. (b) Comparison of node ages obtained with and without “many clocks” option. Regressio slope through the origin and R2 values are shown. Lozano-Fernandez et al. (2017) also commented on the rates produced by Bayesian and RelTime methods (see fig. 1c and 1d in Lozano-Fernandez et al. 2017). These rates should not be compared, because Bayesian analyses produce evolutionary rates for individual branches in the phylogeny, whereas RelTime produces evolutionary rates for lineages (Tamura et al. 2018). A lineage rate is a function of all the branch rates in the subtree originating at the node of interest. Therefore, a lineage rate at any node in the tree is not expected to be equal to the evolutionary rate of a specific branch directly connected to that node. In the case of Erwin et al.’s data, rates for many deep lineages were very similar, which may happen when branch rates increase and decrease over time within lineages. Their averages turn out to be very similar for some data sets (Erwin et al. 2011), but not others (dos Reis et al. 2012). We found that for nodes that were assigned the same rate of 1 in deep divergence, the median difference between ancestral and descendant rates was 5%. In retrospect, detailed discussion of the similarity of time estimates obtained with and without the “many clocks” option in Battistuzzi et al. (2015) may have avoided the perception that RelTime is unable to relax the molecular clock for deep evolutionary timescales in the metazoan data set (Lozano-Fernandez et al. 2017). The distribution of lineage rates obtained without using the “many clocks” option includes both higher and lower lineage rates throughout the tree (fig. 1). Therefore, RelTime relaxes the strict molecular clock in deep as well as shallow lineages. Because RelTime node ages estimated with and without “many clocks” were very similar (fig. 1), the relationship observed by Battistuzzi et al. (2015) between RelTime and Bayesian node ages cannot be caused by a lack of molecular clock relaxation. We, therefore, explored the possibility that the contrasting relationships between Bayesian and RelTime results observed by Battistuzzi et al. (2015) and Lozano-Fernandez et al. (2017) may be caused by the selection of priors in Bayesian analyses.

Bayesian Estimates with Minimal Sets of Priors are Not Consistent with Each Other

Lozano-Fernandez et al. (2017) reported that RelTime estimates are “not proportional” to the Bayesian estimates obtained when they assigned an arbitrary root age calibration of 1 in their Phylobayes analysis with no clock calibrations. This analysis was meant to compare Bayesian and RelTime results using the minimum set of priors, because RelTime does not require the specification of any priors or calibrations and produces relative times. Since the root age calibration of 1 is an arbitrary choice, we compared Bayesian estimates under an alternate minimum set of priors where no root age calibration was used (allowing it to default to 1,000 in Pb, see Materials and Methods for details). The resulting relative node ages showed a curvilinear relationship between two sets of Bayesian estimates, with a 30% overall difference (fig. 1). However, a Bayesian analysis using no root calibration and with a birth–death default prior produced time estimates that had a nearly linear relationship (slope = 0.97) with those from RelTime (Battistuzzi et al. 2015; fig. 1). The similarity of results obtained between RelTime and Bayesian under a specific set of priors (i.e., birth–death and no root calibration) highlights the importance of discussing Bayesian estimates obtained under a range of priors in any discourse about molecular dating, including method comparisons, because Bayesian estimates can vary substantially depending on the prior choices (Hug and Roger 2007; Inoue et al. 2010; Parham et al. 2012; Warnock et al. 2012, 2015, 2017; dos Reis et al. 2016).

Deep Nodes Are Strongly Affected by Prior Choices

As RelTime produces only relative time estimates, Bayesian time estimates must be scaled to a reference node for comparison with RelTime estimates. During our investigation above, we found that selection of the reference node used to scale times also influences the relationship observed between RelTime and Bayesian estimates. For example, the curvilinear trend obtained by using the age of the choanoflagellate Monosiga to normalize all ages for comparison (fig. 1, red line) is very similar to that reported by Lozano-Fernandez et al. (2017) using the root node, but this trend becomes much less pronounced when Bayesian ages are normalized to the age of Metazoa, while maintaining a root calibration of 1.0 (fig. 1, blue line). The relationship becomes even more linear when the root calibration is not specified (fig. 1, pink and green lines). These results suggest that nodes are affected differently by prior selections and trends observed after normalization will be affected, depending on the node chosen for the normalization itself. In cases such as this one, when one or a few nodes have a strong impact on the identified trends, it may be advisable to normalize to the sum of time estimates of all nodes to allow every node to contribute to the normalization (Tamura et al. 2012).

Bayesian and RelTime Estimates of Standard Errors Show Similar Trends

We also investigated whether the priors selected by Lozano-Fernandez et al. (2017) explain the reported differences between RelTime and Phylobayes standard error (SE) estimates for node ages in the metazoan data set. While credibility/confidence intervals would usually be compared, here we present SEs to enable a direct comparison with values presented by Lozano-Fernandez et al. (2017). To generate results under comparable conditions, we fixed the ingroup root calibration to be 1 in RelTime. As expected, the relationship between node ages and SEs for RelTime showed a trend in which SEs first rise and then decrease with the increase of node ages (fig. 1, black circles). This pattern is similar to that produced in the Bayesian analysis (see fig. 1f in Lozano-Fernandez et al. 2017). This shows that the imposition of a root node calibration of 1.0 strongly affected SE estimates, because confidence intervals are clipped to avoid predating the calibration constraint. Upon omitting the root node calibration, we found that the Bayesian SE estimates increased with time (fig. 1, red circles), similar to the pattern observed for RelTime without any constraints. Therefore, the decrease in SEs with node ages for the Bayesian results observed by Lozano-Fernandez et al. (2017) was strongly influenced by the root node constraint, and was not directly comparable with RelTime results.

Priors and Calibrations That Impact Absolute Molecular Dates

Battistuzzi et al. (2015) used data from Erwin et al. (2011) as an example, because its analysis clearly showed that 1) the two maxima and the root prior have a very large impact on molecular time estimates produced by Bayesian methods and 2) different combinations of (maximum) calibrations and priors produce very different time estimates. Results similar to Battistuzzi et al. (2015) were also obtained by Lozano-Fernandez et al. (2017) (see their fig. 3), but the induced effective prior was judged to be overly diffused. These results highlight a well-known attribute of Bayesian analyses: the user-specified parameters and priors can be very different from the induced prior distributions due to complicated parameter interdependencies and approaches used to truncate of the prior distributions (Eme et al. 2014; Warnock et al. 2015; Barba-Montoya et al. 2017). In the data set analyzed here, it is clear that the Bayesian relative and absolute time estimates are strongly affected by the selection of priors. In the absolute dating analysis, the root prior used by both Erwin et al. (2011) and Lozano-Fernandez et al. (2017) is stricter than that explored in Battistuzzi et al. (2015). While the best prior cannot be unequivocally identified, a recent study on molecular timing of eukaryotes adds some new information. This study obtained a Bayesian divergence time estimate of ∼1,375 Ma for Opisthokonta (Animals+Fungi) in an analysis of a data set with 116 taxa and 2,166 amino acids (Gold et al. 2017). Using a data set of comparable size, this study produced a divergence time estimate at the upper end of the marginal prior distribution used by Erwin et al. (2011) and Lozano-Fernandez et al. (2017), but well within the distribution of Battistuzzi et al. (2015) (see table 2 in Lozano-Fernandez et al. 2017). Given the discordance among Bayesian analyses with different priors and RelTime, it is clear that the reliable establishment of the age of animal origin remains challenging.

Conclusions

Lozano-Fernandez et al. (2017) made a specific conclusion about the ability of RelTime to estimate the timeline of animal diversification, which was based on an analysis of only one data set. We have demonstrated that RelTime does not collapse rates, unless one chooses to assign equal rates to lineages that do not show statistically significant rate differences. Thus, RelTime is sensitive to rate changes in both deep and shallow portions of a phylogeny, which is consistent with the theoretical underpinnings of the RelTime method (Tamura et al. 2018). In fact, RelTime may be used as a reference framework to evaluate the effect of prior choices in Bayesian analyses. A RelTime analysis may also provide useful information about selection of priors and distributions, because, under some conditions, RelTime and Bayesian methods produced similar results for the data analyzed in Lozano-Fernandez et al. (2017). This complementary analysis may prove particularly useful in informing selection of priors for Bayesian analyses. Because Bayesian methods can become very computationally demanding and RelTime speed is orders of magnitude faster (Tamura et al. 2012, 2018), RelTime may serve as a practical and theoretically sound alternative to Bayesian methods for many data sets (Tamura et al. 2018).
  21 in total

1.  The impact of the representation of fossil calibrations on Bayesian estimation of species divergence times.

Authors:  Jun Inoue; Philip C J Donoghue; Ziheng Yang
Journal:  Syst Biol       Date:  2009-11-25       Impact factor: 15.683

2.  The impact of fossils and taxon sampling on ancient molecular dating analyses.

Authors:  Laura A Hug; Andrew J Roger
Journal:  Mol Biol Evol       Date:  2007-06-07       Impact factor: 16.240

3.  MEGA6: Molecular Evolutionary Genetics Analysis version 6.0.

Authors:  Koichiro Tamura; Glen Stecher; Daniel Peterson; Alan Filipski; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2013-10-16       Impact factor: 16.240

4.  Exploring uncertainty in the calibration of the molecular clock.

Authors:  Rachel C M Warnock; Ziheng Yang; Philip C J Donoghue
Journal:  Biol Lett       Date:  2011-08-24       Impact factor: 3.703

5.  Estimating divergence times in large molecular phylogenies.

Authors:  Koichiro Tamura; Fabia Ursula Battistuzzi; Paul Billing-Ross; Oscar Murillo; Alan Filipski; Sudhir Kumar
Journal:  Proc Natl Acad Sci U S A       Date:  2012-11-05       Impact factor: 11.205

6.  Phylogenomic datasets provide both precision and accuracy in estimating the timescale of placental mammal phylogeny.

Authors:  Mario dos Reis; Jun Inoue; Masami Hasegawa; Robert J Asher; Philip C J Donoghue; Ziheng Yang
Journal:  Proc Biol Sci       Date:  2012-05-23       Impact factor: 5.349

Review 7.  Bayesian molecular clock dating of species divergences in the genomics era.

Authors:  Mario dos Reis; Philip C J Donoghue; Ziheng Yang
Journal:  Nat Rev Genet       Date:  2015-12-21       Impact factor: 53.242

8.  Comparison of different strategies for using fossil calibrations to generate the time prior in Bayesian molecular clock dating.

Authors:  Jose Barba-Montoya; Mario Dos Reis; Ziheng Yang
Journal:  Mol Phylogenet Evol       Date:  2017-07-11       Impact factor: 4.286

9.  RelTime Rates Collapse to a Strict Clock When Estimating the Timeline of Animal Diversification.

Authors:  Jesus Lozano-Fernandez; Mario Dos Reis; Philip C J Donoghue; Davide Pisani
Journal:  Genome Biol Evol       Date:  2017-05-01       Impact factor: 3.416

10.  Theoretical Foundation of the RelTime Method for Estimating Divergence Times from Variable Evolutionary Rates.

Authors:  Koichiro Tamura; Qiqing Tao; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2018-07-01       Impact factor: 16.240

View more
  6 in total

1.  On estimating evolutionary probabilities of population variants.

Authors:  Ravi Patel; Sudhir Kumar
Journal:  BMC Evol Biol       Date:  2019-06-25       Impact factor: 3.260

2.  East African cichlid lineages (Teleostei: Cichlidae) might be older than their ancient host lakes: new divergence estimates for the east African cichlid radiation.

Authors:  Frederic Dieter Benedikt Schedel; Zuzana Musilova; Ulrich Kurt Schliewen
Journal:  BMC Evol Biol       Date:  2019-04-25       Impact factor: 3.260

3.  Highlight: New Solutions and Open Questions in Computational Evolutionary Biology.

Authors:  Casey McGrath
Journal:  Genome Biol Evol       Date:  2019-11-01       Impact factor: 3.416

4.  Performance of A Priori and A Posteriori Calibration Strategies in Divergence Time Estimation.

Authors:  Alan J S Beavan; Philip C J Donoghue; Mark A Beaumont; Davide Pisani
Journal:  Genome Biol Evol       Date:  2020-07-01       Impact factor: 3.416

5.  Assessing Rapid Relaxed-Clock Methods for Phylogenomic Dating.

Authors:  Jose Barba-Montoya; Qiqing Tao; Sudhir Kumar
Journal:  Genome Biol Evol       Date:  2021-11-05       Impact factor: 3.416

6.  Horizontal transmission and recombination maintain forever young bacterial symbiont genomes.

Authors:  Shelbi L Russell; Evan Pepper-Tunick; Jesper Svedberg; Ashley Byrne; Jennie Ruelas Castillo; Christopher Vollmers; Roxanne A Beinart; Russell Corbett-Detig
Journal:  PLoS Genet       Date:  2020-08-25       Impact factor: 5.917

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.