| Literature DB >> 35262772 |
Andrew M Ritchie1,2, Xia Hua3,4, Lindell Bromham3.
Abstract
Understanding the factors that drive diversification of taxa across the tree of life is a key focus of macroevolutionary research. While the effects of life history, ecology, climate and geography on diversity have been studied for many taxa, the relationship between molecular evolution and diversification has received less attention. However, correlations between rates of molecular evolution and diversification rate have been detected in a range of taxa, including reptiles, plants and birds. A correlation between rates of molecular evolution and diversification rate is a prediction of several evolutionary theories, including the evolutionary speed hypothesis which links variation in mutation rates to differences in speciation rates. If it is widespread, such correlations could also have significant practical impacts, if they are not adequately accounted for in phylogenetic inference of evolutionary rates and timescales. Ray-finned fish (Actinopterygii) offer a prime target to test for this relationship due to their extreme variation in clade size suggesting a wide range of diversification rates. We employ both a sister-pairs approach and a whole-tree approach to test for correlations between substitution rate and net diversification. We also collect life history and ecological trait data and account for potential confounding factors including body size, latitude, max depth and reef association. We find evidence to support a relationship between diversification and synonymous rates of nuclear evolution across two published backbone phylogenies, as well as weak evidence for a relationship between mitochondrial nonsynonymous rates and diversification at the genus level.Entities:
Keywords: Comparative methods; Diversification; Life history; Molecular rates; Phylogeny; Sister pairs; Synonymous substitution rates
Mesh:
Year: 2022 PMID: 35262772 PMCID: PMC8975766 DOI: 10.1007/s00239-022-10052-6
Source DB: PubMed Journal: J Mol Evol ISSN: 0022-2844 Impact factor: 2.395
Fig. 1Terminology used to describe sister pairs. A sister pair is composed of two sister clades. The sister clades are the two daughter lineages of an ancestral lineage and all their known descendants. They do not share their MRCA with any other clades on the phylogeny. For inferring molecular rates, the larger sister clade would be pruned at random so that the same number of sequences would be analysed on each side
Fig. 2Schematic of the process used to select sister pairs and analyse the relationship between molecular evolution and diversification. (1) Tips are grouped into the selected taxonomic level, such as families. (2) A family-level tree is produced by pruning each family to a single tip. (3) Families with nosequence data available in the downloaded data set, if any, are removed. (4) From the remaining tips, we select all immediately adjacent pairs of tips (once without sequence data are removed) as our sister pairs. (5) To form our Filtered dataset, we perform checking for mutual monophyly of the clades within each sister pair. Pairs where one or more genera are present in both clades in > 80% of all-taxon-assembled trees in Rabosky et al. (2018) are removed. This was only done for the Rabosky et al. phylogeny, as most families in the Euteleost Tree of Life tree are monophyletic and no all-taxon-assembled trees are available. (6) To reduce the node-density effect, we randomly delete tips from the larger clade of each sister pair until both clades have the same number of tips. (7) To create outgroups for analysis, each sister pair is attached to its nearest sister pair to form a quartet. (8) Molecular sequence alignments are produced for each quartet. (9) Our final dataset consists of a set of contrasts, each consisting of a pair of sister clades, for which we have data on (a) clade sizes for each pair, and (b) molecular substitution rates (branch lengths) inferred from the sequence alignment. (10) A regression is conducted for contrasts in log clade sizes against contrasts in log substitution rates to test for a relationship between substitution and diversification rates
Summary of sister-pair datasets for examining the relationship between substitution rates and diversification rates
| Rank | Alignment | Genes | Phylogeny | Substitutions | Number of pairs | |||
|---|---|---|---|---|---|---|---|---|
| All | Mono only (not tested) | Filtered (Mono + Welch) | Full (Welch) | |||||
| Families | Mito.All | Rabosky | Total | 54 | 41 | 41 | 50 | |
| Families | Mito.Coding | Rabosky | dS | 88 | 59 | 58 | 77 | |
| dN | 88 | 65 | 63 | 78 | ||||
| Families | Nuc.RAG1 | Rabosky | Total | 119 | 82 | 82 | 113 | |
| dS | 119 | 44 | 44 | 65 | ||||
| dN | 119 | 45 | 43 | 62 | ||||
| Genera | Mito.All | Rabosky | Total | 183 | 71 | 48 | 64 | |
| Genera | Mito.Coding | Rabosky | dS | 475 | 193 | 144 | 235 | |
| dN | 475 | 204 | 83 | 138 | ||||
| Genera | Nuc.RAG1 | Rabosky | Total | 547 | 249 | 121 | 256 | |
| dS | 547 | 250 | 199 | 355 | ||||
| dN | 547 | 239 | 131 | 247 | ||||
| Families | Nuc.ETOL | 19 nuclear protein-coding exons | Betancur-R | Total | 118 | 98 | 78 | – |
| dS | 118 | 98 | 75 | – | ||||
| dN | 118 | 98 | 78 | – | ||||
Rank is the taxonomic rank at which pairs are selected. Three marker sets are used for the main analysis, with the Mito. All set including non-coding sequences for additional power in estimating total substitution rates. An additional analysis uses the euteleost phylogeny of Betancur-R et al. (2013) and an associated dataset of 19 nuclear exonic sequences (Table S2 available as Supplementary Information). Datasets are further divided by the type of substitution rate estimated. Different substitution types may have different numbers of usable pairs because some PAML analyses fail due to short branch lengths or because pairs are deleted by the Welch filter, which removes shallow pairs with high variance in branch length estimates. The number of pairs remaining is shown after filtering for mutual monophyly (Mono) and after conducting the Welch filter(Welch). The monophyly filter removes pairs in which members of one or more genera are found on both sides of the pair in 80% or more of the all-taxon-assembled trees incorporating taxa without molecular data. The set of pairs filtered by both Monophyly and Welch filters formed our Filtered dataset, whereas the set of pairs filtered only by the Welch filter constituted our Full dataset
Regressions through the origin relating contrasts in log substitution rates to contrasts in transformed clade sizes across sister pairs
| Phylogeny | Mono. Filter | Rank | Sequence set | Subst. Rate | No. pairs | Coeff | Std. Err | P value ( >| | |
|---|---|---|---|---|---|---|---|---|---|
| Rabosky et al. ( | Full | Families | Mito.All | Total | 50 | 0.82 | 0.68 | 1.20 | 0.24 |
| Mito.Coding | dS | 77 | − 0.68 | 1.11 | − 0.61 | 0.54 | |||
| dN | 78 | − 0.07 | 0.35 | − 0.21 | 0.83 | ||||
| Nuc.RAG1 | |||||||||
| dN | 62 | 0.47 | 0.33 | 1.43 | 0.16 | ||||
| Genera | Mito.All | Total | 64 | 0.07 | 0.55 | 0.12 | 0.90 | ||
| Mito.Coding | dS | 235 | 0.08 | 0.14 | 0.62 | 0.54 | |||
| dN | 138 | − 0.05 | 0.10 | − 0.53 | 0.60 | ||||
| Nuc.RAG1 | |||||||||
| dS | 355 | 0.02 | 0.08 | 0.29 | 0.77 | ||||
| dN | 247 | 0.18 | 0.10 | 1.90 | 0.06 | ||||
| Filtered | Families | Mito.All | Total | 41 | 0.97 | 0.79 | 1.24 | 0.22 | |
| Mito.Coding | dS | 58 | 0.24 | 0.57 | 0.42 | 0.68 | |||
| dN | 63 | 0.03 | 0.37 | 0.07 | 0.94 | ||||
| Nuc.RAG1 | |||||||||
| dS | 44 | 0.43 | 0.37 | 1.17 | 0.25 | ||||
| dN | 43 | 0.50 | 0.42 | 1.20 | 0.24 | ||||
| Genera | Mito.All | Total | 48 | − 0.19 | 0.55 | − 0.35 | 0.73 | ||
| Mito.Coding | dS | 144 | − 0.05 | 0.13 | − 0.40 | 0.69 | |||
| Nuc.RAG1 | Total | 121 | 0.32 | 0.19 | 1.71 | 0.09 | |||
| dS | 199 | − 0.02 | 0.11 | − 0.19 | 0.85 | ||||
| dN | 131 | 0.07 | 0.14 | 0.49 | 0.63 | ||||
| Betancur-R et al. ( | Yes | Families | Nuc.ETOL | ||||||
| dN | 78 | 0.08 | 0.21 | 0.36 | 0.72 |
Regressions were conducted on the phylogenies of Rabosky et al. (2018) and Betancur-R et al. (2013), and with or without filtering pairs for mutual monophyly of formal taxa (Mono. Filter). Data is shown for mitochondrial sequences with rRNA (Mito.All) and without (Mito.Coding). Nuclear data consists of RAG1 only for the Rabosky tree and of 19 nuclear marker from the Euteleost tree of life (ETOL) for the Betancur tree. Total, synonymous (dS) and nonsynonymous (dN) substitution rates are tested as predictors of clade size. t statistics and P values are for Wald tests. Results with P < 0.05 are in italics
Results of the tree-based analyses performed on the full molecular backbone of Rabosky et al. (2018) and two sampled trees with inferred mitochondrial and nuclear substitutions
| Tree | Slope | Likelihood ratio | P value | |
|---|---|---|---|---|
| Backbone | 27.05 | 1.03 | ||
| Mitochondrial | 25.80 | 1.98 | ||
| Nuclear | 0.54 | 0.46 | 1.00 |
The results are shown with root-to-tip path length as the response variable. A significant likelihood ratio test (bold type) indicates a likely correlation between substitutions and node numbers. The combination of a significant likelihood ratio test and a curvilinearity value greater than one indicates the likely presence of the node-density artefact in the Actinopterygian tree. Trends are plotted in Figs. S13, S14 available as Supplementary Information