| Literature DB >> 34899789 |
Jeffrey P Rose1,2, Ricardo Kriebel2, Larissa Kahan2, Alexa DiNicola2, Jesús G González-Gallegos3, Ferhat Celep4, Emily M Lemmon5, Alan R Lemmon6, Kenneth J Sytsma2, Bryan T Drew1.
Abstract
Next-generation sequencing technologies have facilitated new phylogenomic approaches to help clarify previously intractable relationships while simultaneously highlighting the pervasive nature of incongruence within and among genomes that can complicate definitive taxonomic conclusions. Salvia L., with ∼1,000 species, makes up nearly 15% of the species diversity in the mint family and has attracted great interest from biologists across subdisciplines. Despite the great progress that has been achieved in discerning the placement of Salvia within Lamiaceae and in clarifying its infrageneric relationships through plastid, nuclear ribosomal, and nuclear single-copy genes, the incomplete resolution has left open major questions regarding the phylogenetic relationships among and within the subgenera, as well as to what extent the infrageneric relationships differ across genomes. We expanded a previously published anchored hybrid enrichment dataset of 35 exemplars of Salvia to 179 terminals. We also reconstructed nearly complete plastomes for these samples from off-target reads. We used these data to examine the concordance and discordance among the nuclear loci and between the nuclear and plastid genomes in detail, elucidating both broad-scale and species-level relationships within Salvia. We found that despite the widespread gene tree discordance, nuclear phylogenies reconstructed using concatenated, coalescent, and network-based approaches recover a common backbone topology. Moreover, all subgenera, except for Audibertia, are strongly supported as monophyletic in all analyses. The plastome genealogy is largely resolved and is congruent with the nuclear backbone. However, multiple analyses suggest that incomplete lineage sorting does not fully explain the gene tree discordance. Instead, horizontal gene flow has been important in both the deep and more recent history of Salvia. Our results provide a robust species tree of Salvia across phylogenetic scales and genomes. Future comparative analyses in the genus will need to account for the impacts of hybridization/introgression and incomplete lineage sorting in topology and divergence time estimation.Entities:
Keywords: Lamiaceae; Robinson–Foulds distance; Salvia; anchored hybrid enrichment; cyto-nuclear discordance; distance metrics; incongruence
Year: 2021 PMID: 34899789 PMCID: PMC8652245 DOI: 10.3389/fpls.2021.767478
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
FIGURE 1The ASTRAL species tree of Salvia and outgroups. The ingroup branches are colored by subgenus and the subgenera are also labeled to the right. The major clades as discussed in the text are also indicated for subg. Calosphace. Thickened branches denote those with >0.95 ASTRAL local posterior probability. Pies at major nodes summarize the percentage of various phylogenetic signals across 101 gene trees which can be rooted. The numbers at the left of the pies show the total number of gene trees in which the clade is found, followed by the total number of gene trees that conflict with that clade. The remainder of the gene trees, if any, do not provide information on that particular relationship. The numbers at selected, incompletely supported nodes show the ASTRAL local posterior probability followed by the ASTRAL bootstrap support and the bootstrap support from the concatenated maximum likelihood analysis, summarized on the ASTRAL species tree. For support across all branches, see Supplementary Figures S1, S2, S4. For support on the best-scoring maximum likelihood tree, see Supplementary Figure S3.
FIGURE 2Tanglegram illustrating the cytonuclear discordance in Salvia based on the nuclear ASTRAL species tree (A) and the plastome tree (B). Links connect identical tips, with nodes rotated to minimize the link overlap. Ingroup links are colored by subgenus. Clades that differ between the two trees are indicated by filled circles on the plastome tree. To minimize the discordance caused by the error in the gene tree estimation, clades in the plastome tree with <75% bootstrap support have been collapsed into polytomies.
Summary of the mean gene tree distances in Salvia and the selected subgenera within and across genomes.
| clade | nuclear loci | plastome | |||||
| RF | Nye | CI | RF | Nye | CI | ||
| 179 | 0.61b | 0.41b | 0.36b | 0.63 | 0.37 | 0.29 | |
| subg. | 93 | 0.69c | 0.52d | 0.57d | 0.72 | 0.41 | 0.45 |
| subg. | 10 | 0.29a | 0.20a | 0.24a | 0.38 | 0.23 | 0.25 |
| subg. | 10 | 0.28a | 0.18a | 0.22a | 0.08 | 0.08 | 0.07 |
| subg. | 37 | 0.67bc | 0.44bc | 0.50c | 0.93 | 0.64 | 0.74 |
| subg. | 20 | 0.70c | 0.48cd | 0.49c | 0.78 | 0.46 | 0.51 |
The three different distance metrics, Robinson–Foulds (RF), Nye, and clustering information (CI), compare the topology of the gene trees to the ASTRAL species tree, and we considered the branching order alone. Metrics were calculated on the gene trees that can be rooted and have clade occupancy > 75% of sampled tips. The distances were normalized so that they ranged from 0 to 1, with 0 indicating complete agreement between a gene tree and a species tree. The letters denote significantly different among-group differences for each metric in mean nuclear gene tree discordance compared to the species tree based on an ANOVA with post hoc testing at α = 0.05. Differences among the plastomes were not tested because they represent one gene tree.
Nuclear gene tree distances in Salvia and selected subgenera.
| Clade | RF | Nye | CI |
|
|
|
|
|
|
| mean (SD) | mean (SD) | Mean (SD) | |||||||
|
| |||||||||
| Observed | 0.61 (0.07) | 0.41 (0.06) | 0.36 (0.06) |
|
|
|
|
|
|
| Expected | 0.65 (0.03) | 0.35 (0.02) | 0.32 (0.02) | ||||||
|
| |||||||||
| Observed | 0.69 (0.08) | 0.52 (0.10) | 0.57 (0.11) |
|
|
|
|
|
|
| Expected | 0.72 (0.05) | 0.40 (0.03) | 0.46 (0.04) | ||||||
|
| |||||||||
| Observed | 0.29 (0.21) | 0.20 (0.17) | 0.24 (0.18) | –1.04 | 0.30 | 0.49 | 0.63 | –0.029 | 0.98 |
| Expected | 0.32 (0.17) | 0.19 (0.09) | 0.24 (0.11) | ||||||
|
| |||||||||
| Observed | 0.28 (0.17) | 0.18 (0.11) | 0.22 (0.13) | –1.06 | 0.29 | 0.98 | 0.33 | –0.11 | 0.91 |
| Expected | 0.30 (0.17) | 0.17 (0.09) | 0.22 (0.11) | ||||||
|
| |||||||||
| Observed | 0.67 (0.17) | 0.44 (0.16) | 0.50 (0.16) | –0.91 | 0.36 |
|
|
|
|
| Expected | 0.69 (0.12) | 0.36 (0.07) | 0.46 (0.09) | ||||||
|
| |||||||||
| Observed | 0.70 (0.10) | 0.48 (0.10) | 0.49 (0.11) |
|
|
|
|
|
|
| Expected | 0.74 (0.07) | 0.38 (0.04) | 0.43 (0.05) |
The three different distance metrics, RF, Nye, and CI, compare the topology of the gene trees to the ASTRAL species tree, and we considered the branching order alone. The gene trees are either empirical trees that can be rooted and have clade occupancy > 75% of sampled tips (observed) or 1,000 gene trees simulated under the multispecies coalescent using the ASTRAL species tree (expected). The distances were normalized so that they ranged from 0 to 1, with 0 indicating complete agreement between a gene tree and the species tree. The t-values and associated p-values for each distance metric/clade combination are based on Welch’s t-test with the hypothesis that the mean tree distance for the observed and expected gene trees are equal, or in other words, that the tree distances based on empirical data are what would be expected under incomplete lineage sorting alone. Significant t/p-values at α = 0.05 are indicated in bold.
FIGURE 3Phylogenetic network based on 57 nuclear genes depicting backbone relationships in Salvia with one exemplar per subgenus, with internal branch lengths in coalescent units. The best-fitting network has four reticulation events. Inheritance probabilities (γ) are indicated next to the lineage which is inferred to have received genetic material. The numbers above the branches on the bifurcating major topology are quartet concordance factors or the proportion of the genome supporting each quartet. The numbers below the major topology branches and to the right of hybrid branches are bootstrap support values. Ingroup tips are colored by subgenus.