| Literature DB >> 34907249 |
Sarah J Jacobs1,2, Michael C Grundler1, Claudia L Henriquez1, Felipe Zapata3.
Abstract
What we mean by species and whether they have any biological reality has been debated since the early days of evolutionary biology. Some biologists even suggest that plant species are created by taxonomists as a subjective, artificial division of nature. However, the nature of plant species has been rarely tested critically with data while ignoring taxonomy. We integrate phenomic and genomic data collected across hundreds of individuals at a continental scale to investigate this question in Escallonia (Escalloniaceae), a group of plants which includes 40 taxonomic species (the species proposed by taxonomists). We first show that taxonomic species may be questionable as they match poorly to patterns of phenotypic and genetic variation displayed by individuals collected in nature. We then use explicit statistical methods for species delimitation designed for phenotypic and genomic data, and show that plant species do exist in Escallonia as an objective, discrete property of nature independent of taxonomy. We show that such species correspond poorly to current taxonomic species ([Formula: see text]) and that phenomic and genomic data seldom delimit congruent entities ([Formula: see text]). These discrepancies suggest that evolutionary forces additional to gene flow can maintain the cohesion of species. We propose that phenomic and genomic data analyzed on an equal footing build a broader perspective on the nature of plant species by helping delineate different 'types of species'. Our results caution studies which take the accuracy of taxonomic species for granted and challenge the notion of plant species without empirical evidence. Note: A version of the complete manuscript in Spanish is available in the Supplemental Materials.Entities:
Mesh:
Year: 2021 PMID: 34907249 PMCID: PMC8671583 DOI: 10.1038/s41598-021-03419-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1(Presented as three panels) Phylogenetic history, taxon sampling, and evolutionary model-based species delimitation. Maximum Likelihood (ML) tree of Escallonia based on genome wide data (bottom-left) with tips indicating the six focal clades (Clade I–VI) of our study. For each clade, the first row shows the taxon sampling, with filled symbols indicating specimens used in phenotypic analyses and empty symbols specimens used in genomic analyses; the insets show the distribution of specimens along elevation. The second row shows results of the best fit model for species delimitation with phenotypic data (i.e., phenogroups); phenogroups are shown with different shapes in geographic space. The third row shows results of the best fit model for species delimitation with genomic data (i.e., genogroups); genogroups are indicated with different colors as tips of unrooted ML trees based on matrices of concatenated loci and mapped in geographic space. The fourth row shows the integration of phenogroups and genogroups with evolutionary history and geographic distribution to elucidate the nature of plant species; specimens without overlapping phenotypic and genomic data are designated as unknown specimens. The phylogenetic trees were inferred in IQ-TREE v2.0.3 (http://www.iqtree.org). The maps were generated in R v4.1.1 using the libraries ggplot2 v3.3.5 (https://ggplot2.tidyverse.org/index.html) and maps v3.4.0 (https://cran.r-project.org/web/packages/maps/).
Current state of taxonomic species.
| Clade | Taxonomic species | Specimens | Minimum proportion overlap among 10-cubes | Maximum proportion overlap among 10-cubes | Percent specimens matching any 10-cube | Percent specimens matching correct 10-cube |
|---|---|---|---|---|---|---|
| I | 2 | 33 | 0 | 0.00 | 0.0 | 0.0 |
| II | 2 | 33 | 0 | 0.00 | 0.0 | 0.0 |
| III | 6 | 130 | 0 | 0.02 | 1.6 | 0.8 |
| IV | 2 | 74 | 0 | 0.00 | 0.0 | 0.0 |
| V | 7 | 214 | 0 | 0.13 | 0.0 | 0.0 |
| VI | 10 | 195 | 0 | 0.00 | 0.0 | 0.0 |
Gaussian finite mixture modeling (GFMM) for phenogroup delimitation and model selection using the Bayesian information criterion (BIC).
| Clade | Model | Phenogroups | BIC | Rank | |
|---|---|---|---|---|---|
| I | Naive | 2 | 54.03099 | 1 | 0.00000 |
| Taxonomy | 2 | 45.80586 | 2 | 8.22513 | |
| Taxonomy unaware | 1 | 33.45654 | 3 | 20.57445 | |
| II | Naive | 3 | 71.72976 | 1 | 0.00000 |
| Taxonomy unaware | 1 | 47.52785 | 2 | 24.20191 | |
| Taxonomy | 2 | 17.77346 | 3 | 53.95630 | |
| III | Naive | 5 | 387.15280 | 1 | 0.00000 |
| Taxonomy unaware | 4 | 170.83930 | 2 | 216.31350 | |
| Taxonomy | 6 | 53.38527 | 3 | 333.76753 | |
| IV | Taxonomy | 2 | 1 | 0.00000 | |
| Taxonomy unaware | 2 | 1 | 0.00000 | ||
| Naive | 3 | 2 | 0.89520 | ||
| V | Naive | 8 | 1 | 0.00000 | |
| Taxonomy unaware | 4 | 2 | 131.31560 | ||
| Taxonomy | 7 | 3 | 274.73010 | ||
| VI | Naive | 8 | 231.24780 | 1 | 0.00000 |
| Taxonomy unaware | 10 | 200.30380 | 2 | 30.94400 | |
| Taxonomy | 10 | 3 | 749.01130 |
Genomic modeling for genogroup delimitation and model selection using Bayes factors (BF).
| Clade | Model | Genogroups | Marginal Likelihood ( | Rank | BF (2 x |
|---|---|---|---|---|---|
| I | GC | 3 | − 6580.495 | 1 | |
| AC | 2 | − 6754.495 | 2 | 348.000 | |
| RI | 2 | − 6754.495 | 2 | 348.000 | |
| II | AC | 4 | − 13460.917 | 1 | |
| GC | 3 | − 15036.438 | 2 | 3151.042 | |
| RI | 3 | − 15036.438 | 2 | 3151.042 | |
| RI | 2 | − 18963.342 | 3 | 11004.850 | |
| III | AC | 7 | − 8985.782 | 1 | |
| RI | 5 | − 10014.260 | 2 | 2056.955 | |
| RI | 3 | − 12233.131 | 3 | 6494.698 | |
| GC | 3 | − 12233.131 | 3 | 6494.698 | |
| IV | AC | 6 | − 9601.514 | 1 | |
| GC | 3 | − 11546.649 | 2 | 3890.271 | |
| RI | 2 | − 12017.878 | 3 | 4832.728 | |
| RI | 2 | − 12017.878 | 3 | 4832.728 | |
| V | AC | 10 | − 4588.693 | 1 | |
| GC | 6 | − 5381.361 | 2 | 1585.336 | |
| RI | 3 | − 5601.058 | 3 | 2024.730 | |
| RI | 2 | − 6085.998 | 4 | 2994.610 | |
| VI | AC | 11 | − 2921.024 | 1 | |
| GC | 7 | − 3627.806 | 2 | 1413.564 | |
| RI | 4 | − 4661.351 | 3 | 3480.654 | |
| RI | 4 | − 4661.351 | 3 | 3480.654 |
Specimens assigned to demes using MAVERICK.
Specimens assigned to demes using STRUCTURE.
Figure 2Integration of phenotypic and genome-wide variation to delimit species. For each clade (see panels of Fig. 1), we assigned specimens to their corresponding phenogroup and genogroup based on the best fit models for each type of data. Shaded cells show specimens assigned to a particular combination of best fit phenogroup and genogroup (i.e., each shaded cell is a species). Three types of species are recognized. First, specimens assigned uniquely to a single phenogroup and a single genogroup are recognized as ‘good species’ (e.g., phenogroup 4, genogroup 3 in Clade III). Second, specimens assigned to a single phenogroup across multiple genogroups are recognized as ‘phenotypic cryptic species’ (e.g., phenogroup 2, genogroups 1, 2 in Clade III). Third, specimens assigned to a single genogroup across multiple phenogroups are recognized as ‘genetic cryptic species’ (e.g., phenogroups 1, 3, genogroup 5, in Clade III). Empty rows or columns correspond to specimens which did not have overlapping phenotypic and genomic data and thus were assigned only to their corresponding phenogroup or genogroup, accordingly (e.g., genogroup 2 in Clade I).
Correspondence between taxonomic species and best-fit phenogroups and genogroups.
| Clade | Taxonomic species | Phenogroups | Perfect match taxonomic species to phenogroups | Genogroups | Perfect match taxonomic species to genogroups | Perfect match taxonomic species to phenogroup and genogroup |
|---|---|---|---|---|---|---|
| I | 2 | 2 | 2 | 3 | 1 | 1 |
| II | 2 | 3 | 0 | 4 | 1 | 0 |
| III | 6 | 5 | 1 | 7 | 3 | 1 |
| IV | 2 | 2 | 2 | 6 | 1 | 1 |
| V | 7 | 8 | 0 | 10 | 0 | 0 |
| VI | 10 | 8 | 2 | 11 | 5 | 2 |