| Literature DB >> 34031525 |
Kyung Seok Kim1, Kevin J Roe2.
Abstract
Detailed information on species delineation and population genetic structure is a prerequisite for designing effective restoration and conservation strategies for imperiled organisms. Phylogenomic and population genomic analyses based on genome-wide double digest restriction-site associated DNA sequencing (ddRAD-Seq) data has identified three allopatric lineages in the North American freshwater mussel genus Cyprogenia. Cyprogenia stegaria is restricted to the Eastern Highlands and displays little genetic structuring within this region. However, two allopatric lineages of C. aberti in the Ozark and Ouachita highlands exhibit substantial levels (mean uncorrected FST = 0.368) of genetic differentiation and each warrants recognition as a distinct evolutionary lineage. Lineages of Cyprogenia in the Ouachita and Ozark highlands are further subdivided reflecting structuring at the level of river systems. Species tree inference and species delimitation in a Bayesian framework using single nucleotide polymorphisms (SNP) data supported results from phylogenetic analyses, and supports three species of Cyprogenia over the currently recognized two species. A comparison of SNPs generated from both destructively and non-destructively collected samples revealed no significant difference in the SNP error rate, quality and amount of ddRAD sequence reads, indicating that nondestructive or trace samples can be effectively utilized to generate SNP data for organisms for which destructive sampling is not permitted.Entities:
Year: 2021 PMID: 34031525 PMCID: PMC8144384 DOI: 10.1038/s41598-021-90325-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Measure of genetic Diversity and inbreeding coefficient for populations from rivers, Highlands and species.
| Species/highland | River | Size | Na | Neff | Ho | He | ||
|---|---|---|---|---|---|---|---|---|
| Clinch | 8 | 1.334 | 1.193 | 0.107 | 0.131 | 0.178 | ||
| Green | 10 | 1.409 | 1.221 | 0.121 | 0.144 | 0.162 | ||
| Licking | 24 | 1.462 | 1.225 | 0.120 | 0.141 | 0.153 | ||
| Salt | 9 | 1.393 | 1.215 | 0.122 | 0.142 | 0.142 | ||
| All (Mean) | 51 | 1.400 | 1.214 | 0.118 | 0.140 | 0.159 | 0.013 | |
| Black | 28 | 1.547 | 1.252 | 0.126 | 0.161 | 0.218 | ||
| Spring | 27 | 1.529 | 1.240 | 0.123 | 0.152 | 0.186 | ||
| St. Francis | 16 | 1.447 | 1.219 | 0.117 | 0.140 | 0.169 | ||
| All (Mean) | 71 | 1.508 | 1.237 | 0.122 | 0.151 | 0.191 | 0.073 | |
| Caddo | 5 | 1.245 | 1.155 | 0.097 | 0.104 | 0.064 | ||
| Ouachita | 6 | 1.273 | 1.164 | 0.096 | 0.112 | 0.145 | ||
| Saline | 13 | 1.288 | 1.155 | 0.076 | 0.101 | 0.250 | ||
| All (mean) | 24 | 1.269 | 1.158 | 0.090 | 0.106 | 0.153 | 0.143 | |
| Clinch | 8 | 1.131 | 1.076 | 0.031 | 0.051 | 0.393 |
A total of 154 individuals except two geographically isolated individuals (Ozark_105 and Ozark_107) were included in this analysis.
Na mean number of alleles, Neff effective number of alleles (The number of alleles in a population, weighted for their frequencies), Ho observed heterozygosity, He expected heterozygosity assuming Hardy–Weinberg equilibrium, FIS inbreeding coefficient, FST uncorrected for missing data.
Pairwise FST’s among 10 rivers throughout distribution range of genus Cyprogenia based on 7243 SNP loci.
| Highland | River | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Clinch | Green | Licking | Salt | Black | Spring | St.Francis | Caddo | Ouachita | Saline | ||
| Eastern | Clinch | 0.026 | 0.019 | ||||||||
| Green | 0.051 | 0.003 | 0.006 | ||||||||
| Licking | 0.007 | 0.005 | |||||||||
| Salt | 0.059 | 0.012 | 0.012 | ||||||||
| Ozark | Black | ||||||||||
| Spring | |||||||||||
| St.Francis | |||||||||||
| Ouachita | Caddo | 0.007 | 0.036 | ||||||||
| Ouachita | 0.046 | 0.030 | |||||||||
| Saline | |||||||||||
FST values from raw data below diagonal and FST values from corrected data above diagonal.
A total of 146 individuals except two geographically isolated individuals (Ozark_105 and Ozark_107) were included in this analysis. Significance was tested by genic differentiation for each population pair from exact G test using default Markov chain parameters (Dememorisation: 10,000, Batches: 100, Iterations per batch: 5000). Bold Italicized indicates highly significant FST value. Corrected data refers to data that missing data were replaced with randomly drawn alleles based on the overall allele frequencies. Significance level was adjusted for multiple testing using Bonferroni correction.
Summary of hierarchical AMOVA analysis for Cyprogenia from the highland regions based on 7243 SNP loci.
| Source of variation | %var | c.i.2.5% | c.i.97.5% | P-value | ||
|---|---|---|---|---|---|---|
| (Ozark, Ouachita, and Eastern) | ||||||
| Within individual | 0.668 | 0.332 | 0.323 | 0.341 | – | |
| Among individual | 0.100 | 0.131 | 0.125 | 0.136 | 0.001 | |
| Among population | 0.016 | 0.020 | 0.019 | 0.022 | 0.001 | |
| Among highlands | 0.216 | 0.216 | 0.209 | 0.223 | 0.001 | |
| Among population without grouping | – | 0.179 | 0.174 | 0.185 | 0.001 | |
| (Ozark and Ouachita) | ||||||
| Within individual | 0.790 | 0.210 | 0.198 | 0.222 | – | |
| Among individual | 0.100 | 0.113 | 0.104 | 0.121 | 0.001 | |
| Among population | 0.027 | 0.030 | 0.027 | 0.033 | 0.001 | |
| Among highlands | 0.082 | 0.082 | 0.074 | 0.091 | 0.107 | |
| Among population without grouping | – | 0.069 | 0.064 | 0.075 | 0.001 | |
A total of 146 individuals except two geographically isolated individuals (Ozark_105 and Ozark_107) were included in this analysis. AMOVA was conducted separately according to hierarchical groupings. Three Highlands correspond to the Ozark, Ouachita, and Eastern highland regions, consisting of two Cyprogenia species. Two Highlands correspond to Ozark and Ouachita Highlands, regions consisting of same species, C. aberti. Missing data are replaced with randomly drawn alleles based on the overall allele frequencies. Significance was tested using 999 permutations.
Figure 1Results of the Principal Component Analyses (PCA) and Discriminant Analysis of Principal Components (DAPC) in R. In this Figure, genetic diversity is represented in two ways: by the distances (further away = more genetically divergent), and by the colors (more divergent colors = more genetically divergent). (A) PCA for entire 156 samples representing three freshwater mussel species (Cyprogenia + Dromus), (B) PCA for 148 samples representing genus Cyprogenia, (C) PCA for 97 samples representing Cyprogenia aberti. (D) DAPC for 97 samples representing Cyprogenia aberti (Number of PCA axes retained = 50, and number of PCA axes retained = 5).
Figure 2Geographic locations and pie charts of membership of each sampled population inferred by STRUCTURE analysis of Cyprogenia. The STRUCTURE plots show the posterior probability for individual assignments of samples to different genetic clusters based on the results of re-analysis of the original K = 3 clusters (Fig. S1). The optimal number of genetic clusters within Highlands was K = 2 for Ouachita, K = 3 for Ozark, and K = 2 for Eastern. Pie charts indicate proportions of membership of each sampled population to clusters inferred by STRUCTURE analysis for each Highland (see text for details). Circle size of pie charts is proportional to sample size. Sample sites for Cyprogenia aberti are marked as triangles and Cyprogenia stegaria as circles. Base map reproduced from Chong et al.[23].
Figure 3Phylogenetic relationships and population genetic composition of freshwater mussel species. Maximum likelihood (ML) tree (left) and Bayesian tree (right) were constructed based on 4395 informative sites from concatenated sequences of 9673 SNPs (see text for details). Bootstrap support values (ML) and posterior probability (Bayesian) respectively were provided at the node of major clade. The STRUCTURE plots show the posterior probability for individual assignments of samples to different genetic clusters. The main plot (left) shows the results for the optimal number of genetic clusters (K = 3) for 148 Cyprogenia specimens. The three smaller plots to the right show the results of re-analyses of samples from each Highland. The optimal number of genetic clusters was K = 2 for Ouachita, K = 3 for Ozark, and K = 2 for Eastern. Colors of clusters in the phylogenetic trees correspond to colors in the STRUCTURE plots and Fig. 2. Red triangles: Clinch river, TN (Eastern15 and Eastern18 in Table S1), Black triangles: Spring and Fall rivers, KS (Ozark105 and Ozark107 in Table S1).
Figure 4Species trees generated with Maximum clade credibility (MCC) and Median heights option in TreeAnnotator and with Transform branches (proportional) option in FigTree. Posterior probabilities are shown on the branch of each node. Total of 40 samples for 10 sampling sites (four samples per each sampling site) for Cyprogenia species were aligned and selected based on by selecting the individual samples with the highest coverage (smallest number of missing SNPs) (A) or by random sampling of individuals (B) using a custom Python script.
Path sampling results for the six species delimitation models for Cyprogenia species shown in Fig. 2.
| Model | Classification | No. species | MLE | Rank | BF | Detailed classification |
|---|---|---|---|---|---|---|
| 1 | Current taxonomy | 2 | − 55,853 | 4 | – | Two |
| 2 | Split | 3 | − 53,201 | 1 | − 5304 | |
| 3 | Mix | 3 | − 56,827 | 6 | 1949 | Half of Ozark and Ouachita Highlands were intermixed |
| 4 | Reassign | 3 | − 56,323 | 5 | 940 | Black from Ozark was moved to Ouachita, and Saline from Ouachita was moved to Ozark |
| 5 | Reassign | 3 | − 54,742 | 2 | − 2222 | Saline from Ouachita was moved to Ozark |
| 6 | Reassign | 3 | − 55,285 | 3 | − 1136 | Black from Ozark was moved to Ouachita |
All Bayes factor (BF) calculations are made against the current taxonomy model (Model 1). Therefore, positive BF values indicate support for current taxonomy model, and negative BF values indicate support for alternative model. The BF scale is as follows: 0 < BF < 2 is not worth more than a bare mention, 2 < BF < 6 is positive evidence, 6 < BF < 10 is strong support, and BF > 10 is decisive. Current and alternative species delimitation models were analyzed with the path sampling steps of 36, MCMC sample length of 100,000 for each path sampling step, alpha of 0.3, burnInPercentage = 10, and preBurnin of 10,000.
Best fit demographic model and posterior distribution of the current effective population sizes (Ne) e and generation time (T) (Median and 95% credible interval) for each Cyprogenia sample site using ABC simulation implemented in DIYABC.
| Highlands | Location/sample | Best scenario direct (logistic) | Parameter | Median | Quantile 5% | Quantile 95% |
|---|---|---|---|---|---|---|
| Ozark | Black | DEC (DEC) | 4.70 × 102 | 1.62 × 102 | 9.73 × 102 | |
| 1.97 × 102 | 1.13 × 102 | 4.19 × 102 | ||||
| Spring | DEC (DEC) | 7.78 × 102 | 2.62 × 102 | 1.60 × 103 | ||
| 2.35 × 102 | 1.19 × 102 | 5.63 × 102 | ||||
| St. Francis | DEC (DEC) | 2.00 × 103 | 6.45 × 102 | 3.93 × 103 | ||
| 1.32 × 103 | 3.16 × 102 | 3.24 × 103 | ||||
| Ouachita | Ouachita | DEC (DEC) | 4.26 × 103 | 9.88 × 102 | 1.09 × 104 | |
| 2.33 × 103 | 4.46 × 102 | 4.26 × 103 | ||||
| Caddo | DEC (DEC) | 3.79 × 103 | 8.56 × 102 | 1.07 × 104 | ||
| 2.30 × 103 | 4.45 × 102 | 4.24 × 103 | ||||
| Saline | DEC (DEC) | 3.29 × 103 | 9.04 × 102 | 7.31 × 103 | ||
| 1.29 × 103 | 2.39 × 102 | 3.85 × 103 | ||||
| Eastern | Salt | INCDEC (INCDEC) | 1.55 × 103 | 3.97 × 102 | 4.26 × 103 | |
| 3.82 × 102 | 1.30 × 102 | 1.52 × 103 | ||||
| 3.62 × 103 | 2.42 × 102 | 4.42 × 103 | ||||
| Green | INCDEC (INCDEC) | 3.78 × 102 | 1.10 × 102 | 1.29 × 103 | ||
| 1.60 × 102 | 1.06 × 102 | 6.00 × 102 | ||||
| 3.90 × 103 | 2.54 × 103 | 4.46 × 103 | ||||
| Clinch | DEC (INCDEC) | 2.35 × 103 | 6.93 × 102 | 5.16 × 103 | ||
| 6.50 × 102 | 1.63 × 102 | 2.89 × 103 | ||||
| Licking | INCDEC (INCDEC) | 2.23 × 102 | 8.51 × 10 | 5.49 × 102 | ||
| 1.22 × 102 | 1.03 × 102 | 2.06 × 102 | ||||
| 4.03 × 103 | 2.62 × 103 | 4.47 × 103 |
Five demographic models proposed by by Cabrera and Palsbøll[81] are evaluated. Model 1: CON (constant population size), Model 2: DEC (a single instantaneous decrease in population size), Model 3: INC (a single instantaneous increase in population size), Model 4: INCDEC (a single instantaneous increase followed by a single instantaneous decrease in population size), Model 5: DECINC (a single instantaneous decrease followed by a single instantaneous increase in population size). Time is in number of generations assuming a generation time of 5 years for Cyprogenia; tt denotes a unique event after the LGM, t1 and t2 denote an event after and before the LGM. Detailed prior model parameterization for five demographic models was provided in “Supplementary Information”. No mutation model parameterization was required for SNPs.