| Literature DB >> 33028947 |
Yi-Chiang Hsieh1, Chung-Te Chang2, Jeng-Der Chung3, Shih-Ying Hwang4.
Abstract
Demographic events are important in shaping the population genetic structure and exon variation can play roles in adaptive divergence. Twelve nuclear genes were used to investigate the species-level phylogeography of Rhododendron oldhamii, test the difference in the average GC content of coding sites and of third codon positions with that of surrounding non-coding regions, and test exon variants associated with environmental variables. Spatial expansion was suggested by R2 index of the aligned intron sequences of all genes of the regional samples and sum of squared deviations statistic of the aligned intron sequences of all genes individually and of all genes of the regional and pooled samples. The level of genetic differentiation was significantly different between regional samples. Significantly lower and higher average GC contents across 94 sequences of the 12 genes at third codon positions of coding sequences than that of surrounding non-coding regions were found. We found seven exon variants associated strongly with environmental variables. Our results demonstrated spatial expansion of R. oldhamii in the late Pleistocene and the optimal third codon position could end in A or T rather than G or C as frequent alleles and could have been important for adaptive divergence in R. oldhamii.Entities:
Mesh:
Year: 2020 PMID: 33028947 PMCID: PMC7542430 DOI: 10.1038/s41598-020-73748-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Sample locations of the 18 populations of Rhododendron oldhamii distributed in Taiwan. The 18 populations were assigned to four geographic regions according to EST-SSR clustering (Table 1)[4]. The four geographic groups were north group (populations BL, EGS, HYS, STS, TGK, TKL, and WLJ), central group (population WL), south group (populations CH, CJ, CY, HS, LLK, LS, RL, and WS), and southeast group (populations WR and YP). See Table 1 for population code abbreviations. We generated the map using ArcGIS v.10.6. The ASTER GDEM (Global Digital Elevation Model; https://asterweb.jpl.nasa.gov/gdem.asp) is used for elevational background.
Population information and number of haplotypes, nucleotide diversity, and inbreeding coefficients (FIS) based on the total aligned intron sequences of 12 nuclear genes for the 18 Rhododendron oldhamii populations.
| Population (code) | Region | Locality (°E/°N) | |||||
|---|---|---|---|---|---|---|---|
| Baling (BL) | North | 4 | 121.38/24.68 | 8 | 0.00596 (0.00067) | 0.00554 (0.00243) | |
| Ergirshan (EGS) | North | 1 | 121.62/24.97 | 2 | 0.00785 (0.00392) | 0.00785 (0.0056) | – |
| Huoyanshan (HYS) | North | 5 | 120.73/24.37 | 10 | 0.00549 (0.00056) | 0.00530 (0.00218) | |
| Shihtoushan (STS) | North | 1 | 121.48/24.89 | 2 | 0.00563 (0.00281) | 0.00563 (0.00403) | – |
| Tsaigongkeng (TGK) | North | 1 | 121.52/25.19 | 2 | 0.00370 (0.00185) | 0.00370 (0.00267) | – |
| Tsanguanliao (TKL) | North | 1 | 121.86/25.06 | 2 | 0.00399 (0.002) | 0.00399 (0.00288) | – |
| Wuliaojian (WLJ) | North | 1 | 121.37/24.88 | 2 | 0.00429 (0.00215) | 0.00429 (0.00309) | – |
| Wuling (WL) | Central | 8 | 121.31/24.35 | 16 | 0.00668 (0.001) | 0.00699 (0.00253) | |
| Chungheng (CH) | South | 1 | 121.15/24.03 | 2 | 0.00341 (0.00171) | 0.00341 (0.00246) | – |
| Chingjing (CJ) | South | 1 | 121.16/24.06 | 2 | 0.00163 (0.00081) | 0.00163 (0.0012) | – |
| Chiayang (CY) | South | 3 | 121.21/24.26 | 6 | 0.00681 (0.00096) | 0.00650 (0.0031) | |
| Hueisun (HS) | South | 2 | 121.00/24.08 | 4 | 0.00655 (0.00174) | 0.00671 (0.00366) | |
| Leleku (LLK) | South | 4 | 120.93/23.56 | 8 | 0.00613 (0.00069) | 0.00555 (0.00243) | − 0.026 (− 0.095, 0.044) |
| Lushan (LS) | South | 2 | 121.19/24.02 | 4 | 0.00733 (0.00185) | 0.00729 (0.00397) | |
| Renluen (RL) | South | 1 | 120.90/23.73 | 2 | 0.00237 (0.00119) | 0.00237 (0.00173) | – |
| Wushe (WS) | South | 1 | 121.12/24.03 | 2 | 0.00756 (0.00378) | 0.00756 (0.0054) | – |
| Wuru (WR) | Southeast | 5 | 121.04/23.17 | 10 | 0.00506 (0.00071) | 0.00507 (0.00209) | |
| Yeinping (YP) | Southeast | 5 | 121.03/22.93 | 10 | 0.00578 (0.00074) | 0.00561 (0.00231) | |
| North | 14 | 28 | 0.00594 (0.00032) | 0.00626 (0.002) | |||
| Central | 8 | 16 | 0.00659 (0.001) | 0.00695 (0.00252) | |||
| South | 15 | 30 | 0.00605 (0.00034) | 0.00552 (0.00174) | |||
| Southeast | 10 | 20 | 0.00586 (0.00043) | 0.00606 (0.00209) | |||
| Total | 47 | 94 | 0.00659 (0.00027) | 0.00945 (0.00235) | |||
N, sample size; Nh, number of haplotypes; π, the average number of pairwise nucleotide differences per site; θ, the average nucleotide diversity of segregating site; FIS, inbreeding coefficients.
FIS values do not bracket zero are in bold.
Classification of populations into different geographic regions was based on the results of a previous study[6].
Summary of nucleotide polymorphism and neutrality tests based on the aligned intron sequences of individual genes and the total aligned intron sequences for Rhododendron oldhamii.
| Locus | Intron aligned length (bp) | Neutrality test | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SSD | ||||||||||||
| 486 | 1 | 51 | 0.01059 | 0.03925 | 0.669 | 14 | − | 1.348 | − 0.191 | 0.00209 | ||
| 695 | 2 | 24 | 0.00137 | 0.00689 | 0.427 | 16 | − | − | − | 0.00143 | ||
| 727 | 0 | 13 | 0.00117 | 0.00351 | 0.641 | 14 | − | − 1.430 | − 1.863 | 0.00207 | ||
| 356 | 10 | 26 | 0.01656 | 0.01456 | 0.732 | 22 | 0.413 | 0.777 | 0.763 | 0.110 | 0.01913 | |
| 859 | 1 | 26 | 0.00123 | 0.00593 | 0.578 | 21 | − | − 1.648 | − | 0.00141 | ||
| 633 | 4 | 41 | 0.01673 | 0.01270 | 0.830 | 28 | 0.999 | − 1.256 | − 0.432 | 0.126 | 0.02930 | |
| 332 | 1 | 11 | 0.00507 | 0.00672 | 0.713 | 14 | − 0.645 | − 1.197 | − 1.192 | 0.072 | 0.02205 | |
| 280 | 2 | 12 | 0.00372 | 0.00850 | 0.452 | 10 | − 1.506 | 0.230 | − 0.457 | 0.043 | 0.00329 | |
| 830 | 5 | 32 | 0.00378 | 0.00763 | 0.875 | 27 | − 1.558 | − 1.454 | − 1.790 | 0.00111 | ||
| 538 | 8 | 27 | 0.01046 | 0.01021 | 0.905 | 29 | − 0.140 | − 0.090 | − 0.131 | 0.098 | 0.00823 | |
| 503 | 1 | 20 | 0.00527 | 0.00779 | 0.782 | 15 | − 0.943 | 0.418 | − 0.107 | 0.065 | 0.01036 | |
| 550 | 7 | 30 | 0.01287 | 0.01076 | 0.712 | 26 | 0.172 | − 0.084 | 0.020 | 0.115 | 0.01506 | |
| North | 24 | 164 | 1.000 | 28 | − 0.265 | − 0.280 | − 0.325 | 0.112 | 0.00330 | |||
| Central | 11 | 150 | 1.000 | 16 | − 0.359 | 0.203 | 0.049 | 0.127 | 0.01023 | |||
| South | 34 | 147 | 1.000 | 30 | 0.343 | 0.183 | 0.280 | 0.131 | 0.00381 | |||
| Southeast | 14 | 145 | 1.000 | 20 | − 0.134 | − 0.041 | − 0.081 | 0.122 | 0.00616 | |||
| Total | 6798 | 51 | 313 | 1.000 | 94 | − 1.094 | − 0.750 | − 1.079 | 0.00123 | |||
Rm, minimum number of recombination events; S, number of segregating sites; π, the average number of pairwise nucleotide diversity per site; θw, the average nucleotide diversity of segregating site; Hd, haplotype diversity; Nh, number of haplotypes. D, Tajima's D; D*, Fu & Li's D*; F*, Fu & Li's F*; SSD, sum of square deviations.
P values of neutrality tests < 0.05 are in bold.
Mean GC contents for GCI, GCE, and GC3S across 94 sequences of the 12 nuclear genes.
| Locus | Mean GC content | ||
|---|---|---|---|
| GCI | GCE | GC3S | |
| 0.374 | 0.434* | 0.315+ | |
| 0.308 | 0.519* | 0.385* | |
| 0.353 | 0.429* | 0.354* | |
| 0.351 | 0.479* | 0.632* | |
| 0.384 | 0.361+ | 0.364+ | |
| 0.357 | 0.404* | 0.358* | |
| 0.351 | 0.444* | 0.319+ | |
| 0.357 | 0.554* | 0.648* | |
| 0.349 | 0.500* | 0.308+ | |
| 0.340 | 0.457* | 0.429* | |
| 0.350 | 0.396* | 0.337+ | |
| 0.401 | 0.457* | 0.574* | |
| Average | 0.356 | 0.453 | 0.419 |
GCI, the average GC content of non-coding region.
GCE, the average GC content of coding sites.
GC3S, the average GC content at third positions of codons.
* and + represent, respectively, significantly higher and lower GC content based on Wilcoxon paired test (P < 0.001). Comparisons between GCE and GCI and between GC3S and GCI were performed.
The percentage of exon variation explained by non-geographically-structured environmental variables [a], shared (geographically-structured) environmental variables [b], pure geographic factors [c], and undetermined component [d] analyzed based on the eight retained environmental variables.
| Variation (adjusted | |||
|---|---|---|---|
| Environment [a] | 0.05648 | 1.3536 | 0.030 |
| Environment + Geography [b] | 0.07694 | – | – |
| Geography [c] | − 0.01211 | 0.7381 | 0.804 |
| [a + b + c] | 0.12132 | 1.6351 | 0.003 |
| Residuals [d] | 0.87868 | – | – |
Proportions of explained variation were obtained from variation partitioning by redundant analysis. F and P values are specified wherever applicable.
Exon variable alleles strongly correlated with environmental variables based on generalized linear model (GLM) and generalized linear mixed-effects model (GLMM).
| Exon variation | Frequent to rare allele change | Associated environmental variables | GLM | GLMM | ||
|---|---|---|---|---|---|---|
| Estimate | Estimate | |||||
GC (Ala → Ala) (S) | BIO1 | − 2.661 | − 0.037*,**,*** | − 2.662 | − 0.037*,** | |
| BIO7 | − 2.449 | − 0.071*,** | ||||
| Slope | 2.37 | 0.089* | 2.284 | 0.092* | ||
| WSmean | − 2.305 | − 1.427*,** | − 2.31 | − 1.513* | ||
(Ala → Ser) (Ns) | BIO1 | − 2.063 | − 0.026* | |||
| BIO7 | − 2.826 | − 0.087*,**,*** | ||||
| WSmean | − 2.085 | − 1.184* | − 2.047 | − 1.314* | ||
C (Arg → Gln) (Ns) | EVI | − 2.692 | − 20.846*,**,*** | |||
| NDVI | − 1.631 | − 111.91*,**,*** | ||||
| RH | − 41.34 | − 0.171*,**,*** | ||||
C (His → Arg) (Ns) | Aspect | − 3.116 | − 0.009*,**,*** | |||
| BIO7 | − 2.541 | − 0.208*,**,*** | ||||
| RH | − 2.903 | − 2.095*,**,*** | − 2.847 | − 2.095*,**,*** | ||
| Slope | 2.395 | 0.257*,**,*** | 2.172 | 0.269* | ||
| WSmean | − 2.131 | − 3.308*,** | ||||
| CT | WSmean | − 1.971 | − 1.488* | − 1.97 | − 1.488* | |
GT (Val → Val) (S) | Aspect | − 3.116 | − 0.009*,**,*** | |||
| BIO7 | − 2.541 | − 0.208*,**,*** | ||||
| RH | − 2.903 | − 2.095*,**,*** | − 2.847 | − 2.095*,**,*** | ||
| Slope | 2.395 | 0.257*,**,*** | 2.172 | 0.269* | ||
| WSmean | − 2.131 | − 3.308*,** | ||||
A (Asn → Ser) (Ns) | Aspect | − 3.116 | − 0.009*,**,*** | |||
| BIO7 | − 2.541 | − 0.208*,**,*** | ||||
| RH | − 2.903 | − 2.095*,**,*** | − 2.847 | − 2.095*,**,*** | ||
| Slope | 2.395 | 0.257*,**,*** | 2.172 | 0.269* | ||
| WSmean | − 2.131 | − 3.308*,** | ||||
BIO1 annual mean temperature, BIO7 temperature annual range, EVI enhanced vegetation index, NDVI normalized difference vegetation index, RH relative humidity, WSmean mean wind speed.
The superscript numbers on the second column represent amino acid position of the respective protein in Rhododendron catawbiense.
S synonymous substitution, Ns nonsynonymous substation.
*Values do not bracket zero in 95% confidence intervals.
**Values do not bracket zero in 99% confidence intervals.
***Values do not bracket zero in 99.5% confidence intervals.
Exon variable sites were coded as allelic presence ("1") and absence ("0") of the rare alleles and implemented in a generalized linear model (GLM) and a generalized linear mixed effect model (GLMM) as response variables to assess the correlations of exon variant alleles with environmental variables, with binomially distributed residuals.
The superscript numbers represent aligned exon sites for the nucleotide substitutions.
Figure 2Distributions of frequent allele frequencies of the seven exon variants strongly associated with environmental variables across the 18 Rhododendron oldhamii populations.
Figure 3Logistic regression plots of the exon variants strongly correlated with environmental variables identified by both generalized linear and generalized linear mixed effect models presented in Table 5. Values of the y-axis represent the predicted probabilities of rare alleles of exon variants in LACS8 and SPA1 genes and numbers of the x-axis represent the values of environmental variables.