| Literature DB >> 27648247 |
Larry J Leamy1, Cheng-Ruei Lee2, Qijian Song3, Ibro Mujacic4, Yan Luo5, Charles Y Chen6, Changbao Li7, Susanne Kjemtrup7, Bao-Hua Song1.
Abstract
A fundamental goal in evolutionary biology is to understand how various evolutionary factors interact to affect the population structure of diverse species, especially those of ecological and/or agricultural importance such as wild soybean (Glycine soja). G. soja, from which domesticated soybeans (Glycine max) were derived, is widely distributed throughout diverse habitats in East Asia (Russia, Japan, Korea, and China). Here, we utilize over 39,000 single nucleotide polymorphisms genotyped in 99 ecotypes of wild soybean sampled across their native geographic range in northeast Asia, to understand population structure and the relative contribution of environment versus geography to population differentiation in this species. A STRUCTURE analysis identified four genetic groups that largely corresponded to the geographic regions of central China, northern China, Korea, and Japan, with high levels of admixture between genetic groups. A canonical correlation and redundancy analysis showed that environmental factors contributed 23.6% to population differentiation, much more than that for geographic factors (6.6%). Precipitation variables largely explained divergence of the groups along longitudinal axes, whereas temperature variables contributed more to latitudinal divergence. This study provides a foundation for further understanding of the genetic basis of climatic adaptation in this ecologically and agriculturally important species.Entities:
Keywords: Admixture; canonical correlation analysis; environment change; gene flow; natural selection; population genomics
Year: 2016 PMID: 27648247 PMCID: PMC5016653 DOI: 10.1002/ece3.2351
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
Figure 1Collection sites of the 99 wild soybean ecotypes sampled across northeastern Asia. The wild soybean ecotypes locations are color‐coded by the genetic cluster to which they were assigned by the STRUCTURE procedure. The four genetic clusters are as follows: Red: GROUP 1; Blue: GROUP 2; Green: GROUP 3; Purple: GROUP 4. Individuals assigned at a probability of <70% were shown on the map with a color Pie indicating the composition of the genome. The plant picture on the top‐left corner is Glycine soja.
Figure 2Population structure inferred by the STRUCTURE procedure generated from the 99 wild soybean ecotypes. The colors in each ecotype in A for K = 2, 3, and 4 represent the fraction of their genome that is inferred to be from each of four genetic groups. Red: GROUP 1; Blue: GROUP 2; Green: GROUP 3; Purple: GROUP 4.
F ST values between each pair of the genetic groups
| Group1 | Group2 | Group3 | |
|---|---|---|---|
| Group2 | 0.655 | ||
| Group3 | 0.321 | 0.381 | |
| Group4 | 0.256 | 0.287 | 0.098 |
Principal coordinate analysis results
| PCOA | Eigenvalue | Proportion | Cumulative |
|---|---|---|---|
| PCOA1 | 0.506 | 0.135 | 0.135 |
| PCOA2 | 0.258 | 0.069 | 0.204 |
| PCOA3 | 0.229 | 0.061 | 0.265 |
| PCOA4 | 0.115 | 0.031 | 0.296 |
| PCOA5 | 0.094 | 0.025 | 0.321 |
| PCOA6 | 0.080 | 0.021 | 0.342 |
| PCOA7 | 0.075 | 0.020 | 0.362 |
| PCOA8 | 0.065 | 0.017 | 0.379 |
| PCOA9 | 0.062 | 0.017 | 0.396 |
| PCOA10 | 0.062 | 0.017 | 0.412 |
| PCOA11 | 0.059 | 0.016 | 0.428 |
| PCOA12 | 0.057 | 0.015 | 0.443 |
| PCOA13 | 0.053 | 0.014 | 0.458 |
| PCOA14 | 0.053 | 0.014 | 0.472 |
| PCOA15 | 0.051 | 0.014 | 0.485 |
Shown are the eigenvalues and the proportion of the variation they explain for the first 15 axes derived from principal coordinate analysis of the genetic data from wild soybeans.
Figure 3Plots of the scores from the first (PCOA1) and second (PCOA2) axis (A) and the first and third (PCOA3) axis (B) derived from a principal coordinates analysis of the genetic (SNP) data from 83 wild soybean ecotypes.
CCA of the genetic and environmental/geographic variables in the wild soybean sample
| Canonical axes | Canonical correlation | % Explained |
| Prob > |
|---|---|---|---|---|
| I | 0.983 | 47.7 | 5.27 | <0.0001 |
| II | 0.961 | 19.6 | 3.97 | <0.0001 |
| III | 0.958 | 18.2 | 3.12 | <0.0001 |
| IV | 0.877 | 5.4 | 2.23 | <0.0001 |
| V | 0.811 | 3.2 | 1.79 | <0.0001 |
| VI | 0.711 | 1.7 | 1.48 | 0.0018 |
| VII | 0.687 | 1.5 | 1.31 | 0.0330 |
CCA, canonical correlation analysis.
Shown are the canonical correlations, percentage explained, the F value, and the probability of the F values for all statistically significant canonical variables. These values were derived from CCA of the first 15 axes from PCOA of the wild soybean genetic data with 18 environmental and geographical variables.
Correlations of genetic (PCOA scores) and environmental/geographical variables with their canonical variables
| I | II | III | |
|---|---|---|---|
| Genetic | |||
| PCOA1 | 0.948 | −0.080 | 0.167 |
| PCOA2 | 0.240 | 0.223 | −0.876 |
| PCOA3 | 0.095 | 0.814 | 0.289 |
| PCOA4 | −0.086 | 0.477 | −0.075 |
| PCOA5 | 0.111 | −0.106 | 0.123 |
| PCOA6 | −0.035 | 0.022 | 0.088 |
| PCOA7 | −0.036 | −0.064 | 0.084 |
| PCOA8 | 0.066 | 0.016 | 0.134 |
| PCOA9 | −0.028 | 0.121 | 0.137 |
| PCOA10 | 0.013 | 0.066 | 0.181 |
| PCOA11 | −0.031 | 0.123 | −0.066 |
| PCOA12 | 0.028 | −0.013 | −0.078 |
| PCOA13 | −0.009 | 0.056 | 0.007 |
| PCOA14 | 0.070 | 0.024 | 0.064 |
| PCOA15 | 0.031 | 0.003 | −0.029 |
| Environmental/Geographical | |||
| LONG |
| −0.406 | 0.102 |
| LAT | 0.378 |
| 0.412 |
| SLOPE | −0.107 | 0.207 | −0.007 |
| ALT | 0.214 | −0.130 | −0.356 |
| MDR |
| 0.154 | 0.165 |
| MTW | −0.020 |
| −0.115 |
| AP |
| 0.334 | −0.040 |
| PWM | −0.208 |
| 0.319 |
| MDR2 |
| 0.116 | 0.141 |
| MTW2 | −0.002 | 0.489 | −0.212 |
| AP2 |
| 0.223 | −0.112 |
| PWM2 | −0.196 |
| 0.319 |
| MDR*MTW | 0.211 |
| −0.064 |
| MDR*AP |
|
| −0.128 |
| MDR*PWM | 0.593 | 0.491 | 0.298 |
| MTW*AP | −0.474 |
| −0.015 |
| MTW*PWM | −0.036 |
| −0.051 |
| AP*PWM |
| 0.441 | 0.009 |
MDR, mean diurnal temperature range; MTW, mean temperature of the wettest quarter; AP, annual precipitation; PWM, precipitation of wettest month; LONG, longitude; LAT, latitude; SLOPE, the maximum change in the elevations between each cell and its eight neighbors; ALT, altitude in meters.
Shown are correlations of the 15 genetic variables (PCOA1‐PCOA15) with the scores from their first three canonical variables, and correlations of 18 environmental/geographical variables with the scores from their first three canonical variables. Several of the highest correlations of the environmental/geographical variables with their canonical variables are shown in bold for emphasis.