| Literature DB >> 32546502 |
Jaakko S Tyrmi1,2, Jaana Vuosku1, Juan J Acosta3, Zhen Li4,5, Lieven Sterck4,5, Maria T Cervera6, Outi Savolainen1,2, Tanja Pyhäjärvi7,2.
Abstract
Understanding the consequences of local adaptation at the genomic diversity is a central goal in evolutionary genetics of natural populations. In species with large continuous geographical distributions the phenotypic signal of local adaptation is frequently clear, but the genetic basis often remains elusive. We examined the patterns of genetic diversity in Pinus sylvestris, a keystone species in many Eurasian ecosystems with a huge distribution range and decades of forestry research showing that it is locally adapted to the vast range of environmental conditions. Making P. sylvestris an even more attractive subject of local adaptation study, population structure has been shown to be weak previously and in this study. However, little is known about the molecular genetic basis of adaptation, as the massive size of gymnosperm genomes has prevented large scale genomic surveys. We generated a both geographically and genomically extensive dataset using a targeted sequencing approach. By applying divergence-based and landscape genomics methods we identified several loci contributing to local adaptation, but only few with large allele frequency changes across latitude. We also discovered a very large (ca. 300 Mbp) putative inversion potentially under selection, which to our knowledge is the first such discovery in conifers. Our results call for more detailed analysis of structural variation in relation to genomic basis of local adaptation, emphasize the lack of large effect loci contributing to local adaptation in the coding regions and thus point out the need for more attention toward multi-locus analysis of polygenic adaptation.Entities:
Keywords: Local adaptation; Pinus sylvestris; Structural Variation; Targeted DNA Sequencing; adaptation; gymnosperms; landscape genetics; population genetics – empirical
Mesh:
Year: 2020 PMID: 32546502 PMCID: PMC7407466 DOI: 10.1534/g3.120.401285
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Map of sampling locations with P. sylvestris distribution is marked in green color.
Study population location and summary statistic information
| Population | Latitude | Longitude | Tajima’s D | Ρ ( | ||
|---|---|---|---|---|---|---|
| Inari | 68° 54’ N | 27° 1’ E | 3.90 | 0.404 | −0.250 | 1.63 |
| Kolari | 67° 10’ N | 24° 3′ E | 3.71 | 0.393 | −0.290 | 1.81 |
| Kälviä | 63° 51’ N | 23° 27’ E | 3.94 | 0.395 | −0.259 | 1.49 |
| Punkaharju | 61° 45’ N | 29° 23’ E | 3.87 | 0.395 | −0.283 | 2.27 |
| Kalsnava | 56° 43’ N | 26° 1’ E | 3.79 | 0.384 | −0.299 | 1.21 |
| Radom | 50° 24’ N | 20° 3′ E | 3.85 | 0.400 | −0.290 | 1.47 |
| Ust-Chilma | 65° 22’ N | 52° 21’ E | 3.94 | 0.396 | −0.223 | 1.60 |
| Megdurechensk | 63° 4’ N | 50° 49’ E | 4.04 | 0.397 | −0.283 | 1.81 |
| Ust-Kulom | 61° 30’ N | 54° 0’ E | 3.76 | 0.380 | −0.301 | 1.19 |
| Penzenskaja | 53° 27’ N | 46° 6’ E | 4.12 | 0.398 | −0.272 | 2.16 |
| Volgogradskaja | 47° 45’ N | 44° 30’ E | 4.19 | 0.400 | −0.278 | 1.89 |
| Baza | 37° 46’ N | 2° 49’ W | 3.82 | 0.389 | −0.156 | 1.11 |
Figure 2Minor allele frequency spectrum calculated over all samples. Spectrum is projected down to 87 samples to account for missing data. Blue line denotes expected spectrum shape calculated according to equation presented in figure 7.
Figure 7Linkage disequilibrium coefficients (r) based on all pairwise SNP comparisons for all samples. Black line shows the squared correlation of allele frequencies r against physical distance between the SNPs (Hill and Weir 1988).
Weighted genome-wide averages of pairwise FST estimates for all populations
| Population | Inari | Kolari | Kälviä | Punkaharju | Kalsnava | Radom | Ust-Chilma | Megdurechensk | Ust-Kulom | Penzenskaja | Volgogradskaja |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.013 | |||||||||||
| −0.001 | 0.016 | ||||||||||
| 0.000 | 0.017 | 0.005 | |||||||||
| 0.019 | 0.019 | 0.023 | 0.020 | ||||||||
| 0.002 | 0.021 | 0.005 | 0.007 | 0.015 | |||||||
| 0.010 | 0.022 | 0.013 | 0.010 | 0.034 | 0.018 | ||||||
| 0.008 | 0.021 | 0.011 | 0.009 | 0.029 | 0.016 | 0.001 | |||||
| 0.064 | 0.045 | 0.063 | 0.064 | 0.044 | 0.052 | 0.054 | 0.052 | ||||
| 0.009 | 0.023 | 0.011 | 0.013 | 0.033 | 0.015 | 0.008 | 0.009 | 0.060 | |||
| 0.000 | 0.017 | 0.005 | 0.005 | 0.023 | 0.007 | 0.010 | 0.006 | 0.063 | 0.000 | ||
| 0.068 | 0.082 | 0.073 | 0.071 | 0.077 | 0.065 | 0.083 | 0.077 | 0.117 | 0.081 | 0.072 |
Figure 3Visualization of STRUCTURE results using K values of 2 (A) and 3 (B).
Figure 4PCA projections of two first principal components of all samples (A), excluding Baza population samples (B), excluding Baza and the samples containing the putative inversion (C) and projection created using variants from putatively inverted area (D). In figures B and D the samples encompassed within black circle contain the putatively inverted haplotype. Circles represent the samples of the western cline, squares the samples of eastern cline and triangles the samples in isolated Baza population. Total variance explained by principal component is indicated within parentheses next to respective principal component axis header.
Figure 5Admixture proportions for two layers estimated for different populations using conStruct spatial (A) and non-spatial (B) models.
Figure 6A) Bayescan outlier locus allele frequencies at sampling sites (Y-axis) across latitude of the sites (X-axis). Populations are marked with red (eastern) and blue (western) squares with respective least squares trend line. B) Allele frequency of the second highest scoring bayescan result.
Figure 8A) Heatmap visualization of allelic correlation coefficient values below diagonal calculated between all SNPs identified as being part of an inversion, and all variants within their surrounding 1 kbp areas. Alternating thick and thin X and Y axis borders denotes variants belonging to the same scaffolds. B) Similar heatmap to A, but random variants with similar allele frequency to the inversion were selected along with their 1 kbp surrounding areas to visualize typical linkage disequilibrium patterns. Some scaffolds show LD within them, but between scaffolds mostly only low values can be seen. C) Means of values for 10,000 random 1kb areas (one of which is visualized in 1B heatmap) marked in black and the mean value of blocks containing the putative inversion haplotype marked with red asterisk.