Literature DB >> 24597663

Genome-wide admixture and ecological niche modelling reveal the maintenance of species boundaries despite long history of interspecific gene flow.

Amanda R De La Torre1, David R Roberts, Sally N Aitken.   

Abstract

The maintenance of species boundaries despite interspecific gene flow has been a continuous source of interest in evolutionary biology. Many hybridizing species have porous genomes with regions impermeable to introgression, conferring reproductive barriers between species. We used ecological niche modelling to study the glacial and postglacial recolonization patterns between the widely hybridizing spruce species Picea glauca and P. engelmannii in western North America. Genome-wide estimates of admixture based on a panel of 311 candidate gene single nucleotide polymorphisms (SNP) from 290 genes were used to assess levels of admixture and introgression and to identify loci putatively involved in adaptive differences or reproductive barriers between species. Our palaeoclimatic modelling suggests that these two closely related species have a long history of hybridization and introgression, dating to at least 21,000 years ago, yet species integrity is maintained by a combination of strong environmental selection and reduced current interspecific gene flow. Twenty loci showed evidence of divergent selection, including six loci that were both Fst outliers and associated with climatic gradients, and fourteen loci that were either outliers or showed associations with climate. These included genes responsible for carbohydrate metabolism, signal transduction and transcription factors.
© 2014 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

Entities:  

Keywords:  admixture; ecological niche modelling; outlier loci; spruce

Mesh:

Substances:

Year:  2014        PMID: 24597663      PMCID: PMC4228761          DOI: 10.1111/mec.12710

Source DB:  PubMed          Journal:  Mol Ecol        ISSN: 0962-1083            Impact factor:   6.185


Introduction

The nature of genetic barriers that isolate species from interspecific gene flow is of great biological interest (Abbott ; Feder ). Understanding the maintenance of species boundaries requires addressing a fundamental question: are the genomes or genes the units of specific differentiation? Under the most widely recognized species concept, the biological species concept, the genomes of species are coadapted units that are separated from other units by reproductive barriers. This concept implies that species divergence only occurs through whole-genome isolation and therefore hybridizing species are not ‘true’ species. By contrast, the genic view of speciation proposes that the gene is the unit of species differentiation (Wu 2001), and reproductive isolation is a consequence of natural selection acting on individual genes. Species boundaries are ‘semi-permeable’; some genomic regions share introgressed genes between species, whereas other regions accumulate divergence between species in response to natural selection (Strasburg ). Despite recent advances in population genetics and genomics, little is known about how species boundaries are maintained between closely related species that hybridize. In hybridizing species showing adaptations to different environments, divergent selection acts on a subset of genes, counteracting the homogenizing effect of gene flow and preventing introgression in surrounding genomic regions (Chapman ). As a result, species boundaries are maintained despite hybridization and introgression (Andrew & Rieseberg 2013). Species that hybridize can also coexist without merging or becoming swamped in environments where hybrids are favoured by selection over pure species (fitness higher than pure species), that is, in intermediate environments (Arnold 1997). White spruce [Picea glauca (Moench) Voss] and Engelmann spruce (P. engelmannii Parry) are wind-dispersed, long-lived, closely related conifers that hybridize extensively in British Columbia and the western part of Alberta, Canada, as well as in some parts of the western United States. Their extensive hybrid zone is mainly composed of hybrids with a clinal intergradation of morphological and genetic characteristics along elevational gradients between parental species' habitats (Ledig ). Recent studies using neutral microsatellite markers have found that introgression is extensive in the hybrid zone and asymmetric towards Engelmann spruce (Haselhorst & Buerkle 2013; De La Torre & Aitken, unpublished). Despite extensive interspecific gene flow, natural selection acting along environmental gradients (exogenous selection) is responsible for maintenance of this hybrid zone occupying mountainous areas, with white spruce adapted to low-elevation boreal and subboreal environments, Engelmann spruce adapted to high-elevation environments and hybrid populations inhabiting intermediate elevations. This hybrid zone follows a bounded hybrid superiority model (Moore 1977) in which hybrids are fitter than pure species in intermediate environments (De La Torre ). Over the last century, studies of the evolutionary and genetic relationships between white and Engelmann spruce have focused on whether these two closely related species represent extreme phenotypes along a genetic continuum (Rajora & Dancik 2000) or whether they deserve species-level recognition. Although previous studies in the white x Engelmann spruce contact zone have provided a broad idea of the evolutionary relationships between these species, they have failed to identify the factors contributing to isolating barriers between species. Some limitations of previous studies included small numbers of loci that lacked species-specific diagnostic markers and a limited geographical scope of sampling within the hybrid zone. This study combines population genomic approaches with ecological niche modelling to assess historical and contemporary evolutionary relationships between white and Engelmann spruce. Our primary objective is to understand how white and Engelmann spruce maintain their species integrity despite interspecific gene flow. To answer this question, we inferred the recent histories of white and Engelmann spruce by using palaeoclimatic analysis to study potential climatic niche-based range expansions and contractions and to infer past opportunities for secondary contact between species. We determined the extent and direction of introgression using a genome scan approach with markers from 290 candidate genes and identified a subset of genes that may be involved in genetic barriers between these species.

Materials and methods

Ecological niche modelling

We used ecological niche modelling to study past demographic processes during glacial and postglacial range contractions and expansions. While these models are based on assumptions such as environmental stability and niche conservation (Araujo & Guisan 2006), they have been used effectively, particularly at continental scales, for both past species range reconstructions and future range projections (Elith & Leathwick 2009). The ecological niche model was built using mapped ecoregion delineations for North America from various public sources. Model training and model projection were carried out using these ecoregion classes as a response variable, as this approach was shown to be effective at limiting range overprediction in model hindcasts (Roberts & Hamann 2012). Following model projection/hindcast of these ecoregion classes, species frequencies were attached to classes based on a summary of sample plot records, from the Forest Inventory and Analysis (FIA) programme data (http://www.fia.fs.fed.us) and from proprietary Canadian sample plot data, falling geographically within the boundaries of each original mapped ecoregion (Roberts & Hamann 2012). Environmental predictors comprised ten climate variables, interpolated at 1 km resolution: mean annual temperature, mean coldest and mean warmest month temperatures, continentality, mean annual and mean growing season precipitations, number of frost-free days, number of degree-days above 5 °C and estimates of annual and summer heat–moisture indices (Wang ). Climate for the modern period was based on the 1961–1990 climate record. Palaeoclimate data were generated by overlaying the modern climate data with climatic anomaly data generated by two general circulation models (GCMs): the Community Climate Model (CCM1) (Kutzbach ) and the Geophysical Fluid Dynamics Laboratory model (GFDL) (Bush & Philander 1999). Models were trained with data from the present day and projected with the palaeoclimate data for past periods. Ecoregion model projections were made by a majority vote of three modelling strategies: a discriminant analysis, a randomized bootstrapped classification tree and a minimum multivariate distance (Roberts & Hamann 2012). Modelled reconstructions of modern ranges were validated against the 55 744 individual sample plots in the FIA data. Model projections for the past periods were validated against 931 fossil pollen and plant macrofossil records, largely from the Neotoma Paleoecology Database (http://www.neotomadb.org), using the area under the receiver-operating characteristic (AUC). Values for this statistic range from 0 to 1 where 1 represents a perfect model and 0.5 represents random chance.

Sample collection for genomic analyses

Newly flushed needle tissue of 745 samples of white spruce, Engelmann spruce and their hybrids were collected from common garden experiments, previously established by the British Columbia Ministry of Forests, Lands and Natural Resources Operations' spruce breeding programme (Table 1, Fig. 1). Seed planning zones (SPZs) are geographical units for genetic management based on ecosystem classification and adaptive traits of populations. In this study, we collected samples from 200 open-pollinated families (progeny of individual seed parents sampled from natural populations) in the West Kootenay, East Kootenay Mount Robson and Quesnel Lakes SPZs. Thirty-three putatively pure white spruce (22 from Fort Nelson and 11 from Prince George), and 40 putatively pure Engelmann spruce from southwestern United States were obtained from grafts of trees sampled from natural populations. After the population structure and admixture analyses, the Prince George population was reassigned as a hybrid population (Table 1). The same genetic materials were analysed in De La Torre .
Table 1

Geographical coordinates and climatic variables of parent trees for Picea glauca, P. engelmannii and their hybrids analysed with 311 single nucleotide polymorphism loci. Two-letter codes are used to identify populations in subsequent tables and graphs

 PopulationProvinceElevation range (m)Latitude (degrees)Longitude (degrees)MAT (°C)MAP (mm)Sample size
Picea glauca
 Fort Nelson (FN)B.C350–60058.4–59.4120.5–126.30.350922
P. glauca x P. engelmannii
 Prince George (PG)B.C610–79353.5–54121.6–1222.876911
 Quesnel Lakes (QL)B.C680–155551.8–53.2119.4–122.12.3914220
 Mount Robson (MR)B.C701–152552.2–53.8118.4–121.51.71167197
 East Kootenay (EK)B.C1006–167749.4–50.8115.1–116.61.9944204
 West Kootenay (WK)B.C690–196649–50.5114.9–118.42.51168124
Picea engelmannii
 Salmon River (E1)Idaho1859–253043.8–46.2113.7–115.92.810289
 Teton-Wasatch (E2)Wyo.2347–304840.4–43.8109.5–111.62.688513
 Fishlake-Lasal (E3)Col.2606–338337.5–39.8109.2–112.83.476818

MAT, Mean annual temperature; MAP, mean annual precipitation; B.C, British Columbia; Wyo, Wyoming; Col, Colorado.

Fig 1

(a) Geographical locations of populations of pure Picea glauca (FN), pure P. engelmannii (E1, E2 and E3), and their hybrids (all other populations). Population names corresponding to two-letter codes in Table 1. (b) Map showing the location of the hybrid zone in North America; and (c) posterior estimates of cluster membership for the Picea glauca x P. engelmannii hybrid zone with TESS for K = 2. Populations are ordered by increasing latitude from left to right, finishing with the P. glauca reference population (FN).

Geographical coordinates and climatic variables of parent trees for Picea glauca, P. engelmannii and their hybrids analysed with 311 single nucleotide polymorphism loci. Two-letter codes are used to identify populations in subsequent tables and graphs MAT, Mean annual temperature; MAP, mean annual precipitation; B.C, British Columbia; Wyo, Wyoming; Col, Colorado. (a) Geographical locations of populations of pure Picea glauca (FN), pure P. engelmannii (E1, E2 and E3), and their hybrids (all other populations). Population names corresponding to two-letter codes in Table 1. (b) Map showing the location of the hybrid zone in North America; and (c) posterior estimates of cluster membership for the Picea glauca x P. engelmannii hybrid zone with TESS for K = 2. Populations are ordered by increasing latitude from left to right, finishing with the P. glauca reference population (FN).

DNA extraction and genotyping

Needles were stored at −80 °C prior to DNA isolation. Each sample was isolated using a modified CTAB protocol (Doyle & Doyle 1987). After the extractions, DNA quality and concentration of each sample was assessed using 0.8% agarose gels and quantified based on Nanodrop 2000C spectrophotometer readings (Thermo Fisher Inc., Waltham, MA, USA). DNA samples from all individuals were SNP-genotyped at the Genome Quebec/McGill Innovation Centre using an Illumina bead array chip (Illumina Inc., San Diego, CA, USA) with the GoldenGate allele-specific assay. Samples from allopatric pure species populations and from hybrid populations were assayed in two different SNP arrays. White spruce, Engelmann spruce and Sitka spruce samples were tested with the first array comprising 1536 SNPs from a large panel of genes putatively involved in cold hardiness and insect herbivory resistance (Holliday ). A total of 230 SNPs were selected based on their genotyping quality (GenTrain score >0.40). For the second SNP array, 154 additional SNPs previously tested in other studies (Namroud ; Porth ) were added to the analysis. In the second SNP array, SNPs selected from the first array and other studies (384) were used to genotype 745 samples from the hybrid zone. Due to the selection process, SNPs may be subject to ascertainment bias in terms of underrepresentation of rare alleles (Namroud ). Data files with florescence intensity for each SNP were loaded directly into GenomeStudio Genotyping Module v1.0 (Illumina 2010). The call rate cut-off for SNP selection was 90%; however, most of the SNPs (91%) had call rates >98%. Of 384 SNPs included in the second GoldenGate array, 311 SNPs corresponding to 290 widely distributed genes were successfully genotyped and met both genotyping quality and data normalization criteria. Annotation and position of these genes in the white spruce genome (Pavy ; B. Pelgas, N. Isabel & J. Bousquet, unpublished data) can be found in Supporting Information Table S1 (Supporting information).

Genomic admixture

Samples of putatively pure species from allopatric populations were genotyped for a subset of the SNPs used to genotype individuals from the hybrid zone. These 86 SNPs were used to assess population structure and admixture levels using the program STRUCTURE 2.3.3 (Pritchard ). Admixture models with a putative number of clusters (K) from one to 16 were tested using 50 000 iterations for the pre- and postburn periods. Each run was replicated twenty times to estimate K using the method developed by Evanno with the program Structure Harvester version 0.6.7 (Earl & VonHoldt 2011). Based on STRUCTURE results for K = 2, individuals with Bayesian admixture proportions (Q) >0.9 from Fort Nelson (white spruce) and E1, E2, E3 (Engelmann spruce) were used as reference genotypes for ‘pure species’ to calculate hybrid index within the zone with 86 SNPs using the INTROGRESS 1.1 (Gompert & Buerkle 2010) package in R 2.13.1 (R Core Team 2013). Population structure and admixture levels were also estimated using TESS version 2.3 (Chen ). Unlike STRUCTURE, TESS uses a hierarchical Bayesian algorithm to include spatial prior distributions on the individual admixture proportions (Durand ), giving a reliable estimation of admixture when admixture proportions are variable across space. The admixture model was performed for values of Kmax ranging from 2 to 14. Markov chain Monte Carlo (MCMC) algorithms were run for a length of 50 000 sweeps with burn-in periods of 30 000 sweeps. Each run was replicated twenty times. Principal component analysis (PCA) was conducted using a correlation matrix of allele frequencies with SAS Enterprise Guide 4.2. The first twenty principal components were tested for correlation with elevation using PROC CORR. Fst values were calculated using GenAlEx version 6.4 (Peakall & Smouse 2006).

Linkage disequilibrium

Pairwise linkage disequilibrium (LD; r2) among all informative sites for 23 of the genes was calculated from inferred haplotypes using PHASE algorithms with the program DnaSP v.5.10.01 (Librado & Rozas 2009). Statistical significance of LD tests was determined by Fisher's exact tests with Bonferroni correction. Because LD can be affected by differences in sample sizes and allele frequencies, r2 was calculated from 22 randomly selected individuals in each of the populations. LD was compared between parental and hybrid populations to assess the likelihood of recent admixture in the hybrid zone. If admixture were recent in the hybrid zone, we would expect to find higher LD, on average, in hybridizing populations due to newly recombinant alleles (Barton & Hewitt 1985).

Detection of loci potentially affected by selection

To test for signatures of selection, we followed two different approaches. First, we used BayeScan v2.0 (Foll & Gaggiotti 2008) to identify loci that deviated significantly from neutrality. This Bayesian program decomposes Fst coefficients into a population-specific component (β) and a locus-specific component (α) using logistic regression. When the pattern of diversity cannot be explained by β alone (α significantly different from 0), the locus is considered to be under selection. Positive values of α suggest diversifying selection and negative values suggest balancing selection (Table 2). Several runs were performed to ensure consistency, with 5000 iterations and burn-in period of 50 000 iterations. False discovery rate (q), defined as the expected proportion of false positives among outlier markers, was set at 0.03.
Table 2

Significant single nucleotide polymorphism (SNP) outliers detected using (A) 311 SNP loci within the contact zone (Fig. 4a); and (B) 86 SNP loci across the hybrid zone (Fig. 4b) with BayeScan. A positive value of α suggests diversifying selection, and a negative value, balancing selection. Cut-off for Bayesian posterior probability [prob(α≠0)] was 0.7. This probability cannot be compared with traditional P-values. SNP short ID identifies SNPs in Fig. 4

 SNP IDSNP short IDProb (α≠0)log (PO)AlphaFst
(A)
 13_496170.967791.47781.290.10417
 208pg12875c56111.99870.1767
 295_78620.898980.949341.0450.084892
 C2270-contig1.NC1-3841290.790560.576870.942540.080339
 C6522-contig1.NC1-2691460.969991.50961.3170.10647
 WS-2.0-GQ0024.B3.r-D12.1-2392000.9891.95371.26920.10083
 WS-2.0-GQ0064.B3.r-I13.1-12362400.98061.70361.26610.10062
(B)
 124_49570.705940.38033−1.0760.053519
 13_49617111.67770.29291
 234_171450.703140.37449−0.985150.055809
 295_78620.784960.562320.872260.19316
 45_1067680.9761.60911.07790.21397
 50_135700.865170.80733−1.26890.043362
 68_28676112.77060.4658
Significant single nucleotide polymorphism (SNP) outliers detected using (A) 311 SNP loci within the contact zone (Fig. 4a); and (B) 86 SNP loci across the hybrid zone (Fig. 4b) with BayeScan. A positive value of α suggests diversifying selection, and a negative value, balancing selection. Cut-off for Bayesian posterior probability [prob(α≠0)] was 0.7. This probability cannot be compared with traditional P-values. SNP short ID identifies SNPs in Fig. 4
Fig 4

Results of Bayesian outlier detection analysis (a) within contact zone using 311 single nucleotide polymorphism (SNP) loci and (b) across hybrid zone, using 86 SNP loci. Estimate of Fst plotted against transformed P-values, where PO = p/(1-p). Loci (full circles) at the right of the vertical line showed significant deviations from neutrality.

Our second approach for identifying targets of local adaptation was to use Bayenv 2.0 (Coop ) to test for associations between SNP allele frequencies and environmental variables. This approach complements the Fst outlier analysis, and the two analyses may produce different results for methodological or biological reasons (Keller ). The Fst outlier analysis assumes an island model for the null distribution of neutral values, and the actual distribution of neutral Fst values will deviate from this expectation under other demographic scenarios such as the secondary contact in a hybrid zone (Schoville ; Lotterhos & Whitlock 2014). Allele–environment associations are not sensitive to this problem. However, Fst outliers may reflect divergent selection and local adaptation due to unknown environmental drivers that are not tested in an allele–environment association approach. To overcome the lack of a set of control (noncandidate) SNPs, we built the control matrix using all 311 SNPs (G.Coop, personal communication). We also performed nonparametric Spearman rank correlation coefficient tests, as a measure of support for Bayes factors. SNPs highly ranked in Bayes factor lists (BF > 3; Eckert ) with high correlation coefficients (ρ > 0.2) were strong candidates for divergent selection. Bayenv runs were carried out using 100 000 iterations (k) and a number of different seeds to ensure model convergence. Both the BayeScan and Bayenv analyses were conducted on two sets of data. First, all candidate gene SNPs (311) were used to analyse populations from within the hybrid zone classified into elevational bands (350–600, 600–1600, 1600–1800 , 1800–2250 m, >2250 m). Secondly, a subset of SNPs (86) was used to study populations across the hybrid zone spanning a wide range from the pure Engelmann spruce populations in the south to the pure white spruce populations in the north. In the Bayenv analyses, SNP allele frequencies were tested for associations with 21 climatic variables from ClimateWNA (Wang ) (Table 3).
Table 3

Candidate gene single nucleotide polymorphisms that exhibit Fst outlier behaviour or that are associated with one or several environmental variables

SNP IDFst outlier*Environmental associationsAnnotationReferenceLinkage groupPosition (cM)
208pg12875cYesMWMT, SHM, EXT, ErefGlycoside hydrolase family 28 protein/polygalacturonase (pectinase)P. glauca874.547
295_78YesMWMTNo apical meristemP. sitchensis
WS-2.0-GQ03105.B7-O12.3-654NoMWMT,MSP, SHM, Eref, CMDFructose-1,6-bisphosphataseP. glauca1010.469
WS-2.0-GQ0064.B3.r-I13.1-1236YesMWMTAcid phosphataseP. glauca4156.456
14_248NoMCMT,TDABC transporterP. glauca527.6
WS-2.0-GQ0041.BR-J07.2-36NoMAP, AHM, PASUnknownP. glauca7130.421
0_13680-contig2.C1-149NoMAP, AHM, PASHypothetical proteinP. glauca2102.3
WS-2.0-GQ0021.BR.1-G04.1-641NoMAP, AHM, PASUnknownP. glauca613
WS-2.0-GQ0168.B3-N16.1-556NoSHM, CMDFlavin reductaseP. glauca572.283
144_441NoCMDPhytochrome 4P. patens626.2
C2270-contig1.NC1-384YesCCAAT-binding transcription factorP. glauca249.628
C6522-contig1.NC1-269YesUnknown
WS-2.0-GQ0024.B3.r-D12.1-239YesPeroxisomal membrane proteinP. glauca820.264
69_753NoMAT,DD_0, DD_18CBL-interacting protein kinaseP. sitchensis
68_286YesMWMT, MSP, AHM,SHM, DD5, DD18, EXT,ErefGlycosyl hydrolaseP. sitchensis
206_435NoMCMT,TD,DD_0Isoflavone reductaseP. sitchensis
13_496YesMCMT,TD,DD5,bFFP,PASFK506-binding proteinP. sitchensis
45_1067YesMCMT,ErefAlpha-amylaseP. sitchensis
288_628NoMAP, PASLate elongated hypocotylP. sitchensis
288_302NoMAP, PASLate elongated hypocotylP. sitchensis

Only outlier loci suggesting diversifying selection in the BayeScan analyses were considered.

Environmental associations based on Bayenv are as follows: Mean Annual Temperature (MAT), Precipitation as snow (PAS), Mean Warmest Month Temperature (MWMT), Summer Heat—Moisture Index (SHM), Continentality (TD), Annual Heat—Moisture Index (AHM), Mean Annual Precipitation (MAP), Mean Summer Precipitation (MSP), Degree-days below 0 °C (DD_0), Mean Coldest Month Temperature (MCMT), Eref (Hargreaves reference evaporation), CMD (Hargreaves climatic moisture deficit), Degree-days above 5 °C (DD5), bFFP (Julian date on which frost-free period starts), Degree-days below 18 °C (DD_18), Degree-days above 18 °C (DD18), Extreme maximum temperature over a 30-year period (EXT).

Taken from Pavy .

Candidate gene single nucleotide polymorphisms that exhibit Fst outlier behaviour or that are associated with one or several environmental variables Only outlier loci suggesting diversifying selection in the BayeScan analyses were considered. Environmental associations based on Bayenv are as follows: Mean Annual Temperature (MAT), Precipitation as snow (PAS), Mean Warmest Month Temperature (MWMT), Summer Heat—Moisture Index (SHM), Continentality (TD), Annual Heat—Moisture Index (AHM), Mean Annual Precipitation (MAP), Mean Summer Precipitation (MSP), Degree-days below 0 °C (DD_0), Mean Coldest Month Temperature (MCMT), Eref (Hargreaves reference evaporation), CMD (Hargreaves climatic moisture deficit), Degree-days above 5 °C (DD5), bFFP (Julian date on which frost-free period starts), Degree-days below 18 °C (DD_18), Degree-days above 18 °C (DD18), Extreme maximum temperature over a 30-year period (EXT). Taken from Pavy .

Results

Both general circulation models (CCM1 and GFDL) suggest that white and Engelmann spruce may have had the opportunity for contact as early as the Last Glacial Maximum at 21 000 YBP in the southern part of the Rocky Mountains (Figs 2 and 3 and S1, Supporting information). Models were evaluated against modern sample plot species occurrence, fossil pollen and macrofossil data records since the Last Glacial Maximum using the area under the receiver-operating characteristic (AUC). Evaluations of Engelmann and white spruce with 55 744 modern sample plots showed good model fit with AUC values of 0.80 and 0.88, respectively. Validations with 2651 fossil pollen and macrofossil data since the Last Glacial Maximum for both species resulted in average AUCs of 0.62 and 0.61 for the CCM1 and GFDL model, respectively (Table S2, Supporting information). Model sensitivity for all periods and both species was generally lower and specificity generally very high, indicating that models were better predictors of species absences than species presences. This is to be expected, given the limited number of validation points available for each species, particularly in the periods immediately following the Last Glacial Maximum.
Fig 2

Glacial re-colonization patterns of Picea glauca, P. engelmannii and their hybrids from 21 000 to 14 000 years before present based on climate niche modelling (CCM1 model) and palaeoclimate data. Years before present are in bold at the top of the graph. P. engelmannii is in green, P. glauca is in blue and the hybrid zone is in brown. Species frequency levels were divided equally by thirds: 0.33 (Low), 0.66 (Med), 1.00 (High). Laurentian and Cordilleran ice sheets are shown as inserts in each of the maps (Dyke ).

Fig 3

Postglacial re-colonization patterns of Picea glauca, P. engelmannii and their hybrids based on climate niche modelling (CCM1 model) and palaeoclimate data. Years before present are in bold at the top of the graph. P. engelmannii is in green, P. glauca is in blue and the hybrid zone is in brown. Species frequency levels were divided equally by thirds: 0.33 (Low), 0.66 (Med), 1.00 (High). Laurentian and Cordilleran ice sheets are shown as inserts in each of the maps (Dyke ).

Glacial re-colonization patterns of Picea glauca, P. engelmannii and their hybrids from 21 000 to 14 000 years before present based on climate niche modelling (CCM1 model) and palaeoclimate data. Years before present are in bold at the top of the graph. P. engelmannii is in green, P. glauca is in blue and the hybrid zone is in brown. Species frequency levels were divided equally by thirds: 0.33 (Low), 0.66 (Med), 1.00 (High). Laurentian and Cordilleran ice sheets are shown as inserts in each of the maps (Dyke ). Postglacial re-colonization patterns of Picea glauca, P. engelmannii and their hybrids based on climate niche modelling (CCM1 model) and palaeoclimate data. Years before present are in bold at the top of the graph. P. engelmannii is in green, P. glauca is in blue and the hybrid zone is in brown. Species frequency levels were divided equally by thirds: 0.33 (Low), 0.66 (Med), 1.00 (High). Laurentian and Cordilleran ice sheets are shown as inserts in each of the maps (Dyke ).

Genomic admixture and hybrid index

High levels of admixture and introgression were found within the contact zone between white and Engelmann spruce, where most individuals had hybrid ancestry. Both the admixture and the hybrid index analysis showed asymmetry in introgression towards Engelmann spruce, meaning that in general in the hybrid zone, there were more hybrid individuals with a higher genetic contribution from Engelmann spruce than from white spruce (Fig. 1). The histogram of hybrid classes showed that the majority of the hybrids (60%) had a hybrid index higher than 0.6, weighted towards Engelmann spruce ancestry, while the classes between 0.8 and 0.9 (putative Engelmann spruce advanced generation backcrosses) accounted for 14% of the trees. Hybrid classes between 0.1 and 0.2 (putative white spruce advanced generation backcrosses) were relatively rare (0.8%) in the zone (Fig. S2, Supporting information). The frequency distribution of hybrid classes suggests this is an old hybrid zone. The low levels of linkage disequilibrium in hybridizing populations relative to parental populations also suggest that this is an ancient hybrid zone, in which advanced generation hybrid genotypes predominate and recombination is widespread. Based on 351 pairwise comparisons between informative sites across 23 genes, we found an overall mean LD (r2) of 0.41. There was no evidence for excess LD in individuals with intermediate genetic ancestry (mean r2 = 0.32) in comparison with LD estimates for parentals (mean r2 = 0.43), suggesting little or no recent admixture in the hybrid zone (Table S3, Supporting information). In fact, very few F1 hybrids were previously identified in this hybrid zone (De La Torre ). Despite high levels of introgression, both Engelmann spruce and white spruce remain genetically well differentiated, as evidenced by the results from STRUCTURE, TESS, PCA and Fst analyses. Delta K showed a peak for K = 2, each cluster representing one of the pure species (Fig. S3, Supporting information). The PCA analysis showed similar results, but the hybrids formed a separate cluster from pure species (Fig. S4, Supporting information). The model formed by the first three principal components was significantly correlated with elevation (R2 = 0.52, F = 282.38, P < 0.0001). The Fst analysis showed a greater genetic distance between pure species (0.208) than among white spruce and hybrids (0.09± 0.04) and among Engelmann spruce and hybrids (0.08 ± 0.02).

Detection of adaptive loci

Twenty loci showed some evidence of divergent selection. Of these, six loci were identified by both the Fst outlier analysis and the Bayenv gene–environment association analysis, three loci were identified by the Fst outlier analysis only, and 11 were identified by Bayenv only (Tables 2 and 3). Bayes factors (BF) for nine Fst outlier loci showing diversifying selection ranged from 3 to 47.51, with most of the values ranging from 3 to 10 (Table S4, Supporting information). The majority of Fst outlier loci indicated diversifying selection; only three outlier loci suggested balancing selection with low Fst estimates and negative α values (Table 2, Fig. 4). The three Fst outlier loci without significant associations with climate (C2270-contig1.NC1-384, C6522-contig1.NC1-269 and WS-2.0-GQ0024.B3.r-D12.1-239) showed relatively strong correlations with one or several environmental variables (ρ > 0.6); however, their BF values did not exceed our cut-off of 3. Results of Bayesian outlier detection analysis (a) within contact zone using 311 single nucleotide polymorphism (SNP) loci and (b) across hybrid zone, using 86 SNP loci. Estimate of Fst plotted against transformed P-values, where PO = p/(1-p). Loci (full circles) at the right of the vertical line showed significant deviations from neutrality. The association analysis detected 11 potentially locally adaptive loci that were not detected by the Fst outlier test. These loci showed strong correlations with both mean annual temperature and precipitation as snow, variables strongly associated with elevational gradients that were previously identified as important in shaping adaptive variation, fitness and genetic structure within the hybrid zone (De La Torre ). Most of the SNPs identified as outliers or associated with environmental variables differed between analyses within the hybrid zone only and analyses that spanned both allopatric parental species and hybrid populations, probably reflecting different selection pressures acting along elevational and latitudinal gradients. Fst outliers may reflect divergent selection due to unknown environmental drivers that have not been identified and therefore were not detected by our allele–environment association approach. Adaptive loci were widely distributed in the genome in seven of the 12 linkage groups and pointed towards important functions in cold adaptation such as signal transduction, transcription regulation and carbohydrate metabolism (Table 3).

Discussion

Glacial and postglacial re-colonization patterns

Demographic processes during past glacial cycles have greatly influenced current species distributions and have played a role in the local adaptation and speciation of several North American species (Shafer ). White spruce and Engelmann spruce are closely related species likely diverged in allopatry during the Pliocene (∼5 MYA), according to a recent Picea phylogeny based on nuclear, plastid and mitochondrial sequences (Lockwood ). During the Pleistocene, the Laurentide and the Cordilleran ice sheets, which covered much of Canada and northwestern United States, led to the displacement of both white and Engelmann spruce south of their present distribution (Shafer ). According to our palaeoclimatic modelling, these species most recently had the potential to come into secondary contact by 21 000 YBP, in the southern Rocky Mountains in Colorado and Wyoming, considerably south and east of the current hybrid zone. At 21 000 YBP, the ice sheets reached their largest extension and the white spruce and Engelmann spruce ranges were at their most restricted, suggesting the likelihood of one or several other episodes of secondary contact before 21 000 YBP (Fig. 2). The difference in timing of formation between the Laurentide Ice Sheet (27 000–30 000 YBP) and the Cordilleran Ice Sheet (19 000–22 000 YBP) also suggest the chance for secondary contact before 21 000 YBP in British Columbia (Dyke ). The climatic changes associated with the Last Glacial Maximum led to the displacement of plant and animal species into several ice age refugia (Shafer ). White spruce appears to have survived the Last Glacial Maximum in two refugia, the unglaciated Yukon Valley, north from the ice front and the Appalachian Mountains Wisconsin refugium (Anderson ). After the last glaciation, by 11 000 YBP, white spruce was distributed as far north as Alaska and had re-established its transcontinental range. While the white spruce range expanded, the Engelmann spruce range likely contracted. Engelmann spruce populations, which were more abundant and occupied lower elevations in the Rocky Mountains and the Great Basin during the Pleistocene, became fragmented when the temperature rose during the Xerothermic Period by 11 000 YBP (Ledig ). As a result of white spruce and Engelmann spruce range expansions and contractions, it appears likely that the hybrid zone moved west and then north, reaching British Columbia by 14 000 YBP following the recession of the Cordilleran and Laurentide Ice Sheets (Figs 2 and 3). This concurs with previous biogeographical reports (Pellatt ; Ledig ). After 14 000 YBP, the hybrid zone had available climatic habitat to expand its range in latitude and longitude until reaching its current northern and eastern expanse (Fig. 3). The current phylogeographical structure of several other species in British Columbia and the Northwestern United States also appear to result from secondary contact between populations expanding from glacial refugia south or north of the Cordilleran ice sheet. For this reason, eastern British Columbia is considered a geographical ‘hotspot’ of hybrid zones, due to the number of plant and animal contact zones (Maroja ; Irwin ). The climate niche modelling also had some limitations. Both the CCM1 and the GFDL model indicated climatic habitat for Engelmann spruce rather than white spruce in Alaska between 21 000 and 14 000 YBP. This inconsistency with previous genetic reports of a white spruce refugium (Anderson ) may be due to fewer data points and lower resolution in the Alaska ecoregion in comparison with other ecoregions or due to white spruce ecotypes in Alaska that had a different climatic niche than those elsewhere in the species range (Aitken ). Extensive admixture and introgression was observed in the contact zone, with most alleles being shared by white spruce, Engelmann spruce and their hybrids. Despite this extensive introgression, pure species remained well differentiated from each other, supporting the view that white spruce and Engelmann spruce are recognizably distinct species. Our results point towards a genetic architecture shaped primarily by past introgression events, because little evidence for recent admixture has been found in the hybrid zone. The frequency distribution of hybrid classes supports the results of the palaeoclimatic modelling, indicating this is an ancient hybrid zone, in which advanced generation hybrid genotypes dominate and recombination is widespread. The lower levels of LD in hybrid populations than in parental species also suggest that hybridization has been ongoing for many generations, but we recognize this analysis, limited to 23 genes, lacks power. Nonetheless, recombination takes many generations to break up chromosomal blocks derived from each parental species; therefore, LD in hybrids is expected to decay faster with increasing number of backcrosses generations (Lexer ). Reduced rates of current admixture between species may be a consequence of strong environmental selection causing locally adapted hybrids and parentals (De La Torre ), or other early-evolving endogenous or exogenous isolating mechanisms (Andrew & Rieseberg 2013). Results of the genomic analysis indicated a strong asymmetry in introgression towards Engelmann spruce, which may be explained by demographic processes during range expansion and contraction (Excoffier ). If a species expands its range into that of a congener, and the two species hybridize, this would also produce a moving hybrid zone during the expansion (Buggs 2007). This suggests that introgression may have first occurred between Engelmann spruce as the local species and white spruce as the invading species, contributing to the asymmetrical patterns of introgression observed.

Functional roles of genes exhibiting outlier behaviour

Results of the Fst outlier and environmental association analyses identified a small percentage of candidate SNP loci that deviate from selective neutrality (6.4% of loci were significant in each analysis alone, with 1.9% common to both analyses). The small number of loci identified as putatively adaptive is typical of recent genome scans in plant and animal species (Strasburg ). As we used all of our candidate loci to determine the population covariance matrix for Bayenv and the underlying neutral distribution of Fst, our results are likely somewhat more conservative than using independent, non-candidate or non-coding loci. However, the Fst outlier analysis may still result in false positives due to secondary contact and introgression between these species resulting in neutral population structure (Lotterhos & Whitlock 2014). Loci suggesting diversifying selection that were both Fst outliers and associated with environmental variables were linked to signal transduction, transcription regulation and carbohydrate metabolism, all of which can be important gene functions in local adaptation to climate. Immunophilins (SNP 13_496) are intracellular receptors for the immunosupressants FK506 and rapamycin, which inhibit different signalling pathways required for T-cell activation (Luan ). Studies in Arabidopsis and rice have suggested that FK506 homologs are encoded by a small gene family in higher plants. Leaf-specific immunophilins are regulated by light in fava beans, suggesting they may play a role in plants light signal transduction (Luan ). No apical meristem (Nam) genes (SNP 295_78) are part of the NAC domain proteins, which are plant-specific transcriptional factors known to play important role in plant developmental processes (Hu ). It has been suggested that NAC genes may also play an important role in wood formation and secondary cell wall biosynthesis in Populus trichocarpa (Hu ). Carbohydrates play an important role in protecting cellular membranes from freezing injury by reducing osmotic potential across the membranes and by stabilizing them (Holliday ). Triggered by the cessation of growth and the onset of dormancy, carbohydrate metabolism is modified towards the accumulation of storage compounds, cryoprotective or dehydration-protective solutes and reductive co-factors, increasing cold hardiness in Sitka spruce (Picea sitchensis) (Dauwe ). In this study, we have found two loci with functions related to carbohydrate metabolism, SNP 45_1067 (alpha-amylase) and SNP 68_286 (glycosyl hydrolase). Sugar content and cold hardiness are positively correlated in several tree species as Scots pine (Pinus silvestris), lodgepole pine (Pinus contorta), Norway spruce (Picea abies) and red spruce (Picea rubens) (Ogren ).

Environmental gradients and local adaptation

The Engelmann-white spruce hybrid zone is structured along elevational gradients, and many climatic and other environmental factors vary with elevation and with latitude in this topographically heterogeneous region. For example, trees at lower elevations experience hotter, drier summers and colder winters, while those at high elevations experience deeper winter snowpacks, shorter growing seasons and colder summers, with frost events not infrequent during the growing season (De La Torre ). SNPs showing significant associations with environmental factors reflect the complexity of local adaptation to various environmental factors. Some loci are associated primarily with climatic variables reflecting summer temperatures and moisture deficits. For example, SNPs 208pg12875c, 295_78 and WS-2.0-GQ03105.B7-012.3_654 are associated with mean warmest month temperature, summer heat–moisture index or climate-moisture deficit or mean summer precipitation. Others SNPs (e.g. WS-2.0-GQ0041.BR_J07.2-36) are predominantly associated with precipitation-related variables such as mean annual precipitation and precipitation as snow. Still others (e.g. SNPs 13_496, 45_1067 and 206_435) are significantly associated with estimates of winter temperatures, with the mean coldest month temperature, continentality and degree-days below freezing as associated environmental variables. These associated SNPs are likely within or linked to genes that affect different traits involved in local adaptation to different environmental factors across these steep environmental gradients in southern and central British Columbia.

Maintenance of species boundaries in spruce

This study contributes to the long-standing debate on the maintenance of white spruce and Engelmann spruce species identities despite hybridization, by viewing the evolutionary relationships between these two species based on the genic view of speciation (Wu 2001). White and Engelmann spruce appear to have highly porous genomes, yet a small number of widely distributed genes under selection likely counteract the homogenizing effects of gene flow and prevent introgression in surrounding regions, maintaining species differences despite interspecific gene flow. The maintenance of species integrity between hybridizing species has been reported for several plant and animal species, challenging the view that species differentiation is a genome-wide phenomenon. Sambatti found that a large number of small divergent genomic regions between hybridizing species Helianthus annuus and H. petiolaris maintained species differences despite extensive interspecific gene flow. DeFaveri found that selection acting on multiple loci widely and heterogeneously distributed across the genome has contributed to the divergence between stickleback populations despite high levels of gene flow. Even in more advanced stages of speciation, sympatric species may share parts of their genomes as a consequence of ongoing gene flow, as has been seen in Heliconius butterfly species (Martin ). When analysing allele frequency differentials between white spruce, Engelmann spruce and hybrids, we found that most alleles freely cross species barriers, whereas a small number of them are restricted to one parental species and their hybrids, with greater divergence between species. A subset of these loci apparently impermeable to introgression between species was identified in the outlier and the environmental association analyses, suggesting they are under selection or linked to loci under selection, and in relation to adaptive differences between white, Engelmann spruce and their hybrids. There are likely many more such loci across the genome as we studied a relatively small number of candidate loci. These putatively adaptive loci point to some important gene functions in adaptation to winter regimes. Within this spruce hybrid complex, both parental species and their hybrids are locally adapted to different environments found along elevational gradients, in which key factors for survival are adaptation to the length of the growing seasons and the depth of the snowpack (De La Torre ). The presence of locally adapted populations and apparently reduced rates of current interspecific gene flow suggest that hybrid populations in this complex may be in the early stages of ecological speciation, with hybrid populations undergoing the transition from local adaptation to incipient homoploid species (Andrew & Rieseberg 2013). Further studies of fine-scale linkage and the size and distribution of haplotype blocks from parental species using newly available genomic tools will help to elucidate this process.
  33 in total

1.  Inference of population structure using multilocus genotype data.

Authors:  J K Pritchard; M Stephens; P Donnelly
Journal:  Genetics       Date:  2000-06       Impact factor: 4.562

Review 2.  Of glaciers and refugia: a decade of study sheds new light on the phylogeography of northwestern North America.

Authors:  Aaron B A Shafer; Catherine I Cullingham; Steeve D Côté; David W Coltman
Journal:  Mol Ecol       Date:  2010-09-17       Impact factor: 6.185

3.  Using environmental correlations to identify loci underlying local adaptation.

Authors:  Graham Coop; David Witonsky; Anna Di Rienzo; Jonathan K Pritchard
Journal:  Genetics       Date:  2010-06-01       Impact factor: 4.562

4.  Back to nature: ecological genomics of loblolly pine (Pinus taeda, Pinaceae).

Authors:  Andrew J Eckert; Andrew D Bower; Santiago C González-Martínez; Jill L Wegrzyn; Graham Coop; David B Neale
Journal:  Mol Ecol       Date:  2010-08-13       Impact factor: 6.185

5.  Ice-age endurance: DNA evidence of a white spruce refugium in Alaska.

Authors:  Lynn L Anderson; Feng Sheng Hu; David M Nelson; Rémy J Petit; Ken N Paige
Journal:  Proc Natl Acad Sci U S A       Date:  2006-08-07       Impact factor: 11.205

6.  Metabolic dynamics during autumn cold acclimation within and among populations of Sitka spruce (Picea sitchensis).

Authors:  Rebecca Dauwe; Jason A Holliday; Sally N Aitken; Shawn D Mansfield
Journal:  New Phytol       Date:  2012-01-16       Impact factor: 10.151

7.  Phylogeography of spruce beetles (Dendroctonus rufipennis Kirby) (Curculionidae: Scolytinae) in North America.

Authors:  Luana S Maroja; Steven M Bogdanowicz; Kimberly F Wallin; Kenneth F Raffa; Richard G Harrison
Journal:  Mol Ecol       Date:  2007-06       Impact factor: 6.185

8.  Defense mechanisms against herbivory in Picea: sequence evolution and expression regulation of gene family members in the phenylpropanoid pathway.

Authors:  Ilga Porth; Björn Hamberger; Richard White; Kermit Ritland
Journal:  BMC Genomics       Date:  2011-12-16       Impact factor: 3.969

9.  Genome-wide evidence for speciation with gene flow in Heliconius butterflies.

Authors:  Simon H Martin; Kanchon K Dasmahapatra; Nicola J Nadeau; Camilo Salazar; James R Walters; Fraser Simpson; Mark Blaxter; Andrea Manica; James Mallet; Chris D Jiggins
Journal:  Genome Res       Date:  2013-09-17       Impact factor: 9.043

10.  Evaluation of demographic history and neutral parameterization on the performance of FST outlier tests.

Authors:  Katie E Lotterhos; Michael C Whitlock
Journal:  Mol Ecol       Date:  2014-04-11       Impact factor: 6.185

View more
  11 in total

Review 1.  Insights into conifer giga-genomes.

Authors:  Amanda R De La Torre; Inanc Birol; Jean Bousquet; Pär K Ingvarsson; Stefan Jansson; Steven J M Jones; Christopher I Keeling; John MacKay; Ove Nilsson; Kermit Ritland; Nathaniel Street; Alvin Yanchuk; Philipp Zerbe; Jörg Bohlmann
Journal:  Plant Physiol       Date:  2014-10-27       Impact factor: 8.340

2.  Molecular proxies for climate maladaptation in a long-lived tree (Pinus pinaster Aiton, Pinaceae).

Authors:  Juan-Pablo Jaramillo-Correa; Isabel Rodríguez-Quilón; Delphine Grivet; Camille Lepoittevin; Federico Sebastiani; Myriam Heuertz; Pauline H Garnier-Géré; Ricardo Alía; Christophe Plomion; Giovanni G Vendramin; Santiago C González-Martínez
Journal:  Genetics       Date:  2014-12-29       Impact factor: 4.562

Review 3.  Hybrid zones: windows on climate change.

Authors:  Scott A Taylor; Erica L Larson; Richard G Harrison
Journal:  Trends Ecol Evol       Date:  2015-05-13       Impact factor: 17.712

Review 4.  Biological invasions, climate change and genomics.

Authors:  Steven L Chown; Kathryn A Hodgins; Philippa C Griffin; John G Oakeshott; Margaret Byrne; Ary A Hoffmann
Journal:  Evol Appl       Date:  2014-12-09       Impact factor: 5.183

5.  Genetic architecture and genomic patterns of gene flow between hybridizing species of Picea.

Authors:  A De La Torre; P K Ingvarsson; S N Aitken
Journal:  Heredity (Edinb)       Date:  2015-03-25       Impact factor: 3.821

6.  Contrasting Rates of Molecular Evolution and Patterns of Selection among Gymnosperms and Flowering Plants.

Authors:  Amanda R De La Torre; Zhen Li; Yves Van de Peer; Pär K Ingvarsson
Journal:  Mol Biol Evol       Date:  2017-06-01       Impact factor: 16.240

7.  Environmental Genome-Wide Association Reveals Climate Adaptation Is Shaped by Subtle to Moderate Allele Frequency Shifts in Loblolly Pine.

Authors:  Amanda R De La Torre; Benjamin Wilhite; David B Neale
Journal:  Genome Biol Evol       Date:  2019-10-01       Impact factor: 3.416

8.  Genotype-environment associations support a mosaic hybrid zone between two tidal marsh birds.

Authors:  Jennifer Walsh; Rebecca J Rowe; Brian J Olsen; W Gregory Shriver; Adrienne I Kovach
Journal:  Ecol Evol       Date:  2015-12-29       Impact factor: 2.912

9.  Growth gains from selective breeding in a spruce hybrid zone do not compromise local adaptation to climate.

Authors:  Ian R MacLachlan; Sam Yeaman; Sally N Aitken
Journal:  Evol Appl       Date:  2017-09-03       Impact factor: 5.183

10.  Demographic Inference of Divergence and Gene Exchange Between Castanopsis fabri and Castanopsis lamontii.

Authors:  Ye Sun; Xiangying Wen
Journal:  Front Plant Sci       Date:  2020-03-05       Impact factor: 5.753

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.