Literature DB >> 24820116

Single nucleotide polymorphisms reveal genetic structuring of the carpathian newt and provide evidence of interspecific gene flow in the nuclear genome.

Piotr Zieliński1, Katarzyna Dudek1, Michał Tadeusz Stuglik1, Marcin Liana2, Wiesław Babik1.   

Abstract

Genetic variation within species is commonly structured in a hierarchical manner which may result from superimposition of processes acting at different spatial and temporal scales. In organisms of limited dispersal ability, signatures of past subdivision are detectable for a long time. Studies of contemporary genetic structure in such taxa inform about the history of isolation, range changes and local admixture resulting from geographically restricted hybridization with related species. Here we use a set of 139 transcriptome-derived, unlinked nuclear single nucleotide polymorphisms (SNP) to assess the genetic structure of the Carpathian newt (Lissotriton montandoni, Lm) and introgression from its congener, the smooth newt (L. vulgaris, Lv). Two substantially differentiated groups of Lm populations likely originated from separate refugia, both located in the Eastern Carpathians. The colonization of the present range in north-western and south-western directions was accompanied by a modest loss of variation; admixture between the two groups has occurred in the middle of the Eastern Carpathians. Local, apparently recent introgression of Lv alleles into several Lm populations was detected, demonstrating increased power for admixture detection in comparison to a previous study based on a limited number of microsatellite markers. The level of introgression was higher in Lm populations classified as admixed than in syntopic populations. We discuss the possible causes and propose further tests to distinguish between alternatives. Several outlier loci were identified in tests of interspecific differentiation, suggesting genomic heterogeneity of gene flow between species.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 24820116      PMCID: PMC4018350          DOI: 10.1371/journal.pone.0097431

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Most species are genetically structured, and genetic structure is often observed at multiple spatial scales [1], [2]. Genetic structure is the result of a complex interplay of drift, gene flow, and natural selection acting on standing variation and new mutations [3]–[5]. The relative importance of these evolutionary forces is contingent on biological features of the organisms [6], and has also been affected by large-scale historical events, such as the Pleistocene climatic oscillations [2], [7]. Identification of factors responsible for the observed spatial structuring of genetic diversity is a major goal of population genetics [8]. The quantification and understanding of genetic structure within species are of fundamental importance for inferential studies of population history, population ecology and biodiversity conservation [3], [9], [10]. Analyses of genetic structure are also essential for several aspects of the study of adaptation [5], [11]–[18]. Genetic variation within species is commonly structured in a hierarchical manner which may result from superimposition of processes acting at different spatial and temporal scales. For example the impact of major climatic oscillations is clearly visible in the patterns of genetic differentiation observed currently in temperate and boreal species [2],[19]. This is believed to reflect mainly secondary contact and partial admixture of populations derived from separate refugia with a contribution of processes related to the expansion itself, such as allele surfing [4], [7]. Within these major geographic groups, populations are differentiated due to limited dispersal producing isolation by distance [20], [21]. In species with limited dispersal capabilities, signatures of past subdivision are detectable for a long time [2], [22]. This may be due to a combination of limited dispersal per se and differential adaptation in refugia superimposed on contemporary ecological gradients [23]. Admixture may also be delayed or prevented by the accumulation of intrinsic incompatibilities between populations [24], [25] and poor dispersers appear to speciate on smaller geographic scales [26]. Thus studying contemporary genetic structure in taxa characterized by limited dispersal is likely to provide ample information about historical and demographic events. Amphibians and in particular salamanders are ideal for such inferences [27], [28]. Another advantage of such taxa is that they retain historical information about spatial variation of genetic exchange with related, incompletely reproductively isolated species ([29] and references therein). Detecting, quantifying and interpreting genetic structure requires appropriate tools. Single nucleotide polymorphisms (SNPs) are powerful markers well suited for assessing genetic structure [30], [31]. They are amendable to high throughput, cost-effective and reliable genotyping through array-based [32] and genotyping by sequencing [33] approaches. If SNP discovery and genotyping are performed separately, the researcher has control over the location and other characteristics of the SNPs selected for genotyping. For instance, if polymorphism data from transcriptome sequencing are available, SNPs located in known protein coding genes can be selected for genotyping, providing information about variation in functionally important regions. On the other hand, a discover-then-genotype approach introduces ascertainment bias, which distorts the picture of variation obtained from a larger sample [34]. However, initial discovery of SNPs in a smaller sample may be desirable for some applications, such as detection and quantification of population structure [35], [36]. This is because the discovery process is biased towards more variable SNPs, thus increasing the per-marker information content, especially if SNP discovery is performed in a random or geographically diverse sample [37]. A distinct advantage of SNPs over microsatellites is that orders of magnitude more locations in the genome, can be easily interrogated. Thus SNPs offer a truly genome-wide perspective, essential if the biological processes of interest affect portions of the genome differentially [38]–[40]. Here we investigate the genetic structure of populations of the Carpathian newt (Lissotriton montandoni, Lm), a species which has apparently survived the glacial period in the Carpathians [41], an important refugial area [42], [43]. Two processes appear to have profoundly affected this species and shaped the genetic structure currently observed. The first includes climatic oscillations during the Pleistocene, likely responsible for the observed regional-scale genetic structuring. The second involves hybridization with and introgression from its widely distributed congener, the smooth newt (L. vulgaris, Lv). Previous studies [29], [41], [44], [45] demonstrated that Lm is genetically differentiated across its range in both mitochondrial and nuclear (microsatellites) genome. Patterns of genetic differentiation and species distribution modeling performed by Zieliński et al. [29] suggest several glacial refugia in the Carpathians. While multiple, spatially and temporally distinct introgression events from Lv resulted in complete mtDNA replacement in Lm, very little recent interspecific nuclear gene flow was suggested by microsatellite markers [29]. However, interspecific gene flow in some parts of the nuclear genome has been extensive, as evidenced by data from the Major Histocompatibility Complex genes [45]. A set of transcriptome-derived SNPs and extensive sampling are used herein to address the following issues. First, we compare genetic structure inferred from the genome-wide sample of SNP markers with that estimated previously [29] from a much smaller number of microsatellites. Specifically we wanted to determine the number of genetic clusters (which may correspond to glacial refugia) supported by SNP markers, delineate their distribution and estimate the extent of admixture between them. To this end we use a comprehensive, uniform sampling, including previously undersampled Ukrainian Carpathians where admixture between genetic clusters was expected. Second, we test whether introgression from Lv is detectable in the nuclear genome of Lm with an increased number of markers, and if so, whether introgression varies geographically. Populations in which both species co-occur were also sampled across the range to estimate the admixture in syntopy. Third, we apply outlier analysis to test heterogeneity of gene flow within and between species and identify genes departing from the genomic average; such genes may be involved in population or species-specific adaptations.

Material and Methods

Sample collection

Altogether we analyzed 473 individuals from 40 populations: 25 populations of Lm (298 individuals), 7 syntopic populations in which both species co-occur (83 individuals) and 8 populations of Lv (92 individuals) (Fig. 1; Table 1). Sampling sites were selected to cover the Lm range uniformly and to reflect Lv diversity in the surrounding areas. The average per population sample size, 12, might be considered low for some of the population genetic analyses, however we decided to rather maximize the number of markers as it was shown that this might be beneficial for robust landscape genetic inferences [46]. Throughout the text, we use terms population or locality interchangeably to refer to a particular breeding site consisting of one or more closely located water bodies. Adult newts were sampled by dip-netting during breeding season. Animals were released after tailtips were collected. Tissue samples were stored in 95% ethanol until DNA extraction. DNA was extracted using the Wizard Genomic DNA Purification Kit (Promega).
Figure 1

The distribution of sampling localities (details in Table 1).

Red triangles — Lissotriton montandoni (Lm); green circles— L. vulgaris (Lv); two symbols superimposed — syntopic locality where both species co-occur; T – localities from which six Lm individuals were sampled for liver transcriptomes. The distribution of Lm (Zavadil et al. 2003 and own unpublished data) is hatched. Areas above 500 m a.s.l. are shaded.

Table 1

Sampling.

NumberLocalityCountryN individualsSpeciesLongitude ELatitude N
1JesenikiCZ12 Lm 17.3150.07
2SihelneSK12 Lm 19.3949.51
3ŁopusznaPL14 Lm 20.1449.51
4KrempnaPL12 Lm 21.4849.47
5SmerekPL12 Lm 22.4449.16
6RakovetsUA12 Lm 24.0449.63
7PereprostynyaUA12 Lm 23.3649.21
8ZbynyUA12 Lm 22.9548.80
9Vyshkivsky PassUA12 Lm 23.6348.70
10MykulychynUA12 Lm 24.5948.38
11DzembroniaUA9 Lm 24.6248.07
12SuceviţaRO12 Lm 25.6847.75
13Pasul GutâiRO12 Lm 23.7847.70
14Pasul PascanuRO12 Lm 25.5247.57
15RomuliRO12 Lm 24.3947.51
16Lunca LeşuluiRO12 Lm 24.8647.30
17HoldaRO12 Lm 25.6747.27
18CuejdiuRO12 Lm 26.2747.00
19Lacu RoşuRO12 Lm 25.7746.78
20BrădeţeluRO12 Lm 25.1346.76
21BolătăuRO12 Lm 26.3846.64
22Pasul Musat 1RO12 Lm 26.4045.96
23SăceleRO12 Lm 25.8245.55
24PredealRO11 Lm 25.5645.49
25VoinaRO12 Lm 25.0545.44
26KuźminaPL11 Lm&Lv 22.4349.62
27Bronytsya 1UA12 Lm&Lv 23.4149.43
28LyuchaUA12 Lm&Lv 24.9148.38
29LypchaUA12 Lm&Lv 23.3848.28
30Valea UzuluiRO12 Lm&Lv 26.3046.34
31PenteleuRO12 Lm&Lv 26.3545.61
32LereştiRO12 Lm&Lv 25.0645.37
33PokrzywnaPL11 Lvv 17.4550.29
34JasłoPL12 Lvv 21.4949.74
35LipníkySK13 Lvv 21.4249.07
36RosilnaUA8 Lvv 24.3448.80
37DertsenUA12 Lva 22.6548.35
38StrâmbaRO12 Lvv 24.6547.25
39TazlăuRO12 Lvv 26.4746.73
40Pasul Musat 2RO12 Lvv 26.4045.97

Country abbreviations: CZ – Czech Republic, PL – Poland, RO – Romania, SK – Slovakia, UA – Ukraine. Species and subspecies abbreviations: Lm – Lissotriton montandoni, Lva – L. vulgaris ampelensis, Lvv – L. v. vulgaris, Lm&Lv – syntopic populations.

The distribution of sampling localities (details in Table 1).

Red triangles — Lissotriton montandoni (Lm); green circles— L. vulgaris (Lv); two symbols superimposed — syntopic locality where both species co-occur; T – localities from which six Lm individuals were sampled for liver transcriptomes. The distribution of Lm (Zavadil et al. 2003 and own unpublished data) is hatched. Areas above 500 m a.s.l. are shaded. Country abbreviations: CZ – Czech Republic, PL – Poland, RO – Romania, SK – Slovakia, UA – Ukraine. Species and subspecies abbreviations: LmLissotriton montandoni, Lva – L. vulgaris ampelensis, Lvv – L. v. vulgaris, Lm&Lv – syntopic populations.

Ethics statement

Animal samples were collected under permits: DOPozgiz-4200/II-78/3702/10/JRO provided by the Polish General Director for Environmental Protection, 03.04.12 No. 67 provided by the National Academy of Sciences of Ukraine and 3256/9.07.2010 provided by the Romanian Commission for Protection of Natural Monuments. Samples were collected with institutional animal ethics approval (number 101/2009), issued by the First Local Ethical Committee on Animal Testing at the Jagiellonian University in Krakow. Tissue samples were collected according to the requirements of these institutions: adult newts were captured by dip netting and tissue samples from tail tips were taken under anesthesia. Immediately after recovery from anesthesia, newts were released at the collection site. The sampling locations were not privately owned or protected in any way.

SNP discovery, assay development, and genotyping

SNPs were identified in liver transcriptomes of six Lm individuals sampled to encompass the genetic diversity of the species (Fig. 1). Transcriptomes were sequenced using Illumina technology and de novo assembled with Trinity [47]. Details of transcriptome sequencing and assembly will be provided elsewhere (Stuglik et al. in prep). We used a custom bioinformatic pipeline [48] to construct transcriptome-based gene models (TGM) from the Trinity output. Reads were mapped to this reference transcriptome with Bowtie2 [49] and SNPs were called with SAMTools 0.1.18. [50]. Next, we used blast searches against Xenopus tropicalis transcripts to identify TGM representing protein coding genes. To be included in the design of the genotyping assay, the SNP had to fulfill the following criteria: i) occur in a TGM which produced an unambiguous hit to a single Xenopus gene and to not exhibit high similarity to other TGMs in the newt reference transcriptome; this criterion was applied to minimize the incidence of false “SNPs” derived from paralogous regions; ii) have a minimum sequencing depth of 15 x and minimum genotype quality of 30 phred; iii) be located at least 60 but not more than 1000 bp from the exon boundary; the latter filter was used because last exons of many genes are long and consist mostly of 3′ untranslated regions (UTR) which are poorly conserved between species; thus the length of such exons in the newt could not be reliably determined and particularly long last “exons” may be artifacts of misassembly [51]. Filtering was performed with a custom Python script. A total of 251 SNPs and their flanking sequences were scored with Illumina Assay Design Tool (ADT) and the Illumina VeraCode GoldenGate Assay was designed for 192 best scoring SNPs. GoldenGate provides codominant genotype data for polymorphic positions with two segregating variants [32]. Genotyping, primary visualization, quality assessment and filtering were performed with Illumina GenomeStudio Data Analysis Software. All loci with cluster separation score and gen train score lower than 0.2 and 0.7, respectively, were excluded from further analysis. We also excluded loci with minor allele frequency (MAF)<1% and less than 90% genotyped individuals.

Population genetics analyses

Allele frequencies for each locus, tests of Hardy–Weinberg equilibrium and tests of linkage disequilibrium (LD) were calculated in GENEPOP 4.1.2 [52]; the type I error was controlled using the false discovery rate (FDR) approach implemented in QVALUE 1.0 [53], [54]. Expected heterozygosity (H E), was calculated in the R package adegenet [55]. Allelic richness (AR) was calculated in FSTAT 2.9.3.2 [56]. We interpolated geographic gradients in H E and AR using inverse distance weighting (IDW) in ArcGIS v 10.0 (ESRI, Redlands, CA, USA). Pairwise F ST values between populations and their significance were computed in Arlequin 3.5 [57]. Multidimensional scaling (MDS) was used for visualization of the F ST matrix. Principal component analysis (PCA) was performed in R using adegenet. The significance of correlations between genetic and geographical distances was calculated using the Mantel test implemented in IBDWS [58]. A population tree was constructed in POPTREE2 [59] from the pairwise F ST [60] matrix using the neighbor-joining method. The number of genetic clusters was determined and assignment of individuals to clusters was performed using the Bayesian approach implemented in STRUCTURE 2.3.3 [61]–[63]. We ran Structure on two separate datasets: Lm, which included only morphologically pure Lm populations and Lm&Lv comprising Lm, Lv and syntopic populations. We ran 10 analyses for each K 1-15 for Lm and 10 replicate runs for each K 1–20 for Lm&Lv dataset. In each case, the admixture model was applied and the runs consisted of 250 000 MCMC burnin steps followed by 1 000 000 post-burnin iterations. We performed inferences under the model of correlated allele frequencies for Lm, whereas the uncorrelated model was used for Lm&Lv dataset, because Lm and Lv populations were expected to be more divergent on average. To determine the most likely number of genetic clusters supported by the data, we calculated ΔK, a measure of second order rate of change in the likelihood of data [64], using the online software Structure Harvester [65]. Analysis of molecular variance (AMOVA) in Arlequin was used to partition SNP variation into hierarchical levels. Two groupings of populations were used: i) suggested by Structure; ii) supported by the methods based on genetic distances between populations. Significance levels for variance components were estimated using 10 000 permutations. To identify markers departing significantly from the genome-wide average of differentiation among populations a scan for F ST outliers was performed. In order to minimize the number of false positives the outlier detection was performed under a hierarchical island model [66] in Arlequin. We performed separate scans for differentiation within Lm and for interspecific differentiation. In each case 50 000 coalescent simulations with 2 groups of 100 demes were performed to obtain the null distribution of F-statistics. We selected candidate loci based on F ST or F CT values falling into the 1% upper and lower quantile as suggested by Excoffier et al. [67]. F CT allows for identification of outlier loci between the groups of populations whereas F ST identifies outlier loci among populations after accounting for higher-level structure [67]. Genes containing outliers were annotated by similarity blastx search against the nr protein database.

Results

Variation

Genotyping was successful for 139 out of 192 markers (72%) and these were used for population genetic analyses (Table S1) (DRYAD entry: doi:10.5061/dryad.211ck). The proportion of missing data among 473 genotyped individuals was very low (<0.2% single-locus genotypes). All 139 markers were polymorphic in Lm and 112 (81.6%) in Lv (Table S1). No significant deviations from Hardy–Weinberg expectations were detected at the FDR 0.05 which indicates that null alleles are very rare in our markers (Table S2). Tests for linkage disequilibrium across populations detected significant LD at the FDR 0.05 for 12 pairs of loci, however significant results were found only in three syntopic populations. Thus significant LD resulted not from physical linkage but from local admixture, and we consider all markers as segregating independently. The expected heterozygosity (H E) ranged from 0.05 (locality 33, Lv) to 0.29 (locality 29, syntopic) with a mean of 0.19 (SD = 0.07). H E was significantly higher in Lm than Lv (U-test, P = 2.9×10−5) most likely due to ascertainment bias, as only markers known to be polymorphic in Lm were assayed. Within Lm H E was lowest in population 6, isolated in the Ukrainian Podolian Upland and highest in population 20 located in the Romanian Transylvanian Plateau, and ranged from 0.19 to 0.27 respectively, with a mean of 0.23 (SD = 0.01) (Fig. S1). Within Lv H E ranged from 0.05 in the northernmost locality 33 to 0.10 in locality 36 with a mean of 0.08 (SD = 0.01). H E in syntopic populations spanned a broad range from very low (0.06, locality 31) to the highest overall (0.29, locality 29), most likely depending on the frequency of both species in the population (Fig. S1b).

Genetic structure and diversity in Lissotriton montandoni

Genetic differentiation between Lm populations varied from negligible (F ST<0.001 P = 0.39 between 14 and 17 located in the northern part of the Romanian Carpathians), to strong (F ST = 0.408 P<10−3 between populations 1 and 24 at the opposite limits of the species distribution) (Table S3). The MDS plot of pairwise F ST revealed two major, genetically differentiated groups of populations with distinct geographic distributions in the northern and southern part of the species range (Fig. 2a). Pairwise F ST values within groups were similar (averages of 0.100 and 0.098 in the northern and southern group, respectively), and overlapped only slightly with the distribution of pairwise F ST between populations from different groups (mean 0.271; randomization test P<0.001; Fig. S2). Within the northern group two populations appeared distinct from the rest. Notably both are isolated from the continuous part of the species range (Fig. 1). The westernmost locality 1 in the Sudetes Mountains is separated from the main range by the Moravian Gate and locality 6 in the Ukrainian Podolian Upland by the Dniester river. The two groups of Lm populations are strongly supported also by the population tree (Fig. 3).
Figure 2

Genetic differentiation within L. montandoni.

(a) Non-metric two-dimensional scaling of the pairwise F ST matrix; orange – populations from the northern group; blue – populations from the southern group; (b) Principal Component Analysis (PCA) performed on individual genotypes; in parentheses percentage of variance explained by principal components; orange – individuals from the northern group; blue – individuals from the southern group.

Figure 3

Relationships among populations.

A neighbor-joining tree was constructed from the matrix of pairwise F ST; syntopic populations were excluded. Robustness of relationships was tested with 1000 bootstrap replicates. Red – L. montandoni: orange – northern, blue – southern group; green – L. vulgaris: yellow – populations in the Carpathian Basin, violet – populations outside the Carpathian Basin.

Genetic differentiation within L. montandoni.

(a) Non-metric two-dimensional scaling of the pairwise F ST matrix; orange – populations from the northern group; blue – populations from the southern group; (b) Principal Component Analysis (PCA) performed on individual genotypes; in parentheses percentage of variance explained by principal components; orange – individuals from the northern group; blue – individuals from the southern group.

Relationships among populations.

A neighbor-joining tree was constructed from the matrix of pairwise F ST; syntopic populations were excluded. Robustness of relationships was tested with 1000 bootstrap replicates. Red – L. montandoni: orange – northern, blue – southern group; green – L. vulgaris: yellow – populations in the Carpathian Basin, violet – populations outside the Carpathian Basin. The presence of two genetic clusters is also evident from individual-based analyses. In principal component analysis (PCA) PC1 (15.8% of variance explained) separated newts from northern and southern populations (Fig. 2b). The Evanno [64] method also supported K = 2 as the most likely number of clusters in the Structure analysis (Fig. S3). Structure detected some admixture between the two clusters in the Ukrainian Carpathians and the northern part of the Romanian Carpathians. Admixture was strongest in population 10 which was therefore excluded from the AMOVA analysis (Fig. 4). AMOVA attributed 19.5% of total variation to differentiation between clusters and 8.2% to differentiation between populations within clusters (Table 2). Whereas no alleles were private to any population, 8 and 20 alleles were private for the northern and southern group, respectively. No significant differences between groups were detected in H E (U-test, P = 0.37), but allelic richness was higher in the southern group (1.62 vs, 1.57; U-test, P = 0.0042) (Fig. S4). As could have been expected from the significant among-population differentiation, there was a strong, highly significant isolation by distance observed both at the level of the entire species (Mantel test, r = 0.89, P<0.001) and within genetic clusters (North: r = 0.78, P<0.001; South: r = 0.78, P<0.001) (Fig. S5).
Figure 4

Genetic structure of L. montandoni inferred by Structure for K = 2 groups.

For each population pie charts show the fraction of the genes from the northern (orange) and southern (blue) groups.

Table 2

Results of the Analysis of Molecular variance (AMOVA).

Source of variationd.f.Sum of squaresPercentage of variation explainedp
Two groups within L. montandoni
Among groups11260.05419.52<0.0001
Among populations within groups221330.688.16<0.0001
Within populations5488988.3672.32<0.0001
Two species
Among groups14019.13643.47<0.0001
Among populations within groups313545.32313.31<0.0001
Within populations74710316.70843.22<0.0001

AMOVAs were performed for: i) two groups within L. montandoni; ii) two species (excluding syntopic populations).

Genetic structure of L. montandoni inferred by Structure for K = 2 groups.

For each population pie charts show the fraction of the genes from the northern (orange) and southern (blue) groups. AMOVAs were performed for: i) two groups within L. montandoni; ii) two species (excluding syntopic populations).

Differentiation within Lissotriton vulgaris

Genetic structure among Lv populations around the Carpathians is stronger and presumably deeper than that within Lm. A deep split between populations outside the Carpathian belt and those in the Carpathian Basin is visible in the population tree (Fig. 3), MDS (Fig. 5a) and PCA plot (PC3, Fig. 5c). Structuring within the Carpathian Basin is also pronounced, as substantial genetic distance separates the single analysed population of L. vulgaris ampelensis (locality 38) from two populations of the nominal subspecies L. vulgaris vulgaris (Fig. 3, 5a, 5c).
Figure 5

Genetic differentiation between L. montandoni (Lm) and L. vulgaris (Lv).

(a) Non-metric two-dimensional scaling of the matrix of pairwise F ST between populations; red triangles – Lm; green circles – Lv; grey diamonds – syntopic populations; (b) and (c) Principal Component Analysis (PCA) performed on individual genotypes; in parentheses percentage of variance explained by principal components; red triangles – Lm; green circles – Lv; grey diamonds – individuals from syntopic populations.

Genetic differentiation between L. montandoni (Lm) and L. vulgaris (Lv).

(a) Non-metric two-dimensional scaling of the matrix of pairwise F ST between populations; red triangles – Lm; green circles – Lv; grey diamonds – syntopic populations; (b) and (c) Principal Component Analysis (PCA) performed on individual genotypes; in parentheses percentage of variance explained by principal components; red triangles – Lm; green circles – Lv; grey diamonds – individuals from syntopic populations.

Genetic differentiation and gene flow between species

Strong differentiation between Lm and Lv was detected by the population-based analyses (Table S3, Fig. 3, 5a). AMOVA revealed that 43.5% variation was distributed between species and 13.3% between populations within species (Table 2). Three of seven syntopic populations occupied intermediate positions in the MDS plot (Fig. 5a). However, an overwhelming majority of newts in syntopic populations fell within the range of variation of either one or the other species; only a handful of substantially admixed individuals and possibly a single F1 hybrid were detected (Fig. 5b, 6). K = 2 was strongly supported by Structure when the two species were analyzed together (Fig. 6, S6). Structure confirmed that in syntopic populations admixture is limited, genotypes of two parental species dominate and significantly admixed individuals are rare. Structure also provided an important insight which was not visible in PCA results: a clearly detectable (>3%) admixture of Lv genes in four Lm populations. Three of these were in the southern part of the Romanian Eastern Carpathians (localities 19, 20, 25) and one was the isolated locality 1 in the Sudetes Mountains; the average admixture in these populations was 8.5%. No admixture of Lm genes was detected in Lv populations. The comparison of the average proportion of admixture on both genetic backgrounds in syntopic populations demonstrated that the mean admixture was very low, ca. 2% and that the proportion of admixture did not differ (U-test, P = 0.24) between Lv and Lm backgrounds.
Figure 6

Genetic differentiation and admixture between L. montandoni and L. vulgaris inferred by Structure for K = 2 groups.

For each population pie charts show the fraction of L. montandoni (red) and L. vulgaris (green) genes.

Genetic differentiation and admixture between L. montandoni and L. vulgaris inferred by Structure for K = 2 groups.

For each population pie charts show the fraction of L. montandoni (red) and L. vulgaris (green) genes.

Detection of outliers

The scan for outliers performed in Lm (locality 10 excluded, see above) revealed 14 F ST outliers (10.0%) at the significance level of 0.01: 9 loci (9, 17, 25, 28, 34, 45, 57, 73, 137) showed an excess of differentiation among populations (candidates for local adaptations) and 5 (4, 6, 22, 84, 93) loci were less differentiated than expected under neutrality (candidates for balancing selection) (Fig. 7a). Three F CT outliers (45,73,75) were identified as candidates for diversifying selection between the northern and southern Lm groups and two (72, 89) as candidates for balancing selection (Fig. 7b).
Figure 7

Detection of outlier loci from genome scans.

(a) F ST outliers in L. montandoni, (b) F CT outliers in L. montandoni, (c) F CT outliers in interspecific analysis – markers at the extremes of interspecific differentiation.

Detection of outlier loci from genome scans.

(a) F ST outliers in L. montandoni, (b) F CT outliers in L. montandoni, (c) F CT outliers in interspecific analysis – markers at the extremes of interspecific differentiation. Screening for outlier loci between Lm and Lv was performed excluding syntopic populations (26–32). A total of nine (6.5%) F CT outlier loci were identified: eight (15, 72, 79, 112, 116, 117, 128, 129) were more differentiated than expected under the neutral model and one (126) was less differentiated (Fig. 7c). Only one of the interspecific outliers was a nonsynonymous polymorphism. Locus 72 located within gene SRSF1, involved in splicing regulation, was classified as a candidate for balancing selection between the northern and southern Lm groups and at the same time as a candidate for divergent selection between species.

Discussion

Isolation in glacial refugia and limited dispersal determine the genetic structure of Lm

Two clearly differentiated genetic units were identified in Lm by SNP data: the northern group in the Western Carpathians and the western part of the Eastern Carpathians, and the southern group across the rest of the species range. Admixture between them occurs around the Romanian-Ukrainian border. Zieliński et al. [29] identified three units in microsatellite data: our southern group combines their eastern and southern units. Can the discrepancy between the two studies be reconciled? Below we argue that SNPs reflect the history and differentiation better than microsatellites and offer several explanations of the apparent discrepancy between the datasets. We consider the alternative explanation that SNP data fail to detect true differentiation unlikely on several grounds. First, an overwhelming majority of pairwise F ST values calculated from SNPs were significant, demonstrating substantial power to detect differentiation. Second, remarkably strong isolation by distance is observed in Lm and the apparent break between the eastern and southern groups of Zieliński et al. [29] coincides with a gap in their sampling. This gap has been filled by the present study. Thus, isolation by distance and non-uniform sampling may have resulted in delineation of the apparently distinct unit [68], [69]. Third, the population tree based on SNPs shows a remarkable pattern consistent with colonization from two refugia, but it is difficult to explain under the assumption of expansion from three refugia. While in both groups relationships between some populations are poorly resolved, in each group several populations are related in a nested fashion, with progressively longer branches at the deeper nesting levels. Nesting involves populations distributed from the center to the periphery of the species range – to the west in the Western Carpathians and to the south-west in the Romanian Carpathians. We hypothesize that the populations with poorly resolved relationships are those inhabiting the refugial areas and sharing most variation retained there. Populations related to each other in a nested fashion would be those which colonized the present range through serial events [4], [70]. Taking the evidence together, we propose that the location of the refugium for the northern group was in the Eastern Carpathians close to the Polish-Ukrainian border, and that the refugium for the southern group was in the central part of the Eastern Carpathians in Romania. Species distribution models for the LGM reported by Zieliński et al. [29] are broadly consistent with the proposed location of refugia. The role of the Carpathians as a major refugium for European biota has recently been well documented [42], [43], [71], [72]. Multiple species show genetic differentiation between the Western, Eastern and Southern Carpathians, pointing to the presence of several refugia (reviewed in [29], [72]). So far, a refugium in the western part of the Eastern Carpathians has to our knowledge not been proposed. Expansion from refugia is commonly accompanied by loss of variation [4], [73]. Reduction of genetic variation along the postulated expansion routes is visible in our data, but the signal is not strong, and may be distorted locally by introgression from Lv (see below). Hence expansion was apparently not accompanied by severe bottlenecks and thus only a minor fraction of variation has been lost. The strongest reduction in genetic variation occurred in locality 6 in the Podolian Upland isolated from the main portion of the range. This population is a remnant of a geographically remote group of Lm populations [74] which has been hypothesized to be isolated from the main part of the range for several thousand years [75]. Salamanders often exhibit low individual mobility and strong philopatry [27], [76]. Genetic differentiation between salamander populations appears to reflect these features although the geographic scale of subdivision differs among species [77], [78], which may be related to habitat characteristics [79], [80] and to life-history traits [81]. In continuous habitats limited dispersal abilities are likely to generate isolation by distance patterns with a gradient of genetic differentiation among sites, on which larger-scale, hierarchical differentiation reflecting geographic or environmental barriers may be superimposed [80], [82]. A comparison of our results with those of a study [83] on fine-scale genetic differentiation in L. vulgaris graecus suggests that a combination of isolation by distance, probably due to limited dispersal, and spatial clustering due to historical fragmentation and/or landscape barriers occurs in Lissotriton newts at both micro- and macroscales.

Local introgression of Lv alleles into the Lm nuclear genome is detectable with SNP markers

A major finding of the present study is substantial introgression of Lv nuclear alleles into some Lm populations. This is contrary to the findings of Zieliński et al. [29] who detected very little recent nuclear introgression in either direction. One likely explanation for the difference between the studies is the number of markers employed [84]. While we analyzed 139 unlinked SNPs, inference about introgression in the previous study was based on only 10 microsatellites. The observed discrepancy does not result from differences in sampling because three of four admixed populations were analysed in both studies. As SNP markers were discovered in a sample of Lm individuals, our study did not use diagnostic markers. This could be considered a weakness if viewed from the perspective of classical studies of hybrid zones which usually employed a limited number of diagnostic markers. However, because of the widespread genomic heterogeneity of interspecific gene flow [38], [40], [85], [86], such diagnostic markers may constitute a highly nonrandom sample of the genome, enriched in genomic regions strongly differentiated between species. In our opinion randomly selected polymorphisms are better suited for an unbiased assessment of introgression. We acknowledge that ideally both species should be included in the discovery panel; this would however limit the number of polymorphic loci useful for the assessment of genetic structure within Lm. The current study demonstrates two peculiar features of Lm x Lv hybridization. First, appreciable (>3%) introgression was detectable only locally, in four of 25 sampled Lm localities. In these populations most individuals were introgressed and the average admixture of Lv genes was 8.5%. Second, in the introgressed Lm populations, admixture was stronger than in seven syntopic localities, where it was barely detectable. Thus current syntopy, even if it leads to occasional hybridization, as shown by a single putative F1 hybrid, does not necessarily cause introgression. This is somewhat surprising because a study of a Lm/Lv hybrid zone at microscale detected strong assortative mating but also found that syntopy was almost universally accompanied by some admixture [44]. As the four admixed Lm populations testify, nuclear introgression of Lv alleles into Lm populations extends beyond syntopy, but does not permeate into the core of the Lm range. Local differences in the extent of introgression may be explained by several mechanisms. The introgressed populations may be simply located at the tails of local hybrid zones, and would thus be sampled entirely by chance. However other potential explanations deserve consideration. Local ecological conditions may either favor introgression or delay removal of introgressed alleles by selection [87]. Differences in abundance of species in a breeding locality may force the rarer species to hybridize due to scarcity of conspecific mates, but we have not observed this effect in syntopic populations. If species are genetically structured, as in our case, introgression may be easier between some genetic groups if their genomes harbor fewer incompatible alleles and thus intrinsic selection against hybrids is weaker or ecological/sexual adaptations are similar [24], [25]. Lv is strongly differentiated genetically [41], [88], [89] and various Lv groups come into contact with Lm populations in the Carpathian Basin, and outside the Carpathian belt. If introgression is neutral, the observed pattern may result from expansion-related phenomena [4]. Under the scenario modeled by Currat et al. [90], when one species invades the range of another, neutral introgression occurs almost exclusively from the resident to the invading species. Thus, local expansion of Lm would bring Lv genes onto its genetic background. A comparison of the two isolated Lm populations may be instructive in this respect. Population 1 at the western margin of the species range, probably the result of postglacial or more recent expansion, has recently introgressed Lv mtDNA and shows clear evidence of nuclear introgression. Another isolated population (6), close to the postulated refugial area of the northern Lm group and possibly surviving in situ for a long time, shows no trace of nuclear introgression. Scenarios related to the Currat et al. [90] model were favored as the explanation of mtDNA introgression and replacement in Lm by Zieliński et al. [29]. In addition to laboratory experiments which are difficult to perform in this system due to logistic reasons, two other kinds of analyses would be informative with respect to the causes of the apparent differentiation in the extent of introgression. Examination of several transects through hybrid zones in the context of local environmental conditions and relative species abundance could be informative as demonstrated in multiple systems [87], [91]–[93]. Another important way forward would be to use multilocus sequence data [94] to construct and test multipopulation models of gene flow between Lm and Lv. Models distinguishing two groups within Lv, inside and outside of the Carpathian basin, as well as two groups within Lm can be evaluated and hypotheses regarding the timing and extent of gene flow may be tested within an Approximate Bayesian Computations framework [95], [96]. This approach would provide a longer-scale perspective on gene flow between species and its spatial and temporal variation.

Genomic heterogeneity of gene flow within and between species

Outlier loci were detected both within Lm and between Lm and Lv. Such candidate loci may signal various forms of selection acting on the markers themselves or at linked sites [24], [38]. Alternatively their apparent outlier status may result from violation of the model assumptions, to which the available methods are very sensitive [13]. We do not attempt a formal functional analysis of the identified outliers but rather emphasize that the outliers detected in the Lm-Lv comparison indicate heterogeneity of interspecific gene flow in nuclear protein coding genes. Dramatic discordance in the propensity for interspecific gene flow occurs between the mitochondrial and nuclear genome ([29]; this study). Within the nuclear genome the genes of MHC class II introgress easily between the two species [45]. The present study suggests that heterogeneity of gene flow is widespread in the nuclear genome. Some genomic regions, typically linked to genes involved in intergenomic incompatibilities or underlying species-specific adaptations, i.e. genes which may cause reduced hybrid fitness, acquire reproductive isolation earlier than other regions [38], [97], [98]. The size of such regions and mechanisms responsible for maintenance of genomic differentiation have been a subject of ongoing controversy and intense recent research [85], [98]–[101]. It is expected that the shape of the heterogeneity in gene flow will evolve over time and a comparison of the extent of heterogeneity at various stages of divergence is of great interest for the understanding of the buildup of genomic divergence as differentiation progresses [40], [86]. Transcriptome data, such as those used here for the development of SNP markers, are being applied to study genomic heterogeneity of gene flow in the Lm/Lv system (Stuglik et al. in prep.).

Conclusions

Using a panel of transcriptome-derived SNP markers, our study has demonstrated that isolation in glacial refugia and limited dispersal have been the main factors determining the genetic structure of Lm. Two substantially differentiated groups of Lm populations likely originated from separate refugia, both located in the Eastern Carpathians. The colonization of the present range in north-western and south-western directions was accompanied by a modest loss of variation. Local introgression of Lv alleles into several Lm populations was detected. Introgression was higher in Lm populations classified as admixed than in syntopic populations. We discuss the possible causes of this discrepancy and propose further tests to distinguish between alternatives. Several outliers were identified in tests of interspecific differentiation, suggesting genomic heterogeneity of gene flow between species. The shape of genomic heterogeneity at various stages of species divergence is of major interest for the understanding of the buildup of differentiation across the genome and Lm/Lv is a promising study system in this respect. Expected heterozygosity. (a) interpolated geographic gradients in L. montandoni (Lm), (b) means for all populations: triangles – Lm, diamonds – syntopic, circles – Lv. (TIF) Click here for additional data file. Histograms showing the distribution of pairwise groups and between groups. (TIF) Click here for additional data file. Identification of the number of groups ( ) in Structure analysis for . (a) Evanno et al. (2005) method; (b) means and standard deviations (SD) of the ln-likelihood of the probability of data for various values of K. (TIF) Click here for additional data file. Interpolated geographic gradients of allelic richness in . (TIF) Click here for additional data file. Isolation by distance in . Relationships between pairwise F ST and log-geographic distances are presented for all populations and populations within northern and southern groups separately. (TIF) Click here for additional data file. Identification of the number of groups ( ) in Structure analysis for and . (a) Evanno et al. (2005) method; (b) means and standard deviations (SD) of the ln-likelihood of the probability of data for various values of K. (TIF) Click here for additional data file. Characteristics of the single nucleotide polymorphisms (SNP) used in the present study. (XLSX) Click here for additional data file. Results of the tests of the Hardy-Weinberg proportions for all loci in all populations. Uncorrected P values are given; “-“ indicates that test was not performed due to insufficient polymorphism. (XLSX) Click here for additional data file. Pairwise <0.05) values marked in red. (XLSX) Click here for additional data file.
  80 in total

1.  Inference of population structure using multilocus genotype data.

Authors:  J K Pritchard; M Stephens; P Donnelly
Journal:  Genetics       Date:  2000-06       Impact factor: 4.562

Review 2.  Dispersal, gene flow, and population structure.

Authors:  A J Bohonak
Journal:  Q Rev Biol       Date:  1999-03       Impact factor: 4.875

3.  Genomic divergence during speciation: causes and consequences.

Authors:  Patrik Nosil; Jeffrey L Feder
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2012-02-05       Impact factor: 6.237

4.  Landscape genetics of alpine Sierra Nevada salamanders reveal extreme population subdivision in space and time.

Authors:  Wesley K Savage; Alexander K Fremier; H Bradley Shaffer
Journal:  Mol Ecol       Date:  2010-08       Impact factor: 6.185

5.  Phylogeography of two European newt species--discordance between mtDNA and morphology.

Authors:  W Babik; W Branicki; J Crnobrnja-Isailović; D Cogălniceanu; I Sas; K Olgun; N A Poyarkov; M Garcia-París; J W Arntzen
Journal:  Mol Ecol       Date:  2005-07       Impact factor: 6.185

6.  adegenet: a R package for the multivariate analysis of genetic markers.

Authors:  Thibaut Jombart
Journal:  Bioinformatics       Date:  2008-04-08       Impact factor: 6.937

7.  Sex-specific estimates of dispersal show female philopatry and male dispersal in a promiscuous amphibian, the alpine salamander (Salamandra atra).

Authors:  V Helfer; T Broquet; L Fumagalli
Journal:  Mol Ecol       Date:  2012-08-30       Impact factor: 6.185

8.  Crossing the species barrier: genomic hotspots of introgression between two highly divergent Ciona intestinalis species.

Authors:  Camille Roux; Georgia Tsagkogeorga; Nicolas Bierne; Nicolas Galtier
Journal:  Mol Biol Evol       Date:  2013-04-05       Impact factor: 16.240

9.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

10.  Gene flow and the geographic structure of natural populations.

Authors:  M Slatkin
Journal:  Science       Date:  1987-05-15       Impact factor: 47.728

View more
  7 in total

1.  Balancing selection and introgression of newt immune-response genes.

Authors:  Anna Fijarczyk; Katarzyna Dudek; Marta Niedzicka; Wiesław Babik
Journal:  Proc Biol Sci       Date:  2018-08-15       Impact factor: 5.349

2.  North African hybrid sparrows (Passer domesticus, P. hispaniolensis) back from oblivion - ecological segregation and asymmetric mitochondrial introgression between parental species.

Authors:  Abdelkrim Ait Belkacem; Oliver Gast; Heiko Stuckas; David Canal; Mario LoValvo; Gabriele Giacalone; Martin Päckert
Journal:  Ecol Evol       Date:  2016-06-28       Impact factor: 2.912

3.  Linkage Map of Lissotriton Newts Provides Insight into the Genetic Basis of Reproductive Isolation.

Authors:  Marta Niedzicka; Katarzyna Dudek; Anna Fijarczyk; Piotr Zieliński; Wiesław Babik
Journal:  G3 (Bethesda)       Date:  2017-07-05       Impact factor: 3.154

4.  Selective Landscapes in newt Immune Genes Inferred from Patterns of Nucleotide Variation.

Authors:  Anna Fijarczyk; Katarzyna Dudek; Wieslaw Babik
Journal:  Genome Biol Evol       Date:  2016-12-31       Impact factor: 3.416

5.  Molecular Inversion Probes for targeted resequencing in non-model organisms.

Authors:  M Niedzicka; A Fijarczyk; K Dudek; M Stuglik; W Babik
Journal:  Sci Rep       Date:  2016-04-05       Impact factor: 4.379

6.  Red Queen Processes Drive Positive Selection on Major Histocompatibility Complex (MHC) Genes.

Authors:  Maciej Jan Ejsmond; Jacek Radwan
Journal:  PLoS Comput Biol       Date:  2015-11-24       Impact factor: 4.475

7.  Genomic heterogeneity of historical gene flow between two species of newts inferred from transcriptome data.

Authors:  Michał T Stuglik; Wiesław Babik
Journal:  Ecol Evol       Date:  2016-06-09       Impact factor: 2.912

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.