Literature DB >> 30184067

Analysis of Genome-Wide Differentiation between Native and Introduced Populations of the Cupped Oysters Crassostrea gigas and Crassostrea angulata.

Pierre-Alexandre Gagnaire¹, Jean-Baptiste Lamy², Florence Cornette², Serge Heurtebise², Lionel Dégremont², Emilie Flahauw², Pierre Boudry³, Nicolas Bierne¹, Sylvie Lapègue².

Abstract

The Pacific cupped oyster is genetically subdivided into two sister taxa, Crassostrea gigas and Crassostrea angulata, which are in contact in the north-western Pacific. The nature and origin of their genetic and taxonomic differentiation remains controversial due the lack of known reproductive barriers and the high degree of morphologic similarity. In particular, whether the presence of ecological and/or intrinsic isolating mechanisms contributes to species divergence is unknown. The recent co-introduction of both taxa into Europe offers a unique opportunity to test how genetic differentiation is maintained under new environmental and demographic conditions. We generated a pseudochromosome assembly of the Pacific oyster genome using a combination of BAC-end sequencing and scaffold anchoring to a new high-density linkage map. We characterized genome-wide differentiation between C. angulata and C. gigas in both their native and introduced ranges, and showed that gene flow between species has been facilitated by their recent co-introductions in Europe. Nevertheless, patterns of genomic divergence between species remain highly similar in Asia and Europe, suggesting that the environmental transition caused by the co-introduction of the two species did not affect the genomic architecture of their partial reproductive isolation. Increased genetic differentiation was preferentially found in regions of low recombination. Using historical demographic inference, we show that the heterogeneity of differentiation across the genome is well explained by a scenario whereby recent gene flow has eroded past differentiation at different rates across the genome after a period of geographical isolation. Our results thus support the view that low-recombining regions help in maintaining intrinsic genetic differences between the two species.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Year: 2018 PMID： 30184067 PMCID： PMC6161763 DOI： 10.1093/gbe/evy194

Source DB: PubMed Journal: Genome Biol Evol ISSN： 1759-6653 Impact factor: 3.416

Introduction

Broadcast-spawning marine invertebrates usually display high genetic similarity between adjacent populations, a consequence of combining high dispersal ability and large population sizes. Many such species, however, appear to be genetically subdivided into sibling species, cryptic species-pairs, ecotypes, or partially reproductively isolated populations (Knowlton 1993; Palumbi 1994; Hellberg 2009; Bierne et al. 2011; Gagnaire et al. 2015; Kelley et al. 2016). The ability to score divergence at the genomic level in nonmodel organisms is revealing an increasing number of cases of cryptic species subdivision in broadcast-spawning marine invertebrates (e.g., Westram et al. 2014; Ravinet et al. 2015; Fraïsse et al. 2016; Rose et al. 2018). These recent findings confirm the view that semipermeable barriers to gene flow between closely related taxa (Harrison and Larson 2014, 2016) are relatively frequent in the marine realm, and may explain short-distance genetic differentiation patterns that are seemingly in contradiction with species’ dispersal potential. The existence of marine semi-isolated species pairs evolving in the “grey zone” of the speciation continuum (Roux et al. 2016), that is, before complete reproductive isolation, provides interesting opportunities to contribute to some highly debated questions in the field of speciation genomics. Speciation is a progressive process by which reproductive isolation barriers of various types progressively appear and combine their effects to reduce gene flow (Abbott et al. 2013). As long as speciation is not complete, diverging populations continue to evolve nonindependently because some regions of their genome can still be exchanged. This results in a dynamic architecture of divergence characterized by temporal changes in the number, genomic distribution and magnitude of the genetic differences between incipient species. One of the speciation genomics approaches investigates this architecture by characterizing heterogeneous genome divergence patterns, ultimately aiming at a detection of the loci directly involved in reproductive isolation (Feder et al. 2012). This strategy, however, has come up against a large diversity of evolutionary processes influencing the genomic landscape of species divergence (Yeaman et al. 2016; Ravinet et al. 2017; Wolf and Ellegren 2017), some of which are not directly linked to speciation (Noor and Bennett 2009; Nachman and Payseur 2012; Cruickshank and Hahn 2014; Burri 2017a). The field is now progressively moving toward a more mechanistic understanding of the evolutionary processes underlying heterogeneous genome divergence. However, the issue of distinguishing the impact of genetic barriers from the effect of confounding processes such as linked selection remains challenging (Burri 2017a). Because genetic divergence does not easily persist in the face of gene flow in the absence of genetic barriers (Bierne et al. 2013), high gene flow species such as broadcast-spawning marine invertebrates offer valuable study systems for disentangling the mechanisms at play during speciation. Here, we investigated the existence and the type of genetic barriers between divergent lineages of the Pacific cupped oyster, which has been taxonomically subdivided into two sister species, Crassostrea gigas and Crassostreaangulata. The two species are presumed to be parapatrically distributed in their native range in the north-western Pacific, but the location of putative contact zones remains largely unknown. Whether C. gigas and C. angulata truly represent biological species, semi-isolated species of populations of the same species also remains unclear. The two taxa can be cross-fertilized in the laboratory to form viable and fertile offspring (Takeo and Sakai 1961; Huvet et al. 2001, 2002), and some authors have proposed that they should be considered as a single species (López-Flores et al. 2004; Reece et al. 2008). On the other hand, the finding of genetic differences between C. angulata and C. gigas lead other authors to conclude that they form different but genetically closely related species (Boudry et al. 1998; Lapègue et al. 2004; Ren et al. 2010; Wang et al. 2014). One originality of the Pacific cupped oyster system is the co-introduction of both taxa into Europe. Crassostreaangulata, also called the Portuguese oyster, is presumed to have been nonvoluntarily introduced by merchant ships during the 16th century, probably from Taiwan (Boudry et al. 1998; Huvet et al. 2000) although the exact origins of introduced stocks remains unknown (Grade et al. 2016). Recent studies have shown that this species is widely distributed in Asian seas where it shows a high genetic diversity (Wang et al. 2010; Sekino and Yamashita 2013; Zhong, Li, Yu, et al. 2014; Hsiao et al. 2016), and also suggested a more complex history of transfers between Asia and Europe (Grade et al. 2016). The Pacific oyster, C. gigas, has been voluntarily introduced from Japan and British Columbia into Europe in the early 1970s, mainly to replace the Portuguese oyster in the French shellfish industry following a severe disease outbreak (Grizel and Héral 1991). Since then, the two species are in contact in southern Europe and therefore have the potential to exchange genes in a new environment (Huvet et al. 2004; Batista et al. 2017). Gene flow in the context of marine invasions has mainly been studied between native and nonindigenous lineages (Saarman and Pogson 2015; Viard et al. 2016) but rarely between two co-introduced genetic backgrounds in a new place. A notorious exception is the European green crab (Carcinus maenas) in the Northwest Atlantic (Darling et al. 2008). However, although the genome-wide genetic differentiation has been studied in the introduced range (Jeffery et al. 2017, 2018), it has not been compared with the differentiation observed in the native range to date. In the present study, we first generated new genomic resources in the Pacific oyster to improve the species genome assembly and characterize chromosomal variation in recombination rate. Then, we tested the existence of genetic barriers between C. angulata and C. gigas by searching for genomic regions that remain differentiated in the presence of gene flow, accounting for the demographic divergence history of the species. We also evaluated whether ecological divergence driven by local adaptations is the main factor maintaining species divergence in the native range. We hypothesized that if this is the case, the different ecological conditions encountered by the two species in Europe would have reshaped the original genomic landscape of species divergence existing in the native Asian range. Finally, we attempted to relate genome-wide divergence patterns to underlying evolutionary processes including demography, selection and recombinational constraints.

Materials and Methods

Biological Material for the Mapping Populations

Two F2 families (second generation of biparental crosses) were used for genetic map reconstruction in C. gigas. These families were obtained by experimental breeding as part of the MOREST project (Boudry et al. 2008; Dégremont et al. 2010) by crossing families selected for resistance (R) or susceptibility (S) to summer mortality, which were subsequently found to have respectively higher and lower resistance to the herpesvirus OsHV-1 (Dégremont 2011). The F2-19 family was generated through biparental crossing between an S female and an R male to produce a F1 family from which a single sib-pair was randomly sampled to produce the F2 progeny. The F2-21 family was obtained under the same mating scheme but starting with an R female and a S male. A total of 293 and 282 F2 progenies were used for map reconstruction in family F2-19 and F2-21, respectively. For each individual, whole genomic DNA was isolated from gill tissue using the QIAamp DNA minikit (Qiagen). DNA was checked for quality by electrophoresis on agarose gel and then quantified using the Quant-iT PicoGreen dsDNA assay kit (Life Technologies).

Genotyping Panel for Low-Density Map Construction

We genotyped F1 parents and their F2 progenies in both families (F2-19: 2 F1 and 293 F2; F2-21: 2 F1 and 282 F2) using a panel of 384 SNPs previously developed in C. gigas (Lapègue et al. 2014). SNP genotyping was performed using the Golden Gate assay and analyzed with the Genome Studio software (Illumina Inc.). In addition, 42 microsatellite markers were also genotyped according to published protocols (Li et al. 2003; Taris et al. 2005; Yamtich et al. 2005; Li et al. 2010; Sauvage et al. 2010) in order to include markers from previous generation linkage maps in C. gigas (Hubert and Hedgecock 2004; Sauvage et al. 2010).

RAD Genotyping for High-Density Map Construction

We selected 106 progenies from each family as well as their four F1 parents for RAD library construction following the original protocol (Baird et al. 2008). Briefly, 1 μg of genomic DNA from each individual was digested with the restriction enzyme SbfI-HF (New England Biolabs), and then ligated to a P1 adapter labeled with a unique barcode. We used 16 barcodes of 5-bp and 16 barcodes of 6-bp long in our P1 adapters to build 32-plex libraries. Seven pools of 32 individuals were made by mixing individual DNA in equimolar proportions. Each pool was then sheared to a 350 pb average size using a Covaris S220 sonicator (KBiosciences), and size-selected on agarose gel to keep DNA fragments within the size range 300–700 pb. Each library was then submitted to end-repair, A-tailing and ligation to P2 adapter before PCR amplification for 18 cycles. Amplification products from six PCR replicates were pooled for each library, gel-purified after size selection and quantified on a 2100 Bioanalyzer using the High Sensitivity DNA kit (Agilent). Each library was sequenced on a separate lane of an Illumina HiSeq 2000 instrument by IntegraGen Inc. (France, Evry), using 100-bp single reads. We used the program Stacks (Catchen et al. 2011, 2013) to build loci from short-read sequences and determine individual genotypes. Raw sequence reads were quality filtered and demultiplexed using process_radtags.pl before being trimmed to 95 bp. We explored different combinations of parameter values for the minimum stack depth (-m) and the maximum mismatch distance (-M) allowed between two stacks to be merged into a locus. We found that the combination -m 3 -M 7 represented the best compromise to avoid overmerging loci, while providing an average number of 2.3 SNPs per polymorphic RAD locus (supplementary fig. S1, Supplementary Material online) which is consistent with the high polymorphism rate in C. gigas (Sauvage et al. 2007; Zhang et al. 2012). Individual de novo stack formation was done with ustacks (-m 3 -M 7 -r -d). We then built a catalog of loci using all individuals from both families with cstacks (-n 7) and matched back all the samples against this catalog using sstacks. After this step, the two families were treated as two separate populations to produce a table of observed haplotypes at each locus in each family using populations. We developed a Bayesian approach that uses information from progenies’ genotypes to correct for missing and miscalled genotypes in parental samples. The probability of a given combination of parental genotypes (G) conditional on the genotypes observed in their descendants (G) is given by where the probability of the observed counts of progenies’ genotypes (NAA, NAB, NAC, NAD, NBB, NBC, NBD) conditional on the genotypes of their parents is drawn from a multinomial distribution that takes different parameter values for each of the six alternative models of parental genotypes (i.e., crosses AA × AA, AA × AB, AB × AB, AA × BB, AB × AC, and AB × CD). Each RAD locus was treated independently using observed haplotypes to determine the best model of parental haplotype combination. When the actual haplotypes called in parents did not match the best model, a correction was applied to restore the most likely combination of parental haplotypes, taking read depth information into account. Data were finally exported in JoinMap 4 format with population type CP (van Ooijen 2006). The R script for correcting missing and miscalled haplotypes in parental samples is available from the authors upon request.

Genetic Map Construction

A new Pacific oyster linkage map was constructed in four successive steps: 1) First, a low-density map was built for each family using the SNP and microsatellite markers data set. These two maps were used to provide accurate ordering of markers, since both mapping populations comprised ∼300 individuals with very few missing genotypes (0.5%). 2) In a second step, we tried to reach a consensus order for the markers that were included in the two previous maps, in order to determine a set of anchor loci for each linkage group. 3) Third, we included RAD markers and determined the order of all loci in each mapping family after setting a fixed order for the anchor loci. 4) Finally, we integrated the two high-density linkage maps to produce a consensus genetic map for the Pacific oyster. All linkage mapping analysis were performed using JoinMap 4 (van Ooijen 2006). Markers showing significant segregation ratio distortion after Bonferroni correction (P < 0.05) were removed from the analysis. Markers were grouped using an independence LOD threshold of 16 for the SNP and microsatellite marker data sets and a LOD threshold of 10 for the RAD marker data sets. Additional ungrouped markers were assigned at a LOD threshold of 5 using their strongest cross-link information. We used the regression mapping algorithm to build the maps using a recombination frequency threshold of 0.4, a minimal LOD score of 1 and a goodness-of-fit jump value of 5. The ordering of markers was optimized using the ripple function after each added locus. Map distances in centiMorgans (cM) were calculated from recombination frequencies using Kosambi’s mapping function.

Identification of Chimeric Scaffolds and Reassembly Using BAC-End Scaffolding

In order to detect chimeric scaffolds in the oyster_v9 assembly (Zhang et al. 2012), the consensus sequences of the markers included in the new linkage map were blasted against the reference genome. An E-value threshold of 10−30 and a minimum identity threshold of 90% were set to retain only highly significant matches. Assembly errors were identified by scaffolds anchored to more than a single linkage group. Chimeric scaffolds were subsequently splitted at all stretches of Ns connecting adjacent contigs. In order to improve the scaffolding of the Pacific oyster genome, we sequenced 73,728 BAC clones of 150 kb average insert size (Gaffney 2008) from both ends using Sanger sequencing at the Genoscope facility (Evry, France). This resulted in 60,912 cleaned full-length (i.e., >1,300 bp) BAC-End sequences (BES) pairs. Each sequence from each pair was trimmed to only conserve 1,000 bp between positions 10 and 1010, and clipped into 19 evenly-spaced 100-mers overlapping by 50 bp. This clipping procedure aimed at constructing a high quality short-read paired-end data set from our BES data set. Clipped BES were used for an additional round of contig extension and scaffolding with SSPACE (Boetzer et al. 2011). For each LG, we used unambiguously anchored scaffolds and contigs originating from splitted scaffolds matching to this LG. Clipped BES were aligned to scaffolds using Bowtie2 (Langmead and Salzberg 2012), allowing at most 3 mismatches per 100-bp read. Scaffolding parameters to SSPACE were set to a minimum of 5 links (-k) to validate a new scaffold pair and a maximum link ratio of 0.7 (-a). Scaffolding was only permitted between scaffolds of at least 1,000 bp, using an allowed insert size range of 75–225 kb.

Scaffold Anchoring to the Linkage Map

We searched for nonambiguous associations between the new set of extended scaffolds and markers included in the consensus linkage map using the same blasting procedure as prior to genome reassembly. We constructed pseudochromosomes by positioning scaffolds along each linkage group using Harry Plotter (Moretto et al. 2010). Because the genetic resolution of our consensus map was still limited by the size of our mapping populations (∼100 individuals each), many scaffolds were anchored by a single marker or had too few markers to determine their orientation. Therefore, we did not determine scaffold orientation. Scaffolds that were anchored to multiple markers were positioned using the average cM value of their anchor loci. Unmapped scaffolds were placed in an artificial chromosome named UN.

Local Recombination Rate Estimation

We used the R package MareyMap v1.2 (Rezvoy et al. 2007) to estimate local recombination rates along the genome, by comparing the consensus linkage map and the physical map for each pseudochromosome. The relationship between genetic and physical distances was first visualized to remove outlier markers resulting from scaffold misplacement within pseudochromosomes. We then used the Loess interpolation method which estimates recombination rates by locally adjusting a 2nd degree polynomial curve, setting the span parameter value to 0.25.

RAD Genotyping of Natural Populations

We sampled four wild populations of C. gigas and C. angulata from both their native Asian and introduced Atlantic areas. We used 24 individuals of C. gigas from both Japan (JAP, native) and French Marennes-Oléron Bay (LAF, introduced), and 24 individuals of C. angulata from both Taiwan (KEE, native) and Portugal (SAD, introduced). We prepared and sequenced four 24-plex RAD libraries using the same protocol as described above. Cleaned demultiplexed reads were mapped to each newly assembled pseudochromosome using Bowtie2 (Langmead and Salzberg 2012) with the very-sensitive option, allowing a maximum of seven mismatches per alignment. SNPs were called from aligned reads with Stacks using a minimum read depth of 5× per individual per allele (Catchen et al. 2011, 2013). The correction module rxstacks was used to re-evaluate individual genotypes and exclude low-quality variants with a cutoff log-likelihood value of –500. Only RAD loci that were successfully genotyped in at least 80% of the samples in each population were retained for subsequent population genomic analyses.

Population Genomic Analyses

We used VCFtools v0.1.11 (Danecek et al. 2011) to apply within-population filters in order to improve the quality of our SNP data set. We excluded loci showing >4 missing genotypes over 24 individuals, as well as markers showing departure from Hardy–Weinberg equilibrium within at least one population using a P-value cutoff of 0.01. Nucleotide diversity (π), computed as the average number of pairwise differences, was estimated from retained loci for each population within 150 kb windows. Genetic differentiation between all possible pairs of populations was estimated SNP by SNP as well as in 150 kb windows using FST (Weir and Cockerham 1984). Within- and between-population components of genetic diversity were decomposed using a discriminant analysis of principal components (dAPC), in order to maximize genetic variation between populations while minimizing within-population variance. The dAPC analysis was performed with the R package Adegenet (Jombart 2008), using the global SNP data set containing the two populations from both species with only one randomly selected SNP per RAD locus. In order to evaluate the influence of recombination rate variation on genetic differentiation and absolute genetic diversity between species, we tested for correlations across the entire genome using Spearman correlation test. Recombination rate values were averaged in 500 kb windows to increase the number of informative sites per window. We excluded windows with recombination rate values exceeding 10 cM/Mb (i.e., corresponding to the 99th percentile of the distribution of estimated recombination rate). This cutoff aimed at removing outlying recombination rate values estimated along chromosome IX (fig. 1). In addition, we calculated the average FST in 500 kb windows, as well as the between-species genetic diversity (πB) following (Aeschbacher et al. 2017), which is the equivalent of dXY in the absence of phased sequence data. Both statistics were averaged between the Asian and European species pairs. Finally, we used a nonparametric quantile regression approach to test whether the 97.5th quantile of the distribution of FST and πB were related to the local average recombination rate. The regression fits were computed at each of 10 equally spaced recombination intervals distributed over the range of recombination values. We used Treemix v1.12 (Pickrell and Pritchard 2012) to test for the presence of gene flow between diverged populations of oysters. This program represents the genetic relationships among populations by building a population tree, allowing for gene flow in addition to population splits by introducing migration edges between populations of the tree. The optimal number of migration edges was determined by comparing likelihood values of fitted models with increasing numbers of migration edges. These models were fitted to the global SNP data set with only one randomly selected SNP per RAD locus to avoid linkage disequilibrium. We then used the f3 statistics (Reich et al. 2010) to further test whether the recipient populations detected by Treemix displayed evidence of admixture. The value f3(X, Y, Z) is negative when the focal population (X) is not related to populations Y and Z by a simple tree, but rather appears to be a mixture of Y and Z. Therefore, more negative f3 values indicate stronger evidence for admixture. We applied this test on two separate data sets consisting of unlinked SNPs from either low- or high-recombination regions. A threshold value of 2.21 cM/Mb corresponding to the genome-wide median recombination rate was used to assign SNPs to low- and high-recombination categories. This partitioning of the genome aimed at further testing the influence of the local recombinational environment on gene flow.

. 1.

—Anchoring of newly edited scaffolds (blue) into pseudochromosomes using the markers included in the 10 linkage groups (red) of the new consensus linkage map. Comparison of the physical (x axis, Mb) and the genetic (y axis, cM) maps are provided for each linkage group (green points, outlier values removed), as well as the local recombination rate (orange line) estimated using the Loess interpolation method in MareyMap.

Inference of the Demographic Divergence History

We inferred the demographic divergence history between C. angulata and C. gigas in both their native and introduced ranges using a version of the program δaδi (Gutenkunst et al. 2009) modified by Tine et al. (2014). The observed joint allele frequency spectrum (JAFS) of each population pair was obtained by randomly selecting one SNP for each pair of RAD loci associated to the same restriction site in order to avoid the effect of linkage between SNPs, and then by projecting the data to 20 chromosomes per population to reduce the JAFS size. We first considered four classical models of divergence including models of strict isolation (SI), isolation-with-migration (IM), ancient migration (AM), and secondary contact (SC). We also considered extensions of the three divergence-with-gene-flow models assuming two categories of loci occurring in proportions P and 1 – P and having different effective migration rates: IM2m, AM2m, and SC2m (Tine et al. 2014). These models offer simplified but more realistic representations of the speciation process, by enabling loci to be either neutrally exchanged between populations, or to have a reduced effective migration rate to account for the direct and indirect effects of selection. Each model was fitted to the observed JAFS (singletons were masked) using three successive optimization steps: Hot simulated annealing, cold simulated annealing and BFGS. Comparison among models was made using the Akaike Information Criterion (AIC).

Results

Construction of an Integrated High-Density Linkage Map

The microsatellite and SNP data sets used for constructing the low-density linkage map contained 133 informative markers in family F2-19 (293 progeny) and 137 informative markers in family F2-21 (282 progeny). Ten linkage groups were found in each family in agreement with the haploid number of chromosomes of the species, and the two maps generated by the first round of regression mapping respectively contained 98 and 105 mapped markers in family F19 and F21. The integrated map obtained after identifying homologous LG pairs between families contained 136 markers successfully ordered after the first round of regression mapping. This sex-averaged consensus low-density linkage map displayed a strong collinearity with the two maps from which it was derived (supplementary fig. S2, Supplementary Material online, left panel), indicating that the ordering of markers was highly consistent between the two families. We thus used it as an anchoring map to support the construction of the high-density RAD-linkage map. The average number of filtered RAD-Sequencing reads obtained per individual was similar between family F2-19 (1.8 × 106) and F2-21 (1.9 × 106). After correcting for missing and miscalled haplotypes in parental samples, the number of informative markers was 1,278 in family F2-19 and 996 in family F2-21. A total of 1,855 markers were assigned to linkage groups (1,231 in family F19 and 935 in family F21, 311 in common), and the number of markers per linkage groups was highly positively correlated between families (R2 = 0.71, supplementary fig. S3, Supplementary Material online). Using the anchoring map to set fixed orders, the two RAD maps generated after three rounds of regression mapping respectively contained 1,023 and 705 mapped markers in family F19 and F21, respectively. The final combination of these two maps resulted in a sex-averaged consensus map containing 1,516 markers (table 1, and supplementary fig. S2, Supplementary Material online, right panel). The total map length was 965 cM and the average spacing between two neighboring markers was 0.64 cM. We compared this new linkage map to the Pacific oyster’s second generation linkage map (table S7 from Hedgecock et al. 2015) using scaffolds from the oyster reference genome (oyster_v9, Zhang et al. 2012) as intermediate. Using 278 pairs of markers colocalized to the same scaffolds, we found a good collinearity between the two maps, as illustrated by linkage group-wise correlations between recombination distances (0.35 < R2 < 0.90, supplementary fig. S4, Supplementary Material online).

Table 1

Summary Statistics for the 10 Linkage Groups of the New C. gigas High-density Linkage Map and the Reassembled Physical Genome

LG	Correspondence with Hedgecock et al. (2015)	Length (cM)	Nb of Markers F2-19	Nb of Markers F2-21	Total nb of Markers	Marker Spacing	Physical Length of Anchored Scaffolds	Recombination Rate (cM/Mb)	Recombination Rate Marey Map (cM/Mb)
I	1	90.92	86	88	158	0.58	31113268	2.92	2.61
II	4	94.23	89	56	117	0.81	21949604	4.29	3.65
III	5	99.17	166	69	229	0.43	27979720	3.54	3.40
IV	7	116.98	149	103	226	0.52	41226161	2.84	2.29
V	10	72.01	95	74	163	0.44	35636424	2.02	2.18
VI	3	98.87	143	60	132	0.75	28856184	3.43	2.71
VII	8 & 2	88.78	113	71	157	0.57	33248857	2.67	2.36
VIII	9	108.20	64	68	131	0.83	22447078	4.82	4.13
IX	6	136.51	94	58	124	1.10	21192597	6.44	5.47
X	6	58.89	24	58	79	0.75	16594032	3.55	3.65
ALL	–	964.55	1023	705	1516	0.64	280243925	3.44	3.05

Summary Statistics for the 10 Linkage Groups of the New C. gigas High-density Linkage Map and the Reassembled Physical Genome

Pseudochromosome Assembly

A majority (78.3%) of the markers included in the new consensus RAD linkage map matched to the oyster reference genome (oyster_v9, Zhang et al. 2012). Among the 592 scaffolds that were matched with mapped markers, 327 (55.2%) contained a single mapped marker, 127 (21.5%) contained two markers, and 138 (23.3%) contained three to eleven markers (supplementary table S1, Supplementary Material online). We found 117 (44.2%) chimeric scaffolds mapping to different linkage groups among the 265 scaffolds containing at least two markers. Splitting chimeric scaffolds into smaller contigs at poly-N stretches decreased the assembly N50 size from 401 kb to 218 kb. Contig extension and scaffolding using clipped BES allowed the merging of 4,162 contigs. The mean insert size of mapped BES was 116 kb, which was consistent with the mean insert size of the BAC end library. The resulting N50 of new scaffolds increased by 47% to 320 kb (supplementary table S2, Supplementary Material online). A total of 773 scaffolds from this new set of extended scaffolds were anchored to 1,161 loci of the consensus linkage map. Among them, 531 (68.7%) contained a single mapped marker, 153 (19.8%) contained two markers, and 89 (11.5%) contained three to seven markers (supplementary table S3, Supplementary Material online). The pseudochromosomes assembly obtained had a length of 280.2 Mb (fig. 1), representing 50.1% of the total length of the oyster_v9 assembly (Zhang et al. 2012).

Local Recombination Rate

The comparison between the pseudochromosome assembly and the consensus linkage map revealed variation in local recombination rate along chromosomes, with generally increased values toward linkage group extremities compared with central chromosomal regions (fig. 1). The average genome-wide recombination rate estimated with MareyMap was 3.05 cM/Mb (median = 2.21 cM/Mb), a value close to the total map length divided by the size of the physical assembly (3.44 cM/Mb). The real value, however, may be twice lower considering that the pseudochromosome assembly only represents 50% of the total genome size. The average chromosome-wide recombination rate (table 1) was negatively correlated with chromosome length (R2 = 0.61, fig. 2), as expected under strong chiasma interference.

. 2.

—Negative correlation between the length of newly assembled pseudochromosomes and the average chromosome-wide recombination rate assessed with MareyMap. Chromosome lengths and raw estimates of recombination rate are based on the physical chromosome lengths effectively covered by scaffolds. Rescaled values taking into account the presence of unmapped scaffolds can be obtained by multiplying (dividing) chromosome length (recombination rate) values by two (see text).

Genetic Diversity and Differentiation

The average number of sequence reads retained per individual for calling genotypes was 3.1 × 106. After filtering for missing genotypes, HWE and a minor allele frequency of 1%, we kept a total of 10,144 SNPs from 1,325 RAD loci for downstream genetic diversity analyses. Within-population nucleotide diversity was elevated in both species and showed very similar levels between native and introduced populations (C. angulata: πKEE = 0.0095, πSAD = 0.0096; C. gigas: πJAP = 0.0099, πLAF = 0.0101). The genome-wide averaged differentiation assessed by FST between native and introduced populations from the same species was higher in C. angulata (FST KEE-SAD = 0.0179) compared with C. gigas (FST JAP/LAF = 0.0130). The mean genetic differentiation between species was very similar between Asia (FST KEE/JAP = 0.0440) and Europe (FST SAD/LAF = 0.0459). The between-population genetic structure revealed by the dAPC (fig. 3) separated the two species along the first PC axis (explaining 15.78% of total variance) and the Asian and European populations of each species along the second axis (2.54% of total variance). The European populations of C. angulata (SAD) and C. gigas (LAF), were slightly shifted toward the center of the first PC axis, indicating an increased genetic similarity of the two species in Europe.

. 3.

—(A) dAPC plot of the four populations of Pacific cupped oysters assessed with 1,325 SNPs (one SNP per RAD), with Crassostrea angulata from Asia (KEE, light blue) and Europe (SAD, dark blue), and Crassostrea gigas from Asia (JAP, light red) and Europe (LAF, dark red). (B) Genetic parallelism in the level of differentiation between species in Asia (x axis) and Europe (y axis) measured at 10,144 SNPs. The black line shows the y = x equation line and the red dashed line corresponds to the linear regression (FST Europe = 0.83 × FST Asia + 0.01, R2 = 0.71, P < 10−10). Genetic differentiation between species was highly heterogeneously distributed across the genome, with regions of elevated differentiation alternating with genetically undifferentiated regions in most chromosomes (fig. 4). The maximum FST value between C. gigas and C. angulata was 1 in Asia and 0.96 in Europe. Between-species genetic differentiation at individual SNPs was highly positively correlated between Asia and Europe, although it was on average lower in Europe (fig. 3).

. 4.

—Genomic landscape of between-species genetic differentiation (FST) measured at 10,144 SNPs (alternatively represented by black and grey points for odd and even chromosome numbers for convenience) along the ten Pacific oyster linkage groups. The red dotted line shows the genome-wide average FST and the grey line shows the local average FST calculated in 150 kb windows. Genetic differentiation between Crassostrea angulata and Crassostrea gigas is showed (A) in Asia and (B) in Europe. Chromosomal variations in average genetic differentiation (FST) and between-species diversity (πB) were both related to the local recombination rate. More precisely, we found that FST and πB were respectively negatively (fig. 5, Spearman’s rho = -0.076*) and positively (fig. 5, Spearman’s rho = 0.088*) related to the local recombination rate. Moreover, the within-species nucleotide diversity was significantly positively correlated with recombination (supplementary fig. S5, Supplementary Material online). Despite showing inverted relationships at the genome-wide scale, both FST and πB showed decreasing trends in the maximal amount of divergence (measured by the 97.5th quantile of each distribution) with increasing local recombination rate (fig. 5). This result supports that the highest values of both relative and absolute divergence preferentially map to low-recombining regions of the oyster genome.

. 5.

—Distribution of between-species genetic differentiation (FST) and diversity (πB) as a function of the local recombination rate assessed with MareyMap. Each point represents an estimate of FST (A) or πB (B) and recombination rate averaged in a 500 kb window. Outlier recombination rate values exceeding 10 cM/Mb were excluded. The red dashed line represents the nonparametric 97.5th quantile regression fit of FST or πB as a function of the local recombination rate.

Inference of Gene Flow

The population tree estimated with Treemix closely mirrored the genetic relationships evidenced with the dAPC (fig. 6). Two migration edges had to be introduced for improving the model fit, afterwards the likelihood did not improve by adding further migration events. Evidence was found for asymmetrical gene flow from C. angulata to C. gigas in both Asia and Europe. Nevertheless, a stronger intensity of migration was inferred in Europe (i.e., migration weight was 0.146 in Europe against 0.122 in Asia).

. 6.

—Analysis of gene flow between oyster populations. (A) Treemix analysis showing the population tree with two migration edges from Crassostrea angulata to Crassostrea gigas populations in Asia and Europe. The upper left panel shows model likelihood as a function of the number of migration edges considered in the model. (B) Results of f3 calculations obtained using LAF and JAP as focal populations, on two similar size data sets containing independent SNPs from high- (i.e., R > 2.21 cM/Mb) or low- (i.e., R < 2.21 cM/Mb) recombining regions. The only significantly negative f3 scores calculated over all possible three-population tests were obtained using the European C. gigas population (LAF) as the focal population. Estimates for the LAF population were more negative when considering the reference C. angulata population from Europe (SAD) instead of Asia (KEE), thus supporting increased gene flow in Europe (fig. 6). In contrast, no negative value was obtained for the Asian C. gigas population (JAP), which confirms the Treemix result that the JAP population is less introgressed by C. angulata than the European population of LAF. We found significant contrasts between f3 estimates obtained from high- versus low-recombining regions of the genome. Evidence for genetic admixture in the European C. gigas population (LAF) was only detected using SNPs located in the 50% of the genome associated with the highest recombination rates (fig. 6). These results thus indicate that between-species gene flow is reduced in low- compared with high-recombining regions. Moreover, the contrast between f3(LAF; gigas, angulata) and f3(JAP; gigas, angulata) was always amplified in high-recombination rate regions. This result suggests that the different introgression rates experienced by LAF and JAP populations (fig. 6) mainly concern the regions with the highest recombination rates.

Divergence History

The demographic history of divergence between C. angulata and C. gigas inferred with δaδi showed evidence for a SC in both Asia and Europe (table 2). The JAFS presented in figure 7 show a high proportion of shared polymorphisms at high frequency along the diagonal, which is a characteristic footprint of secondary introgression that is not expected under a SI model (Alcala et al. 2016; Roux et al. 2016) and explain the good support for the SC model. Moreover, varying rates of introgression among loci (i.e., the SC2m model) was strongly supported for both native and introduced species pairs, since reduced introgression explains the presence of SNPs in the upper right and lower left corners of the spectrum. The SC2m model received a good statistical support compared with the six other alternative models (Akaike weights > 0.95), and its goodness-of-fit showed almost no trend in the distribution of residuals for both species pairs (fig. 7).

Table 2

Results of Model Fitting for Seven Alternative Models of Divergence between C. angulata and C. gigas in Asia and Europe

Species Pair	Model	Anneal. Hot	Anneal. Cold	BFGS	AIC	Δ_i	w_i	θ	N₁	N₂	m₁₂	m₂₁	m'₁₂	m'₂₁	T_S	T_AM	T_SC	P
ASIA	SI	–247.3844	–245.0567	–244.2167	494.4335	61.3173	0.0000	94.0960	0.9363	19.8905	–	–	–	–	0.2422	–	–	–
	IM	–232.0492	–231.6256	–229.6122	469.2244	36.1082	0.0000	50.7159	1.1160	2.1658	1.4404	0.6423	–	–	1.7087	–	–	–
	AM	–246.4434	–239.7242	–229.6158	471.2316	38.1154	0.0000	53.1106	1.0452	2.1069	1.5443	0.6500	–	–	1.5652	0.0000	–	–
	SC	–227.7188	–227.7188	–223.1180	458.2361	25.1198	0.0000	61.5886	1.2126	1.6412	2.1765	1.5932	–	–	0.8178	–	0.1899	–
	IM2m	–217.3343	–217.3343	–214.4709	444.9419	11.8257	0.0027	35.8268	2.2285	2.4318	3.7818	4.0944	0.2835	0.2487	3.0381	–	–	0.5496
	AM2m	–233.2967	–219.8171	–214.4997	446.9993	13.8831	0.0010	30.2063	2.7356	2.7870	2.4416	3.3503	0.2287	0.2133	3.8450	0.0001	–	0.5603
	SC2m	–218.2059	–218.2059	–207.5581	433.1162	0	0.9963	56.5407	1.5600	1.5353	10.6952	11.2826	1.4966	1.5031	1.1291	–	0.0536	0.6336
EUROPE	SI	–266.1601	–266.1601	–266.1370	538.2740	66.4534	0.0000	119.5436	0.7832	19.9447	–	–	–	–	0.2127	–	–	–
	IM	–251.9601	–251.9601	–250.9902	511.9804	40.1598	0.0000	78.0597	1.4969	1.0110	0.6372	1.8961	–	–	1.1803	–	–	–
	AM	–254.7446	–254.3147	–251.0059	514.0118	42.1913	0.0000	72.6025	1.6028	1.0537	0.5769	1.8227	–	–	1.4241	0.0000	–	–
	SC	–258.8205	–249.8626	–245.4758	502.9515	31.1310	0.0000	87.7852	1.6884	0.8123	0.7821	3.8227	–	–	0.5498	–	0.1525	–
	IM2m	–239.2627	–239.2627	–231.3043	478.6086	6.7880	0.0321	61.9879	1.3816	1.8837	5.3928	3.1806	0.2426	0.4237	1.8432	–	–	0.6025
	AM2m	–244.92139	–244.92139	–231.3516	480.7032	8.8826	0.0113	65.9174	1.2964	1.7676	5.9929	3.3784	0.2603	0.4515	1.6631	0.0004	–	0.6007
	SC2m	–233.321	–233.321	–226.910	471.821	0	0.9566	85.6599	1.1751	1.3148	10.3971	12.8181	0.9340	2.0132	0.6982	–	0.0644	0.5527

Note.—In the order of appearance in the table: the model fitted, the maximum likelihood estimate over 20 independent runs after three successive optimization steps: simulated annealing hot, cold and BFGS, the Akaike Information Criterion, the difference in AIC with the best model (SC2m), Akaike weight, and theta parameter for the ancestral population before split. Following are the inferred values for the model parameters (scaled by theta): the effective size of C. angulata (N1) and C. gigas (N2) populations, the migration rates for the two categories of loci in the genome (category 1: m12 and m21, category 2: m′12 and m′21), the duration of the split (TS), of ancestral migration (TAM) and secondary contact (TSC) episodes, and the proportion (P) of the genome falling within category 1 (experiencing migration rates m12 and m21).

. 7.

—Demographic divergence history of Crassostrea angulata and Crassostrea gigas in Asia (A) and Europe (B). The best selected model in both species pairs was the secondary contact model with two categories of loci experiencing different rates of introgression (SC2m) and occurring in proportions P (for freely introgressing loci) and 1 – P (for loci with reduced effective migration rates) across the genome. The model parameters are described in the legend to table 2. Each plot shows the observed JAFS (upper left), the JAFS obtained under the maximum-likelihood SC2m model (upper right), the spectrum of residuals with over (red) and under (blue) predicted SNPs (lower left), and the distribution of residuals (lower right).

Results of Model Fitting for Seven Alternative Models of Divergence between C. angulata and C. gigas in Asia and Europe Note.—In the order of appearance in the table: the model fitted, the maximum likelihood estimate over 20 independent runs after three successive optimization steps: simulated annealing hot, cold and BFGS, the Akaike Information Criterion, the difference in AIC with the best model (SC2m), Akaike weight, and theta parameter for the ancestral population before split. Following are the inferred values for the model parameters (scaled by theta): the effective size of C. angulata (N1) and C. gigas (N2) populations, the migration rates for the two categories of loci in the genome (category 1: m12 and m21, category 2: m′12 and m′21), the duration of the split (TS), of ancestral migration (TAM) and secondary contact (TSC) episodes, and the proportion (P) of the genome falling within category 1 (experiencing migration rates m12 and m21). —Demographic divergence history of Crassostrea angulata and Crassostrea gigas in Asia (A) and Europe (B). The best selected model in both species pairs was the secondary contact model with two categories of loci experiencing different rates of introgression (SC2m) and occurring in proportions P (for freely introgressing loci) and 1 – P (for loci with reduced effective migration rates) across the genome. The model parameters are described in the legend to table 2. Each plot shows the observed JAFS (upper left), the JAFS obtained under the maximum-likelihood SC2m model (upper right), the spectrum of residuals with over (red) and under (blue) predicted SNPs (lower left), and the distribution of residuals (lower right). Therefore, the semipermeable species-barrier model, in which introgression occurs at variable rates across the genome since SC, explains well the observed data. In the neutrally introgressing fraction of the genome (i.e., 55–63% of the genome), gene flow was found to occur in both directions but at slightly higher rates from C. angulata to C. gigas (i.e., the Nem product indicates 12.2–16.7 migrants per generation from C. gigas to C. angulata and 16.9–17.3 migrants per generation from C. angulata to C. gigas, table 2). In contrast, there was a 6–11-fold decrease in gene flow (m/m′, table 2) in the remaining fraction of the genome, due to reduced effective migration rates in both directions. Time parameters indicated that the duration of isolation without gene flow was 11–21 times longer than the duration of secondary gene flow (TS/TSC, table 2). These results support that the allopatric divergence period was long enough to enable differentiation prior to SC.

Discussion

An Improved Scaffolding of the Pacific Oyster Genome

Our new high-density linkage map in C. gigas adds to a series of first-generation linkage maps (Hubert and Hedgecock 2004; Li and Guo 2004; Hubert et al. 2009; Sauvage et al. 2010; Plough and Hedgecock 2011; Guo et al. 2012; Zhong, Li, Guo, et al. 2014) and more recent second-generation maps (Hedgecock et al. 2015; Wang et al. 2016) that were produced in the Pacific oyster. Compared with the two most recent linkage maps, this map offers a slightly improved resolution, with an average marker-spacing of 0.64 cM compared with 1 cM in Hedgecock et al. (2015) and 0.8 cM in Wang et al. (2016). We paid attention to minimize the influence of the potentially widespread transmission ratio distortions observed across most of the 10 linkage groups (Launey and Hedgecock 2001; Plough and Hedgecock 2011; Plough 2012; Hedgecock et al. 2015), by discarding distorted markers prior to map reconstruction. Our map, however, shows a good collinearity with the map produced by Hedgecock et al. (2015) that did not exclude distorted markers. This suggests that distortions of segregation ratios do not substantially affect the ordering of markers in C. gigas genetic maps, and further support that Hedgecock’s consensus linkage map and ours provide a reasonable assessment of marker order and genetic distances across the Pacific oyster genome. Consistent with what has been previously reported by Hedgecock et al. (2015), we found a high incidence of chimeric scaffolds (38.5% and 44.2%, respectively) in the oyster_v9 assembly (Zhang et al. 2012; http://www.oysterdb.com). Splitting scaffolds to resolve mapping conflicts to different linkage groups decreased the scaffold N50 by 45.6% (401–218 kb), but the rescaffolding step using BES increased the scaffold N50 again by 47% (320 kb). Because of the limited density of markers in our linkage map, this new assembly version (available at ftp://ftp.ifremer.fr/ifremer/dataref/bioinfo/) is not complete and tends to exclude especially the shortest scaffolds present in the oyster_v9 assembly (Zhang et al. 2012). Therefore, only 50.1% (280.2 Mb) of the oyster_v9 assembly could be anchored into 10 pseudochromosomes based on the information contained in our new linkage map. This underlines the need for an improved genome assembly in C. gigas using new information from higher density linkage maps and long read sequencing data. Such improvements are mandatory for all functional analyses relying on a complete genome description and a thorough annotation of gene sequences, such as genome-wide association studies (GWAS). However, for the purpose of this study, a partial chromosome-level assembly is sufficient to explore broad-scale genomic variation in genetic differentiation between the two studied sister taxa and its possible relationship with recombination rate variation.

Recombination Rate Variation across the Genome

The total length of our sex-averaged consensus map (965 cM) is intermediate to previous estimates that were reported from the two existing high-density linkage maps in the Pacific oyster (890–1,084 cM, Hedgecock et al. 2015, respectively, Wang et al. 2016). Considering the genome-size estimate of 559 Mb (Zhang et al. 2012), this corresponds to an average recombination rate of 1.73 cM per Mb, a value within the range of average recombination rate values obtained across a wide diversity of animal species (Corbett-Detig et al. 2015; Stapley et al. 2017). The chromosome-averaged recombination rate shows a 2.5-fold variation among linkage groups and is negatively correlated to chromosome length. Such a pattern has already been observed in several species including yeast (Kaback et al. 1992), human (Lander et al. 2001), stickleback (Roesti et al. 2013), and daphnia (Dukić et al. 2016), and has been commonly attributed to positive crossover interference. This phenomenon happens during meiosis, when the formation of an initial recombination event reduces the probability that an additional recombination event occurs nearby on the chromosome. If the mechanism of interference is still elusive, it is often assumed that a fixed number of crossovers happens per chromosome (between 1 and 2), making large chromosomes experiencing less recombination events per nucleotide position than smaller ones. Our results thus provide new empirical support for the existence of crossover interference in C. gigas. We also revealed large-scale (i.e., megabase-scale) variation in local recombination rate within chromosomes. The general pattern we observed was a reduction of crossover rate in chromosome centers relative to peripheries. Similar patterns are widespread in animals, plants and fungi (reviewed in Berner and Roesti 2017; Haenel et al. 2018), and can be attributed to different but not mutually exclusive mechanisms. Firstly, a reduction in recombination near the centromere of metacentric chromosomes (Nachman 2002) can be hypothesized, possibly due to the presence of highly condensed heterochromatin. Secondly, crossover interference, which tends to produce evenly and widely spaced crossovers when multiple crossovers occur on the same chromosome, may increase the chance that multiple recombination events affect chromosomal extremities. Thirdly, heterochiasmy (i.e., different recombination rates and landscapes between sexes) could also explain these patterns (Stapley et al. 2017), especially if one sex tends to recombine preferentially in chromosomal extremities. Finally, a telomere-guided initiation of homology search before recombination could also explain an increased frequency of crossovers near chromosome peripheries. In the Pacific oyster, the comparison of male and female maps did not support the hypothesis of heterochiasmy (Hedgecock et al. 2015), so the observed recombination rate variation within chromosomes could be rather explained by centromere location, crossover interference and/or telomere-guided initiation of recombination. The recombinational environment is known to affect the impact of natural selection on linked neutral diversity across a wide range of species, in which low-recombining regions generally display reduced levels of nucleotide diversity (Corbett-Detig et al. 2015). As expected under the joint effect of background selection (Charlesworth et al. 1993) and hitchhiking (Maynard Smith and Haigh 1974) on linked neutral diversity, we found a positive relationship between the local recombination rate and nucleotide diversity across the Pacific oyster genome. However, the magnitude of this effect was rather small compared with what could be expected from comparative estimates in other invertebrate species supposed to have large population sizes. Corbett-Detig et al. (2015) argued that the effect of natural selection on the reduction of linked neutral variation was stronger in species with large census sizes, but a reanalysis by Coop (2016) showed that this conclusion was not supported by the data. Oysters are usually perceived as species with large effective population sizes and this is corroborated by a medium to high nucleotide diversity (Sauvage et al. 2007; Zhang et al. 2012). However, several empirical evidences from molecular evolution and population genetics studies have also shown that oysters have among the highest segregating loads of deleterious mutations observed in marine invertebrates (Sauvage et al. 2007; Plough 2016). This may be due to a high variance in reproductive success (sweepstake effect) and population size fluctuations (Hedgecock 1994; Boudry et al. 2002; Hedgecock and Pudovkin 2011; Harrang et al. 2013; Plough 2016). We are lacking theoretical predictions on the effect of linked selection in a species with skewed offspring distribution, but the effect is likely more genome-wide during favorable sweepstake events than it is in the standard Wright–Fisher model.

Genome-Wide Diversity Patterns within and between Pacific Oyster Species

The core objectives of this study were to 1) determine the extent and variation in the level of differentiation within and between C. angulata and C. gigas across the genome, 2) test the existence of differences in the genomic landscape of differentiation between native and introduced species pairs, 3) infer the history of gene flow during divergence, and finally 4) relate interspecies divergence to underlying evolutionary processes including demography, selection and genomic constraints. Our results reveal that the European introduction of C. angulata in the 16th century, and more recently of C. gigas in the 1970s, did not lead to a reduction in genetic diversity for both species compared with source populations in their native range. This observation is rather common in the literature of marine invasions, and has been attributed to the combined effects of multiple introductions and high propagule pressure (Viard et al. 2016). Another possibility is that the initiation of the demographic wave of invasion can simply not occur without a sufficiently large number of individuals in species with a strong Allee effect like sessile broadcast spawners. In other words, a successful marine introduction cannot happen without a high number of founders. We found a relatively low background level of differentiation between C. gigas and C. angulata (i.e., genome-wide averaged FST = 0.045), which is consistent with previous estimates based on microsatellite loci (Huvet et al. 2000, 2004). However, differentiation was highly heterogeneous across the genome in both Asian and European populations, with peaks of elevated FST values being found in most chromosomes, sometimes even reaching differential allelic fixation in Asia. These “genomic islands of differentiation” are commonly observed in genome scans for divergence between closely related species pairs (Wolf and Ellegren 2017), and multiple mechanisms have been proposed to explain their formation (Cruickshank and Hahn 2014; Burri et al. 2015; Yeaman et al. 2016; Ravinet et al. 2017). They broadly fall in two categories: 1) mechanisms of differential gene flow involving the existence of reproductive isolation and/or local adaptation loci acting as genetic barriers to interspecific gene flow (Barton and Bengtsson 1986; Feder and Nosil 2010), and 2) heterogeneous divergence mechanisms due to variation in the rate of lineage sorting across the genome through linked selection (Cruickshank and Hahn 2014; Burri et al. 2015; Burri 2017a). In accordance with the hypothesis of linked selection, genetic differentiation (FST) and absolute divergence (πB) were respectively negatively and positively correlated with the local recombination rate. Concomitantly, however, the highest values of both relative and absolute divergence were preferentially observed in low-recombining regions, which is not predicted under linked selection alone (Burri 2017a,b). Using simulations, Duranton et al. (2018) recently showed that these observations are consistent with the existence of genetic barriers to interspecific gene flow, in addition to the action of linked selection. Therefore, genomic islands between Pacific oyster species may also involve both mechanisms. In order to test whether specific adaptations to different habitats participate in genetic barriers, we hypothesized that the different ecological conditions encountered by oysters in Europe would either result in relaxed divergent selection pressures on genes involved in local adaptation in Asia, or in new selective constraints targeting different subsets of genes. In both cases, we expect that the new environmental conditions in Europe would promote a remodeling of the genomic landscape of species divergence and hence reduce the extent of parallelism in genetic differentiation between species-pairs from native and introduced populations. Contrary to that prediction, we found a remarkably high degree of divergence parallelism indicating that the genetic architecture of reproductive isolation between C. angulata and C. gigas has not been reinforced, nor significantly weakened following the co-introduction of the two species in Europe. Nevertheless, a slightly lower interspecific differentiation was found in Europe. This reduction was not driven by the most differentiated loci (figs. 3 and 6), and therefore rather indicated increased gene flow in the low-differentiated fraction of the genome than relaxed divergent selection on barrier loci in the novel habitat. Moreover, the Treemix and three-population f3 tests suggest that more pronounced gene flow is ongoing from C. angulata to C. gigas than in the opposite direction, especially in Europe (fig. 6). This asymmetrical introgression pattern seems consistent with the more recent introduction of C. gigas, which progressive spread over Europe has possibly favored the incorporation of genetic material from C. angulata through repeated backcrossing. Overall, these results suggest that the particular demographic conditions imposed by the co-introduction of the two species in southern Europe have facilitated opportunities for genetic interactions, without modifying the nature of the species boundary. The finding of unaltered genomic differentiation patterns in the introduced species-pair despite increased gene flow suggests either that the contact is too recent to observe large effects, or that historical and architectural genomic features have played a major role in shaping the genomic landscape of species divergence between Pacific cupped oysters. The observed pattern of heterogeneous genome divergence is predicted by the semipermeable species boundary model, in which neutral introgression is only permitted in genomic regions unlinked to reproductive isolation barriers (Harrison and Larson 2014, 2016). Under this model, elevated FST values above the background level indicate the location of genomic regions that are resistant to introgression due to the presence of reproductive isolation loci (i.e., speciation genes). The existence and the possible origin of a semipermeable barrier to gene flow between C. gigas and C. angulata was further addressed through the comparison of various theoretical models offering simplified representations of contrasted evolutionary scenarios. Our results indicated that contemporary differentiation patterns likely result from a long period of allopatric divergence followed by a recent SC with different rates of introgression among loci. This model consistently outperformed alternative scenarios in both native and introduced ranges, predicting observed differentiation patterns with very small residual errors (fig. 7). Inferred model parameters capturing semipermeability indicated that the fraction of the genome experiencing reduced introgression rates amounts to 37–45%, consistent with the view that the species barrier is still largely permeable to gene flow. Similar estimates have been obtained in other marine species-pairs analyzed with the same approach (Tine et al. 2014; Le Moan et al. 2016; Rougemont et al. 2017). In all these cases, postglacial SCs between allopatrically diverged lineages have been inferred. The northwestern Pacific region is known to harbor multiple cases of marine species subdivided into divergent lineages that are supposed to have originated in separate glacial refugia, most likely corresponding to the current seas of Japan and China (reviewed in Ni et al. 2014). Our study provides the first genome-wide view of the evolutionary consequences of past sea-level regression in this region, and adds to previous studies suggesting their important role in initiating speciation in northwestern Pacific taxa (e.g., Shen et al. 2011). In addition to providing evidence for partial reproductive isolation between C. gigas and C. angulata, our results also bring new empirical support for the role of recombination rate variation in shaping the genomic landscape of species divergence. A recent study in the European sea bass has shown that low-recombining regions not only experience accelerated rates of lineage sorting during allopatric phases, but are also preferentially involved in the barrier to gene flow during SCs (Duranton et al. 2018). Here, the three-population f3 tests revealed that the increase in gene flow favored in the European context has mostly affected highly recombining regions, which likely explains the negative relationship observed between differentiation and recombination. This finding is consistent with the idea that foreign alleles are more efficiently removed by selection following introgression when recombination is locally reduced (Schumer et al. 2018). This interpretation also suggests that the sites under selection are widespread across the genome, although not individually under strong selection (Aeschbacher et al. 2017). This type of distribution of individual locus effect sizes has recently been evidenced from hybridization studies in a number of animal and plant species (Simon et al. 2018).

Conclusion

Our results shed new light on the existence of reproductive isolation barriers between the two cupped oysters C. angulata and C. gigas. By providing empirical evidence for heterogeneous divergence patterns attributable to reduced introgression in low-recombining regions since SC, we show that these semi-isolated species are still evolving in the so-called “speciation grey zone” (Roux et al. 2016). Moreover, the finding of strong divergence parallelism between species-pairs from native and introduced areas suggests that the genomic architecture of reproductive isolation is not primarily determined by ecological divergence driven by local adaptations in the native range. Our study thus implies the existence of intrinsic genetic differences between the two species, which will be the focus of future investigations based on experimental crosses. Click here for additional data file.

78 in total

Review 1. Recombination rate variation and speciation: theoretical predictions and empirical results from rabbits and mice.

Authors: Michael W Nachman; Bret A Payseur
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2012-02-05 Impact factor: 6.237

2. adegenet: a R package for the multivariate analysis of genetic markers.

Authors: Thibaut Jombart
Journal: Bioinformatics Date: 2008-04-08 Impact factor: 6.937

3. The evolution of genomic islands by increased establishment probability of linked alleles.

Authors: Sam Yeaman; Simon Aeschbacher; Reinhard Bürger
Journal: Mol Ecol Date: 2016-05-21 Impact factor: 6.185

4. Development of SNP-genotyping arrays in two shellfish species.

Authors: S Lapègue; E Harrang; S Heurtebise; E Flahauw; C Donnadieu; P Gayral; M Ballenghien; L Genestout; L Barbotte; R Mahla; P Haffray; C Klopp
Journal: Mol Ecol Resour Date: 2014-03-05 Impact factor: 7.090

5. The hitch-hiking effect of a favourable gene.

Authors: J M Smith; J Haigh
Journal: Genet Res Date: 1974-02 Impact factor: 1.588

6. European sea bass genome and its variation provide insights into adaptation to euryhalinity and speciation.

Authors: Mbaye Tine; Heiner Kuhl; Pierre-Alexandre Gagnaire; Bruno Louro; Erick Desmarais; Rute S T Martins; Jochen Hecht; Florian Knaust; Khalid Belkhir; Sven Klages; Roland Dieterich; Kurt Stueber; Francesc Piferrer; Bruno Guinand; Nicolas Bierne; Filip A M Volckaert; Luca Bargelloni; Deborah M Power; François Bonhomme; Adelino V M Canario; Richard Reinhardt
Journal: Nat Commun Date: 2014-12-23 Impact factor: 14.919

7. Linked selection and recombination rate variation drive the evolution of the genomic landscape of differentiation across the speciation continuum of Ficedula flycatchers.

Authors: Reto Burri; Alexander Nater; Takeshi Kawakami; Carina F Mugal; Pall I Olason; Linnea Smeds; Alexander Suh; Ludovic Dutoit; Stanislav Bureš; Laszlo Z Garamszegi; Silje Hogner; Juan Moreno; Anna Qvarnström; Milan Ružić; Stein-Are Sæther; Glenn-Peter Sætre; Janos Török; Hans Ellegren
Journal: Genome Res Date: 2015-09-09 Impact factor: 9.043

8. Marine invasions enter the genomic era: three lessons from the past, and the way forward.

Authors: Frédérique Viard; Patrice David; John A Darling
Journal: Curr Zool Date: 2016-04-26 Impact factor: 2.624

Review 9. Using neutral, selected, and hitchhiker loci to assess connectivity of marine populations in the genomic era.

Authors: Pierre-Alexandre Gagnaire; Thomas Broquet; Didier Aurelle; Frédérique Viard; Ahmed Souissi; François Bonhomme; Sophie Arnaud-Haond; Nicolas Bierne
Journal: Evol Appl Date: 2015-07-28 Impact factor: 5.183

10. The Genomic Signature of Population Reconnection Following Isolation: From Theory to HIV.

Authors: Nicolas Alcala; Jeffrey D Jensen; Amalio Telenti; Séverine Vuilleumier
Journal: G3 (Bethesda) Date: 2015-11-06 Impact factor: 3.154

18 in total

Review 1. Anthropogenic hybridization at sea: three evolutionary questions relevant to invasive species management.

Authors: Frédérique Viard; Cynthia Riginos; Nicolas Bierne
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2020-07-13 Impact factor: 6.237

2. Evolutionary trade-offs between baseline and plastic gene expression in two congeneric oyster species.

Authors: Ao Li; Li Li; Wei Wang; Guofan Zhang
Journal: Biol Lett Date: 2019-06-05 Impact factor: 3.703

3. Acetylome Analysis Reveals Population Differentiation of the Pacific Oyster Crassostrea gigas in Response to Heat Stress.

Authors: Ao Li; Li Li; Wei Wang; Guofan Zhang
Journal: Mar Biotechnol (NY) Date: 2020-01-29 Impact factor: 3.619

4. Replicated anthropogenic hybridisations reveal parallel patterns of admixture in marine mussels.

Authors: Alexis Simon; Christine Arbiol; Einar Eg Nielsen; Jérôme Couteau; Rossana Sussarellu; Thierry Burgeot; Ismaël Bernard; Joop W P Coolen; Jean-Baptiste Lamy; Stéphane Robert; Maria Skazina; Petr Strelkov; Henrique Queiroga; Ibon Cancio; John J Welch; Frédérique Viard; Nicolas Bierne
Journal: Evol Appl Date: 2019-11-22 Impact factor: 5.183

5. Sequence Composition Underlying Centromeric and Heterochromatic Genome Compartments of the Pacific Oyster Crassostrea gigas.

Authors: Monika Tunjić Cvitanić; Tanja Vojvoda Zeljko; Juan J Pasantes; Daniel García-Souto; Tena Gržan; Evelin Despot-Slade; Miroslav Plohl; Eva Šatović
Journal: Genes (Basel) Date: 2020-06-24 Impact factor: 4.096

6. Detailed insights into pan-European population structure and inbreeding in wild and hatchery Pacific oysters (Crassostrea gigas) revealed by genome-wide SNP data.

Authors: David L J Vendrami; Ross D Houston; Karim Gharbi; Luca Telesca; Alejandro P Gutierrez; Helen Gurney-Smith; Natsuki Hasegawa; Pierre Boudry; Joseph I Hoffman
Journal: Evol Appl Date: 2018-12-31 Impact factor: 5.183

7. Genetic Characterization of Cupped Oyster Resources in Europe Using Informative Single Nucleotide Polymorphism (SNP) Panels.

Authors: Sylvie Lapègue; Serge Heurtebise; Florence Cornette; Erwan Guichoux; Pierre-Alexandre Gagnaire
Journal: Genes (Basel) Date: 2020-04-21 Impact factor: 4.096

8. Genomic Prediction for Whole Weight, Body Shape, Meat Yield, and Color Traits in the Portuguese Oyster Crassostrea angulata.

Authors: Sang V Vu; Wayne Knibb; Cedric Gondro; Sankar Subramanian; Ngoc T H Nguyen; Mobashwer Alam; Michael Dove; Arthur R Gilmour; In Van Vu; Salma Bhyan; Rick Tearle; Le Duy Khuong; Tuan Son Le; Wayne O'Connor
Journal: Front Genet Date: 2021-07-08 Impact factor: 4.599

9. Biases in Demographic Modeling Affect Our Understanding of Recent Divergence.

Authors: Paolo Momigliano; Ann-Britt Florin; Juha Merilä
Journal: Mol Biol Evol Date: 2021-06-25 Impact factor: 16.240

10. Population genomics of the introduced and cultivated Pacific kelp Undaria pinnatifida: Marinas-not farms-drive regional connectivity and establishment in natural rocky reefs.

Authors: Jaromir Guzinski; Marion Ballenghien; Claire Daguin-Thiébaut; Laurent Lévêque; Frédérique Viard
Journal: Evol Appl Date: 2018-06-14 Impact factor: 5.183