Literature DB >> 34797710

Pervasive hybridization with local wild relatives in Western European grapevine varieties.

Sara Freitas1,2,3, Małgorzata A Gazda1,2, Miguel  Rebelo1, Antonio J Muñoz-Pajares1,3,4, Carlos Vila-Viçosa1,2,3,5, Antonio Muñoz-Mérida1, Luís M Gonçalves1, David Azevedo-Silva1,2,3, Sandra Afonso1,3, Isaura Castro6, Pedro H Castro1,3, Mariana Sottomayor1,2,3, Albano Beja-Pereira1,3,7,8, João Tereso1,3,5,9, Nuno Ferrand1,2,3,10, Elsa Gonçalves11,12, Antero Martins11,12, Miguel Carneiro1,2,3, Herlander Azevedo1,2,3.   

Abstract

Grapevine (Vitis vinifera L.) diversity richness results from a complex domestication history over multiple historical periods. Here, we used whole-genome resequencing to elucidate different aspects of its recent evolutionary history. Our results support a model in which a central domestication event in grapevine was followed by postdomestication hybridization with local wild genotypes, leading to the presence of an introgression signature in modern wine varieties across Western Europe. The strongest signal was associated with a subset of Iberian grapevine varieties showing large introgression tracts. We targeted this study group for further analysis, demonstrating how regions under selection in wild populations from the Iberian Peninsula were preferentially passed on to the cultivated varieties by gene flow. Examination of underlying genes suggests that environmental adaptation played a fundamental role in both the evolution of wild genotypes and the outcome of hybridization with cultivated varieties, supporting a case of adaptive introgression in grapevine.

Entities:  

Year:  2021        PMID: 34797710      PMCID: PMC8604406          DOI: 10.1126/sciadv.abi8584

Source DB:  PubMed          Journal:  Sci Adv        ISSN: 2375-2548            Impact factor:   14.136


INTRODUCTION

The European grapevine (Vitis vinifera L.) is one of the most charismatic plants of the Mediterranean agricultural landscape and one of the most widely grown fruit crops in the world. V. vinifera has diverged into two subspecies that distinguish cultivated varieties (V. vinifera ssp. vinifera) from their wild relatives (V. vinifera ssp. sylvestris), hereafter vinifera and sylvestris, respectively (). V. vinifera is the only species of the Vitis genus’ European clade, holding wild populations that occupy river banks and damp woods from the Transcaucasian region of West Asia to the Iberian Peninsula (, ). The origin and domestication centers for V. vinifera are both widely considered to match the Transcaucasian region, as suggested by morphological data and the presence of higher nucleotide diversity in both wild and cultivated germplasm belonging to this region (–). The earliest archaeological evidence for domestication of Eurasian grapevine was observed in the South Caucasus, timing it to the Late Neolithic Period, 8 thousand years (ka) ago (). Divergence between the two subspecies, estimated using whole-genome resequencing (WGS), places this event much earlier (22 to 400 ka ago) with an important bottleneck occurring ca. 8 ka ago (, ). Between 6 and 3 ka ago, viticulture spread south and west across the Mediterranean space (). Similar to most crops, grapevine still retains numerous unanswered questions regarding its origin and domestication path (). A restricted origin hypothesis first predicted that diversity of cultivated grapes was limited to a few founder genotypes in a single location (). Since then, use of molecular markers spanning a restricted number of loci has suggested the existence of gene flow between cultivated varieties and local wild genotypes, particularly associated with Western European wine varieties. Thus, there has been conflicting molecular evidence to support either multiple independent domestications (, ) or an initial domestication in the Transcaucasus followed by postdomestication hybridization with local wild relatives (). Characterization of cultivated and wild genetic diversity across the Mediterranean geographic range has placed the Iberian Peninsula region as a strong candidate for local contribution of wild populations to the overall modern-day structure of the cultivated grape (). The Iberian Peninsula is a hotspot for grapevine genetic diversity supporting hundreds of autochthonous varieties. Its diversity has been structured into three genetic groups that reflect either a core Iberian nature, a common genetic signature with Western and Central Europe varieties, or a Western Mediterranean provenance that incorporates the Maghreb (). This complexity reflects an ancient and rich domestication history, which translates into the diversity and identity of Iberian wines such as the distinctive Vinhos Verdes. Since the earliest domestication period, human activity led to the creation of thousands of grapevine varieties (). Whole-genome resequencing currently offers new opportunities to address this domestication path. Here, we analyzed 100 whole-genome sequences from the Vitis genus to elucidate important aspects of grapevine domestication, including the singularity of Western European genotypes in their relationship with sympatric sylvestris genotypes. Our data support the hypothesis of a single major domestication event in V. vinifera, followed by postdomestication hybridization with wild relatives in Western Europe. A subset of Iberian varieties exhibited particularly extensive signs of unidirectional introgression with local wild genotypes. Analysis of introgression tracts and selective sweep mapping suggest that regions under selection in donor wild genotypes were favorably incorporated into introgressed regions of the Iberian varieties. Ensuing analysis of the underlying genes suggests a strong association with environmental adaptation and a scenario of adaptive introgression as the result of hybridization with local wild relatives.

RESULTS

Whole-genome resequencing dataset

In V. vinifera, several reports have emphasized the importance of Western Europe as a source of either independent domestication or postdomestication hybridization between cultivated forms and local wild plants (, , ). To provide a detailed understanding of this past genetic history, we collected 51 cultivated (vinifera) samples, representing a large geographic range reinforced with Western European and particularly Iberian varieties, nine wild (sylvestris) samples from the Iberian Peninsula, and three Vitis sp. genotypes (table S1). Genomes were resequenced using Illumina technology. This dataset was complemented with publicly available sequencing data from 37 other genotypes (table S1). The total dataset comprised 100 genotypes and was composed of several Vitis sp. members, sylvestris samples from eastern and Iberian origins, and an array of table and wine vinifera varieties. Reads were mapped to the V. vinifera reference genome (Pinot Noir PN40024). Samples mapped, on average, to 86% of the genome, regardless of origin, showing a mapping depth of 5.2× (fig. S1).

Patterns of population structure

We began by investigating population structure between species within the Vitis genus and cultivated and wild V. vinifera samples, using clustering methods based on genotype likelihoods (–). Among other advantages, these methods use a probabilistic framework to tackle multiple aspects of population genomics, offering an extensive suite of solutions tailored specifically to medium- and low-coverage data. First, principal components analysis (PCA) was used to help visualize the relationships between the different Vitis species members (Fig. 1A and fig. S2, A and B). Vitis rotundifolia (also designated Muscadinia rotundifolia) was clearly divergent from the remaining genotypes, in line with previous WGS assessments (, ). Cultivated and wild V. vinifera formed a consistent cluster, with wild genotypes appearing closer to remaining Vitis species. We then resolved the relationship between cultivated and wild genotypes by excluding non–V. vinifera samples (Fig. 1B and fig. S2, C and D). This resulted in a clear separation between wild and cultivated varieties (PC2, 2.88% of the variance). Most notably, an overall east-to-west geographical gradient was evident both in wild and cultivated samples (PC1, 3.71% of the variance), consistent with a parallel westward expansion in these subspecies. Within cultivated varieties, there was a separation between table and wine varieties (Fig. 1B). Broadly, the PCA revealed a subclustering of wine varieties that reflected their estimated provenance. Varieties with estimated Iberian origin were split into two cohesive genetic groups, distinctly separated by varieties of Western and Central Europe estimated provenance. Overall results were strongly supported by ancestry and phylogenetic tree analysis (Fig. 1, C and D). Although the PCA seemed to explain a small percentage of the observed variance, values were similar to those reported using identical sequencing and PCA estimation methodologies (, ). Also, they followed the differentiation into wild-table-wine grapes of previous genome resequencing efforts (, ) and reflected the same clustering patterns in wine grapes obtained with 18K single nucleotide polymorphism (SNP) chip data (). Subsequently, we used population structure data and the perceived geographical origin of the varieties to cluster and name genotypes within six study groups: Wild samples were divided between Eastern and Iberian groups (WEAST, n = 8; WIBERIA, n = 9), while cultivated varieties were divided between one table (CTABLE, n = 9) and three wine groups, representing Western and Central Europe (CwWCE, n = 9) and the Iberian Peninsula (CwIB1, n = 13; CwIB2, n = 12) (Fig. 1, B to D). We estimated identity-by-descent (IBD) scores to single out the presence of clones, in which case, a single clone was retained for subsequent population studies (fig. S3 and table S2).
Fig. 1.

Population structure analysis of Vitis sp. and Vitis vinifera genotypes.

(A) PCA plot of Vitis sp. and V. vinifera wild and cultivated genotypes. (B) PCA plot of V. vinifera samples. (C) Ancestry proportions of all V. vinifera genotypes following admixture analysis for K = 4; bars represent individual genotypes, organized into six study groups plus remaining admixed individuals. (D) Phylogenetic tree of V. vinifera samples.

Population structure analysis of Vitis sp. and Vitis vinifera genotypes.

(A) PCA plot of Vitis sp. and V. vinifera wild and cultivated genotypes. (B) PCA plot of V. vinifera samples. (C) Ancestry proportions of all V. vinifera genotypes following admixture analysis for K = 4; bars represent individual genotypes, organized into six study groups plus remaining admixed individuals. (D) Phylogenetic tree of V. vinifera samples. When interrogating these study groups for patterns of ancestry using admixture analysis, CwIB2 displayed compelling evidence of shared ancestry with Iberian sylvestris genotypes (Fig. 1C and fig. S4A). These results were highly contrasting with its CwIB1 counterpart in K = 4 and other meaningful simulations (K = 2 to K = 6). Furthermore, CwIB2 was the wine group closest to WIBERIA genotypes in the PCA (Fig. 1B), and phylogenetic tree analysis placed most of the CwIB2 genotypes within the same subclade as WIBERIA samples (Fig. 1D and fig. S4B). This evidence amounts to a high likelihood of genetic relatedness between CwIB2 and WIBERIA groups. In addition, wine varieties from Western and Central Europe were also polarized by PC1 of the multivariate analysis to comparable levels as WIBERIA and CwIB2 (Fig. 1B). They also clustered next to these two groups in the phylogenetic tree, suggesting some extent of admixture between CwWCE and WIBERIA, although smaller than that observed for CwIB2. Table varieties clustered closer to wild genotypes from the center of origin (WEAST) in all analysis (Fig. 1), supporting previous evidence of high phenotypic and genetic differentiation between table and wine grapes (, ). This also indicates that a potential hybridization event in Western European grapevine varieties took place after differentiation between wine and table grapes.

Introgression testing between cultivated varieties and local wild genotypes

Given the potential for hybridization with local wild relatives displayed by CwIB2, we next tested for admixture using Patterson’s D statistics—an explicit test of gene flow (Fig. 2A) (). We configured the statistic to test potential gene flow from sylvestris donor groups (either WEAST or WIBERIA) to recipient cultivated groups. When placing CTABLE (which previously showed no evidence of admixture with WIBERIA) as P1, wine genotypes as P2, and WIBERIA as the donor group, we observed positive and statistically significant Patterson’s D levels across all tests (Fig. 2B). Results were strongest for CwIB2 (D = 0.1772; Z = 114.2490; P = 0.0000), followed by CwWCE (D = 0.1448; Z = 76.7658; P = 0.0000) and CwIB1 (D = 0.0927; Z = 59.4755; P = 0.0000). Conversely, D scores were negative and much closer to zero (neutrality) when replacing WIBERIA with WEAST (Fig. 2B and table S3). Statistical support for the strongest gene flow occurring between WIBERIA and CwIB2 was still evident when strictly wine groups were placed as recipients (Fig. 2C). When looking at each chromosome separately, there was a highly heterogeneous distribution of D (Fig. 2D), indicating that exchange of genetic information was not random across the genome. We found that chromosomes 6 (D = 0.20) and 14 (D = −0.03) provided extreme cases of high and low admixture, respectively. When extending this analysis to mitochondrial and chromosomal genomes, we observed an increase in Patterson’s D, which suggests a sylvestris maternal progenitor in the cross(es) that resulted in admixture. Collectively, these results support the first genome-level claim of admixture between Western European grapevine varieties and local wild relatives.
Fig. 2.

Patterson’s D statistics test for admixture.

(A) Patterson’s D statistics (or ABBA-BABA test) assumes that in four groups phylogenetically related as such—(((P1,P2),P3),O)—the proportion of ABBA and BABA sites will be equal under a scenario of incomplete lineage sorting without gene flow. Genome-wide levels of introgression from the donor P3 group can be detected by the presence of statistically significant levels of excess ABBA (P3➔P2) or BABA (P3➔P1) patterns. (B) Genome-wide Patterson’s D scores when confronting table (P1) against wine groups as P2 and wild groups as P3. (C) Genome-wide Patterson’s D scores assuming wine groups as either P1 or P2 and wild groups as P3. (D) Chromosome-level estimates of Patterson’s D statistics (±SE) estimated in the CTABLE or CwIB1 (P1), CwIB2 (P2), and WIBERIA (P3) configurations.

Patterson’s D statistics test for admixture.

(A) Patterson’s D statistics (or ABBA-BABA test) assumes that in four groups phylogenetically related as such—(((P1,P2),P3),O)—the proportion of ABBA and BABA sites will be equal under a scenario of incomplete lineage sorting without gene flow. Genome-wide levels of introgression from the donor P3 group can be detected by the presence of statistically significant levels of excess ABBA (P3➔P2) or BABA (P3➔P1) patterns. (B) Genome-wide Patterson’s D scores when confronting table (P1) against wine groups as P2 and wild groups as P3. (C) Genome-wide Patterson’s D scores assuming wine groups as either P1 or P2 and wild groups as P3. (D) Chromosome-level estimates of Patterson’s D statistics (±SE) estimated in the CTABLE or CwIB1 (P1), CwIB2 (P2), and WIBERIA (P3) configurations.

Analysis of introgression impact

We next looked at the impact of introgression by performing a genome-wide characterization of three separate properties of the data: (i) nucleotide diversity, (ii) IBD scores, and (iii) genetic differentiation. Diversity indexes and measures of population differentiation were estimated across the genome while taking into account nucleotide uncertainty () (Fig. 3 and fig. S5). Results for nucleotide diversity (π) (Fig. 3A) and Watterson’s theta (θW) (fig. S5A) showed a decrease in diversity when comparing wild grapes from the East (π = 0.0140; θW = 0.0140) and the wild Iberian group (π = 0.0112; θW = 0.0108). There was also a decrease in nucleotide diversity from wild to table (π = 0.0120) and wine (π = 0.0110 to 0.0119) groups, which is consistent with the presence of a weak domestication bottleneck (). Comparisons were all statistically supported (P < 0.001). There was an increase in nucleotide diversity in CwIB2 when compared to the remaining wine groups, which is compatible with higher proportion of admixture following introgression of local sylvestris into the vinifera genetic pool. Tajima’s D statistics were close to zero for WEAST (TD = −0.0346), but they increased for the remaining study groups (Fig. 3B), consistent with population contraction (a likely scenario in the wild Iberian population). An overall similar profile was observed when we summarized IBD scores between all individuals of a study group against all individuals of a second group (Fig. 3C). Groups displayed comparable levels of heterozygosity, albeit slightly higher in cultivated varieties (0.341 ± 0.024; n = 43) when compared to wild genotypes (0.335 ± 0.014; n = 17) (fig. S5C).
Fig. 3.

Nucleotide diversity and genetic differentiation of the six study groups.

(A and B) Violin plot distribution of nucleotide diversity (A) and Tajima’s D (B). (C) Violin plot of pairwise IBD scores reflecting comparisons between two genotypes of interest. (D) Heatmap of the group differentiation matrix of averaged FST values (inset: multidimensional scaling analysis of the FST matrix for study group differentiation). (E) Biogeographical model depicting Vitis vinifera speciation (triangle) and important events during domestication history (circles). Statistics in (A), (B), and (D) were estimated as 100-Kb nonoverlapping windows across the genome.

Nucleotide diversity and genetic differentiation of the six study groups.

(A and B) Violin plot distribution of nucleotide diversity (A) and Tajima’s D (B). (C) Violin plot of pairwise IBD scores reflecting comparisons between two genotypes of interest. (D) Heatmap of the group differentiation matrix of averaged FST values (inset: multidimensional scaling analysis of the FST matrix for study group differentiation). (E) Biogeographical model depicting Vitis vinifera speciation (triangle) and important events during domestication history (circles). Statistics in (A), (B), and (D) were estimated as 100-Kb nonoverlapping windows across the genome. To capitalize on the opportunities provided by our dataset on understanding grapevine domestication, we determined the impact of introgression on differentiation. We calculated the fixation index (FST) () across all pairwise group comparisons and used multidimensional scaling to reduce FST distances to a two-dimensional space (Fig. 3D). Results completely mirrored our previous PCA analysis (Fig. 1A). Highest FST was observed for table grapes against wild Iberian genotypes (FST = 0.179). This result, in conjunction with the fact that table grapes were the group closest to wild Eastern grapes (second lowest FST = 0.063), suggests an altogether independent path between table vinifera and Iberian sylvestris genotypes (Fig. 3E). Most notably, we observed three important and correlated observations: (i) All three wine groups were more differentiated against WIBERIA (FST = 0.100 to 0.154) than WEAST (FST = 0.082 to 0.104), suggesting a common origin in WEAST; (ii) there was a sharp (~33%) decrease in FST in WIBERIA versus CwIB2 when compared with other wine groups, supporting the existence of an introgression event; yet, (iii) the overall lowest FST, with values that represent fairly low divergence, was observed between Iberian wine groups (FST = 0.059), as might be expected with a recent differentiation (Fig. 3D). Collectively, these observations strongly undermine the possibility of an independent grapevine domestication event based on Western European wild grape ancestors. Rather, they favor the following model: (i) a single major domestication in the Transcaucasus, (ii) differentiation between wine and table grapes, and (iii) a postdomestication hybridization event with local wild grapes in Western Europe (most likely taking place in the Iberian Peninsula) (Fig. 3E).

Identification of introgression tracts in Iberian grape varieties

Next, we used f^ statistics to assess the fraction of the genome shared through introgression (). The analysis was restricted to the same biogeographical space (Iberia), where we had the strongest and weakest signs of introgression in wine study groups, sympatric with the potential donor (wild) genotypes, i.e., configuration f^ (P1 = CwIB1, P2 = CwIB2, P3 = WIBERIA, and O = V. rotundifolia) (Fig. 4A). Averaged genome values from the sliding window analysis were fairly high (f^ = 0.216). Together with the previous ancestry analysis (see Fig. 1C), our results suggest that the proportion of introgressed tracts may range between 25 and 50% of the grapevine genome in CwIB2 varieties. Nonetheless, we decided to implement a conservative approach in defining introgression tracts (see Materials and Methods), which resulted in 219 tracts averaging 162 Kb in size and representing 8.11% of all genomic windows (Fig. 4A). Chromosome 6 displayed the largest number of tracts (33), consistent with its highest averaged Patterson’s D score (Fig. 2D). In total, introgression tracts contained 2214 genes (data file S1). To extract biological meaning, genes were subjected to Gene Ontology (GO) term functional enrichment analysis (table S4), which highlighted an overrepresentation (P < 0.05) in abscisic acid (ABA) signaling genes. ABA is the canonical hormone in plant adaptation to abiotic stress stimuli (). Of importance also is the presence of a homolog for the ABA receptor PYR1 (VIT_02s0012g01270), which is the sensor protein for ABA (). Enrichment analysis also signaled grapevine PATHOGENESIS RELATED-10 homologs, strongly associated not only with biotic but also with abiotic stress responses (). Results support a scenario where introgression in CwIB2 grapevine varieties implicated local environmental adaptation.
Fig. 4.

Introgression regions and signatures of positive selection in Iberian genotypes.

(A) Manhattan plot of f^ scores for detection of introgressed tracts in CwIB2, using 20-Kb nonoverlaping windows across the genome, assuming the configuration f^ (P1 = CwIB1, P2 = CwIB2, P3 = WIBERIA, O = V. rotundifolia). Singleton windows with elevated f^ scores were not considered as tracts (see Materials and Methods for details). The x axis shows chromosome positions. (B) Manhattan plots of DCMS scores for CTABLE versus WEAST, CwWCE versus WEAST, CwIB2 versus WEAST, CwIB1 versus WEAST, and WIBERIA versus WEAST estimated across the genome in 100-Kbp windows with 50-Kbp steps. Dashed line represents 95th percentile cutoff. The x axis shows chromosome positions. (C) Venn summarization of shared genes between introgressed tracts in CwIB2, signatures of positive selection in wild groups (WEAST versus CwIB2), and signatures of positive selection in cultivated grapevine groups against WEAST.

Introgression regions and signatures of positive selection in Iberian genotypes.

(A) Manhattan plot of f^ scores for detection of introgressed tracts in CwIB2, using 20-Kb nonoverlaping windows across the genome, assuming the configuration f^ (P1 = CwIB1, P2 = CwIB2, P3 = WIBERIA, O = V. rotundifolia). Singleton windows with elevated f^ scores were not considered as tracts (see Materials and Methods for details). The x axis shows chromosome positions. (B) Manhattan plots of DCMS scores for CTABLE versus WEAST, CwWCE versus WEAST, CwIB2 versus WEAST, CwIB1 versus WEAST, and WIBERIA versus WEAST estimated across the genome in 100-Kbp windows with 50-Kbp steps. Dashed line represents 95th percentile cutoff. The x axis shows chromosome positions. (C) Venn summarization of shared genes between introgressed tracts in CwIB2, signatures of positive selection in wild groups (WEAST versus CwIB2), and signatures of positive selection in cultivated grapevine groups against WEAST.

Detection of signatures of positive selection across the genome and overlap with introgression tracts

We next reasoned that some regions under selection might be favored in introgressed tracts of CwIB2. To address this, we focused on the detection of selection signatures across the genome, by comparing cultivated varieties and Iberian wild genotypes, against the wild ancestral (WEAST) (Fig. 3E). A catalog of potential positive selection signals was identified through four complementary statistics that use different properties of the data: genetic differentiation (FST), genetic diversity [reduction of diversity (ROD)], and the allele frequency spectrum of mutations (∆TD and Fay and Wu’s H). We then implemented a summarization strategy using decorrelated composite of multiple signals (DCMS) (table S5) (). Multiple comparisons (Fig. 4B and data file S2) revealed high and differentiated signals of positive selection across the genome. We observed the differentiation that underpinned domestication of wine and table grapes, with shared selection targets (e.g., chromosome 17) contrasting with a series of genomic regions specific for each cultivated study group (Fig. 4B; WEAST versus CwTABLE/CwWCE/CwIB1/CwIB2). The data also provided an opportunity to look at the genomic fingerprint of adaptation of sylvestris plants as they expanded westward from their speciation center (Fig. 4B; WEAST versus WIBERIA). Strongly selected regions (95th percentile) included an extensive set of genes associated with biotic and abiotic stress responses, including homologs of the ABA sensing and signaling pathway, and multiple pathogen resistance genes with importance for grapevine biology (data file S3), indicating that biogeographical expansion was most likely driven by the capacity to adapt to newly found external challenges. We then investigated whether introgressed tracts in CwIB2 might match regions under positive selection. A partial overlap could be observed between regions under selection in CwIB2 and WIBERIA (Fig. 4B). We cross-referenced genes in introgressed tracts and genes under positive selection in Iberian wild genotypes against multiple signals of positive selection identified for the various wine and table study groups. In this comparison, CwIB2 consistently displayed two to three times as many shared genes when compared to the remaining cultivated groups (Fig. 4C). This result suggests that introgression and selection were not independent events. Furthermore, the introgressed regions contained 76 genes with signatures of positive selection in both WIBERIA and CwIB2 (table S6). The gene set comprised highly relevant homologs of genes associated with flowering and light perception (CONSTANS-Like, VIT_00s0194g00070; FAR1-Related, VIT_00s0194g00200), pathogen perception and hormonal signaling (RPP2A, VIT_18s0072g01230; ICS2, VIT_17s0000g05750), abiotic stress responses to cold (ADA2B, VIT_00s0194g00130) and drought (ERD4/OSCA1.8, VIT_02s0109g00230), and sugar content regulation (PKR, VIT_02s0109g00080). The latter two genes, ERD4 and PKR, integrate one of the largest and most robust signals of introgression observed in our analysis, positioned in chr2, and consisting of five consecutive introgression tracts (Fig. 5). Their contiguity suggests that they may be part of a large introgression tract, containing additional positively important homologs of known abiotic and biotic stress determinants (e.g., MED25, VIT_02s0012g02620; ABC-transporter Homolog; VIT_02s0012g02770; Disease resistance protein RPM1, VIT_02s0012g02720; Putative disease resistance protein RGA1, VIT_02s0109g00420; and multiple Geraniol 8-hydroxylase–coding genes) (Fig. 5E). This genomic section exemplifies our hypothesis in which robust f^ signals of introgression often equaled a peak in selection signatures in the CwIB2, including elevated levels of Tajima’s D and ROD. They also matched a peak in selection signatures in WIBERIA, but not in CwIB1 (Fig. 5, C and D). PRK is particularly meaningful since this sugar-associated regulator is positioned within the largest introgression tract and both CwIB2 and WIBERIA sweeps (Fig. 5E), targeting it as a key functional candidate for future characterization studies. Collectively, these findings highlight the functional pathways that most likely drove introgression from local wild populations into a range of Western European grapevine varieties.
Fig. 5.

Signals of introgression and positive selection in one of the strongest introgression tracts for the CwIB2 study group, positioned in chromosome 2.

(A) Zoom-in on chromosome 2 (9.5- to 13.5-Mb coordinates) details five neighboring introgression tracts determined by top f^ scores in a 20-Kb sliding window analysis of configuration f^ (P1 = CwIB1, P2 = CwIB2, P3 = WIBERIA, O = V. rotundifolia). (B) DCMS scores for the WEAST versus CwIB2 comparison (top) and scores for the selection signature statistics composited in the DCMS analysis (∆Tajima’s D and Fay and Wu’s H in the middle; FST and ROD in the bottom). (C) DCMS scores for the WEAST versus WIBERIA comparison. (D) DCMS scores for the WEAST versus CwIB1 comparison. (E) Gene space and annotation of highlighted genes in this genomic interval.

Signals of introgression and positive selection in one of the strongest introgression tracts for the CwIB2 study group, positioned in chromosome 2.

(A) Zoom-in on chromosome 2 (9.5- to 13.5-Mb coordinates) details five neighboring introgression tracts determined by top f^ scores in a 20-Kb sliding window analysis of configuration f^ (P1 = CwIB1, P2 = CwIB2, P3 = WIBERIA, O = V. rotundifolia). (B) DCMS scores for the WEAST versus CwIB2 comparison (top) and scores for the selection signature statistics composited in the DCMS analysis (∆Tajima’s D and Fay and Wu’s H in the middle; FST and ROD in the bottom). (C) DCMS scores for the WEAST versus WIBERIA comparison. (D) DCMS scores for the WEAST versus CwIB1 comparison. (E) Gene space and annotation of highlighted genes in this genomic interval.

DISCUSSION

Implementation of whole-genome resequencing is currently driving an onset of population genomic approaches that target the genetic basis of domestication, selection, and adaptation events. Here, we resequenced dozens of grapevine genotypes and provided new evidence that Western European cultivated grapes hosted a major postdomestication hybridization event with local wild genotypes, rather than independent domestication. In grapevine, such an event likely took place in the Iberian Peninsula. However, a molecular signature of introgression was also found across multiple grape varieties of Western Europe outside the Iberia. Whether a single or multiple postdomestication hybridization events occurred across the Mediterranean basin is yet to be determined. Our data support a growing body of literature showing that hybridization events leading to postdomestication gene flow are mainstream occurrences during crop geographical expansion (). In few crops, this phenomenon has been associated with adaptive introgression (, ). Similarly, our genome-level approach allowed us to (i) recognize introgression tracts, (ii) detect selective sweeps underpinning adaptation of wild populations, and (iii) identify genes of interest mutually involved in adaptation and introgression.

Western European varieties reflect postdomestication hybridization rather than independent domestication

In the present work, we tested whether postdomestication hybridization or an independent domestication event underpinned the development of Western European wine varieties [reviewed in ()]. The likelihood of an independent domestication event should not be minimized, since recent WGS hints at the possibility of an independent history for a specific set of grapevine cultivars from the Levant (). However, our results provide genome-level evidence that Western European wine varieties did not originate from independent domestication but rather from a postdomestication hybridization event. Compared to insights generated from markers with ascertainment bias, our data contradict previous claims for an independent domestication (, , –) and help clarify ambivalent hypothesis (, –), in favor of a postdomestication hybridization model (, ). Furthermore, our results suggest a common domestication framework that contemplates both wine and table grapes deriving from the historically and genetically accepted domestication center in the Transcaucasus, followed by expansion of grapevine across the Mediterranean basin, where it hybridized at least once with local sylvestris populations (Fig. 3E). Even though feralized domesticate individuals can be found in the Iberian Peninsula, their abundance proportions seem to be fairly small (, ). Here, introgression directionality testing using f^ statistics indicated unidirectional gene flow between sylvestris and vinifera populations, supporting previous 3-population tests on SNP chip data (). Another key finding was the multiple evidence that all three wine study groups seem to display, to varying degrees, a sylvestris introgression signature (summarized as CwIB2 > CwWCE > CwIB1) (Fig. 2). This suggests that introgression historically permeated most of the modern grapevine wine varieties found in Western Europe. Future studies should now address the origin and whether these introgression signatures are derived from a single or multiple hybridization events (Fig. 3E). Our varietal dataset was structured into groups that broadly reflect previous population structuring using SNP chip data (, ), in which CwIB1 can be seen as a core group of typical Iberian varieties, while CwIB2 are Iberian varieties with an affinity to Western and Central Europe genotypes (). In this context, the possibility that the hybridization responsible for CwIB2 took place outside the Iberian Peninsula needs to be considered. However, the likeliest scenario is that this hybridization occurred in the Iberian Peninsula. CwIB2 showed the highest and CwIB1 showed the lowest introgression signatures, yet both groups presented the closest genetic proximity, indicating recent differentiation. Ample studies using low-resolution molecular markers support the genetic proximity between local sylvestris populations and modern cultivated varieties in Portugal (, , , ) and Spain (, , ). Last, the chlorotype that characterizes wild populations in Western Europe is more prevalent in cultivated varieties from the Iberian Peninsula, when compared to the remaining geographies (). In a single-introgression model, CwIB2-related genotypes may have acted as donors of sylvestris genetic material, further diluted in Western European cultivated varieties by backcrossing with vinifera as a result of purposeful breeding. The long track record of historical flow of varieties across Europe, particularly since Roman times (, ), offers a framework for interchange between Iberian and Western and Central Europe varieties. Recent ancient DNA analysis suggests a transition in French grapevine diversity from Roman to Medieval times, in which early Roman period seeds clustered closer to Iberian and Eastern European grape varieties, whereas Late Roman and Early Medieval seeds were more similar to modern Western Europe varieties (). In agreement, pip morphometric studies documented a shift from abundant morphologically wild pips in earlier chronologies to domestic types in the Late Roman and Medieval times, suggesting greater selection efforts in these later periods (). Meanwhile, one should also consider a multiple-introgression model, in which hybridization events occurred independently in multiple geographic locations (Fig. 3E). Non–genome-level studies have suggested a genetic relatedness with local wild genotypes in cultivated varieties from Italy, France, and the Balkans (, , , , , ), but whether these signatures reflect a single or multiple hybridization events remains to be established. Future whole-genome resequencing efforts across the Mediterranean distribution range will be vital to expand our knowledge on postdomestication hybridization, particularly in the wine-producing varieties of Western Europe.

Selection signatures corroborate a case of adaptive introgression in grapevine

Screening for wild introgression signatures that may be present in the cultivated gene pool can be an effective strategy to uncover wild diversity relevant for crop adaptation to current environmental changes (). Our evidence suggests that adaptive introgression is common in cultivated grapes, since many wild genes under natural selection seem to have been favored and retained in introgressed cultivated varieties (Figs. 4 and 5). The size of estimated introgression in the strongest admixed study group, CwIB2, suggests the replacement of a massive number of alleles for new functional variants. Such an event is likely to lead to important phenotypic differences (). We used a WGS genomics strategy that is powerful for population-scale studies, as it facilitates sample comparison against a common reference genome (), but recognize as a caveat of the approach the failure to account for sylvestris genomic regions that are absent from the vinifera reference genome. Genome alignment between our reference genome and the recent high-quality assembly of the V. vinifera ssp. sylvestris genome () highlights the absence of major structural variations between both genomes (fig. S6), which should minimize this bias. Moreover, we combined detection of introgression with detection of selection, to refine our capacity to identify impactful genes. We subsequently singled out a set of 76 genes that belong to selective sweeps in CwIB2 and WIBERIA and are part of introgression tracts between both groups (table S6). They offer high confidence candidates for trait architecture determination in CwIB2. Using small SNP panels, Cunha and co-workers () recently genotyped the Portuguese national variety catalog plus local sylvestris genotypes, showing that a subset of varieties with strong overlap with CwIB2 clustered with local wild relatives. Several varieties belonged to the Vinhos Verdes demarcated region typical of Northwestern Iberia. In our studies, the provenance of CwIB2 members is associated with the north of Portugal, and 7 of 11 members are canonical Vinhos Verdes varieties, firmly establishing how a major introgression event characterizes this Iberian wine type. Remarkably, Vinhos Verdes are naturally sparkling wines with lower sugar and higher acidity, suggesting that there may be a genetic component in addition to the viticultural and environmental factors that help shape their typicity. Among the 76 genes of interest (table S6), we singled out PKR (VIT_02s0109g00080) because of its overlap with an interval showing strong signatures of both positive selection and introgression (Fig. 5). PKR encodes for a phosphoribulokinase that can be accounted for sugar content regulation, and its Arabidopsis best ortholog (AtPRK, AT1G32060) is involved in redox regulation of the Calvin-Benson cycle (). In this report, we show an east-to-west genetic gradient in both wild and cultivated genotypes (PC1; Fig. 1B), which supports earlier studies (, , , ) and corroborates the presence of introgression. The latter is likely to have favored a reduction in the genetic load (i.e., the “cost of domestication”) previously reported in grapevine (), evidenced here by the increase in nucleotide diversity in CwIB2. In the WIBERIA population, selective sweeps were likely associated with resistance and adaptation mechanisms, detected by the presence of a large number of genes involved in transcriptional control, pathogen resistance, and hormonal modulation (data file S3). In other crops, adaptive introgression has been associated with adaptation to altitude and geographical expansion (, ). Similarly, we highlight how Vinhos Verdes varieties are typical of the Northwestern Iberian Peninsula, characterized by a wet temperate Atlantic climate that contrasts severely with the dry Mediterranean climate toward the Iberia Southeast (). Considering that sylvestris plants are lianas that favor high-humidity conditions (), introgression may have enabled cultivated grapes to quickly rewire water usage signaling and response pathways. In support, many of our 76 genes of interest are involved in ABA/drought responses. The trehalose 6-phosphate phosphatase (TPP) family member VIT_15s0046g01000 links sugar and abiotic stress adaptation by being involved in the control of sugar utilization and in the tolerance response to drought, as seen in other plants (). VIT_18s0072g01220 is a putative grapevine ABA transporter () that may interfere with ABA distribution and ultimately control the ABA-regulated stress responses (). VIT_00s0194g00210 is an ortholog of the Arabidopsis TPK1 vacuolar K+ channel involved in the ABA- and CO2-mediated stomatal closure (). Other genes with orthologs implicated in drought/ABA responses include the transcriptional adapter ADA2B (VIT_00s0194g00130) () and Early-Responsive to Dehydration stress protein ERD4 (VIT_02s0109g00230) that belongs to the OSCA family of mechanically activated ion channels involved in osmosensing (). The latter OSCA proteins were recently associated with regulation of plant stomatal immunity (). Last, our 76 genes of interest also incorporate the homologs of ISOCHORISMATE SYNTHASE 2 (VIT_17s0000g05750), which is involved in the biosynthesis of salicylic acid (SA), the central hormone in local and systemic acquired resistance against pathogens (), as well as DISEASE RESISTANCE PROTEIN RPP2A (VIT_18s0072g01230), an ortholog of genes linked to downy mildew resistance (). Thus, it seems that introgression affected on upstream hormonal control of environmental responses, especially those involving ABA and SA. These results emphasize the potential of these wild populations as sources of previously unknown allelic diversity for breeding, as suggested for grapevine and other major crops (, ).

The timing of postdomestication hybridization

An important question now remains as to the historical timing of postdomestication hybridization. Genome resequencing data suggest a protracted domestication history in which sylvestris and vinifera diverged anytime between 200–400 and 22 ka ago. Models show an important genetic bottleneck ca. 8 ka ago that matches the earliest archaeological evidence of Eurasian grapevine use in the Transcaucasus and marks the beginning of purposeful cultivation in grapevine (–, ). It is generally accepted that the Transcaucasus region approximates the primary domestication center, after which grapevine use (and possibly a wine culture) spread to Anatolia and across the Mediterranean following the main civilizations (, ). This assumption provides an extensive time frame for a subsequent hybridization event. There is multiple support for the use of wild grape across Europe and the Mediterranean civilizations before the widespread use of domesticated grape [reviewed by ()]. In the Iberian Peninsula, there is evidence for gathering of wild grapes by hunter gatherers in the Early Holocene () and by the first prehistoric farmers 8 to 4 ka ago (), and its multipurpose use seems to have extended until the late 20th century (). This means that locals have been familiar with native sylvestris populations and therefore amenable to take advantage of crosses with domesticated vinifera. In our report, D statistics (Fig. 2) corroborates previous chlorotype data () to support an original cross with maternal sylvestris provenance. Given the differentiated flower morphology of subspecies vinifera (hermaphrodite) and sylvestris (unisexual), and in the absence of reproductive barriers, it looks like a plausible scenario that wild plants were the receptors of domesticated pollen as suggested by molecular data. This might show up as “wild” plants (possibly adjacent to domesticated vineyards), displaying superior berry/bunch characteristics. As to the timing of this event, the earliest evidence of grapevine cultivation in the Iberian Peninsula dates back to 2900 years ago and is based on Phoenician influence (, ), making it the earliest tentative moment for an introgression event in this region. In stark contrast, we show how varieties from the CwIB2 population can display up to 25 to 50% of introgression tracts (Figs. 1 and 2), placing them as potential F1s or second-generation backcrosses. In grapevine, this is not necessarily recent because of the extensive use of clonal lineaging since Roman times (). In support, ancient DNA analysis of French medieval grape pips recently provided evidence for 900 years of uninterrupted vegetative propagation (). The study also suggests the presence of gene flow between local wild grapevines and cultivated varieties, timing it to the early stages of viniculture in France (ca. 2500 years ago). Within the context of the Iberian Peninsula, we find additional support that hybridization was not a fairly recent event. A morphometrics study of grape pips from Northwestern Iberia archaeological sites grouped medieval and Roman pips close to the modern variety Alvarinho(PT)/Albariño(SP) (). This CwIB2 member is a hallmark variety for Vinhos Verdes wines and was suggested to be a first-generation migrant from sylvestris based on simple sequence repeat (SSR) data (). Another CwIB2 variety, Amaral, is mentioned in 1532 writings addressing the North of Portugal (). Collectively, results frame a postdomestication hybridization event in the Iberian Peninsula between 2900 and 500 years ago. Definitive clues are likely to be hidden in the DNA of archaeobotanical samples. Future approaches should concentrate on using genomic approaches to confront ancient DNA samples with modern genomic sequences as a means to understand the timing, strength, and interdependence of hybridization events that seem to permeate a subset of Iberian varieties and Western European varieties in general.

MATERIALS AND METHODS

Extended Materials and Methods are available in the Supplementary Materials and Methods.

Sampling, sequencing, and mapping

Vinifera varieties were sampled from two separate Portuguese germplasm collections (PORVID and UTAD), and sylvestris samples were collected in the southwestern region of the Iberian Peninsula. High-quality genomic DNA was used to produce polymerase chain reaction–free sequencing libraries followed by Illumina sequencing. Sequencing data from 37 additional genotypes from previous studies were also used. Information on genotypes and sequencing effort is summarized in table S1. Read quality filtering and trimming was performed with FastQC (www.bioinformatics.babraham.ac.uk/projects/fastqc/) and Trimmomatic (). Reads were mapped to the V. vinifera PN40024 reference genome using BWA-MEM ().

Population structure analysis

For PCA, we estimated genotype posterior probabilities using the ANGSD software package (). The ngsCovar feature (ngsPopGen package) was used to compute the expected correlation matrix between individuals from genotype posterior probabilities. For the phylogenetic tree, the same genotype posterior probabilities were used to calculate pairwise genetic distances in ngsDist from ngsTools. We computed a distance-based minimal evolution tree by inputting the genetic distance matrix into FastME (www.atgc-montpellier.fr/fastme/) with 100 bootstraps for branch support. For ancestry analysis, we used NgsAdmix (www.popgen.dk/software/index.php). NgsAdmix was run assuming two to eight ancestral populations with the default minor allele frequency of 0.05.

Nucleotide diversity and genetic differentiation

Population genetics summary statistics (FST; Watterson’s Theta, θw; π; Tajima’s D; Fay and Wu’s H) were also inferred under a probabilistic framework using ANGSD. V. rotundifolia () was used as the outgroup to polarize the ancestral state of alleles at each polymorphic site. All statistics were summarized across the genome using a sliding-window approach.

IBD estimation and SNP calling

To look at the relationship between different grape cultivars, we performed IBD analysis using the probabilistic methods implemented in ANGSD (). We applied stringent criteria for the SNP call, with post-cutoff of 0.95 and an SNP P value of 1 × 10−9, to include only highly supported SNPs. Subsequently, the SNPs were used to calculate IBD for all pairwise comparisons among the 100 samples using PLINK () and applying the following filters: maf 0.05 and geno 0.05.

Admixture test using Patterson’s D statistic

Patterson’s D statistics (or ABBA-BABA test) assumes three populations (P1, P2, and P3) and one outgroup (O), which are phylogenetically related as (((P1,P2),P3),O). Here, we estimated all permutations of the six groups of interest (WEAST, WIBERIA, CTABLE, CwWCE, CwIB1, and CwIB2) as P1, P2, and P3. V. rotundifolia served as an outgroup (O). We computed Patterson’s D statistic using allele frequencies instead of binary counts of fixed ABBA-BABA sites, as implemented in the ABBABABA2 (Multipopulation) function in ANGSD (), using nonoverlapping 20-Kbp windows. We calculated f^ to assess the fraction of the genome shared through introgression () in the comparison P1 = CwIB1, P2 = CwIB2, P3 = WIBERIA, and O = V. rotundifolia. We estimated D(P1,P2,P2,O) and D(P1,P3,P3,O) in ANGSD as previously described, which allowed us to obtain, for each genomic window, a donor population PD (population with the higher frequency of the derived allele).

DCMS analysis of positive selection signatures

We calculated four separate statistics that differ in their approach to detect selection events based on the type of selection signals that are targeted: genetic differentiation (FST), shifts in the allele frequency spectrum of mutations [Delta Tajimas’s D (∆TD) and Fay and Wu’s H], and reduction in genetic diversity from pairwise nucleotide diversity measures (ROD). Statistics were based on FST, Tajima’s D, Fay and Wu’s H, and nucleotide diversity (π), calculated across the genome (100-Kbp windows, 50-kbp steps) as previously reported. DCMS was then used to summarize the four different statistics, taking the covariance of the statistics into account (). Comparisons of interest were defined as those that confronted the WEAST group with the remaining five groups. Genes of interest were considered for genomic windows above the 95th percentile of the distribution.

Analysis of genes of interest

Gene annotation was retrieved from PANTHER (www.pantherdb.org) and UniProtKB (www.uniprot.org/uniprot/). Also, for the 76 cross-referenced genes of interest, protein Fasta sequences were retrieved from UniProtKB and used to perform a BlastP search in NCBI (https://blast.ncbi.nlm.nih.gov/Blast.cgi), against the nonredundant protein sequences (nr) database, and the Arabidopsis thaliana RefSeq database. GO terms (GO biological process complete) were subjected to statistical overrepresentation testing in PANTHER.
  71 in total

1.  On the number of segregating sites in genetical models without recombination.

Authors:  G A Watterson
Journal:  Theor Popul Biol       Date:  1975-04       Impact factor: 1.570

2.  The genetical structure of populations.

Authors:  S WRIGHT
Journal:  Ann Eugen       Date:  1951-03

3.  Multiple origins of cultivated grapevine (Vitis vinifera L. ssp. sativa) based on chloroplast DNA polymorphisms.

Authors:  R Arroyo-García; L Ruiz-García; L Bolling; R Ocete; M A López; C Arnold; A Ergul; G Söylemezoğlu; H I Uzun; F Cabello; J Ibáñez; M K Aradhya; A Atanassov; I Atanassov; S Balint; J L Cenis; L Costantini; S Goris-Lavets; M S Grando; B Y Klein; P E McGovern; D Merdinoglu; I Pejic; F Pelsy; N Primikirios; V Risovannaya; K A Roubelakis-Angelakis; H Snoussi; P Sotiri; S Tamhankar; P This; L Troshin; J M Malpica; F Lefort; J M Martinez-Zapater
Journal:  Mol Ecol       Date:  2006-10       Impact factor: 6.185

4.  Quantifying population genetic differentiation from next-generation sequencing data.

Authors:  Matteo Fumagalli; Filipe G Vieira; Thorfinn Sand Korneliussen; Tyler Linderoth; Emilia Huerta-Sánchez; Anders Albrechtsen; Rasmus Nielsen
Journal:  Genetics       Date:  2013-08-26       Impact factor: 4.562

5.  ABC transporter AtABCG25 is involved in abscisic acid transport and responses.

Authors:  Takashi Kuromori; Takaaki Miyaji; Hikaru Yabuuchi; Hidetada Shimizu; Eriko Sugimoto; Asako Kamiya; Yoshinori Moriyama; Kazuo Shinozaki
Journal:  Proc Natl Acad Sci U S A       Date:  2010-01-19       Impact factor: 11.205

6.  Estimation of allele frequency and association mapping using next-generation sequencing data.

Authors:  Su Yeon Kim; Kirk E Lohmueller; Anders Albrechtsen; Yingrui Li; Thorfinn Korneliussen; Geng Tian; Niels Grarup; Tao Jiang; Gitte Andersen; Daniel Witte; Torben Jorgensen; Torben Hansen; Oluf Pedersen; Jun Wang; Rasmus Nielsen
Journal:  BMC Bioinformatics       Date:  2011-06-11       Impact factor: 3.169

7.  Whole-genome resequencing of 472 Vitis accessions for grapevine diversity and demographic history analyses.

Authors:  Zhenchang Liang; Shengchang Duan; Jun Sheng; Shusheng Zhu; Xuemei Ni; Jianhui Shao; Chonghuai Liu; Peter Nick; Fei Du; Peige Fan; Ruzhi Mao; Yifan Zhu; Weiping Deng; Min Yang; Huichuan Huang; Yixiang Liu; Yiqing Ding; Xianju Liu; Jianfu Jiang; Youyong Zhu; Shaohua Li; Xiahong He; Wei Chen; Yang Dong
Journal:  Nat Commun       Date:  2019-03-13       Impact factor: 14.919

8.  Overexpression of the trehalose-6-phosphate phosphatase family gene AtTPPF improves the drought tolerance of Arabidopsis thaliana.

Authors:  Qingfang Lin; Jiao Yang; Qiongli Wang; Hong Zhu; Zhiyong Chen; Yihang Dao; Kai Wang
Journal:  BMC Plant Biol       Date:  2019-09-02       Impact factor: 4.215

9.  Calculation of Tajima's D and other neutrality test statistics from low depth next-generation sequencing data.

Authors:  Thorfinn Sand Korneliussen; Ida Moltke; Anders Albrechtsen; Rasmus Nielsen
Journal:  BMC Bioinformatics       Date:  2013-10-02       Impact factor: 3.169

10.  Genetic Relationships Among Portuguese Cultivated and Wild Vitis vinifera L. Germplasm.

Authors:  Jorge Cunha; Javier Ibáñez; Margarida Teixeira-Santos; João Brazão; Pedro Fevereiro; José M Martínez-Zapater; José E Eiras-Dias
Journal:  Front Plant Sci       Date:  2020-03-05       Impact factor: 5.753

View more
  1 in total

1.  Alternative Modes of Introgression-Mediated Selection Shaped Crop Adaptation to Novel Climates.

Authors:  José Luis Blanco-Pastor
Journal:  Genome Biol Evol       Date:  2022-08-03       Impact factor: 4.065

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.