Literature DB >> 18384667

Construction of nested genetic core collections to optimize the exploitation of natural diversity in Vitis vinifera L. subsp. sativa.

Loïc Le Cunff1, Alexandre Fournier-Level, Valérie Laucou, Silvia Vezzulli, Thierry Lacombe, Anne-Françoise Adam-Blondon, Jean-Michel Boursiquot, Patrice This.   

Abstract

BACKGROUND: The first high quality draft of the grape genome sequence has just been published. This is a critical step in accessing all the genes of this species and increases the chances of exploiting the natural genetic diversity through association genetics. However, our basic knowledge of the extent of allelic variation within the species is still not sufficient. Towards this goal, we constructed nested genetic core collections (G-cores) to capture the simple sequence repeat (SSR) diversity of the grape cultivated compartment (Vitis vinifera L. subsp. sativa) from the world's largest germplasm collection (Domaine de Vassal, INRA Hérault, France), containing 2262 unique genotypes.
RESULTS: Sub-samples of 12, 24, 48 and 92 varieties of V. vinifera L. were selected based on their genotypes for 20 SSR markers using the M-strategy. They represent respectively 58%, 73%, 83% and 100% of total SSR diversity. The capture of allelic diversity was analyzed by sequencing three genes scattered throughout the genome on 233 individuals: 41 single nucleotide polymorphisms (SNPs) were identified using the G-92 core (one SNP for every 49 nucleotides) while only 25 were observed using a larger sample of 141 individuals selected on the basis of 50 morphological traits, thus demonstrating the reliability of the approach.
CONCLUSION: The G-12 and G-24 core-collections displayed respectively 78% and 88% of the SNPs respectively, and are therefore of great interest for SNP discovery studies. Furthermore, the nested genetic core collections satisfactorily reflected the geographic and the genetic diversity of grape, which are also of great interest for the study of gene evolution in this species.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18384667      PMCID: PMC2375891          DOI: 10.1186/1471-2229-8-31

Source DB:  PubMed          Journal:  BMC Plant Biol        ISSN: 1471-2229            Impact factor:   4.215


Background

The study of natural allelic diversity has proved fruitful in understanding the genetic basis of complex traits [1-6]. However, exploiting it successfully through association genetics requires basic knowledge of the extent of allelic variation within a species. One of the most interesting ways to achieve this goal consists of developing high-density diversity maps like the those developed in human and chicken, which allow the identification of causal polymorphisms for important traits [7-10]. The recent publication of the first high quality draft of the grapevine genome sequence opens the way to building such a diversity map [11]. Like in animals or in other perennial plant species where genetic approaches based on the study of segregating populations are hampered by a long biological cycle, association genetics is of particular interest in grapevine. The development of diversity map relies on the discovery of sequence polymorphisms in the genome in a small set of genotypes that are as representative as possible of available genetic diversity. Such a concept was first proposed by Frankel and Brown under the name of core collection [12]. Core collections can be built using different types of markers. For example, molecular markers were used for rice, wheat and potato, while for yam a core collection was built using the origin of cultivars, eating quality, tuber shape, tuber flesh colour, and morphotype [13-16]. Different strategies have been proposed to assist the construction of core collections including the M-Method developed by Schoen and Brown and implemented in the software MSTRAT [17-20]. This strategy has been successfully used for the construction of core collections in Arabidopsis thaliana and Medicago truncatula and was also proposed as a preliminary step in association genetics [21-24]. Large collections of genetic resources are available for grapevine especially in Europe [25]. The largest one is held by INRA in France at the domain of Vassal: this collection contains 7000 accessions of Vitis genus of worldwide origin [26]. The genotyping of the whole collection using 20 well-scattered SSR markers is complete Laucou et al. (in prep). The cultivated compartment (V. vinifera L. subsp. sativa) is represented by 3900 accessions corresponding to 2262 unique genotypes (Laucou et al, in prep), from 38 different countries. It represents about a half of the known grapevine cultivars [27]. The Vassal collection was highly diverse for V. vinifera L. subsp. sativa, exhibiting a total of 326 alleles for the 20 SSRs markers with an average of 16.3 SSR alleles per locus (Laucou et al, in prep). Moreover a large proportion of these alleles (17%) were present at very low frequency (freq < 0.05%). A first core collection (M-core) in grape was recently developed based on 50 morphological traits on 1759 accessions from the Vassal collection [28]. It was used for a preliminary study of the extent of linkage disequilibrium (LD) in V. vinifera L. as well as for association studies [28,29]. However, the size of the M-core (141 individuals) limits its use for the analysis of wide sequence diversity. Here we present the use of the data set obtained by Laucou et al. (in prep) to develop four nested genetic core collections (G-cores) suitable for the search for allelic diversity. The ability of retaining the SSR genetic diversity using different sample sizes was studied and compared to the SSR diversity present in the M-core and in the Vassal collection. Finally, the allelic diversity captured at the sequence level in the different sub-cores was analysed by sequencing three gene fragments. This work provides the foundation required for the development of a detailed map of haplotypic diversity in grapevine.

Results

Construction of nested core collections representing the available germplasm diversity of cultivated V. vinifera L

We first determined the optimal size of a core collection by retaining the 271 alleles showing a frequency above 0.05%. Forty-eight cultivars were necessary to capture 100% of the 271 alleles (Figure 1A). Within this core collection of 48 cultivars (G-48), we then determined the two most diverse samples of 12 (G-12) and 24 (G-24) cultivars (Table 1). In order to assess the robustness of these nested core collections, we calculated the percentage of identical varieties among the G-12, G-24 and G-48 core collections obtained in the second run using the same process, which corresponded to 83.3% (10) of the varieties selected in the two G-12 to 83.3% (20) of the varieties selected in the two G-24 and to 60.24% (29) of the varieties selected in the two G-48. Among these two sets of samples, the G-48 core collection presenting the highest Nei's index was chosen as the reference core collection (Table 2).
Figure 1

Redundancy curves obtained using MSTRAT software. Redundancy curves with standard deviation obtained using MSTRAT software (five independent samplings). Determination of the optimal size allowed the capture of all alleles of the original sample. A. For the 271 alleles of the restricted Vassal collection using the M-method (blue dot) and random sampling method (pink dot). B. For the 326 alleles of the Vassal collection using the G-48 core as core using the M-method (blue dot) and random sampling method (pink dot).

Table 1

SSR diversity within each sample of the G-core compared to the Vassal collection with and without the rare allele (Restricted Vassal collection).

Sample NameSizeNumber of allelesNei's indicesObserved heterozygosityPercentage of total SSR diversityPercentage of restricted SSR diversityCorrelation of SSR frequency with Vassal collection (R2)
Vassal collection22623260.760.75100%100%
Restricted Vassal collection22622710.760.7583%100%1
G-12 core121910.830.8058%70%0.77
G-24 core242390.830.8173%88%0.85
G-48 core482710.820.8083%100%0.92
G-92 core923260.810.78100%100%0.94
M-core1412270.760.7570%81%0.98
Table 2

Nested genetic core collection of 12 to 92 varieties.* Varieties bred from cultivars of different geographical origin: the countries listed are breeding locations.

SizeVariety nameVariety numberCountryNbr of alleles
12Tsolikouri2668Georgia
12Voskeat2511Armenia
12Kapistoni tétri hermaphrodite (Coll. Kichinev)3242Georgia
12Lameiro3380Portugal
12Médouar3381Israel
12Chirai obak1186Tajikistan
12Espadeiro tinto1498Portugal
12Araklinos1805Greece
12Plant du Maroc E (Coll. Meknès)2158Morocco
12César225France
12Orlovi nokti2461Russia
12Tsitsa Kaprei2471Moldavia191

24Variété d'oasis Bou Chemma 463281Tunisia
24Uburebekur3270Romania
24Chouchillon192France
24Mehdik2082Iran
24Assyl kara2505Russia
24Pervenetz praskoveïsky2651Russia
24Pletchistik2652Russia
24Ak ouzioum tagapskii2897Kyrgyzstan
24Orbois294France
24Cabernet franc324France
24Katta-kourgan556Uzbekistan
24Kichmich tcherni3264Turkey239

48Tandanya faux3279Australia*
48Veltliner rot284Austria
48Yapincack faux3292Turkey
48Frühe Meraner3183Italy
48Kisilovy3349Russia
48Lumassina3312Italy
48Mourisco (Coll. EVV Amandio Galhano)3379Portugal
48Raisin banane noir3384Algeria
48Riesling bleu3073France
48Frappato di Vittoria1318Italy
48Tinto Cao1488Portugal
48Ag isioum1563Dagestan
48Orangetraube1569Germany
48Onusta1980Italy*
48Malvasia di Sardegna2166Italy
48Armenia2267Armenia*
48Jo Rizling2563Hungary*
48Krakhouna2638Georgia
48Portan2796France*
48Misguli kara2917Ukraine
48Bayadi du Liban2998Lebanon
48Bakarka3008Hungary
48Catanese nero2398Italy
48Retagliado bianco67Italy271

92Istchak rouge3272Uzbekistan
92Verdelho tinto3205Portugal
92Fantasy seedless3051USA*
92Kaisi baladi3219Syria
92Malahy3238Iran
92Koutlaksky belyi3160Ukraine
92Variété d'oasis Tozeur 173228Tunisia
92Long Yan3142China
92Plant de Querol 98-N-2 (Coll. Torres SA)3304Spain
92Albarola rossa faux (Coll. Pisa)3329Italy
92Barbera selvatico del Grosseto3320Italy
92Doppel Augen3151Azerbaijan
92Duc de Magenta819France*
92Graeco3224Tunisia
92Lambrusco del Caset3181Italy
92Badagui3156Georgia
92Moscatel de Oeiras faux (Coll. Bordeaux)3266unknown
92Nero grosso3176Italy
92Agoumastos3386Greece
92Rich baba rose faux3154Russia
92Colorino1353Italy
92Uva de Rey1395Spain
92Tinta castellõa1540Portugal
92Alburla1606Ukraine
92Korithi aspro1766Greece
92Canner seedless1833USA*
92Agourane1898Algeria
92Morlin gris2067France
92Askari2081Iran
92Bogazkere2104Turkey
92Jeludovii2253Romania
92Tchilar2274Armenia
92Peygamber üzümü2340Turkey
92Lambrusco viadanese2351Italy
92Vernaccia di San Gimignano2360Italy
92Alexandroouli2500Georgia
92Malaga II (Dumas)2570France*
92Sapéré otskhanouri2655Georgia
92Khindogny2664Iran
92Yapincak2768Turkey
92Arna-guirna2899Azerbaijan
92Romorantin304France
92Mandilaria341Greece
92Mauzac faux de Cahuzac357France326
SSR diversity within each sample of the G-core compared to the Vassal collection with and without the rare allele (Restricted Vassal collection). Nested genetic core collection of 12 to 92 varieties.* Varieties bred from cultivars of different geographical origin: the countries listed are breeding locations. Redundancy curves obtained using MSTRAT software. Redundancy curves with standard deviation obtained using MSTRAT software (five independent samplings). Determination of the optimal size allowed the capture of all alleles of the original sample. A. For the 271 alleles of the restricted Vassal collection using the M-method (blue dot) and random sampling method (pink dot). B. For the 326 alleles of the Vassal collection using the G-48 core as core using the M-method (blue dot) and random sampling method (pink dot). The G-48 core was used as a core to build the final core collection retaining the 326 alleles found in the cultivated compartment of Vitis vinifera L. represented in the Vassal collection by Laucou et al. (in prep). The optimal size of this final core collection was 92 individuals (Fig. 1B). The cultivars added at this step contained only rare alleles (freq < 0.05%, present on less than 3 copies), which corresponded to less choice for the selection of varieties. Indeed, only two alternative samples were proposed by MSTRAT, with only one individual differing between the two samples: Rich baba rose faux versus Kizil. Again, we selected the G-92 presenting the highest value for the Nei's index as the reference core collection for the cultivated compartment of V. vinifera L; the resulting final core collection is listed in Table 2. In order to estimate the gain of SSR allelic diversity, we compared the number of alleles captured in samples obtained by the M-method and by random sampling. In each case, when using the M-method, we observed a gain (Table 3), the greatest of which being obtained for the selection of the G-48.
Table 3

Gain obtained using the M-method at each step of the construction of the nested core collection versus random sampling.

Original collectionSample sizeM-method (mean number of alleles for 5 runs)Random sampling (mean number of alleles for 5 runs)Gain using M-method
Vassal with G-48 used as core92 individuals326278.2 (+/- 1.3)15%
Restricted Vassal collection (without rare alleles freq < 0.05%)48 individuals269.8 (+/- 1.6)185.2 (+/- 5.7)31%
G-48 (without rare alleles freq < 0.05%)24 individuals238.2 (+/- 0.4)218.8 (+/- 6.5)8%
G-24 (without rare alleles freq < 0.05%)12 individuals190.8 (+/- 0.4)177.8 (+/- 1.6)6%
Gain obtained using the M-method at each step of the construction of the nested core collection versus random sampling.

Analysis of the diversity retained in the nested core collections using different descriptors

The reference nested core collections for the cultivated compartment of Vitis vinifera were described for several characteristics and compared to the Vassal collection and to the M-core collection (141 individuals) defined by Barnaud et al. [28].

SSR diversity

The nested core collections represented 58% to 100% of the total SSR diversity of the Vassal collection and 70% to 100% of the restricted SSR diversity of the Vassal collection (only considering alleles with frequencies higher than 0.05%) (Table 1). All the SSR alleles with a frequency of more than 5% within the Vassal collection are present in the G-12 core and all those with a frequency of more than 3.5% within the Vassal collection are present in the G-24. The values of the unbiased Nei's diversity index and the level of unbiased observed heterozygosity for the G-12 core, G-24 core and G-48 core collection were quite similar and slightly higher than those calculated for the G-92 core collection. These values were slightly higher than those of the Vassal collection and of the M-core (Table 1). We also compared allele frequencies of the SSR markers in the three G-cores and in the M-core with the frequencies observed in the Vassal collection: the best correlation was obtained between the Vassal collection and the M-core (r2 = 0.98) and the G-92 (r2 = 0.92) core collections (Table 1).

Geographic origin and final uses

The definition of the true geographical origin of grapevine cultivars is sometimes difficult due to many migration events with humans [30]. Based on current knowledge, the cultivars held in the Vassal collection originated from 38 countries, with about half of them from Western Europe (France, Iberian Peninsula and Italy). The cultivars selected in the nested core collection originated from 27 different countries (Table 2). However 10 varieties of the G-92 sample could not be assigned to a precise geographical origin. Among them, 9 varieties were recent crosses between varieties from different countries (indicated by * in the Table 2) and one have an unknown origin (Moscatel de Oeiras faux). Moscatel de Oeiras faux microsatellite data seemed to indicate a Western Europe origin when compared to the whole collection. The origins of the 82 remaining varieties were well distributed (Figure 2): 21 (25%) came from the Caspian region (Dagestan, Georgia, Armenia and Azerbaijan) and the Middle East (Iran) which corresponds to the center of domestication, and 35 (42%) came from Western Europe and North Africa (Iberian Peninsula, Morocco, Algeria, Tunisia, Italy and France) (Table 4). Interestingly, five varieties (6% of the G-92) originated from Central Asia and Asia despite their very limited representation in the whole collection (less than 2%).
Figure 2

Probable geographic origin of the varieties contained in the nested genetic core collections. Each triangle corresponds to one variety, red triangles correspond to the first sub-sample of the nested genetic core collection (G-12), yellow triangles to the second sub-sample (G-24), black triangles to the third sub-sample (G-48) and green triangles to the fourth sub-sample (G-92). Ten varieties belonging to the Core G-92 did not have a precise geographical origin and are not shown on this map.

Table 4

Distribution of the geographical origin and the final use of the cultivars in the different samples

Region or Final usesWestern Europe and North AfricaCenter of domesticationAsia and central AsiaOther areaWine cultivarsTable cultivarsWine and table cultivars
Vassal collection56%3%1.6%39.4%55%36%9%
M-core58%7.2%0.9%33.9%63%30%7%
G-12 core33%33%8%26%67%33%0%
G-24 core33%33%12.5%21.5%58.5%37.5%4%
G-48 core37.5%23%6.25%33.25%56%31%12.5%
G-92 core42%25%6%27%56%32%12%
Distribution of the geographical origin and the final use of the cultivars in the different samples Probable geographic origin of the varieties contained in the nested genetic core collections. Each triangle corresponds to one variety, red triangles correspond to the first sub-sample of the nested genetic core collection (G-12), yellow triangles to the second sub-sample (G-24), black triangles to the third sub-sample (G-48) and green triangles to the fourth sub-sample (G-92). Ten varieties belonging to the Core G-92 did not have a precise geographical origin and are not shown on this map. No differences were observed between the M-core (22 countries) and the Vassal collection (Table 4) whereas all the G-cores differed from the Vassal collection. Indeed the number of cultivars from Western Europe and the center of domestication were more balanced in the G-92 core with a very good representation of the whole set of geographic origins. The same trend was observed in the different sub-cores (Table 4). We also compared the G-92 core collection, the M-core and the Vassal collection with respect to the final use of the cultivars: wine making (wine cultivars), fruit consumption (table cultivars) or both (wine/table cultivars). The M-core and the different G-cores all resembled the Vassal collection (Table 4).

Evaluation of the capture of unlinked diversity in the nested core collections

Next, we assessed the ability of the nested G-core samples to capture diversity unlinked to the SSR markers used to build the nested core collection. Barnaud et al. estimated using 38 SSR markers mapped on five different linkage groups (LG) with a maximum distance of 30 cM that LD in grape extends only within LG and is around 16.8 cM maximum [28]. We analysed the polymorphism of three gene fragments mapped further than 16.8 cM from the SSR markers in the same linkage group. DFR mapped in LG 18, 25.3 cM from the SSR marker VVIn16; L-DOX mapped in LG 8, 26 cM from the SSR marker VMC1b11 and BURP mapped in LG 3, 26 cM from the SSR marker VVMD28. Forty-one nucleotide polymorphisms (40 substitutions and 1 in/del) were observed in the G-92, ranging from 12 to 15 depending on the gene fragment (Table 5). The total polymorphism is thus one SNP for 49 nucleotides. The number of SNPs per base also varied between the three gene fragments: one SNP for every 58 nucleotides for DFR, one SNP for every 42 nucleotides for L-DOX and one SNP for every 50 nucleotides for BURP. The difference of genetic diversity between coding and non coding region of the sequences was estimated only for the DFR sequence which has a quite similar length of the two types of regions. For this gene the polymorphism was different between coding and non-coding regions with a ratio of 3.2 (one SNP for every 127 nucleotides for coding region versus one SNP for every 39 nucleotides for non-coding region). Considering all genes together, the number of SNPs detected increased from 32 to 36 between the G-12 and the G-24 cores and from 36 to 40 between the G-24 and the G-48 cores. Only one more SNP was discovered in the G-92 core than in the G-48 core for the L-DOX gene fragment (this SNP is present in two varieties: Œil de Dragon and Badagui). The higher number of SNPs in the G-24 than in the G-48 cores was due to two genotypes: Yapincack with three additional SNPs in the DFR gene fragment and Kisilowy with one additional SNP in the L-DOX gene fragment.
Table 5

Number of polymorphic bases (SNP or insertion deletions found in the DNA fragments)

Core collection studied

GeneTotal size (exon size/intron size)G-12G-24G-48G-92M coreTotal number in exonTotal number in intronTotal number
DFR (gi 499017)810 nt (380 nt/430 nt)10111414731114
L-DOX (gi 22010674)500 nt (459 nt/41 nt)9101112812012
BURP (gi 22014825)700 nt (700 nt/0 nt)131515151015015
Total2010 nt (1539 nt/471 nt)3236404125301141
Number of polymorphic bases (SNP or insertion deletions found in the DNA fragments) Estimation of the ability to capture unlinked diversity of the G-24 core and G-12 core was performed by comparing their SNP diversity with SNP diversity in five random samples of 24 individuals in the G-48 core and 12 individuals in the G-24 core. The number of SNPs in the different random samples varied from 35 to 37 SNPs for the five random samples of 24 individuals and from 30 to 34 SNP for the five random samples of 12 individuals. In order to compare SNP distribution, we also calculated the unbiased Nei's index, which varied from 0.24 to 0.25 for the five random samples of 24 individuals and from 0.30 to 0.32 for the five random samples of 12 individuals. The unbiased Nei's index of the G-24 and G-12 cores was respectively 0.28 and 0.33. Estimating the unlinked diversity within the whole Vassal collection (2262 cultivars) would have been very fastidious. Consequently we compared the capture of unlinked diversity in the nested core collections and in the M-core developed only on morphological traits. The total number of SNPs in the M-core (25 SNPs; Table 5) was smaller than in any of the nested G-core samples, even the G-12 core (32 SNPs; Table 5). Moreover, none of the SNPs observed in the M-core was new compared to those found in the nested core collections.

Discussion

In the present work, we developed a set of nested core collections from the cultivated compartment of the Vassal collection, using the M-method and SSR diversity data obtained on 2262 unique genotypes. However, in this way we did not take into account the somatic variants present within V. vinifera L. cultivated germplasm. The usefulness of core collections is due to their ability to capture the diversity of the whole species. Even the smallest nested core collections were more efficient in capturing allelic diversity than the M-core with its 141 accessions. The Vassal collection, which formed the basis of this work, includes around 3900 cultivars which correspond to 2262 unique SSR genotypes from 38 countries, including from the main domestication area. This represents more than half the varieties found world wide [27]. A core collection developed from Vassal collection is thus of major interest for the scientific community, and thanks to the vegetative propagation ability of grape, could be easily multiplied and distributed.

Construction of the core collections

The first result of our work is the fact that only a small number of cultivars (92 individuals, 4% of the Vassal collection) are needed to represent the whole diversity and an even smaller number of cultivars are needed to capture all the most frequent alleles (48 individuals, 2.1%). The comparison with other models is not easy, as they have different biological characteristics, the original collection did not reach the same global diversity of the species, and the analyses are seldom performed in the same way. Nevertheless, the core collections developed for A. thaliana (18%) or M. truncatula (31%) using the same method required higher percentages of individuals selected to represent all the genetic diversity [21,22]. In our study we only considered the cultivated compartment which tends to be less diverse than wild compartments [31]. But the high level of heterozygosity of the grapevine is probably also one of the factors that allow a lower number of individuals than homozygous species like the two plant species mentioned above. Finally, the small number of individuals needed to represent the genetic diversity of the cultivated grapevine also pinpointed the high redundancy of the Vassal collection where many kingroups were highlighted and the interest in using such core collections to optimize the study of the phenotypic and genetic diversity in grapevine [32,33].

Nested core collections are of great interest for identifying the sequence diversity that exists in the cultivated compartment of the V. vinifera species

The total genetic diversity revealed in the sequences of three gene fragments (2010 bp) in the G-92 core was quite high with 41 SNPs, i.e. one SNP for every 49 nucleotides. This is substantially higher than the level of genetic diversity observed in the M-core on the same gene set. Moreover, it is higher that the level of genetic diversity observed on an other set of 25 gene fragments totalling 12 kilobases sequenced on seven cultivated individuals (one SNP for every 118 nucleotides) by Salmaso et al. and on a set of 230 gene fragments, what represents the analysis of over 1 Mb of grape DNA sequence 11 grape genotypes (one SNP for every 64 nucleotides) by Lijavetzky et al. [34,35]. This comparison thus emphasises the interest of such a core collection for the discovery of genetic diversity. Among cultivated species, polymorphism in grape is relatively high compared to Zea mays (one SNP every 100 nucleotides), Pinus pinaster (one SNP for every 102 nucleotides) and Hordeum vulgare (one SNP for every 78 nucleotides), while it is relatively low compared to wild species such as A. thaliana (one SNP for every 32 nucleotides) [21,36-38].

G-48 core is highly diverse and non-redundant

The G-92 core was built taking into account extremely rare alleles. Considering the rapid evolution of SSR markers, we assumed that the alleles present in two cultivars or less in the collection did not adequately represent gene diversity and they were thus removed when we built the G-48 core [39-41]. Indeed, only one additional SNP was revealed in the G-92 sample (present in two cultivars and not in the M-core) compared to the G-48, thus validating our assumption. On one hand, the gain in the unlinked diversity was high in the G-48, probably due to the decrease in redundancy compared to the Vassal collection (revealed by the number of kingroups). On the other hand, when compared to a random sampling, the gain was much higher using the M-method. The final G-48 core is highly non-redundant and highly diverse. Moreover the G-48 core optimized the unlinked diversity in the three different regions sequenced compared to the M-core, whose individuals coming from Vassal collection were not selected based on their genotypes, by consequent they could be consider as a less optimized sampling within the Vassal collection. The G-12 and G-24 cores already include respectively 78% and 88% of the SNPs markers present in the G-92 core (80% and 90% of the G-48 core). They also include 58% and 73% of all the SSRs markers identified within the Vassal collection, representing a gain of 6% to 8% compared to random sampling from the G-48 or G-24 core. From a technical point of view, the size of the G-12 and G-24 cores is better suited for high throughput genomic studies and consequently highly suitable for ambitious projects of SNP discovery.

Geographic origin and final uses of the varieties within the G-core

Interestingly the nested core collections constructed in the present work reflect the distribution of grapes in Europe and around the Mediterranean Sea but with over-representation of the cultivars originating from the Caspian region and Middle East, and under-representation of the cultivars from Western Europe (Iberian peninsula, France and Italy) compared to the Vassal collection. We compared the SSR allele frequencies of the nested core collections and of the Vassal collection and found low correlations. This result further emphasizes the decrease in redundancy in the core collections compared with the Vassal collection, but also reflected the relative high number of cultivars originating from Western Europe in the Vassal collection, whereas the main domestication center is the Middle East [30]. These two regions may thus represent important sources of genetic diversity for the V. vinifera L. species. They represent the cradle of viticulture and the first migration of cultivars by Greeks and Etruscans, and a second domestication center in Western Mediterranean region [42]. Finally, despite their low representation in the Vassal collection, the presence of cultivars from Asia and Central Asia in the nested core collections could also indicate an underexploited center of diversification worthy of prospection and analysis. The proportion of table varieties, wine varieties and table/wine varieties was very well conserved in the nested core collections compared to the M-core and to the Vassal collection. The distinction between these three categories of cultivars is based on morphological traits such as berry size, bunch size and compacity but also on other traits such as the sugar/acid balance at maturity [43]. Previous studies have shown that there is strong genetic differentiation between these three groups of varieties that may be due either to divergent selection based on the same gene pool or to the use of specific gene pools for the development of the three types of varieties [44,27]. As a consequence, if the samples are well suited for analysis of allelic diversity, other uses can also be proposed for the cores, for example, the nested core collection could help understand the evolution of grape. Both G-12 and G-24 cores contained more frequent alleles representing ancient alleles while G-48 and G-92 may constitute subsequent diversification of cultivars in recent periods.

Conclusion

In the present work, we developed a set of robust nested core collections of V. vinifera L. (cultivated compartment) that will facilitate the discovery of allelic diversity by the scientific community. Moreover, this is an important basic tool for the development of projects of association mapping in grapevine. In conclusion, even if these nested core collections are statistically too small to study correlations between phenotype and nucleotide diversity, their use for preliminary tests of hypothesis will speed up the selection of suitable candidates (for example by discarding unsuitable candidates) and for SNP discovery. Due to the perennial nature of grape and the ease of vegetative propagation, these nested core collections could easily be disseminated worldwide for analyses (by simple request at request-vassal@supagro.inra.fr).

Methods

Plant material and DNA extraction

For each genotype of the four nested core collections, an accession of the Vassal collection (Domain de Vassal, Herault, France) was selected (Table 1) and a batch of young leaves was collected and lyophilized for long-term conservation. Lyophilized leaves were ground twice for 1 min at 20 Hz using a Qiagen-Retsch MM300 crusher. DNA was extracted using the Qiagen DNeasy Plant mini kit (Qiagen) following the manufacturer's instructions with minor modifications: addition of 1% w/v of PVP-40 to the AP1 solution, addition of 180 μl AP2 instead of 130 μl and an additional step of 10 minutes centrifugation at 6000 rpm after incubation on ice, which enabled the majority of the cellular remains and aggregates formed after the addition of AP2 to be pelleted.

Methods for the construction of the core collection

The dataset obtained by Laucou et al. (in prep) on the 2262 unique genotypes from the Vassal collection was used. The M-method proposed by Schoen and Brown and implemented in the MSTRAT software by Gouesnard at al. was used to generate the nested genetic core collections that maximize the number of observed alleles in the SSR data set [19,20]. The efficiency of the sampling strategy was assessed by comparing the total number of alleles captured using MSTRAT in samples of increasing size with the number of alleles captured in randomly chosen collections of the same size (five independent samplings). After having determined the optimal size of the nested core collections, 200 core collections were generated independently for each sample size. Putative core collections exhibiting the same allelic richness (determined by the total number of alleles represented) were ranked using Nei's index as the second criterion [45].

PCR primer design

The gene sequences that were analysed were derived from three genes located on three separate chromosomes (Table 6). Two were involved in the anthocyanin metabolic pathway: the dihydroflavonol 4-reductase (DFR, gi 499017) present in one copy in the genome of V. vinifera L. and the leucoanthocyanidin dioxygenase (L-DOX gi 22010674) present at least in three copies in the genome of V. vinifera L. based on the NCBI database. The third gene codes for a BURP domain protein presenting a differential expression in a natural mutant of berry development compared to the wild type (VvBURP1; gi 22014825) [46-48]. Specific PCR primers (Table 2) were designed for the amplification of fragments of these three genes using Primer3 software and tested for amplification on the genomic DNA of the 12 individuals of the core G-12 [49].
Table 6

Localisation of the genes chosen for partial re-sequencing, specific PCR primers used and size of the gene fragment re-sequenced

DNA fragment (GenbanK)LG locatedSizePrimer forward (5'→3' sequence)Primer reverse (5'→3' sequence)
DFR (X75964)18810 ntCAAGCTGCATGGAAGTATGCTTGGGCCATTCCGTTTTATTA
L-DOX (BQ795708)8500 ntTTGAGCCCAATCATATTAGTTCCGTGGCATGACCATTCTCCTC
BURP (BQ799859)3700 ntCGAAAAGGGACACACAGAGGTTCAGAGTAGGCCTCGGAA

Total2010 nt
Localisation of the genes chosen for partial re-sequencing, specific PCR primers used and size of the gene fragment re-sequenced

PCR amplification, sequencing, sequence analysis and SNP detection

The 25 μl PCR reaction mixtures contained 20 ng of genomic DNA, 50 mM KCl, 10 mM TRIS-HCl (pH 8.3), 0.4 mM of each primer, 125 μM of each dNTP, 1.5 mM MgCl2 and 2.5 U of Taq polymerase (Qiagen). PCR amplifications were performed in a MJ Research PTC 100 Thermal Cycler programmed as follows: 5 min denaturation at 94°C, 35 cycles of 94°C for 30 s, 52°C for 45 s, and 72°C for 1 min, followed by an extension step at 72°C for 8 min. The PCR products were purified using the Agencourt AMPure method (Beckman Coulter) and directly sequenced in the two ways using the Big Dye Sequencing kit according to the manufacturer's specifications (Applied Biosystems Inc.). The sequence products were purified using the Agencourt CleanSEQ method (Beckman Coulter) and loaded onto an ABI PRISM® 3130 XL (Applera) capillary sequencer. The DNA sequences were analysed using the Staden Package [50]. Heterozygous SNPs were identified as double pics on the chromatograms and coded according to international codes (nucleotide codes of the International Union of Biochemistry). Insertion/Deletions were easily identified by overlapping sequences. Sequencing both strands enable to deal with such events. Only SNPs present on both forward and reverse sequences were validated.

Statistical analysis

Different indices were used in this study. The selection of reference core collections among those constructed using MSTRAT and exhibiting the same allelic richness (determined by the total number of alleles represented) was performed using Nei's index (Nei, 1987) as the second criterion. Nei's index is given for one locus by: = where represents the allele frequency of the locus. The Nei diversity index for all the loci is the sum of indices for each locus given by = ∑. The more the allelic frequencies are equilibrated within a sample, the higher the value of Nei's index As the samples compared were of different size (M-core, nested core collections and the whole collection) the comparison was performed using the unbiased observed heterozygosity and the unbiased Nei's index [45]. The unbiased Nei's index for the locus is given by: = where represents the number of individuals and where represents the allele frequency of the locus, the unbiased Nei diversity index for all the loci studied is given by = where is the number of loci studied. The more the allelic frequencies are equilibrated within a sample, the higher the value of the unbiased Nei's index. The observed heterozygosity for the locus is given by = where represents the homozygote frequency for allele of the locus. The unbiased observed heterozygosity for the locus is: = >where represents the number of individuals and the unbiased observed heterozygosity for all loci studied is = where is the number of loci studied. We compared the SSR frequencies found in the M-core and the nested G core-collections with those of the Vassal collection using the R2 correlation coefficient. R2 is given by = where is the covariance between the two samples compared and and are the variance of samples and respectively.

Authors' contributions

LLC carried out the sequence, participated in the sequence alignment, performed the statistical analysis and drafted the manuscript. AF-L carried out the sequence and participated in the sequence alignment. VL carried out the SSR analysis. SV carried out the sequence. TL participated in the design of the study and performed the ampelographic analysis. A-FA participated in the design and coordination of the study and helped to draft the manuscript. J-MB participated in the design of the study and performed the ampelographic analysis. PT conceived of the study, participated in the design and coordination of the study and helped to draft the manuscript. All authors read and approved the final manuscript
  38 in total

1.  Integration of HapMap-based SNP pattern analysis and gene expression profiling reveals common SNP profiles for cancer therapy outcome predictor genes.

Authors:  Gennadi V Glinsky
Journal:  Cell Cycle       Date:  2006-11-15       Impact factor: 4.534

Review 2.  A model for susceptibility polymorphisms for complex diseases: apolipoprotein E and Alzheimer disease.

Authors:  A D Roses
Journal:  Neurogenetics       Date:  1997-05       Impact factor: 2.660

3.  Impact of gene flow from cultivated beet on genetic diversity of wild sea beet populations

Authors: 
Journal:  Mol Ecol       Date:  1999-10       Impact factor: 6.185

4.  Natural variation in light sensitivity of Arabidopsis.

Authors:  J N Maloof; J O Borevitz; T Dabi; J Lutes; R B Nehring; J L Redfern; G T Trainer; J M Wilson; T Asami; C C Berry; D Weigel; J Chory
Journal:  Nat Genet       Date:  2001-12       Impact factor: 38.330

5.  Allele frequencies at microsatellite loci: the stepwise mutation model revisited.

Authors:  A M Valdes; M Slatkin; N B Freimer
Journal:  Genetics       Date:  1993-03       Impact factor: 4.562

6.  Association analysis of candidate genes for maysin and chlorogenic acid accumulation in maize silks.

Authors:  S J Szalma; E S Buckler; M E Snook; M D McMullen
Journal:  Theor Appl Genet       Date:  2005-04-02       Impact factor: 5.699

7.  Expression of the grape dihydroflavonol reductase gene and analysis of its promoter region.

Authors:  Rachel Gollop; Sylvie Even; Violeta Colova-Tsolova; Avihai Perl
Journal:  J Exp Bot       Date:  2002-06       Impact factor: 6.992

8.  Genetic structure and differentiation in cultivated grape, Vitis vinifera L.

Authors:  Mallikarjuna K Aradhya; Gerald S Dangl; Bernard H Prins; Jean-Michel Boursiquot; M Andrew Walker; Carole P Meredith; Charles J Simon
Journal:  Genet Res       Date:  2003-06       Impact factor: 1.588

9.  A comparison of sequence-based polymorphism and haplotype content in transcribed and anonymous regions of the barley genome.

Authors:  Joanne Russell; Allan Booth; John Fuller; Brian Harrower; Peter Hedley; Gordon Machray; Wayne Powell
Journal:  Genome       Date:  2004-04       Impact factor: 2.166

10.  Genetic structure and phylogeography of rice landraces in Yunnan, China, revealed by SSR.

Authors:  Hongliang Zhang; Junli Sun; Meixing Wang; Dengqun Liao; Yawen Zeng; Shiquan Shen; Ping Yu; Ping Mu; Xiangkun Wang; Zichao Li
Journal:  Genome       Date:  2007-01       Impact factor: 2.166

View more
  36 in total

1.  The SSR-based molecular profile of 1005 grapevine (Vitis vinifera L.) accessions uncovers new synonymy and parentages, and reveals a large admixture amongst varieties of different geographic origin.

Authors:  Guido Cipriani; Alessandro Spadotto; Irena Jurman; Gabriele Di Gaspero; Manna Crespan; Stefano Meneghetti; Enrica Frare; Rita Vignani; Mauro Cresti; Michele Morgante; Mario Pezzotti; Enrico Pe; Alberto Policriti; Raffaele Testolin
Journal:  Theor Appl Genet       Date:  2010-08-06       Impact factor: 5.699

2.  DNA fingerprinting in botany: past, present, future.

Authors:  Hilde Nybom; Kurt Weising; Björn Rotter
Journal:  Investig Genet       Date:  2014-01-03

3.  Quantitative genetic bases of anthocyanin variation in grape (Vitis vinifera L. ssp. sativa) berry: a quantitative trait locus to quantitative trait nucleotide integrated study.

Authors:  Alexandre Fournier-Level; Loïc Le Cunff; Camila Gomez; Agnès Doligez; Agnès Ageorges; Catherine Roux; Yves Bertrand; Jean-Marc Souquet; Véronique Cheynier; Patrice This
Journal:  Genetics       Date:  2009-08-31       Impact factor: 4.562

4.  Molecular markers for establishing distinctness in vegetatively propagated crops: a case study in grapevine.

Authors:  Javier Ibáñez; M Dolores Vélez; M Teresa de Andrés; Joaquín Borrego
Journal:  Theor Appl Genet       Date:  2009-08-13       Impact factor: 5.699

5.  Validation assay of p3_VvAGL11 marker in a wide range of genetic background for early selection of stenospermocarpy in Vitis vinifera L.

Authors:  Carlo Bergamini; Maria Francesca Cardone; Angelo Anaclerio; Rocco Perniola; Arianna Pichierri; Rosalinda Genghi; Vittorio Alba; Lucia Rosaria Forleo; Angelo Raffaele Caputo; Cinzia Montemurro; Antonio Blanco; Donato Antonacci
Journal:  Mol Biotechnol       Date:  2013-07       Impact factor: 2.695

6.  Developing core collections to optimize the management and the exploitation of diversity of the coffee Coffea canephora.

Authors:  Thierry Leroy; Fabien De Bellis; Hyacinthe Legnate; Pascal Musoli; Adrien Kalonji; Rey Gastón Loor Solórzano; Philippe Cubry
Journal:  Genetica       Date:  2014-05-04       Impact factor: 1.082

7.  Morphological variability in leaves and molecular characterization of novel table grape candidate cultivars (Vitis vinifera L.).

Authors:  Vittorio Alba; Carlo Bergamini; Maria Francesca Cardone; Marica Gasparro; Rocco Perniola; Rosalinda Genghi; Donato Antonacci
Journal:  Mol Biotechnol       Date:  2014-06       Impact factor: 2.695

8.  High throughput analysis of grape genetic diversity as a tool for germplasm collection management.

Authors:  V Laucou; T Lacombe; F Dechesne; R Siret; J-P Bruno; M Dessup; T Dessup; P Ortigosa; P Parra; C Roux; S Santoni; D Varès; J-P Péros; J-M Boursiquot; P This
Journal:  Theor Appl Genet       Date:  2011-01-14       Impact factor: 5.699

9.  Retention of agronomically important variation in germplasm core collections: implications for allele mining.

Authors:  Patrick A Reeves; Lee W Panella; Christopher M Richards
Journal:  Theor Appl Genet       Date:  2012-01-07       Impact factor: 5.699

10.  The powdery mildew resistance gene REN1 co-segregates with an NBS-LRR gene cluster in two Central Asian grapevines.

Authors:  Courtney Coleman; Dario Copetti; Guido Cipriani; Sarolta Hoffmann; Pál Kozma; László Kovács; Michele Morgante; Raffaele Testolin; Gabriele Di Gaspero
Journal:  BMC Genet       Date:  2009-12-30       Impact factor: 2.797

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.