Literature DB >> 20868482

Identification of new polymorphic regions and differentiation of cultivated olives (Olea europaea L.) through plastome sequence comparison.

Roberto Mariotti1, Nicolò G M Cultrera, Concepcion Muñoz Díez, Luciana Baldoni, Andrea Rubini.   

Abstract

BACKGROUND: The cultivated olive (Olea europaea L.) is the most agriculturally important species of the Oleaceae family. Although many studies have been performed on plastid polymorphisms to evaluate taxonomy, phylogeny and phylogeography of Olea subspecies, only few polymorphic regions discriminating among the agronomically and economically important olive cultivars have been identified. The objective of this study was to sequence the entire plastome of olive and analyze many potential polymorphic regions to develop new inter-cultivar genetic markers.
RESULTS: The complete plastid genome of the olive cultivar Frantoio was determined by direct sequence analysis using universal and novel PCR primers designed to amplify all overlapping regions. The chloroplast genome of the olive has an organisation and gene order that is conserved among numerous Angiosperm species and do not contain any of the inversions, gene duplications, insertions, inverted repeat expansions and gene/intron losses that have been found in the chloroplast genomes of the genera Jasminum and Menodora, from the same family as Olea.The annotated sequence was used to evaluate the content of coding genes, the extent, and distribution of repeated and long dispersed sequences and the nucleotide composition pattern. These analyses provided essential information for structural, functional and comparative genomic studies in olive plastids. Furthermore, the alignment of the olive plastome sequence to those of other varieties and species identified 30 new organellar polymorphisms within the cultivated olive.
CONCLUSIONS: In addition to identifying mutations that may play a functional role in modifying the metabolism and adaptation of olive cultivars, the new chloroplast markers represent a valuable tool to assess the level of olive intercultivar plastome variation for use in population genetic analysis, phylogenesis, cultivar characterisation and DNA food tracking.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20868482      PMCID: PMC2956560          DOI: 10.1186/1471-2229-10-211

Source DB:  PubMed          Journal:  BMC Plant Biol        ISSN: 1471-2229            Impact factor:   4.215


Background

Olive is the main cultivated species belonging to the monophyletic Oleaceae family, within the clade of Asterids, in which the majority of nuclear and organellar genomic sequences are unknown. The Olea genus includes two sections, Olea and Ligustroides. The former comprises the six recognised subspecies of the olive complex, which can be found throughout the Mediterranean area as well as the temperate and subtropical regions of Africa and Asia. The Mediterranean form (Olea europaea, subspecies europaea) includes the wild (var. sylvestris) and cultivated (var. europaea) olives [1]. Recently, chloroplast genome sequencing of species belonging to this family from the tribe of Jasmineae revealed that two genera, Jasminum and Menodora, carry several distinctive rearrangements, including inversions, gene duplications, insertions, inverted repeat expansions and gene/intron losses [2]. One of these genomic features involves the duplication of the rpl23 protein-coding gene in Jasminum. A similar duplication has also been detected in the Poaceae, and in both Oleaceae and Poaceae, the duplicated copy has been inserted into the intergenic region between rbcL and psaI [3]. By comparative gene mapping and sequencing, Lee and co-workers also demonstrated that all other Oleaceae genera, including Olea, have an identical gene content and order as Nicotiana tabacum. A phylogenetic reconstruction of the entire family, based upon the sequences of the ndhF and rbcL genes, partially confirmed previous results obtained by the analysis of the trnL-F and rps16 chloroplast regions [4]. Intraspecies variation within other Oleaceae genera, such as Syringa [5], Forsythia [6], Ligustrum [7] and Fraxinus [8,9] has also been examined. Different chlorotypes have been identified among the six subspecies of O. europaea. Lumaret et al. [10] identified 12 distinct chlorotypes by RFLP analysis of DNA isolated from the purified chloroplasts of a wide set of O. europaea taxa. In other O. europaea subspecies Baldoni et al. [11] identified nine nucleotide substitutions, one insertion-deletion (indel) and a polymorphic poly-T SSR in the trnT-L region. Besnard et al. [12] in the O. europaea complex identified fourteen polymorphisms in three chloroplast regions (trnT-L, trnQ-R and matK), including five microsatellite motifs, two indels and eight nucleotide substitution sites. Recently, the analysis of four regions (trnL-F, trnT-L, trnS-G and matK) was used to demonstrate the polyphyletic origin of the Olea genus and estimate the divergence times for the major groups of Olea species and subspecies during the Tertiary period [13]. In cultivated olives chloroplasts are maternally inherited [14] and, in contrast to that seen at the subspecies level, a low plastidial variability was detected. A strong linkage disequilibrium between the chloroplast and mitochondrial genomes has been demonstrated, particularly for the Mediterranean cultivated and wild olives (subspecies europaea), suggesting that a low level of recurrent mutations occurs in both organellar genomes of the olive [15]. In particular, RFLP analysis of chloroplast DNA isolated from 72 cultivars revealed that most cultivars have a common chlorotype [16]. Besnard et al. [17], using two microsatellites and 13 RFLPs on more than 140 olive cultivars, were able to distinguish only four chlorotypes. The majority of cultivars was characterised by the chlorotype CE1, which likely originated from the wild olive populations of the Eastern Mediterranean and was spread to the Western part through cultivar dispersal by humans. Polymorphisms at the varietal level have been detected in the trnD-T locus [18], but only one polymorphism in this locus was found within a set of 12 cultivars [19]. Chloroplast DNA represents an ideal system for plant species DNA barcoding, and some chloroplast regions have been indicated as ideal for use in tests that discriminate between different land plants. Based on assessments of recoverability, sequence quality and discriminatory abilities at the species level, the two-locus combination of rbcL-matK has been recommended as a universal framework for plant barcoding [20]. The combination of trnH-psbA coupled with rbcL has been recommended for DNA barcoding to discriminate between lower taxonomic ranks such as genera or related species [21]. In highly valuable crop species, such as the olive, that have a variety of cultivars available in the market, however, typing at the species level is not sufficient. Thus, the development of reliable methods to rapidly and efficiently discriminate between cultivars has become a pressing need. In addition, DNA barcoding may have useful applications to tracking food products [22] and the analysis of archaeological remains [23]. In this respect, the availability of complete chloroplast genome sequences from a growing number of species offers the opportunity to evaluate many potentially polymorphic sites and identify new regions that could be used to define cultivar DNA barcodes. There are numerous approaches to sequence chloroplast genomes: traditional sequence analysis of highly purified chloroplast DNA, as applied for Solanum lycopersicum [24], Lolium perenne [25], Trachelium caeruleum [26], Jasminum nudiflorum [2] and Parthenium argentatum [27]; Rolling Circle Amplification (RCA) of high-purity chloroplast DNA, as demonstrated in Cicer arietinum [28], Platanus occidentalis [29] and Welwitschia mirabilis [30]; shot gun sequence analysis of BAC clones containing chloroplast genomic inserts, as demonstrated in Vitis vinifera [31], Hordeum vulgare [32] and Brachypodium distachyon [33]; and the use of universal primers based on chloroplast sequences highly conserved among most Angiosperm species to amplify overlapping fragments [34-36], as demonstrated in Cycas taitungensis [37] and two Bambusa species [38]. For this study, the last approach was used to sequence the entire chloroplast genome of the O. europaea subsp. europaea cv. Frantoio. The resulting availability of the entire plastome allowed to evaluate the sequence arrangement of the plastid genome in O. europaea and to identify new organellar polymorphisms that could discriminate between cultivated olive varieties.

Results and Discussion

Size, gene content and gene order of the olive chloroplast genome

The complete plastome of olive, cv. Frantoio has a total length of 155,889 bp (GenBank Accession Number GU931818), with the typical structure found in the unrearranged chloroplast genomes of Angiosperms. It includes an 86,614-bp Large Single Copy (LSC) and a 17,791-bp Small Single Copy (SSC) region separated by a pair of Inverted Repeats (IR), each 25,742 bp long (Figure 1). Coding DNA (92,095 bp) accounts for 59.08% of the genome and includes protein coding genes (80,252 bp), tRNAs (2,793) and rRNAs (9,050), while noncoding DNA (63,794 bp) accounts for the remaining 40.92% and includes introns (20,130 bp) and intergenic spacers (43,664 bp). The olive plastome contains 114 unique genes (80 CDS, 30 tRNA and 4 rRNA), with 19 of these genes (8 CDS, 7 tRNA and all 4 rRNA) duplicated in the IR for a total of 133 genes. In addition, the duplicated region includes a partial CDS for ycf1, as in other species like Typha [39]. There are 18 intron-containing genes, 15 of which contain one intron and 3 (ycf3, clpP and rps12) with two introns. The rps12 gene is trans-spliced, with the 5' end located in the LSC and the 3' end duplicated in the IR regions. The nucleotide composition of the olive chloroplast genome comprises 37.81% GC and 62.19% AT.
Figure 1

. Genes drawn inside the circle are transcribed clockwise, those outside are counterclockwise.

. Genes drawn inside the circle are transcribed clockwise, those outside are counterclockwise. The in silico search for repetitive elements identified 633 mono-nucleotide SSRs with 5 or more repeat units (Table 1), with 276 poly-A, 303 poly-T, 31 poly-C and 23 poly-G repeats. In addition, six di-nucleotide SSRs with five or six repeat units, no tri-nucleotide SSRs, three tetra- and two penta-nucleotide SSRs were identified, for a total of 644 repetitive sequences. The distribution of SSRs across the chloroplast genome was as follows: 400 in the LSC (density = 0.0046), 126 in the SSC (density = 0.0071) and 59 (x2) in the IR region (density = 0.0022).
Table 1

Abundance and length of SSR motifs identified on the olive chloroplast genome.

No. of repeats*No. of SSR
ACGTATGATAAAATATAAGAAACCAATTAAACTotal

3----000011114
4----000100001
5----113000005
6141201716101000000340
769865700000000140
82320330000000058
92000190000000039
101110140000000026
1140090000000013
1250070000000012
131002000000003
141000000000001
151000000000001
160001000000001
Total276312330312311111644

SSRs analyzed for polymorphism are given in bold. * Mononucleotide SSRs with less than 6 repeats were not determined.

Interspersed repeats and palindromic sequences.

Abundance and length of SSR motifs identified on the olive chloroplast genome. SSRs analyzed for polymorphism are given in bold. * Mononucleotide SSRs with less than 6 repeats were not determined. Interspersed repeats and palindromic sequences. The repeat analysis also identified 14 interspersed repetitive sequences longer than 30 bp, each having 2-6 repetitions and a sequence identity higher than 85% (Table 2, Figure 2). These long interspersed repetitive sequences included two tandem repeats in the ycf2 gene and five palindromic sequences (two in the LSC, one in the SSC and two in the IR regions). Three of the four repeats found within the ycf2 exon were tandem repeats, as previously observed in V. vinifera [31]. There were only two inverted repeats, all the others were direct repeats. Five repeats were located within CDS, two repeats were found in the introns of the ycf3 and ndhA genes and all others were in the intergenic spacers (Table 2). Interspersed repeats did not cause any uncertainty during the sequencing process because they were quite short (< 61 bp), with a low number of repetitions and primers were never constructed on the repeats.
Table 2
RepeatNumber of repeatsSizeStart(1)Type% IdentityRegionGene positionSequence
13309,345D86.67LSCtrnS-GCU-exon[CA][AC]GGA[GA]AGAGAGGGATTCGAACCCTCG[AG]TA
37,281DLSCtrnS-UGA-exon
47,117ILSCtrnS-GGA-exon
223110,849D90.32LSCtrnG-UCC-exon 2[AT][AG]A[CA]GATGCGGGTTCGATTCCCGCTA[CT]CCGC
38,241DLSCtrnG-GCC-exon
313014,401P93.33LSCatpF - atpHAAATATGAAAAATA[TC][GA]TATTTTTCATATTT
424540,451D88.89LSCpsaB-exon[AT]TGCAATAGCTA[AG]ATGATG[AG]TG[TA]GCAATATCGGTCAGCCATA[AG]AC
42,675DLSCpsaA-exon
534145,474D92.68LSCycf3-intronT[CA]CAGAACCGTAC[GA]TGAGATTTTCA[TC]CTCATACGGCTCCTC
100,797DIRrps12 - trnV-GAC
122,052DSSCndhA-intron
623156,736D90.32LSCatpB - rbcLT[AT]CTTATTCATCCACTTGAAATTTTCAA[AG][AT]T
56,777ILSCatpB - rbcL
714476,926P95.45LSCpsbT - psbNTTGAAGTAATGAGCCTCCC[CA]ATAT[TG]GGGAGGCTCATTACTTCAA
823083,181D90LSCrps8 - rpl14AATCTA[CG]T[AT][AC]TTAATCTAGTTCTTAATCTA
83,193DLSCrps8 - rpl14
923091,385TR90IRycf2-exonTTTCTTTTTGTC[CT]AA[GC]TCACTTC[TC]TTTTTT
91,427IRycf2-exon
1023693,791TR94.83IRycf2-exon[AG]ATATTGATG[AC]TAGTGAC[AG]ATATTGATG[AC]TAGTGAC
93,827IRycf2-exon
1114896,252P91.67IRycf15 - trnL-CAAAGAGCTCGGATCGAATCGGTAT[TA][TG][AC][TA]ATACCGATTCGATCCGAGCTCT
12230109,623D93.33IRrrn 4.5 - rrn 5CATTGTTCAA[AC]TCTTTGACAACA[CT]GAAAAA
109,654DIRrrn 4.5 - rrn 5
13161110,599P95.08IRtrnR-ACG - trnN-GCUAGAATTCTCAGATGTACTAGCACTGCATC[AT][AT][AT]GATGCAGTGCTAGTACATCTGAGAATTCT
14136118944P100SSCndhD - psaCAAAACCCGTGCTCCAAATATTTGGAGCACGGGTTTT

D: direct, I: inverted, P: palindrome, TR: tandem repeat (imperfect).

(1) The start base position of the interspersed repeats and palindromic sequences refers to the cv. Frantoio sequence.

Bold nucleotides refer to the indel P32.

Figure 2

Polymorphic regions identified in the olive chloroplast genome. Different colours indicate the four mono-nucleotide microsatellites (poly-T and poly-G are reported in the external circle, poly-A and poly-C in the internal circle), bar lengths correspond to the number of repetitions. Arrows indicate polymorphisms (base mutations, microsatellites and indels). The circle reports the interspersed repeats: to the same number corresponds the same repetition. External or internal number position corresponds to the sense or anti-sense sequence direction.

D: direct, I: inverted, P: palindrome, TR: tandem repeat (imperfect). (1) The start base position of the interspersed repeats and palindromic sequences refers to the cv. Frantoio sequence. Bold nucleotides refer to the indel P32. Polymorphic regions identified in the olive chloroplast genome. Different colours indicate the four mono-nucleotide microsatellites (poly-T and poly-G are reported in the external circle, poly-A and poly-C in the internal circle), bar lengths correspond to the number of repetitions. Arrows indicate polymorphisms (base mutations, microsatellites and indels). The circle reports the interspersed repeats: to the same number corresponds the same repetition. External or internal number position corresponds to the sense or anti-sense sequence direction. The actual size of the olive plastome is larger than the size estimated on the basis of RFLP analysis, which predicted a range from 132 to 134 Kb [16].

Olive chloroplast genome organisation

The sequence of the olive chloroplast genome represents one of the first contributions to deciphering the genetic background of this important tree crop species and was used to verify that rearrangements observed in the plastomes of other genera of Oleaceae, such as Jasminum and Menodora, were not represented in O. europaea. In fact, in contrast to what observed in the Jasminum and Menodora plastomes [2], the olive chloroplast maintains a size range, organisation and gene order typical of most land plants, such as members of the Vitis, Populus, Citrus, Eucalyptus, Coffea and Arabidopsis genera. Based on the phylogeny of Oleaceae inferred from the ndhF and rbcL genes [2], Jasminum and Menodora were already known to be unusual genera within the family, and all other tribes, including Oleae, to which the Olea genus belongs, do not share their combination of multiple mutational events. The highly conserved plastome organization of the olive allowed universal primers and genome walking with consensus primers to be used to amplify most of the LSC region.

Identification of new plastid markers to discriminate between olive cultivars

To detect intervarietal polymorphisms, a preliminary screening of the intergenic spacer trnS-GCU - trnG-UCC, previously demonstrated to be polymorphic among olive varieties [40], was performed on a set of 30 cultivars having different geographical distributions and representing a wide range of morphological and agronomical phenotypes (data not shown). A sub-set of eight highly variable cultivars (Table 4) was further examined for 100 potentially polymorphic regions.
Table 4

Chlorotypes detected on eight cultivars.

Repository of samples/Collection numberCRA-OLI/92CRA-OLI1/32WOGB2/12, 691WOGB/128WOGB/5, 787WOGB/114
Polymorphic sitesChloroptype 1(Frantoio)Chloroptype 2(Canino)Chloroptype 3(Farga, Kalogerida)Chloroptype 4(Galega)Chloroptype 5(Lechin Sevilla, Sorani)Chloroptype 6(Oueslati)

P1GAAGAA
P2TATTTA
P3TCTTCC
P4T12T11T10T12T11T11
P5T11T12T12T11T12T12
P6T12T11T12T12T11T11
P7AACAAA
P8T11T10T11T11T10T10
P9TCCTCC
P10A/AAT/
P11TTAGATA-TTAGATATTAGATA--
P12-A4(G)A5A4(G)A5-A4(G)A5A4(G)A5
P13A12A14A11A12A14A14
P14TGTTGG
P15A15A16A16A15A16A16
P16C10C11C11C11C10C10
P17T11T10T10T11T10T10
P18A12A12A13A12A12A12
P19CTCCTT
P20A13A13A12A13A13A13
P21CCTCTC
P22A10A10A11A10A10A10
P23CCTCCC
P24AAGAAA
P25A12A12A11A12A12A12
P26CCCCTC
P27AGGAGG
P28A6A7A6A6A7A7
P29CTCCCT
P30TGGGGG
P31T18T11T18T17T11T11
P32TTAATC TAGTTCTTAATCTAGTTC-TTAATC TAGTTCTTAATCTAGTTCTTAATC TAGTTC
P33TGGGTT
P34TATTAA
P35AGGAGG
P36T10T9T9T10T9T9
P37A14A7(T)A5A7(T)A5A14A7(T)A5A7(T)A5
P38CAACAA
P39AACAAA
P40GGTGGG

The position of the polymorphic regions refers to the cv. Frantoio sequence.

The tested potential variant domains have shown different levels of variability. Fifteen of the analyzed intergenic spacers contained mutations within the sequence of the eight cultivars, ranging in number from one to six per region. These mutations were microsatellites, indels or single nucleotide polymorphisms (Table 3). One SNP was located within the intron of the rpoC1 gene, and three others were located in the coding regions (CDS) of the rpl14, ndhF and ycf1 genes. The CDS-SNPs resulted in substitutions at aminoacidic position 109 in rpl14 (leucine to phenylalanine), at 32 aa in ndhF (valine to alanine), and at 995 and 1,161 aa in ycf1 (leucine to isoleucine and isoleucine to arginine, respectively). Blast analyses revealed that the ndhF alanine and the ycf1 leucine, widely represented in other species, are present in Farga and Frantoio cultivars, respectively. Also the rpl14 polymorphism can be found in other species, as is the case for the phenylalanine aminoacid, present in the V. vinifera cv. Pinot Noir in the mitochondrial copy of this gene, due to the incorporation of more than 42% of the Vitis chloroplast genome into its mitochondrial genome [41]. On this respect, the risk that our chloroplast olive markers may reside on mitochondrial or nuclear genes has been prevented by amplifying coding regions anchored on the intergenic spacers and confirmed by the absence of sequence ambiguities.
Table 3

Chloroplast polymorphisms within olive (Olea europaea subsp. europaea var. europaea) cultivars.

Polymorphic sitesMarkerPolymorphism typeMotifPosition (bp)(1)RegionPolymorphisms already known (Authors)
P1Oe-rpl2-trnHSNPA/G2rpl2-trnH
P2Oe-trnH-psbA-1SNPA/T221
P3Oe-trnH-psbA-2SNPC/T470trnH-GUC - psbA2
P4Oe-trnH-psbA-3SSRT10-12505
P5Oe-trnK-rps16-1SSRT11-124,690trnK-UUU - rps16
P6Oe-trnK-rps16-2SSRT11-124,883
P7Oe-trnK-rps16-3SNPA/C5,011
P8Oe-psbK-psbISSRT10-119,072psbK - psbIBesnard et al., 2003
P9Oe-trnS-trnG-1SNPC/T9,463trnS-GCU - trnG-UCC
P10Oe-trnS-trnG-2SNP/indelA/T/-9,535
P11Oe-trnS-trnG-3IndelTTAGATA/-9,536Besnard et al., 2003
P12Oe-trnS-trnG-4IndelA4(G)A5/-9,574Besnard et al., 2003
P13Oe-trnS-trnG-5SSRA11-149,579Besnard et al., 2003
P14Oe-trnS-trnG-6SNPG/T9,960
P15Oe-atpA-atpFSSRA15-1612,790atpA - atpF
P16Oe-rps2 - rpoC2-1SSRC10-1117,433rps2 - rpoC2Besnard et al., 2007 (ccmp5)
P17Oe-rps2 - rpoC2-2SSRT10-1117,443Besnard et al., 2007 (ccmp5)
P18Oe-rps2 - rpoC2-3SSRA12-1317,455Besnard et al., 2007 (ccmp5)
P19Oe-rpoC1SNPC/T23,981rpoC1 intron
P20Oe-trnE-trnT-1SSRA12-1332,682trnE-UUC - trnT-GGUIntrieri et al., 2007; Besnard, 2008
P21Oe-trnE-trnT-2SNPC/T32,813Intrieri et al., 2007
P22Oe-psbZ-trnG-1SSRA10-1138,011psbZ - trnG-GCC
P23Oe-psbZ-trnG-2SNPC/T38,129
P24Oe-psaA-ycf3-1SNPA/G43,868psaA - ycf3
P25Oe-psaA-ycf3-2SSRA11-1244,077
P26Oe-psaA-ycf3-3SNPC/T44,302
P27Oe-atpB-rbcL-1SNPA/G56,929atpB - rbcL
P28Oe-atpB-rbcL-2SSRA6-757,116Besnard et al., 2007 (ccmp7)
P29Oe-petA-psbJ-1SNPC/T65,656petA - psbJ
P30Oe-petA-psbJ-2SNPG/T66,340
P31Oe-rps8-rpl14-1SSRT11-1883,112rps8 - rpl14
P32Oe-rps8-rpl14-2IndelTTAATCTAGTTC/-83,195
P33Oe-rpl14SNPG/T83,307rpl14 exon
P34Oe-rps12-trnV-1SNPA/T101,265rps12 - trnV-GAC
P35Oe-ndhFSNPA/G114,454ndhF exon
P36Oe-ndhF-rpl32SSRT9-10114,885ndhF - rpl32
P37Oe-rpl32-trnL-1SSRA14/A7(T)A5115,359rpl32 - trnL-UAG
P38Oe-rpl32-trnL-2SNPA/C115,598
P39Oe-ycf1-1SNPA/C127,793ycf1 exon
P40Oe-ycf1-2SNPG/T128,292

The position of the polymorphic regions refers to the cv. Frantoio sequence.

The new markers identified on olive cultivars are given in bold.

Chloroplast polymorphisms within olive (Olea europaea subsp. europaea var. europaea) cultivars. The position of the polymorphic regions refers to the cv. Frantoio sequence. The new markers identified on olive cultivars are given in bold. Chlorotypes detected on eight cultivars. The position of the polymorphic regions refers to the cv. Frantoio sequence. The comparison of the Frantoio chloroplast sequence with ESTs deriving from fruits of cvs. Coratina and Tendellone showed some sequence mismatches, but they were not confirmed by resequencing the corresponding genomic regions in Coratina and Tendellone cultivars. Overall, the analysis of cpDNA sequences from the eight cultivars resulted in the identification of 40 polymorphic sites, 30 of which represent new and never-described plastid variants (Table 3, Table 4, Figure 2). Sixteen polymorphic sites were mono-nucleotide SSRs: eight poly-A, including one with an irregular motif; seven poly-T and one poly-C. The remaining polymorphisms included 20 SNPs and 4 indels. Thirty-three polymorphic sites (P1-P33) were located within the LSC region, one (P34) within the IR and six (P35-P40) within the SSC (Figure 2). The indel P32 was identified within the repeat of the rps8 - rpl14 spacer, but none of the other repetitive regions was polymorphic between cultivars. The chloroplast sequence of cv. Frantoio was also compared with all previously sequenced regions of the olive chloroplast, particularly with the plastome sequence of cv. Bianchera, which has been recently deposited in the Genbank database (NC_013707.1). More than 200 mismatches were detected between the Bianchera and Frantoio sequences. Surprisingly, not one of these polymorphisms fell within the previously identified cultivar-specific polymorphic regions. To verify if these mismatches might represent real sequence differences between the two varieties, most of the ambiguous regions were reamplified and resequenced in both cultivars (Bianchera sample was provided by the CRA-OLI of Spoleto, Perugia, Italy). These analyses confirmed the sequence of cv. Frantoio and showed an absolute sequence identity with that obtained from the cv. Bianchera in all of these regions, including the exons of the rpoC1 and ndhF genes, carrying 27- and 9-bp indels, respectively. The differences detected between the two olive plastome sequences can not derive from an incorrect identification of the Bianchera genotype because, in that case, mutations should have been found in the polymorphic sites and not randomly along the chloroplast genome. More likely, divergences may be attributed to sequence uncertainties in the Bianchera plastome sequence deposited in GenBank. The new markers identified in this study can distinguish six haplotypes among eight cultivars. Therefore, these new markers hold great promise for the identification of new cultivar haplotypes and for use in DNA barcoding systems to distinguish between different cultivars.

Comparison of plastome variation between cultivars and with other Olea taxa

Based on previous chloroplast sequence analyses, olive cultivars belong to the cp-II lineage and have been classified into three sublineages (E1, E2 and E3) and four chlorotypes (1, 2, 9 and 13) [19,40]. These chlorotypes were defined by evaluating length variations in the psbK-psbI, trnS-trnG, rps2-rpoC2, trnE-trnT and atpB-rbcL regions among more than 140 cultivars [17,19,40]. Several polymorphisms had been previously identified in the partial sequence of the trnK intron (AF359497-AF359504) by analysing the subspecies cuspidata, laperrinei, maroccana, cerasiformis, guanchica, europaea var. sylvestris (wild olive) and the Cornicabra cultivar, but none of these polymorphisms were found among the cultivars we have analysed. The psbK-psbI and trnS-GCU-trnG-UCC regions, spanning the polymorphic sites P8, P9, P10, P11, P12, P13 and P14, were analyzed by Besnard et al. [12] as fragment length variation on a set of different O. europaea taxa including cultivars. That analysis revealed intercultivar variability only at P11, P12 and P13 but was unable to keep the C/T and G/T SNPs in the P9 and P14 sites, respectively. We treated the A/T/- polymorphism, closely linked to P11, as a different polymorphism (P10) because the A/- indel is present in most varieties while the T is a rare mutation carried by few cultivars. The spacer rps2-rpoC2, spanning the polymorphic sites P16, P17 and P18, generated five different chlorotypes among the eight varieties analysed, demonstrating a high level of rearrangement within cultivars. This region corresponds to the ccmp5 microsatellite [42,43], but previous studies that analysed only length polymorphisms were unable to capture the complexity of this region. P28 includes ccmp7 [40,42] and an additional SNP polymorphism (P27) captured in the flanking region. Intrieri et al. [18] reported the identification of 5 SNPs and 4 indels in the trnD-trnT region of 13 cultivars. Analyzing a different set of cultivars, Besnard [19] did not detect these polymorphisms. Similarly, only two polymorphisms were confirmed in our cultivar set: the poly-A SSR (P20) and the C/T SNP (P21). Other regions previously analysed in different Olea taxa, such as trnL-F and rps16 [4], trnL-trnF [13], and trnT-trnL [11] were not polymorphic among our cultivars. No differences between the eight cultivars were found within the matK and psbA exons or the rps16 intron, regions used for species barcoding. In contrast, the psbK-psbI and trnH-psbA barcoding regions, both representing markers for plant species identification [44,45], correspond to our P8, P2, P3 and P4 polymorphisms. This observation indicates that these markers may not accurately discriminate between some species, given their potential intra-specific genetic variations [46].

Conclusions

The low level of cpDNA variation detected up to now within olive cultivars represented a serious obstacle to the widespread use of cpDNA markers for cultivar characterization, parentage analysis and population genetics. The most probable causes of the high level of sequence conservation may be related to the domestication process, by which most cultivars were likely derived from only a few different wild plants, and the low generation turnover resulting from the long life span of the trees, which reduces the rate of emergence of new mutations. In this study, using eight cultivars, 30 new cpDNA markers were identified from the olive plastome sequence and 10 markers previously reported were confirmed. In fact, the availability of the entire chloroplast genome and systematic sequencing of candidate regions from selected cultivars resulted in the identification of many new polymorphisms, mostly represented by nucleotide substitutions and by rearrangements of different microsatellites. They were not discovered in previous analyses likely because these focused mostly on fragment length variations. The 40 markers applied to eight cultivars were able to split them into six different chlorotypes. The ten known markers are able to establish to which lineages the olive varieties may correspond and to reconstruct their phylogeny with potential ancestors, while the new markers should allow to break down cultivated olives into new chlorotypes and to finely assign them to different lineages within the Mediterranean O. europaea complex. These markers could provide a valuable contribution to understanding the evolutionary and ecological processes involved in olive domestication as well as to increase the knowledge about the function of plastid genes on plant metabolism. They could be used to screen olive genotypes, to assess the chlorotype distribution among cultivars and to better determine their phylogenetic relationships with the wild populations as well as with other O. europaea subspecies. This could help reconstruct the origin of the cultivated olive and to determine the timeline involved in the distribution of chlorotypes from traditional varieties throughout the Mediterranean region. Most of these polymorphisms showed a high level of reorganization among cultivars, particularly in the intergenic regions such as psaA-ycf3, rps2-rpoC2 and trnS-GCU-trnG-UCC. This observation demonstrates that after rearrangements occurred within the plastid genome, these changes were fixed and maintained within cultivars by vegetative propagation. The putative functional role that these mutations may play in modifying the metabolism of olive cultivars and in developing adaptations to the environment, will also represent a further contribution to understanding the genetic background of the olive, providing insights into the evolution of plant phenotypes. The application of these polymorphisms as functional markers will also be considered. Finally, these polymorphisms represent a new source of markers for olive DNA barcoding to distinguish between cultivars, for practical applications related to DNA-based tracking of olive oil and the identification of archaeological remains. One particular focus involves their potential use in DNA tracking of food products derived from the olive (e.g., olive oil and table olives), based on the assumptions that: i) the high number of chloroplasts per cell increases the probability that trace amounts of DNA can be amplified from these food products; ii) their maternal origin excludes the risk that DNA from pollinators would be amplified instead; iii) the haploid chloroplast genome can produce cultivar-specific single signals. The identification of 30 new polymorphic sites, most of which are located in chloroplast regions previously unexplored in cultivated O. europaea, demonstrates that chloroplast variation in olive cultivars is higher than expected and that new chlorotypes could be discovered through the analysis of a larger number of cultivars.

Methods

Plant material and DNA extraction

For the plastome sequence analysis, leaves of cv. Frantoio were collected from the accession present at the CRA-OLI olive cultivars collection (Collececco, Spoleto). For the detection of intervarietal polymorphisms, a subset of eight cultivars was used, chosen among 30 cultivars pre-selected on the basis of their haplotypes for the intergenic trnS-GCU - trnG-UCC spacer (Table 4). Total DNA was extracted by the DNeasy Plant Mini Kit (Qiagen, Hilden, Germany), following the manufacturer's instructions.

Sequencing strategy: primer design and PCR amplification

Sequencing of the olive plastome was performed by designing a series of PCR primer pairs that produced partially overlapping amplicons and spanned the entire chloroplast genome. For the Large Single Copy (LSC) region, 38 primer pairs located within conserved regions and designed by Grivet et al. [34] were used, avoiding gaps between successive fragments along the cpDNA molecule. Five primer pairs (5-14-22-27-38) produced double bands, and two (16-28) did not produce any amplification. Thus, new primers for those regions were constructed, following the strategy used for the amplification of the IR and SSC regions. For primer sequences see Additional File 1, Table S2. For the SSC and the IR regions, primers were constructed from conserved sequences derived by the alignment of the plant chloroplast genomes of Jasminum nudiflorum (DQ673255), Populus tricocharpa (EF489041), Vitis vinifera (DQ424856), Eucaliptus globulus (AY780259), Arabidopsis thaliana (AP000423), Gossypium hirsutum (DQ345959), Citrus sinensis (DQ864733), Cucumis sativus (AJ970307), Morus indica (DQ226511), Panax ginseng (AY582139), Solanum lycopersicum (AM087200) and Nicotiana tabacum (Z00044). These sequences were retrieved from GenBank and aligned using Muscle V. 3.7 [47], and the primers were designed using PerlPrimer v1.1.6 [48]. Because the average size of the amplified fragments was approximately 2,500 bp, internal primers to sequence the entire amplicons were also designed. The primer sequences and positions, along with their respective amplicon lengths, are given in Additional File 1, Table S1. PCR amplifications were performed in a final volume of 50 μL containing 1-20 ng of template DNA, 10× PCR buffer, 200 μM of each dNTP, 10 pmol of each primer and 2 U of EuroTaq polymerase (EuroClone). For those fragments that were longer than 5,000 bp, 1 unit of LA Taq polymerase (TaKaRa) was used instead. The amplifications were performed with the PCR System 9600 (Applied Biosystems, Foster City, CA) using the following cycling conditions: an initial denaturation step of 95°C for 5 min, followed by 35 cycles of 95°C for 30 sec, 60°C for 30 sec and 72°C for 25 sec and a final elongation step of 72°C for 30 min. For those amplifications including LA Taq polymerase in the PCR mix, the following cycling conditions were used instead: an initial denaturation step of 94°C for 1 min, followed by 30 cycles of 98°C for 60 s and 68°C for 10 min and a final extension step of 72°C for 10 min. Negative controls (no template DNA) were included in all experiments. The PCR products were checked by electrophoresis on 2% agarose gels, then purified with the JetQuick PCR purification kit (Genomed) and directly sequenced in both directions using the ABI Prism BigDye Terminator V.3.1 Ready Reaction Cycle Sequencing Kit (Applied Biosystems) on an ABI 3130 Genetic Analyzer (Applied Biosystems-Hitachi). The sequences were assembled using BioEdit v7.0.9 software (Ibis Biosciences, Carlsbad, CA). The DOGMA program [49] was used for the initial genome annotation, which was then manually refined using Artemis version 11 [50] and NCBI Blast searches. The annotation of tRNA genes was checked using tRNAscan version 1.21 [51]. The genome map was generated using OGDRAW software V. 1.0 [52].

Evaluation of repeat structures

Msatfinder v. 2.0.9 [53] was used to identify simple sequence repeats (SSR), with the following settings: a six-repeat threshold for mono-nucleotide SSRs, a five-repeat threshold for di- and tri-nucleotide SSRs and a three-repeat threshold for tetra-, penta- and esa-nucleotide SSRs. The SSR density in the different regions of the chloroplast genome was calculated by dividing the number of SSRs by the length of the given region. Interspersed repeats were identified with REPuter [54] by setting the minimum repeat size to 30 bp and the Hamming distance to 3. The presence and distribution of the repetitive element were verified manually using Artemis and computationally by performing an intragenomic Blast search. For this purpose, the sequence was interrogated using a local installation of NCBI Blast and a Blast database created with formatDB software http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/wwwblast/.

Identification of polymorphic regions among olive cultivars

To identify sequence polymorphisms, the following potentially variant domains were tested: i) regions containing mono-, di-, tetra- and penta-nucleotide microsatellites; ii) regions previously reported as polymorphic among Olea subspecies, iii) regions containing high sequence variations among 12 species (see materials and methods for chloroplast sequencing strategy); iv) barcoding regions previously identified for species discrimination that had never been tested in olive cultivars; and v) plastid ESTs derived from massive sequence analyses of fruit cDNAs [55]. Candidate SSRs were selected among those having the highest number of repeats (Table 1 and Figure 2). Although no mono-nucleotide SSRs with repeats shorter than 10 bp were considered, some were indirectly included in the analyses of other regions. PCR amplifications were performed in a final volume of 25 μl containing 25 ng of template DNA, 2,5 μl of 10 × PCR buffer, 0.5 mM of each dNTP, 1 μM of each primer and 1.5 U/μl of PerfectTaq DNA Polymerase (5-PRIME). The amplifications were run on a thermal cycler Mastercycler Gradient (Eppendorf) using the same conditions as previously indicated for plastid sequencing. After an initial evaluation by electrophoresis on a 2% agarose gel, amplicons were sequenced in both directions using the ABI Prism BigDye Terminator V.3.1 Ready Reaction Cycle Sequencing Kit and run on an ABI 3130 Genetic Analyzer (Applied Biosystems-Hitachi). The sequences of each region were aligned to evaluate the presence of SNPs, indels or polymorphic microsatellites among the six cultivars. To use these polymorphisms as chloroplast markers able to distinguish olive cultivars from each other, specific primers localizing within conserved flanking regions were constructed (Additional File 1, Table S1). The resulting fragments ranged in size from 145 to 688 bp and could be amplified at an annealing temperature of 60°C. Some amplicons included from two to five polymorphisms. All 40 polymorphisms can be amplified by a set of 21 primer pairs.

Authors' contributions

CGNM1 and MDC2: contributed to the DNA sequencing of the entire plastome. MR1: conducted all the experiments to establish chloroplast variation at varietal level. AR1: conducted bioinformatic analyses, contributed to the DNA sequencing of the IR and SSC of plastome and revised the manuscript. LB1: conceived the study and wrote the manuscript. All authors read and approved the final manuscript.

Author details

1 CNR - Institute of Plant Genetics, 06128 Perugia, Italy 2 University of Cordoba - Dep. of Agronomy, 14071 Cordoba, Spain

Additional file 1

Table S1 and Table S2. Supplemental tables in a Word DOC. Click here for file
  41 in total

1.  Automatic annotation of organellar genomes with DOGMA.

Authors:  Stacia K Wyman; Robert K Jansen; Jeffrey L Boore
Journal:  Bioinformatics       Date:  2004-06-04       Impact factor: 6.937

2.  Chloroplast genome (cpDNA) of Cycas taitungensis and 56 cp protein-coding genes of Gnetum parvifolium: insights into cpDNA evolution and phylogeny of extant seed plants.

Authors:  Chung-Shien Wu; Ya-Nan Wang; Shu-Mei Liu; Shu-Miaw Chaw
Journal:  Mol Biol Evol       Date:  2007-03-22       Impact factor: 16.240

Review 3.  Gene duplication, transfer, and evolution in the chloroplast genome.

Authors:  Ai-Sheng Xiong; Ri-He Peng; Jing Zhuang; Feng Gao; Bo Zhu; Xiao-Yan Fu; Yong Xue; Xiao-Feng Jin; Yong-Sheng Tian; Wei Zhao; Quan-Hong Yao
Journal:  Biotechnol Adv       Date:  2009 Jul-Aug       Impact factor: 14.227

4.  A DNA barcode for land plants.

Authors: 
Journal:  Proc Natl Acad Sci U S A       Date:  2009-07-30       Impact factor: 11.205

5.  Characterization and primer development for amplification of chloroplast microsatellite regions of Fraxinus excelsior.

Authors:  M E Harbourne; G C Douglas; S Waldren; T R Hodkinson
Journal:  J Plant Res       Date:  2005-08-05       Impact factor: 2.629

6.  A set of conserved PCR primers for the analysis of simple sequence repeat polymorphisms in chloroplast genomes of dicotyledonous angiosperms.

Authors:  K Weising; R C Gardner
Journal:  Genome       Date:  1999-02       Impact factor: 2.166

7.  On chloroplast DNA variations in the olive ( Olea europaea L.) complex: comparison of RFLP and PCR polymorphisms.

Authors:  G. Besnard; A. Bervillé
Journal:  Theor Appl Genet       Date:  2002-02-08       Impact factor: 5.699

8.  Geographic origin and taxonomic status of the invasive Privet, Ligustrum robustum (Oleaceae), in the Mascarene Islands, determined by chloroplast DNA and RAPDs.

Authors:  R I Milne; R J Abbott
Journal:  Heredity (Edinb)       Date:  2004-02       Impact factor: 3.821

9.  Complete plastid genome sequence of the chickpea (Cicer arietinum) and the phylogenetic distribution of rps12 and clpP intron losses among legumes (Leguminosae).

Authors:  Robert K Jansen; Martin F Wojciechowski; Elumalai Sanniyasi; Seung-Bum Lee; Henry Daniell
Journal:  Mol Phylogenet Evol       Date:  2008-06-27       Impact factor: 4.286

10.  Comparative 454 pyrosequencing of transcripts from two olive genotypes during fruit development.

Authors:  Fiammetta Alagna; Nunzio D'Agostino; Laura Torchia; Maurizio Servili; Rosa Rao; Marco Pietrella; Giovanni Giuliano; Maria Luisa Chiusano; Luciana Baldoni; Gaetano Perrotta
Journal:  BMC Genomics       Date:  2009-08-26       Impact factor: 3.969

View more
  43 in total

1.  Plastid Genomes of Flowering Plants: Essential Principles.

Authors:  Tracey A Ruhlman; Robert K Jansen
Journal:  Methods Mol Biol       Date:  2021

Review 2.  On the origins and domestication of the olive: a review and perspectives.

Authors:  Guillaume Besnard; Jean-Frédéric Terral; Amandine Cornille
Journal:  Ann Bot       Date:  2018-03-05       Impact factor: 4.357

Review 3.  Molecular studies in olive (Olea europaea L.): overview on DNA markers applications and recent advances in genome analysis.

Authors:  T Bracci; M Busconi; C Fogher; L Sebastiani
Journal:  Plant Cell Rep       Date:  2011-01-07       Impact factor: 4.570

4.  The eastern part of the Fertile Crescent concealed an unexpected route of olive (Olea europaea L.) differentiation.

Authors:  Soraya Mousavi; Roberto Mariotti; Francesca Bagnoli; Lorenzo Costantini; Nicolò G M Cultrera; Kazem Arzani; Saverio Pandolfi; Giovanni Giuseppe Vendramin; Bahareh Torkzaban; Mehdi Hosseini-Mazinani; Luciana Baldoni
Journal:  Ann Bot       Date:  2017-06-01       Impact factor: 4.357

5.  Mechanisms of functional and physical genome reduction in photosynthetic and nonphotosynthetic parasitic plants of the broomrape family.

Authors:  Susann Wicke; Kai F Müller; Claude W de Pamphilis; Dietmar Quandt; Norman J Wickett; Yan Zhang; Susanne S Renner; Gerald M Schneeweiss
Journal:  Plant Cell       Date:  2013-10-18       Impact factor: 11.277

6.  Selective recognition of DNA from olive leaves and olive oil by PNA and modified-PNA microarrays.

Authors:  Stefano Rossi; Alessandro Calabretta; Tullia Tedeschi; Stefano Sforza; Sergio Arcioni; Luciana Baldoni; Roberto Corradini; Rosangela Marchelli
Journal:  Artif DNA PNA XNA       Date:  2012-04-01

Review 7.  Recent developments in olive (Olea europaea L.) genetics and genomics: applications in taxonomy, varietal identification, traceability and breeding.

Authors:  L Sebastiani; M Busconi
Journal:  Plant Cell Rep       Date:  2017-04-22       Impact factor: 4.570

8.  A new avenue for classification and prediction of olive cultivars using supervised and unsupervised algorithms.

Authors:  Amir H Beiki; Saba Saboor; Mansour Ebrahimi
Journal:  PLoS One       Date:  2012-09-05       Impact factor: 3.240

9.  Complete plastid genome sequence of the basal asterid Ardisia polysticta Miq. and comparative analyses of asterid plastid genomes.

Authors:  Chuan Ku; Jer-Ming Hu; Chih-Horng Kuo
Journal:  PLoS One       Date:  2013-04-30       Impact factor: 3.240

10.  The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza.

Authors:  Jun Qian; Jingyuan Song; Huanhuan Gao; Yingjie Zhu; Jiang Xu; Xiaohui Pang; Hui Yao; Chao Sun; Xian'en Li; Chuyuan Li; Juyan Liu; Haibin Xu; Shilin Chen
Journal:  PLoS One       Date:  2013-02-27       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.