Literature DB >> 25461752

Development of genic and genomic SSR markers of robusta coffee (Coffea canephora Pierre Ex A. Froehner).

Prasad S Hendre1, Ramesh K Aggarwal1.   

Abstract

Coffee breeding and improvement efforts can be greatly facilitated by availability of a large repository of simple sequence repeats (SSRs) based microsatellite markers, which provides efficiency and high-resolution in genetic analyses. This study was aimed to improve SSR availability in coffee by developing new genic-/genomic-SSR markers using in-silico bioinformatics and streptavidin-biotin based enrichment approach, respectively. The expressed sequence tag (EST) based genic microsatellite markers (EST-SSRs) were developed using the publicly available dataset of 13,175 unigene ESTs, which showed a distribution of 1 SSR/3.4 kb of coffee transcriptome. Genomic SSRs, on the other hand, were developed from an SSR-enriched small-insert partial genomic library of robusta coffee. In total, 69 new SSRs (44 EST-SSRs and 25 genomic SSRs) were developed and validated as suitable genetic markers. Diversity analysis of selected coffee genotypes revealed these to be highly informative in terms of allelic diversity and PIC values, and eighteen of these markers (∼ 27%) could be mapped on a robusta linkage map. Notably, the markers described here also revealed a very high cross-species transferability. In addition to the validated markers, we have also designed primer pairs for 270 putative EST-SSRs, which are expected to provide another ca. 200 useful genetic markers considering the high success rate (88%) of marker conversion of similar pairs tested/validated in this study.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25461752      PMCID: PMC4252042          DOI: 10.1371/journal.pone.0113661

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Coffee tree belongs to the genus Coffea, comprising two main cultivated species C. arabica L. (2n = 4x = 44) and C. canephora Pierre ex A. Froehner (diploid, 2n = 2x = 22), yielding arabica and robusta type of coffees, respectively. Arabica coffee is known for excellent cup quality but suffers from a narrow genetic base due to its domestication history and susceptibility to diseases and pests. In contrast, robusta coffee though poor in quality has better adaptability to various stresses. To keep pace with the environment and also of the sensibilities of market, there is a continuous need for genetic improvement of coffee, which unfortunately is severely constrained owing to inherently slow pace of tree breeding using conventional methods, and variety of other reasons [1], [2]. The situation demands development of new, easy, practical technologies that can provide acceleration, reliability and directionality to the breeding efforts, as well as characterization of cultivated/secondary gene pool for proper utilization of the available germplasm in coffee genetic improvement programs. In this context, DNA polymorphism based genetic markers becomes important that have proven to be of immense value in characterization and genetic improvement of plant germplasm resources. Among different types of DNA markers, microsatellites or SSR markers are the most ideal for studying genetic diversity, population structure, phylogenetic relationships, construction of frame-work linkage maps, QTL interval mapping, marker-assisted selection (MAS), etc., thereby aiding in genetic improvement of crop plants [1]. In the last few years a number of efforts have lead to development of a few hundred SSR markers in coffee [2]–[12], but these are insufficient to realize the full potential of markers for mapping/linkage studies in coffee, more so in arabicas which have an extremely narrow genetic base. Moreover, most of the described markers are poorly validated, especially for their utility in cultivated genepool comprising arabicas and robusta coffee. The situation thus calls for newer efforts to generate additional validated markers for them being of any gainful utility in marker-based genetic studies/coffee breeding. With advancements in genomic studies, there has been an huge burst in the EST sequences in the public domain that provide an easy and economic/cost-effective opportunity to identify and develop EST based SSR markers, which have the additional advantage of assessing the functionally effective genetic diversity [13], [14], and also have very high cross-species transferability [8], [11]. In this study, we have used the coffee EST database containing 13,175 unigene [15] to identify SSRs in the expressed part of coffee genome, and use the same to develop novel coffee-specific EST-SSRs for use as efficient genetic markers. Thus we describe here 44 new validated genic-SSRs, and another set of 270 putative similar markers that need further validation. In addition, we also describe 25 new genomic SSR markers that were developed using an affinity capture approach based SSR-enriched partial, small-fragment genomic library.

Materials and Methods

Plant material and DNA extraction

The plant material used for the validation of SSR markers comprised a set of 16 elite coffee genotypes belonging to C. arabica (tetraploid arabicas) and C. canephora (diploid robustas) and 14 related wild species belonging to Coffea and Psilanthus [2] that were available in the Coffee Germplasm Bank maintained at Central Coffee Research Institute, Balehonnur, Karnataka, India. The fresh leaf samples collectedfrom each genotype were used for DNA isolation as described by Aggarwal et al. [16]. The DNA isolated from robusta variety CxR was used for constructing SSR enriched small-insert genomic library.

Microsatellite screening of coffee transcriptome, identification of SSRs and marker development

An EST database of robusta coffee comprising 13,175 unigene ESTs [15] was downloaded from ftp site (ftp://ftp.sgn.cornell.edu/coffee/) maintained by Sol Genomics Network (SGN, http://www.sgn.cornell.edu/coffee.pl). The database was used for: (i) identification and localization of SSRs using microsatellite search module MISA (MIcroSAtellite, http://www.pgrc.ipk-gatersleben.de/misa), and the criteria being- a minimum repeat core of 12 bp, considering the base complementarities and a minimum distance of 50 bp between two SSRs; (ii) selecting the ‘usable/candidate SSRs’ for marker development, being those that carried a minimum of 18-bp long repeat core (nine repeat units of DNRs, six of TNRs, five of TtNRs, four of PNRs, or three of HNRs) (iii) designing of primer pairs for the selected usable SSR sequences using PRIMER 3 tool embedded in MISA and/or GENETOOL Lite version 1.0 (http://www.biotools.com/downloads/brochures/GeneTool2.pdf); and (iv) standardizing PCR conditions followed by validation of working primer pairs for genetic studies as described earlier [2].

Construction of an SSR-enriched small-insert genomic library/development of genomic SSRs

A partial genomic DNA library enriched for microsatellite repeats was constructed using the methods described earlier [17]. Briefly, the method involved: one-step restriction digestion of genomic DNA with Hae III enzyme (NEB) and ligation of resulting fragments with ds Mlu-I adaptor (Mlu-F: CTC TTG CTT ACG CGT GGA CTA and Mlu-R: pTAG TCC ACG CGT AAG CAA GAG CAC) [18]; amplification of the restricted-ligated DNA pool using Mlu-F primer; SSR enrichment of the amplified DNA pool using liquid phase hybridisation (in 6X SSC) with streptavidin coated paramagnetic beads (Dynal) attached with biotinylated equimolar pool of four oligos (CA)15, (GA)15, (GAA)15 and (CAA)15. This was followed by amplification of the hybridized/trapped genomic DNA fragments by PCR and construction of partial genomic library in TA vector (Invitrogen) as per the manufacturer’s instructions. A number of positive (white) recombinant clones were randomly picked up from the library, amplified and sequenced for both the strands using M13 universal primers on ABI 3730 DNA Analyzer (Applied Biosystems, USA). The sequences were aligned and edited using Autoassembler (Applied Biosystems, USA). The SSR-positive sequences were identified and used for development of new genomic SSRs as described earlier [2]. The amplified PCR products generated using all the new SSRs were resolved using capillary-based ABI 3730 DNA Analyzer and were precisely sized for major, comparable and conspicuous peaks using GeneMapper 3.7 (Applied Biosystems), using default parameters.

Statistical, genetic and diversity analysis

The data for EST-SSRs and genomic SSRs were analyzed separately for various genetic parameters, viz., mean, standard deviation, expected heterozygosity (H), Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium, (LD), using Arlequin ver 3.1 [19], polymorphism information content (PIC) and private alleles (Pas) using Convert ver. 1.3.1 [20]. Cross-taxa transferability (T) was calculated over 15 species (except C. canephora) as proportion of primers showing successful amplification vis-à-vis all the tested primers whereas primer conservance (C) was calculated as proportion of the species displaying successful amplification vis-à-vis all the tested markers. Genetic diversity analysis to infer generic relatedness/affinities was performed over informative Pms (polymorphic markers) for cultivated genotypes/related species using MicroSatellite Analyzer [21] with Nei’s genetic distance [22]. The genetic distance matrices were used to construct Neighbour Joining (NJ) consensus tree using Phylip ver 3.6 [23], which was viewed using Treeview ver 1.6.6 [24]. We also attempted mapping of the new markers on a robusta linkage map using JoinMap ver 4.0 [25] as described earlier [2], [26] using group LOD score of 5.0.

Results

In this study, we undertook in-silico analysis of a robusta coffee transcriptome to identify and develope coffee-specific EST-SSR markers. Simultaneously, we also attempted development of genomic-SSRs by constructing a small-insert SSR enriched partial genomic DNA library. The new markers (44 EST-SSRs and 25 genomic-SSRs) were validated for their utility in genetic studies using panels of elite coffee genotypes, and related taxa of coffee for cross-species transferability. The details of these new markers viz., locus designation, primer sequences, repeat motifs, amplification temperature, amplicon size, and SGN ID or Genbank accession numbers are given in Tables 1 and 2. Details of additional primer pairs for 270 putative EST-SSRs, which were designed but need to be validated have been provided in Table 3.
Table 1

Details of the new EST-SSR markers developed using EST database and validated using diverse coffee genotypes in the present study.

Sl. No.Marker IdPrimer sequenceRepeat motifAllele size (bp)* Source EST ID** Linkage group
1 CCESSR01 F: TGGTAGCACTGTCGGAAGCATAT (AGC)7239119516
R: GACCCATCTAACTTGCTGCATTTT
2 CCESSR02 F: AAGATATGTTTTAGCCCAAGTAGTGAC (AT)11168119534
R: ATTGGTTGGTACTGTTTAGCTGTTCAT
3 CCESSR03 F: CAGCCGTATCAGCACCAGCAT (TC)10217119559CLG02
R: TTCCCAACCCGTCAAAGTCCT
4 CCESSR04 F: GTCGACAACAGCGCTCAAGATC (TCT)6259119613
R: CAAAAAAGACTGGAAAGAGGGTATTAG
5 CCESSR05 F: AGGGGCTGGTTATTTTTTGGG (GCA)6141119723
R: GGGGGTAAATACGGGAAAGCAGA
6 CCESSR06 F: AACCCTCCCTCCTCCACCTTTTC (AC)23224119781CLG09
R: GGAGGTGGTGGTGGTGTAAAAAAAG
7 CCESSR07 F: CCCCCTTCCCATTTTTTCCCCT (AGC)6194119864
R: CCGGGTGGTAGAGAGATTGCTGCA
8 CCESSR08 F: AAACCACAACATCCCCCAAGAGT (GCA)8149120008
R: GGCAGTAGAATTGGAGCGGTGAAA
9 CCESSR09 F: CCCCCACCCACTTCTCTTTG (TTC)8184120113
R: ACAACAAACGAACGCTCTCTGATAA
10 CCESSR10 F: GCAGAAGAAGCACCAGTAGCAGAAGAAG (GCT)7207120120
R: TGCCTTCTACTCTTCACTCTTCTCCACT
11 CCESSR11 F: AGGGAAAAAGAAGAGTGAGAAGAATATT (GAA)6140120217
R: GGGACACTCACATATACTGCAAACTTAG
12 CCESSR12 F: CCGCCATCCCTTTTGCCTTTC (CCA)6200120227CLG02
R: ACGGCAGCAGAAGTGGAGGTGTT
13 CCESSR13 F: GCGGGGTAGTTTTGGGAATATGG (TTA)6-tt-(TTC)6124120252CLG02
R: TTTGGGGTCCTTTTTCTTTCACACAT
14 CCESSR14 F: CTTGCCCCCTTCCCTCCCACTC (CT)10232120329
R: TTCGGCTCCTTGTGTTTGGGTA
15 CCESSR16 F: AGAGCAATGAGAAACAAACGAAACT (CAT)6173120439
R: AGGTGCCCAACTATCCCAGAAT
16 CCESSR17 F: CTCCACACCAACAAAATCCCACTT (CAG)6169120475
R: CCCACATCCTGAGTCTGCTGCTAA
17 CCESSR18 F: GGGGAGGATGCTTATGATATGAGG (CAC)6152120498
R: TCCGGTTCACCTGCTTTTCCTT
18 CCESSR19 F: CATCGTATCTCGCCCTCTCTCTTTC (CA)13219120514
R: CCACAACAAGTACAACCAACCGAAAC
19 CCESSR20 F: TTCTGGCCGATTGATTGTGAT (AAG)7143120538
R: GCGACAAGGCTGACAAACTACTAC
20 CCESSR21 F: CGAGCTAGTGCAGACAGATTGAGAT (AG)17164120568CLG13
R: GTCCTTGGCGAAATCCCTCAG
21 CCESSR22 F: CCCTCAATCTCGTCCCCCTCT (TC)9153120823
R: CCCTCCATAAATCTTCTTCACGTACTC
22 CCESSR23 F: GGCCTCTCTTTAATTTTCTTGTCTTTTTTC (TTC)8160120860
R: ATGGAGGGTAGGGTTTCGAGAGTGA
23 CCESSR26 F: AACCGGCCTTCTTGTATGATTCTCTA (CAT)8206121392CLG07
R: TTGGCTAACCCTCACTCTCTCCCTACTA
24 CCESSR27 F: GCCCACTCCATTCGTACTTGTTTCC (CT)12120121464
R: GCGGTGCTGCTCAATGCTCAT
25 CCESSR28 F: AAAATGAGTGACGATGGGAAAGACA (CCA)7193121482
R: GAGGGAAGCCGATCACTGGTTG
26 CCESSR29 F: GGCGCTAGAGTTGGTTGTTTGC (CTTCT)595121548
R: CAGGCATTGGAACCAGCGAAC
27 CCESSR31 F: AGAAGAGTACTGAAGGCCTGGAAGA (GTG)6220121671
R: AGCATCTGCAGCCTCCATAGC
28 CCESSR32 F: CTTGGCGTTTAGCGTTCTCACATT (TA)9163121811
R: GCTCAACCAAACCAATACATACCTCTT
29 CCESSR33 F: GCCCGCATGGACGACTTGGA (AGG)7227121876CLG13
R: CGCTTGACGTATCCTTTGGCCTCT
30 CCESSR34 F: GCATTGCTCCCCCCACTTCA (CTC)6168121905
R: GAGCATGGGGACGAGGAGGA
31 CCESSR35 F: CTGCTAATGCTGCTGAAAAAGAGATACC (AG)10107121994CLG13
R: GGCTGTGAATTCTTGTGACTTGTGACT
32 CCESSR36 F: AGCCTTCTGCAATTCCCTCGTACA (AG)9100122089
R: GGCGTCGTAGAGGGCATTCAGA
33 CCESSR38 F: GCCCGAGGGTTAGATTGATCA (AG)12162122114
R: CTTGTCTTCTGTTTGATTTTGTGTTCTA
34 CCESSR39 F: GCGACCGGACGACCAAAAATAAT (GAA)7131122147
R: CGCCGTCGTCAGAGTCATAATAATCA
35 CCESSR40 F: CGTGGGGGTTTGTTTTTCTCG (AGG)7205122194
R: GTCCCCCCTCAGCCGTTTTTG
36 CCESSR41 F: GGGCTGCAGGCTTGTCACCAC (GA)10205122322
R: AATCGGTTTAGTTTTTTGTTTCCTCAC
37 CCESSR42 F: CGGGCGGAAACGGTCAGATC (GCA)7117122295
R: TGCCGTTGTTGTTGTCCAGGTG
38 CCESSR43 F: CCCAGCAAGAACTCAACCCCATCA (TTC)8172122653CLG02
R: TGGCCTAATGAAGATGACGTTGCTGATG
39 CCESSR44 F: AGGAATAATGGAGGAGACGTTGTTG (CTT)7236122680CLG02
R: GCACAAATCCCAGTACTTCCTCATAGA
40 CCESSR45 F: AAATGGCCGAGATAGAGAAGGAGAAG (AGG)7135122764
R: CCCACTCCTCCGCGGTACTGATC
41 CCESSR47 F: GCAGCAACAATCACTTCCACAGC (GCA)6198122922
R: TGCTGTTGTAACTGCGGGATTTG
42 CCESSR48 F: GCAACCTTATCTAGATTCAACTTCAACTT (AAATCA)5188122975
R: CGGGAAGAAATGGCAGCCTATAC
43 CCESSR49 F: GCGGCCATCCTTGTCTTCG (ATC)8186122978CLG02
R: TAGCCGCTGACGTAATCTTCCTT
44 CCESSR50 F: GGGATGATGTGGATTCTATGGTCTACTA (CAG)7108123181CLG10
R: ATGCCATTTTAACACTTCCTCCTCA

CCESSR: CCMB Coffee EST SSR marker;

F: forward primer;

R: reverse primer;

–: Unmapped;

CLG: Combined Linkage Group [1];

*: Predicted amplicon size based on source EST sequence;

**: Source EST ID as per the downloaded SGN database (ftp://ftp.sgn.cornell.edu/coffee/).

Table 2

Details of the new genomic SSR markers developed using streptavidin-avidin affinity capture SSR-enriched library in the present study.

Sl. No.Primer IdPrimer sequenceRepeat motifTa (°C)Allele size (bp)* GeneBank acc. No.Linkage group
1 CCRM02 F: AATGGTGGCAGTCCTGAAAGATC (GA)12 57268KM874369
R: AACATCAACTTTCCTGGTCTTC
2 CCRM06 F: TTCTTATCACCTTGGGCTACCTTTCTTC (AG)8 57146KM874370
R: AAGCGGTTTAGTTTTTTGTTTCCTCAC
3 CCRM07 F: TAAAGGATGGTATATGTGGCTGGAGTA (AT)8 57126KM874371CLG01
R: CCACAGCCTCGGCATTTACTATATAT
4 CCRM10 F: AAAAAGACAAGATTCAACCTGCAGTAGT (GT)9 57104KM874372
R: TTCCCACCCCCCAAAAAAAA
5 CCRM14 F: ATTTGATTTCTTCTTTCTCTGTTGTC (CT)22 55130KM874373
R: ACAAAAGCCCTGAAAATAATAGATCTA
6 CCRM15 F: CGAAATTGACGAAGCTCTTGTT (CTT)6 57243KM874374
R: TTGCTAGTTTCGAAATCGTGTAAGGAC
7 CCRM16 F: TCCTATAGCAGAAACACAAAATGACACAG (TC)26 55223KM874375
R: GGTTTTTGGGTTCTTTTTAGCATATACA
8 CCRM17 F: TAAGCGTTGGAATTCCTCACTCTATCT (CT)17 55228KM874376CLG01
R: ACAGCTAAAGAAACAATGAACCAGT
9 CCRM19 F: GTTTTTTTTTTCTTTTTTCTTTTTGAGCT (GA)26 57252KM874377CLG06
R: AAGGCAATGTTGGTCAGCAGTGG
10 CCRM21 F: CACCCCTCCCATCCGTTGAAACAT (GA)16 57258KM874378CLG06
R: AATGATGCTCCCAGTGTTTGATGA
11 CCRM22 F: CTTGCAGTTTACTTCCCTTTGGTTG (GA)29 57241KM874379
R: TTTTCTTCTGTATATTGTTGGAGTTCTTC
12 CCRM23 F: CGGCAGTGTGGTCCCCTTTGAAT (GTT)6 57141KM874380
R: AAAAAAAAACTCACACTCTATCAAAACTAAGG
13 CCRM24 F: GAGTGTGAGTTTTTTTTTGTGACCTTAA (GA)9.(GA)9 57213KM874381
R: ACCCCACATTCCTCTCATCCATTC
14 CCRM28 F: GGGGCAACAAGTGGTAGGATATGAAGAC (CA)10 57209KM874382
R: CGCCTTCACTATGGTTTTGCCTTCTAA
15 CCRM31 F: CTTTTATGTCTATCTGTCTCTGCTTTTC (CA)10 57114KM874383
R: CCTGCAGTAGTTTCACCCTTTATCC
16 CCRM33 F: ACAGCCGTTGAACTTATGGGATTACA (CT)12 57118KM874384
R: ACAAAGGGATGGAGAGGATGGAATATAC
17 CCRM34 F: CCCCAGAACGAAAGGCAATCAT (GA)10 57165KM874385CLG05
R: TTGGGACTATTTATACTGGGGAAGAA
18 CCRM35 F: GGGGTTAAATCAGGGGAAAAGTGG (TG)12 57144KM874386
R: AAGCGAGGGAGAGAGCAGCAGATC
19 CCRM36 F: CCATGGGGCAAAAGGCAAATTCTAT (CT)18 57171KM874387
R: TCCAGACCGCCGTTCACGAAGTATA
20 CCRM37 F: TGCTTCCCTTCTCATTCTGGTACTTT (GT)10 57146KM874388
R: AATCCATCAACAACTTCAGCATACCA
21 CCRM38 F: TGAGAATTAAAGCAGCAGGGGTATG (CA)10 57204KM874389CLG05
R: GCAAAAAAAGGCAAAAGCATTACATC
22 CCRM40 F: ATTCACGCTTTCATTACTTTTCTC (CT)29 57176KM874390CLG05
R: TTTGTATTTCCTTTCCATTTCTTTTGTA
23 CCRM41 F: AGCAGAAACACAAAATGACACAGAGCA (CT)24 57161KM874391
R: AATGGTCCAAGGAAAATGAAAAATGTT
24 CCRM42 F: CGGAGAAGAGCAATATACAAGCAAGG (GA)13 55143KM874392CLG11
R: GCCACCCCAGAACTTTTGCAA
25 CCRM45 F: CTTCAAGCAAAATTTTCAACAGCACAG (GA)10 57187KM874393
R: GGCCCTTTTTTAGTCTCACCACATT

CCRM: CCMB CXR Microsatellite marker;

Ta: annealing temperature;

F: forward primer;

R: reverse primer;

–: Unmapped;

CLG: Combined Linkage Group [1];

*: Expected amplicon size in the robusta variety CxR.

Table 3

Details of primer pairs for additional new EST-SSR markers designed in the present study.

Sl No.Marker IDUnigene ID* SSR repeat unitForward PrimerPredicted Tm (°C)forward primerReverse PrimerPredicted Tm (°C)reverse primerPredictedamplicon size (bp)
1CCESSR51119463(AGAGCA)5 AGCAGAGATCGAGACAGAGAGA 59 ACATATCTGAAACCCTCGGC 59477
2CCESSR52119606(CT)9 AATCGGAGGATTTGTGCTTC 59 GCGCCTAAATCACCCATATT 59390
3CCESSR53119736(GA)11 CCGCGGTCAGTCTTACTACA 59 ACACAAATCAACACCCATCC 58266
4CCESSR54119787(GGA)6 CGCAATCTTGAATGAGGAGA 59 TGGGAGGTTGATCATCTGAA 59341
5CCESSR55119820(CT)11 AGGACAGGAGTGTGATCCCT 59 GCCATGTCCTCCTTTCGTAT 59376
6CCESSR56119897(TC)11 AGGTTTTTGTGTCCCTTTGG 59 CATCGATGAAAAGCAGCAGT 59230
7CCESSR57119919(AT)15 GAATTTTGTCAGCCAGAGCA 59 GACGGAAAGATTCTGGCTTT 58184
8CCESSR58119987(TCC)7 AGCTACGCTAGGCAATTGGT 59 GACAACAACAACAGCCAACA 58296
9CCESSR59120008(TC)12 GGAACAAGACTCTCTTGCCA 58 TCATCACACAAGGAGGCAAT 59418
10CCESSR60120008(TG)9 ATTGCCTCCTTGTGTGATGA 59 CCGGTCGATCAACAATCTTA 59198
11CCESSR61120045(TCT)6 GGTGAAGGGTCCTTACCTGA 59 GAGATGTGCTACTGGCTACTGC 59232
12CCESSR62120064(GATT)5 CTCTTGTTTCCAACCCAACC 59 CGGACACTGTGAGGAGAGAA 59320
13CCESSR63120107(TCT)6 AGTCCAGTCCAGTCCTGTCC 59 CCGATATGATTTTGGTGCTG 59432
14CCESSR64120121(GCT)7 CGAAGTTGTGCAGGATGAAC 59 GGAGCTGCTTGCTCTTCTTT 59241
15CCESSR65120179(AT)10 GTGATGCTCGGTGTATCTGG 59 ACTAGAGGCCGAGAATTGGA 59203
16CCESSR66120206(AGA)10 GATGAGCTCCAAAACAAGCA 59 AAAACTTCCCAGGCTTCAGA 59447
17CCESSR67120260(CTCC)5 GTCGTCTGTTCCTCCTCGAC 60 CCACTAATCCCGAGCAAAAT 59273
18CCESSR68120291(AAG)8 CCACGCGAATAATCATCAAC 59 AAGCACCTTATCCCCAACAG 59469
19CCESSR69120316(TCA)6 TTGGAAAACCATAGAAGGGC 59 ACCCTCATCAATCTCTTGCC 59408
20CCESSR70120320(GCCACC)4 TTAAATCAGCCCTCAAGCCT 59 CTTGAAATCTTGCGCACTGT 59343
21CCESSR71120322(TTC)6 TTAGAAAAGCTGCGAGACGA 59 TTGACCATTTCCCCTTCTTC 59342
22CCESSR72120543(AT)9 GATTGCTCTCTTTCTTCGGG 59 TCCGCCCCAGTTAGATTTAT 59382
23CCESSR73120545(AT)9 GTTTTCCGGCTAGCTTGTTC 59 GTGCATGAGGTGAAAATGGA 60214
24CCESSR74120579(CCT)7 TGTAATATTGCTTCGCTCGG 59 AAGGGATGATGCCTAGTTGG 59485
25CCESSR75120596(TCC)7 GGCTCAATCGATAAGCAACA 59 CATGAACATATCCTGGAGCG 59429
26CCESSR76120656(AAC)6 AAGCGAAGATGGAGAGCAAT 59 CCGCCTTTTGTAGAAGAACC 59316
27CCESSR77120656(TCA)6 TTTTAGGCCAATCCTTCACC 59 GTATGTGAGGGCGTAACGTG 59454
28CCESSR78120720(GAA)6 GAGCCCTCTTTTTCCTCCTT 59 CCGTAATGATACAAACCCCC 59326
29CCESSR79120764(CATATA)4–13 bp-(AT)9 AGCAGCACCAGACTAGCTCA 59 GTCCCAGAAACGAAGATGGT 59399
30CCESSR80120821(ACAGG)4 CAGAATCCCATATTCCCCAC 59 TTGCTGTCTGATTTCCAAGC 59324
31CCESSR81120883(TCAAT)4 CAGATTCCGGTGTCACAAAG 59 TTTGCTGCAGTGTCACTTGT 58107
32CCESSR82120965(TCGAAA)4 GACTATGGATGGCTTTGCCT 59 GGCGGATTGGTTGATAGAAT 59397
33CCESSR83121010(CCTGCA)4 TGAATCAACTCCTGCTCCTG 59 GGTAGTTCCAGCCACCAGAT 59115
34CCESSR84121086(TC)11 GGATCAAACGTGGCTAAGGT 59 AAGAAAAGGGCTACAACGGA 59203
35CCESSR85121131(AAG)6 CCCCGGGCTGCAGAAACAAGT 60 AGCTCCGGTAAGCCTCAATA 59389
36CCESSR86121240(GCA)6 AACCCTCCACTCCATTTCAG 59 ACTATTGTTGCTGCTGCTGG 59347
37CCESSR87121289(TTC)6 GAAGCAACCCCTCAACTGAT 59 AGCACCCTCTGCTTCAATCT 59345
38CCESSR88121377(GT)9 CACGCGGATAATACACCG 59 AGTTGCTCCTGCTTCTCCAT 59191
39CCESSR89121392(GCT)6 GCACTGTTTTGAATGGTTGG 59 TCCGCACTACAAGTACCGTC 59159
40CCESSR90121439(GATTA)4 TACTTGAGCGATCAGAACCG 59 TAATCCTGCGTGCTATTTGG 59455
41CCESSR91121548(CTTCT)5 AGGCGCTAGAGTTGGTTGTT 59 CAAAGTAAGCAGCAGGCATT 58108
42CCESSR92121580(TC)11 CCTACATCCCACGACCTTCT 59 GTGGTAGTGCTTTGGTGGTG 59361
43CCESSR93121610(AT)9 CTGCCCTATGATGAATGGTG 59 TCATCTCAAGCATCGTCTCC 59157
44CCESSR94121752(TGCTCC)4 CCTCTCATGCCAGCAACTAA 59 AAAGCACTGGGAACTGAAGC 59258
45CCESSR95121841(CT)10 TGGAGCAGAGATTGTCAAGG 59 CCAGCTGGAACTTCCCTG 59349
46CCESSR96122004(AT)19 ACATTCGAAAACTTGGGGAG 59 CAAGGTCTTGGGTTTGTCCT 59232
47CCESSR97122006(AG)14 CTCGTGCTGTAACCCTCTCA 59 AGTGTGATGGAACGCGAATA 59439
48CCESSR98122012(AGC)6 GACGCGCAGTCTTTCAAGTA 59 GCTTCACAGTCGTCTTCCAA 59373
49CCESSR99122077(CGC)6 TCGGACTGCTATAGTGGACG 59 TCCTTCGAGACGTTAGCCTT 59215
50CCESSR100122089(AG)9 TTCTGCAATTCCCTCGTACA 59 AGGGGTCAGAAGAAGGGATT 59214
51CCESSR101122184(AT)10 AGGGGTGGATTCAGTAGACG 59 ATGCAATTGAACCGACTCAG 59466
52CCESSR102122195(AGGCTC)6 AGTCAGCTTCTTGCTGCTCA 59 CCCTCCAGAGTACCCATTGT 59336
53CCESSR103122254(TCGT)5 GGAATGAGCTAAAAGCCCAG 59 TTACCGTGTAGGTTGGTGGA 59318
54CCESSR104122257(TTCA)5 GCTCAGTGAGGTTCCAAACA 59 ACCCTGTAGGTTGGTGGAGA 59310
55CCESSR105122324(AGA)6 AGGCCATAGTGGTGAACACA 59 ACAAAGCACAGAACCAACCA 59319
56CCESSR106122340(TC)10 AGCTGATTGAGGATGGCTCT 59 CATTGGCTTTCCCCAATACT 59484
57CCESSR107122383(CTT)8 CCTTCTCCCTCTTCCCTTCT 59 TGAGTGACAGCGTCAACAAA 59459
58CCESSR108122387(CTT)7 GTGGGAAATCTCAAATGGCT 59 CTCTTTCTGCTGGGTTGTGA 59143
59CCESSR109122514(AGG)6 GAAAGACCCCAAACCAAGAA 59 GGAGTCATCAACAATGGCAC 59340
60CCESSR110122533(TA)10 GTGCCGTTGTTTGTTTCATC 59 GTGCATGAGTGGATTCAAGG 59324
61CCESSR111122550(TCA)6 CTTCATGGACCATTCTCACG 59 TAGCCGAAACAGAGCATTTG 59290
62CCESSR112122619(GAA)6 TCCCCTGAATTGGACTCTTC 59 ACAAAGGCCAACGTTTTACC 59452
63CCESSR113122646(AGC)6 GCATCCTTGTATTCGGAGGT 59 ACCACTCGCTTCAACCTCTT 59115
64CCESSR114122681(AAAG)5 CCAGGCTAAGTGCTCATCAA 59 GGCAGCGCAGTAAAGTCATA 59209
65CCESSR115122758(TC)13 CTCCCATTTCCTCTTTCTCT 55 TAGGGTTTTGAAGGGCAATC 59274
66CCESSR116122793(TC)18 GAGAAGTTTGCCAGAACCCT 58 ATCAATCTTCAAAGGGCCAC 59453
67CCESSR117122797(CT)12 CTCGTCCCTCTCCCTGTACT 58 ACCACAACAGCGAACATCAT 59160
68CCESSR118122811(AAG)7 CTTTCCTCAGCTTCACCACA 59 CCCCTGTTTCTGGTCTTGTT 59371
69CCESSR119122842(ATA)6 TGCAAGTGTGATGAATGTGG 59 CTAAGGCGAGAAAACAAGGG 59431
70CCESSR120122880(GAAGCA)4 TAGGGGGACCAATAGCAAAG 59 ACACGTCTTGCGACAAAGTC 59212
71CCESSR121122996(TTA)6 CCGGACAGGAACAAGAAAAT 59 CATTCAATTGCCTGACCATC 59443
72CCESSR122123010(TCC)6–30 bp-(TCA)7 GTCTGCATCTGCTTGCTCAT 59 AAACCATCTATCCCAGCCAG 59438
73CCESSR123123023(CTGCCT)4 GACAGAGGAATCCTTGCCAT 59 TCCCAGAAAAATCACCTTCC 59494
74CCESSR124123110(AACTA)4 GAGAAGCGAACCACACTTGA 59 ACTTGAGCAGCAACCCTTTT 59183
75CCESSR125123132(CTATGA)4 GAGCGCTATGGATGAGCATA 59 GCTGTACAGGATTGGGAGGT 59478
76CCESSR126123134(CTG)6 ACCACCACAAGCACAACAAT 59 TGACCCCCAACAAAACAAAG 61457
77CCESSR127123160(TA)11 CTGATGGGCGTCATCAATAC 59 GGGAAAGGGAGTCTCAACAA 59394
78CCESSR128123185(ACG)6 GAGGAAATTCGGGAAGTTGA 59 CATCGTCATCATCATCGTCA 59280
79CCESSR129123240(TTC)6 CCAGAACCAACTAGGGTTTT 56 TCACTTGAATTGCAGAAGGC 59498
80CCESSR130123248(GCA)6 TCTTCTACAGCTGCCACCAC 59 ATTGCGAAGAATGAAGGGTC 59137
81CCESSR131123291(CT)10 CTTCTCCAAACGGACCAAAC 60 CTCAGCCTCCATTGCACTAA 59172
82CCESSR132123309(AT)20 CGTAAATTAGGGGCGTTGTT 59 TTGGCACAAACCTGATGGTA 60227
83CCESSR133123323(ACAT)5 TCTCCCCATGGTACTTCACA 59 CCCATGCATTTCCAGACATA 59463
84CCESSR134123329(TC)13 TCACTGCCTGAGGCTTTATG 59 GAGCATGGTCCCGATAAGAT 59380
85CCESSR135123332(AG)10 TTGTTCTCTCCAGGGGAGTC 59 CGTAGAACCAATCCAGCAGA 59210
86CCESSR136123342(AC)9 AATATGAATGGAGGCGAAGG 59 ACAATGTCCCGAACTGTTGA 59347
87CCESSR137123362(AAT)6 GGTCTGAAAGTGGCCAAAAT 59 GGGAGATACGCTCAAGGTGT 59496
88CCESSR138123381(CGG)6 AAACCTAGTGGTGGAGGTGG 59 CAGCTTGGTTCCAGTCTCAA 59240
89CCESSR139123445(TC)9 AATCTTGTGGAGGAAATGCC 59 GCTTTTTGCTTCGGTACCTC 59349
90CCESSR140123562(CT)9 CGAGGCTCTATCCTCCTCTC 58 CTGACGCATCTGACCTCACT 59182
91CCESSR141123682(CTC)7–10 bp-(TTC)7 AGTACCTCGACAACCCCAAC 59 AGGCTCTGCCAAATCAAAGT 59479
92CCESSR142123741(CAGCAA)4 GAAGTTTGAAGCCTGAAGCC 59 TTCCGCGTATCTGAATCTTG 59478
93CCESSR143123742(TTGTA)4 AACCGATTTGATTCAGAGCC 59 CTCCACTCCACTGGGTTTTT 59294
94CCESSR144123743(TTGTA)4 AACCGATTTGATTCAGAGCC 59 AAAAAGGTCAAACGTCAGCC 59269
95CCESSR145123795(GCG)8 AGGCTTGAAGACGAAATGCT 59 GCTCTTTCCAGTCGATCTCC 59309
96CCESSR146123812(ATCTTT)4 ACACTCGACCCTCAAGTTCC 59 AGTGATCATCCATCCATCCA 59482
97CCESSR147123819(CTTC)5 ACGAGGGCAAAACAAAGAAG 59 TCAAGACGAAGAAATGCAGG 59112
98CCESSR148123822(GCT)6 ATCCGCGAAGCTACTTGTTT 59 ACGAATTTCTGGTCCACTCC 59133
99CCESSR149123903(CCG)6 GGCTCTGGAGGCTTTGAATA 59 GGCGAGCATGATTAGACAGA 59256
100CCESSR150123909(CT)14 CTTCCAGGTTCCTTGTCCTC 59 CCATCAGCACTTCCATCATC 59389
101CCESSR151123967(TTTC)5 CTCTTCCCTACTTTGCTGCC 59 AGGGTTTCTGTCGCTGTTCT 59374
102CCESSR152123993(TG)10 GGCCTAGTACTCCCAATCCA 59 ATGATGCAGTTGATCCTCCA 59251
103CCESSR153124161(CT)12 CTCGGCTCCCAAATATTCAT 59 ACATGCGGTTATGCACTCAT 59453
104CCESSR154124171(CCA)6 ACCACCACCTCCAGAGAATC 59 CGTAGAAATTGCTATCCGCA 59421
105CCESSR155124195(AGC)6 ATCCCCATCAGAAGACCTCA 59 GAAGCTCGACAAACAGGACA 59296
106CCESSR156124198(GAACT)4 CCCAGAAGAAATCCCAGTGT 59 GACCAATCGAGGCCAATAAC 59240
107CCESSR157124268(TGC)6 TTTCTCTCACACGCCAAAAG 59 GCTGATGATGCGATCAAAGT 59406
108CCESSR158124269(CCT)7 AGTCGTCTTCTGGTGGCTTT 59 GAGGAGGAGGTGGTGGTG 59195
109CCESSR159124309(AAAGG)4 TTCGAGTTCTATGGCTCGTG 59 TGGTGGTATGGAAAACAGGA 59345
110CCESSR160124355(CAAAA)4 TCGCCTTTTCTCTCACTTCA 59 GCTCGTTTCCTCTTGACTCC 59359
111CCESSR161124358(ACA)6 ATGAGGATGAGGAGGGATTG 59 TCCCTCAAGCTGTCTTTCCT 59407
112CCESSR162124358(AAG)6 AGCTTCCTCCAGAACCTCAA 59 ATCATCATCAGGGCCTCCT 59147
113CCESSR163124365(AAC)6 TTGCTCTTCCACTTTCATCG 59 CTTCACGAAATGCCTCAGAA 59171
114CCESSR164124426(TTTTTA)5 GACGTGGAGGGATCAGTTTT 59 CAATTCCGACGTTTCATCAG 59140
115CCESSR165124428(TTTTTA)4 GACGTGGAGGGATCAGTTTT 59 TGAATATGACCCGCATGAGT 59239
116CCESSR166124436(GAG)6 CGAGTACCATTGGGATGATG 59 ACTGTCATTTGAGTGCTCCG 59304
117CCESSR167124457(AC)9 CAAAGAAAAACCGCTCTCG 59 AGTGGAGGATTTGGAAAACG 59367
118CCESSR168124461(AATA)6 CTCTGCCCACTTTCTCCTTC 59 CACTTGGGCATCTGAGAAGA 59354
119CCESSR169124465(TTCA)6 TCCCCCATTCATTTGGTAGT 59 GCAAGGAGACCTTCAAGAGG 59296
120CCESSR170124468(TTCCTC)4–24 bp-(ATC)6 TCTCCCATTCATTCCAGGAT 59 GTAAGTGCGGATGATGATGG 59342
121CCESSR171124565(ATT)6 GCTCTGTTCAGGAAGGGAAG 59 GTATCAGAATCCCCACCCAA 60406
122CCESSR172124577(AAG)6 TGGCTTTTCTCCGTTATCCT 59 TTCTTCTTGGTGCTTGATCG 59293
123CCESSR173124612(CGG)6 TCCTTTCCTAGCCTTCTGGA 59 AGAGAAAAAGAGCGCGAGAG 59331
124CCESSR174124697(GAC)6 GAAAATCGAGAAAGGGTGGA 59 GAGTGGTCCTCAAGTCCCAT 59242
125CCESSR175124700(GCT)6 ACACCAACGACGGGTAAGAT 59 CAGCCAGATTCGAATACCCT 59346
126CCESSR176124759(AG)11 GGGATTGCTGCATAGGAAAT 59 CATCTCAATCAAATGCCCAC 59290
127CCESSR177124765(CCA)7a(AAC)6 CTGCGACTCAGACATCCACT 59 ACGCTGTCTCCTCCGTAACT 59444
128CCESSR178124767(TCAT)5 GCCGGCTAGTCTATACGGAG 59 ACCTTGAAGTTTGGTGGAGC 59369
129CCESSR179124817(TC)9 TCATTGCTGCTTCAGTCTGTC 59 TCTCTTCAAAGCGGATTCCT 59354
130CCESSR180124851(TTC)11 GACTTCTCTCTCCCCCTTCC 59 TTGGAGCGTAATGTCGTAGG 59422
131CCESSR181124871(AGC)6 ACCAATTTCAAACCTCCAGC 59 CGAACACCACATGCATTACA 59475
132CCESSR182124949(CCT)7 CCCCTTAAAATCGATTCCCT 59 CGGAGAGGAATTTTCGACAT 59328
133CCESSR183124969(AATTG)4 TCTAGCTTGAGGGGATCGTT 59 TAAGCCTAAGCAAAGCAGCA 59243
134CCESSR184124995(GAGAAG)4 CACCAAAGCCATGAAAGCTA 59 TCCCAAATCTCCTCTCCATC 59224
135CCESSR185125025(TC)11 CCTCCTTAAAACCCTAAACCG 59 TCTAGTATTTTGGTGGGGGC 59302
136CCESSR186125054(TA)9 TCTTGTCTTGCACATTTCCC 59 GGTCTCCGTTGTTGAGGATT 59283
137CCESSR187125099(ACC)6 TTCCCGCTTTACCAGATACC 59 TTCCTCCAACAGACAACAGC 59388
138CCESSR188125102(ACC)6 TTCCCGCTTTACCAGATACC 59 AAACATAAGGTTGGGGTGGA 59281
139CCESSR189125104(AT)11 CTGGCTCTGACATGCAATTT 59 TTCAGAAGCGCGTAGTCTGT 59387
140CCESSR190125128(TC)12 TGCCACTTTCTTCTCCTCCT 59 CTGGAAAAGGTGAAAAGGGA 59418
141CCESSR191125139(CAGCGG)4 AGCACATCCAACACCACAGT 59 AACCCAACGTTAAGGGACAC 59204
142CCESSR192125151(ACA)8 ATTCCGAATCGGGTACAGAG 59 AATGTACATCCCTTCCCCAC 59400
143CCESSR193125186(AAG)6 GCCTCAACCACTTGCCTATT 59 CGTGATCATGATGCCCTAAC 59372
144CCESSR194125232(AT)13 ATTCAGTTGCAGCTGTGGAG 59 ATGATCTGGGAAAGGACAGG 59314
145CCESSR195125265(GT)14 TGATGATAGCTCCGCTTTTG 59 AGCCAATCACCAGCAAGATT 60254
146CCESSR196125286(CTC)8 GGGTCCATGTTCGTCTTTTT 59 GAGACCAAGCATTGAAGCAA 59304
147CCESSR197125289(ATTTT)5 GAGGATCACTTTTGCCCCT 59 GAAAAAGGTCAAGACATCCC 56470
148CCESSR198125382(AC)16 CTCTCCCCATCCCTCAACT 59 GTGAAGGAGGTGGTGGTCTT 59365
149CCESSR199125416(AT)16 ATCATCCCCTCGATAGCAAC 59 TCCCGAAAGGAAAGAATGAG 59183
150CCESSR200125444(TG)11 TGGTTGTAGGTGGTCGTCAT 59 GCTCGGGGATATCGATCTTA 59495
151CCESSR201125484(CTCCGA)4 AGGGTTTTTCGGAGATGACA 60 CCGGGAAAACTAAGAGCAAG 59424
152CCESSR202125529(TC)9 GTGCTCGTGCAATTGAAACT 59 ATGCCGAGTGGATGTCTATG 59490
153CCESSR203125676(AAACAA)4 TGATCTTGATCCCCTCATCA 59 GATGCATTCACAAAACCGTC 59393
154CCESSR204125734(GAG)6 TCCTTGGCTCGAGATCTTTT 59 TTTGTGTCCTCTTCTCGCAC 59434
155CCESSR205125776(CT)14 AGGATTGCCTTCCGTTTGTA 60 GATTCAACTCGTGGGACCTT 59225
156CCESSR206125893(CAT)7 CCCCTTGTTTCTCTTTGTGG 60 TGCATCATCCCTTGTATCGT 59271
157CCESSR207126139(GGC)6 TCCGTTCCTCTTTCTCGTTT 59 CACAGCACCCATGGTAACAA 60411
158CCESSR208126219(GAG)6 TGGTCTTCAATCACCAAGGA 59 GACACATTGCACGTAGTCCC 59134
159CCESSR209126243(CAC)8 GGTGGACGAGGCTTTTATGT 59 CTGTGAGCAAACTTGACGGT 59180
160CCESSR210126250(CAC)9 GCCAAAATCCCTGTTCTCAT 59 TGTGGATGCACCAGATTCTT 59245
161CCESSR211126255(GAA)6 ACGAAAGGGTTGAAGGTGAC 59 GACAGAGACGACGAAGACCA 59412
162CCESSR212126427(ATC)7 TCCAATGGTTTACAGGAGCA 59 TCTCCGGTGTAAAACTGCTG 59341
163CCESSR213126427(CAGGAG)6 GGCGGCTTCGGTATTATTTA 59 GCACTAGTTTCTTGGACTGGG 59184
164CCESSR214126506(TTTC)6 TTCATTGTCTTTGAGCCAGG 59 CCCCATCACTCATTCCTTCT 59339
165CCESSR215126540(AAG)9 AAATCGAATGGCTTGGAGAC 59 GTGCAAGTAAATCCGAGCAA 59309
166CCESSR216126661(AGT)6 TTTGTCATTTCTCTCGCTGC 59 AGTTGGGGAGAATTGACCTG 59440
167CCESSR217126701(TTCA)5 AAAGGATACGGCTCACAAGG 59 GAATGAATGAACGAGGGGTT 59124
168CCESSR218126701(TTCA)6 AAGGAGCCAATCGTTCATTC 59 TTACCCTGTAGGTTGGTGGAG 59175
169CCESSR219126730(CT)10 GAGTTGGCTTGAAAGCATGA 59 CCATCCCCAAGGAAAGTGTA 60371
170CCESSR220126767(GA)9 GCACAAGACCTTCCCTTTGT 59 CTCTTGCACGTCGTTTTGTT 59428
171CCESSR221126881(GCA)6 CGGTGCTGCAGTTTCACTAC 60 CAAGGCAGCAAGTATACCGA 59351
172CCESSR222126910(AAT)8 CCAACAGAAAAGTTGGGGAC 59 TTAGCCGAATCGTTATTCCC 59163
173CCESSR223127055(TTC)6 GAAATCCCACTCGGTGAAGT 59 CCCACCCCGAAGATATAAAA 59401
174CCESSR224127171(AAGG)5 ATTTCAGTCTGGGTCTTCCC 58 TGAGGCAGACCTTGTTCTTG 59470
175CCESSR225127376(GT)10 TCCCAGAACAAAACGAGACA 59 CTTGTCGTAAACGGCTTCAA 59270
176CCESSR226127478(CT)23 CAGCCTCCGGTTAGCAAG 60 TGTCCAATACCCACTCTTCG 59179
177CCESSR227127479(ACC)10 GTACCCCCACAACACTCACA 59 GATGCTTCTCATCGTCTCCA 59404
178CCESSR228127712(TTTA)5 GATTTTGTGCCAAAGGGAAG 60 GCTTTGACCGGAGAAGTCAT 59115
179CCESSR229127774(CCT)6 CCGCCGTTAAATCAAACTCT 59 CAAAGAAAGCTGCTCACTGC 59120
180CCESSR230127785(AT)13 GCTGCTGCTTTGCTTGTTCT 61 AGCTTCAGAAATGGTGGCAT 60333
181CCESSR231127800(GAA)9 GCTACAGCGTGCAGAAAAAG 59 TACTCCTCAGGCCTCCATTT 59318
182CCESSR232127828(TCA)6 ACCTGGGAGGACTGATAACG 59 CCAGATTCCATACTTGGGCT 59177
183CCESSR233127910(AAAAGG)4 GAAGACTCTGTAAAGGAACGACC 58 GTGTTATGGCTCAACCATGC 59347
184CCESSR234127938(CTG)8 GCGGCTTAGTTCTATCCTCG 59 ATGGCTGACAGCTTCCTCTT 59442
185CCESSR235127992(TAA)6 AGTGGGAAAAGTGAAGGTGG 59 ATTAATCCCCTCCTCCGAGT 59225
186CCESSR236128005(CTC)8 AGTCAGGTATGCTGCCATTG 59 GGAAATCTTTGGGGTCTGAA 59322
187CCESSR237128130(CCT)6 CGTCAATAAATTGGTGTGGC 59 GGGAGAATGGGAGATGAAGA 59142
188CCESSR238128268(TA)11 CCGTCGTTACTGCAAAAGAA 59 ACAGAGGGGTTTCAAACAGG 59254
189CCESSR239128301(AAGAAC)4 TCCCAAGTCCCTATCTCTGC 59 CAGCTGATGGTTTAGGAGCA 59123
190CCESSR240128306(TCT)7 TCTGTGTCCCTTTTCAGCAG 59 GGATGGTCAATGCAGATGAG 59221
191CCESSR241128309(CGG)8 ATATCATGGATGGTGCTGCC 61 TCCAGATTGTGCTGCTTTTC 59355
192CCESSR242128315(TTC)6 GCACGAGGCTTTCTCTCTTT 59 CAGCAGGAAACTGGAGAAGTC 59260
193CCESSR243128377(AG)12 GGTTCTTCTGTGTGGGTGTG 59 GTAGAAGCCAGCTAATGCCC 59107
194CCESSR244128432(TG)15 CACACACCCAACACACAGAA 59 GCGAGAGGAATTTTGACCAG 60482
195CCESSR245128504(GAG)6 TCCATCCCATATTCCCATCT 59 GTCGGGTCTCTTCTTCTTGC 59454
196CCESSR246128527(GA)10 CACCAACAATTCCCTATCCC 59 TCCATCTCCAACACTACCCA 59427
197CCESSR247128527(CTG)6 CACCAACAATTCCCTATCCC 59 TCCATCTCCAACACTACCCA 59427
198CCESSR248128527(CTG)6 TGGGTAGTGTTGGAGATGGA 59 CAATGACCGCCTGTATTCTG 59260
199CCESSR249128550(ATC)6 CTAGTAGCCATGGTCACCCA 59 TTCTTCGTAGGCAGCTCTGA 59416
200CCESSR250128587(TC)11 GAGGCAAACCAAATGGAGAT 59 ATTGAAAAGCCTGCTGAACC 59358
201CCESSR251128673(CT)9 TTCTCTGCTGCTGCTCATTT 59 GGCACACATTTATCCACTCG 59245
202CCESSR252128894(AAAAG)4 GCCCAAAGCTGTTAGAGACC 59 TCCTTGTCATACGCTTGCTC 59344
203CCESSR253129060(TCT)7 CAATTTGCAGTCAGTCCGTT 59 ACATCTGCCACTGTCTGAGC 59216
204CCESSR254129129(GAG)7 ACAGATGCTTTATGGGAGGG 59 GGAAAGGTATTGCTGGGGTA 59272
205CCESSR255129269(GCA)9 TGAAGGTTCACTGATTGGGA 59 TGCTGTTGTAACTGCTGCTG 59477
206CCESSR256129431(AAG)9 CGCCACCAGTACAGATCAAG 59 TCCGCATATCGAACACTCAT 59496
207CCESSR257129439(CA)10 AGTCACCACACAGATGCTGC 60 GGCCATTCAAGGTTTCTTGT 59296
208CCESSR258129524(GAA)8 CTAATCTTCCACGGCTCTCC 59 TTTTGTTCAGCAGCAAGGTC 59412
209CCESSR259129581(AAAAG)4 CCAACCTCACGTCTCACCTA 59 GCCAGTGACACAGACGACTT 59102
210CCESSR260129646(TC)11 CTTCAGCCACAACAGATGCT 59 AAAAGAGCACACAGTGCAGC 59179
211CCESSR261129668(CT)12 CATGATTATGCGTTGGCTTC 59 GGTTGTCATGTTGAATGAGG 56163
212CCESSR262129745(CAC)10 GCCTCCAGTCAACAAACAAA 59 TGGTGCTTCTCTTCCTTGTG 59252
213CCESSR263129793(CA)10 TAGCGGGGAAAATTGATAGG 59 AGGGGCTTTTTCTCCATCTT 59487
214CCESSR264129797(TCC)6 TCGATGATGGCTACAGCTTC 59 TTGCTCCTGAGAAACACTGG 59312
215CCESSR265129906(TA)11 GACCAGAGAGAGACACCTACTTTT 57 CTCTTCCTAGCAGCAGCAGA 59186
216CCESSR266129913(CAG)6 TTCCTTTAAAACCGTCAGGG 59 CTGCATGCTCTGTTGTTCCT 59264
217CCESSR267129917(AAGAGC)5 GAGATGGAGAGACCATCATCC 58 AACTAAGACTTTTGCCGCGT 59267
218CCESSR268129950(TCCACT)4 TTCCCCATCTAAGGCAAAGT 59 CTGATTCAGGGCTCTCCATT 59403
219CCESSR269129957(GAGCAG)4 AACCCACCTTCCTCTCCTTT 59 CATGCATATCGGACACAACA 59294
220CCESSR270129964(GAT)7 AAGGCTCGACATGTTTTTCC 59 CCTACGCAACATCATCAACC 59221
221CCESSR271129972(AG)9 CTTTCCGCCTTTTCGTACTC 59 AACACACGCACAACACACAG 59392
222CCESSR272130030(AG)19 GGCTGCAAGAACATAGAGCA 59 GAAATCCTTGTTGGCTGGTT 59369
223CCESSR273130066(GAG)6 AAAGAAAGAGCCCTCCACAA 59 GCTCCAAAACCACCATTACC 59212
224CCESSR274130235(TGC)6 GGTTTATGACGTTGTGGACG 59 GGCCTATGATGCGGTAGAAT 59198
225CCESSR275130276(GGC)7 GGCGGCTACTCAGAGACTTC 59 TGGAACTCCAATTCCTTTCC 59300
226CCESSR276130329(AT)13 AGGCTGCGTTCCATATAAGA 57 CCCAGAAACTCCATTTCGAT 59410
227CCESSR277130342(ATC)6 CTCTGCAGAACAACCCTTCA 59 CCAACAGCTAAAAGTGCCAA 59296
228CCESSR278130353(CA)11 TCTCTCTCTCTCCATTCGCA 59 GGTACGAGGTTGATCGGAGT 59154
229CCESSR279130353(CAC)6 ATGGCTCCATCATCTCATCA 59 ATTGGTGGGTCCTTCAATTC 59215
230CCESSR280130368(AG)17 GGCTGTCTTAAGGCCCTAGTT 59 ATCAAGTGATGCTGTCGAGC 59436
231CCESSR281130475(CTC)7 CGCCAACACCCCTTATCTAT 59 CATTGAGGAATTGGAACGTG 59485
232CCESSR282130516(CCT)6 ACCCTGCTCCACCTTATCAC 59 GACAGAAACCTTCATCCCGT 59357
233CCESSR283130575(AGA)7 GTGGTCAGTATGGTTGCTGG 59 AGCCACAAAAGAACCATTCC 59456
234CCESSR284130635(AG)13 AGCACCGTCAGACTCTCCAT 60 TAAGCCAAACGTCGTCGTAG 59283
235CCESSR285130647(CAC)6 GAGGCCGAGTTACGAGAGTC 59 AGCATGCCAACCCTTTATTC 59366
236CCESSR286130785(TAAAA)4 TGCTCGGACCCTCTAAGTTT 59 TGGAGTCGTTTAAAAAGGGG 59334
237CCESSR287130804(AG)10 CGTTACAGAATTTGCGGATG 59 CAACTCTAAGCCGTCGATGA 59360
238CCESSR288130947(GCAGAG)4 TTTTGAACACGTCAAGGCTC 59 CGAAGAGGGGACTGAGAAAG 59273
239CCESSR289130972(TCA)9 GCACAACCCATATGACGAAG 59 GGAGGAACAGTTGGAGAATGA 59107
240CCESSR290130978(AG)16 TTATTATTGCACAGCCCCG 60 AACAATCCATTCTTGCTCCC 59388
241CCESSR291130993(TC)14 CGGTGGGGTTCTGATAATTT 59 CGGTTTCAAGCTAACGAACA 59313
242CCESSR292131127(TC)11 ATCGTGTTTAGCAGTGACGC 59 GCAGATATCATATACTCCCGTC 55143
243CCESSR293131130(AAG)6 CACTCACTCAACACTTCGCA 59 CCATCTTCCACCACCTTTCT 59426
244CCESSR294131306(TCGCA)4 CTTTGGGGGCTGAGAGTAAG 59 GCTCCATATTGCAACGTCTT 58432
245CCESSR295131334(CCA)7a(AAC)6 TAATACATCCACTCCGCCAA 59 TACTCATCAACAACCTCGCC 59320
246CCESSR296131479(TG)9 GCAATTGTGGATGGTCTGAG 59 AAACACACTGGGAAAGAGCA 58424
247CCESSR297131560(ACC)6 CAATCTGTTCCCACGTATGC 59 ACAAGTGCTCCAACAAACCA 59328
248CCESSR298131583(AT)12 CCACACACTGCAGATTGGTT 60 AAGATCTCTGGGGCATCAAC 59453
249CCESSR299131633(CTT)6 AGGTTCGAGCCAGGAAGATA 59 TTGCCTTGGTTGATGACTTC 59141
250CCESSR300131669(AG)12 TCCGGGAAGGATAGTGGTAG 59 TTGCTGAAACAGTCTCCCTC 58203
251CCESSR301131714(GA)22 ATTAGTGCGGAATTCCCTTG 59 TGATAATCTCGAGTGACCGC 59491
252CCESSR302131742(TA)18 TAATGCACGCACAAACACAC 59 ATCATTGCAAATCTCCTCCC 59112
253CCESSR303131761(CA)10 CATGGATTGGAATTCTGCTG 59 CAGATTAGCTCCGCAATCAA 59454
254CCESSR304131806(ATG)8 GCACCTAAGAGAGGAGTGCC 59 ATGAGTTTGGCTGGATGTGA 59401
255CCESSR305131831(GCCTGC)4 CCGATTTCGCTCGTAGTGTA 59 TATCAATACGGACTGCTGGG 59462
256CCESSR306131899(GAT)9 TCCGCTGGTCACAAGAAAT 59 ATCAAACTTCGACACCCCTC 59184
257CCESSR307131901(AAC)6 AAACTTTGATATCCCGGCTG 59 GCCGCCTTTTGTAGAAGAAC 59175
258CCESSR308131958(TCA)9 GATTCCAGTATTTCGGCTCC 59 TTGCTCAGATTAGGCTGTCG 59211
259CCESSR309131966(AC)10 TTTGAACTTGCTGGAGCATC 59 TGGTGAACTTCTGTTTTCCC 58392
260CCESSR310132036(AAG)6 CTCCTCAACACTCCTGACCA 59 CCCTCCTGAAGCTGCTTATC 59194
261CCESSR311132205(GT)10 CCTTGATTGTCACGTGTATGC 59 AAGATATGCCCAGGTCATGC 60424
262CCESSR312132207(TCGT)7 AAAGGATACGGCTCACAAGG 59 ACCCTGTAGGTTGGTGGAGA 59198
263CCESSR313132325(AAG)8 TGAGCACTCAAGGGAAATTG 59 TTCGTCCAGGGAGTAGCTTT 59462
264CCESSR314132326(AAG)6 GCTTTTGTCCAAGGGATGAT 59 GGAGCTCCCAGGAGGTAGTA 58336
265CCESSR315132398(TA)11 ACGTTTATTACGGCGGATTC 59 ATCGCCGTCCCATAACTATT 58174
266CCESSR316132466(AGA)6 TCTGTCTGGGCAACACTATAC 55 AGAAAGGGAGGTGGAGGAAT 59338
267CCESSR317132484(CT)9 TTTGCAGTTATGACTTGGGC 59 TCCTCACTGGAAAGGGATTC 59449
268CCESSR318132502(GTCTT)4 GGGGATCAGCGTAAGAATGT 59 TTTCCACCCACGTAATAGCA 59346
269CCESSR319132516(TCA)6 CCGTCCACGTTCACATCTAC 59 AGTAGCGCGTAGTGACCTGA 59130
270CCESSR320132606(AGC)6 GACCAGAAGAACAGCGATGA 59 CTCCTTCTTCTTCTCGGCAG 59285

*: Unigene ID as per downloaded from the SGN ftp site ftp://ftp.sgn.cornell.edu/coffee/.

CCESSR: CCMB Coffee EST SSR marker; F: forward primer; R: reverse primer; –: Unmapped; CLG: Combined Linkage Group [1]; *: Predicted amplicon size based on source EST sequence; **: Source EST ID as per the downloaded SGN database (ftp://ftp.sgn.cornell.edu/coffee/). CCRM: CCMB CXR Microsatellite marker; Ta: annealing temperature; F: forward primer; R: reverse primer; –: Unmapped; CLG: Combined Linkage Group [1]; *: Expected amplicon size in the robusta variety CxR. *: Unigene ID as per downloaded from the SGN ftp site ftp://ftp.sgn.cornell.edu/coffee/.

Types, Frequency and Distribution of SSRs in the coffee transcriptome

The coffee EST unigene database analyzed here, comprised 13,175 unigenes having a total length of 8923 kb and an average lengthof ca. 677 bases/unigene [15]. These ESTs were found to contain a total of 2,589 SSRs (having a minimum numbers of repeats as: six for DNRs, four for TNRs and three for all other HO-NRs) located in 2,028 unigenes. The identified 2,589 SSRs comprised- 502 DNRs, 1285 TNRs, 503 TtNRs, 144 PNRs and 155 HNRs, which differed significantly in their relative distribution and abundance accross the unigene ESTs (Tables S1 & S2). The mean length of repeat iterations (RI) for all the SSRs was 5.2, whereas average length of DNRs was maximum (9.6 RI) followed by TNRs (4.6 RI). Among the individual SSRs, AC had the maximum average repeat length of 10.5 RI, followed by AG (8.6 RI), AT (8.3 RI), CG (6.3 RI); all the TNRs had RI in the range of 4.3 to 4.8, whereas all other larger SSRs had an average RI of less the three. The identified SSRs having a repeat core of 18 bp or more were selected as candidate ‘usable’ SSRs for further primer designing/marker conversion. Overall, in-silico analysis of the EST unigenes revealed one EST-SSR (having a minimum repeat core of 12 bp) per 3.4 kb and one usable SSR (having a minimum repeat core of 18 bp) per 15.9 kb of robusta transcriptome (Table S2). Among the individual SSRs, the most abundant EST-SSR motif was AG, followed by AAG.

Development of microsatellite markers from usable EST-SSRs

Only 483 (18.7%) of the total 2589 identified SSRs had a repeat core>18 bp, which were used for marker conversion. Primer pairs could be designed for 320 of these SSRs, of which randomly chosen 50 pairs were further tested for validation studies. These included SSRs with DNRs (30%), TNRs (64%), PNRs, HNRs (2% each) and complex SSRs (see Table 1 for marker ID, primer sequences, repeat motifs, amplicon size, sequence ID and functional identity). Of the selected 50 primer pairs, 44 could be successfully amplified as single locus SSR marker, indicating 88% primer to marker conversion ratio. Considering this high conversion ratio, another ca. 200 useful genetic markers are expected from the remaining 270 putative EST-SSRs (Primer IDs: CCESSR51 to CCESSR320) that are lsited in Table 3.

Identification and development of genomic SSRs using affinity capture

Sequencing of randomly chosen 66 recombinant clones from the small-insert SSR-enriched robusta genomic library (prepared in this study), revealed 81 potential SSRs distributed in 62 sequences. A redundancy analysis of these sequences indicated presence of a total of 60 non-redundant sequences of which 56 were SSR +ive (93.3% of non-redundant sequences) containing 72 non-redundant SSRs. Non-redundant dataset contained 10 sequences with more than one SSR either in compound formation or separated by>50 bp distance. The non-redundant SSRs contained 56.9% AG, 33.3% AC, 2.8% AT, 2.8% AAC, 1.4% AAG and 2.7% A/T repeat motifs (Table S3). From the 56 SSR+ive sequences, a total of 41 primer pairs could be designed successfully (with five pairs containing two SSRs each). Of these, 25 pairs (encompassing 28 SSRs) resulted in robust PCR amplifications (Table 2), and all of them could further be validated as single locus markers indicating ∼61% primer to marker conversion ratio.

Validation of EST- SSRs for use in genetic studies

All the new 44 EST-SSRs resulted in good amplicons exhibiting low to medium allelic diversity when tested on a panel of 16 elite robusta and arabica genotypes (Figure 1). Overall, a maximum of six and seven alleles (NA) with an average of 2.1 and 3 alleles/SSR were obtained for the tested markers of which 65.9% and 81.8% were polymorphic/informative for tetraploids and robusta genotypes, respectively (Table 4). Fifteen markers in the case of tetraploids and eight for robustas were found to be monomorphic. Moreover, 14 markers resulted in double alleles (i.e. consistent presence of two allelic amplicons across the tested samples) indicative of duplicated loci in case of all the tested tetraploid arabicas. In general, no private alleles were evident except in one robusta genotype (Sln274) for marker CCESSR14.
Figure 1

Representated Gene Scan profiles showing the SSR alleles obtained using the new SSR markers for some of the coffee genotypes tested in the study for marker validation.

The right side set of 3 panels is for the genomic-SSR CCRM-33, and the similar set on the left is for genic-SSR CCESSR05. The 3 panels in each set represent 8 genotypes each of arabica and robusta coffee, and 8 of the related coffee species, respectively, from right to left end.

Table 4

Allelic diversity attributes of the newly developed 44 EST-SSRs when tested over cultivated and wild related coffee genera.

Species: C. arabica (n = 8) C. canephora (n = 8) Coffea spp. (n = 12) Psilanthus spp. (n = 2)
Primer IDNA PAAllele range (in bp) Ho He PICNA PAAllele range (in bp) Ho He PICNA PAAllele range (in bp)NA PAAllele range (in bp)
CCESSR0120251–2540.000.50**0.3620251–2540.500.400.3073246–26040246–281
CCESSR0250181–2090.330.67*0.5750183–2070.380.71*0.62100178–20910183
CCESSR0360236–260DL50232–2441.000.780.68121232–26030226–242
CCESSR0420252–2550.000.50**0.3620252–2550.500.400.3062247–26120247–252
CCESSR0520153–159DL40153–1620.630.680.6140153–16220154–159
CCESSR0610223MM60225–2420.880.720.63175105–24320225–227
CCESSR0720211–1330.380.330.2620211–2330.500.400.3084211–23944220–240
CCESSR0820151–167DL20164–1670.500.400.3071151–17620161–167
CCESSR0930138–2020.000.62**0.5050129–2120.500.76*0.67100129–21210188
CCESSR1030158–2350.830.68**0.5520221–1381.000.53*0.38133158–24730212–227
CCESSR1130147–155DL30147–1550.380.540.4340147–16010147
CCESSR1210218MM10218MM41205–22110208
CCESSR1340122–1400.750.590.5140131–1410.500.730.62113115–14430122–130
CCESSR1450243–2570.500.610.5461231–2560.430.600.54140145–25720231–136
CCESSR1610190MM10190MM1019010181
CCESSR1720182–1910.880.530.3730179–1910.500.580.4852179–19120182–185
CCESSR1810169MM10169MM1016921169–172
CCESSR1930230–236DL40224–2340.750.700.5691224–23940224–232
CCESSR2020153–162DL40153–1660.880.740.6482153–18921166–177
CCESSR2120170–1750.000.230.1920170–1750.000.40*0.3094168–21141168–176
CCESSR2210158MM70159–1801.000.850.77122158–18020151–153
CCESSR2320168–1740.880.530.3720174–1770.000.230.1971166–17831167–172
CCESSR2640214–2270.860.780.6730214–2270.570.470.3993212–23610218
CCESSR2710132MM50122–1390.500.650.56135115–19530122–195
CCESSR2810197MM10189MM51186–21322194–207
CCESSR293096–106DL5096–1111.000.75*0.665096–1113096–106
CCESSR3110150MM40150–1820.630.740.6472140–18222167–182
CCESSR3210181MM40179–1850.380.66*0.57125177–20421187–189
CCESSR3320236–250DL20236–2410.880.530.3760230–25620223–236
CCESSR3420185–1910.130.53*0.3720174–1850.130.130.11103135–22422193–222
CCESSR3510115MM40122–1280.750.730.6291115–13020111–120
CCESSR3610112MM20112–1140.250.230.1941102–12420102–114
CCESSR3820177–1790.630.460.3430177–1810.250.240.21114170–18520177–179
CCESSR3910150MM10147MM20147–15020143–150
CCESSR4020220–222DL30220–2230.250.240.2173217–11510222
CCESSR4130212–219DL40214–2230.710.760.6590212–22610139
CCESSR4210129MM10129MM51120–14110132
CCESSR4320184–190DL30187–1990.330.320.2750178–19930174–187
CCESSR4430250–2590.880.580.4530238–2591.000.59*0.4681236–25930247–159
CCESSR4520144–151DL10151MM83137–16620125–151
CCESSR4710211MM10211MM20211–21310207
CCESSR4820199–211DL20192–2050.130.130.1140192–21110182
CCESSR4920192–203DL30192–2030.500.430.3571188–20320185–197
CCESSR5010122MM20116–1220.380.330.2640116–12820122–128
Range 1–6 0 0.00–0.88 0.23–0.78 0.19–0.67 1–7 0–1 0.00–1.00 0.13–0.85 0.11–0.77 1–17 0–5 1–4 0–4
Average 2.14 0.00 0.47 0.54 0.43 3.00 0.02 0.54 0.53 0.44 7.52 1.57 2.11 0.34
SD (±) 1.21 0.00 0.36 0.13 0.16 1.57 0.15 0.26 0.22 0.19 3.62 1.59 0.89 0.81
SE (±) 0.18 0.00 0.10 0.03 0.03 0.24 0.02 0.04 0.04 0.03 0.55 0.24 0.14 0.12

Note: NA: Number of amplified alleles; PA: Number of Private Alleles; Ho: Observed heterozygosity; He: Expected heterozygosity; PIC: Polymorphism Information Content; PI: Probability of Identity; NA: Not amplified; *: Significant HW dis-equilibrium at P<0.05; **: Highly significant HW dis-equilibrium at P<0.01; The putative DL (duplicated loci) markers were not considered for calculation of various estimates as these appear to be fixed exhibiting no segregation.

Representated Gene Scan profiles showing the SSR alleles obtained using the new SSR markers for some of the coffee genotypes tested in the study for marker validation.

The right side set of 3 panels is for the genomic-SSR CCRM-33, and the similar set on the left is for genic-SSR CCESSR05. The 3 panels in each set represent 8 genotypes each of arabica and robusta coffee, and 8 of the related coffee species, respectively, from right to left end. Note: NA: Number of amplified alleles; PA: Number of Private Alleles; Ho: Observed heterozygosity; He: Expected heterozygosity; PIC: Polymorphism Information Content; PI: Probability of Identity; NA: Not amplified; *: Significant HW dis-equilibrium at P<0.05; **: Highly significant HW dis-equilibrium at P<0.01; The putative DL (duplicated loci) markers were not considered for calculation of various estimates as these appear to be fixed exhibiting no segregation. The PIC values were comparable (0.19–0.67 and 0.11–0.77), and no significant differences were seen in the observed/expected heterozygosity (H: t-value = 0.70; P = 0.49; and t-value = 0.68; P = 0.40) for the new markers across the tested tetraploids and diploid robustas, respectively. However, significant differences were observed in the total number of amplified alleles (NA: t = 3.74, P<0.005), as well as, the behaviour of the polymorphic markers (Pms) when tested for HWE and LD in the tested tetraploids and the robusta genotypes (Table 4). In general, more markers were in HWE and only a relatively small proportion of markers exhibited LD and heterozygote excess and/or deficiency in case of robustas, in comparision to tetraploid arabicas.

Validation of genomic SSRs for use in genetic studies

A total of 25 putative genomic SSRs were also validated as genetic markers (Table 5). When tested on the panel of 16 elite robusta and arabica (tetraploid) genotypes, five of these markers in arabicas and one in robustas were found to be monomorphic. Twelve ofthe polymorphic markers in arabicas resulted in double alleles (putative duplicated loci). In total, a maximum of seven and eight alleles (NA) with an average of 2.7 and 4.3 alleles/marker were obtained for the tested polymorphic markers of which 32% and 96% were informative in arabicas and robustas, respectively (Figure 1). The PIC values varied considerably, with mean PIC value being 0.47 (range 0.12–0.78) for tetraploids, which was significantly less than 0.60 (0.12–0.85) observed for robusta (Table 5). Further, the Student’s t-test revealed significant differences in NA (t = 4.09, P = 0.00) but non-significant differences in PIC estimates (t = 1.26, P = 0.13), as well as, for the observed/expected heterozygosity estimates (H/H) for the comparable markers of arabica and robusta genotypes.
Table 5

Allelic diversity attributes of the newly developed 25 genomic SSRs when tested over cultivated and wild related coffee genera.

Species: C. arabica (n = 8) C. canephora (n = 8) Coffea spp. (n = 12) Psilanthus spp. (n = 2)
Primer IDNA PAAllele range (in bp) Ho He PICNA PAAllele range (in bp) Ho He PICNA PAAllele range (in bp)NA PAAllele range (in bp)
CCRM0230252 −262DL30256– 2680.710.690.6772252– 27821162 −272
CCRM0621143 −1560.140.140.1210143MM1014310143
CCRM0750124– 1400.710.820.7851124– 1461.000.780.7463115– 13620124 −128
CCRM103096– 1060.290.380.3220105– 1060.000.230.236298– 100NA
CCRM1430109– 188DL72110–1320.630.88** 0.8385109–15510123
CCRM1510298MM30243– 2490.170.62** 0.6172243– 361NA
CCRM1630198– 2020.000.70** 0.6780190– 2400.880.890.84113180– 22630180 −192
CCRM1710230MM72226– 2481.000.860.81104216– 25021210 −238
CCRM1940217– 2340.430.690.6271196– 2520.830.920.85134188– 24630210 −226
CCRM2171256– 290DL70256– 2880.630.900.85166234– 32020258 −320
CCRM2230194– 203DL51194– 2540.250.67** 0.63104194– 25421201 −226
CCRM2320140– 162DL30140– 1510.380.490.4671140– 16220140– 162
CCRM2430202– 213DL30211– 2150.630.570.5681202– 21720205– 213
CCRM2820203– 2060.000.260.2320202– 2060.130.130.1260202– 21030202– 210
CCRM3110111MM40101– 1100.380.440.4271101– 11522113– 118
CCRM3320110– 1160.140.140.3330110– 1160.880.640.6481106– 12220112– 116
CCRM3420130–146DL40144– 1630.750.750.7291130– 14631146– 163
CCRM3510156MM32152– 1560.000.43** 0.421015610156
CCRM3620139–146DL71139– 1751.000.900.85113139– 18310139
CCRM3720140– 150DL30140– 1521.000.630.6341140– 15021140– 142
CCRM3841175–217DL31175– 2060.380.340.33125163– 21420175– 200
CCRM4020144–151DL50159– 1770.630.720.6880151– 17720151– 153
CCRM4110101MM2095– 1010.140.140.142095–11010101
CCRM4250130– 161DL51138– 1570.750.750.72132121– 16540134– 155
CCRM4530183– 1950.860.690.6450187– 2120.500.760.7271183– 20120187– 189
Range 1 −7 0 −1 0 −0.71 0.14 −0.81 0.12 −0.78 1 −8 0 −2 0.00 −1.00 0.13 −0.92 0.12 −0.85 1 −16 0 −6 0 −4 0 −2
Mean 2.68 0.12 0.32 0.45 0.47 4.28 0.48 0.57 0.63 0.60 7.92 2.08 2.04 0.30
SD (±) 1.46 0.33 0.32 0.28 0.24 1.97 0.71 0.33 0.25 0.22 3.67 1.78 0.77 0.56
SE (±) 0.30 0.07 0.07 0.06 0.05 0.40 0.15 0.07 0.05 0.05 0.75 0.36 0.16 0.11

Note: NA: Number of amplified alleles; PA: Number of Private Alleles; Ho: Observed heterozygosity; He: Expected heterozygosity; PIC: Polymorphism Information Content; PI: Probability of Identity; NA: Not amplified;

**: Highly significant HW dis-equilibrium at P<0.01; The putative DL (duplicated loci) markers were not considered for calculation of various estimates as these appear to be fixed exhibiting no segregation.

Note: NA: Number of amplified alleles; PA: Number of Private Alleles; Ho: Observed heterozygosity; He: Expected heterozygosity; PIC: Polymorphism Information Content; PI: Probability of Identity; NA: Not amplified; **: Highly significant HW dis-equilibrium at P<0.01; The putative DL (duplicated loci) markers were not considered for calculation of various estimates as these appear to be fixed exhibiting no segregation. Further, it was notable that while>83% of the Pms were in HWE and only few markers showedsignificant heterozygote deficiency to varying extent in both arabicas and robustas, the number of marker-pairs that exhibited LD was significantly more in arabicas (28.0%; 8 of 28 pairs) that that seen in robustas (14.2%; 36 of 254 marker pairs).

Mapping of new EST- and genomic SSRs

The 69 new SSR markers were also tested for their suitability in linkage mapping. In total, 11 of the 44 EST-SSRs (∼39%) and seven of the 25 genomic SSRs (28%) could be mapped onto an existing first-generation framework linkage map of robusta coffee [2], [26]. This map comprised a total of 374 mapped markers (71 SSRs, 185 RAPDs and 118 AFLPs) on 11 major and 5 minor linkage groups. The new markers developed in the present study were mapped using the existing SSRs on the map as anchor markers. The 18 new markers that could be mapped, occupied positions on eight distinct linkage groups, with eight markers on CLG03; two markers each on CLG06, CLG11, CLG15; one marker on CLG02, CLG04, CLG05, CLG08 (Tables 1 & 2). The position of these 18 markers on robusta linkage groups alongwith positions of SSRs used as anchores (CM62, CM115, CM12, CM100, Cof_EST01_150, CaM46, CaM44, CM39_302, CM39_273) is shown in figure 2.
Figure 2

Map positions of 18 new SSR markers developed in this study (11 CCESSR and 7 CCRM markers) on robusta linkage map; mapping population was derived from a cross between CxR (a commercial robusta hybrid) and a local selection Kagganahalla [26].

The SSR markers of the existing map used as anchor markers are shown in italic and bold face.

Map positions of 18 new SSR markers developed in this study (11 CCESSR and 7 CCRM markers) on robusta linkage map; mapping population was derived from a cross between CxR (a commercial robusta hybrid) and a local selection Kagganahalla [26].

The SSR markers of the existing map used as anchor markers are shown in italic and bold face.

Cross-species/−genera transferability and marker conservation

New SSR-markers when tested on 13 related Coffea and two Psilanthus species, exhibited robust cross-species amplifications with alleles of comparable sizes in the tested taxa (Figure 1, Tables 4 & 5, Table S4). The EST-SSRs showed 100% transferability accross the tested Coffea and Psilanthus spp., whereas the genomic-SSRs indicated 96% amplification and transferability for Coffea spp. and 92% for the related Psilanthus spp. The analysis also indicated some private alleles (PAs), which possibly could be species-specific (Tables 4 & 5).

Generic affinities within/between cultivated and wild coffee germplasm by new SSRs

The SSR allelic data were examined for their utility in ascertaining the genetic diversity and generic inter-relationships between the cultivated, as well as, the wild coffee genepool. The average genetic distance values calculated using the EST-SSR allelic data were in general, significantly less but comparable to that obtained using the genomic SSRs for the tested arabicas, robustas, as well as, for different Coffea and Psilanthus species. The NJ phenetic tree generated using the genetic distance estimates of EST-SSRs allelic data clearly resolved the tested germplasm in two distinct clusters, one representing all the tetraploid arabicas, while the other comprised all the diploid robusta genotypes (Figure 3a) with significant branch support. The selections formed a single cluster within the tetraploids cluster, while pure arabicas and hybrid-selections appeared in distict sub-clusters. Similarly, in clustering analysis of 14 related species (12 Coffea and two Psilanthus spp.; Figure 3b) along with two genotypes each from C. arabica and C. canephora, tetraploid Erythrocoffeas (C. arabica) and diploid Erythrocoffea (C. canephora) formed coherent clusters. Moreover, the grouping of the related taxa, in general, was as per their botanical type with few changes. Though all the entries from Erythrocoffeas fell into one cluster, it contained two entries from Pachycoffeas (C. dewevrei with C. canephora and C. liberica with C. congensis). The remaining four of the Pachycoffeas (C. excelsa, C. arnoldiana, C. aruwemiensis, C. abeokutae) grouped with each other with good bootstrap support. The C. salvatrix a Mozambicoffea was also grouped with these Pachycoffeas, while the other three Mozambicoffeas (C. racemosa, C. eugenioides, C. kapakata) and two Paracoffeas (Psilanthus spp.) appeared as independent strong groups. Single Melanocoffea species (tested in this study), C. stenophylla was not grouped with any of the above species cluster but was found close to the Coffea species than the Psilanthus spp.
Figure 3

Unrooted phenetic trees based on the allelic diversity across the tested 44 EST-SSRs showing generic affinities between the: a) C. arabica and C. canephora genotypes, and b) 14 Coffea and two Psilanthus taxa; (only>50 bootstrap values are shown).

Similar results were obtained using the data from genomic SSRs (CCRMs, data not shown).

Discussion

SSR motifs in coffee transcriptome, and development of new EST-SSR markers

In the present study, 15.4% of the coffee ESTs were found to contain SSRs, which is comparable with our earlier study [11], but much higher than 2.7–10.8% that was reported for 18 representative dicotyledonous species [27], and 7 −10% reported for monocot species [28]. Notwithstanding this apparent enrichment/higher abundance, the SSRs in coffee transcriptome were very comparable to other plant species in observations like: Abundance of TNRs than DNRs; Abundance of AG among the DNRs followed by AT; CG as the least abundant among the DNRs; Abundance of AAG among TNRs (among the dicots); Predominance of GC-rich TNRs (but not CCG/GGC) than the non-GC-rich TNRs. A total of 18.7% of the detected EST-SSRs were found to be suitable candidates for primer design, a comparatively lower proportion (∼50%) then we reported earlier [11]. The main attributes that rendered majority of the identified SSRs unsuitable for marker development were: a shorter repeat core (<18 bp) and/or flanking sequences of low complexity (AT/GC-rich and/or regions prone to secondary structure formation) or shorter lengths seriously constraining designing of optimal primer pairs. However, in this study, primer-to-marker conversion ratio (ca. 88%), was higher than many earlier similar studies in other crops. Such differences in marker conversion ratios are expected due to differences- in the quality of primers designed), GC content of the genome, the genome complexity, and/or genome size [13].

SSR enrichment and development of genomic SSRs

The genomic DNA library constructed in this study, resulted in very high proportion of SSR+ive sequences, with very low degree of redundancy (9.1%; 6 out of 66 identified SSRs positive sequences). This was notable, as in earlier similar studies the apparent high success rates were generally confounded by high degree of redundancy [29]. Similarly, the proportion of SSR positive sequences found suitable for primer designing was also higher (87%) in our study than the average of 54±3% recorded in other species [30]. These obsrevationssuggest that the enrichment approach used in this study may be a desirable strategy for efficiently entrapping and targeting the SSRs even in genome(s) like coffee that are relatively poor in SSR motifs [2].

Utility of new EST- and genomic SSRs as genetic markers

The SSRs provide desirable markers for studying genetic diversity, germplasm characterization, constructing reference panels/bar codes, for individualization of genotypes, linkage mapping, population biology, and taxonomic relationships of related taxa [2]. Therefore, it becomes desirable to validate the new markers for their utility in genetic studies, which unfortunately has been lacking in majority of published studies describing development of coffee-specific SSR markers. Various genetic parameters viz., allelic diversity, PIC, H, H, HWE, LD calculated for all the new EST and genomic SSRs and mapability on linkage map, amply suggested their possible utility as genetic markers (see Table 4 & 5). In general, the extent and pattern of allelic/genetic diversity revealed by the new markers conform to that reported earlier for the coffee genomic SSRs [5], [6], [31], and the EST-SSRs [8], [11]. Different genetic parameters/tests such as H, H, LD, HWE are important indicators of origin, evolution and distribution of diversity in the available genepool. The heterozygosity measures (H) for the new SSR markers indicated heterozygote decay (deficiency) in the tested germplasm. The HWE and LD analysis of the polymorphic markers were in general agreement with our earlier observations with genomic as well as EST-SSRs [5], [8], [11]. Overall, these studies indicated that the tested robusta germplasm comprised allogamous, relatively unrelated genotypes, while autogamous tetraploids comprised mostly of hybrid varieties/selections with overlapping/shared pedigrees. The results thus suggest the suitability of the new markers for reliably ascertaining genetic diversity in the coffee gene pool.

Cross-species/−generic transferability

All the new EST- and genomic SSR markers revealed very high and robust cross species/−generic amplifications with alleles of comparable sizes when tested on 12 other Coffea and two Psilanthus taxa. The data revealed that the markers described here show much higher taxa transferability than earlier published genomic−/EST-SSR markers [2], [5], [7], [11]. This is significant as successful cross-species amplification is generally restricted to related species within a genus and reduces when tested for different genera [32]. Further, it was interesting to note that the new SSRs that were monomorphic/uninformative for the tested arabica/robusta germplasm, exhibited considerable polymorphism across the tested related taxa (the only exceptions were the marker CCESSR16 and 18 that showed a very low conservation even across the Coffea spp.). Thus the new SSR markers described here strengthen the possibility of their use as Conserved Orthologous Sets (COS) for genetic characterization of different related wild coffee taxa, and also for coffee taxonomic/synteny studies.

Diversity analysis and genetic relatedness within/between Coffea and Psilanthus species

The EST−/genomic-SSRs described in this study were able to group all the 16 genotypes (representing the cultivated genepool) in phenetic clustering that was indicative of their species status and known pedigrees (Figure 1a). Similarly, the analysis 14 Coffea and two Psilanthus species, revealed generic affinities that were largely in agreement with their known taxonomic relationships (Figure 1b), based on their geographical distribution as well as Chevalier’s botanical classification [33]. Importantly, the analysis distinctly separated the two Paracoffea species (P. bengalensis and P. wightiana) from all the other Coffea spp. These results are similar to the earlier published studies undertaken to ascertain species relationships using SSRs [2], [7], [8], [11], as well as other marker approaches [34]–[36]. These results, thus, amply demonstrate that the new SSR markers developed in the present study can be considerably informative in exploring the taxonomic relationship of coffee species complex.

Conclusions

The present study describes a total of 69 new validated SSRs; 44 EST-SSRs developed from coffee transcriptome using in-silico methodology, and 25 genomic SSRs developed using SSR enrichment approach. In addition, it provides primer pairs for additional 270 putative EST-SSRs. Analysis of the identified SSR-positive ESTs also provided insights into the relative abundance and distribution pattern of different SSR motifs in the coffee transcriptome, which was found to be relatively rich in its SSR abundance. Among the identified EST-SSRs, TNRs followed by DNRs were more abundant than other SSRs, and among different types of SSR motifs, AG was the most abundant. All the 69 markers were found to be polymorphic in the tested coffee/related germplasm and their utility as efficient genetic markers could be demonstrated for diversity analysis, germplasm individualization, linkage mapping, cross-species transferability and taxonomic studies. As many of these SSRs showed a very high cross-species transferability, they can aid in conservation, management and resolving taxonomic relationships, as Conserved Orthologous Sets (COS) for Coffea and Psilanthus species and more importantly as efficient, and informative genetic landmarks on molecular linkage maps. Summary statistics of screening of the coffee unigene ESTs for SSRs. (PDF) Click here for additional data file. Summary statistics of distribution and abundance of detected SSRs in the unigene ESTs and SSR frequency estimates for coffee transcriptome. (PDF) Click here for additional data file. Characteristics and distribution of the detected SSR motifs (without MNRs) across the non-redundant 56 SSR+ive sequences generated using SSR enrichment approach. (PDF) Click here for additional data file. Inter-species and inter-generic transferability of the new EST-SSRs and genomic SSR markers. (PDF) Click here for additional data file.
  19 in total

1.  In silico analysis on frequency and distribution of microsatellites in ESTs of some cereal species.

Authors:  Rajeev K Varshney; Thomas Thiel; Nils Stein; Peter Langridge; Andreas Graner
Journal:  Cell Mol Biol Lett       Date:  2002       Impact factor: 5.787

2.  SSR cross-amplification and variation within coffee trees (Coffea spp.).

Authors:  V Poncet; P Hamon; J Minier; C Carasco; S Hamon; M Noirot
Journal:  Genome       Date:  2004-12       Impact factor: 2.166

3.  Microsatellite libraries enriched for several microsatellite sequences in plants.

Authors:  K J Edwards; J H Barker; A Daly; C Jones; A Karp
Journal:  Biotechniques       Date:  1996-05       Impact factor: 1.993

4.  SSR mining in coffee tree EST databases: potential use of EST-SSRs as markers for the Coffea genus.

Authors:  Valérie Poncet; Myriam Rondeau; Christine Tranchant; Anne Cayrel; Serge Hamon; Alexandre de Kochko; Perla Hamon
Journal:  Mol Genet Genomics       Date:  2006-08-19       Impact factor: 3.291

5.  EST-SSR development from 5 Lactuca species and their use in studying genetic diversity among L. serriola biotypes.

Authors:  Dilpreet S Riar; Sachin Rustgi; Ian C Burke; Kulvinder S Gill; Joseph P Yenish
Journal:  J Hered       Date:  2010-12-10       Impact factor: 2.645

6.  Simple sequence repeat diversity in diploid and tetraploid Coffea species.

Authors:  Pilar Moncada; Susan McCouch
Journal:  Genome       Date:  2004-06       Impact factor: 2.166

7.  Development of genomic microsatellite markers in Coffea canephora and their transferability to other coffee species.

Authors:  Valérie Poncet; Magali Dufour; Perla Hamon; Serge Hamon; Alexandre de Kochko; Thierry Leroy
Journal:  Genome       Date:  2007-12       Impact factor: 2.166

8.  Coffee and tomato share common gene repertoires as revealed by deep sequencing of seed and cherry transcripts.

Authors:  Chenwei Lin; Lukas A Mueller; James Mc Carthy; Dominique Crouzillat; Vincent Pétiard; Steven D Tanksley
Journal:  Theor Appl Genet       Date:  2005-11-05       Impact factor: 5.699

9.  MoccaDB - an integrative database for functional, comparative and diversity studies in the Rubiaceae family.

Authors:  Olga Plechakova; Christine Tranchant-Dubreuil; Fabrice Benedet; Marie Couderc; Alexandra Tinaut; Véronique Viader; Petra De Block; Perla Hamon; Claudine Campa; Alexandre de Kochko; Serge Hamon; Valérie Poncet
Journal:  BMC Plant Biol       Date:  2009-09-29       Impact factor: 4.215

10.  Development of new genomic microsatellite markers from robusta coffee (Coffea canephora Pierre ex A. Froehner) showing broad cross-species transferability and utility in genetic studies.

Authors:  Prasad Suresh Hendre; Regur Phanindranath; V Annapurna; Albert Lalremruata; Ramesh K Aggarwal
Journal:  BMC Plant Biol       Date:  2008-04-30       Impact factor: 4.215

View more
  7 in total

1.  In Silico development of new SSRs primer for aquaporin linked to drought tolerance in plants.

Authors:  Karim Rabeh; Fatima Gaboun; Bouchra Belkadi; Abdelkarim Filali-Maltouf
Journal:  Plant Signal Behav       Date:  2018-10-31

2.  Development of a new set of genic SSR markers in the genus Gentiana: in silico mining, characterization and validation.

Authors:  Era Vaidya Malhotra; Rishu Jain; Sangita Bansal; Suresh Chand Mali; Neelam Sharma; Anuradha Agrawal
Journal:  3 Biotech       Date:  2021-09-10       Impact factor: 2.893

3.  Transcriptome sequencing of Himalayan Raspberry (Rubus ellipticus) and development of simple sequence repeat markers.

Authors:  Samriti Sharma; Rajinder Kaur; Amol Kumar U Solanke; Himanshu Dubey; Siddharth Tiwari; Krishan Kumar
Journal:  3 Biotech       Date:  2019-03-30       Impact factor: 2.406

4.  Development of novel EST-SSR markers in the macaúba palm (Acrocomia aculeata) using transcriptome sequencing and cross-species transferability in Arecaceae species.

Authors:  Bárbara Regina Bazzo; Lucas Miguel de Carvalho; Marcelo Falsarella Carazzolle; Gonçalo Amarante Guimarães Pereira; Carlos Augusto Colombo
Journal:  BMC Plant Biol       Date:  2018-11-12       Impact factor: 4.215

5.  Genetic diversity of native and cultivated Ugandan Robusta coffee (Coffea canephora Pierre ex A. Froehner): Climate influences, breeding potential and diversity conservation.

Authors:  Catherine Kiwuka; Eva Goudsmit; Rémi Tournebize; Sinara Oliveira de Aquino; Jacob C Douma; Laurence Bellanger; Dominique Crouzillat; Piet Stoffelen; Ucu Sumirat; Hyacinthe Legnaté; Pierre Marraccini; Alexandre de Kochko; Alan Carvalho Andrade; John Wasswa Mulumba; Pascal Musoli; Niels P R Anten; Valérie Poncet
Journal:  PLoS One       Date:  2021-02-08       Impact factor: 3.240

6.  Genetic variability for vigor and yield of robusta coffee (Coffea canephora) clones in Ghana.

Authors:  Abraham Akpertey; Esther Anim-Kwapong; Paul Kwasi Krah Adu-Gyamfi; Atta Ofori
Journal:  Heliyon       Date:  2022-08-11

7.  Development and Characterization of Genic SSR Markers from Indian Mulberry Transcriptome and Their Transferability to Related Species of Moraceae.

Authors:  B Mathi Thumilan; R S Sajeevan; Jyoti Biradar; T Madhuri; Karaba N Nataraja; Sheshshayee M Sreeman
Journal:  PLoS One       Date:  2016-09-26       Impact factor: 3.240

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.