Literature DB >> 23137291

High-throughput novel microsatellite marker of faba bean via next generation sequencing.

Tao Yang1, Shi-ying Bao, Rebecca Ford, Teng-jiao Jia, Jian-ping Guan, Yu-hua He, Xue-lian Sun, Jun-ye Jiang, Jun-jie Hao, Xiao-yan Zhang, Xu-xiao Zong.   

Abstract

BACKGROUND: Faba bean (Vicia faba L.) is an important food legume crop, grown for human consumption globally including in China, Turkey, Egypt and Ethiopia. Although genetic gain has been made through conventional selection and breeding efforts, this could be substantially improved through the application of molecular methods. For this, a set of reliable molecular markers representative of the entire genome is required.
RESULTS: A library with 125,559 putative SSR sequences was constructed and characterized for repeat type and length from a mixed genome of 247 spring and winter sown faba bean genotypes using 454 sequencing. A suit of 28,503 primer pair sequences were designed and 150 were randomly selected for validation. Of these, 94 produced reproducible amplicons that were polymorphic among 32 faba bean genotypes selected from diverse geographical locations. The number of alleles per locus ranged from 2 to 8, the expected heterozygocities ranged from 0.0000 to 1.0000, and the observed heterozygosities ranged from 0.0908 to 0.8410. The validation by UPGMA cluster analysis of 32 genotypes based on Nei's genetic distance, showed high quality and effectiveness of those novel SSR markers developed via next generation sequencing technology.
CONCLUSIONS: Large scale SSR marker development was successfully achieved using next generation sequencing of the V. faba genome. These novel markers are valuable for constructing genetic linkage maps, future QTL mapping, and marker-assisted trait selection in faba bean breeding efforts.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 23137291      PMCID: PMC3542174          DOI: 10.1186/1471-2164-13-602

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Faba bean (Vicia faba L.) is an important temperate legume, grown for human consumption and animal feed due to its high protein and fibre content [1,2]. The crop also replaces available nitrogen in the soil when used in rotation with cereals and oilseeds, and thus is expected to be a highly beneficial component in future temperate Low Carbon Agricultural systems. China is the largest faba bean producer (40.36%) with an average dry grain production (2005–2009) of 1,720,000 metric tonnes (mt) from 945,400 hectares; followed by Ethiopia (476,026 mt), France (331,122 mt), Egypt (274,040 mt) and Australia (196,800 mt) [3]. However, faba bean suffers from several major biotic and abiotic factors that constrain productivity. Although significant genetic gain to overcome these has been made through traditional breeding practices [1], progress through the use of genomics and associated biotechnologies is limited. This is due mainly to the large genome size (13GB; [4]), which is approximately 25 times larger than that of the model legume Medicago truncatula, and 2.5 times larger than Pisum sativum[1], together with a lack of financial investment in this crop species. Recent advances in next generation sequencing (NGS) technologies enable the generation of large volumes of sequence efficiently and cost-effectively [5,6]. This has led to a revolution in biological and agricultural applications including identification of genes correlated with key breeding traits through high-density SNP marker and genome-wide association analysis studies (GWAS) [7,8]. Another outcome is the ability to accurately identify sequences flanking simple sequence repeat (SSR) regions for use as locus-specific markers for downstream genotyping. Otherwise known as microsatellites, SSRs are tandemly repeated motifs of 1 to 6 nucleotides found in both coding and non-coding regions [9,10]. These have become a marker of choice in many genotyping applications due to their relatively high abundance, high level of allelic variation, co-dominant inheritance, analytical simplicity and transferability of results across laboratories [11]. A limited number of characterized SSR loci (<120) which have been validated over relatively few genetic backgrounds are available for faba bean. Initially, Pozarkova et al. developed primers to 25 SSR loci detected in chromosome 1 DNA libraries [12]. Subsequently, Zeid et al. developed primers to 54 SSR loci [13] and Gong et al. developed 11 EST-SSR loci primers [14]. Most recently, EST sequences within the public domain databases were screened and an additional 21 novel SSR loci were characterized and validated among 32 faba bean accessions [15]. Besides providing a cost-effective valuable source for molecular marker generation, the identification of SSR within ESTs is an effective approach for gene discovery and transcript pattern characterization, particularly if through mapping an EST-SSR or EST marker is significantly associated with a QTL [16-18]. This may be achieved by searching for SSR associated sequences within EST of a well characterised crop or model plant species. Together with the advantage of in silico analysis, this approach has the potential to substantially broaden the field of comparative studies to species where limited or no sequence information is available. The present study identified high-quality putative SSR loci and flanking primer sequences cheaply and efficiently using the Roche 454 GS FLX Titanium platform. The resultant SSR sequences were characterized and validated through successful amplification of randomly selected target loci across a selection of faba bean genotypes from diverse geographic origin.

Methods

Plant material

A total of 247 faba bean accessions were selected from the National Genebank of China held at the Institute of Crop Science (ICS), Chinese Academy of Agricultural Sciences (CAAS), Beijing. Of these, 100 originated from China, 54 were from other Asian countries, 39 were from Europe, 30 were from Africa, 14 were from the America, 9 breeding lines were sourced from the ICARDA (International Center for Agricultural Research in the Dry Areas) faba bean breeding program and one was from Oceania (Additional file 1: Table S1).

DNA isolation, library preparation and 454 sequencing

Seven days after seed were left on moist filter paper in the dark at 22°C, sprouts from each of the 247 genotypes were collected. A single sprout of each genotype and of approximately the same weight was pooled and total gDNA was extracted using the CTAB method [19,20]. Genome libraries were constructed using eight biotin labeled probes and a selective hybridization with streptavidin coated bead method [21-23]. The probes were: pGA, pAC, pAAT, pAAC, pAAG, pATGT, pGATA and pAAAT. The quality of libraries was inspected by randomly selecting and sequencing 276 clones. The cloning vector was pEASY-T1 (TransGen Biotechnology Co., Ltd), and the primers used for sequencing were F: 5′-GTAAAACGACGGCCAGT-3′ and R: 5′-CAGGAAACAGCTATGAC-3′. Libraries were considered to be of high quality if the length of sequences were from 200 to 1000 bp, as evidenced on agarose gel. Subsequently, entire libraries were equally pooled and subjected to 454 sequencing with GS-FLX Titanium reagents at Beijing Autolab Biotechnology Co., Ltd (China). All processing and analyses of the sequencing data was performed with GS-FLX Software v2.0.01 (454 Life Sciences, Roche, Germany). Using a series of normalization, correction and quality-filtering algorithms, the 454 sequencing data were processed to screen and filter for weak signals and low-quality reads, and to trim the read ends for 454 adaptor sequences using the EMBOSS [24] software package. The sequencing data were then submitted to the ′National Center for Biotechnology Information (NCBI) short read archive and given the accession number SRP006387.

SSR loci search and primer design

The software MISA (Microsatellite identification) tool (http://pgrc.ipk-gatersleben.de/misa/) was configured to locate a minimum of 10 bp: monomers (×10), 2-mers (×6), 3-mers (×5), 4-mers (×5), 5-mers (×5) and 6-mers (×5). This tool allowed the identification and localization of perfect microsatellites as well as compound microsatellites. The maximum size of interruption allowed between two different SSR in a compound sequence was 100 bp. Subsequently, Primer 3.0 (http://www-genome.wi.mit.edu/genome_software/other/primer3.html.) was used to design primer pairs to the flanking sequences of each unique SSR.

SSR characterization and validation

The number of different types of SSR, length (motif bp × number of motifs) and SSR position was searched and analyzed for using a bespoke program written in MISA files [25] and plotted by OpenOffice.org Calc.

Marker assessment

Polymerase chain reactions (PCR) were performed in 20 μl reaction volumes containing 0.5 U of Taq DNA polymerase (Zhexing, Beijing, China), 1 × PCR BufferII, 1.5 mM MgCl2, 25 μM of dNTP, 0.4 μM primer, and 50 ng of genomic DNA. Microsatellite loci were amplified on a Heijingang Thermal Cycler (Eastwin, Beijing, China) with the following cycle: 5 min initial denaturation at 95°C; 35 cycles of 30s at 95°C, 30s at the optimized annealing temperature (Table 1), 45s of elongation at 72°C, and a final extension at 72°C for 10min. PCR products were initially assessed for size polymorphism on 6% denaturing polyacrylamide gels and visualized by silver nitrate staining.
Table 1

Occurrence of microsatellites in the genome survey

CategoryNumbers
Total number of sequences examined
532,599
Total size of examined sequences (bp)
162,448,842
Total number of identified SSRs
250,393
Number of SSR containing sequences
125,559
Number of sequences containing more than one SSR
61,266
Number of SSRs present in compound formation122,988
Occurrence of microsatellites in the genome survey The genotyping data was subsequently used to determine genetic relationships among 32 V. faba accessions (eleven from China, seven from Asia, five from Europe, five from Africa, three from the Americas and one from Oceania; (Additional file 1: Table S1). The number of alleles (Na), expected (He) heterozygosities and observed (Ho) heterozygosities were calculated using POPGEN1.32 [26]. The cluster analysis of 32 genotypes was carried out based on Nei's unbiased measures of genetic distance [27] by using the unweighted pair-group method with arithmetic average (UPGMA), and the dendrogram was drawn by MEGA4 [28].

Results

Quality inspection of the DNA library

The recombination rate within the constructed SSR-enriched V. faba library was 73.9%. Among the 276 clones sequenced, 31.9% contained SSR sequences within an insert that ranged from 0.2 to 1.0 kb in size.

454 sequencing and characterization reads

A total of 578,251 reads were generated from the pooled library, and 532,599 read sequences were used for further analysis after adaptor removal. Adenine was the most abundant nucleotide (30%), followed by thymine (27%), guanine (22%) and cytosine (21%). The mean GC content was 43%. The average length of read sequence was 305 bp, with a maximum length of 635 bp (Figure 1).
Figure 1

Frequencies length distribution of 454 read sequences.

Frequencies length distribution of 454 read sequences.

Identification of SSR loci

After MISA analysis, the number of sequences containing an SSR was 125,559, and in total 250,393 SSR loci were detected. The number of sequences containing more than one SSR loci was 61,266 and the number of SSRs present in compound formation was 122,988 (Table 1). The total size of SSR motif sequences was 8,759,185 bp, with an average motif length of 69 bp. Of these, 25% comprised more than one discrete repeat and a high proportion (49%) was located within compound repeats. The majority of identified SSR motifs (83%) were located between the 5’-terminus and mid regions of the cloned sequences, and within 200 bp of the 5’-terminus (Figure 2). A total of 28,503 primer pairs were designed for future assessment of locus amplification (Additional file 2: Table S2).
Figure 2

The frequency of the SSR motif start position from the 5’ terminus of the cloned insert within the enriched libraries .

The frequency of the SSR motif start position from the 5’ terminus of the cloned insert within the enriched libraries .

Abundance and length frequencies of SSR repeat motifs

The most common SSR motifs comprised trinucleotide and dinucleotide repeats (Figure 3). The majority of the trinucleotide repeats were from 15 to 30 bp in length. Within the 1,188 characterised mononucleotide SSR, (A/T)n was almost three times more common than (C/G)n, particularly at the 11–12 bp length. The dinucleotide repeats (AC/GT)n and (AG/CT)n were predominant, representing 99.2% of all of the dinucleotides characterised. Triucleotide (AAC/GTT)n repeats were the most abundant (96.5%). Twenty two unique tetranucleotide repeat motifs were identified, with the most common being AGAT/ATCT (66.4%), ACAG/CTGT (19.3%) and ACAT/ATGT (9.1%). Pentanucleotide and hexanucleotide motifs were far less frequent, together comprising only 0.1% of the total SSR detected. The dominant pentanucelotide motif was AGAGT/ATCTC (23.8%) and the most common hexanucelotide motif was ACACGC/CGTGTG (49.5%) (Additional files 3, 4, 5, 6, 7 and 8: Figure S1-S6).
Figure 3

Frequencies of different nucleotide repeat sizes within the clones analysed .

Frequencies of different nucleotide repeat sizes within the clones analysed .

Compound SSR analysis

Two types of compound SSR were identified; those without an interruption between two motifs (ie (CA)12(ACG)37 and noted as C* type) and those with an interruption between two motifs ( ie (AAC)7gtcaat(AAC)5 and noted as C type). In total, 1,893 C* type and 59,369 C type compound SSR loci were detected among those sequenced, reflecting the complexity of the faba bean genome.

Validation of SSR assay

Of the 150 primer pairs selected for validation of SSR locus amplification, 102 produced a reproducible and clear amplicon of the expected size. Of these, 94 (63%) were polymorphic among thirty-two genotypes assessed (Table 2). The number of alleles per locus ranged from 2 to 8, the expected heterozygosities ranged from 0.0000 to 1.0000, and the observed heterozygosities ranged from 0.0908 to 0.8410 (Table 3).
Table 2

Characteristics of 94 polymorphic SSR markers developed in L. (F=forward primer, R=reverse primer, Size = size of cloned allele, Ta = annealing temperature)

PrimerRepeatF (5’– 3’)R (5’– 3’)Size (bp)Ta(°C)
CAAS1
(AAAGGG)7
AGTCAGGGGGTCGATTTTTC
TCTTGCGCAGTTTTGACATC
212
55
CAAS2
(GAA)9
TACAAAAGCTCTGGGGCCTA
CCAATTCCTCTGGGCAACT
202
56
CAAS3
(AG)7
CTGGTGCGTAAGGTTGATGA
CAAACCACCACCAATCACAG
132
53
CAAS4
(CA)11
ATTGCAAGTCCTGAGGCAAG
ATAATGGCGCCACAAAGTGT
160
57
CAAS5
(ACA)15
TACATCAGTCCCGCAAATCA
CCATGTAGCCGATTCCACTT
150
55
CAAS6
(A)10
TGCAAAGTAATTCCGAAACAA
CGCACATGAATTGGGGTAAT
150
56
CAAS7
(A)10
GACCCAAGCCTTCACCACTA
TGTGTGGGATCCATTTTGAA
200
59
CAAS8
(AAC)14
AATTTGTTCAGCATCTCGGG
CTGGTTGGTTCCTGGTGAGT
150
56
CAAS9
(AAC)9
GTGATGCTTTGCCTGTGCTA
ATGGACGTTTGTAGGTGGGA
200
56
CAAS10
(AAG)5
CTGTTCGTCATCATCATCGG
CGTAAATCAACCCCAACACC
150
53
CAAS11
(ACA)10
TCCCGCTATTCTTGCTCTGT
GCTCAAAAATGCTTGTCTTTCA
170
54
CAAS12
(TGT)9
GAGGAGGATCCCACAATGAA
GCCAAAAGAGCCATGGTAGA
210
56
CAAS13
(CAA)11aaatcccaaaaactgcaaattgtatgccatcttaaaccatac(CAA)7
CAAAAATCCCAAAAACTGCAA
TCGATTTTTCGACTTGGGTC
130
56
CAAS14
(AAC)6
CCGTAGATCTCAAAAACCATGA
GGAGGAAGGAAGCTCGAATC
170
60
CAAS15
(AAC)8
AACCAACATCAATGGCATCA
TCTTTTCCTTTTTCCTCTTCCA
140
60
CAAS16
(CA)7
TCAAATTTCCCTTTGCAAAAAT
GACCAAGGTCAACCACCTTT
350
56
CAAS17
(CA)8
TCAAACACCTACACACCCACA
TCTCGGTCAATCTCACATGC
250
56
CAAS18
(CA)9
ATGGGAGGGCAAATTTTAGG
AGTGAGTGGAGCGCTTGTTT
350
56
CAAS19
(CAA)6
AACATTTTTCCAATCGAGGC
TGTAGGCTTACGGCCAAAGA
200
56
CAAS20
(CAT)5
ACTGGAAAATCCCAATGCAC
AGCAAACTTGCACCCAACAT
190
56
CAAS21
(CTT)8
GAATTTTCAAAACATGAGTCCCA
CCGGATCTGAAAAGACTTGC
175
60
CAAS22
(G)10
TGATGAACAGAACTGCGCTC
ATTGGAGAGAGGCGAAATCA
190
56
CAAS23
(GA)6
ACCGCATGCTAGGGAGTCTA
TGGGTGACTCACTTTTGTGG
220
58
CAAS24
(GA)6gca(AG)6(TG)8
TCACTCACAAGCCACTAAGTCAA
GATGCGACACTATCCCCACT
200
56
CAAS25
(GT)15
TCCATAATCAATTGGCTAAGCTC
AAGACTAACTCTCGACTGTATTTAGGC
150
58
CAAS26
(GT)7
CGGCTTGGTTAACTGGATGT
TCTTCCTTTTCTTCAATGCG
160
58
CAAS27
(TA)6
TTGGCATCATGCTCTAATCG
CTTGAAGTCGTGCCAGATGA
280
60
CAAS28
(TC)8
CCATTGATGCAGGAAAGGAT
CAGCTTTGACAGCTCCAACA
160
58
CAAS29
(TCA)5
TGCAAGTCAGTAGCCAAGACA
CTCGTCTCTCCTCATTCCCA
180
58
CAAS30
(TG)10
GGTTTTTAGGTGATTTTCGCA
GCGAAACCTCGTATGGTTGT
170
59
CAAS31
(TG)12
CAACGCGCTAGAGGAAGAAG
CCACTGCCCTAGCACACTAA
160
56
CAAS32
(TG)7
TTTGGGGTACAACACTGGGT
CCTCACTCCTCTATATAAACAACACTT
200
59
CAAS33
(TGA)5
GCAGTGATTCTGGCAGTGAA
TGCAGCAACATTTCCATCAT
190
56
CAAS34
(TGT)5
TTTCTCGCAATTGTTCTCACA
TTCGATGAAATCCATCTTCTGA
200
57
CAAS35
(TTG)8
AGGCAGAAGTTTGGAAGCAA
TCTCACTTCGGCTTCAGGAT
180
56
CAAS36
(A)11
AGCACTAGAGTTCCAAGCCA
TTTTTATCGTTTCTTGTCACGC
130
52
CAAS37
(A)11
CAACGCAAGAACACGTGAAT
TAGAGGCCAATTCAAGCCAT
190
54
CAAS38
(AAC)5
CGCCTCAGAACCAAGTTCAT
TGCTTTGTTTTGGTTTTGTGA
170
56
CAAS39
(AAG)5
CTGTTCGTCATCATCATCGG
CGTAAATCAACCCCAACACC
170
54
CAAS40
(AAG)6
CCAAAGCCACTTCCAAACAT
TTCAGCCGGGCTTCTTTC
110
54
CAAS41
(AC)10
GAAACCCACTTGGTCGTGTC
TTCATTTGGGTAGGCTCCAA
190
56
CAAS42
(AC)10
CAAGTGTCGACGCAAGAGAT
TGACTTTTTGACTGCTCCCA
250
56
CAAS43
(AC)7
GAGGAAGTGTGAAAGGTCGC
TCATTTTAAAGTGGTGTATGTGTGT
170
54
CAAS44
(AC)7
ACACACACACGCACACACAC
CATGAACCTTTGATAGTTTTCCA
150
56
CAAS45
(AGA)5
ATGGCTTTGACAAAAGGGAA
CTCCTTCACCCGACAATGTT
180
57
CAAS46
(AGA)6
AGATCGCAGGCGTAGAAAGA
TGCTTCAACCACAACACCAT
200
58
CAAS47
(C)11
CAAATTGGTTTGCATATCCG
AGCCCTTCACATCCATTGAG
200
56
CAAS48
(CA)10
CCTCCTCCTTTAATTTGTGGC
TGAATCGTGAATGCTCTCTGA
200
56
CAAS49
(CA)10
ACCTCCATAGCAGCAGCATT
GGCCAATTCTTAACGTGCTT
140
56
CAAS50
(CA)10
CACTGGACCATTTTGCATTC
ATGAGATCCGGAGCAGATGT
140
56
CAAS51
(CA)11
AAGCATTAAAACTCCCATAGCG
ATGTGTGCGTGTGTCATGTG
140
52
CAAS52
(CA)12
CATTCCATGTTGCGTTTTTG
GGATAAGAGGGTGGTGGTGA
200
56
CAAS53
(CA)13
GGCCCATTTGTTAAGGGTTT
AATGAGATCTGGCCTGGATG
200
56
CAAS54
(CA)6
CCATTGGACCTCTTTGCATT
CCAGAGTGGATGATGATCTGA
150
54
CAAS55
(CA)6
ACTCACATACACGCACACACA
AATGCTCTCATCCCTTTTGC
150
56
CAAS56
(CA)6
CACATACACGCACACACACA
AATGCTCTCATCCCTTTTGC
150
56
CAAS57
(CA)8
GCCCGAGACACTTTGGTTTA
CCAGAATGGATGAGGACCTG
210
56
CAAS58
(CA)9
CTCCTGGTCCATGTATGAATGA
TGTGTGTATGTGTATGCGTGC
150
54
CAAS59
(CAA)10
GGCCAACATAGGTGAGCATT
GTGTTGTAGGCCTTTGGTCC
200
56
CAAS60
(CAA)8
ATGCAAAATGAAATGCGACA
TGTAGTTGTCTGTTTAATGGTTGTTG
190
56
CAAS61
(G)11
AGAGGAAAAAGGCAAATGGC
CCCTTCATCAATCACACCAA
130
54
CAAS62
(GA)14
AATGTTGGGACGGAGTTCAG
TTGTTGATTCATTCATCCCTTG
130
56
CAAS63
(GA)15
CGCAGAGAAACACTCCATGA
GAAGTTGAATGTCATTTGTGTCAA
100
56
CAAS64
(GA)6
AAAATATAATAAACAAAGCAAAAGTGC
CAGGTTTGTGGTTTCACCCT
200
54
CAAS65
(GA)6
CGATATTCCTCGGTTTCCAA
CATGGGTCGTCTTCTCCACT
200
54
CAAS66
(GA)6
CATCACTTTCCAGCCTGTCA
ATTTTCTGCCTCCCCTTTGT
190
58
CAAS67
(GA)7
GGGTTTCAGAGAAAGGGGTC
CGCAAGCGTATTGGGTATTT
130
56
CAAS68
(GA)8
ATGGAGGTTGCGATTTGAAG
CATCATCTCCACACTTTTTCCA
130
54
CAAS69
(GT)10
ATTACAAATGTCGGTGCCGT
AGCACAACGATAAGATGATATGC
170
54
CAAS70
(GT)8
TCGCGATAGAGGTTTTGGAA
AACAACAACGATTCATCACAAGA
200
56
CAAS71
(GTT)15
CCATGTAGCCGATTCCACTT
TTCGGCAACGTAGGAAAAAT
160
54
CAAS72
(T)10
TTTTCCAGTGTCAACCCATCT
ACATGAGGCCAAAAACTGCT
170
54
CAAS73
(TG)13
TTGCACCTCTGTTGAAGACG
TCACCAACACTCTAATCCTCAATC
190
54
CAAS74
(AC)6
CCCACCGTATTACACAAGGG
GCGAGGAAGAAGATGACGTT
200
56
CAAS75
(AG)15
TCGATTGCACAATAAATGGTTT
GAGGTCGACTCCCATTGAAA
180
54
CAAS76
(AG)6
GCCTGTTAATGAGAAGAACTGGA
TTTCAAAATTTAGTTTCTCTCTGTCTC
200
56
CAAS77
(CA)21
TAGCAGCCAACAATCAGTGG
GGTGATGTTGCTCATGTTCG
180
56
CAAS78
(CA)7
TCAAATTTCCCTTTGCAAAAAT
TCGAACACAACTTCTTCATTTCTC
180
56
CAAS79
(CA)7
TCAAATTTCCCTTTGCAAAAAT
CATGGAAAATCTTTTATTTTGTGTG
100
58
CAAS80
(CA)8
GTGTGAAAACTCACCCGGTC
TGTGTGTAAGTGTGTGTATGTGTGTG
130
54
CAAS81
(GA)15
AACTTACAGGGGCCACACAC
TGTGCATTATACTTTACGTATGTTCCT
100
52
CAAS82
(GA)17
TTTGCTTGACAATGGTGGAA
ATTCAACAAGCAAGGGTTGG
120
52
CAAS83
(T)10
GATTTGCGTTTAGGGTTCCA
GAACAAACTACGTTTTATTGTCCAGA
180
52
CAAS84
(TA)6
TGTCGACACCACAGCTATTTT
TGTGGTTCGTTGTTTTGGTG
200
56
CAAS85
(TCA)6
TTGAAGTGAATAAGATGAAGAAGTGT
GTTGCCTTTCCTTGCATGAT
130
56
CAAS86
(TG)10
TCGCGATAGAGGTTTTGGAA
CACAAACAACAACGATTCATCA
200
56
CAAS87
(TG)14
CTCTACCATGGGCCATTTCT
AGAGATAGAGAGAGAGACAGAGATGAA
90
54
CAAS88
(TG)18
TCCTACCGATCTCTCTCTCCC
GTGGCATAACCGCGTAAGTT
130
56
CAAS89
(TG)18
TGTCTCGCCTTCAATCTTCC
CTTGCTAAGTGAGACTGCTGCT
190
54
CAAS90
(TG)19
TCCATAGTCGATGAGGACCG
TTGTCTCATTGTCTTTCTTTTCTTTC
100
54
CAAS91
(TG)6
ATCTTCGGCTTGGTTGATTG
GAGGCGGCCACATTAGACT
200
56
CAAS92
(TG)9
CGAGATCTGGAGTGGATTTAGA
TTTTCATATGCCACATGCTCA
170
56
CAAS93
(TTC)5
GGCATTGCTTACTTACCGGA
CGACGTCGACATTAACATGC
200
56
CAAS94(TTG)9TCCTCAACACGTGATGCAATTGTAGGACCAGGAAGGTCGT18056
Table 3

Informativeness of SSR loci following amplification from 32 geographically diverse accessions of L

Locus
32 Accessions
 NaHeHo
CAAS1
3
0.0000
0.3591
CAAS2
3
0.2857
0.5703
CAAS3
7
0.4444
0.8099
CAAS4
4
0.0000
0.6111
CAAS5
3
0.1111
0.6471
CAAS6
4
0.2188
0.6324
CAAS7
6
0.6774
0.7372
CAAS8
7
0.6250
0.8016
CAAS9
4
0.1290
0.7250
CAAS10
4
0.7419
0.7277
CAAS11
4
0.3929
0.6890
CAAS12
4
0.1000
0.6718
CAAS13
5
0.3871
0.6256
CAAS14
3
0.4062
0.6493
CAAS15
4
0.6129
0.6901
CAAS16
6
0.6667
0.7708
CAAS17
3
0.0000
0.5159
CAAS18
4
0.3333
0.6887
CAAS19
5
0.0500
0.7474
CAAS20
4
0.2593
0.5926
CAAS21
4
0.1562
0.4712
CAAS22
3
0.2222
0.6038
CAAS23
2
0.0938
0.0908
CAAS24
6
0.1000
0.8000
CAAS25
5
0.4375
0.7399
CAAS26
3
0.0000
0.6333
CAAS27
5
0.2963
0.7701
CAAS28
4
0.5294
0.6471
CAAS29
4
0.3793
0.4483
CAAS30
4
0.2917
0.4991
CAAS31
4
0.4167
0.3608
CAAS32
5
0.6875
0.7882
CAAS33
3
0.2188
0.6195
CAAS34
3
0.4091
0.5613
CAAS35
4
0.3226
0.6753
CAAS36
3
0.3182
0.6131
CAAS37
2
0.1053
0.1024
CAAS38
2
0.4500
0.5013
CAAS39
4
0.3226
0.5960
CAAS40
3
0.0000
0.3579
CAAS41
3
0.0645
0.5812
CAAS42
5
0.7500
0.7599
CAAS43
3
0.0000
0.6400
CAAS44
4
0.3333
0.6078
CAAS45
4
0.1034
0.6068
CAAS46
3
0.0625
0.2758
CAAS47
5
0.0000
0.6885
CAAS48
3
0.5333
0.6706
CAAS49
3
0.0938
0.6424
CAAS50
4
0.2759
0.6733
CAAS51
4
1.0000
0.7270
CAAS52
3
0.7000
0.5757
CAAS53
5
0.5806
0.7832
CAAS54
5
0.6129
0.7441
CAAS55
3
0.0000
0.4504
CAAS56
2
0.5000
0.4944
CAAS57
5
0.2188
0.5045
CAAS58
3
0.4167
0.5616
CAAS59
5
0.5200
0.6686
CAAS60
3
0.8182
0.6104
CAAS61
3
0.2667
0.4881
CAAS62
2
0.6250
0.4583
CAAS63
3
0.1176
0.5704
CAAS64
4
0.4194
0.7229
CAAS65
4
0.4643
0.7266
CAAS66
4
0.3871
0.7123
CAAS67
4
0.0000
0.4719
CAAS68
2
0.2500
0.2283
CAAS69
6
0.9524
0.8072
CAAS70
2
0.0000
0.5034
CAAS71
6
0.1429
0.8097
CAAS72
2
0.1000
0.4808
CAAS73
5
0.2000
0.6220
CAAS74
3
0.1250
0.2651
CAAS75
5
0.2222
0.6797
CAAS76
4
0.1724
0.3358
CAAS77
5
0.3600
0.6106
CAAS78
5
0.6000
0.7734
CAAS79
5
0.2812
0.7941
CAAS80
4
0.6400
0.7192
CAAS81
5
0.0500
0.7167
CAAS82
4
0.6875
0.6230
CAAS83
4
0.6000
0.7590
CAAS84
3
0.0625
0.4172
CAAS85
3
0.3750
0.5928
CAAS86
3
0.0323
0.4691
CAAS87
5
0.9091
0.8139
CAAS88
6
0.8571
0.8269
CAAS89
8
0.0000
0.8410
CAAS90
4
0.5294
0.6471
CAAS91
5
0.8710
0.6267
CAAS92
4
0.3750
0.5382
CAAS93
4
0.1562
0.7217
CAAS9450.24000.7412

Notes: Number of alleles (Na), expected heterozygosity (He) and observed heterozygosity (Ho).

Characteristics of 94 polymorphic SSR markers developed in L. (F=forward primer, R=reverse primer, Size = size of cloned allele, Ta = annealing temperature) Informativeness of SSR loci following amplification from 32 geographically diverse accessions of L Notes: Number of alleles (Na), expected heterozygosity (He) and observed heterozygosity (Ho). The dendrogram showed that the 32 faba bean genotypes fell into four distinct clusters (Figure 4). Cluster 1 comprised accessions from China and other Asian countries except for one accessions from Africa. Cluster 2 comprised accessions from Europe and nearby regions such as Syria. Cluster 3 comprised accessions from Africa and Cluster 4 contained accessions from America, Oceania and Africa. The pattern of diversity was similar to that previously observed using AFLP [29] and ISSR [30] markers.
Figure 4

UPGMA dendrogram of 32 genotypes of faba bean .

UPGMA dendrogram of 32 genotypes of faba bean .

Discussion

This study demonstrated that massively parallel sequencing technology offers opportunity to quickly identify large numbers of high quality SSR with diverse motifs from a genetically orphaned species such as Vicia faba. Given the huge number of marker loci identified in this study, future SSR marker optimisation may be best focussed on those comprising trinucleotide repeats. These repeats are generally more robust since they are reported to give fewer “stutter bands” than those based on dinucleotide repeats [31,32]. Also, trinucleotide repeats in particular have been demonstrated to be highly polymorphic and stably inherited in the human genome [33-35]. While the tri- and dinucleotide repeats mostly contributed to the major proportion of SSRs, a very small share was contributed by mono-, tetra-, penta- and hexa-nucleotide repeats. A similar trend was observed in other species [36]. The conversion of SSR-containing sequences into single locus markers may have a low success rate due to complex and/or insufficient flanking sequence. For example, just 20% of the identified dinucleotide repeats from spruce were converted to clear, discrete markers [37]. Similar observations were made for pine [38], wheat [39] and previously for V. faba[12]. Another factor affecting the development of clear markers is the complexity of the repeat motifs, indeed a high proportion of the SSR in the current study comprised compound repeats (49.1%). Nevertheless, this study has provided the selected data required to potentially develop tens of thousands of novel SSR markers for the faba bean genome. Previously, a total of 304,680 reads were generated and 802 EST-SSR primer pairs were designed from transcriptome sequencing of faba bean [40]. From this, 81 primer pairs were developed, of which 48% produced polymorphic markers on the genotypes assessed. In our study, 68% (102) of the SSR loci identified were accurately amplified, of which 63% (94) were polymorphic among the genotypes tested. This may be indicatative of the larger number of SSR loci detected, inclusive of non-transcribed sequences. Hence these markers may be more representative of the entire genome for the purposes of germplasm diversity assessment and conservation purposes [41]. Meanwhile, the identification of EST-SSR within sequences provides future opportunity to mine the expressed sequences for significant physical and functional association with traits of interest in marker-assisted faba bean breeding.

Conclusion

This work represents a major advance in the identification of large numbers of informative SSR loci in V. faba by application of 454 GS FLX Titanium sequencing technology.

Abbreviations

SSR: Simple sequence repeat; QTL: Quantative Trait Locus; MAS: Marker-assisted selection; NGS: Next generation sequencing; EST: Express sequence tag; NCBI: National Center for Biotechnology Information; CTAB: Cetyltrimethylammonium bromid; MISA: Microsatellite identification; Na: Number of alleles; He: Expected heterozygosities; Ho: Observed heterozygosities.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

TY performed bioinformatic analysis, primer design and drafted the manuscript. SYB created the SSR sequences rich DNA library, and participated in 454 sequencing. RF assisted in designing experiment and preparing the manuscript. TJJ tested SSR markers. JPG and YHH prepared all the seeds of V. faba. XLS and JYJ took charge of quality inspection of the DNA library. JJH and XYZ participated in conceiving the study and the manuscript drafting. XXZ designed and coordinated the study, and assisted in preparing the manuscript. All authors read and approved the final manuscript.

Additional file 1

Table S1. The information of Vicia faba L. germplasm used in this study. Click here for file

Additional file 2

Table S2. The primer pairs were successfully designed by Primer3. Click here for file

Additional file 3

Figure S1. Frequences of different SSR repeat motif types in mononuceotide. Click here for file

Additional file 4

Figure S2. Frequences of different SSR repeat motif types in dinuceotide. Click here for file

Additional file 5

Frequences of different SSR repeat motif types in trinuceotide. Click here for file

Additional file 6

Figure S4. Frequences of different SSR repeat motif types in tetranuceotide. Click here for file

Additional file 7

Figure S5. Figure S3. Frequences of different SSR repeat motif types in pentanuceotide. Click here for file

Additional file 8

Figure S6. Frequences of different SSR repeat motif types in heaxanuceotide. Click here for file
  31 in total

Review 1.  Microsatellites for linkage analysis of genetic traits.

Authors:  C M Hearne; S Ghosh; J A Todd
Journal:  Trends Genet       Date:  1992-08       Impact factor: 11.639

2.  Genome-wide association studies of 14 agronomic traits in rice landraces.

Authors:  Xuehui Huang; Xinghua Wei; Tao Sang; Qiang Zhao; Qi Feng; Yan Zhao; Canyang Li; Chuanrang Zhu; Tingting Lu; Zhiwu Zhang; Meng Li; Danlin Fan; Yunli Guo; Ahong Wang; Lu Wang; Liuwei Deng; Wenjun Li; Yiqi Lu; Qijun Weng; Kunyan Liu; Tao Huang; Taoying Zhou; Yufeng Jing; Wei Li; Zhang Lin; Edward S Buckler; Qian Qian; Qi-Fa Zhang; Jiayang Li; Bin Han
Journal:  Nat Genet       Date:  2010-10-24       Impact factor: 38.330

3.  Estimation of average heterozygosity and genetic distance from a small number of individuals.

Authors:  M Nei
Journal:  Genetics       Date:  1978-07       Impact factor: 4.562

Review 4.  Applications of next-generation sequencing technologies in functional genomics.

Authors:  Olena Morozova; Marco A Marra
Journal:  Genomics       Date:  2008-08-24       Impact factor: 5.736

5.  Molecular variation among Chinese and global winter faba bean germplasm.

Authors:  Xuxiao Zong; Xiuju Liu; Jianping Guan; Shumin Wang; Qingchang Liu; Jeffrey G Paull; Robert Redden
Journal:  Theor Appl Genet       Date:  2009-01-24       Impact factor: 5.699

6.  Analysis of a diverse global Pisum sp. collection and comparison to a Chinese local P. sativum collection with microsatellite markers.

Authors:  Xuxiao Zong; Robert J Redden; Qingchang Liu; Shumin Wang; Jianping Guan; Jin Liu; Yanhong Xu; Xiuju Liu; Jing Gu; Long Yan; Peter Ades; Rebecca Ford
Journal:  Theor Appl Genet       Date:  2008-09-25       Impact factor: 5.699

7.  Abundance, variability and chromosomal location of microsatellites in wheat.

Authors:  M S Röder; J Plaschke; S U König; A Börner; M E Sorrells; S D Tanksley; M W Ganal
Journal:  Mol Gen Genet       Date:  1995-02-06

8.  Survey of trinucleotide repeats in the human genome: assessment of their utility as genetic markers.

Authors:  J M Gastier; J C Pulido; S Sunden; T Brody; K H Buetow; J C Murray; J L Weber; T J Hudson; V C Sheffield; G M Duyk
Journal:  Hum Mol Genet       Date:  1995-10       Impact factor: 6.150

9.  QTL mapping of ten agronomic traits on the soybean ( Glycine max L. Merr.) genetic map and their association with EST markers.

Authors:  W-K Zhang; Y-J Wang; G-Z Luo; J-S Zhang; C-Y He; X-L Wu; J-Y Gai; S-Y Chen
Journal:  Theor Appl Genet       Date:  2004-01-22       Impact factor: 5.699

10.  Genome-wide distribution and organization of microsatellites in plants: an insight into marker development in Brachypodium.

Authors:  Humira Sonah; Rupesh K Deshmukh; Anshul Sharma; Vinay P Singh; Deepak K Gupta; Raju N Gacche; Jai C Rana; Nagendra K Singh; Tilak R Sharma
Journal:  PLoS One       Date:  2011-06-21       Impact factor: 3.240

View more
  26 in total

1.  Characterization of drought stress-responsive root transcriptome of faba bean (Vicia faba L.) using RNA sequencing.

Authors:  Salem S Alghamdi; Muhammad A Khan; Megahed H Ammar; Qiwei Sun; Lihua Huang; Hussein M Migdadi; Ehab H El-Harty; Sulieman A Al-Faifi
Journal:  3 Biotech       Date:  2018-11-27       Impact factor: 2.406

2.  Genetic variability in Tunisian populations of faba bean (Vicia faba L. var. major) assessed by morphological and SSR markers.

Authors:  Feten Rebaa; Ghassen Abid; Marwa Aouida; Souhir Abdelkarim; Ibtissem Aroua; Yordan Muhovski; Jean-Pierre Baudoin; Mahmoud M'hamdi; Khaled Sassi; Moez Jebara
Journal:  Physiol Mol Biol Plants       Date:  2017-02-02

3.  Genome-wide identification of SSR and SNP markers from the non-heading Chinese cabbage for comparative genomic analyses.

Authors:  Xiaoming Song; Tingting Ge; Ying Li; Xilin Hou
Journal:  BMC Genomics       Date:  2015-04-20       Impact factor: 3.969

4.  Assessment of genetic diversity among faba bean genotypes using agro-morphological and molecular markers.

Authors:  Megahed H Ammar; Salem S Alghamdi; Hussein M Migdadi; Muhammad A Khan; Ehab H El-Harty; Sulieman A Al-Faifi
Journal:  Saudi J Biol Sci       Date:  2015-02-16       Impact factor: 4.219

5.  Development of novel microsatellite markers for the BBCC Oryza genome (Poaceae) using high-throughput sequencing technology.

Authors:  Caihong Wang; Xiaojiao Liu; Suotang Peng; Qun Xu; Xiaoping Yuan; Yue Feng; Hanyong Yu; Yiping Wang; Xinghua Wei
Journal:  PLoS One       Date:  2014-03-14       Impact factor: 3.240

6.  Development and application of the Faba_bean_130K targeted next-generation sequencing SNP genotyping platform based on transcriptome sequencing.

Authors:  Chenyu Wang; Rong Liu; Yujiao Liu; Wanwei Hou; Xuejun Wang; Yamei Miao; Yuhua He; Yu Ma; Guan Li; Dong Wang; Yishan Ji; Hongyan Zhang; Mengwei Li; Xin Yan; Xuxiao Zong; Tao Yang
Journal:  Theor Appl Genet       Date:  2021-06-12       Impact factor: 5.699

7.  Large-Scale Transcriptome Analysis in Faba Bean (Vicia faba L.) under Ascochyta fabae Infection.

Authors:  Sara Ocaña; Pedro Seoane; Rocio Bautista; Carmen Palomino; Gonzalo M Claros; Ana M Torres; Eva Madrid
Journal:  PLoS One       Date:  2015-08-12       Impact factor: 3.240

8.  High-Throughput Development of SSR Markers from Pea (Pisum sativum L.) Based on Next Generation Sequencing of a Purified Chinese Commercial Variety.

Authors:  Tao Yang; Li Fang; Xiaoyan Zhang; Jinguo Hu; Shiying Bao; Junjie Hao; Ling Li; Yuhua He; Junye Jiang; Fang Wang; Shufang Tian; Xuxiao Zong
Journal:  PLoS One       Date:  2015-10-06       Impact factor: 3.240

9.  Rapid microsatellite development for tree peony and its implications.

Authors:  Zhimin Gao; Jie Wu; Zheng'an Liu; Liangsheng Wang; Hongxu Ren; Qingyan Shu
Journal:  BMC Genomics       Date:  2013-12-16       Impact factor: 3.969

10.  What remains from a 454 run: estimation of success rates of microsatellite loci development in selected newt species (Calotriton asper, Lissotriton helveticus, and Triturus cristatus) and comparison with Illumina-based approaches.

Authors:  Axel Drechsler; Daniel Geller; Katharina Freund; Dirk S Schmeller; Sven Künzel; Oliver Rupp; Adeline Loyau; Mathieu Denoël; Emilio Valbuena-Ureña; Sebastian Steinfartz
Journal:  Ecol Evol       Date:  2013-09-17       Impact factor: 2.912

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.