Literature DB >> 28096766

Genetic variation, population structure and linkage disequilibrium in Switchgrass with ISSR, SCoT and EST-SSR markers.

Yu Zhang1,2, Haidong Yan1, Xiaomei Jiang1, Xiaoli Wang3, Linkai Huang1, Bin Xu4, Xinquan Zhang1, Lexin Zhang1.   

Abstract

BACKGROUND: To evaluate genetic variation, population structure, and the extent of linkage disequilibrium (LD), 134 switchgrass (Panicum virgatum L.) samples were analyzed with 51 markers, including 16 ISSRs, 20 SCoTs, and 15 EST-SSRs.
RESULTS: In this study, a high level of genetic variation was observed in the switchgrass samples and they had an average Nei's gene diversity index (H) of 0.311. A total of 793 bands were obtained, of which 708 (89.28 %) were polymorphic. Using a parameter marker index (MI), the efficiency of the three types of markers (ISSR, SCoT, and EST-SSR) in the study were compared and we found that SCoT had a higher marker efficiency than the other two markers. The 134 switchgrass samples could be divided into two sub-populations based on STRUCTURE, UPGMA clustering, and principal coordinate analyses (PCA), and upland and lowland ecotypes could be separated by UPGMA clustering and PCA analyses. Linkage disequilibrium analysis revealed an average r2 of 0.035 across all 51 markers, indicating a trend of higher LD in sub-population 2 than that in sub-population 1 (P < 0.01).
CONCLUSIONS: The population structure revealed in this study will guide the design of future association studies using these switchgrass samples.

Entities:  

Keywords:  Genetic variation; Linkage disequilibrium; Panicum virgatum L; Population structure

Mesh:

Substances:

Year:  2016        PMID: 28096766      PMCID: PMC5226102          DOI: 10.1186/s41065-016-0007-z

Source DB:  PubMed          Journal:  Hereditas        ISSN: 0018-0661            Impact factor:   3.271


Background

Genetic diverstiy is a significant factor that contributes to crop improvement. Evaluation of genetic variation in contemporary germplasm through breeding programs may be indirectly favorable for genetic progress in future cultivars [1]. Thus, estimation of plant diversity is crucial for the efficacious use of genetic resources in breeding programs. Molecular markers, as particular segments of DNA that represent different functional classes, play an essential role in all aspects of plant breeding, and have been widely used to estimate genetic variation. Compared with conventional phenotyping methods, molecular markers have numerous advantages as they are easily detectable and stable in plant tissues regardless of environmental influences [2]. The inter simple sequence repeat marker (ISSR) is highly polymorphic and is useful in studies of genetic diversity, genome mapping and evolutionary biology [3]. This PCR-based technique is used in various types of plants and can overcome many defects of other marker methods, such as high-cost of amplified fragment length (AFLP) and the low reproducibility of random amplified polymorphic DNA (RAPD) [4]. Start codon targeted marker (SCoT) is a reliable and simple gene-targeted marker located on the translational start codon [5]. This technique involves designing single primers from the short conserved region flanking the ATG start codon [6] without knowing any further genomic sequence information. It has been used in peanut and mango crops for genetic diversity and cultivar relationship analysis [7]. Expressed sequence tag-simple sequence repeats marker (EST-SSR) detects variation based on the expressed portion of the genome from EST databases, thus explaining the low cost of development compared with the genomic simple sequence repeat marker (SSR) [8]. These EST-SSR primers can be used across various species for comparative mapping and the construction of genetic linkage maps [9, 10]. Each marker type has unique advantages and these three marker systems have found extensive application in the evaluation of genetic variation, population structure, and assisted selection for crop improvement [3, 11–14]. Many studies have shown that these markers are mainly used to develop genetic linkage maps [15, 16], however, fewer studies have focused on constructing linkage disequilibrium (LD) maps. Remarkably, LD and linkage are two different genetic terms, where LD refers to correlation between alleles in a population, while LD means the correlated inheritance of loci through physical connection on a chromosome [17]. Some factors can affect the LD level, including allele frequency and recombination. Unlike linkage analysis, LD mapping relies on a natural population which is used to identify the relationships between genetic and phenotypic variation. LD mapping, that is association analysis, represents a useful tool to identify trait-marker relationships, and the first LD mapping of a quantitative trait was the analysis of flowering time and the dwarf8 gene in maize [18]. Linkage disequilibrium (LD), referring to the nonrandom association of alleles between linked or unlinked loci, is the basis of association mapping to identify genetic regions associated with agronomic traits [17]. Recently, LD studies have been performed in various plants, such as rice (Oryza sativa L.) [19], barley (Hordeum vulgare L.) [20], Maize (Zea mays L.) [21], chickpea (Cicer arietinum L.) [22], perennial ryegrass (Lolium perenne L.) [23], and the model legume, Medicago truncatula [24]. The level of LD is constantly regarded as a standard to reflect mapping resolution. Association mapping in populations with low LD requires a high number of markers, whereas a high LD means low mapping resolution [25]. In addition, information about population structure within germplasm collections is also crucial for the interpretation and identification of associations between genetic and functional diversity, and to assess whether the inter-sample relatedness is suitable for association studies [26-28]. Therefore, population structure is also included as an effect in models used for association analysis. [15, 29]. Switchgrass (Panicum virgatum L.), as a warm season C4 perennial grass that is native to North America [30], is regarded as an important biofuel crop for its remarkable biomass yield and good adaptability on marginal lands thereby not competing with food crops on farmland [31-33]. In this study, we explored two distinct forms of switchgrass, upland and lowland ecotypes. The upland accessions are distributed in northern cold areas with lower biomass than lowland varieties. Generally, upland switchgrass is shorter (≤2.4 m, tall) than lowland types (≥ 2.7 m) in favorable environments. However, lowland cultivars appear more sensitive to moisture stress than upland cultivars [34]. Constructing association maps comparing the physiological and genetic basis of varying stresses can provide an available reference for the genetic improvement of switchgrass, and the evaluation of the level of LD and population structure can aid association analyses. To date, however, LD analysis across the switchgrass genome remains inadequate [35]. In our study, we present 134 switchgrass accessions supplied by Plant Genetic Resources Conservation Unit, Griffin, Georgia USA to identify the levels of genetic variation, population structure, and extent of LD using 51 markers including 16 ISSRs, 20 SCoTs, and 15 EST-SSRs. These results will provide a valuable molecular basis for enriching switchgrass genetic variation, and the information on the level of LD and population structure may guide association mapping using this representative collection. Here we constructed a three-marker molecular dataset with important applications for diversity analysis, establishment of population structure and evaluation of linkage disequilibrium in switchgrass which is an allogamous species.

Results and discussion

Genetic variation analysis

The ISSR, SCoT, and EST-SSR primers were screened using the selected four genotypes [PI421999 (AM-314/MS-155), PI422006 (Alamo), PI642190 (Falcon), and PI642207 (70SG 016)]. After the initial screening, the numbers of selected ISSR, SCoT, and EST-SSR primers used in further studies were reduced to 16, 20, and 15 pairs, respectively (Table 1).
Table 1

The ISSR, SCoT, and EST-SSR primers used in this study and amplification results

PrimerPrimer sequence (5′ → 3′)Annealing (°C)Total number of amplified bands (TNB)The number of polymorphic bands (NPB)Percentage of polymorphic bands (PPB) %)
ISSR-UBC812(GA)8A52.0131292.31
ISSR-UBC827(AC)8G53.01010100.00
ISSR-UBC828(TG)8A52.09888.89
ISSR-UBC829(TG)8C52.0121083.33
ISSR-UBC830(TG)8G55.0141392.86
ISSR-UBC835(AG)8YC52.0151386.67
ISSR-UBC836(AG)8YT54.0171482.35
ISSR-UBC844(CT)8RC52.0141285.71
ISSR-UBC848(CA)8RG53.0141392.86
ISSR-UBC854(TC)8RG52.01515100.00
ISSR-UBC868(GAA)6 55.0131292.31
ISSR-UBC876(GATA)2(GACA)2 52.0161381.25
ISSR-UBC879(CTTCA)3 53.0171482.35
ISSR-UBC887DVD(TC)7 52.01414100.00
ISSR-UBC890VHV(GT)7 52.0131184.62
ISSR-UBC891HVH(TG)7 52.0141285.71
SCoT2CAACAATGGCTACCACCC55.0171482.35
SCoT3CAACAATGGCTACCACCG55.0343191.18
SCoT4CAACAATGGCTACCACCT55.0211990.48
SCoT5CAACAATGGCTACCACGA55.0211990.48
SCoT6CAACAATGGCTACCACGC55.0222091.91
SCoT7CAACAATGGCTACCACGG55.0222091.91
SCoT9CAACAATGGCTACCAGCA55.0201995.00
SCoT10CAACAATGGCTACCAGCC55.0212095.00
SCoT12ACGACATGGCGACCAACG55.0252496.00
SCoT13ACGACATGGCGACCATCG55.0252288.00
SCoT15ACGACATGGCGACCGCGA55.0181688.89
SCoT16ACCATGGCTACCACCGAC55.0262492.31
SCoT18ACCATGGCTACCACCGCC55.0201890.00
SCoT21ACGACATGGCGACCCACA55.0211990.48
SCoT28CCATGGCTACCACCGCCA55.0252288.00
SCoT31CCATGGCTACCACCGCCT55.0221986.36
SCoT34ACCATGGCTACCACCGCA55.0211990.48
SCoT35CATGGCTACCACCGGCCC55.0211990.48
SCoT37CAATGGCTACCACTAGCC55.0232086.96
SCoT48ACAATGGCTACCACTGGC55.0201890.00
EST-SSR-cnl35f: AAGTGAGCACAACGACACGA58.09888.89
r:CGATCCAAAGAAGCAAAGATG
EST-SSR-cnl37f:CTGCCTCGCGTGAAAGATA59.010990.00
r:CCTCCTCGATCTGGATGGT
EST-SSR-cnl42f:GTTGGTCTGCTGCTCACTCG59.09888.89
r:CCGACGATGTTGAAGGAGAG
EST-SSR-cnl47f: GACTCGCACGATTTCTCCTC57.09888.89
r:GCCAGACAACCAATTCAGGT
EST-SSR-cnl51f:CTAGGGTTTCCCACCTCTCA59.08675.00
r:AATGTCCTTGGCGTTGCT
EST-SSR-cnl55f:GCTGATAGCGAGGTGGGTAG58.0141178.57
r:CTGCCGGTTGATCTTGTTCT
EST-SSR-cnl61f:CACGAGTGCAGAGCTAGACG60.05480.00
r:ACAACAACCCGACTGCTACC
EST-SSR-cnl86f:CAACAACGTCAACGCCTTC59.011872.73
r:GCGTCTTGAACCTCTTGTCC
EST-SSR-cnl100f:CGTCGTCCTCTGCTGTGAG58.05480.00
r:AGGTCGTCCATCTGCTGCT
EST-SSR-cnl115f:CGAGAAGAAGGTGGTGTCGT59.07685.71
r:AGGTCGTGGAAGGTCTTGG
EST-SSR-cnl119f:ATCGTCTCCTCCTCCTCCA57.066100.00
r:ATGCCTCGGTGGACTGGTA
EST-SSR-cnl130f:AAATGTTGAGCAACGGGAGCT59.07685.71
r:ACTTCATAGGGCGGAGGTCT
EST-SSR-cnl144f:AGAAGGCGGCTCAGAAGAAG58.01010100.00
r:GCTCCAACTCAGAATCAACAA
EST-SSR-cnl147f:GGCTAGGGTTTCGACTCCTC60.09777.78
r:AGATGGCGAACTCGACCTG
EST-SSR-cnl158f:CTCATCCCACCACCACCAC59.099100.00
r:CCCTGAAGAAGTCGAACACG
Total79370889.28
The ISSR, SCoT, and EST-SSR primers used in this study and amplification results These three marker systems (ISSR, SCoT, and EST-SSR) have been used for cultivar identification and genetic variation assessment in many plant species [36-39]. In this study, these markers were successfully used to differentiate switchgrass accessions. A total of 51 primer pairs were used and 793 bands were produced, with a mean of 15.5 bands per primer, among which 89.28 % were polymorphic. Our results suggested that ISSR, SCoT, and EST-SSR analyses could contribute to the detection of genetic variation. In addition, Nei’s (1973) gene diversity index (H) and Shannon’s information index (I) was 0.311 and 0.471, respectively, and the similarity coefficient, ranging from 0.162 to 0.857 with an average of 0.510 was similar to other studies on switchgrass, in which the similarity coefficients were estimated to be between 0.45 to 0.81 [40] or 0.53 to 0.78 [41]. This indicates that switchgrass has abundant genetic variation and is a highly heterogenous species [42]. The AMOVA of the distance matrix for the genotypes permitted a partitioning of the overall variation into two levels: between upland and lowland ecotypes and within a population. The results revealed genetic differentiation between upland and lowland ecotypes (P < 0.001), with 31.42 % of genetic variation between ecotypes and 68.58 % of genetic variation within ecotypes. Similar results were obtained in other switchgrass germplasm collections [40, 43, 44] and in other perennial, and cross-pollinated plants [45].

Marker efficiency analysis

In this study, we extracted genomic DNA from an individual so that we were able to obtain complete genetic information including allele numbers, gene frequency and observed heterozygosity for marker efficiency analysis. A parameter marker index (MI) was used to compare the efficiencies of the three assays in the collection of 134 switchgrass genotypes (Table 2). There was almost no disparity between the average band informativeness (Ibav) indice for ISSRs, SCoTs, and EST-SSRs, which were 0.38, 0.43, and 0.36, respectively. However, the effective multiplex ratio (EMR) index for ScoT (20.10) was twice as high as that of the ISSRs (12.25) and three times as high as that of the EST-SSRs (7.33). The MI calculation indicated an efficient and distinctive nature of the SCoTs with the MI for these markers (8.64) higher than the other two assays examined here (4.66 for ISSRs and 2.64 for EST-SSRs).
Table 2

Comparison of usefulness between ISSR, SCoT, and EST-SSR markers for 134 switchgrass accessions

ItemsISSRSCoTEST-SSR
No. of primers162015
No. of total bands220445128
No. of average bands per primers13.7522.258.53
Percentage of polymorphic bands (PPB)0.890.900.86
Average band informativeness (Ibav)0.380.430.36
Effective multiplex ratio (EMR)12.2520.107.33
Marker index (MI)4.668.642.64
Comparison of usefulness between ISSR, SCoT, and EST-SSR markers for 134 switchgrass accessions A parameter MI, has been widely used to evaluate the overall utility of each marker system [46]. The high MI in the SCoTs results from its high EMR, making these markers appropriate for fingerprinting [47] or evaluating genetic variation in breeding populations [48, 49]. In addition, the SCoTs performed well in other species. Compared with ISSR and inter-retrotransposon amplified polymorphism (IRAP), SCoT markers were more informative than IRAP and ISSR for the assessment of diversity among Persian oak (Quercus brantii Lindl.) individuals [50]. Results from the evaluation on the genetic variation of mango (Mangifera indica L.) cultivars indicated that the SCoT analysis represents actual relationships better than the ISSR analysis [51].

Population structure analysis

After removing low frequency bands (considering MAF ≤ 0.05), we analyzed the data from 51 pairs of ISSR, SCoT, and EST-SSR primers to understand the population structure of the entire switchgrass collection based on a Bayesian clustering approach using STRUCTURE [52]. The number of subpopulations (K) was identified based on maximum likelihood and ΔK values. For the 134 switchgrass genotypes the maximum ΔK was observed at K = 2 (Fig. 1), with genotypes falling into two subpopulations. Using a membership probability threshold of 0.75, 76 genotypes were assigned to subpopulation 1 (G1), out of which, 69 genotypes belonged to upland ecotypes, and the remaining 7 were lowland. Subpopulation 2 (G2) contained 42 genotypes, and all of them were upland ecotypes. The remaining 16 genotypes were classified into an admixed group as they had membership probabilities lower than 0.75 in any given subpopulation. With the maximum membership probability, 91 accessions were assigned to G1 and 43 accessions to G2 (Fig. 2).
Fig. 1

STRUCTURE analysis of the number of populations for K. The number of subpopulations (K) was identified based on maximum likelihood and ΔK values. The most likely value of K identified by STRUCTURE was observed at K = 2

Fig. 2

Two subgroups inferred from STRUCTURE analysis. The vertical coordinate of each subgroup means the membership coefficients for each accessions; the digits of the horizontal coordinate represent the 134 switchgrass accessions corresponding to Table 3; Red zone: G1, Green zone: G2

STRUCTURE analysis of the number of populations for K. The number of subpopulations (K) was identified based on maximum likelihood and ΔK values. The most likely value of K identified by STRUCTURE was observed at K = 2 Two subgroups inferred from STRUCTURE analysis. The vertical coordinate of each subgroup means the membership coefficients for each accessions; the digits of the horizontal coordinate represent the 134 switchgrass accessions corresponding to Table 3; Red zone: G1, Green zone: G2
Table 3

The 134 switchgrass samples used for marker (ISSR, SCoT, and EST-SSR) genotyping

CodePlant IDPlant nameEcotypeOriginCodePlant IDPlant nameEcotypeOrigin
1PI315723BN-8358-62LLNorth Carolina, US68PI64224470SG 057ULNorth Dakota, US
2PI414065BN-14668-65LLArkansas, US69PI64224570SG 058ULNorth Dakota, US
3PI421521KANLOWLLKansas, US70PI64224770SG 060ULNorth Dakota, US
4PI421999AM-314/MS-155LLKansas, US71PI64224870SG 061ULNorth Dakota, US
5PI422006ALAMOLLTexas, US72PI64224970SG 062ULNorth Dakota, US
6PI607837TEM-SLC 01LLTexas, US73PI64225070SG 063ULNorth Dakota, US
7PI607838TEM-SLC 02LLTexas, US74PI64225170SG 064ULNorth Dakota, US
8PI315724BN-10860-61ULKansas, US75PI64225270SG 065ULNorth Dakota, US
9PI315727BN-11357-63ULNorth Carolina, US76PI64225470SG 067ULNorth Dakota, US
10PI414066GRENVILLEULNew Mexico, US77PI64225670SG 069ULNorth Dakota, US
11PI414067BN-8624-67ULNorth Carolina, US78PI64225770SG 071ULNorth Dakota, US
12PI414068BN-18758-67ULKansas, US79PI64225970SG 073ULNorth Dakota, US
13PI421138CarthageULNorth Carolina, US80PI64226070SG 074ULNorth Dakota, US
14PI421520BlackwellULOklahoma,US81PI64226170SG 075ULNorth Dakota, US
15PI421901MIAMIULFlorida, US82PI64226270SG 076ULNorth Dakota, US
16PI422001STUARTULFlorida, US83PI64226470SG 078ULNorth Dakota, US
17PI422003PMT-785ULFlorida, US84PI64226570SG 079ULNorth Dakota, US
18PI422016-ULFlorida, US85PI64226770SG 081ULNorth Dakota, US
19PI431575KY1625ULKentucky, US86PI64226870SG 082ULNorth Dakota, US
20PI442535156ULBelgium87PI64226971SG 001ULNorth Dakota, US
21PI469228Cave-in-RockULIllinois, US88PI64227071SG 002ULNorth Dakota, US
22PI476290T2086ULNorth Carolina, US89PI64227171SG 004ULNorth Dakota, US
23PI476291T2099ULMaryland, US90PI64227271SG 005ULNorth Dakota, US
24PI414069BN-309-69ULNew York, US91PI64227571SG 008ULNorth Dakota, US
25PI414070BN-12323-69ULKansas, US92PI64227671SG 009ULNorth Dakota, US
26PI476292T2100ULArkansas, US93PI64227771SG 010ULNorth Dakota, US
27PI476293T2101ULNew Jersey, US94PI64227871SG 011ULNorth Dakota, US
28PI476294T4613ULColorado, US95PI64227971SG 012ULNorth Dakota, US
29PI476295T4614ULColorado, US96PI64228071SG 013ULNorth Dakota, US
30PI476296T16971ULMaryland, US97PI64228171SG 014ULNorth Dakota, US
31PI476297CaddoULOklahoma,US98PI64228271SG 015ULNorth Dakota, US
32PI477003Ncbraska 28ULNebraska, US99PI64228371SG 016ULNorth Dakota, US
33PI478002T6011ULNorth Dakota, US100PI64228471SG 017ULNorth Dakota, US
34PI537588DACOTAHULOregon, US101PI64228571SG 018ULNorth Dakota, US
35PI549094TRAILBLAZERULNebraska, US102PI64228671SG 019ULNorth Dakota, US
36PI591824SHAWNEEULNebraska, US103PI64228771SG 020ULNorth Dakota, US
37PI598136SUNBURSTULSouth Dakota, US104PI64228871SG 021ULNorth Dakota, US
38PI642190FALCONULNew Mexico, US105PI64228971SG 022ULNorth Dakota, US
39PI642191SUMMERULSouth Dakota, US106PI64229071SG 023ULNorth Dakota, US
40PI642192PATHFINDERULNebraska, US107PI64229171SG 024ULNorth Dakota, US
41PI64219570SG 003ULNorth Dakota, US108PI64229271SG 025ULNorth Dakota, US
42PI64219670SG 004ULNorth Dakota, US109PI64229371SG 026ULNorth Dakota, US
43PI64219770SG 005ULNorth Dakota, US110PI64229471SG 027ULNorth Dakota, US
44PI64219870SG 006ULNorth Dakota, US111PI64229571SG 028ULNorth Dakota, US
45PI64219970SG 007ULNorth Dakota, US112PI64229671SG 029ULNorth Dakota, US
46PI64220070SG 008ULNorth Dakota, US113PI64229771SG 030ULNorth Dakota, US
47PI64220170SG 010ULNorth Dakota, US114PI64229871SG 031ULNorth Dakota, US
48PI64220370SG 012ULNorth Dakota, US115PI64229971SG 032ULNorth Dakota, US
49PI64220470SG 013ULNorth Dakota, US116PI64230171SG 034ULNorth Dakota, US
50PI64220770SG 016ULNorth Dakota, US117PI64230271SG 035ULNorth Dakota, US
51PI64220870SG 017ULNorth Dakota, US118PI64230371SG 036ULNorth Dakota, US
52PI64220970SG 018ULNorth Dakota, US119PI64230471SG 037ULNorth Dakota, US
53PI64221070SG 019ULNorth Dakota, US120PI64230571SG 038ULNorth Dakota, US
54PI64221270SG 021ULNorth Dakota, US121PI64230671SG 039ULNorth Dakota, US
55PI64221370SG 022ULNorth Dakota, US122PI64230771SG 040ULNorth Dakota, US
56PI64221470SG 023ULNorth Dakota, US123PI64230971SG 041BULNorth Dakota, US
57PI64221770SG 026ULNorth Dakota, US124PI64231071SG 042ULNorth Dakota, US
58PI64221870SG 028ULNorth Dakota, US125PI64231171SG 043ULNorth Dakota, US
59PI64222970SG 041ULNorth Dakota, US126PI64231271SG 044ULNorth Dakota, US
60PI64223270SG 044ULNorth Dakota, US127PI64836670SG 053ULNorth Dakota, US
61PI64223370SG 045ULNorth Dakota, US128PI64836770SG 070ULNorth Dakota, US
62PI64223470SG 046ULNorth Dakota, US129PI657660Central lowa GermplasmULMissouri, US
63PI64223570SG 047ULNorth Dakota, US130PI657661BlackwellULKansas, US
64PI64223670SG 048ULNorth Dakota, US131PI657662NEBRASKA28ULNebraska, US
65PI64223770SG 049ULNorth Dakota, US132PI657663BlackwellULKansas, US
66PI64224270SG 055ULNorth Dakota, US133PI657664GRENVILLEULNew Mexico, US
67PI64224370SG 056ULNorth Dakota, US134PI6593459086103ULNew York, US

Note: “UL” refers to upland ecotype switchgrass, “LL” refers to lowland ecotype switchgrass

The UPGMA cluster analysis from 51 markers generated a dendrogram, demonstrating that the 134 genotypes could be clearly divided into two groups (Fig. 3). The dendrogram clustered all of the lowland ecotypes (LL) into the first. The second group contained all of the upland ecotypes (UL). Other methods have also been used to cluster upland and lowland switchgrass ecotypes. Missaoui et al adopted restriction fragment length polymorphism (RFLP) markers to analyze the genetic relationships among 21 switchgrass genotypes, resulting in three upland and eighteen lowland genotypes clusteringinto two different groups [53]. Huang et al identified differences between the coding sequences of a nuclear gene encoding plastid acetyl-CoA carboxylase in upland and lowland ecotypes genetic variation analysis at gene level, provided by Huang et al researching about a nuclear gene encoding plastid acetyl-CoA carboxylase [54]. In this study, we preliminarily presented population structure analysis of 7 lowland and 127 upland genotypes using 51 ISSR, SCoT, and EST-SSR primer pairs, resulting in an apparently separate cluster among the two ecotypes, confirming the genetic differences between upland and lowland ecotypes. However, as we do not have as many lowland switchgrass samples as upland, we highly recommend more lowland ecotype or other nuclear markers should be used in conjunction with ISSR, SCoT and EST-SSR to more appropriately classify upland and lowland ecotypes.
Fig. 3

Radiation of genetic relationships for 134 switchgrass accessions based on UPGMA. G1 and G2 are the two subgroups identified by STRUCTURE with the maximum membership probability. The numbers at the branches are confidence values based on Felsenstein’s bootstrap produced by FreeTree software, as a general rule, the higher bootstrap value for a given interior branch indicates a closer relationship

Radiation of genetic relationships for 134 switchgrass accessions based on UPGMA. G1 and G2 are the two subgroups identified by STRUCTURE with the maximum membership probability. The numbers at the branches are confidence values based on Felsenstein’s bootstrap produced by FreeTree software, as a general rule, the higher bootstrap value for a given interior branch indicates a closer relationship Based on modified Rogers distances (MRD), PCA separated the 134 genotypes into two major groups, which was consistent with assignments generated by STRUCTURE and the UPGMA dendrogram (Fig. 4). Seven genotypes formed group 1 (Fig. 4, upper right), and the other 127 genotypes, belonging to group 2, were mainly distributed at the lower portion of the plot. The accessions belonging to G1 inferred by the STRUCTURE analysis were all distributed on the right portion of the resulting plot, while G2 was distributed on the left portion of the plot. The distribution of G1 accessions was less tightly clustered than G2, indicating accessions in G1 had higher diversity than G2 (Fig. 4).
Fig. 4

Principal coordinate analysis of 134 switchgrass accessions based on ISSRs, SCoTs, and EST-SSRs. G1 and G2 are the two subgroups identified by STRUCTURE with the maximum membership probability

Principal coordinate analysis of 134 switchgrass accessions based on ISSRs, SCoTs, and EST-SSRs. G1 and G2 are the two subgroups identified by STRUCTURE with the maximum membership probability Before analyzing LD and association mapping, the analysis of population structure emphasizes the need for the genetic analysis of different ecotypes [28]. The UPGMA cluster and PCA analysis demonstrated that 134 genotypes could be clearly divided into two groups (Figs. 1 and 4), and the lowland and upland germplasm clusters were almost completely separated, which was consistent with the results of several other switchgrass studies [41, 55, 56]. For the UPGMA cluster analysis, the first group only included lowland ecotypes, while the second group contained upland ecotypes and could be further classified into two subgroups. Subgroup 1 (G1) contained 83 genotypes, while the remaining 43 belonged to subgroup 2 (G2). The 46 accessions of the 70SG series and 42 accessions of the 71SG series dispersed into these two subgroups are from the same geographical distribution of North Dakota, United States. This indicates that most of the germplasm sub-clustered in accordance with different regions [43, 55], and the assignment of 132 accessions (98.51 % of the total) by the UPGMA cluster analysis was consistent with their classification using PCA (Fig. 4). Unexpectedly, in the STRUCTURE analysis, the 127 upland genotypes were assigned to two subpopulations, possibly because the UPGMA and STRUCTURE programs calculate parameters in different ways. Clusters are generated in STRUCTURE based on both transitory Hardy–Weinberg disequilibrium and LD caused by admixture between populations [55], while the UPGMA dendrogram generates clusters based on the genetic distance among populations [57].

Linkage disequilibrium estimation

After the deletion of low frequency alleles (MAF ≤ 5 %), the 51 ISSRs, SCoTs, and EST-SSRs with unknown chromosome information were used to evaluate the extent of LD among the switchgrass samples. In the collection, interallelic r2 values, the association between any pair of alleles from different loci, were calculated and ranged from 0.000 to 1.000 with an average r2 of 0.035. Across all 51 loci, 247,456 locus pairs were detected in the 134 switchgrass samples. Among all of the locus pairs, 7107 of 135,718 (5.24 %) showed LD at the P < 0.001 level for G1 and 5415 locus pairs (3.99 %) were found at r2 > 0.1 at P < 0.001. For G2, 84,154 locus pairs were detected, 4833 were significant pairs (P < 0.001, 5.74 %), while 4235 locus pairs (5.03 % of 84,154) were found at r2 > 0.1 at P < 0.001. The mean r2 for all materials was 0.480 (P < 0.001), and the LD in G2 (0.668, ranging from 0.068 to 1.000) was significantly (P < 0.001) larger than that in G1 (0.291, ranging from 0.066 to 1.000) (P < 0.01). Populations with high levels of outcrossing have relatively low LD [58]. Among outcrossing maize (Zea mays L.), Remington et al. [59] found lower levels of LD among 47 SSR loci (9.7 % of SSR pairs performing LD at P < 0.01), compared to LD data from an SSR survey of inbred lines of maize, which showed high levels of LD [60]. For switchgrass, LD data comparisons showed a trend towards higher LD in G2 (mean r2 = 0.668) including 42 genotypes all belonging to upland ecotypes, compared with G1 (mean r2 = 0.291), which contained 76 genotypes, including 7 lowland ecotypes.

Method

Plant material

A total of 134 switchgrass genotypes, representing most of the natural geographical distribution areas of switchgrass supplied by the Plant Genetic Resources Conservation Unit, Griffin, Georgia USA were used in this study. These included 7 lowland genotypes originating from 5 US states and 127 upland genotypes originating from Belgium and 15 US states (Table 3). The full accession data and information on switchgrass germplasm comes from ARS GRIN (http://www.ars-grin.gov/). The 134 genotypes, including one seedling from each accession, were grown and maintained in the experimental farm of the Sichuan Agricultural University during the 2012 growing season. The 134 switchgrass samples used for marker (ISSR, SCoT, and EST-SSR) genotyping Note: “UL” refers to upland ecotype switchgrass, “LL” refers to lowland ecotype switchgrass

DNA extraction and marker genotyping

Genomic DNA was extracted from tender leaves of each individual using a modified cetyltrimethylammonium bromide (CTAB) method [61]. ISSR [designed by the University of British Columbia (UBC set No. 9)], EST-SSR [62], and SCoT primer [45] sequences were aligned to the Panicum reference genome using the bl2seq blast program in NCBI (www.ncbi.nlm.nih.gov/BLAST/), which was designed to eliminate redundancies. Initially, four germplasms were used to screen marker primers [PI421999 (AM-314/MS-155), PI422006 (Alamo), PI642190 (Falcon), and PI642207 (70SG 016)]. The selected primers were synthesized by the Shanghai Sangon Biological Engineering Technology and Service Company (Shanghai, China) to genotype the collection. ISSR-PCR was carried out according to Li et al [63] as follows: the total reaction volume was 15 μL and contained 20 ng template DNA, approximately 1.0 μM primer, 7.5 μL Mix (10 × PCR buffer, Mg2+, dNTPs; Tiangen Biotech, Beijing, China), and 1 U Taq polymerase. Amplifications were performed in a BioRad iCycle PCR machine (BIO-RAD Certified) under the following conditions: 95 °C for 5 min, followed by 35 cycles of the following: 95 °C for 45 s, 52–55 °C for 45 s, and 72 °C for 90 s. A final extension was conducted at 72 °C for 7 min. All PCR bands were visualized on 1 % polyacrylamide gel electrophoresis in 1 × TBE buffer. Silver staining was used to visualize the bands. The SCoT-PCR amplification reaction was conducted in a total volume of 15 μL according to Collard and Mackill [5], and containing 10 ng template DNA, 0.8 mM primers, 1.2 mM MgCl2, 0.4 mM dNTPs, and 1 U Taq DNA polymerase (Tiangen Biotech, Beijing, China). PCR amplification had an initial denaturation step of 5 min at 95 °C, followed by 45 s at 95 °C, 45 s at 55 °C, 1.5 min at 72 °C for 30 cycles, and 7 min at 72 °C. PCR products were visualized following agarose gel (1.5 %) electrophoresis at 120Vfor 1.5 h in 1 × TBE buffer, followed by staining with GelRed (Tiangen Biotech, Beijing, China). The EST-SSR PCR consisted of a denaturation for 5 min at 94 °C then 35 cycles of 30 s at 94 °C, 30 s at 53–55 °C, and 2 min at 72 °C, with a final extension of 5 min at 72 °C [62] and products were visualized as described above.

Genetic variation and marker efficiency analysis

For each marker, polymorphic alleles were scored as “1” for presence and “0” for absence at the same mobility, and this data was used to construct an original data matrix. Using Excel 2007 and POPGENE v.1.32 [64], corresponding diversity parameters were estimated including: total number of bands (TNB), number of polymorphic bands (NPB), percentage of polymorphic bands (PPB), Nei’s (1973) gene diversity index (H), and Shannon’s information index (I). AMOVA v.1.55 was employed to reveal genetic variation among groups and within a population [65]. The data input to POPGENE and AMOVA was produced by DCFA v.1.1 [66]. The comparative efficiency of ISSRs, SCoTs, and EST-SSRs in these 134 switchgrass genotypes was assessed with MI. MI is the product of the EMR and the Ibav for the polymorphic markers [67]. EMR is explained as the average number of polymorphic bands [68]. Ibav is defined as: pi is the proportion of the i-th amplification site, n represents the total number of amplification site. The model-based program STRUCTURE v.2.3.4 (http://pritchardlab.stanford.edu/structure.html) [69] was applied to assess the population structure of the 134 switchgrass genotypes with 51 ISSRs, SCoTs, and EST-SSRs. The number of subpopulations (K) was set from 1 to 10 based on admixture models and correlated band frequencies. With 5 × 105 Markov Chain Monte Carlo replications carried out for each run after a burn-in period of 106 iterations, 20 independent runs were performed per K. When there was a clear maximum value for posterior probability [LnP(D)] output in STRUCTURE, a K value was selected in the range of 1 to 10 subpopulations. The most probable K value was the ΔK, an ad hoc quantity related to the rate of change in LnP(D) between successive K inferred by STRUCTURE [70]. The replication of K showing the maximum likelihood was applied to subdivide the genotypes into different groups with membership probabilities ≥ 0.75. Genotypes with less than 0.75 membership probabilities were assigned to an admixed group. Bar charts from the STRUCTURE data were displayed using Distruct 1.1 [71]. A dendrogram was drawn using FreeTree and TreeView programs (http://web.natur.cuni.cz/flegr/freetree.php) [72] based on Nei-Li genetic similarity coefficient with unweighted pair group method average (UPGMA) clustering. To reveal relationships among the 134 switchgrass genotypes, a figure of two-dimensional scatterplots representing all of the genotypes was obtained for principal coordinate analysis (PCA) using NTsys-pc v.2.1 [73]. All of the switchgrass individuals were analyzed to calculate MRD [74]. The resulting genetic distance matrices were double-centered and used to obtain eigenvectors by the modules DCENTER and EIGEN using NTsys-pc.

Evaluation of linkage disequilibrium

The significance of pairwise LD was evaluated using squared band-frequency correlations (r2) between all combinations of marker loci using the package TASSEL version 2.1 (http://www.maizegenetics.net/bioinformatics) [75]. Rare bands with a band frequency of less than 5 % were removed to avoid biased evaluations of LD because of their large variances. Other pairs of bands were evaluated with a minor band frequency of at least 5 % (MAF ≥ 0.05) with the GDA 1.1 program [76].

Conclusions

The results of this study showed a great level of genetic variation among switchgrass germplasm. The switchgrass accessions were clearly divided into two groups containing upland and lowland ecotypes. For the first time, we revealed the extent of LD and population structure in switchgrass. The implications of these results in terms of utilizing association mapping for genes or QTL discovery in switchgrass were discussed. For further association mapping using a collection of switchgrass samples, we highly recommend the inclusion of more lowland ecotypes or the use of other nuclear markers in conjunction with ISSR, SCoT and EST-SSR.
  37 in total

Review 1.  Directed molecular evolution in plant improvement.

Authors:  M Lassner; J Bedbrook
Journal:  Curr Opin Plant Biol       Date:  2001-04       Impact factor: 7.834

2.  Structure of linkage disequilibrium and phenotypic associations in the maize genome.

Authors:  D L Remington; J M Thornsberry; Y Matsuoka; L M Wilson; S R Whitt; J Doebley; S Kresovich; M M Goodman; E S Buckler
Journal:  Proc Natl Acad Sci U S A       Date:  2001-09-18       Impact factor: 11.205

Review 3.  Structure of linkage disequilibrium in plants.

Authors:  Sherry A Flint-Garcia; Jeffry M Thornsberry; Edward S Buckler
Journal:  Annu Rev Plant Biol       Date:  2003       Impact factor: 26.379

4.  TASSEL: software for association mapping of complex traits in diverse samples.

Authors:  Peter J Bradbury; Zhiwu Zhang; Dallas E Kroon; Terry M Casstevens; Yogesh Ramdoss; Edward S Buckler
Journal:  Bioinformatics       Date:  2007-06-22       Impact factor: 6.937

5.  Genetic diversity in tetraploid switchgrass revealed by AFLP marker polymorphisms.

Authors:  J Todd; Y Q Wu; Z Wang; T Samuels
Journal:  Genet Mol Res       Date:  2011-11-29

6.  Start codon targeted polymorphism for evaluation of functional genetic variation and relationships in cultivated peanut (Arachis hypogaea L.) genotypes.

Authors:  Faqian Xiong; Ruichun Zhong; Zhuqiang Han; Jing Jiang; Liangqiong He; Weijian Zhuang; Ronghua Tang
Journal:  Mol Biol Rep       Date:  2010-11-23       Impact factor: 2.316

7.  Hierarchical classification of switchgrass genotypes using SSR and chloroplast sequences: ecotypes, ploidies, gene pools, and cultivars.

Authors:  J E Zalapa; D L Price; S M Kaeppler; C M Tobias; M Okada; M D Casler
Journal:  Theor Appl Genet       Date:  2010-11-23       Impact factor: 5.699

8.  Comparative assessment of DNA fingerprinting techniques (RAPD, ISSR and AFLP) for genetic analysis of cashew (Anacardium occidentale L.) accessions of India.

Authors:  S Archak; A B Gaikwad; D Gautam; E V V B Rao; K R M Swamy; J L Karihaloo
Journal:  Genome       Date:  2003-06       Impact factor: 2.166

9.  Population structure and linkage disequilibrium unravelled in tetraploid potato.

Authors:  Björn B D'hoop; M João Paulo; Krissana Kowitwanich; Mariëlle Sengers; Richard G F Visser; Herman J van Eck; Fred A van Eeuwijk
Journal:  Theor Appl Genet       Date:  2010-06-20       Impact factor: 5.699

10.  Natural allelic diversity, genetic structure and linkage disequilibrium pattern in wild chickpea.

Authors:  Maneesha S Saxena; Deepak Bajaj; Alice Kujur; Shouvik Das; Saurabh Badoni; Vinod Kumar; Mohar Singh; Kailash C Bansal; Akhilesh K Tyagi; Swarup K Parida
Journal:  PLoS One       Date:  2014-09-15       Impact factor: 3.240

View more
  5 in total

1.  Assessment of genetic diversity in Lepidium sativum L. using inter simple sequence repeat (ISSR) marker.

Authors:  Vinay Kumar; Hemant Kumar Yadav
Journal:  Physiol Mol Biol Plants       Date:  2018-11-20

2.  Assessment of genetic diversity in Vigna unguiculata L. (Walp) accessions using inter-simple sequence repeat (ISSR) and start codon targeted (SCoT) polymorphic markers.

Authors:  David Okeh Igwe; Celestine Azubike Afiukwa; Benjamin Ewa Ubi; Kenneth Idika Ogbu; Omena Bernard Ojuederie; George Nkem Ude
Journal:  BMC Genet       Date:  2017-11-17       Impact factor: 2.797

3.  Genetic diversity and population structure of Miscanthus lutarioriparius, an endemic plant of China.

Authors:  Sai Yang; Shuai Xue; Weiwei Kang; Zhuxi Qian; Zili Yi
Journal:  PLoS One       Date:  2019-02-01       Impact factor: 3.240

4.  Genetic diversity and population structure analysis of Ghanaian and exotic cassava accessions using simple sequence repeat (SSR) markers.

Authors:  Joseph Adjebeng-Danquah; Joseph Manu-Aduening; Isaac Kwadwo Asante; Richard Yaw Agyare; Vernon Gracen; Samuel Kwame Offei
Journal:  Heliyon       Date:  2020-01-31

5.  Full-length transcriptome sequencing analysis and development of EST-SSR markers for the endangered species Populus wulianensis.

Authors:  Qichao Wu; Fengqi Zang; Xiaoman Xie; Yan Ma; Yongqi Zheng; Dekui Zang
Journal:  Sci Rep       Date:  2020-10-01       Impact factor: 4.379

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.