Literature DB >> 25153139

Association mapping of seed oil and protein content in Sesamum indicum L. using SSR markers.

Chun Li1, Hongmei Miao1, Libin Wei1, Tide Zhang1, Xiuhua Han1, Haiyang Zhang1.   

Abstract

Sesame is an important oil crop for the high oil content and quality. The seed oil and protein contents are two important traits in sesame. To identify the molecular markers associated with the seed oil and protein contents in sesame, we systematically performed the association mapping among 369 worldwide germplasm accessions under 5 environments using 112 polymorphic SSR markers. The general linear model (GLM) was applied with the criteria of logP ≥ 3.0 and high stability under all 5 environments. Among the 369 sesame accessions, the oil content ranged from 27.89%-58.73% and the protein content ranged from 16.72%-27.79%. A significant negative correlation of the oil content with the protein content was found in the population. A total of 19 markers for oil content were detected with a R2 value range from 4% to 29%; 24 markers for protein content were detected with a R2 value range from 3% to 29%, of which 19 markers were associated with both traits. Moreover, partial markers were confirmed using mixed linear model (MLM) method, which suggested that the oil and protein contents are controlled mostly by major genes. Allele effect analysis showed that the allele associated with high oil content was always associated with low protein content, and vice versa. Of the 19 markers associated with oil content, 17 presented near the locations of the plant lipid pathway genes and 2 were located just next to a fatty acid elongation gene and a gene encoding Stearoyl-ACP Desaturase, respectively. The findings provided a valuable foundation for oil synthesis gene identification and molecular marker assistant selection (MAS) breeding in sesame.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25153139      PMCID: PMC4143287          DOI: 10.1371/journal.pone.0105757

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Sesame (Sesamum indicum L.) is an ancient and important oilseed crop and is cultivated mainly in the tropical and subtropical regions of Asia, Africa and Southern America. The harvested area of world sesame reaches to 7.3×107 hm2, and the total product per year is roughly 3.7×107 ton from 2001 to 2010 (FAO data). Compared with other main oil crops, e.g., soybean (18% of average oil content) [1], oilseed rape (41%) [2], sunflower (40–44%) [3] and peanut (51%) [4], sesame is one of the few oil crops with the highest oil content and quality. Sesame seeds contain 55–58% oil and almost 18% proteins. Among the fatty acid compositions in sesame seeds, oleic acid (18∶1) (39.6%) and linoleic acid (18∶2) (46.0%) are the two main components with the ideal ratio of almost 1∶1 [5], [6]. Apart from the seed yield, the content of seed storage oil and proteins is the highlight agronomic trait in sesame breeding [7]. In the past two decades, in order to clarify the high quality of sesame oil and protein, many researchers focused on exploring seed development and fatty acid and storage protein synthesis processes, as well as identifying the lipid synthesis related genes and molecular markers in sesame [8]–[12]. Of all the three available cDNA libraries, two libraries are constructed for seed development analysis [13], [14]. However, oil and protein contents are complex quantitative traits and always are affected by genotype and environment [15]. At present, the mechanism of high oil content and quality in sesame seeds is still unclear. No loci of oil and protein content traits have ever been found in the sesame linkage maps. Even though Wei et al. [12] performed the association analysis of seed oil and protein content and fatty acid composition within 216 Chinese sesame accessions using 79 molecular primer pairs (including SSRs, SRAPs and AFLPs), only one association marker (M15E10-3) was identified under two environments. Therefore, in order to precisely detect the genes or markers associated with oil and protein content traits and to improve the sesame breeding, more efficient markers and germplasm resources with larger phenotypic variation need to be applied [16], [17]. Currently, linkage analysis (QTL mapping) and association mapping are two main and common analysis tools for dissecting complex phenotypic variation. Compared with the traditional linkage analysis based on mapping populations, association mapping offers higher precision for locating QTLs and selecting molecular markers [16], [18]. Till now, association mapping has been extensively used for analyzing important agronomic and quantity traits in wheat, maize, cotton, oilseed rape and other crops [18]–[22]. In the past several years, vast simple sequence repeat (SSR) or microsatellite markers with the high polymorphism are developed in sesame [7], [23], [24]. Accordingly, the association mapping is getting reliable and powerful for detecting the genes or markers associated with key traits and improving the molecular marker-assisted selection (MAS) in sesame breeding programs. The objectives of this study are: (1) to perform the association mapping of seed oil content (OC) and protein content (PC) traits in worldwide sesame accessions using the GLM and MLM models, (2) to reflect the characteristics of sesame oil and protein contents under various environments, and (3) to determine the key SSR markers associated with seed quality. In this report, a natural population covering 369 worldwide accessions from China and other 15 countries and 112 pairs of polymorphic SSR markers were applied. The results give a foundation for investigating the seed development-related genes and seed quality in sesame.

Results

Seed oil and protein content variation in the natural population

Both the seed oil content (OC) and the protein content (PC) are often influenced by environment. To decrease the environmental effect, we collected the phenotypic data of the 369 sesame accessions under 5 environments of Pingyu and Yuanyang locations in 2011, and Pingyu, Yuanyang and Xinyang locations in 2012, The descriptive parameters under each environment were calculated (Table 1). The results showed that the OC and PC significantly varied among the 369 accessions. In total, the OC of the natural population ranged from 27.89%–58.73%, with an average of 49.59%–53.14%; meanwhile, the PC ranged from 16.72%–27.79% with an average of 20.28%–22.51%. All the datasets showed a normal or nearly normal distribution. To determine the heritability of the phenotypes, we performed the variance analysis of oil and protein contents (Table 2). Results indicated that the OC and PC traits were significantly influenced by genotype and environments (i.e., year and location). No significant interactions between variety and environmental factors (year and location) were detected. Moreover, the OC and PC traits presented the significant negative correlation under the 5 environments, as the correlation coefficient (r) between OC and PC varied from −0.66∼−0.72 (P<0.01) in 2011 and from −0.52∼−0.74 (P<0.01) in 2012, respectively (Data not listed).
Table 1

Phenotypic variation of the seed oil and protein contents in the 369 accessions under 5 environments.

TraitYearPlaceMin. (%)Max. (%)Mean (%)Std. ev.SkewnessKurtosis
OC2011Pingyu27.8955.0349.594.61−2.191.69
2011Yuanyang34.2857.5351.943.72−2.262.85
2012Pingyu31.9554.8049.823.43−2.545.13
2012Yuanyang30.7757.8852.234.22−2.533.98
2012Xinyang32.7458.7353.144.59−2.322.67
PC2011Pingyu18.7426.4921.651.300.901.28
2011Yuanyang17.6025.1020.281.160.971.55
2012Pingyu20.2227.7922.511.040.781.68
2012Yuanyang16.7227.4420.941.251.055.27
2012Xinyang17.2426.8420.591.570.851.30
Table 2

Analysis of Variance of the oil and protein contents in the population under 5 environments.

Source of variationDFOCPC
MSF valueMSF value
Year127.337.46** 211.90412.83**
Location21564.58427.02** 551.561074.55**
Variety36862.7817.13** 5.2310.19**
Year×Variety3683.190.870.551.07
Location×Variety7362.940.800.651.27
Residual3673.660.51

** The significance at P<0.01.

DF denotes degree of freedom; MS denotes mean square.

** The significance at P<0.01. DF denotes degree of freedom; MS denotes mean square.

Linkage disequilibrium

Linkage disequilibrium (LD) refers to the non-random association of alleles between the genetic loci. A total of 112 SSR markers were used for estimating the LD level among the 369 sesame germplasm accessions (Table S2). These SSR markers distributed in 33 contigs/scaffolds with a total length of 180.86 Mb, which approximately represented 67 percentage of the assembly genome size (270 Mb) and 50 percentage of the estimated genome size (360Mb). The average SSR density was 1 SSR per 1.6 Mb. To reflect the associations between the polymorphic loci of the 112 SSR markers, LD P-values were determined among the 6,216 locus pairs (i.e., 112*(112-1)/2) using Fisher's exact test and two indexes of D′ and r 2 (Figure 1). The average values of D′ and r 2 for the 6,216 pairs were 0.1649 and 0.0173, respectively. Of the 6216 pairs, 2584 pairs (41.57%) showed a significant linkage disequilibrium (P<0.01), 363 pairs (5.84%) showed a higher linkage disequilibrium of D′>0.5, and 33 pairs (0.5%) gave a D′ value of 1.0 (i.e., complete linkage). The data indicated that linkage disequilibrium existed among the sesame accessions.
Figure 1

Disequilibrium matrix of 112 SSR polymorphic sites with both the X-axis and Y-axis.

The matrix is divided into two regions by the diagonal line. The upper right region indicates the D′ value of each SSR couples, and the corresponding blocks in the lower left region indicates the significance of D′. Various value intervals of the D′ or P values are shown in different colors according to the right color columns.

Disequilibrium matrix of 112 SSR polymorphic sites with both the X-axis and Y-axis.

The matrix is divided into two regions by the diagonal line. The upper right region indicates the D′ value of each SSR couples, and the corresponding blocks in the lower left region indicates the significance of D′. Various value intervals of the D′ or P values are shown in different colors according to the right color columns.

SSR marker diversity and population structure

Before analyzing the association, the polymorphism of the 112 SSR markers within the 369 germplasms was investigated (Table S2). Results showed that the number of alleles ranged from 2–5, with an average of 2.47 per locus. The PIC values of the markers ranged from 0.028 (Hs373) to 0.669 (Y1994) with an average value of 0.302. The percentage of heterozygotes per marker varied from 0.27% (Hs373) to 23.12% (Hs4325), with an average of 10.14%. The data indicated that the natural population had the high heterozygosity and was suitable for association mapping analysis. Subsequently, we estimated the population structure, as admixture of subpopulation could result in LD and produce the false-positive results. The most likely number (K) of subgroups in the 369 germplasm accessions was estimated using the 112 SSR markers (Figure 2). As K values increased from 1 to 10, the value of LnP(D) elevated directly; meanwhile, the ΔK reduced straightly with a clear peak value of 643.5 at K = 2. The results indicated that the population was roughly composed of two divergent subgroups according to the Bayesian posterior probability analysis (Figure 3). Of the 369 accessions, 126 accessions were grouped into subgroup 1 (the green ones in Figure 3), 243 accessions into subgroup 2 (the red ones in Figure 3). Most (47 out of 51) of the foreign lines were located in subgroup 2. The two subgroups in the population were considered having the complex ‘admixture’ relationship, and no significant correlation of geographic origin with the subgroups in the Chinese lines were found.
Figure 2

Estimated (a) LnP(D) and (b) ΔK values for a given K.

Figure 3

Genetic composition of individuals based on Bayesian posterior probability.

According to the Bayesian posterior probability, the natural population of 369 worldwide sesame lines is divided into two groups in green and red colors, respectively.

Genetic composition of individuals based on Bayesian posterior probability.

According to the Bayesian posterior probability, the natural population of 369 worldwide sesame lines is divided into two groups in green and red colors, respectively.

Marker-trait associations

In this study, the association analysis was performed using general liner model (GLM) method. The stringent criterion of logP≥3.0 under 5 environments was used for determining the association significance between the OC and PC traits. For the OC trait, 19 markers were detected and the R2 values ranged from 4%–29% (Table 3 & Table S2). According to the primer locations in the sesame genome, 19 markers were mapped in 11 contigs/scaffolds. Of the 19 markers, Hs485 and Hs586 are located in scaffold00001, Hs4381, Hs02 and Hs4082 are located in C01, and Hs4061, Hs345 and Hs19563 are in C04. The distance between the markers that located in the same contig/scaffold was short than 2 Mb (Table S2). For the PC trait, 24 markers were detected with the R2 value range of 3%–29% and mapped in 15 contigs/scaffolds (Table 4 & Table S2). Comparison results indicated that 19 of the above 24 markers were associated with OC trait at the same time; the other 5 markers of Hs425 (in C08), Hs560 (in C11), Hs4265 (in C12), and Hs4089 and Hs1514 (in C19) were unique to PC trait. Simultaneously, in order to assay the stability, we performed association mapping using mixed liner model (MLM) method, with the criterion of logP≥3.0 under at least 3 environments.
Table 3

Association mapping of the OC trait using GLM method under the 5 environments.

Marker20112012
PingyuYuanyangPingyuYuanyangXinyang
Fmaker P marker R2 Fmaker P marker R2 Fmaker P marker R2 Fmaker P marker R2 Fmaker P marker R2
Hs34535.933.05E-250.2826.224.37E-190.2213.413.41E-100.1326.443.11E-190.2237.344.08E-260.29
Hs438161.728.30E-240.2543.171.48E-170.1940.311.53E-160.1846.181.32E-180.2075.463.73E-280.29
Hs195647.517.46E-190.2239.982.83E-160.1916.152.05E-070.0924.868.80E-110.1340.252.29E-160.19
Hs48531.504.06E-180.2020.632.29E-120.1421.101.27E-120.1521.487.82E-130.1536.071.98E-200.23
Hs103641.736.75E-170.2025.694.19E-110.1317.874.24E-080.1026.003.19E-110.1343.841.24E-170.20
Hs20517.183.11E-150.1910.412.42E-090.1210.372.68E-090.1312.463.65E-110.1516.291.75E-140.18
Hs420914.912.67E-130.178.152.64E-070.109.282.55E-080.1212.374.41E-110.1423.929.21E-210.25
Hs67216.811.19E-120.1611.351.09E-080.1116.821.16E-120.1620.274.38E-150.1826.532.58E-190.22
Hs39317.902.01E-130.1610.109.18E-080.1012.192.61E-090.1216.741.32E-120.1529.184.76E-210.24
Hs406130.933.93E-130.1419.321.06E-080.0921.421.60E-090.1123.283.05E-100.1130.724.66E-130.14
Y212922.994.73E-100.1313.891.64E-060.0815.294.57E-070.0924.869.25E-110.1330.656.69E-130.16
Hs61825.225.54E-110.1212.903.86E-060.067.546.19E-040.0417.664.78E-080.0922.914.26E-100.11
Hs3774.142.01E-050.113.166.87E-040.083.924.49E-050.106.267.83E-090.157.093.63E-100.17
Hs63522.158.38E-100.1117.237.11E-080.0810.344.30E-050.0518.791.71E-080.0931.731.98E-130.15
Hs0220.982.38E-090.1017.197.34E-080.0810.314.43E-050.0512.416.09E-060.0636.204.55E-150.16
Hs37619.251.15E-080.1011.491.45E-050.0612.535.52E-060.0716.271.72E-070.0821.261.89E-090.10
Hs163817.456.23E-080.097.297.96E-040.0418.272.97E-080.1019.251.23E-080.1024.868.66E-110.13
Hs58616.841.01E-070.0813.482.25E-060.077.307.82E-040.0410.902.52E-050.0620.145.06E-090.10
Hs40828.183.36E-040.047.566.08E-040.0418.302.69E-080.0915.513.44E-070.0820.613.33E-090.10

The markers associated with the OC trait under 5 environments are listed in the table.

Table 4

Association mapping of PC trait using GLM method under the 5 environments.

Marker20112012
PingyuYuanyangPingyuYuanyangXinyang
Fmaker P marker R2 Fmaker P marker R2 Fmaker P marker R2 Fmaker P marker R2 Fmaker P marker R2
Hs34525.461.41E-180.2121.526.24E-160.1913.413.41E-100.1326.443.11E-190.2237.344.08E-260.29
Hs438150.305.14E-200.2133.863.25E-140.1540.311.53E-160.1846.181.32E-180.2075.463.73E-280.29
Hs48526.939.51E-160.1821.071.32E-120.1421.101.27E-120.1521.487.82E-130.1536.071.98E-200.23
Hs20516.431.33E-140.187.904.52E-070.1010.372.68E-090.1312.463.65E-110.1516.291.75E-140.18
Hs103631.712.48E-130.1617.495.98E-080.0917.874.24E-080.1026.003.19E-110.1343.841.24E-170.20
Hs39316.014.43E-120.1513.343.73E-100.1212.192.61E-090.1216.741.32E-120.1529.184.76E-210.24
Hs67217.028.43E-130.1511.636.73E-090.1116.821.16E-120.1620.274.38E-150.1826.532.58E-190.22
Hs195626.911.50E-110.1422.884.93E-100.1216.152.05E-070.0924.868.80E-110.1340.252.29E-160.19
Hs3775.735.69E-080.144.466.21E-060.113.924.49E-050.106.267.83E-090.157.093.63E-100.17
Y212926.651.99E-110.1416.791.17E-070.0915.294.57E-070.0924.869.25E-110.1330.656.69E-130.16
Hs420911.185.03E-100.137.072.56E-060.099.282.55E-080.1212.374.41E-110.1423.929.21E-210.25
Hs406125.554.14E-110.1217.495.59E-080.0821.421.60E-090.1123.283.05E-100.1130.724.66E-130.14
Hs163820.723.30E-090.1114.131.29E-060.0818.272.97E-080.1019.251.23E-080.1024.868.66E-110.13
Hs61822.585.71E-100.1112.167.75E-060.067.546.19E-040.0417.664.78E-080.0922.914.26E-100.11
Hs151419.021.53E-080.1015.503.69E-070.086.232.20E-030.0414.598.48E-070.0814.777.19E-070.08
Hs408919.687.67E-090.0918.492.25E-080.096.441.80E-030.0317.684.71E-080.0913.841.61E-060.07
Hs42517.078.21E-080.088.382.75E-040.047.904.39E-040.0417.028.55E-080.0818.861.61E-080.09
Hs5604.531.96E-040.074.986.48E-050.084.691.33E-040.076.431.91E-060.107.361.97E-070.11
Hs63513.542.12E-060.0713.282.72E-060.0710.344.30E-050.0518.791.71E-080.0931.731.98E-130.15
Hs0214.677.44E-070.0710.165.08E-050.0510.314.43E-050.0512.416.09E-060.0636.204.55E-150.16
Hs42655.171.34E-040.066.587.07E-060.086.705.50E-060.088.541.16E-070.1011.901.13E-100.14
Hs37611.291.76E-050.0611.042.22E-050.0612.535.52E-060.0716.271.72E-070.0821.261.89E-090.10
Hs58611.072.15E-050.0611.381.60E-050.067.307.82E-040.0410.902.52E-050.0620.145.06E-090.10
Hs408212.366.41E-060.068.722.00E-040.0418.302.69E-080.0915.513.44E-070.0820.613.33E-090.10

The markers associated with the PC trait under 5 environments are listed in the table.

The markers associated with the OC trait under 5 environments are listed in the table. The markers associated with the PC trait under 5 environments are listed in the table. For the OC trait, 9 markers were detected in MLM model and 8 markers of Hs345, Hs4381, Hs485, Hs1036, Hs4061, Hs635, Hs376 and Hs586 were confirmed using GLM and MLM methods. Especially, 4 markers of Hs345, Hs4381, Hs485 and Hs1036 had high R2 values (≥10%) under 5 environments (Table 3 and Table S3). For the PC trait, 9 markers were found and confirmed in both models, of which 7 markers had high R2 values (≥10%) under 5 environments (Table 4 and Table S3). These results suggested that the OC and PC traits are controlled mostly by major genes in sesame.

Marker effect on the phenotypic variation

To explore the association between the above markers and phenotypic variation and the utility potential in sesame MAS breeding program, we performed the allelic effects of the five markers associated with both traits (Table 5). For each marker, the effects estimated were in accordance with the allele character under the 5 environments. Hs345 had the largest effect on the variation of seed oil and protein content. The Hs345-1∶1 and Hs345-2∶2 of Hs345 showed the different variation effects on OC trait, as the average oil content in the accessions ranged from 52.05%–42.82%. In the genotypes carrying the Hs4381-1∶1 allele, oil and protein contents were 43.89% and 23.25%, respectively, while the samples with the Hs485-2∶2 contained 52.00% oil and 21.00% proteins. Furthermore, the specific allele indicated the negative or positive effect to a large extent on the OC or PC trait. For example, the allele effect of Hs345-2∶2 on OC trait ranged from −8.13 to −12.18, whereas the effect on PC trait ranged from 2.05 to 3.82. Comparison results suggested that the alleles of all 5 markers give the opposite effects on OC and PC traits, respectively. The allele that increased the oil content certainly gave the negative effect on protein content, and vice versa. Therefore, these markers could be used for screening sesame lines with either high oil or protein content.
Table 5

Allele effects of 5 markers simutaneously associated with OC and PC traits under 5 environments.

MarkerAllelea 20112012
PingyuYuanyangPingyuYuanyangXinyang
OCPCOCPCOCPCOCPCOCPC
Hs3451∶1−0.04−0.17−0.590.25−1.170.44−1.610.57−0.930.39
2∶2−10.742.30−8.132.12−8.872.05−12.182.96−11.733.82
1∶2−5.531.38−4.591.67−5.131.29−5.842.08−6.662.58
Hs40611∶10.000.000.000.000.000.000.000.000.000.00
2∶28.78−2.414.46−1.956.37−1.848.54−2.337.94−2.96
1∶22.58−0.98−0.18−1.061.78−0.802.58−1.141.60−0.86
Hs19561∶11.12−0.361.01−0.490.85−0.321.18−0.391.67−0.81
2∶2−5.891.21−4.130.79−4.030.63−5.030.95−5.151.34
1∶2−0.740.150.11−0.28−0.860.26−0.600.49−0.850.02
Hs43811∶10.000.000.000.000.000.000.000.000.000.00
2∶29.75−2.456.33−1.736.97−1.829.06−2.179.80−3.41
1∶25.02−1.052.34−0.493.16−0.905.12−0.754.58−1.51
Hs4851∶1−4.800.47−3.230.25−3.870.51−6.881.62−7.192.24
2∶22.04−1.280.82−1.211.71−0.96−0.020.020.20−0.29
1∶2−3.990.33−3.85−0.03−2.59−0.12−4.641.38−5.761.74

Note: a two of the most common alleles are listed.

Note: a two of the most common alleles are listed.

Comparative genome analysis

To clarify the distribution and more information of the above associated markers, we performed the comparative genome analysis of the 19 SSR markers associated with the OC trait (Table 6). All genes closest to the markers were explored. Of the 19 markers, 11 are located in gene regions and 8 are in intergenic regions. The flanking genes had various functions, such as ligase (C01.560), transcription factor (C04.26, C13.438) and kinase (C14.22). Moreover, we found that the markers of Hs4082 and Hs345 were just located next to C01.883 (ABC transporter G family member 3 gene) and C04.786 (Stearoyl-ACP Desaturase gene), respectively, which were proved involved in plant lipid biosynthesis. We also screened the upstream and downstream sequences of 500 Kb far from each marker. 17 (out of 19) markers were proved close to plant lipid pathway genes. These result further confirmed our association mapping conclusions.
Table 6

Information of candidate genes related to the markers associated with OC traits.

MarkerRelated genea AnnotationNearby lipid genesb Arabidopsis thaliana homologous genesRef.
Locic Annotation
Hs4381 C01.460Thioredoxin M3C01.526At1G55260Lipid Transfer Protein type 5 [25]
Hs02 C01.560E3 ubiquitin-proteiC01.548At3G25110Acyl-ACP Thioesterase A [26]
ligase RHF2AC01.575At3G07400Lipid Acylhydrolase-like
C01.561UnknownC01.601At2G30720Acyl-CoA Thioesterase [27]
Hs4082 C01.882Cell cycle proteinC01.873At1G53390ABC Transporter [28]
C01.883ABC transporter GC01.883At2G28070ABC Transporter [28]
family member 3C01.928At3G44830Diacylglycerol Acyltransferase [29]
Hs635 C02.696RAS-related protein Rab11AC02.739At1G47620Midchain Alkane Hydroxylase30]
Hs4209 C04.25Transmembrane domainC04.38At1G31770ABC Transporter [31]
containing proteinC04.56At1G71010Phosphatidylinositol-Phosphate Kinase [32]
C04.26TCP transcription factor 12C04.81At1G10900Phosphatidylinositol-Phosphate Kinase [32]
C04.96At1G77660Phosphatidylinositol-Phosphate Kinase [32]
Hs4061 C04.681Inositol trisphosphate 5-phosphatase 1
Hs345 C04.785UnknownC04.767At1G05630Phosphoinositide 5-Phosphatase
C04.786Stearoyl-ACP desaturaseC04.786At2G43710Stearoyl-ACP Desaturase [33]
Hs1956 C04.845FAR1-related sequenceThe same as Hs345
Hs393 C13.438bZIP transcriptionC13.388At1G10900Phosphatidylinositol-Phosphate Kinase [32]
factor 40C13.471At5G10160Hydroxyacyl-ACP Dehydrase [34]
C13.504At1G17840ABC Transporter [35]
C13.514At4G36480Subunit of Serine Palmitoyltransferase [36]
Hs205 C14.21Thylakoid membraneC14.49At4G33550Lipid Transfer Protein type 3 [37]
phosphoproteinC14.66At1G49430Long-Chain Acyl-CoA Synthetase [38]
C14.22Serine/threonine-proteinC14.111At2G46210Sphingobase-D8 Desaturase [39]
kinase WNK4C14.132At2G46090Diacylglycerol Kinase
Hs672 C14.367Cyclin-P3.1 F-boxC14.359At2G45150CDP-DAG Synthase [40]
proteinC14.413At4G33355Lipid Transfer Protein type 1
C14.368UnknownC14.428At2G44810Acylhydrolase (DAD1-like)
Y2129 C15.825UnknownC15.92At2G29980Linoleate Desaturase [41]
Hs1638 C15.840UnknownThe same as Y2129
Hs618 C25.41IAA-alanine resistanceC25.69At5G08415Lipoate Synthase [42]
protein 1
C25.42Beta-D-xylosidase
Hs376 C26.474ADP-ribosylation factorC26.417At1G15110Base-Exchange-type Phosphatidylserine [43]
Synthase
C26.454At1G71010Phosphatidylinositol-Phosphate Kinase
C26.515At1G31812Acyl CoA Binding Protein [44]
Hs377 C26.475UnknownThe same as Hs376
Hs485 sf00001.885Unknownsf00001.95At4G04930Dihydrosphingosine Delta-4 Desaturase [45]
sf00001.886Unknown
Hs586 sf00001.754ADP-ribosylation factor
Hs1036 sf00044.12ABC transporter Isf00044.12At3G20320Acid-Binding Protein [46]
family member 15sf00044.41At2G25170Chromatin remodeling factor [47]
sf00044.61At4G19860Acyl acceptor Acyltransferase
sf00044.90At1G13210Translocase
sf00044.113At1G74320Choline Kinase [48]

Note: aRelated genes are the genes containing or close to the screened markers.

Nearby lipid genes refer to the genes located in the upstream and downstream of 500 Kb far from the marker.

Only one of the homologous genes is listed in the table.

— refers to no known lipid genes in the location.

Note: aRelated genes are the genes containing or close to the screened markers. Nearby lipid genes refer to the genes located in the upstream and downstream of 500 Kb far from the marker. Only one of the homologous genes is listed in the table. — refers to no known lipid genes in the location.

Discussion

To clarify the genetic mechanisms of fatty acid and protein synthesis in sesame seeds, we herein performed the association mapping analysis of the OC and PC traits among 369 sesame accessions using 112 genic-SSR markers. These accessions were collected from 19 provinces of China and 15 other countries, and represented the genetic diversity of sesame germplasm for association mapping study. These accessions included many released Chinese and foreign cultivars. Compared with the traditional linkage analysis (QTL mapping), the association analysis based on linkage disequilibrium (LD) has been applied for the quantitative trait loci (QTLs) detection and location in many crops. Meanwhile, GLM and MLM models are applied individually for evaluating the marker association. In this article, 19 SSR markers associated with the OC trait were detected under each 5 environments in GLM model, while 24 markers were determined and associated with the protein content.

Sesame genetic diversity and the population structure

Sesame is a diploid and self-pollinated oil crop with the karyotype of 2n = 2x = 26. As all cultivars are originated from the sole cultivated species, Sesamum indicum L, the narrow genetic diversity presents in regional sesame resources to a large extent [24], [49]–[51]. In addition, many reports reflect that there is no clear association between genotype and geographical origin, as many sesame accessions from the different geographic locations are clustered into the same group in the dendrogram [51]–[53]. Apart from the natural history of predomesticated ancestors, the diversity pattern of domestic species could be influenced by the breeding practice, germplasm collection and human activity [53]–[55]. In this article, in order to guarantee the broad genetic variation, we selected the 369 worldwide accessions for seed nutrition genetic analysis according to the geographical origin, molecular clustering and the morphologic diversity (Table S1). The heterozygosity ranged from 0.27% (Hs373)-23.12% (Hs4325). The result proved that the natural population could be used as the core germplasm for association mapping (Table S2). Population structure analysis showed that many sesame accessions collected from the same geographic locations were not grouped together, which further proved the unclear association between genotype and geographical origin in sesame germplasm resource [51]–[53]. During performing the association mapping in a population, LD patterns between the functional loci and markers should be analyzed at first [56]. We analyzed the P-values of linkage disequilibrium between the polymorphisms of the 112 SSR marker loci using Fisher's exact test (Figure 1). As 363 (5.84%) pairwise comparisons had the high LDs (D′>0.5), the linkage disequilibrium existed in 369 sesame accessions. Accordingly, we believed that the natural population is suitable for association analysis of the oil and protein contents.

Oil and protein content variation and associated SSR markers

Among large-scale sesame germplasm resources, the oil and protein contents varied significantly. Yermanos et al. [57] evaluated 721 sesame samples collected from more than 19 countries, and found that the oil content varied from 40.4–59.8% with the mean of 53.1%. The protein content ranged from 19–31% with an average of about 25% [58]. In our population, the oil content varied from 27.89%–58.73% with an average of 51.34%, while the protein content varied from 16.72%–27.79% with an average of 21.19% (Table 1). The data reflected the great variation of sesame seed compositions in germplasm [57], [59], [60]. Comparison analysis proved that there is a strong negative correlation of the oil content with protein content. Interestingly, the stringency relationship was also exhibited in the association analysis. As shown in the GLM analysis in Table 3 and 5, all the 19 markers associated with OC were detected for PC trait; and the alleles exhibited the opposite effects on OC and PC at the same time. These phenomena also present in other oil crops, such as oilseed rape, cotton, soybean and peanut [1], [61]–[63]. Zhao et al. [62] found 6 QTLs with pleiotropic effects on both oil content and protein content in oilseed rape. In the cotton backcross inbred population, of 17 QTLs for oil content and 20 QTLs for protein content, 8 QTLs co-localized in the same chromosome regions and controlled oil and protein contents with opposite additive effects [63]. Hwang et al. (2014) detected 25 SNPs associated with seed oil in 13 different genomic regions through GWAS (genome wide association study), and 7 SNPs were significantly associated with both protein and oil traits. For the six of seven marker loci, a negative relationship existed between the protein effect versus that on oil [64]. Meanwhile, QTLs or markers associated with increased protein and oil contents were also found. For example, Zhao et al. (2006) found that 2 QTLs that controlled oil content were independent from protein content by conditional QTL mapping [65]; Hwang et al. (2014) found a SNP at the 4.92 Mb position on Chr 9 was associated with increased protein and oil contents [64]. In many crops, the seed oil content seems to be controlled mostly by major genes [66]–[69]. In this study, we detected 19 markers associated with OC trait in sesame using GLM method, and the R2 values ranged from 4.0–29.0%. Chen et al. (2010) screened 27 QTLs related to oil content in oilseed rape and the individual explanation was high with the range of 4.20–30.20% [66]. In Arabidopsis thaliana, a single QTL or marker could give an explanation of 17–19% for the seed oil content [68]. In soybean, the explanation reached to 14.3–45.6% [69]. In this report, 112 polymorphic SSR markers were used for association mapping. Compared with other common molecular markers, such as SRAP and AFLP, SSR marker is more suitable for sesame diversity analysis due to the narrow genetic basis [24], [49]–[51]. The marker distribution and density were analyzed using the sesame genome assembly data. The contigs/scaffolds carrying the 112 markers approximately covered 180 Mb of the sesame genome and occupied ∼67% of the assembly size (270 Mb) and 50% of the estimated genome size (∼360 Mb, in which 90 Mb was believed to be repeat sequences) (Table S2) [70], [71]. Therefore, the association mapping using 112 markers is meaningful and reliable, even though some QTLs might be missed. To detect more QTLs, new SSR or SNP markers could be applied in further research.

Candidate genes and oil components

To clarify the marker location and more genome information for OC and PC traits, we screened the genes that were close to the associated markers using the sesame genome assembly data. As a result, 36 candidate genes related to lipid pathway were identified (Table 6) [25]–[48]. We found that most candidate genes were involved in three pathways: (1) fatty acid and TAG (triacylglycerol) synthesis and elongation, e.g., C01.526, C01.548 and C01.928; (2) TAG degradation, e.g., C01.601; and (3) fatty acid dehydrogenation, e.g., C04.786 (Stearoyl-ACP desaturase, determining the ratio of saturated and unsaturated fatty acids) (Table S4). Therefore, the seed oil content in the sesame accessions could be regulated by three factors, i.e., oil synthesis ability, oil degradation ability and oil component ratio (e.g., 18∶1 and 18∶2 fatty acids). In various accessions, any alleles of the genes involved in fatty acid and TAG synthesis, TAG degradation or dehydrogenase genes could give effects on oil content. To confirm this hypothesis, further studies of seed oil synthesis should be performed in the future.

Conclusion

We systematically explored the association mapping of seed oil and protein content traits in 369 worldwide sesame accessions using the 112 SSR markers. A significant negative correlationof the oil content with the protein content existed in the population. 19 SSR markers were associated with the oil content trait with high phenotypic variation explanation, and 24 SSR markers were associated with the protein content trait using GLM method. This association results would provide an efficient platform for seed development research and MAS breeding in sesame.

Methods

Plant materials

A population of 369 core sesame germplasm resources was chosen according to the genetic diversity analysis results and phenotypic variation [51]. These core genotypes comprised 318 lines from 19 provinces of China and 51 worldwide lines from the 15 countries, which were reserved at the Sesame Germplasm Bank of Henan Sesame Research Center (HSRC), HAAS (Henan Academy of Agricultural Sciences) (Table S1). All accessions were grown with three replications at three different experimental stations of Yuanyang (113.96°E, 35.05°N), Pingyu (114.62°E, 32.97°N), and Xinyang (114.08°E, 32.13°N) in 2011 and 2012. Five or six young leaves of individual accessions were collected and reserved at −70°C for DNA extraction.

Oil and protein content analysis

After harvested, ∼20 g of seeds were collected from five plants per line, and the seed OC and PC were measured on infrared determination equipment (Perten DA7250, Sweden) according to the manufactures' instructions. The standard curve for measuring the sesame oil and protein contents had been established according to the chemical analysis results of 300 sesame accessions (Unpublished data, HSRC). Three replications of each samples were assayed for phenotypic analysis. Mean value, broad-sense heritability and correlation coefficient of the phenotypic data were analyzed using the statistical analysis system software (SAS Institute Inc. 2002) [72].

SSR genotyping

The 112 polymorphic SSR pairs were selected from our SSR marker bank [24], [51], [73] (Table S2). DNA extraction, PCR amplification, electrophoresis and SSR genotyping analysis were performed according to the methods described by Zhang et al. [24]. The total number of polymorphic alleles at each SSR locus was calculated according to the results of all 369 lines. The polymorphic SSR alleles presented only within 4 (1%) or fewer accessions were recorded as rare alleles.

Linkage disequilibrium (LD)

As the population structure could result in the spurious associations between phenotypes and marker loci, we analyzed the extent and structure of LD within the population before selecting the appropriate association mapping strategy. To assay whether the 112 polymorphic SSR markers were segregated independently or not, LD analysis was conducted according to the dedicated procedure of the TASSEL software [49]. Both D′ and r2 were used for quantifying LD values [74], [75]. Significance (P values) of D′ for each SSR pairs was determined with 100,000 permutations.

Population structure and relatedness analysis

The population structure was determined using STRUCTURE 2.2 [76]. The mixture model and the independent allele frequency model were used to analyze the population dataset. Five runs of STRUCTURE were carried out for each number of populations (K) (from 1–10), and each run started with 10,000 burn-ins, followed by 100,000 iterations. While performing the STRUCTURE, we assumed that the inferred population accord with Hardy Weinberg equilibrium (HWE) and the loci are unlinked. To correct the relatedness of individuals in further analyses, the relatedness between individuals (relative kinship) was evaluated using SPAGeDi 1.2 software [77]. The matrix with the relative kinship coefficients (K matrix) was applied for association analysis using the Mixed Linear Model(MLM, Q+K)method.

Association mapping and marker distribution in sesame genome

Associations between the SSR markers and the oil and protein content traits were investigated using both methods of the general linear model (GLM, Q) and the mixed linear model (MLM, Q+K) in TASSEL 2.1 described by Bradbury et al. [49]. The mean value of the markers at P<0.005 was used for determining the significance of marker-trait associations. To determine distributions of the associated markers in sesame genome, we carried out the alignment of SSR markers and transcripts with the updated sesame genome data [70], [71]. In the present genome assembly, the number of N50 scaffold was 14, and the number of N90 scaffold was 64. 29,798 gene models were identified. Among related scaffolds or contigs, the sesame lipid synthesis related genes were identified according to the homologous comparison using the genes of A. thaliana from the Acyl Lipids pathway database [78] as queries. Origins of 369 sesame accessions. (DOCX) Click here for additional data file. Diversity statistics of 112 SSR markers in 369 sesame accessions and locations in the sesame draft genome (DOCX) Click here for additional data file. Association mapping of the OC and PC traits using MLM method. (DOCX) Click here for additional data file. Functions and involved pathways of the candidate genes. (DOCX) Click here for additional data file.
  49 in total

1.  Identification of oleosins as major allergens in sesame seed allergic patients.

Authors:  V Leduc; D A Moneret-Vautrin; J T C Tzen; M Morisset; L Guerin; G Kanny
Journal:  Allergy       Date:  2006-03       Impact factor: 13.146

2.  A phenylalanine in DGAT is a key determinant of oil content and composition in maize.

Authors:  Peizhong Zheng; William B Allen; Keith Roesler; Mark E Williams; Shirong Zhang; Jiming Li; Kimberly Glassman; Jerry Ranch; Douglas Nubel; William Solawetz; Dinakar Bhattramakki; Victor Llaca; Stéphane Deschamps; Gan-Yuan Zhong; Mitchell C Tarczynski; Bo Shen
Journal:  Nat Genet       Date:  2008-02-17       Impact factor: 38.330

3.  Genetic diversity and association analysis for salinity tolerance, heading date and plant height of barley germplasm using simple sequence repeat markers.

Authors:  Lilia Eleuch; Abderrazek Jilal; Stefania Grando; Salvatore Ceccarelli; Maria von Korff Schmising; Hisashi Tsujimoto; Amara Hajer; Abderrazek Daaloul; Michael Baum
Journal:  J Integr Plant Biol       Date:  2008-08       Impact factor: 7.061

4.  A phosphatidic acid-binding protein of the chloroplast inner envelope membrane involved in lipid trafficking.

Authors:  Koichiro Awai; Changcheng Xu; Banita Tamot; Christoph Benning
Journal:  Proc Natl Acad Sci U S A       Date:  2006-07-03       Impact factor: 11.205

5.  The essential nature of sphingolipids in plants as revealed by the functional identification and characterization of the Arabidopsis LCB1 subunit of serine palmitoyltransferase.

Authors:  Ming Chen; Gongshe Han; Charles R Dietrich; Teresa M Dunn; Edgar B Cahoon
Journal:  Plant Cell       Date:  2006-12-28       Impact factor: 11.277

6.  Genetic structure and diversity in Oryza sativa L.

Authors:  Amanda J Garris; Thomas H Tai; Jason Coburn; Steve Kresovich; Susan McCouch
Journal:  Genetics       Date:  2005-01-16       Impact factor: 4.562

7.  Conditional QTL mapping of oil content in rapeseed with respect to protein content and traits related to plant development and grain yield.

Authors:  Jianyi Zhao; Heiko C Becker; Dongqing Zhang; Yaofeng Zhang; Wolfgang Ecke
Journal:  Theor Appl Genet       Date:  2006-04-14       Impact factor: 5.699

8.  The Arabidopsis stearoyl-acyl carrier protein-desaturase family and the contribution of leaf isoforms to oleic acid synthesis.

Authors:  Aardra Kachroo; John Shanklin; Edward Whittle; Ludmila Lapchyk; David Hildebrand; Pradeep Kachroo
Journal:  Plant Mol Biol       Date:  2006-10-28       Impact factor: 4.076

9.  ROXY1 and ROXY2, two Arabidopsis glutaredoxin genes, are required for anther development.

Authors:  Shuping Xing; Sabine Zachgo
Journal:  Plant J       Date:  2007-11-23       Impact factor: 6.417

10.  Characterization of Arabidopsis ABCG11/WBC11, an ATP binding cassette (ABC) transporter that is required for cuticular lipid secretion.

Authors:  David Bird; Fred Beisson; Alexandra Brigham; John Shin; Stephen Greer; Reinhard Jetter; Ljerka Kunst; Xuemin Wu; Alexander Yephremov; Lacey Samuels
Journal:  Plant J       Date:  2007-08-28       Impact factor: 6.417

View more
  11 in total

1.  A physical map of important QTLs, functional markers and genes available for sesame breeding programs.

Authors:  Komivi Dossa
Journal:  Physiol Mol Biol Plants       Date:  2016-10-08

2.  Sesamol suppresses the inflammatory response by inhibiting NF-κB/MAPK activation and upregulating AMP kinase signaling in RAW 264.7 macrophages.

Authors:  Xin-Ling Wu; Chian-Jiun Liou; Zih-Ying Li; Xuan-Yu Lai; Li-Wen Fang; Wen-Chung Huang
Journal:  Inflamm Res       Date:  2015-06-10       Impact factor: 4.575

3.  Identification of Sesame Genomic Variations from Genome Comparison of Landrace and Variety.

Authors:  Xin Wei; Xiaodong Zhu; Jingyin Yu; Linhai Wang; Yanxin Zhang; Donghua Li; Rong Zhou; Xiurong Zhang
Journal:  Front Plant Sci       Date:  2016-08-03       Impact factor: 5.753

4.  Development of Highly Informative Genome-Wide Single Sequence Repeat Markers for Breeding Applications in Sesame and Construction of a Web Resource: SisatBase.

Authors:  Komivi Dossa; Jingyin Yu; Boshou Liao; Ndiaga Cisse; Xiurong Zhang
Journal:  Front Plant Sci       Date:  2017-08-22       Impact factor: 5.753

Review 5.  The Emerging Oilseed Crop Sesamum indicum Enters the "Omics" Era.

Authors:  Komivi Dossa; Diaga Diouf; Linhai Wang; Xin Wei; Yanxin Zhang; Mareme Niang; Daniel Fonceka; Jingyin Yu; Marie A Mmadi; Louis W Yehouessi; Boshou Liao; Xiurong Zhang; Ndiaga Cisse
Journal:  Front Plant Sci       Date:  2017-06-30       Impact factor: 5.753

6.  Genome-Wide Association Study Identifies Candidate Genes Related to Seed Oil Composition and Protein Content in Gossypium hirsutum L.

Authors:  Yanchao Yuan; Xianlin Wang; Liyuan Wang; Huixian Xing; Qingkang Wang; Muhammad Saeed; Jincai Tao; Wei Feng; Guihua Zhang; Xian-Liang Song; Xue-Zhen Sun
Journal:  Front Plant Sci       Date:  2018-10-22       Impact factor: 5.753

7.  Identification of a SiCL1 gene controlling leaf curling and capsule indehiscence in sesame via cross-population association mapping and genomic variants screening.

Authors:  Haiyang Zhang; Hongmei Miao; Libin Wei; Chun Li; Yinghui Duan; Fangfang Xu; Wenwen Qu; Ruihong Zhao; Ming Ju; Shuxian Chang
Journal:  BMC Plant Biol       Date:  2018-11-22       Impact factor: 4.215

8.  Effect of Phytoplasma Associated with Sesame Phyllody on Ultrastructural Modification, Physio-Biochemical Traits, Productivity and Oil Quality.

Authors:  Eman A Ahmed; Amro A Farrag; Ahmed A Kheder; Ahmed Shaaban
Journal:  Plants (Basel)       Date:  2022-02-10

9.  Fine Mapping of a Major Pleiotropic QTL Associated with Sesamin and Sesamolin Variation in Sesame (Sesamum indicum L.).

Authors:  Fangtao Xu; Rong Zhou; Senouwa Segla Koffi Dossou; Shengnan Song; Linhai Wang
Journal:  Plants (Basel)       Date:  2021-06-30

10.  Dissection of complicate genetic architecture and breeding perspective of cottonseed traits by genome-wide association study.

Authors:  Xiongming Du; Shouye Liu; Junling Sun; Gengyun Zhang; Yinhua Jia; Zhaoe Pan; Haitao Xiang; Shoupu He; Qiuju Xia; Songhua Xiao; Weijun Shi; Zhiwu Quan; Jianguang Liu; Jun Ma; Baoyin Pang; Liru Wang; Gaofei Sun; Wenfang Gong; Johnie N Jenkins; Xiangyang Lou; Jun Zhu; Haiming Xu
Journal:  BMC Genomics       Date:  2018-06-13       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.