Literature DB >> 31057576

Genome-Wide Association Study Reveals Novel Genomic Regions Associated With High Grain Protein Content in Wheat Lines Derived From Wild Emmer Wheat.

Jia Liu1,2, Lin Huang1,2, Changquan Wang3, Yaxi Liu1,2, Zehong Yan1,2, Zhenzhen Wang1, Lan Xiang1, Xiaoying Zhong1, Fangyi Gong1, Youliang Zheng1,2, Dengcai Liu1,2, Bihua Wu1,2.   

Abstract

Grain protein content (GPC) and yield are of two important traits in wheat, but their negative correlation has hampered their simultaneous improvement in conventional breeding. Wild emmer wheat (Triticum turgidum ssp. dicoccoides) is an important genetic resource for wheat quality improvement. In this study, we report a genome-wide association study (GWAS) using 13116 DArT-seq markers to characterize GPC in 161 wheat lines derived from wild emmer. Using a general linear model, we identified 141 markers that were significantly associated with GPC, and grouped into 48 QTL regions. Using both general linear model and mixed linear model, we identified four significant markers that were grouped into two novel QTL regions on chromosomes 2BS (QGpc.cd1-2B.1) and 7BL (QGpc.cd1-7B.2). The two QTLs have no negative effects on thousand kernel weight (TKW) and should be useful for simultaneous improvement of GPC and TKW in wheat breeding. Searches of public databases revealed 61 putative candidate/flanking genes related to GPC. The putative proteins of interest were grouped in four main categories: enzymes, kinase proteins, metal transport-related proteins, and disease resistance proteins. The linked markers and associated candidate genes provide essential information for cloning genes related to high GPC and performing marker-assisted breeding in wheat.

Entities:  

Keywords:  GPC; GWAS; common wheat; wide hybridization; wild emmer wheat

Year:  2019        PMID: 31057576      PMCID: PMC6477094          DOI: 10.3389/fpls.2019.00464

Source DB:  PubMed          Journal:  Front Plant Sci        ISSN: 1664-462X            Impact factor:   5.753


Introduction

Wheat provides approximately 20% of calories and 25% of proteins in the human diet. The nutritional quality of wheat grains has a significant impact on human health and well-being. The grain protein content (GPC) is an important trait for both nutritional value and end-use quality of wheat (Veraverbeke and Delcour, 2002). While the protein and micronutrients levels in modern wheat grains are inherently low and therefore, breeding for improvements in the nutritional quality of wheat, such as increased protein and micronutrient levels in the grain are possible and have the potential to alleviate hunger and nutrient deficiencies. Breeding wheat varieties with high in both GPC and grain yield is desirable (Zhao et al., 2009). However, the GPC was negative correlated with grain yield, where phenotypic correlations for these two traits usually range between −0.30 and −0.60 (Klindworth et al., 2009). This characteristic has hampered the simultaneous improvement of these two traits in the conventional wheat breeding. The transfer of alien genes from wheat related species has become an effective approach in the development of new wheat varieties for high GPC or yield. Wild emmer wheat (Triticum turgidum ssp. dicoccoides, 2n = 4x = AABB) is the tetraploid progenitor of cultivated wheat, offering wide genotypic variations relevant for improvement of various agronomic traits in wheat (Cakmak et al., 2004; Peleg et al., 2008b; Gomez-Becerra et al., 2010), such as GPC (e.g., Uauy et al., 2006a; Wang et al., 2018), disease resistance (Hua et al., 2009; Li et al., 2009), and drought resistance (Peleg et al., 2005, 2008a). The introgression is feasible due to the occurrence of homologous recombination between the A and B genomes of wild emmer and modern wheat. Dissection of the genetic control of GPC in wild emmer wheat was conducted using tetraploid wheat mapping population, derived from a cross between durum wheat (cv. Langdon) and wild emmer accessions (G18-16 or FA-15-3) (Joppa et al., 1997; Gonzalez-Hernandez et al., 2004; Peleg et al., 2009). Although several quantitative trait loci (QTLs) affecting GPC were reported from these two accessions, the introgression and characterization of wild emmer GPC-QTL in a hexaploid wheat background have been less reported. The most important wild emmer GPC-QTL is the Gpc-B1 on chromosome 6BS. This gene encodes a NAC-domain transcription factor that accelerates senescence and increases nutrient remobilization from leaves to developing grains (Uauy et al., 2006b). The introgression of Gpc-B1 in breeding programs has shown significant improvement in GPC (Kumar et al., 2011; Mishra et al., 2015; Vishwakarma et al., 2016). However, the presence of Gpc-B1 was associated with reductions in grain weight and yield in some wheat varieties and environments (Uauy et al., 2006a; Brevis and Dubcovsky, 2010; Tabbita et al., 2013). Therefore, the exploring of QTLs for GPC with less negative effect on yield-related traits is required. In recent years, genome-wide association analysis (GWAS) based on linkage disequilibrium (LD) has been extensively used to decipher the genetic bases of complex traits in crops (Su et al., 2016; Wang et al., 2016; Ates et al., 2018). Advantages of GWAS over traditional QTL mapping include high resolution mapping, cost and time efficiency, and the potential to utilize large sets of germplasm resources such as landraces, elite cultivars, and advanced breeding lines. Besides, GWAS was available to analysis traits in multi-parent populations consisted of backbone parent and its derived lines in Yu and Tian (2012); Yu et al. (2014); Xiao et al. (2016). In wheat, this approach has been widely used to identify loci controlling disease resistance (Joukhadar et al., 2013; Kollers et al., 2014; Liu et al., 2017a), yield-related traits (Breseghello and Sorrells, 2006; Tadesse et al., 2015), end-use quality traits (Battenfield et al., 2016), root traits (Beyer et al., 2018), and grain zinc concentration (Velu et al., 2018). In our previous work, the agronomically stable advanced wheat lines were obtained from common wheat cultivar Chuannong 16 (CN16 hereafter) as female crossed with wild emmer accession D1 as male through successive selfing. We found that majority of the tested wheat lines showed higher chlorophyll contents and lower chlorophyll degradation rates in flag leaves than the controls (Liu et al., 2016). Several wheat lines without Gpc-B1 gene showed thousand-kernel weight (TKW) and GPC simultaneous improvement (Wang, 2015), implying the presence of novel loci conferring high GPC and TKW. In the current study, GWAS was used to study genetic basis of GPC in a multi-parent population which consisted of wild emmer as backbone parent and its derived wheat lines. The objectives of this study were to characterize the wild emmer GPC loci that associated with no reductions in TKW in a hexaploid wheat population, and to scan associated candidate genes using recently published wheat reference sequences (Avni et al., 2017; Appels et al., 2018). The identified genes and markers will provide important information for cloning GPC-related genes and be useful in marker-assisted breeding for enhanced GPC in wheat.

Materials and Methods

Plant Materials

The low-gluten wheat cultivar CN16 (T. aestivum, AABBDD, 2n = 6x = 42) was crossed as the female parent with a high-protein-content wild emmer accession D1 (originating from Israel) as the male parent. The resulting pentaploid F1 hybrid was advanced to F12 to generate 106 first recombinant inbred lines (RILs) include 106 individuals with a number of stable 42 chromosomes. Some RILs with high GPC were selected and crossed to each of low/medium-gluten common wheat varieties Mianmai 46 (MM46), Chuanmai 50 (CM50), Kechengmai 2 (KCM2), and Chuanyu 18 (CY18)/YunB 58863 (YB58863) and eight generations of self-fertilization to create a second RILs with 55 individuals (Supplementary Table S1). A total of 161 wheat lines were used in this study. All of the materials were maintained at the Triticeae Research Institute, Sichuan Agricultural University, China. Wheat plants were grown in a randomized complete block design with three replicates over two growing seasons (2015 and 2016) at Chongzhou (2015CZ and 2016CZ) and one growing season (2015) at Wenjiang (2015WJ). Individual plants were spaced 10 cm apart within a 2 m row, with 30 cm between rows. Each replicate contained twenty individuals in a 2 m row. Mature seeds were harvested from the middle six plants of each row and used for GPC and TKW measurement.

Determination of Grain Protein Content and Thousand Kernel Weight

Wheat seeds were dried to constant weight and ground to a fine powder with a Chopin CD1 AUTO (Renault, Boulogne-Billancourt, France) (Jirsa et al., 2008). A 0.5 g powder samples were measured for GPC by1241 Grain Analyzer (FOSS A/S, Hillerød, Denmark) following the method as described by Pettersson and Eckersten (2007). TKW was evaluated with an electronic balance by weighing three samples of 300 kernels (GB T 5519-2008, 2008). Analyses of variance (ANOVA) and pearson correlation coefficients for measured traits were performed using the SPSS version 22.0 (SPSS Inc., Chicago, IL, United States).

Genotyping and Linkage Disequilibrium and Population Structure Analysis

The genomic DNA was extracted using CTAB method (Murray and Thompson, 1980). All of the tested materials were genotyped using the DArT (Diversity Arrays Technology, Canberra, ACT, Australia[1]). The obtained DArT markers data were filtered based on call rate (minimum threshold value of 85%) and reproducibility (minimum threshold value of 95%). A total of 24022 DArT markers were recalled from marker data. For the further analysis, markers with missing data greater than 10% and minor allele frequency (MAF) less than 5% were filtered, which resulted in 13116 DArT markers were retained and used. Linkage disequilibrium (LD) analysis was performed for each of the 21 wheat chromosomes associated with mapped DArT markers using software TASSEL 3.0 (Bradbury et al., 2007). The LD squared allele-frequency correlation (r2), which contains both mutational and recombination history information, was evaluated for syntenic loci (p < 0.001). The mean r2 over different genetic distances was calculated for the A, B, and D subgenomes as well as the whole genome. The LD decay plots were generated using r2 and the genetic map distance between markers. Population structure was estimated using Structure software 2.3.4 (Pritchard et al., 2000). An admixture model with ten replicates for each number of genetic groups (K = 1–10) and 100,000 iterations of burn-in followed by 100,000 Markov Chain Monte Carlo (MCMC) iterations were used. The outputs of the genetic group analysis were extracted in Structure Harvester (Earl and von Holdt, 2012). The optimal K-value was determined using the delta K method as described by Evanno et al. (2005). Furthermore, principal component analysis (PCA) of filtered markers was performed with the TASSEL 3.0 (Bradbury et al., 2007) and the first two PCA values were plotted.

Genome-Wide Association Study for GPC

To eliminate the environmental impact, the best linear unbiased prediction (BLUP) across all tested environments was carried out using the META-R (Alvarado et al., 2015). The association analysis of DArT markers and GPC was performed using a general linear model (GLM) and mixed linear model (MLM) in TASSEL. The first three PCA values were used as covariate in the model to adjust population stratification and kinship matrix (K) was calculated using Scaled IBS method (Yu et al., 2006; Liu et al., 2018). A Bonferroni-corrected p value threshold at α = 1 was used as the cutoff (Yang et al., 2014; Liu et al., 2017b). For the 13116 DArT markers, the p-value threshold at α = 1 was 7.62 × 10−5, with a corresponding −log10p value of 4.12. Significant markers were visualized with a Manhattan plot using Haploview 4.2 software (Barrett et al., 2004). Important p-value distributions (expected vs. observed p-values on a −log10 scale) were shown with a quantile-quantile plot. To identify the candidate genes linked to significant markers, we performed a BLAST search against the International Wheat Genome Sequencing Consortium database (IWGSC[2]) and the International Wild Emmer Wheat Genome Sequencing Consortium database[3] using the significant marker sequences. When a DArT marker sequence and a wheat contig were 100% identical, the sequence was extended to 5 kb for each marker using the IWGSC BLAST results (Liu et al., 2017b). The extended sequence was explored to predict flanking genes using BLAST search against the Triticeae Multi-omics center database[4].

Results

Variation for GPC in the Wheat Lines Derived From Wild Emmer

The variation for GPC in the wheat lines and their core parents were summarized in Table 1. Most genotypes displayed relatively stable GPC across the three environments (2015WJ, 2015CZ, and 2016CZ) (Figure 1). The GPC was highly correlated (p < 0.01) across environments (r = 0.210–0.732) based on the two-tailed pearson product-moment correlation coefficient test (Fieller et al., 1957; Table 2). The male parent D1 exhibited significantly higher GPC (mean range 22.45–23.43%) compared to female parent CN16 (mean range 12.00–12.90%) across all test environments (Table 1). The GPC of the wheat lines derived from wild emmer were ranged from 11.97 to 19.73% (mean range 14.47–14.70%) across three environments. The lowest mean GPC (11.97%) was recorded in 2015 at Chongzhou, whereas the highest mean GPC (19.73%) was recorded in 2015 at Wenjiang. Most of the wheat lines showed significantly higher GPC than CN16 (Table 1 and Figure 2). A large number of wheat lines had GPC ranged from 14 to 15% in each environment (Figure 2A–C). A total of 92 (59%) wheat lines displayed consistently high GPC (≥14%) across three tested environments (Figure 2D). The mean TKW of wheat lines were higher than that of female parent CN16 under 2015WJ, 2015CZ, and 2016CZ environments (Table 1). Pearson’s correlation analysis showed that there was no significant negative correlation between GPC and TKW across environments (Table 2). Therefore, according to the phenotypic BLUP data of GPC and TKW, a total of 132 wheat lines with strong gluten (GPC ≥14%) and high TKW (≥46.24 g) were identified (Supplementary Figure S1).
Table 1

The variation for GPC and TKW in wheat lines derived from wild emmer under three environments.

TraitEnvironmentParentsDerived lines
CN16D1Mean ± SDRangeCV (%)
GPC2015WJ12.89a ± 0.2522.45c ± 0.4414.70b ± 0.8212.35–19.735.58
2015CZ12.00a ± 0.1022.96c ± 0.4614.47b ± 0.7811.97–19.185.39
2016CZ12.90a ± 0.0823.43c ± 0.9414.48b ± 0.9412.25–17.336.49
TKW2015WJ43.19b ± 4.9919.21a ± 4.3945.78b ± 4.9431.35–57.0010.79
2015CZ49.33b ± 1.5415.00a ± 0.6752.82b ± 4.4037.08–64.638.33
2016CZ42.70b ± 4.9731.63a ± 0.7148.94c ± 4.5535.00–60.889.30
FIGURE 1

GPC analysis of the wheat lines derived from wild emmer grown at 2015WJ, 2015CZ, and 2016WJ.

Table 2

Pearson correlation coefficient of GPC and TKW among all measured environments in the wheat lines derived from wild emmer.

Correlations2015WJ GPC2015CZ GPC2016CZ GPC2015WJ TKW2015CZ TKW
2015CZ GPC0.732∗∗
2016CZ GPC0.263∗∗0.210∗∗
2015WJ TKW−0.02−0.0350.171
2015CZ TKW−0.0170.1150.1290.436∗∗
2016CZ TKW0.0290.135−0.0620.0310.367∗∗
FIGURE 2

Frequency distributions of GPC values under three environments in the wheat lines derived from wild emmer. (A), 2015WJ; (B), 2015CZ; (C), 2016CZ; (D) Wayne chart: the number of wheat lines with GPC ≥14% in tested environments (Blue color, 2015WJ; Yellow color, 2015CZ; Green color, 2016CZ).

GPC analysis of the wheat lines derived from wild emmer grown at 2015WJ, 2015CZ, and 2016WJ. Frequency distributions of GPC values under three environments in the wheat lines derived from wild emmer. (A), 2015WJ; (B), 2015CZ; (C), 2016CZ; (D) Wayne chart: the number of wheat lines with GPC ≥14% in tested environments (Blue color, 2015WJ; Yellow color, 2015CZ; Green color, 2016CZ). The variation for GPC and TKW in wheat lines derived from wild emmer under three environments. Pearson correlation coefficient of GPC and TKW among all measured environments in the wheat lines derived from wild emmer.

Analysis of DArT Markers, Population Structure and Linkage Disequilibrium (LD)

A total of 13116 polymorphic DArT markers (MAF≥5% and missing ≤10%) were used to estimate the underlying population structure. Of these, 4710, 6718, and 1688 markers were mapped on the A, B, and D genomes, respectively. The total map length was 5941.15 cM, and the average distance between markers for the A, B, and D genomes was 0.44 cM, 0.30 cM, and 1.11 cM, respectively (Supplementary Table S2). Chromosome 1B contained the largest number of markers (1708) and the average distance between markers was 0.33 cM, whereas Chromosome 4D contained the smallest number of markers (52) and the average distance between markers was 3.47 cM. Base on structure analysis, the highest delta K-value was 3, the 163 genotypes were divided into three main sub-groups, namely Gp1, Gp2, and Gp3, representing 41, 75, and 47 genotypes, respectively. Gp1 only included individuals from the 106 RILs; Gp2 included individuals from both RIL populations; the two core parents and most of the individuals from the 55 RILs were belonged to Gp3 (Supplementary Figure S2A and Supplementary Table S1). PCA was also performed to investigate population structure. The first two PCs explained 12.64 and 7.54% variation in the population of wheat lines (Supplementary Figure S2B). The PCA results were basically agreed with that of structure analysis, which clearly divided the genotypes into three main sub-groups. LD between pairs of markers was estimated for each of the 21 chromosomes (Supplementary Table S3). In the whole genome, 55.55% of pairwise DArT markers had a significant LD (p < 0.001), and 37.37% of significant pairwise markers had an r2 > 0.2. In the A, B, and D subgenomes, 51.12, 62.58, and 39.71% of pairwise DArT markers had a significant LD (p < 0.001), and 32.06, 43.90, and 25.98% of the significant pairwise markers had an r2 > 0.2, respectively. The extent of LD in each chromosome was different. Chromosome 1B showed the highest percentage of pairwise markers with significant LDs (75.64%), with r2 > 0.2 (62.14% of significant pairwise markers). In contrast, chromosome 4D had the lowest percentage of pairwise markers with significant LDs (17.58%), with r2 > 0.2 (7.35% of significant pairwise markers). The mean r2-values for the A, B, and D genomes as well as the whole genome were rapidly decreased with increasing pairwise distance (Supplementary Figure S3). The LD decay distances were approximately 9, 12, 13, and 12 cM for the A, B, and D genomes as well as the whole genome, respectively (Supplementary Figure S3).

GWAS for GPC in the Wheat Lines Derived From Wild Emmer

A total of 141 markers had significant results in the three environments by GLM with PVEs ranged from 9.90 to 22.79% (Supplementary Table S4). These markers were distributed on all wheat chromosomes except chromosomes 1A, 1B, 4A, and 7A (Figure 3A). In the MLM, four significant markers (three on 2B and one on 7B) were detected with PVEs of 11.92–12.93%, and these markers were also detected by GLM (Figure 3B and Tables 3, 4). To determine the confidence interval for the potential QTL identified in this study, markers significantly associated to the GPC that were co-locating together and/or adjacent within intervals inferior to the LD decay distance of ± 5 cM from the association peak were considered as a same QTL region. In total, the 141 significant markers detected were grouped into 48 loci (Supplementary Table S4). Of these, four significant markers by both models were grouped into two loci on chromosomes 2B (QGpc.cd1-2B.1) and 7B (QGpc.cd1-7B.2) (Table 3 and Figure 3B) that potentially contain GPC genes. In particular, the mean PVE of three markers on chromosome 2B in these two stable major loci was 12.78 and 20.49% in MLM and GLM, and the PVE value of marker 1166404 located on chromosome 7B was the highest at 22.79% in GLM. Quantile-quantile plots of p-values comparing the uniform distribution of the expected −log10p value to the observed −log10p value for GPC showed that the MLM was more conservative than GLM (Figure 3C).
FIGURE 3

Manhattan plots of genome-wide association scan for GPC in three environments. Manhattan plot showing –log10 p values of the markers for GPC. Red lines indicate the −log10 p threshold of 4.12. Chromosomes carrying significant markers detected by GLM (A) and MLM (B); (C), Quantile-quantile plots of GPC. Green plot, GLM model; Red plot, MLM model; Black line, the expected values.

Table 3

Significant DArT markers for GPC identified by MLM and GLM.

QTLMarkerChr.Position (cM)EnvironmentMLM
GLM
−log10(p)PVE (%)−log10(p)PVE (%)
QGpc.cd1-2B.111165352B3.93BLUP4.3912.557.8420.19
30222132B8.70BLUP4.4712.938.4221.35
10886962B8.70BLUP4.4412.877.8919.93
QGpc.cd1-7B.211664047B216.46BLUP4.2911.929.0722.79
Table 4

MTAs identified by GWAS base on the BLUP data.

ModelNo. sig.aAverage -log(p)Range -log(p)Average PVE (%)Range PVE (%)No. Sharedb
Mixed linear model (MLM)44.44.29–4.4712.5711.92–12.934
General linear model (GLM)1415.764.12–9.4514.159.90–22.79
Manhattan plots of genome-wide association scan for GPC in three environments. Manhattan plot showing –log10 p values of the markers for GPC. Red lines indicate the −log10 p threshold of 4.12. Chromosomes carrying significant markers detected by GLM (A) and MLM (B); (C), Quantile-quantile plots of GPC. Green plot, GLM model; Red plot, MLM model; Black line, the expected values. Significant DArT markers for GPC identified by MLM and GLM. MTAs identified by GWAS base on the BLUP data.

Candidate Genes That May Be Associated With GPC

Sequences of DArT markers related with GPC were utilized to predicate candidate genes using the recently annotated wheat reference sequence (RefSeq v.1.0) and wild emmer wheat reference sequence (WEWSeq v.1.0). The enzymes and proteins related to 61 putative candidate genes were predicated (Supplementary Table S5). These candidate genes could be roughly divided into four groups according to the types of protein they encoded. The first group of candidate genes included enzymes associated with nitrogen transport, such as aspartic peptidase, peroxidase, and peptidase. The second group consisted of genes including kinase proteins. The third group included metal transport related proteins, such as zinc transporter, and heavy metal transport proteins. The last group consisted of disease resistance proteins, such as MYB transcription factor and NAC domain-containing proteins (Supplementary Table S5).

Discussion

Wild emmer wheat has been found to show higher concentration of protein in grain when compared to common cultivated wheat (Cakmak et al., 2004). Several QTL analyses were employed to dissect the genetic basis of grain protein using a tetraploid wheat RIL population (Joppa et al., 1997; Gonzalez-Hernandez et al., 2004; Peleg et al., 2009). In this study, the 161 RILs derived from wild emmer were phenotyped in multiple environments and genotyped using the DArT markers to dissect the genetic basis of grain protein accumulation in hexaploid wheat. GWAS identified 141 significant markers for GPC by GLM, and 61 associated candidate genes were predicated using the sequences annotation (RefSeq v.1.0; WEWSeq v.1.0). Two large effect QTL regions on chromosomes 2B and 7B were detected by both GLM and MLM. Our findings are consistent with previous studies that the group 2 and 7 chromosomes harboring genes for nutritional quality traits in wheat (Crespo-Herrera et al., 2017; Velu et al., 2018). Peleg et al. (2009) identified ten GPC-QTLs related to GPC using a mapping population derived from LDN crossed with wild emmer G18-16 that showed high GPC phenotype; these loci were distributed on 10 wild emmer chromosomes except 1A, 1B, 3A, and 4B; A few genomic regions (2A, 5A, 6B, and 7A) were found to contain clusters of QTLs for GPC. In the current study, 48 significant GPC-QTLs were scattered across all wheat chromosomes (except 1A, 1B, 4A, and 7A) using a GLM model. We found the GPC-QTLs on chromosomes 2A, 3B, and 6A are located in the same region as compared to Peleg et al. (2009) and the remaining QTLs in our studies were novel. However, GPC-QTLs on chromosomes 4A and 7A in the G18-16 tetraploid mapping population were not detected in our hexaploid RILs. Candidate genes prediction revealed that some QTL were related to nitrogen metabolism and disease resistance. For example, QTLs on chromosome 5B were associated with glutathione S-transferase (GST) and proline transporter. Previous research showed that GSTs are critical for nitrogen fixation in plant (Dalton et al., 2009). In particular, the marker 1064158 (55.28 cM) located on chromosome 5B was consistent with the results reported by previous study (Kumar et al., 2018b). QGpc.cd1-2A.2 and QGpc.cd1-5A were associated with methyltransferase, cytochrome P450 and NAC domain-containing protein. QGpc.cd1-6A.1 were associated with nitrogenase-stabilizing. Kim et al. (2005) reported that the function of nitrogenase (NA) is closely related to plant nitrogen metabolism, and nitrogenase activity has a positive effect on wheat growth, total plant N-yield, and protein content. Moreover, although QGpc.cd1-1D.1 was only detected under one model, it showed a very high PVE value 19.06% and was associated with disease resistance protein. Note worthily, we detected two QTLs QGpc.cd1-6B.2 and QGpc.cd1-6B.3 on 6BS that associated with zinc transporter and glutenin macropolymer (GMP) synthase. In barley, the zinc transporter (for example HvMTP1) was associated with enhancement of grain Zn content (Menguer et al., 2018). The glutenin macropolymer (GMP) content of flour was confirmed to be significant related to high protein content (Weegels et al., 1996). Uauy et al. (2006b) reported the Gpc-B1 on 6BS that was associated with N, Zn, and Fe remobilization in plants. However, this locus was not detected in the current study and the physical position of the two QTLs on 6BS was larger than 10 Mb away from the functional Gpc-B1. Different hypotheses may explain this observation: (1) the relatively low density of DArT markers; (2) the low frequency of Gpc-B1 in the current wheat lines that cannot detected by GWAS. Additional studies are required to test these hypotheses. Using a MLM model, we identified two significant stable GPC-QTLs QGpc.cd1-2B.1 (3.93–8.70 cM) and QGpc.cd1-7B.2 (216.46 cM), explaining 11.92–12.93 and 19.93–22.79% of the phenotypic variance. A GPC-QTL (95.4 ± 13.8 cM) on wild emmer chromosome 2B was detected by Peleg et al. (2009). In durum wheat, Suprayogi et al. (2009) identified a QTL on 2B which was associated with higher GPC in all tested environments. Recently, two environmental stable GPC-QTLs QGpc.mna-2B (Tsilo et al., 2010) and QGpc.2B-yume (Terasawa et al., 2016) were reported in wheat. QGpc.mna-2B was located between the markers Xwmc245-Xgwm271 at position 64.0–65.0 cM on chromosome 2B; whereas QGpc.2B-yume was sited near the marker Xgpw4382 at position 67.1 cM. The position of QGpc.cd1-2B.1 (3.93–8.70 cM) in our study is different to the previous reported GPC-QTLs on chromosome 2B. On chromosome 7B, at least five GPC-QTLs with stable effects across environments have been reported in different studies (reviewed by Kumar et al., 2018a). Of these, most GPC-QTLs were located on the short arm of chromosome 7B. Blanco et al. (2006) reported a QTL located on 7BL explained 9.1% of the phenotypic variance of the GPC which close to the SSR marker Xgwm577 at position 137 cM. QGpc.cd1-7B.2 in our study is located at position 216.46 cM on the long arm of chromosome 7B. Therefore, we concluded that the two wild emmer QTLs on 2B and 7B conferring high GPC were novel. In this study, the highly associated markers sequences were used to predicate putative candidate genes for QGpc.cd1-2B.1 and QGpc.cd1-7B.2. We found that the ABA-responsive binding factor and processing peptidase genes may play crucial roles in conferring of higher protein content in wheat grain. For example, the ABA-dependent NAC transcription factor (OsNAC5) in rice was reported to be associated with Fe, Zn and amino acids remobilization from green tissues to seeds (Sperotto et al., 2009). In addition, the QGpc.cd1-7B.2 was associated with processing peptidase. Peptidases are key enzymes that involved in nitrogen metabolism, such as nitrate reductase, endopeptidase, aminopeptidase, and carboxypeptidase. It was known that leaf senescence and nitrogen metabolism of senescing tissues are two important factors that determining the GPC in cereals (Jukanti and Fischer, 2008). Moreover, the peptidase plays a role in regulating the redistribution and utilization of nitrogen in wheat plants that affects the grain protein quality of wheat (Waters et al., 1980). Functional analyses are needed to further understand the pathways of these candidate genes for regulating high GPC in wheat. The wheat grain yield and GPC are negatively correlated and the negative correlation was mainly due to the dilution effect that making the simultaneous increase of the two traits challenging (Cox et al., 1985; Day et al., 1985; Simmonds, 1995). In our previous studies, we confirmed that the introgression of high GPC and TKW traits from wild emmer wheat to common wheat is feasible by wide hybridization (Wu et al., 2008; Wang et al., 2018). The TKW was positively correlated with grain yield as reported by previous studies (Bhatta et al., 2018a,b). In the present study, no significant negative correlation between GPC and TKW were detected in the wheat lines derived from wild emmer, suggesting presence of novel GPC-QTLs that are not associated or less associated with TKW. Our results indicate the improvement of GPC without sacrificing yield could be possible by dissection and introgression of wild emmer QTLs. In addition, our study indicates the potential value of QGpc.cd1-2B.1 and QGpc.cd1-7B.2 for GPC improvement and provides comprehensive use of wild emmer wheat for offering new source of alleles to increase GPC in wheat.

Conclusion

Wild emmer is a valuable resource for wheat quality improvement. The wild emmer derived wheat population showed high GPC and a weak correlation between GPC and TKW, indicating simultaneous improvement of GPC and TKW could be possible. A GWAS identified 48 QTLs (141 MTAs), of which two novel major QTL regions (chromosomes 2B and 7B) for high GPC were stable detected in two models. The QTLs have no negative effects on TKW and would facilitate the development of cultivars with both high GPC and TKW. Furthermore, at significant loci and flanking regions, we identified 61 putative candidate genes that might play crucial roles in conferring of higher protein content in wheat grain. The identified markers and genes offer information for cloning genes related to GPC and may be used in wheat breeding programs.

Author Contributions

BW, JL, and LH designed the study. JL, ZW, LX, XZ, and FG carried out experiments. JL analyzed the data and drafted the manuscript. CW, YL, and ZY provided the genotyping data. YZ and DL provided their constructive comments and suggestions. LH, BW, DL, and YZ revised the manuscript. All authors read and approved the final manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  40 in total

1.  Association mapping in structured populations.

Authors:  J K Pritchard; M Stephens; N A Rosenberg; P Donnelly
Journal:  Am J Hum Genet       Date:  2000-05-26       Impact factor: 11.025

2.  Haploview: analysis and visualization of LD and haplotype maps.

Authors:  J C Barrett; B Fry; J Maller; M J Daly
Journal:  Bioinformatics       Date:  2004-08-05       Impact factor: 6.937

3.  A unified mixed-model method for association mapping that accounts for multiple levels of relatedness.

Authors:  Jianming Yu; Gael Pressoir; William H Briggs; Irie Vroh Bi; Masanori Yamasaki; John F Doebley; Michael D McMullen; Brandon S Gaut; Dahlia M Nielsen; James B Holland; Stephen Kresovich; Edward S Buckler
Journal:  Nat Genet       Date:  2005-12-25       Impact factor: 38.330

4.  Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study.

Authors:  G Evanno; S Regnaut; J Goudet
Journal:  Mol Ecol       Date:  2005-07       Impact factor: 6.185

5.  The high grain protein content gene Gpc-B1 accelerates senescence and has pleiotropic effects on protein content in wheat.

Authors:  Cristobal Uauy; Juan Carlos Brevis; Jorge Dubcovsky
Journal:  J Exp Bot       Date:  2006-07-10       Impact factor: 6.992

6.  Wheat root colonization and nitrogenase activity by Azospirillum isolates from crop plants in Korea.

Authors:  Chungwoo Kim; Mihály L Kecskés; Rosalind J Deaker; Kate Gilchrist; Peter B New; Ivan R Kennedy; Seunghwan Kim; Tongmin Sa
Journal:  Can J Microbiol       Date:  2005-11       Impact factor: 2.419

7.  Detection of QTLs for grain protein content in durum wheat.

Authors:  A Blanco; R Simeone; A Gadaleta
Journal:  Theor Appl Genet       Date:  2006-02-02       Impact factor: 5.699

8.  Association mapping of kernel size and milling quality in wheat (Triticum aestivum L.) cultivars.

Authors:  Flavio Breseghello; Mark E Sorrells
Journal:  Genetics       Date:  2005-08-03       Impact factor: 4.562

Review 9.  Wheat protein composition and properties of wheat glutenin in relation to breadmaking functionality.

Authors:  Wim S Veraverbeke; Jan A Delcour
Journal:  Crit Rev Food Sci Nutr       Date:  2002       Impact factor: 11.176

10.  A NAC Gene regulating senescence improves grain protein, zinc, and iron content in wheat.

Authors:  Cristobal Uauy; Assaf Distelfeld; Tzion Fahima; Ann Blechl; Jorge Dubcovsky
Journal:  Science       Date:  2006-11-24       Impact factor: 47.728

View more
  9 in total

1.  Enriching LMW-GS alleles and strengthening gluten properties of common wheat through wide hybridization with wild emmer.

Authors:  Lan Xiang; Lin Huang; Fangyi Gong; Jia Liu; Yufan Wang; Yarong Jin; Yu He; Jingshu He; Qiantao Jiang; Youliang Zheng; Dengcai Liu; Bihua Wu
Journal:  3 Biotech       Date:  2019-09-05       Impact factor: 2.406

2.  Genome-Wide Association Study for Grain Micronutrient Concentrations in Wheat Advanced Lines Derived From Wild Emmer.

Authors:  Jia Liu; Lin Huang; Tingxuan Li; Yaxi Liu; Zehong Yan; Guan Tang; Youliang Zheng; Dengcai Liu; Bihua Wu
Journal:  Front Plant Sci       Date:  2021-05-14       Impact factor: 5.753

3.  Identification of Novel Genomic Regions and Superior Alleles Associated with Zn Accumulation in Wheat Using a Genome-Wide Association Analysis Method.

Authors:  Zhengfu Zhou; Xia Shi; Ganqing Zhao; Maomao Qin; Maria Itria Ibba; Yahuan Wang; Wenxu Li; Pan Yang; Zhengqing Wu; Zhensheng Lei; Jiansheng Wang
Journal:  Int J Mol Sci       Date:  2020-03-11       Impact factor: 5.923

4.  Genome-Wide Characterization of OFP Family Genes in Wheat (Triticum aestivum L.) Reveals That TaOPF29a-A Promotes Drought Tolerance.

Authors:  Dezhou Wang; Zhichen Cao; Weiwei Wang; Wengen Zhu; Xiaocong Hao; Zhaofeng Fang; Shan Liu; Xiaoyan Wang; Changping Zhao; Yimiao Tang
Journal:  Biomed Res Int       Date:  2020-11-07       Impact factor: 3.411

5.  Identification of QTLs for Grain Protein Content in Russian Spring Wheat Varieties.

Authors:  Irina N Leonova; Antonina A Kiseleva; Alina A Berezhnaya; Anatoly I Stasyuk; Ivan E Likhenko; Elena A Salina
Journal:  Plants (Basel)       Date:  2022-02-05

6.  Comparison of the Agronomic, Cytological, Grain Protein Characteristics, as Well as Transcriptomic Profile of Two Wheat Lines Derived From Wild Emmer.

Authors:  Fangyi Gong; Tiangang Qi; Tian Zhang; Yusen Lu; Jia Liu; Xiaoying Zhong; Jingshu He; Yunfang Li; Youliang Zheng; Dengcai Liu; Lin Huang; Bihua Wu
Journal:  Front Genet       Date:  2022-01-28       Impact factor: 4.599

7.  Identification of Glu-D1 Alleles and Novel Marker-Trait Associations for Flour Quality and Grain Yield Traits under Heat-Stress Environments in Wheat Lines Derived from Diverse Accessions of Aegilops tauschii.

Authors:  Ikram Elsadig Suliman Mohamed; Nasrein Mohamed Kamal; Hala Mohamed Mustafa; Modather Galal Abdeldaim Abdalla; Ashraf M A Elhashimi; Yasir Serag Alnor Gorafi; Izzat Sidahmed Ali Tahir; Hisashi Tsujimoto; Hiroyuki Tanaka
Journal:  Int J Mol Sci       Date:  2022-10-10       Impact factor: 6.208

8.  Consensus genomic regions associated with grain protein content in hexaploid and tetraploid wheat.

Authors:  Pooja Saini; Imran Sheikh; Dinesh Kumar Saini; Reyazul Rouf Mir; Harcharan Singh Dhaliwal; Vikrant Tyagi
Journal:  Front Genet       Date:  2022-09-28       Impact factor: 4.772

9.  Genome-Wide Investigation and Functional Verification of the ZIP Family Transporters in Wild Emmer Wheat.

Authors:  Fangyi Gong; Tiangang Qi; Yanling Hu; Yarong Jin; Jia Liu; Wenyang Wang; Jingshu He; Bin Tu; Tao Zhang; Bo Jiang; Yi Wang; Lianquan Zhang; Youliang Zheng; Dengcai Liu; Lin Huang; Bihua Wu
Journal:  Int J Mol Sci       Date:  2022-03-05       Impact factor: 5.923

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.