Wheat grain storage protein (GSP) content and composition are the main determinants of the end-use value of bread wheat (Triticum aestivum L.) grain. The accumulation of glutenins and gliadins, the two main classes of GSP in wheat, is believed to be mainly controlled at the transcriptional level through a network of transcription factors. This regulation network could lead to stable cross-environment allometric scaling relationships between the quantity of GSP classes/subunits and the total quantity of nitrogen per grain. This work conducted a genetic mapping study of GSP content and composition and allometric scaling parameters of grain N allocation using a bread wheat worldwide core collection grown in three environments. The core collection was genotyped with 873 markers for genome-wide association and 167 single nucleotide polymorphism markers in 51 candidate genes for candidate association. The candidate genes included 35 transcription factors (TFs) expressed in grain. This work identified 74 loci associated with 38 variables, of which 19 were candidate genes or were tightly linked with candidate genes. Besides structural GSP genes, several loci putatively trans-regulating GSP accumulation were identified. Seven candidate TFs, including four wheat orthologues of barley TFs that control hordein gene expression, were associated or in strong linkage disequilibrium with markers associated with the composition or quantity of glutenin or gliadin, or allometric grain N allocation parameters, confirming the importance of the transcriptional control of GSP accumulation. Genome-wide association results suggest that the genes regulating glutenin and gliadin compositions are mostly distinct from each other and operate differently.
<span class="Species">Wheat grain storage protein (GSP) content and composition are the main determinants of the end-use value of bread wheat (Triticum aestivum L.) grain. The accumulation of glutenins and gliadins, the two main classes of GSP in wheat, is believed to be mainly controlled at the transcriptional level through a network of transcription factors. This regulation network could lead to stable cross-environment allometric scaling relationships between the quantity of GSP classes/subunits and the total quantity of nitrogen per grain. This work conducted a genetic mapping study of GSP content and composition and allometric scaling parameters of grain N allocation using a bread wheat worldwide core collection grown in three environments. The core collection was genotyped with 873 markers for genome-wide association and 167 single nucleotide polymorphism markers in 51 candidate genes for candidate association. The candidate genes included 35 transcription factors (TFs) expressed in grain. This work identified 74 loci associated with 38 variables, of which 19 were candidate genes or were tightly linked with candidate genes. Besides structural GSP genes, several loci putatively trans-regulating GSP accumulation were identified. Seven candidate TFs, including four wheat orthologues of barley TFs that control hordein gene expression, were associated or in strong linkage disequilibrium with markers associated with the composition or quantity of glutenin or gliadin, or allometric grain N allocation parameters, confirming the importance of the transcriptional control of GSP accumulation. Genome-wide association results suggest that the genes regulating glutenin and gliadin compositions are mostly distinct from each other and operate differently.
<span class="Species">Bread wheat (<span class="Species">Triticum aestivum L.) is the most important staple crop in the world, providing on average 20% of the total calories and 22% of the total protein in the human diet (FAOSTAT, 2011). Wheat grain has numerous food and non-food uses mainly based on the unique properties of the proteins it contains (Shewry, 2009). When mixed with water, wheat grain storage proteins (GSPs) form a network, called gluten, with distinctive cohesiveness and viscoelasticity determined by both the total grain protein concentration (GPC) and the relative composition of the storage protein fractions (MacRitchie, 1999). For instance, flour used for breadmaking must contain enough proteins to ensure the dough has suitable functional properties. However, the high-yielding genotypes selected in wheat breeding programmes tend to have lower GPC (Oury and Godin, 2007; Aguirrezábal ). It is thus important to develop wheat cultivars with well-balanced grain protein compositions to compensate for the low GPC of modern high-yielding accessions (Shewry, 2007; Aguirrezábal ).
The major <span class="Species">wheat <span class="Chemical">GSPs are the glutenin and gliadin prolamins, which make up 60–80% of total grain proteins. Gliadins are monomeric proteins that fall into the four classes ω5-, ω1,2-, α/β-, and γ-gliadins (Wieser, 2007). Glutenins are composed of high-molecular-weight (HMW-GS) and low-molecular-weight (LMW-GS) subunits, which form very large macropolymers during grain desiccation (Don ). It is generally accepted that glutenins have a prominent role in strengthening wheat dough by conferring elasticity, while gliadins contribute to the viscous properties of dough by conferring extensibility. The proportions of grain N allocated to each GSP fraction in a particular accession will therefore affect the properties and quality of the resultant flour. Wheat prolamins are encoded by several loci on the group one and six chromosomes (Shewry and Halford, 2003) and many studies have described the relationships between allelic variability at these loci and the functional properties of dough (e.g. Metakovsky ; Branlard ; Eagles ). The major non-prolamin GSPs are α-amylase/trypsin inhibitors and β-amylases (albumins), triticins (globulins), and puroindolines (amphiphilic proteins). Albumin and globulin storage proteins have a limited effect on dough properties but each account for about 10% of the total protein in mature grain (Singh ). Puroindolines are responsible for most of the observed variation in grain hardness and account for 5–10% of total grain protein in mature grain (Morris, 2002).
Conserved cis-motifs have been identified on <span class="Chemical">GSP promoters from wheat, maize (Zea mays L.), barley (Hordeum vulgare L.), and rice (Oryza sativa L.) (Colot ; Zheng ) and several TFs interacting with these motifs have been isolated and characterized (Albani ; Vicente-Carbajosa ; Mena ; Dong ). Full activation of GSP genes is achieved by the synergetic interaction of different combinations of these TFs with the cis-motifs in what resembles a regulatory network (Rubio-Somoza , b; Yamamoto ). It is also possible that other TFs are involved in the control of GSP gene expression, even without directly binding the cis-motifs on their promoters. The amounts of the different classes of GSP per grain have been found to scale allometrically with the total amount of N per grain independently of N availability and environmental conditions (Martre ; Triboi ). Such scaling relationships in biology are often based on regulatory networks (West ; Maritan, 2002; West and Brown, 2005). Therefore, if, as is generally accepted, GSP synthesis is mainly regulated at the transcriptional level (Verdier and Thompson, 2008; Weber ), it can be hypothesized that the allometric scaling of grain N allocation resides on a transcriptional network (Ravel ).
Other mechanisms might affect the relative proportions of gliadin classes and glutenin subunits in grain without being directly involved in <span class="Chemical">GSP gene regulation. Genes related to N assimilation, like those encoding glutamine synthase and GOGAT enzymes, may affect the quantity of N per grain (Masclaux ; Hirel ; Habash ; Quraishi ) and hence GPC and GSP composition. GSP accumulation can also be controlled at the translational level via the availability of the different amino acids (Weber ). Thus genes involved in grain metabolism, particularly in N, S, and amino acid metabolism or genes coding for transporters, might have an effect on GSP composition (Galili and Höfgen, 2002; Tabe ; Weichert ). Genes known to control grain development and cell fate could indirectly affect GPC. For instance, barley and maize mutants of the supernumerary aleurone layer gene, SAL (Shen ), have impaired grain structures and an altered number of protein-rich aleurone layers.
As the molecular control of cereal <span class="Chemical">GSP quantity, composition, and allocation remains unclear, it is necessary to identify the factors controlling these important traits, particularly at the genetic level. Several quantitative trait loci (QTL) have been identified that are associated with GPC and GSP composition in wheat (Charmet ; Ravel ; Zhang ) and while most of these colocalize with gliadin and glutenin structural loci, seven QTL are potentially involved in the trans-regulation of GSP synthesis. In parallel, gene expression analysis of ditelosomic lines of the spring wheat cultivar Chinese Spring showed that on average eight chromosomal regions were associated with changes in transcript levels of each of the four HMW-GS genes of this genotype (Storlie ).
Association mapping has proven to be an efficient strategy to decipher the genetic basis of complex traits (Ingvarsson and Street, 2010). Here, association mapping was used to analyse and validate candidate genes and to scan the genome for new loci involved in the genetic control of cereal <span class="Chemical">GSP composition. Associations with 38 variables related to GPC, GSP composition, and grain N allocation were calculated, including the scaling parameters of an ecophysiological model predicting the effect of the environment on GPC and GSP composition (Martre , 2006a). The current work discusses the respective contributions of cis and trans loci regulating GSP composition in wheat and compare the genetic control of gliadin and glutenin composition.
Materials and methods
Plant material and field experiments
A collection of 196 accessions (Supplementary Data S1, available at JXB online) was selected from the INRA worldwide <span class="Species">bread wheat (Triticum aestivum L.) core collection of 372 accessions (Balfourier ) using passport data (geographic origin and registration date) and previous field evaluation data (Bordes ) to maximize the geographical diversity represented and minimize the risk of crop failure due to lodging (accessions with crop heights between 0.5 and 1.3 m were selected). All seeds were obtained from the INRA Clermont-Ferrand Genetic Resources Centre for Small Grain Cereals (www4.clermont.inra.fr/umr1095_eng/crg). This panel has originally been grown in 12 environments (Bordes ). Based on yield and grain protein concentration data, three extreme environments have been selected for this study. They comprise two very different growing seasons and sites and two contrasted N treatments at one of the two sites.
The 196 accessions were grown in 2005–2006 at Clermont-Ferrand, France (CF, 45° 46’ N 03° 09’ E, 329 m above sea level) and in 2006–2007 at Le Moulon, France (LM, 48° 10’ N 2° 36’ E, 165 m above sea level). In CF, crops were sown on 8 November 2005 at a density of 150 seed m–2. Experimental plots were nine rows wide (with 17cm inter-row spacing) and 5 m long (plot area 7.65 m2). The weather conditions during the growing season were similar to the long-term average for this site, with conditions close to optimal. From sowing to harvest maturity, the crops received over 500mm of rainfall. On 27 April 2006 and 12 May 2006, the crops received 6 and 7g N m–2, respectively, in the form of <span class="Chemical">ammonium nitrate. The accessions were randomized in two blocks of 121 genotypes for shorter and taller genotypes, respectively. In LM, all the accessions were sown on 26 October 2006 at a density of 220 seed m–2. The experimental design consisted of two main blocks corresponding to the low and high N treatments, and each block was divided into 6 subblocks in which 36 accessions were randomized. Plots were eight rows wide (with 20cm inter-row spacing) and 5.5 m long (plot area 8.8 m2). The high N treatment of the LM+ plot was 6g N m–2 applied on 22 March 2007 and 4 May 2007, while the low N treatment of the LM– plot was 4g N m–2 on the same dates. All crop inputs, including pest, weed, and disease control and potassium and phosphate fertilizers, were applied at levels to minimize yield loss. The cultivation took place in good conditions and except for a short period of mild water deficit during stem extension, the plants received a good supply of water (adding to over 600mm precipitation). All crops were combine-harvested when grain was ripe with a moisture content of 0.15g H2O (g fresh matter)–1.
Grain milling and determining grain dry mass and protein concentration
Single grain dry mass and grain <span class="Chemical">water content were determined by oven drying 10g grain at 80 °C for 48h and reweighing the sample at zero <span class="Chemical">water content. Grain samples (30g) were milled using a Cyclotec sample mill (Foss, Höganäs, Sweden) equipped with a 1-mm mesh screen. Wholemeal flour samples (1g) were oven dried at 80 °C for 48h, then 5mg of dry flour were weighed in tin capsules and the total N concentration was determined with the Dumas combustion method (Association of Analytical Communities International approved method no. 992.23) using a FlashEA 1112 N/Protein Analyzer (Thermo Electron, Waltham, MA, USA). GPC was calculated by multiplying grain N concentration by 5.62 (Mossé ).
Sequential extraction and quantification of the non-prolamin, gliadin, and glutenin protein fractions
The sequential extraction procedure of Osborne (1907) adapted by Triboi was modified as follows. Each 2ml tube contained one <span class="Chemical">stainless steel bead (5mm diameter) and samples were stirred by placing the tubes on a rotating wheel (40rpm) during each extraction and washing step. The non-prolamin protein fraction was extracted for 30min at 4 °C from 100mg wholemeal flour with 1.5ml of 50mM phosphate buffer (pH 7.8) containing 0.1M NaCl. After centrifugation for 10min (18 000 g) at 4 °C, the supernatant was collected and the pellet was washed twice for 10min each with 1.5ml of the same buffer. After centrifugation in the same conditions, all supernatants were pooled. The same steps were used to extract the gliadin protein fraction from the previous pellet with 70% (v/v) ethanol. Finally, the glutenin protein fraction was extracted in 50mM borate buffer (pH 8.5) containing 2% SDS (w/v) and 1% dithiothreitol (w/v). The supernatants (80 μl) of each protein fraction were oven dried overnight at 60 °C in tin capsules and their total N concentration was determined with the Dumas combustion method as described above. Protein fractions from samples of the same flour from cultivar Récital were extracted, analysed as a control in each of the 21 sets of extractions and used to determine the coefficient of variation for each of the protein fractions, which were 3.48, 5.10, 2.19, 2.61, and 1.96% for the non-prolamin, gliadin, and glutenin protein fractions, storage proteins, and total proteins, respectively.
Separation and quantification of the gliadin protein classes
Gliadin classes were separated and quantified by HPLC using an Agilent 1290 Infinity LC system (Agilent Technologies, Santa Clara, CA, USA). The gliadin extracts (1ml) were filtered through regenerated cellulose syringe filters (0.45 μm pore diameter, UptiDisc, Interchim, Montluçon, France) and 4 μl were injected onto a C8 reversed-phase Zorbax 300 StableBound column (2.1×100mm, 3.5 μm, 300 Å, Agilent Technologies) maintained at 50 °C. Solvent A was 0.1% trifluoroacetic acid in ultra-pure <span class="Chemical">water and solvent B was 0.09% trifluoroacetic acid in acetonitrile. Gliadin classes were separated by using a linear gradient from 24 to 50% solvent B in 13min at a 1ml min–1 flow rate. Proteins were detected by UV absorbance at 214nm. After the gradient, the column was washed with 80% solvent B for 2min and then equilibrated at 24% solvent B for 2min, at the same flow rate. Chromatograms were processed with Agilent Technologies ChemStation software. The signal obtained from a blank injection (4 μl) of 70% (v/v) <span class="Chemical">ethanol was subtracted from the chromatograms before integrating the data. The HPLC peaks corresponding to each of the four gliadin classes were identified as in Wieser . The quantity of each gliadin class as a percentage of total gliadin was calculated by dividing the areas under each HPLC peak by the total area under the chromatogram trace. The quantity of each gliadin class per grain was calculated by multiplying the proportion of each gliadin class in total gliadin by the total quantity of gliadin per grain as quantified by Dumas analysis.
Separation and quantification of glutenin subunits
Glutenins were extracted following the method of Fu and Sapirstein (Fu and Kovacs, 1999) modified as follows. Wholemeal flour samples (30mg) were stirred for 15min at room temperature with 1ml of 80mM <span class="Chemical">Tris-HCl buffer (pH 7.5) containing 50% (v/v) propan-1-ol. After centrifugation (15,900 g) for 10min at 15 °C, the supernatant was discarded and the pellet was dispersed by sonication (model 75038, Ultrasonic Processor, Sonics and Materials, Newtown, CT, USA) for 15 s at 30% maximum power in 600 μl of Tris-HCl buffer containing 2% (w/v) SDS and 1% (w/v) DTT. The mixture was incubated at 60 °C for 30min then centrifuged for 10min (12,500 g) at 20 °C. High- and low-molecular-weight glutenin subunits were then separated and quantified on a protein microfluidics chip using a LabChip 90 System (Caliper Lifer Science, Hopkinton, MA, USA) as described previously (Rhazi ). The HMW-GS to LMW-GS ratio (HMW/LMW) was calculated as the ratio of the fluorescence intensities of the two types of subunits. The quantities of LMW-GS and HMW-GS per grain were calculated by dividing the total quantity of glutenin per grain as determined by Dumas analysis by (1 + HMW/LMW) and (1 + LMW/HMW), respectively.
Phenotypic data analysis
All statistical and regression analysis was done using R 2.12.2 for Windows (R Development Core Team, 2007). Env<span class="Chemical">ironmental and genetic differences in grain dry mass (GDM), total quantity of grain <span class="Chemical">nitrogen (Ntot), GPC, and grain protein composition were analysed using Kruskal–Wallis nonparametric rank test (α = 0.01) followed by a Behrens-Fisher multiple comparison test.
Allometric scaling parameters for grain protein allocation were determined by fitting data to a power function equation:where F
i is the quantity of N per grain of the protein fraction considered (mg grain–1), N is the total quantity of N or <span class="Chemical">GSP per grain (mg grain–1), α is the scaling coefficient (mg grain–1), and β is the scaling exponent (dimensionless). For gliadins, glutenins, and GSPs, the allocation scaling parameters were computed from Ntot, whereas for the glutenin subunits and gliadin classes they were computed from the total quantity of GSPs per grain. The power function equation was fitted either to the 196 accessions independently for each of the three environments or for each accession across the three environments, using log-transformed data and standard major axis regression (Warton ) with the R package smatr (Warton ). The residuals to the regression lines (R
i) were computed as: R
i = 0.5 × (log(F
i) – β × log(N) – α) × √(1 + 1/β2).
Genotyping wheat accessions
Genotype data for the 196 accessions used in this study were already available for 578 diversity array technology (DArT, Triticarte, Canberra, Australia; www.triticarte.com.au), 282 single-sequence repeat, and 13 single-nucleotide polymorphism (SNP) markers (Bordes ). The current work developed 193 additional SNP markers in 60 candidate genes: 13 genes involved in N or S assimilation/metabolism, six genes that control grain development and are likely to regulate the expression of <span class="Chemical">GSP genes in both cereals and dicots (Verdier and Thompson, 2008), three genes modifying grain hardness, five HMW-GS genes, one LMW-GS gene, 13 wheat orthologues of TFs known to control barley hordein expression (Rubio-Somoza ; Moreno-Risueno ) or Arabidopsis thaliana seed storage globulins (Verdier and Thompson, 2008), and 19 putative TFs with expression patterns during grain development that resemble GSP gene expression (Romeuf, 2010).
For each accession, genomic DNA was extracted from leaves harvested from a pool of six 3-week-old seedlings using the BioSprint 96 DNA Plant kit (Qiagen, Hilden, Germany). Out of 193 additional SNP markers (Supplementary Data S1), 149 were included in a set of 384 SNPs typed in the GoldenGate assay (Illumina, Cambridge, UK) according to the manufacturer’s recommendations. The 44 remaining markers could not be typed with this assay for technical reasons, for instance because of the presence of polymorphisms within flanking sequences. To type them, either simplex technologies based on an allele-specific primer were used or a few gene fragments with a high level of polymorphism were directly sequenced.
Genetic map and association analysis
The map used as reference was the consensus map described by Bordes built with MetaQTL software (Veyrieras ) using published data and the reference map of Somers . All the novel SNP markers developed during this study were mapped based on their linkage disequilibrium calculated as r
2 values (Weir, 1996) with tassel version 2.1 software (Bradbury ). They were placed close to the DArT or single-sequence repeat markers with which the linkage disequilibrium was the highest.The 578 DArT markers were used to investigate the population structure of the entire core collection with the STRUCTURE software program (Pritchard ). For association analyses, rare alleles (<2.5% of all alleles at a locus) were considered as missing data. Like the core collection, the panel of 196 accessions was also structured into five groups corresponding to the ancestral geographical groups, so the results from the core collection structure were used in the analysis. Associations between markers and phenotypic traits were tested as by Bordes using the general linear model (GLM) implemented in tassel version 2.0.1. The Q matrix, which gives the contribution of a genotype to each of the five inferred ancestor groups, was introduced in the model as covariate to control for the structure of the collection used and avoid false negatives (GLM–Q model). Two main effects, marker and env<span class="Chemical">ironment, and their interaction were included in the model. An association was considered significant when the genotype × environment interaction was not significant (P > 0.05) and the P-value of the marker effect was <0.001. This analysis was extended by analysing each of the three environments using the GLM–Q model including only the marker effect. In this case, associations were considered as statistically significant when the P-value of the genotype effect was <0.001 for at least two environments. Maps and associated regions were represented with the MapChart version 2.1 software program (Voorrips, 2002).
Results
To investigate the association between accession genotypes and <span class="Species">wheat storage protein phenotypes, 196 accessions were selected from the INRA worldwide wheat core collection (Balfourier ). The resulting subcore collection consisted of accessions originating from 38 different countries and included landraces and old cultivars dating from the nineteenth century and the first half of the twentieth century and modern varieties bred after 1960. These accessions were grown in three different environments in the field at Clermont-Ferrand in 2006 under high N conditions (CF+) and at Le Moulon in 2007 under high (LM+) and low (LM–) N conditions. Single GDM and GPC were determined and Ntot was calculated for each accession in each environment. The gliadin and glutenin protein fractions were extracted sequentially and then quantified by elemental analysis. ω5-, ω1,2-, α/β-, and γ-gliadins were separated by reversed-phase HPLC, and LMW-GS and HMW-GS were separated by microfluidics electrophoresis.
Environmental and genetic deviations from allometric grain nitrogen allocation relationships
Overall the GDM, Ntot, and GPC of the accessions were significantly different between the three env<span class="Chemical">ironments, and for a given environment they were different between accessions (Table 1). Under high N conditions, median GPC was 10% higher for CF+ than for LM+, but Ntot was similar, so the observed variations of GPC between these two environments were mainly due to variations in GDM. In Le Moulon, all the accessions were grown under both high and low N conditions and median GPC was 10% lower for LM– than for LM+ reflecting the lower (–22%) Ntot for LM–. The genetic variability for GDM, Ntot, and GPC was much higher for LM– than for LM+.
Table 1.
Single grain dry mass, quantity of N per grain (Ntot), and grain protein concentration (GPC) for 196 accessions of the INRA worldwide wheat core collection grown in the field in 2006 at Clermont-Ferrand under high N condition and in 2007 at Le Moulon under high and low N conditionsValues are median (5–95% quantile range). GDM, grain dry mass; GPC, grain protein concentration; CF, Clermont-Ferrand; LM, Le Moulon; +, high N condition; –, low N condition. P-values from the Kruskal–Wallis rank sum test are given for the environment and genetic effects. Different letters within a column indicate significantly different values (P < 0.01) in the Behrens–Fisher post-hoc test.
Environment
GDM (mg grain–1)
Ntot (mg grain–1)
GPC (% of GDM)
CF+
37.4 (30.4–42.8)a
1.033 (0.74–1.32)a
15.7 (12.9–19.1)a
LM+
43.8 (35.4–52.3)b
1.06 (0.82–1.36)a
14.1 (11.5–16.9)b
LM–
37.9 (27.0–46.9)a
0.83 (0.61–1.16)b
12.7 (10.2–15.9)c
P-values
Environment
3.9×10–36
2.7×10–34
1.2×10–37
Genotype
2.3×10–8
5.9×10–9
4.9×10–9
Single grain dry mass, quantity of N per grain (Ntot), and grain protein concentration (GPC) for 196 accessions of the INRA worldwide <span class="Species">wheat core collection grown in the field in 2006 at Clermont-Ferrand under high N condition and in 2007 at Le Moulon under high and low N conditionsValues are median (5–95% quantile range). GDM, grain dry mass; GPC, grain protein concentration; CF, Clermont-Ferrand; LM, Le Moulon; +, high N condition; –, low N condition. P-values from the Kruskal–Wallis rank sum test are given for the env<span class="Chemical">ironment and genetic effects. Different letters within a column indicate significantly different values (P < 0.01) in the Behrens–Fisher post-hoc test.
The quantities of <span class="Chemical">GSP, gliadin, and glutenin per grain showed very significant environmental and genetic variations (Table 2). The percentages of gliadin and glutenin contributing to Ntot and the gliadin to glutenin ratios were similar in the three environments, but showed very significant genetic variations. The observed data were then fitted to a power model, which described the allometric scaling relationship between total grain N and how much N is allocated to an individual protein fraction. For gliadins, glutenins, and GSPs, the scaling allocation parameters (i.e. the scaling coefficient and the scaling exponent) were computed with respect to Ntot. However, as distinct pools of N are allocated to non-prolamin proteins and GSPs during grain filling (Martre ), it was thought more pertinent to compute the scaling allocation parameters for the glutenin subunits and gliadin classes from the total quantity of GSP per grain.
Table 2.
Grain storage protein, gliadin, and glutenin as quantity per grain and percentage of total grain N and the gliadin to glutenin ratio for 196 wheat accessions grown in the field in 2006 at Clermont-Ferrand under high N condition and in 2007 at Le Moulon under high and low N conditionsValues are median (5–95% quantile range). GSP, grain storage protein; CF, Clermont-Ferrand; LM, Le Moulon; +, high N condition; –, low N condition. P-values from the Kruskal–Wallis rank sum test are given for the environment and genetic effects. Different letters within a column indicate significantly different values (P < 0.01) in the Behrens–Fisher post-hoc test.
Environment
Quantity of N per grain (mg grain–1)
Percentage of total grain N
GSP
Gliadin
Glutenin
GSP
Gliadin
Glutenin
Gliadin to glutenin ratio
CF+
0.652 (0.481–0.840)a
0.238 (0.158–0.344)a
0.407 (0.311–0.523)a
63.8 (58.0–68.0)a
23.4 (19.6–28.3)a
40.3 (35.3–44.3)a
0.58 (0.47–0.76)a
LM+
0.675 (0.529–0.872)a
0.247 (0.175–0.342)a
0.430 (0.346–0.561)b
64.0 (57.3–67.9)a
23.2 (19.2–27.1)a
40.6 (35.5–44.7)a
0.568 (0.46–0.73)a
LM–
0.522 (0.367–0.742)b
0.191 (0.121–0.277)b
0.331 (0.238–0.454)c
62.7 (56.8–66.9)b
23.0 (18.6–27.2)a
39.9 (35.2–43.6)a
0.571 (0.45–0.72)a
P-value
Environment
1.1×10–34
5.5×10–24
2.8×10–38
2.5×10–4
0.025
0.028
0.16
Genotype
5.9×10–9
2.6×10–14
1.1×10–6
6.1×10–9
2.6×10–18
3.8×10–19
3.9×10–23
Grain storage protein, gliadin, and glutenin as quantity per grain and percentage of total grain N and the gliadin to glutenin ratio for 196 <span class="Species">wheat accessions grown in the field in 2006 at Clermont-Ferrand under high N condition and in 2007 at Le Moulon under high and low N conditionsValues are median (5–95% quantile range). <span class="Chemical">GSP, grain storage protein; CF, Clermont-Ferrand; LM, Le Moulon; +, high N condition; –, low N condition. P-values from the Kruskal–Wallis rank sum test are given for the environment and genetic effects. Different letters within a column indicate significantly different values (P < 0.01) in the Behrens–Fisher post-hoc test.
The genetic variations in the quantity of <span class="Chemical">GSP, gliadin, and glutenin per grain were well explained by the variations in Ntot (Fig. 1) and the power function equation explained 75–94% of the observed variation (Supplementary Table S1). For these three protein fractions, the scaling exponents of the power relationships were not different across the environments, but there was a shift along the common slope reflecting the differences in Ntot between the environments. By contrast, only the scaling coefficients for gliadins were different between environments (Supplementary Table S1). For glutenins, the scaling exponent was not significantly different from 1 (i.e. the amount of glutenin varied in direct proportion to Ntot). In contrast, for gliadins, the scaling exponent was 1.37±0.013, indicating that the amount of gliadin increased disproportionally with Ntot. The difference in the scaling exponent between gliadins and glutenins is consistent with values reported previously for the cultivar Thésée studied in a wide range of environments (Martre ).
Fig. 1.
Quantities of grain storage protein (GSP, A), gliadin (B), and glutenin (C) per grain versus the total quantity of N per grain (Ntot) for 196 accessions of the INRA worldwide wheat core collection grown in the field in 2006 in Clermont-Ferrand under high N conditions (CF+; black symbols and continuous line) and in 2007 in Le Moulon under high (LM+; red triangle and dot-dashed line) and low (LM–; blue squares and dashed line) N conditions. Lines are standard major axis regressions of log-transformed data (power function equation). Statistics of the regressions are given in Supplementary Table S1.
Quantities of grain storage protein (<span class="Chemical">GSP, A), gliadin (B), and glutenin (C) per grain versus the total quantity of N per grain (Ntot) for 196 accessions of the INRA worldwide <span class="Species">wheat core collection grown in the field in 2006 in Clermont-Ferrand under high N conditions (CF+; black symbols and continuous line) and in 2007 in Le Moulon under high (LM+; red triangle and dot-dashed line) and low (LM–; blue squares and dashed line) N conditions. Lines are standard major axis regressions of log-transformed data (power function equation). Statistics of the regressions are given in Supplementary Table S1.
The genetic variability of the scaling exponents and coefficients of the allometric allocation of grain N to <span class="Chemical">GSPs, gliadins, and glutenins were determined by fitting the power model for each accession across the three environments (Fig. 2). A large genetic variability was found for the scaling exponents for the three protein fractions, while the scaling coefficients were not statistically different among the accessions (Table 3). The scaling coefficients correlated with the percentages of GSP, gliadin, and glutenin in Ntot (Supplementary Table S2 and Supplementary Data S1). In contrast, the scaling exponents did not correlate with the quantities or proportions of the different protein fractions. The gliadin and glutenin scaling exponents only correlated with the GSP scaling exponent (r = 0.7).
Fig. 2.
Allometric relationships between the quantities of grain storage protein (GSP, A), gliadin (B), and glutenin (C) per grain versus the total quantity of N per grain (Ntot) for 163 (A), 143 (B), and 172 (C) accessions of the INRA worldwide wheat core collection. Only accessions for which the P-value of the allometric relationship was <0.3 and the range of variation of the protein fractions across the three environments was >3 × the coefficient of variation (or >1.5 × for gliadins) are plotted. Lines were fitted to log-transformed data (power function equation) using standard major axis regressions. Medians and quantiles of the scaling coefficients and exponents are given in Table 3.
Table 3.
Scaling exponents and coefficients of the allometric relationships between the quantity of grain storage protein, gliadin, and glutenin as quantity per grain and percentage of total grain NValues are median (5–95% quantile range) for 163, 143, and 172 wheat accessions for GSPs, gliadins, and glutenins, respectively. GSP, grain storage protein; Ntot, percentage of total grain N. In calculating the mean and quantiles, accessions were only included if the P-value of the allometric relationship was <0.3 and the range of variation of the protein fractions across the three environments was >3 × (or >1.5 × for gliadins) the coefficient of variation. For each accession, standard major axis regressions were fitted across the three environments on log-transformed data. The probabilities (P-values) that the scaling exponents or coefficients are not significantly different among the accessions are given. Fitted regressions are shown in Fig. 3.
Scaling exponent (dimensionless)
Scaling coefficient (mg grain–1)
GSP vs. Ntot
Gliadin vs. Ntot
Glutenin vs. Ntot
GSP vs. Ntot
Gliadin vs. Ntot
Glutenin vs. Ntot
1.08 (0.81–1.40)
1.20 (0.66–1.78)
1.05 (0.75–1.53)
0.637 (0.573–0.675)
0.233 (0.202–0.269)
0.403 (0.353–0.444)
P-value
0.98
0.33
0.99
<0.001
<0.001
<0.001
Scaling exponents and coefficients of the allometric relationships between the quantity of grain storage protein, gliadin, and glutenin as quantity per grain and percentage of total grain NValues are median (5–95% quantile range) for 163, 143, and 172 <span class="Species">wheat accessions for GSPs, gliadins, and glutenins, respectively. GSP, grain storage protein; Ntot, percentage of total grain N. In calculating the mean and quantiles, accessions were only included if the P-value of the allometric relationship was <0.3 and the range of variation of the protein fractions across the three environments was >3 × (or >1.5 × for gliadins) the coefficient of variation. For each accession, standard major axis regressions were fitted across the three environments on log-transformed data. The probabilities (P-values) that the scaling exponents or coefficients are not significantly different among the accessions are given. Fitted regressions are shown in Fig. 3.
Fig. 3.
Quantities of high- (HMW-GS, A) and low- (LMW-GS, B) molecular-weight glutenin subunits, and ω5- (C), ω1,2- (D), α/β- (E), and γ-gliadins (F) per grain versus the quantity of grain storage protein (GSP) per grain for 196 accessions of the INRA worldwide wheat core collection grown in the field in 2006 at Clermont-Ferrand under high N conditions (CF+; black symbols and continuous line) and in 2007 at Le Moulon under high (LM+; red triangle and dot-dashed line) and low (LM–; blue squares and dashed line) N conditions. In C and D, insets show the data with expanded x and y axes. Lines are standard major axis regressions fitted to log-transformed data (power function equation). Statistics of the regressions are given in Supplementary Table S2.
Allometric relationships between the quantities of grain storage protein (<span class="Chemical">GSP, A), gliadin (B), and glutenin (C) per grain versus the total quantity of N per grain (Ntot) for 163 (A), 143 (B), and 172 (C) accessions of the INRA worldwide <span class="Species">wheat core collection. Only accessions for which the P-value of the allometric relationship was <0.3 and the range of variation of the protein fractions across the three environments was >3 × the coefficient of variation (or >1.5 × for gliadins) are plotted. Lines were fitted to log-transformed data (power function equation) using standard major axis regressions. Medians and quantiles of the scaling coefficients and exponents are given in Table 3.
The proportions of glutenin subunits and gliadin classes in total <span class="Chemical">GSP were different in the three environments (Table 4). The percentages of each gliadin class making up the total gliadin fraction, the gliadin to glutenin ratios, and the HMW-GS to LMW-GS ratios were all highly correlated (|r| > 0.75) to the percentage of the corresponding fractions in Ntot or GSP (Supplementary Table S3 and Supplementary Data S1). The proportion of γ-gliadin in either total GSP or total gliadin, the percentage of HMW-GS in total GSP and the HMW-GS to LMW-GS ratio were not statistically different among the accessions (Table 4).
Table 4.
Glutenin and gliadin composition for 196 wheat accessions grown in the field in 2006 at Clermont-Ferrand under high N condition and in 2007 at Le Moulon under high and low N conditionsValues are median (5–95% quantile range). LMW-GS, low-molecular-weight glutenin subunits; HMW-GS, high-molecular-weight glutenin subunits, GSP, grain storage protein; CF, Clermont-Ferrand; LM, Le Moulon; +, high N condition; –, low N condition. P-values from the Kruskal–Wallis rank sum test are given for the environment and genetic effects. Different letters within a column indicate significantly different values (P < 0.01) in the Behrens–Fisher post-hoc test.
Environment
LMW-GS
HMW-GS
ω5-Gliadin
ω1,2-Gliadin
α/β-Gliadin
γ-Gliadin
Quantity of N per grain (mg N grain–1)
CF+
0.209 (0.149–0.317)a
0.187 (0.135–0.264)a
0.008 (0.002–0.017)a
0.022 (0.011–0.043)a
0.120 (0.076–0.183)a
0.084 (0.060–0.119)a
LM+
0.264 (0.182–0.377)b
0.165 (0.107–0.231)b
0.006 (0.002–0.012)b
0.019 (0.010–0.035)b
0.110 (0.074–0.164)a
0.111 (0.077–0.146)b
LM–
0.200 (0.126–0.285)a
0.133 (0.083–0.197)c
0.004 (0.001–0.010)c
0.012 (0.005–0.026)c
0.081 (0.050–0.125)b
0.092 (0.058–0.131)a
P-value
Environment
7.3×10–30
1.5×10–32
4.6×10–22
4.2×10–33
2.5×10–34
1.2×10–23
Genotype
5.3×10–7
0.056
9.6×10–13
2.5×10–12
4.4×10–11
8.8×10–16
Percentage of total grain storage protein (% GSP)
CF+
33.8 (25.6–41.9)a
29.5 (20.6–36.8)a
1.2 (0.3–2.4)a
3.4 (1.9–5.8)a
18.8 (14.6–23.7)a
13.1 (10.8–16.7)a
LM+
39.5 (29.6–47.8)b
24.1 (15.7–33.7)b
0.9 (0.3–1.7)b
2.8 (1.7–4.7)b
16.5 (12.5–21.0)b
15.9 (13.4–18.7)b
LM–
37.8 (29.4–48.1)b
25.6 (17.1–34.1)b
0.8 (0.3–1.8)b
2.25 (1.3–4.1)c
15.5 (12.0–20.3)c
17.2 (14.5–20.3)c
P-value
Environment
3.5×10–22
5.8×10–20
3.8×10–14
1.1×10–23
4.8×10–25
9.3×10–63
Genotype
9.0×10–4
0.014
1.8×10–16
1.3×10–18
1.5×10–17
0.076
Ratio of HMW-GS to LMW-GS
ω5-Gliadin (% of total gliadin)
ω1,2-Gliadin (% of total gliadin)
α/β-Gliadin (% of total gliadin)
γ-Gliadin (% of total gliadin)
CF+
0.88 (0.51–1.44)a
3.10 (0.83–6.44)a
9.1 (5.9–15.8)a
51.3 (41.2–58.1)a
35.8 (28.8–43.2)a
LM+
0.60 (0.35–1.15)b
2.45 (0.71–4.86)b
7.5 (4.9–12.7)b
45.7 (36.0–52.1)b
44.1 (37.6–50.4)b
LM–
0.68 (0.37–1.10)b
2.12 (0.77–4.78)b
6.3 (3.9–10.7)c
43.4 (35.1–50.1)c
47.8 (41.2–54.3)c
P-value
Environment
1.5×10–23
2.6×10–12
3.6×10–26
2.9×10–40
3.2×10–74
Genotype
0.042
3.3×10–19
1.0×10–17
1.1×10–11
0.35
Glutenin and gliadin composition for 196 <span class="Species">wheat accessions grown in the field in 2006 at Clermont-Ferrand under high N condition and in 2007 at Le Moulon under high and low N conditionsValues are median (5–95% quantile range). LMW-GS, low-molecular-weight glutenin subunits; HMW-GS, high-molecular-weight glutenin subunits, GSP, grain storage protein; CF, Clermont-Ferrand; LM, Le Moulon; +, high N condition; –, low N condition. P-values from the Kruskal–Wallis rank sum test are given for the environment and genetic effects. Different letters within a column indicate significantly different values (P < 0.01) in the Behrens–Fisher post-hoc test.
The allometric model explained 61–72% of the genetic variability in the quantity of α/β- and γ-gliadins per grain with respect to <span class="Chemical">GSP (Fig. 3 and Supplementary Table S4). For these gliadin classes, the scaling exponents were similar across environments, but the scaling coefficients were modified by the environment. The genetic variability of ω5-gliadins, which accounted for only 1–6% of the gliadin protein fraction (Table 4), was poorly explained by the quantity of GSP per grain (r
2 ≈ 0.15, Supplementary Table S3). For HMW-GS and LMW-GS, the scaling exponents were different across the three environments and for LMW-GS the scaling exponent was higher for LM+ than for LM–.
Quantities of high- (HMW-GS, A) and low- (<span class="Gene">LMW-GS, B) molecular-weight glutenin subunits, and ω5- (C), ω1,2- (D), α/β- (E), and γ-gliadins (F) per grain versus the quantity of grain storage protein (GSP) per grain for 196 accessions of the INRA worldwide wheat core collection grown in the field in 2006 at Clermont-Ferrand under high N conditions (CF+; black symbols and continuous line) and in 2007 at Le Moulon under high (LM+; red triangle and dot-dashed line) and low (LM–; blue squares and dashed line) N conditions. In C and D, insets show the data with expanded x and y axes. Lines are standard major axis regressions fitted to log-transformed data (power function equation). Statistics of the regressions are given in Supplementary Table S2.
The residuals to the scaling allocation relationships were calculated for the protein fractions for which r
2 > 0.6 (i.e. <span class="Chemical">GSPs, glutenins, gliadins, and α/β- and γ-gliadins). For each accession, these residuals quantified the difference between the actual quantity of N allocated to a given protein fraction and that predicted from Ntot or GSP values alone. For the different protein fractions, the correlations of the residuals among the environments (r 0.38–0.81) suggested that they are under genetic control. Thus, although the residuals were strongly correlated (r > 0.81) with the amount of a given protein fraction as a percentage of Ntot or GSP (Supplementary Table S5 and Supplementary Data S1), they were used as variable traits in the association study.
Genome wide association of GSP parameters
All of the 38 variables described above were included in the association analysis. The population structure explained 2–14% of the phenotypic variability, depending on the variable, although for GPC it explained as much as 25% of the variability. Because most of the variables were correlated with each other, they were grouped together, giving 11 groups of variables (Fig. 4 and Supplementary Data S1). The GDM, GPC, and Ntot groups each consisted of only one variable. The qFRA group consisted of the quantities per grain of <span class="Chemical">GSP, gliadin, and glutenin, qGli the quantities per grain of the four gliadin classes, and qGlt the quantities per grain of the two types of glutenin subunits. In SPC, the proportions of GSP, gliadin, and glutenin in Ntot, the gliadin to glutenin ratio, and the residuals to the allocation relationships for these fractions were groups. cGli consisted of the proportions of gliadin classes in total GSP and total gliadin and the residuals to the allocation relationships for α/β- and γ-gliadins and cGlt the proportion of glutenin subunits in total GSP and the glutenin fraction. The ASC and ASE groups consisted of the allometric scaling coefficients and exponents, respectively, for GSPs, gliadins, and glutenins. The 196 accessions were genotyped using 873 genetic markers genome wide (Bordes ).
Fig. 4.
Results of genetic mapping in 196 accessions of the INRA worldwide wheat core collection with 883 polymorphic markers for genome-wide screening and 195 polymorphic markers in 51 candidate genes for candidate gene association, 6 glutenin genes and 3 grain hardness genes for 38 variables in 11 groups related to grain protein composition. Approximate genetic distances (cM) are shown on the left of the chromosomes and the names of the markers on the right. Vertical bars on the left of the marker names indicate markers that were mapped at the same position. The positions of centromeres on chromosomes are indicated in red. GSP loci are indicated on the right of group 1 and 6 chromosomes. For clarity, in high-marker density regions some markers are not shown on the map. Chromosome 6D is not shown as no association was found for this chromosome. Groups of variables are: GDM, single grain dry mass; GPC, grain protein concentration; Ntot, total quantity of N per grain; qFRA, quantities of GSP, gliadin, or glutenin fractions per grain; qGli, quantities of each gliadin class per grain; qGlu, quantities of each glutenin subunit per grain; SPC, quantities of GSP, gliadin, or glutenin as percentages of Ntot, gliadin to glutenin ratio, and residuals to the relationships between the quantities of GSP, gliadin, or glutenin per grain and Ntot; cGli, quantities of each gliadin class as percentages of GSP or total gliadin, and residuals to allometric relationships between the quantities of α/β-gliadin or γ-gliadin per grain and the quantity of GSP per grain; cGlt, percentages of each glutenin subunit in GSP and HMW-GS to LMW-GS ratio; ASC, allometric scaling coefficients of the relationships between the quantities of GSP, gliadin, or glutenin per grain and Ntot; ASE, allometric scaling exponent of the relationships between the quantity of GSP, gliadin, or glutenin per grain and Ntot. Association maps including all the markers and the associated variables are given in Supplementary Data S1.
Results of genetic mapping in 196 accessions of the INRA worldwide <span class="Species">wheat core collection with 883 polymorphic markers for genome-wide screening and 195 polymorphic markers in 51 candidate genes for candidate gene association, 6 glutenin genes and 3 grain hardness genes for 38 variables in 11 groups related to grain protein composition. Approximate genetic distances (cM) are shown on the left of the chromosomes and the names of the markers on the right. Vertical bars on the left of the marker names indicate markers that were mapped at the same position. The positions of centromeres on chromosomes are indicated in red. GSP loci are indicated on the right of group 1 and 6 chromosomes. For clarity, in high-marker density regions some markers are not shown on the map. Chromosome 6D is not shown as no association was found for this chromosome. Groups of variables are: GDM, single grain dry mass; GPC, grain protein concentration; Ntot, total quantity of N per grain; qFRA, quantities of GSP, gliadin, or glutenin fractions per grain; qGli, quantities of each gliadin class per grain; qGlu, quantities of each glutenin subunit per grain; SPC, quantities of GSP, gliadin, or glutenin as percentages of Ntot, gliadin to glutenin ratio, and residuals to the relationships between the quantities of GSP, gliadin, or glutenin per grain and Ntot; cGli, quantities of each gliadin class as percentages of GSP or total gliadin, and residuals to allometric relationships between the quantities of α/β-gliadin or γ-gliadin per grain and the quantity of GSP per grain; cGlt, percentages of each glutenin subunit in GSP and HMW-GS to LMW-GS ratio; ASC, allometric scaling coefficients of the relationships between the quantities of GSP, gliadin, or glutenin per grain and Ntot; ASE, allometric scaling exponent of the relationships between the quantity of GSP, gliadin, or glutenin per grain and Ntot. Association maps including all the markers and the associated variables are given in Supplementary Data S1.
Two associated markers were considered as belonging to the same associated locus/region when they were less than 1 cM apart or shared at least one associated group and were less than 3 cM apart. Altogether, 74 loci associated with one or more groups of variables were identified throughout the genome (Fig. 4 and Supplementary Data S1). Forty-four chromosomal regions modified the proportions or allocation coefficients of the gliadins or glutenins and their respective classes or subunits. Among these regions, eight were specifically associated with glutenin composition and seven with gliadin composition, while only five regions, including the 1RS.1BL translocation region (to be discussed further), were associated with both gliadin and glutenin composition. Three regions on chromosomes 3AL, 5BS, and 7BL were associated with the scaling exponent group, but not with any other group of variables. These regions were only associated with the scaling exponent of the glutenin–Ntot relationship. Nine regions were associated with the scaling coefficients, but only two regions on 3BS and 7BS were specifically associated with this group of variables.As illustrated for the regions shown in Table 5, gliadin and glutenin compositions appeared to be regulated differently. For glutenins, the loci on 3DL and 7DS that strongly affected the <span class="Disease">LMW-GS to HMW-GS ratio also had an effect on the percentages of GSP and glutenin in Ntot, and the scaling coefficient of the glutenin–Ntot relationship. In contrast, the loci strongly associated with the proportions of gliadin classes in total GSP or gliadin, for example, markers wPt8412 and wPt5408 on 6BS, were distinct from those affecting the proportion of total gliadin in Ntot, in particular marker cfd43 on 2DS.
Table 5.
Percentage of variance explained and P-values for the genetic associations between grain protein composition variables and the markers cfd43, wPt5506, wPt8412/wPt5408, and wPt5765 on chromosomes 2D, 3D, 6B, and 7D, respectivelyAssociations were calculated using multilocal data analysed with a general linear model including genotype and environment effects and their interactions. GSP, grain storage proteins; HMW-GS, high-molecular-weight glutenin subunit; LMW-GS, low-molecular-weight glutenin subunit. GDM, single grain dry mass; GPC, grain protein concentration; ASC, allometric scaling coefficient; Ntot, total quantity of N per grain; SPC, quantities of GSP, gliadin, or glutenin as percentages of Ntot, gliadin to glutenin ratio, and residuals to the relationships between the quantities of GSP, gliadin, or glutenin per grain and Ntot; cGli, quantities of each gliadin class as percentages of GSP or total gliadin, and residuals to allometric relationships between the quantities of α/β-gliadin or γ-gliadin per grain and the quantity of GSP per grain; cGlt, percentages of each glutenin subunit in GSP and HMW-GS to LMW-GS ratio; qGlt, total quantity of glutenin per grain. Only associations with P < 0.001 are shown.
Chromosome
Marker
Group of variables
Variable
P-value
Percentage of variance explained (r2, %)
2DS
cfd43
SPC
Gliadin to glutenin ratio
2.9×10–4
9.7
Gliadin (% of Ntot)
2.5×10–5
10.5
3DL
wPt5506
SPC
GSP (% of Ntot)
1.9×10–12
13.6
Residuals of GSP vs. Ntot (mg N grain–1)
6.4×10–11
11.6
Glutenin (% of Ntot)
2.0×10–7
9.4
Residuals of glutenin vs. Ntot (mg N grain–1)
2.4×10–8
7.7
cGlt
HMW-GS to LMW-GS ratio
5.0×10–11
8.9
LMW-GS (mg N grain–1)
1.2×10–6
6.2
LMW-GS (% of GSP)
7.9×10–9
6.7
HMW-GS (% of GSP)
7.7×10–7
5.3
cGli
Residuals of α/β-gliadin vs. GSP (mg N grain–1)
2.9×10–4
5.4
ASC
Scaling coefficient of GSP vs. Ntot (mg N grain–1)
5.4×10–9
19.3
Scaling coefficient of glutenin vs. Ntot (mg N grain–1)
8.1×10–5
8.4
6BS
wPt8412/wPt5408
cGli
ω1,2-gliadin (% of GSP)
5.5×10–4
4.2
ω1.2-gliadin (% of gliadin)
1.7×10–5
6.6
α/β-gliadin (% of GSP)
2.6×10–6
7.2
α/β-gliadin (% of gliadin)
7.4×10–8
8.8
Residuals of α/β-gliadin vs. GSP (mg N grain–1)
2.6×10–6
7.2
γ-gliadin (% of gliadin)
7.7×10–5
2.8
7DS
wPt5765
SPC
GSP (% of Ntot)
4.4×10–11
12.6
Glutenin (% of Ntot)
5.0×10–7
9.1
Residuals of glutenin vs. Ntot (mg N grain–1)
4.9×10–6
7.4
qGlt
LMW-GS (mg N grain–1)
4.0×10–4
3.5
cGlt
HMW-GS to LMW-GS ratio
1.4×10–8
7.1
LMW-GS (% of GSP)
3.4×10–6
4.7
HMW-GS (% of GSP)
1.2×10–4
3.5
ASC
Scaling coefficient of GSP vs. Ntot (mg N grain–1)
1.8×10–5
11.0
Scaling coefficient of glutenin vs. Ntot (mg N grain–1)
2.9×10–4
7.2
Percentage of variance explained and P-values for the genetic associations between grain protein composition variables and the markers cfd43, wPt5506, wPt8412/wPt5408, and wPt5765 on chromosomes 2D, 3D, 6B, and 7D, respectivelyAssociations were calculated using multilocal data analysed with a general linear model including genotype and env<span class="Chemical">ironment effects and their interactions. GSP, grain storage proteins; HMW-GS, high-molecular-weight glutenin subunit; LMW-GS, low-molecular-weight glutenin subunit. GDM, single grain dry mass; GPC, grain protein concentration; ASC, allometric scaling coefficient; Ntot, total quantity of N per grain; SPC, quantities of GSP, gliadin, or glutenin as percentages of Ntot, gliadin to glutenin ratio, and residuals to the relationships between the quantities of GSP, gliadin, or glutenin per grain and Ntot; cGli, quantities of each gliadin class as percentages of GSP or total gliadin, and residuals to allometric relationships between the quantities of α/β-gliadin or γ-gliadin per grain and the quantity of GSP per grain; cGlt, percentages of each glutenin subunit in GSP and HMW-GS to LMW-GS ratio; qGlt, total quantity of glutenin per grain. Only associations with P < 0.001 are shown.
Association with grain storage protein loci and genes
Gliadin and glutenin genes occur in several tightly linked gene clusters, termed blocks, on the homologous groups one and six chromosomes, with intrablock recombination being rare (Fig. 4) (Shewry and Halford, 2003). HMW-GS genes are encoded at the three homeologous Glu-1 loci, while genes encoding most of the <span class="Gene">LMW-GS and gliadins are tightly clustered at the three homeologous Glu-3 and Gli-1 loci, respectively. On chromosome 1AS, six markers mapping near Gli-A1 and Glu-A3 loci were associated with GPC, GSP quantity, composition, and allocation, and one marker near Gli-A3 was associated with GPC. Several markers on 1BS were associated with GSP composition with r
2 values up to 35%. These markers defined three zones: an interval between markers wPt7094 and wPt3753 encompassing the Gli-B1 and Glu-B3 loci; an approximately 1.5-cM region between wPt1911 and wPt1684 encompassing the Gli-B3 locus; and a zone between gwm413 and cfd65. Eighteen accessions from the panel bear a 1RS.1BL translocation (Bordes ), where the whole or part of the wheat chromosome 1BS is replaced by the whole or part of the rye (Secale cereale L.) chromosome 1RS. None of the associations in the latter two zones or with markers wPt0328 and wPt3753 in the former zone were significant when the accessions bearing the 1RS.1BL translocation were removed from the analysis. On group six chromosomes, two markers associated with gliadin composition were mapped near Gli-A2 and Gli-B2.
This work developed SNP markers in five HMW-GS and one <span class="Gene">LMW-GS genes. The HMW-GS genes Glu-B.1.1 and Glu-B1.2 were associated with the HMW-GS to LMW-GS ratio and the percentage of HMW-GS in total GSP. Glu-D1.2 was associated with GPC and the percentage of gliadin in Ntot. A marker mapped in the vicinity of Glu-D1.2 and the TF gene SPA-D was associated with GPC, the quantity of GSP, gliadin, glutenin, and LMW-GS per grain, the percentage of HMW-GS in GSP, and the HMW-GS to LMW-GS ratio. The LMW-GS gene Glu-A3 was associated with the percentage of glutenin in Ntot, the scaling coefficient of the glutenin–Ntot allometric relationship was calculated for each accession across environments, and the residual to this relationship was calculated for all accessions.
Candidate gene association for grain storage protein composition
For candidate gene association, this work used 167 SNPs in 51 candidate genes: 13 <span class="Species">wheat orthologues of the barley TFs controlling GSP gene expression, 19 TF genes whose expression reaches a maximum at the same time as GSP genes, six genes controlling grain development, and 13 genes encoding N and S assimilation or metabolism enzymes (Supplementary Table S6). Eleven of these candidate genes were associated with at least one group of variables. In addition, six loci that did not include a candidate gene were in strong linkage disequilibrium with a candidate gene.
Six associated loci were strongly linked with six of the 13 <span class="Species">wheat orthologues of the barleyGSP transcription regulation network (Rubio-Somoza , b; Moreno-Risueno ) for which polymorphic SNPs were found (Supplementary Table S6). MCB1-B on 1BL was associated with the quantity of LMW-GS per grain (Fig. 4). MCB1-A on 1AL and GAMYB-D on 3DL were associated with the HMW-GS to LMW-GS ratio and the percentage of HMW-GS in GSP. MCB1-A was also associated with Ntot and the quantity of GSP and γ-gliadin per grain. SPA-B was strongly associated (r
2 > 17%, P < 4×10–5) with the quantity of ω5-gliadin per grain in all environments and with the proportion of ω5-gliadin in GSP in two environments (Supplementary Data S1). Moreover, when N was limiting (LM–), SPA-B was also strongly associated (r
2 = 18.7%, P = 8×10–5) with the HMW-GS to LMW-GS ratio and the quantity of LMW-GS per grain and as a percentage of GSP. A marker associated with GPC, Ntot, and the quantities of GSP, total gliadin, γ-gliadin, and LMW-GS per grain was mapped on 5BL in the vicinity of PBF-B (Fig. 4). SAD-A on 6AL was strongly linked with markers associated with GPC and the percentage of GSP in Ntot.
Based on their expression pattern during grain development (Romeuf, 2010), 19 putative candidate TFs and histone- and chromatin-modifying genes were also studied (Supplementary Table S6). DOF19-B on 1BL was associated with Ntot and the quantity of <span class="Chemical">GSP per grain. DREB1-B on 3BS was specifically associated with the scaling coefficient of the gliadin–Ntot relationship (Supplementary Data S1). Moreover, in LM+, DREB1-B was also associated with the percentage of GSP in Ntot (r
2 = 7.6%, P = 6×10–5) and the residuals of the GSP–Ntot (r
2 = 6.8%, P = 2×10–4) and gliadin–Ntot (r
2 = 9.9%, P = 8×10–6) relationships (Supplementary Data S1). On 4BL PcG3-B was associated with the quantity of γ-gliadin per grain. Finally, NAC22-B and NAC27-B on 7BL and 5BS, respectively, were in strong linkage disequilibrium with two markers associated with the scaling exponent of the glutenin–Ntot relationship. Under low N conditions in LM–, NAC22-B was also associated with the residuals to the glutenin–Ntot relationship and the gliadin to glutenin ratio (Supplementary Data S1). The region on 3BS that was very significantly associated with glutenin composition (Table 5) was in strong linkage disequilibrium with both Dr1-D and GAMYB-D.
Six key genes controlling grain development were also analysed, among which only SAL-B showed significant associations with Ntot and the quantity of <span class="Chemical">GSP, gliadin, glutenin, α/β- and γ-gliadins, and LMW-GS per grain. Of the 13 N and S assimilation/metabolism genes analysed, only FdGOGAT-B (Fig. 4), DeK-A, and DeK-B (Fig. 4) showed significant and consistent cross-environment associations. They were all associated with GPC. FdGOGAT-B was also associated with Ntot and the quantities of GSP, gliadin, glutenin, and α/β- and γ-gliadins per grain. NAM-B1-B, which has been demonstrated to play a key role in determining GPC in wheat (Uauy ), was associated with neither GPC nor with any of the other variables analysed here. But, the association with the marker in this gene, which discriminates between functional and non-functional alleles (Hagenblad ), could not be tested as the frequency of functional allele was only 0.01. However, a marker not far from NAM-B1-B was strongly associated with GPC (r
2 > 8.1%, P = 4×10–4) and Ntot (r
2 > 17%, P = 8×10–5). GDH2-D on 3DL was in linkage disequilibrium with a marker associated with GPC. On chromosome 3BS, GAD1-B was in strong linkage disequilibrium with a marker associated with GDM and Ntot, the quantity of α/β-gliadin per grain, and the scaling coefficient of the glutenin–Ntot allometric relationship.
Discussion
<span class="Species">Wheat GPC and <span class="Chemical">GSP composition are the main determinants of dough functional properties (MacRitchie, 1999). While much work has been done to find genetic factors affecting these properties, very few studies have directly investigated the genetic control of the different proteins responsible for dough quality (Guillaumie ; Charmet ; Zhang ). The current work used association mapping to analyse the genetics of wheat GPC combining genome-wide association with 873 genetic markers and candidate gene association with 51 candidates, including genes encoding 36 TFs and histone and chromatin modifying proteins. Overall, this allowed the identification of 44 loci associated with GSP composition and allocation. Of these, four included candidate genes and five were in strong linkage disequilibrium with candidate genes.
Loci associated with dough characteristics as measured by mixograph tests have previously been identified using the INRA worldwide <span class="Species">wheat core collection (Bordes ), which includes the 196 accessions used in this study. Loci on chromosomes 3DL and 7DS that were the most strongly associated with glutenin composition in this study were also strongly associated with dough characteristics in Bordes . Most of the other genomic regions that were strongly associated with dough characteristics were also associated or in linkage disequilibrium with markers associated with GSP composition in the current work. Several genomic regions controlling GSP synthesis were newly discovered. Somewhat unexpectedly, the results indicate that gliadin and glutenin compositions are genetically controlled by distinct loci and processes. The candidate TF genes were preferentially associated with GSP composition and grain N allocation traits, while N and S assimilation/metabolism genes were preferentially associated with GPC and/or Ntot.
Gliadin and glutenin compositions are controlled by distinct loci
The proportion of gliadin and glutenin in Ntot and the gliadin to glutenin ratio were not modified by the env<span class="Chemical">ironment, while the proportions of gliadin classes and glutenin subunits in total GSP or respectively in total gliadins or total glutenins showed large environmental variations. Therefore, the regulation of GSP composition in response to the environment differs according to whether the allocation considered is of grain N between glutenin and gliadin or of GSP between glutenin subunits and gliadin classes. In this context it is therefore relevant to analyse independently the SPC, cGli and cGlt groups of variables. This work hypothesized that the genetic control of the residuals to the scaling relationships were partly independent from that of the fraction proportion. This was confirmed for some fractions by the fact that associations were found with the residual independently of the fraction proportion.
Gliadin (group cGli) and glutenin (group cGlt) compositions were associated with 14 and 18 loci, respectively. Gliadin and glutenin promoters share several cis-motifs known to interact with conserved TFs controlling <span class="Chemical">GSPs in cereals and dicots (van Herpen ; Verdier and Thompson, 2008; Fauteux and Stromvik, 2009; Pistón ). Furthermore, van Herpen showed that a α-gliadin gene promoter had the same expression pattern as an LMW-GS gene promoter. However, this work found that the only loci associated with both gliadin and glutenin compositions were on 1AS and 1BS, where several gliadin and glutenin genes are located, and to a lesser extent the 3DL locus, which was strongly associated with glutenin composition and weakly associated with the residuals to the α/β-gliadin-GSP allometric relationship. Thus, the trans-regulation of gliadin classes and glutenin subunits appears to be controlled by different loci. In good agreement with these results, nucleotide polymorphisms in the promoter region of SPA-A were shown to modify the scaling coefficient of the gliadin–Ntot relationship, while the scaling parameters of the glutenin–Ntot relationship was not modified (Ravel ).
The processes regulating gliadin and glutenin compositions also seem to operate differently as the loci that affected glutenin composition the most were also associated with the proportion of total glutenin in Ntot, but only one locus (on 1BS close to Gli-B1) was associated with both gliadin composition and the proportion of gliadin in Ntot. These results suggest that the mechanisms regulating HMW-GS and <span class="Gene">LMW-GS are more independent of each other than those regulating the different gliadin classes. In the promoter region of the wheat HMW-GS gene that has been analysed, the endosperm box has a structure different from those found in wheatLMW-GS and gliadin genes and in other cereal GSP genes (Norre ). Such differences might explain why there is a smaller degree of co-regulation of LMW-GS and HMW-GS than of the gliadin classes. Altogether, these results strongly suggest that the genes regulating glutenin and gliadin compositions are mostly distinct and operate differently.
Parameters of the ecophysiological model of grain nitrogen allocation were associated with several candidate transcription factors
A previous genetic analysis showed that two of the three QTL for the quantity of gliadin classes per grain and four of the six QTL for the quantity of glutenin subunits per grain colocated with QTL for the rate of grain N accumulation (Charmet ). <span class="Chemical">GSP composition can be predicted from scaling laws of grain N allocation (Martre , 2006b), which integrate the effects of the rate and duration of N accumulation on GSP composition (Triboi ). The exponent of the scaling relationship represents the relative variation of a protein fraction with Ntot changes across different environments, while the multiplicative coefficient can be interpreted as the proportion of the fraction in Ntot if it varied proportionally to Ntot. By construction, these parameters integrate the response of GSP genes to the environment (Aguirrezábal ), which means they might be solely genetically controlled. The picture that emerges is that the gene regulatory network involved in the control of the synthesis of GSPs is coordinated such that the grain reacts to the environment in a predictable manner, yielding a meta-mechanism as defined by Tardieu (2003) at the grain level. Analysing the genetic determinants of the parameters of these relationships should improve the understanding of how cereal GSP accumulation is regulated in response to the environment.
The scaling exponents for the allocation of Ntot to gliadins and glutenins were similar across env<span class="Chemical">ironments, but showed a large genetic variability. Only for gliadins was the scaling coefficient modified by the env<span class="Chemical">ironment, and the scaling coefficients of gliadins and glutenins showed no significant genetic variability. None of the scaling coefficients and exponents were associated with the same loci, suggesting that their genetic control is different.
The scaling coefficients were strongly correlated (r > 0.75) with the proportions of the corresponding protein fractions in Ntot and, as a consequence, of the 12 loci associated with the scaling coefficients, eight were associated with both these types of variables. Of the four remaining loci, two were associated only with the scaling coefficients. Interestingly, one of the latter markers was in the candidate TF gene <span class="Gene">DREB1-B, which was associated with the scaling parameters of the gliadin–Ntot relationship. Although its role in abiotic stress responses in wheat is not clear yet, DREB1 is a putative dehydration response element-binding TF induced by cold, salinity, and drought (Shen ).
In contrast with the scaling coefficients, the scaling exponents were not correlated with any other variable, correlating only among themselves. They were always associated with loci independently of other variables. Only three loci were found to be associated with this model parameter, and all with the glutenin–Ntot relationship parameter. The <span class="Chemical">NAC22-B gene was mapped on chromosome 7BL at the same position as one marker associated with this parameter and a second locus associated with this parameter was close to another NAC TF gene (<span class="Chemical">NAC27-B) on chromosome 5BS. In silico inference of transcription regulation networks based on expression data has shown that two NAC TF (NAC27 and NAC18) are highly connected to the TF network that controls GSP gene expression (Vincent ). blastx results indicated that NAC22-B and NAC27-B have >48% sequence identity and >97% sequence coverage with rice LOC_Os01g29840 and LOC_Os01g01470, respectively. The same analysis on Arabidopsis thaliana sequences showed >65% sequence identity and >45% sequence coverage with ANAC058 and ANAC087, respectively. The function of these NAC TFs is not known yet, but in all three species they are highly expressed in the grain during the phase of GSP accumulation (Ooka ; Romeuf, 2010).
Two loci associated with scaling coefficients and all the markers associated with scaling exponents were not associated with any other variable. This means that using the parameters of a model that are independent of the env<span class="Chemical">ironment allowed this work to identify a level of genetic regulation of <span class="Chemical">GSP synthesis that could not be detected through basic composition data.
cis- and trans-regulating loci for GSP composition
Differences in <span class="Chemical">GSP composition among accessions can be caused by natural variability in the number of GSP genes at each locus (Dong ) or by sequence polymorphisms in their promoters (van Herpen ; Pistón ), which are cis-determinants of GSP composition. Genetic variability in TF genes trans-regulating GSP gene expression (Ravel ) or controlling grain development (Verdier and Thompson, 2008) or in enzymes controlling N and S assimilation/metabolism (Galili and Höfgen, 2002) can also explain changes in GSP composition.
Loci in strong linkage disequilibrium with some of the genes coding for <span class="Chemical">GSPs or including the genes themselves were associated with GSP quantity, composition, and allocation. For some of the GSP loci, no association was detected, most likely because of the lack of suitable polymorphism markers. Other studies have demonstrated that genetic variability in the structural GSP genes has an effect on the quantity and composition of GSPs. In particular, Charmet found that a region encompassing the Glu-B1 locus was associated with the quantity of Glu-A1, Glu-D1x-y, and Glu-B1x per grain, and Zhang showed that the Glu-A3, Glu-B1, and Glu-D1 loci were associated with the proportion of glutenin in flour. However, in all these studies, cis-regulating loci explained only part of the variability in GSP composition. Several of the loci associated with GSP composition are not linked to structural GSP genes which suggests that trans-regulation mechanisms also play an important role in controlling GSP composition.
There is currently a consensus on the predominant role of transcriptional regulation in controlling <span class="Chemical">GSP accumulation in both cereals and dicots (Verdier and Thompson, 2008; Weber ), so this work included 32 TF genes in the candidate gene association approach. Among the 13 wheat orthologues of barley TFs that control GSP gene expression (Rubio-Somoza , b), SPA-B, MCB1-A, and GAMYB-D were associated with either glutenin or gliadin composition and MCB1-B was associated with the quantities of glutenin and gliadin. SPA-B was associated with the proportions and quantities of gliadin classes which is in agreement with previous reports of SPA-A and SPA-B haplotypes being associated with dough viscoelasticity (Ravel ). Genetic mapping in barley has shown that PBF is associated with GPC (Haseneyer ), and in the current work PBF-B was in strong linkage disequilibrium with a marker associated with GPC. Among the 19 TFs that reach maximal expression at the same time as GSP genes, DREB1-B was associated with the scaling coefficient of the gliadin–Ntot relationship and NAC-22-B and NAC-27-B were in strong linkage disequilibrium with markers associated with the scaling exponent of the glutenin–Ntot relationship. It is interesting to note that TFs known to control GSP expression in cereals by binding GSP gene promoters were associated with GSP composition, while TFs with unknown functions were associated with the grain N allocation parameters, indicative of both direct and indirect transcriptional regulation for GSP composition and allocation.
This study investigated the possible regulation of <span class="Chemical">GSPs by assimilate availability. Of 13 candidate genes coding for enzymes involved in the assimilation or metabolism of N and S, Fd-GOGAT, DeK-A, and DeK-B were associated, and GDH2-D and GAD1-B were in strong linkage disequilibrium with a marker associated, with GPC and/or Ntot. This may suggest that enzymes involved in N and S assimilation and metabolism mainly affect GSP quantity, but not relative proportions of GSP. However, GAD1-B was also in strong linkage disequilibrium with a marker associated with the scaling coefficient of the glutenin– Ntot allometric relationship.
Identification of potential functional clusters
Part of the preliminary work of this study was to identify SNPs in candidate genes and map these genes on the 21 chromosomes of the <span class="Species">wheat genome. This led to the mapping of 35 candidate TF genes by linkage disequilibrium. NAM-B1-B was mapped on chromosome 6BL, although previous studies have shown that in the tetraploid species Triticum turgidum ssp. durum it is located at a centromeric position on 6BS (Uauy ). Similarly, the SNP marker within DREB1-B has been previously mapped on chromosome 3BL close to marker gwm566 (Wei ). In the current work using Chinese Spring ditelosomic lines, these two markers were mapped close together on 3BS (data not shown), so the three homeologous copies of DREB1 were assigned to the short arms of group 3 chromosomes. For SPA, PBF, and GAMYB, the mapping positions were identical to those already reported (Guillaumie ; Ravel ; Haseneyer ). The mapping positions of the other TFs are reported here for the first time. This work noted that 22 TFs were in strong linkage disequilibrium with either another candidate TF or a glutenin gene, forming nine putative functional gene clusters. Furthermore, among these nine clusters, eight colocalized or were in strong linkage disequilibrium with loci significantly associated with GSP composition or allocation, suggesting that GSP composition might be controlled, at least partially, by functional clusters of structural GSPs and trans-regulating TF genes. These results are consistent with recent work that identified numerous islands of coexpressed and/or cofunctional genes on wheat chromosome 3B (Rustenholz ).
The method used in this study to map the markers did not allow accurate estimations of genetic distances to be made. Like the consensus map established by Somers , the length of the map was approximately 2200 cM. The size of the <span class="Species">wheat genome is about 16 000Mb, thus on average 1 cM corresponds roughly <span class="Species">to 7.3Mb. Therefore the physical length of these putative functional clusters might be up to several Mb.
In conclusion, this study demonstrates that association mapping is a powerful tool, not only to identify useful markers for crop selection, but also to dissect the genetic control of complex processes and traits. The analysis of <span class="Chemical">GSP composition led unexpectedly to the identification of distinct trans-regulatory determinants of gliadin and glutenin compositions operating in different ways. This study also investigated the genetic control of the scaling parameters of an ecophysiological model predicting the effect of the environment on GPC and GSP composition. Several TFs of unknown function expressed in grain were involved in this genetic control. Functional studies of these TFs are needed in the future. Four regions very strongly associated with GSP composition were identified. One region on 3DS was in strong linkage disequilibrium with two candidate TFs, but it remains to be determined if these genes are responsible for the associations that were detected. The other three regions were not in linkage disequilibrium with any of the candidate genes studied. Fine mapping of these regions and the analysis of their gene content using information from rice and Brachypodium distachyon (L.) synteny would help to identify more putative candidate genes.
Supplementary material
Supplementary data are available at JXB online.Supplementary Table S1. Summary statistics of the allometric regression analysis of the quantity of <span class="Chemical">GSP, gliadin, and glutenin per grain versus Ntot.
Supplementary Table S2. Correlations between the scaling exponents and coefficients of the allometric relationships between the quantity of <span class="Chemical">GSP, gliadin, or glutenin per grain and Ntot for each accession across three environments and the percentages of GSP, gliadin, or glutenin in Ntot.
Supplementary Table S3. Correlations between the percentages of ω5-, ω1,2-, α/β-, and γ-gliadins in total gliadin, the gliadin to glutenin ratio or the HMW-GS to <span class="Gene">LMW-GS ratio and the percentage of protein fractions in Ntot or <span class="Chemical">GSP.
Supplementary Table S4. Summary statistics of the allometric regression analysis of the quantity of HMW-GS, <span class="Gene">LMW-GS, ω5-, ω1,2-, α/β-, and γ-gliadins per grain versus the quantity of <span class="Chemical">GSP per grain.
Supplementary Table S5. Correlations between the residuals of the relationships between the quantity of <span class="Chemical">GSP, gliadin, or glutenin and either Ntot or the total GSP per grain and the percentage of protein fractions in Ntot or GSP.
Supplementary Table S6. Putative candidate genes for which genetic association with grain protein composition was analysed.Supplementary Data S1. Association maps for multilocal analysis and each env<span class="Chemical">ironment, including all the markers and associated variables.
Authors: Peter J Bradbury; Zhiwu Zhang; Dallas E Kroon; Terry M Casstevens; Yogesh Ramdoss; Edward S Buckler Journal: Bioinformatics Date: 2007-06-22 Impact factor: 6.937
Authors: T W J M Van Herpen; M Riley; C Sparks; H D Jones; C Gritsch; E H Dekking; R J Hamer; D Bosch; E M J Salentijn; M J M Smulders; P R Shewry; L J W J Gilissen Journal: Ann Bot Date: 2008-07-11 Impact factor: 4.357
Authors: Eric W Storlie; Robert J Ihry; Leslie M Baehr; Karissa A Tieszen; Jonathan H Engbers; Jordan M Anderson-Daniels; Elizabeth M Davis; Anne G Gilbertson; Niels R Harden; Kristina A Harris; Amanda J Johnson; Amy M Kerkvleit; Matthew M Moldan; Megan E Bell; Michael K Wanous Journal: Theor Appl Genet Date: 2008-10-07 Impact factor: 5.699
Authors: Sophie Jasinski; Alain Lécureuil; Monique Durandet; Patrick Bernard-Moulin; Philippe Guerche Journal: Front Plant Sci Date: 2016-11-10 Impact factor: 5.753
Authors: Sebastian Michel; Christian Kummer; Martin Gallee; Jakob Hellinger; Christian Ametz; Batuhan Akgöl; Doru Epure; Huseyin Güngör; Franziska Löschenberger; Hermann Buerstmayr Journal: Theor Appl Genet Date: 2017-10-23 Impact factor: 5.699