| Literature DB >> 23733887 |
Alexander E Lipka1, Michael A Gore, Maria Magallanes-Lundback, Alex Mesberg, Haining Lin, Tyler Tiede, Charles Chen, C Robin Buell, Edward S Buckler, Torbert Rocheford, Dean DellaPenna.
Abstract
Tocopherols and tocotrienols, collectively known as tocochromanols, are the major lipid-soluble antioxidants in maize (Zea mays L.) grain. Given that individual tocochromanols differ in their degree of vitamin E activity, variation for tocochromanol composition and content in grain from among diverse maize inbred lines has important nutritional and health implications for enhancing the vitamin E and antioxidant contents of maize-derived foods through plant breeding. Toward this end, we conducted a genome-wide association study of six tocochromanol compounds and 14 of their sums, ratios, and proportions with a 281 maize inbred association panel that was genotyped for 591,822 SNP markers. In addition to providing further insight into the association between ZmVTE4 (γ-tocopherol methyltransferase) haplotypes and α-tocopherol content, we also detected a novel association between ZmVTE1 (tocopherol cyclase) and tocotrienol composition. In a pathway-level analysis, we assessed the genetic contribution of 60 a priori candidate genes encoding the core tocochromanol pathway (VTE genes) and reactions for pathways supplying the isoprenoid tail and aromatic head group of tocochromanols. This analysis identified two additional genes, ZmHGGT1 (homogentisate geranylgeranyltransferase) and one prephenate dehydratase parolog (of four in the genome) that also modestly contribute to tocotrienol variation in the panel. Collectively, our results provide the most favorable ZmVTE4 haplotype and suggest three new gene targets for increasing vitamin E and antioxidant levels through marker-assisted selection.Entities:
Keywords: GWAS; biofortification; candidate gene; vitamin E
Mesh:
Substances:
Year: 2013 PMID: 23733887 PMCID: PMC3737168 DOI: 10.1534/g3.113.006148
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Tocochromanol biosynthetic pathway in maize grain. Enzymes in red correspond to genes that are within ±250 kb of the associated SNPs identified in our study. Compound abbreviations: HPA, p-hydroxyphenylpyruvic acid; HGA, homogentistic acid; GGDP, geranylgeranyl diphosphate; Pytyl-P, phytyl monophosphate; Phytyl-DP, phytyl diphosphate; MGGBQ, 2-methyl-6-geranylgeranylbenzoquinol; MPBQ, 2-methyl-6-phytylbenzoquinol; DMGGBQ, 2,3-dimethyl-5-geranylgeranylbenzoquinol; DMPBQ, 2,3-dimethyl-5-geranylgeranylbenzoquinol; SAM, S-adenosylmethionine. Reactions: 1) p-hydroxyphenylpyruvate dioxygenase (HPPD); 2) homogentisate geranylgeranyl transferase (HGGT1); 3) GGDP reductase (GGDR); 4) phytol kinase (VTE5); 5) unspecified kinase; 6) homogentisate phytyl transferase (VTE2); 7) tocopherol cyclase (VTE1); 8) MPBQ/MGGBQ methyl transferase (VTE3); and 9) γ-tocopherol methyl transferase (VTE4).
Means and ranges (in μg·g−1) seed for untransformed BLUPs of 20 tocochromanol grain traits evaluated on a maize inbred association panel and estimated heritability on a line-mean basis in two summer environments, in West Lafayette, Indiana, across 2 years
| Trait | No. Lines | BLUPs | Heritabilities | |||
|---|---|---|---|---|---|---|
| Mean | SD | Range | Estimate | SE | ||
| γT | 251 | 30.18 | 15.00 | 5.04–85.94 | 0.88 | 0.02 |
| γT3 | 252 | 12.08 | 8.73 | 1.46–55.25 | 0.89 | 0.01 |
| αT | 252 | 8.19 | 5.50 | 0.70–31.35 | 0.91 | 0.01 |
| αT3 | 250 | 7.52 | 3.01 | 2.86–22.38 | 0.87 | 0.02 |
| δT | 251 | 1.11 | 0.63 | 0.22–3.32 | 0.78 | 0.03 |
| δT3 | 250 | 0.59 | 0.68 | 0.09–6.06 | 0.91 | 0.01 |
| Total T3 | 252 | 20.34 | 10.66 | 3.77–74.59 | 0.90 | 0.01 |
| Total T | 252 | 39.95 | 15.05 | 13.16–95.07 | 0.85 | 0.02 |
| Total T/Total T3 | 247 | 2.44 | 1.70 | 0.37–9.34 | 0.92 | 0.01 |
| Total T3 + T | 252 | 60.54 | 18.10 | 25.70–125.14 | 0.83 | 0.02 |
| δT/(γT + αT) | 251 | 0.03 | 0.02 | 0.0027–0.08 | 0.89 | 0.01 |
| δT/γT | 251 | 0.04 | 0.02 | 0.0012–0.18 | 0.91 | 0.01 |
| δT/αT | 248 | 0.28 | 0.35 | 0.013–2.04 | 0.94 | 0.01 |
| γT/(γT + αT) | 251 | 0.75 | 0.17 | 0.18–0.97 | 0.95 | 0.01 |
| δT3/(γT3 + αT3) | 251 | 0.03 | 0.02 | 0.0039–0.22 | 0.94 | 0.01 |
| δT3/γT3 | 251 | 0.05 | 0.04 | 0.0046–0.25 | 0.89 | 0.01 |
| δT3/αT3 | 250 | 0.09 | 0.11 | 0.0080–0.80 | 0.93 | 0.01 |
| γT3/(γT3 + αT3) | 251 | 0.55 | 0.18 | 0.096–0.92 | 0.95 | 0.01 |
| αT/γT | 250 | 0.40 | 0.42 | 0.018–2.21 | 0.88 | 0.01 |
| αT3/γT3 | 251 | 1.24 | 1.43 | 0.18–11.75 | 0.96 | 0.01 |
BLUPs, best linear unbiased predictors; SD, standard deviation of the BLUPs; SE, standard error of the heritabilities; γT, γ-tocopherol; γT3, γ-tocotrienol; αT, α-tocopherol; αT3, α-tocotrienol; δT, δ-tocopherol; δT3, δ-tocotrienol; total T3, total tocotrienols; total T, total tocopherols; total T3 + T, total tocochromanols.
Figure 2Comparison of MAFs for SNPs between temperate and tropical lines in the maize association panel. (A) Contour plot of MAFs for 591,822 SNPs between temperate (n = 207) and tropical (n = 45) maize lines. For each SNP, the minor allele across all lines was identified, followed by calculation of the frequency of this allele for temperate and tropical lines. The grayscale indicates the percentage of SNPs with each set of MAFs. Colored symbols (triangles and asterisks) indicate MAFs of SNPs that are statistically significant for at least one of the 20 tocochromanol traits at a genome-wide FDR of 5%. Significant SNPs ± 250 kb of ZmVTE4, ZmVTE1, and ZmHGGT1 are colored red, blue, and purple, respectively. All other statistically significant SNPs are colored black. The SNPs that were included in the optimal models of the MLMM analysis are indicated with asterisks. (B) Contour plot of MAFs for 591,822 SNPs between temperate and tropical lines with additional SNPs significant at a genome-wide FDR of 10%, as in (A).
Figure 3GWAS for α-tocopherol (αT) content in maize grain. (A) Scatter plot of association results from a unified mixed model analysis of αT and LD estimates (r) across the ZmVTE4 chromosome region. Negative log10-transformed P-values (left, y-axis) from a GWAS for αT and r values (right, y-axis) are plotted against physical position (B73 RefGen_v2) for a 7-Mb region on chromosome 5 that encompasses ZmVTE4. The blue vertical lines are –log10 P-values for SNPs that are statistically significant for αT at 5% FDR, whereas the gray vertical lines are –log10 P-values for SNPs that are non-significant at 5% FDR. Triangles are the r values of each SNP relative to the peak SNP (indicated in red) at 200,367,532 bp. The black horizontal dashed line indicates the –log10 P-value of the least statistically significant SNP at 5% FDR. The black vertical dashed lines indicate the positions of four genes (from left to right): a WYRKY transcription factor (GRMZM5G823157), ZmVTE4 (GRMZM2G035213), a pentatricopeptide repeat-containing protein (GRMZM2G325019), and an amino acid permease (GRMZM2G161641). (B) Scatter plot of association results from a conditional unified mixed model analysis of αT and LD estimates (r) across the ZmVTE4 chromosome region, as in (A). The three SNPs (ss196416269, S5_200369534, and S5_200369481) from the optimal MLMM model were included as covariates in the unified mixed model to control for the ZmVTE4 effect. (C) Gene model diagram for ZmVTE4 with αT associated SNPs. Blue vertical lines indicate the physical position (RefGen_v2) of SNPs within ±3 kb of the open reading frame start or stop position for ZmVTE4 that are significantly associated with αT at 5% FDR. Significant SNPs at 10% FDR are shown as gray vertical lines. The peak SNP is indicated by a red triangle, whereas the three SNPs included in the optimal MLMM model are indicated by inverted red triangles.
Figure 4Summary of local LD and haplotype blocks for a 3.9-kb genomic region that surrounds ZmVTE4 (Chr 5: 200,367,029–200,370,851 bp). LD plot, generated in Haploview (Barrett ), indicates r values between pairs of SNPs multiplied by 100; white, r2 = 0; shades of gray, 0 < r2 < 1; black r = 1. Haplotype blocks (blocks 1−4) in the ZmVTE4 genomic region were defined with the confidence interval method (Gabriel ). The three SNPs included in the optimal MLMMs for α-tocopherol and its three derived trait ratios are indicated with red circles.
Haplotype effects of three ZmVTE4 SNPs identified with an optimal MLMM for αT levels
| Haplotype | Haplotype Frequency | Haplotype Mean | ||||||
|---|---|---|---|---|---|---|---|---|
| ss196416269 | S5_200369481 | S5_200369534 | Overall | Temperate | Tropical | αT, μg·g−1 | SD | |
| A,C, | A | C | 50 | 44 | 6 | 3.02 | 2.48 | |
| C | 151 | 125 | 26 | 9.18 | 4.01 | |||
| A, | A | 1 | 1 | − | 3.51 | − | ||
| 28 | 15 | 13 | 15.95 | 3.90 | ||||
| C | T | 22 | 22 | − | 2.77 | 2.83 | ||
| R2LR (%) | 60.2 | |||||||
| Partial R2LR (%) | 48.4 | |||||||
| 2.2 × 10−38 | ||||||||
| Fold change | 5.76 | |||||||
MLMM, multilocus mixed model; αT, alpha tocopherol; SNP, single-nucleotide polymorphism; SD, standard deviation of the untransformed BLUPs; BLUPs, best linear unbiased predictors;
The most favorable allele for each of the ZmVTE4 SNPs is underlined.
R2LR, likelihood-ratio based R2 statistic, percentage of total phenotypic variation explained by the unified mixed model.
Partial R2LR, likelihood-ratio based partial R2 statistic, percentage of total phenotypic variation explained by the haplotypes.
The P-value was from a unified mixed linear model that tested for an association between haplotypes and αT levels.
Fold change was calculated as the ratio between the most favorable (G, G, G) and least favorable (G, C, T) haplotypes for αT levels.
Figure 5GWAS for the ratio of δT3 to the sum of γT3 and αT3 [δT3/(γT3 + αT3)] in maize grain. (A) Scatter plot of association results from a unified mixed model analysis of δT3/(γT3 + αT3) and LD estimates (r) across the ZmVTE1 chromosome region. Negative log10-transformed P-values (left, y-axis) from a GWAS for δT3/(γT3 + αT3) and r values (right, y-axis) are plotted against physical position (B73 RefGen_v2) for a 4-Mb region on chromosome 5 that encompasses ZmVTE1. The blue vertical lines are –log10 P-values for SNPs that are statistically significant for δT3/(γT3+αT3) at 5% FDR, whereas the gray vertical lines are –log10 P-values for SNPs that are nonsignificant at 5% FDR. Triangles are the r values of each SNP relative to the peak SNP (indicated in red) at 133,501,858 bp. The black horizontal dashed line indicates the –log10 P-value of the least statistically significant SNP at 5% FDR. The black vertical dashed lines indicate the positions of two genes (from left to right): a transcription factor (GRMZM2G105494) and ZmVTE1 (GRMZM2G009785). (B) Scatter plot of association results from a conditional unified mixed model analysis of δT3/(γT3+αT3) and LD estimates (r) across the ZmVTE1 chromosome region, as in (A). The SNP (S5_133501858) from the optimal multi-locus mixed model (MLMM) model was included as a covariate in the unified mixed model to control for the ZmVTE1 effect. (C) Gene model diagram for ZmVTE1 with δT3/(γT3 + αT3) associated SNPs. Blue vertical lines indicate the physical position (RefGen_v2) of SNPs within +/− 3 kb of the open reading frame start or stop position for ZmVTE1 that are significantly associated with δT3/(γT3 + αT3) at 5% FDR. Significant SNPs at 10% FDR are shown as gray vertical lines. The peak SNP is indicated by a red triangle, while the SNP included in the optimal MLMM model is indicated by an inverted red triangle.
Statistically significant results from the pathway-level analysis of 20 tocochromanol grain traits when SNPs identified from the MLMM analysis are excluded or included as covariates in the unified mixed model
| Significant Associations | ||||||
|---|---|---|---|---|---|---|
| Category | Candidate Gene | Function | No Covariates | Three | Two | Three |
| AHG | GRMZM2G573867 | 3-dehydroquinate synthase | αT, δT/αT, αT/γT, γT/(γT + αT) | αT, δT/αT | ||
| AHG | GRMZM2G124365 | chorismate mutase | αT3, Total T3, αT, δT/αT | αT3, Total T3, αT, δT/αT | ||
| AHG | GRMZM2G138624 | isochorismatase hydrolase | αT3 | αT3 | ||
| AHG | GRMZM2G437912 | prephenate dehydratase | Total T3 | Total T3 | Total T3, γT3 | Total t3, γT3 |
| AHG | GRMZM2G070218 | shikimate kinase | δT/αT | |||
| PG | GRMZM2G493395 | 1-deoxy-D-xylulose 5-phosphate synthase | αT/γT, γT/(γT + αT) | αT/γT, γT/(γT + αT) | ||
| PG | AC209374.4_FG002 | 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase | αT, αT/γT, γT/(γT + αT), δT/αT | αT, αT/γT, γT/(γT + αT), δT/αT | ||
| PG | GRMZM2G172032 | 2-C-methyl-D-erythritol 4-phosphate cytidyltransferase | δT3/(γT3 +αT3), δT3 | δT3/(γT3 + αT3) | ||
| PG | GRMZM2G027059 | 4-hydroxy-3-methylbut-2-enyldiphosphate reductase | δT3, αT | δT3 | αT | |
| PG | GRMZM2G137409 | hydroxymethylbutenyl 4-diphosphate synthase | δT3/γT3 | |||
| PG | GRMZM2G133082 | isopentenyl pyrophosphate isomerase | γT/(γT + αT) | γT/(γT+αT) | ||
| TP | GRMZM2G009785 | tocopherol cyclase ( | δT/γT, δT3, δT3/(γT3 + αT3), δT3/αT3, δT3/γT3 | δT/γT, δT3, δT3/(γT3 + αT3), δT3/αT3, δT3/γT3 | ||
| TP | GRMZM2G035213 | γ-tocopherol methyltransferase ( | αT3, αT3/γT3, γT3/(γT3 + αT3), αT, αT/γT, δT/αT, γT/(γT + αT) | αT3, αT3/γT3, γT3/(γT3 + αT3), αT, αT/γT, δT/αT, γT/(γT + αT), δT/(γT + αT) | ||
| TP | GRMZM2G173358 | homogentisic acid geranylgeranyl transferase | Total T3, Total T/Total T3, γT3, γT3/(γT3 + αT3), αT3 | Total T3, Total T/Total T3, γT3, γT3/(γT3 + αT3) | Total T3, Total T/Total T3, γT3, γT3/(γT3 + αT3), αT3 | Total T3, total T/total T3, γT3, γT3/(γT3 + αT3) |
SNP, single-nucleotide polymorphism; MLMM, multilocus mixed model; AHG, aromatic head group; PG, prenyl group synthesis; TP, tocochromanol pathway.
At least one SNP within ±250 kb of the gene open reading frame (ORF) start or stop position is associated with at least one of the indicated traits at a candidate gene-wide 5% false-discovery rate (FDR) using the unified mixed model without covariates.
At least one SNP within ±250 kb of the gene ORF start or stop position is associated with at least one of the indicated traits at a pathway-wide 5% FDR using the unified mixed model with the three ZmVTE4 SNPs identified in the MLMM analysis included as covariates.
At least one SNP within ±250 kb of the gene ORF start or stop position is associated with at least one of the indicated traits at a pathway-wide 5% FDR using the unified mixed model with the two ZmVTE1 SNPs identified in the MLMM analysis included as covariates.
At least one SNP within ±250 kb of the gene ORF start or stop position is associated with at least one of the indicated traits at a pathway-wide 5% FDR using the unified mixed model with the three ZmVTE4 SNPs and two ZmVTE1 SNPs identified in the MLMM analysis included as covariates.