| Literature DB >> 35778686 |
Matthew J Hill1,2,3, Bryan W Penning4, Maureen C McCann5,6, Nicholas C Carpita7,8,9.
Abstract
BACKGROUND: Genome-Wide Association Studies (GWAS) are used to identify genes and alleles that contribute to quantitative traits in large and genetically diverse populations. However, traits with complex genetic architectures create an enormous computational load for discovery of candidate genes with acceptable statistical certainty. We developed a streamlined computational pipeline for GWAS (COMPILE) to accelerate identification and annotation of candidate maize genes associated with a quantitative trait, and then matches maize genes to their closest rice and Arabidopsis homologs by sequence similarity.Entities:
Keywords: Computational biology; European corn borer; Flowering time; GWAS; Genome; Maize; Ostrinia nubilalis; QTL; Zea mays; γ-Tocopherol synthesis
Mesh:
Year: 2022 PMID: 35778686 PMCID: PMC9250234 DOI: 10.1186/s12870-022-03668-9
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 5.260
Fig. 1Manhattan Plot showing GWAS results for ratio of α-tocopherol to γ-tocopherol conversion. Data from Lipka et al. [16] were input into COMPILE for GWAS analysis and gene discovery. Negative log10 p-values are plotted against physical position (B73_RefGen_v4) on all 10 chromosomes. Values in red are significant at Benjamini-Hochberg false discovery rate of 5%. A visual marker for the Bonferroni threshold (averaged from the individual threshold y-values of each chromosome) at (α = 0.1) is indicated by the blue horizontal line. The vertical line marks the position of the maize tocopherol O-methyltransferase (ZmVTE4). Dotted lines indicate the positions of genes identified by COMPILE not identified in the original study: a MADS-box 36 transcription factor (Zm00001d043589), a putative glycosyl transferase gene (Zm00001d019057), and a the long non-coding RNA occurred very early in embryo development, and only between 84 and 96 h post-pollination in the nucellus [17]
Genes associated with QTL for α/γ-tocopherol ratio in maize kernels by COMPILEa
| Chrom. | Marker | MLM | Distance | Maize Gene | Gene | BLAST | BLAST | BLAST | e-Value |
|---|---|---|---|---|---|---|---|---|---|
| Position | to Gene | Number | Name | Match | Description | Score | |||
| 3 | 204,491,509 | 4.76E–06* | − 2050 | Zm00001d043589 | MADS36 | Os01g0726400 | MADS box floral identity | 299 | 1.93E–103 |
| At5g60910 | AGAMOUS-like 8 | 107 | 6.29E–28 | ||||||
| 8 | 132,442,986 | 1.54E–06* | −231 | Zm00001d008091 | None | [lncRNA] | [Long non-coding RNA] | – | – |
aPhenotype data are from Lipka et al. [16]. All genes are identified by COMPILE as significant at a Benjamini-Hochberg FDR of 5%. Entries in bold indicate genes identified as significant by Lipka et al. [16] at a Benjamini-Hochberg FDR of 5% (***) or of 10% (**). Manual annotation of the long non-coding RNA is in brackets. *Not identified as significant in Lipka et al. [16]
Fig. 2Manhattan Plot showing NCRPIS 2.7 and Goodman 2.7 GWAS results for flowering time. Data from Romay et al. [8] were used in GWAS analysis and gene discovery. Negative log10 p-values are plotted against physical position (B73_RefGen_v4) on all 10 chromosomes. Markers significant at a Benjamini-Hochberg false discovery rate of 10% are shown in red. A visual marker for the Bonferroni threshold (averaged from the individual threshold y-values of each chromosome) at (α = 0.1) is indicated by the blue horizontal line. Vertical solid lines indicate positions of significant QTL and gene annotations for flowering time, as shown by Romay et al. [8]. Dotted lines indicate positions of novel genes identified. Gene identities are described in Romay et al. [8]. a Results for GWAS conducted using the 2279-member NCRPIS population and NCRPIS 2.7 marker data. b Results for GWAS conducted using the 282-member Goodman AP collection within the NCRPIS population and Goodman 2.7 marker data. c Results for GWAS conducted using the 282-member Goodman AP collection within in the NCRPIS population and the nearest neighbor of similar phenotype (564 lines) and NCRPIS 2.7 marker data. d Results for GWAS conducted using the 282-member Goodman AP collection within in the NCRPIS population and three nearest neighbors of similar phenotype (1128 lines) and NCPRIS 2.7 marker data
Genes associated with QTL for growing-degree-day-adjusted days to flowering identified by COMPILEa
| Chrom. | Marker | MLM | Distance | Maize Gene | Gene | BLAST | BLAST | BLAST | e-Value |
|---|---|---|---|---|---|---|---|---|---|
| Position | to Gene | Number | Name | Match | Description | Score | |||
| 1 | 94,342,987 | 5.19E–08 | − 3107 | Zm00001d029918 | None | Os10g0324600 | Zinc finger, C2H2 domain | 48.9 | 4.31E–06 |
| At5g61190 | C2H2-type zinc finger domain | 41.6 | 2.00E–03 | ||||||
| 2 | 208,886,674 | 1.41E–06 | 2175 | Zm00001d006461 | None | Os07g0508300 | SH3 domain-containing protein like | 642 | 0.0 |
| AT4G18060 | SH3 domain-containing protein | 462 | 3.30E–164 | ||||||
| 8 | 116,806,871 | 7.35E–10 | 1222 | Zm00001d010470 | UbiE3 | AT5G13530 | E3 Ubiquitin-protein ligase | 2198 | 0.0 |
| Os05g0392050 | UbiE3 Ubiquitin-ligase | 349 | 4.91E–108 | ||||||
| 9 | 135,980,380 | 5.87E–07 | −242 | Zm00001d047573 | F-box | Os03g0321300 | Cyclin-like F-box domain protein | 533 | 0.0 |
| AT5G46170 | F-box family protein | 308 | 1.47E–101 | ||||||
| 10 | 10,315,088 | 7.02E–08 | − 914 | Zm00001d023565 | TCP | AT1G58100 | TCP family transcription factor | 204 | 9.25E–62 |
| Os12g0173300 | Transcription factor, TCP protein | 366 | 2.12E–124 | ||||||
| 10 | 92,331,033 | 1.41E–08 | 361 | Zm00001d024885 | WD40-like | Os03g0403400 | TolB-like domain β-propeller | 1065 | 0.0 |
| At1g21680 | DPP6 N-terminal domain-like | 351 | 1.23E–110 | ||||||
| 10 | 133,736,251 | 2.03E–07 | 244 | Zm00001d025915 | Syntaxin81 | At1g51740 | Syntaxin81 | 185 | 6.42E–59 |
| Os04g0530400 | t-Snare domain containing protein | 255 | 1.34E–86 |
aPhenotype data are from Romay et al. [8]. All genes are identified as significant by COMPILE at a Benjamini-Hochberg FDR of 5%. Entries in bold are those identified as significant by Romay et al. [8]. Genes without p-value or gene distance annotations represent genes not identified by COMPILE. For these genes, “Marker Position” annotations in brackets represent gene midpoint coordinates. Entries in normal text were identified as significant by COMPILE but not by Romay et al. [8]
Fig. 3Micrographs of internodes of maize and histogram of insect damage. a Maize internodes without European corn borer damage. b Maize internodes with holes remaining after penetration of the corn borer larvae. c. Frequency distributions for insect damage in the Goodman AP. The two populations were normalized and assigned a damage index from 1, with zero damage, to 10, in increments of 0.032 holes/cm of internode length, with an average of 0.11 holes/cm of internode length, corresponding to an average index of 3.28. Mo17 with an index of 0.05 falls in bin 2, and B73 with an index of 0.15 falls into bin 5
Fig. 4Manhattan Plot showing GWAS results for insect damage index in the Goodman AP using Goodman 2.7 data. Negative log10 p-values are plotted against physical position (B73 RefGen_v4). A visual marker for the Bonferroni threshold (averaged from the individual threshold y-values of each chromosome at (α = 0.1) is indicated by the blue horizontal line. One location, red circle, was significant. Gene identities are described in Table 3
Genes associated with QTL for European corn borer stem penetration
| Chrom. | Marker | MLM | Distance | Maize Gene | Gene | BLAST | BLAST | BLAST | e-Value |
|---|---|---|---|---|---|---|---|---|---|
| Position | to Gene | Number | Name | Match | Description | Score | |||
| 1 | 211,798,518 | 3.67E-05 | 1079 | Zm00001d032079 | RboS-L | Os09g0438000 | Riboflavin synthase-like | 1418 | 0.0 |
| At1g19230 | Riboflavin synthase-like | 1093 | 0.0 | ||||||
| 1 | 283,651,183 | 4.84E-05 | − 1304 | Zm00001d034084 | WRKY31a | At3g01080 | WRKY DNA-binding (WRKY58) | 218 | 1.52E-66 |
| Os12g0507300 | WRKY DNA binding (WRKY96) | 231 | 4.72E-72 | ||||||
| 1 | 283,651,183 | 4.84E-05 | 1803 | Zm00001d034085 | DUF1336a | At3g29180 | DUF1336 | 427 | 4.06E-147 |
| Os03g55180 | DUF1336 | 577 | 0.0 | ||||||
| 2 | 1,413,891 | 6.11E-06 | 852 | Zm00001d001813 | RLK-La | Os04g42620 | Ser/Thr RLK-like | 634 | 0.0 |
| Os03g0759000 | LysM RLK-like | 407 | 3.26E-141 | ||||||
| 2 | 28,958,597 | 4.46E-05 | − 2883 | Zm00001d002991 | MYB-La | Os06g0190900 | MYB-SANT-like protein | 164 | 1.65E-502 |
| [At3g11290] | Unknown | 38 | 0.007 | ||||||
| 2 | 84,078,171 | 7.44E-05 | − 4593 | Zm00001d004120 | P4H | At2g43080 | P4H isoform 1 | 313 | 6.39E-109 |
| Os04g0346000 | P4H1 | 118 | 1.18E-34 | ||||||
| 2 | 239,762,997 | 5.45E-06 | − 644 | Zm00001d007788 | CHUP1 | Os07g0188266 | FKBP-type peptidyl-prolyl isomerase | 285 | 6.45E-93 |
| At3g25690 | CHUP1 | 42 | 6e-04 | ||||||
| 3 | 207,872,378 | 4.36E-05 | 2565 | Zm00001d043701 | IAH | Os01g0706900 | Similar to Auxin amidohydrolase. | 639 | 0.0 |
| At1g51760 | IAA-Ala peptidase | 488 | 1.71E-171 | ||||||
| 5 | 981,132 | 2.63E-05 | 418 | Zm00001d012838 | GDA1-L | Os05g0498700 | GDA1-like | 461 | 6.23E-164 |
| At3g27090 | DCD domain protein | 328 | 1.25E-111 | ||||||
| 7 | 91,467,624 | 4.37E-05 | − 5168 | Zm00001d020093 | UGE5 | At4g10960 | UDP-D-Glc epimerase5 (UGE5) | 228 | 5.79E-75 |
| Os09g0323000 | UDP-D-Glc/UDP-D-Gal 4-epimerase | 261 | 3.36E-88 | ||||||
| 8 | 144,702,617 | 9.44E-06 | − 4196 | Zm00001d011256 | OHP3 | At1g34000 | One-helix LHC protein2 (OHP2) | 160 | 3.13E-50 |
| Os01g0589800 | High-light inducible protein | 145 | 5.29E-46 | ||||||
| 8 | 144,702,617 | 9.44E-06 | 16,216 | Zm00001d011257 | AGP22-La | At5g53250 | Arabinogalactan protein22 | 45.1 | 4.50E-8 |
| Os01g0592500 | DUF1070 family protein | 39.3 | 6.46E-6 | ||||||
| 9 | 19,551,832 | 2.67E-06 | 109 | Zm00001d045360 | None | Os06g0147300 | Unknown | 107 | 3.55E-31 |
| At5g13000 | Unknown | 27 | 5.5 | ||||||
| 10 | 3,599,714 | 6.09E-05 | 881 | Zm00001d023332 | WRKY63a | Os09t0334500 | WRKY74 | 86 | 1E-17 |
| At2g40750 | WRKY54 | 76 | 3E-15 |
aGene identity defined manually by alignment with rice and Arabidopsis homologs closest in sequence
Fig. 5Expression of genes associated with insect damage in developing stems of field-grown and greenhouse-grown maize. a Expression in maize B73 of genes associated with larval penetrations during stem development. Transcript levels in rind tissues from Internodes 9 through 2 from field-grown plants were normalized and compared as counts per 20 M reads. Values are the mean ± variance or S.D. of two or three independent rind collections, respectively. Genes with expression greater than 500 reads per 20 M were ordered by their ratio of expression (black diamonds) in secondary cell-wall-forming tissues (Internodes 5 and 4) to elongating tissue (Internodes 8 and 6). b Differential expression in maize B73 and Mo17 of genes associated with larval penetrations during stem development. Transcript levels in rind tissues of greenhouse-grown plants taken at elongation stages (Internodes 8 and 6) and secondary wall synthesis stages (Internodes 5 and 4) of each inbred were pooled and normalized and compared as counts per 20 M reads