| Literature DB >> 35997208 |
Ravi V Mural1,2, Guangchao Sun1,2, Marcin Grzybowski1,2, Michael C Tross1,2, Hongyu Jin1,2, Christine Smith1, Linsey Newton3, Carson M Andorf4,5, Margaret R Woodhouse4, Addie M Thompson3, Brandi Sigmon6, James C Schnable1,2.
Abstract
Classical genetic studies have identified many cases of pleiotropy where mutations in individual genes alter many different phenotypes. Quantitative genetic studies of natural genetic variants frequently examine one or a few traits, limiting their potential to identify pleiotropic effects of natural genetic variants. Widely adopted community association panels have been employed by plant genetics communities to study the genetic basis of naturally occurring phenotypic variation in a wide range of traits. High-density genetic marker data-18M markers-from 2 partially overlapping maize association panels comprising 1,014 unique genotypes grown in field trials across at least 7 US states and scored for 162 distinct trait data sets enabled the identification of of 2,154 suggestive marker-trait associations and 697 confident associations in the maize genome using a resampling-based genome-wide association strategy. The precision of individual marker-trait associations was estimated to be 3 genes based on a reference set of genes with known phenotypes. Examples were observed of both genetic loci associated with variation in diverse traits (e.g., above-ground and below-ground traits), as well as individual loci associated with the same or similar traits across diverse environments. Many significant signals are located near genes whose functions were previously entirely unknown or estimated purely via functional data on homologs. This study demonstrates the potential of mining community association panel data using new higher-density genetic marker sets combined with resampling-based genome-wide association tests to develop testable hypotheses about gene functions, identify potential pleiotropic effects of natural genetic variants, and study genotype-by-environment interaction.Entities:
Keywords: community association populations; maize; pleiotropy; quantitative genetics
Mesh:
Substances:
Year: 2022 PMID: 35997208 PMCID: PMC9396454 DOI: 10.1093/gigascience/giac080
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 7.658
Figure 1:Characteristics of Maize Association Panel trait data sets. (A) Number of accessions that are represented in any of the 3 diversity panels. (B) Representation of 8 broad phenotypic categories among the 162 traits collected here. Category assignments for individual traits are provided in Supplementary Table S3. (C) Geographic distribution of trials where trait data sets were collected. Size of circles indicates number of traits collected at a specific geographic location. Colors of circles indicate types of trait data sets collected at that location. Labels for which colors correspond to which types of traits are given in panel B. (D) Distribution of the number of genotypes scored for a given trait. (E) Distributions of narrow-sense heritability values, across the same 8 broad phenotypic categories shown in panel B. Colors corresponding to the color key for phenotype classes are provided in panel B. (F) Correlations among the 162 trait data sets analyzed in this study. Trait data sets are clustered based upon absolute Spearman correlation value. Phenotype classes are indicated with color bar on top the x-axis with colors corresponding to the color key for phenotype classes provided in panel B.
Studies from which maize trait data sets were drawn
| Reference | Study type | Phenotypes scoreda | Accessions evaluatedb | Panel |
|---|---|---|---|---|
| Peiffer et al. 2014 [ | Reproductive & Vegetative | 11 | 737 | Ames Panel |
| Hirsch et al. 2014 [ | Reproductive & Vegetative | 3 | 427 | WiDiv-503 |
| Leiboff et al. 2015 [ | Agronomic, Cellular/Biochemical, & Vegetative | 9 | 378 | SAM |
| Lin et al. 2017 [ | Cellular/Biochemical, Root, & Vegetative | 16 | 363 | SAM |
| Gustafson et al. 2018 [ | Disease | 7 | 447 | WiDiv-503 |
| Gage et al. 2018 [ | Reproductive | 16 | 817 | WiDiv-942 |
| Mazaheri et al. 2019 [ | Cellular/Biochemical & Vegetative | 5 | 788 | WiDiv-942 |
| Qiao et al. 2019 [ | Cellular/Biochemical | 4 | 429 | WiDiv-503 |
| Sekhon et al. 2019 [ | Agronomic | 3 | 364 | WiDiv-503 |
| Zheng et al. 2019 [ | Agronomic, Root | 13 | 359 | SAM |
| Azodi et al. 2020 [ | Reproductive & Vegetative | 3 | 388 | WiDiv-503 |
| Lin et al. 2020 [ | Cellular/Biochemical & Reproductive | 8 | 439 | WiDiv-503 |
| Renk et al. 2021 [ | Seed Composition | 16 | 499 | WiDiv-503 |
| Schneider et al. 2021 [ | Root | 1 | 599 | WiDiv-503 |
| Zhou et al. 2021 [ | Reproductive | 17 | 339 | SAM |
| Sun et al. 2021 [ | Disease | 1 | 687 | WiDiv-942 |
| Previously unpublished | Agronomic, Disease, Reproductive, & Vegetative | 29 | 752 | WiDiv-942 |
aPhenotypes used in this study from the phenotypes scored in respective studies (after removing exact same phenotype values if used in another study).
bThe highest number of accessions with phenotype data used in this study from the respective publication.
Figure 2:Characteristics of Maize Association Panel Marker data sets. (A) Genotype frequency and minor allele frequency of the marker data set. (B) The genome-wide LD decay with maximum distance of 600 kilobases between 2 SNPs. (C) Genetic relationship among the accessions used in this study and visualized using multidimensional scaling/principal coordinate analysis of the distance matrix. The x- and y-axes represent first and second principal component coordinates. Each point is color coded by the heterotic group each accession belongs to. (D) Genetic relationship among the accessions used in this study and visualized using multidimensional scaling/principal coordinate analysis of the distance matrix. The x- and y-axes represent first and third principal component coordinates. Each point is color coded by the heterotic group each accession belongs to.
Figure 3:GWAS summary: multitrait peaks detected across phenotypic categories. (A) Combined Manhattan plot for GWAS using all 1,014 individuals screened using 18M markers. Dashed gray and red lines indicate the cutoff of 5% and 10% for statistical significance calculated based on RMIP value. Each chromosome is shown in the x-axis. The y-axis is the RMIP values ranging from 0 to 1. (B) An upset plot showing number of shared GWAS hits between various phenotypic categories. (C) Percent representation of GWAS hits for the number of trait data sets analyzed. Number on top of each pair of bars in each phenotypic category corresponds to the ratio of GWAS hits/number of trait data sets analyzed in each category. Note: The ratio was higher for the disease traits, but the traits in this category are essentially the same trait analyzed at different time points in a time-series manner; thus, most of the hits overlap among the traits, leading to an inflated ratio.
Summary of unique associations with RMIP ≥5 within each of the 8 phenotypic groups analyzed
| Phenotype group | No. of phenotypes analyzed | No. of phenotypes with hits | No. of peaks | No. of single trait peaks | No. of multitrait peaks | No. of multitrait peaks within each categorya | No. of peaks associated across each category |
|---|---|---|---|---|---|---|---|
| Agronomic | 11 | 11 | 161 | 155 | 6 | 4 | 2 |
| Cellular/Biochemical | 21 | 17 | 92 | 69 | 23 | 20 | 3 |
| Disease | 8 | 8 | 72 | 44 | 28 | 28 | 0 |
| Flowering Time | 15 | 15 | 176 | 128 | 48 | 32 | 16 |
| Inflorescence | 47 | 41 | 459 | 420 | 39 | 32 | 7 |
| Root | 15 | 15 | 113 | 81 | 32 | 25 | 7 |
| Seed Composition | 16 | 16 | 128 | 108 | 20 | 19 | 1 |
| Vegetative | 29 | 28 | 295 | 247 | 48 | 28 | 20 |
| Total Unique | 162 | 151 | 1,466b | 1,252 | 214b | 188 | 26b |
aExcluding 26 peaks that overlap between 2 or more phenotype groups/categories. Of these 26 peaks, 22 are associated with traits belonging to 2 phenotype categories and 4 peaks are associated with phenotype traits belonging to 3 phenotype categories.
bThe total unique value is less than the sum of all values in respective columns because some of the peaks were associated with phenotypes in multiple categories and they are depicted in each category they show significance.
Figure 4:Probability of genes at different distances from peak SNP from GWAS is linked to phenotypes. (A) Gene positions of unique trait associations. First 7 genes closest to the GWAS peaks were selected and shown on the x-axis. (B) Gene order of unique trait associations. The distance of the genes from the trait-associated markers is shown on the x-axis.
Figure 5:Combined GWAS identifies peak associated with seed starch and fat. (A) View of resampling marker inclusion probability values for markers in a window from 108,211,603 to 108,213,234 on chromosome 6 spanning 200 kilobases upstream and downstream of the pleiotropic peak identified for seed starch and oil content. Only markers with resampling marker inclusion probability values ≥0.01 are shown. (B) The LD relationships between the significant SNPs within the peak. (C) Distributions of observed oil and starch content values reported in [32] for lines carrying either allele of the peak SNP located at position 108,212,338 bp.
Figure 6:GWAS peaks associated with multiple traits. (A) Local Manhattan plot with ±200 kilobases of pleiotropic peak on chromosome 3 from 160,559,294 to 160,989,691 bp. This peak is associated with MADS69 (Zm00001d042315). The phenotypes associated with this peak belongs to Flowering Time and Vegetative categories. The phenotypes associated with this peak are Anthesis1_L, Anthesis4_H, Anthesis6_H, Anthesis7_H, Anthesis_A, Anthesis_G, Anthesis_J, BiomassYield_G, ExtantLeafNumber1_J, ExtantLeafNumber2_J, PlantHeight_D, PlantHeight_G, Silking_A, Silking_J, Silking_L, and StalkDiameter_D. The vertical dashed lines show the peak boundary. (B) Local Manhattan plot with ±200 kilobases of pleiotropic peak on chromosome 8 from 135,928,821 to 136,325,345 bp. This peak is associated with Rap2.7 (Zm00001d010987). The phenotypes associated with this peak belong to Flowering Time and Vegetative categories. The phenotypes associated with this peak are Anthesis1_L, Anthesis5_H, Anthesis6_H, Anthesis7_H, Anthesis_A, Anthesis_G, Anthesis_J, ExtantLeafNumber1_J, LeafWidth_J, PlantHeight_D, SilkingGDD_L, and Silking_L. The vertical dashed lines show the peak boundary. (C) Local Manhattan plot with ±200 kilobases of pleiotropic peak on chromosome 8 from 126,884,534 to 126,891,234 bp. This peak is associated with ZCN8 (Zm00001d010752). The phenotypes associated with this peak belong to Flowering Time and Vegetative categories. The phenotypes associated with this peak are Anthesis7_H, Anthesis_G, Anthesis_J, ExtantLeafNumber1_J, and ExtantLeafNumber2_J. The vertical dashed lines show the peak boundary. (D) Local Manhattan plot with ±200 kilobases of pleiotropic peak on chromosome 8 from 134,706,389 to 134,759,977 bp. This peak is associated with lg4 (Zm00001d010948). The phenotypes associated with this peak belong to Flowering Time, Root, and Vegetative categories. The phenotypes associated with this peak are Anthesis4_H, Anthesis7_H, Anthesis_A, Anthesis_G, Anthesis_J, ExtantLeafNumber1_J, ExtantLeafNumber2_J, RootArea1_O, RootArea2_O, RootArea4_O, RootWidth3_O, Silking_A, and Silking_J. The vertical dashed lines show the peak boundary.