| Literature DB >> 34240182 |
William Andres Lopez-Arboleda1, Stephan Reinert1, Magnus Nordborg2, Arthur Korte1.
Abstract
Understanding the genetic architecture of complex traits is a major objective in biology. The standard approach for doing so is genome-wide association studies (GWAS), which aim to identify genetic polymorphisms responsible for variation in traits of interest. In human genetics, consistency across studies is commonly used as an indicator of reliability. However, if traits are involved in adaptation to the local environment, we do not necessarily expect reproducibility. On the contrary, results may depend on where you sample, and sampling across a wide range of environments may decrease the power of GWAS because of increased genetic heterogeneity. In this study, we examine how sampling affects GWAS in the model plant species Arabidopsis thaliana. We show that traits like flowering time are indeed influenced by distinct genetic effects in local populations. Furthermore, using gene expression as a molecular phenotype, we show that some genes are globally affected by shared variants, whereas others are affected by variants specific to subpopulations. Remarkably, the former are essentially all cis-regulated, whereas the latter are predominately affected by trans-acting variants. Our result illustrate that conclusions about genetic architecture can be extremely sensitive to sampling and population structure.Entities:
Keywords: GWAS; evolutionary genomics; genetic architecture; regulation of gene expression
Mesh:
Year: 2021 PMID: 34240182 PMCID: PMC8557469 DOI: 10.1093/molbev/msab208
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Fig. 1.GWAS of flowering time across Europe. (A) Origin of the 888 European Arabidopsis thaliana accessions, with eight designated subpopulations in different colors. (B) The distribution of flowering time in the subpopulations. (C) Manhattan plots of GWAS results for the whole European populations and three of the eight subpopulations. Dashed and dash-dotted lines indicate Bonferroni- and permutation-based 5% significance thresholds, respectively. Candidate genes that are in close proximity to significantly associated markers are indicated in red.
Significant SNPs (chromosome:position) in the GWAS of Different Subpopulations
| SNP | 1:24339560 | 3:3458977 | 4:10949262 | 4:11016778 | 5:18590501 | 5:23100540 | 5:23234243 |
|---|---|---|---|---|---|---|---|
| Candidate gene |
|
|
|
|
|
| |
| Europe | 2.4e-10 (0.44) | 4.3e-03 (0.18) | 4.1e-01 (0.34) | 1.4e-04 (0.25) | 1.7e-09 (0.20) | 9.9e-10 (0.03) | 1.4e-06 (0.07) |
| SIP | 1.7e-02 (0.45) | 5.1e-02 (0.47) | 9.3e-01 (0.19) | 2.0e-09 (0.12) | 2.9e-02 (0.03) | 5.2e-01 (0.01) | 7.4e-01 (0.04) |
| NIP | 1.2e-02 (0.35) | 1.2e-01 (0.35) | 8.2e-01 (0.32) | 6.8e-02 (0.22) | 1.8e-08 (0.14) | 5.2e-01 (0.04) | 5.8e-01 (0.10) |
| Germany | 3.1e-02 (0.28) | 9.3e-01 (0.05) | 6.4e-01 (0.49) | 9.4e-01 (0.28) | 2.6e-02 (0.06) | 2.4e-01 (0.06) | |
| France/UK | 7.0e-04 (0.41) | 2.9e-01 (0.08) | 6.4e-01 (0.50) | 9.5e-01 (0.17) | 1.4e-01 (0.05) | 8.2e-01 (0.08) | |
| Central Europe | 7.4e-02 (0.46) | 4.1e-08 (0.22) | 1.2e-08 (0.37) | 1.1e-01 (0.12) | 8.2e-02 (0.05) | ||
| Skåne | 5.1e-02 (0.24) | 2.1e-01 (0.12) | 5.7e-01 (0.36) | 8.5e-02 (0.49) | 1.5e-01 (0.33) | 2.7e-01 (0.01) | |
| Northern Sweden | 3.1e-01 (0.20) | 9.7e-01 (0.13) | 9.1e-01 (0.22) | 4.2e-01 (0.37) | 1.0e-01 (0.48) | 9.8e-10 (0.20) | 4.3e-09 (0.24) |
| Eastern Europe | 2.6e-01 (0.44) | 6.2e-01 (0.09) | 7.4e-01 (0.26) | 2.8e-01 (0.14) | 7.4e-08 (0.08) | 4.1e-01 (0.01) |
Note.—Entries are “P value (minor allele frequency),” with genome-wide significance using a 5%-permutation-based threshold shown in red. Candidate genes were assigned to the SNPs from a list of 306 flowering time genes (Bouché et al. 2016) using 10-kb window.
FT (FLOWERING LOCUS T, Corbesier et al. 2007).
TSF (TARGET OF FLC AND SVP1, Yamaguchi et al. 2005).
JMJ14 (JUMONJI 14, Lu et al. 2010).
DOG1 (DELAY OF GERMINATION 1, Huo et al. 2016).
CIR1 (CIRCADIAN 1, Zhang et al. 2007).
VIN3 (VERNALIZATION INSENSITIVE 3, Sung and Amasino 2004).
Fig. 2.Sharing of subsignificant () associations. (A) Histogram of the number of associated SNPs in each subpopulation and shared between subpopulations. (B) Histogram of the number of associated genomic regions in each subpopulation and shared between subpopulations.
Fig. 3.Manhattan plots from GWAS on expression levels for three different genes. The columns show the results from genes representing different scenarios. The rows display the GWAS results of the analysis in the two subpopulations (SW and IP, respectively), or in the merged population (ALL). Horizontal dash-dotted lines indicate the significance threshold of . Vertical dashed lines show the position of the gene whose expression is being used as a molecular phenotype.
Fig. 4.Summary of the difference between shared and nonshared GWAS results for expression data. The top panel shows associations that are shared between the two subpopulations, whereas the bottom panels show associations that are specific to one subpopulation. The plots show the chromosomal location of the genes whose expression is mapped on the x axis, and the chromosomal position of significantly associated SNPs on the y axis. Associations in cis are shown in orange, whereas trans-associations are shown in purple. The pie charts show the number of genes in each category.