| Literature DB >> 35192625 |
Katherine Hartmann1, Michał Seweryn2, Wolfgang Sadee1.
Abstract
Genome-wide association studies (GWAS) have implicated 58 loci in coronary artery disease (CAD). However, the biological basis for these associations, the relevant genes, and causative variants often remain uncertain. Since the vast majority of GWAS loci reside outside coding regions, most exert regulatory functions. Here we explore the complexity of each of these loci, using tissue specific RNA sequencing data from GTEx to identify genes that exhibit altered expression patterns in the context of GWAS-significant loci, expanding the list of candidate genes from the 75 currently annotated by GWAS to 245, with almost half of these transcripts being non-coding. Tissue specific allelic expression imbalance data, also from GTEx, allows us to uncover GWAS variants that mark functional variation in a locus, e.g., rs7528419 residing in the SORT1 locus, in liver specifically, and rs72689147 in the GUYC1A1 locus, across a variety of tissues. We consider the GWAS variant rs1412444 in the LIPA locus in more detail as an example, probing tissue and transcript specific effects of genetic variation in the region. By evaluating linkage disequilibrium (LD) between tissue specific eQTLs, we reveal evidence for multiple functional variants within loci. We identify 3 variants (rs1412444, rs1051338, rs2250781) that when considered together, each improve the ability to account for LIPA gene expression, suggesting multiple interacting factors. These results refine the assignment of 58 GWAS loci to likely causative variants in a handful of cases and for the remainder help to re-prioritize associated genes and RNA isoforms, suggesting that ncRNAs maybe a relevant transcript in almost half of CAD GWAS results. Our findings support a multi-factorial system where a single variant can influence multiple genes and each genes is regulated by multiple variants.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35192625 PMCID: PMC8863290 DOI: 10.1371/journal.pone.0244904
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Summary of CAD GWAS loci.
(A) For each of the 58 loci identified by GWAS, number of candidate genes annotated by GWAS and additional genes added by eQTL, then sQTL, and finally position based reprioritization, if implicating genes other than those annotated previously by GWAS (See S1 Fig for further details about the approach and S3 File for a comprehensive table). Tier 1 (n = 7) denotes those loci where a GWAS annotated gene is supported by QTL-based re-prioritization or position and no other candidate genes are introduced; Tier 2 (n = 50) where QTL-based reprioritization or position introduces new associated genes while supporting all candidates at this locus (Tier2A), only some including the GWAS gene (Tier2B) or new genes except the GWAS genes (Tier2C); and Tier 3 (n = 1) where no eQTLs or sQTLs are identified and no gene physically overlaps the SNP, accordingly annotation by GWAS is not supported and no other genes are implicated. (B) Corresponding figure for recent large scale GWAS for insulin resistance. (C) For each of the 245 candidate genes displayed along the x-axis (names available in S1 File), the number of transcripts assigned to the gene, the number of antisense transcripts (note: antisense genes are not included among the 245 candidate genes unless their expression is associated with or they physically overlap a GWAS variant), GO terms, Papers indexed in PubMed, cis-eQTLs and sQTLs published in v8 of GTEx. Blue bar highlights those genes with only non-coding transcripts.
Tier 1 CAD GWAS loci.
| Locus | SNP | OR | Risk Allele (Freq) | Gene | eQTL Tissue(s) | sQTL Tissue(s) | Position |
|---|---|---|---|---|---|---|---|
| 16 | rs6903956 | 1.65 | A (0.08 | ADTRP | Testis | ADTRP (intron) | |
| 32 | rs11203042 | 1.04 (1.02–1.06) | T (0.45) | LIPA | Adipose (subq) | Adipose (subq) | LIPA (intron) |
| 32 | rs1412444 | 1.07 (1.05–1.09) | T (0.37) | LIPA | Adipose (subq) | Adipose (subq) | LIPA (intron) |
| 38 | rs9319428 | 1.04 (1.02–1.06) | A (0.31) | FLT1 | Nerve (tibial) | FLT1 (intron) | |
| 42 | rs17514846 | 1.05 (1.03–1.07) | A (0.44) | FES | Adipose (subq) | Adipose (subq) | FURIN (intron) |
| FURIN | Artery (aorta) | ||||||
| 54 | rs7212798 | BCAS3 | BCAS3 (intron) | ||||
| 57 | rs11830157 | KSR2 | KSR2 (intron) | ||||
| 08 | rs6544713 | 1.05 (1.03–1.07) | T (0.32) | ABCG8 | Colon (transverse) | ABCG8 (intron) |
Tissue names in grey font indicate GWAS SNP is associated with a decrease in gene expression (eQTL) or normalized intron-excision ratio (sQTL), while those in black font are associated with increased expression/normalized intron-excision ratio as reported by GTEx.
a values reported from original publication [28] in Han Chinese population. rs6903956 was not significant in Nikpay et al. [1].
b ABCG8 and ABCG5 were both annotated by GWAS. ABCG5 was not supported by QTL or position.
Fig 2Allelic expression imbalance at GWAS variants mark functional SNPs.
Deviation in the observed from the expected ratio for individuals heterozygous for given GWAS variant. (A) Locus 3 –rs7528419 (SORT1) exhibits AEI in 53/57 liver samples. Subcutaneous adipose, also shown, demonstrates near normal distribution of deviation from the expected allelic ratio and is representative of the 46 other tissues with at least 5 samples. (B) Locus 14—rs72689147 (GUCY1A3) exhibits AEI in 114/121 samples across 10 different tissues.
Fig 3Number of eQTL signals.
Correlation plots show absolute value of beta for variant effects on RNA expression versus R2 with the top eQTL (most significant p-value), including all significant eQTLs in the given gene-tissue combination. Blue dots represent the top eQTL (most significant p-value), red dots represent GWAS variant(s). (A) FLT1 in Tibial Nerve: eQTLs are accounted for by a single eQTL marked by the GWAS variant (all eQTLs display a linear correlation with R2). CELSR2 (liver), GUCY1A3 (tibial artery), and LIPA (blood), correlation between beta and R2 suggests multiple functional variants. (B) At least three distinct LD blocks represented by LIPA eQTLs in whole blood. Correlations are shown left to right between the absolute value of beta and R2 with rs142444 (GWAS SNP), rs1051338, or rs2250781. Tightly linked SNPs (D’ > 0.9; R2 > 0.9) are shown in the same color.
Assessing multiple regulatory variants for LIPA.
| Variable of interest | ANOVA p-value | Model 1 | AIC | Model 2 | AIC |
|---|---|---|---|---|---|
|
| 8.8e-16 | XP ~ sex + age | 3310 | XP ~ rs1412444 + sex + age | 1778 |
|
| 8.8e-16 | XP ~ sex + age | 3310 | XP ~ rs13332328 + sex + age | 1782 |
|
| 8.8e-16 | XP ~ sex + age | 3310 | XP ~ rs1051338 + sex + age | 1779 |
|
| 8.8e-16 | XP ~ sex + age | 3310 | XP ~ rs2250781 + sex + age | 1800 |
| 1.0 | XP ~ rs1412444 + sex + age | 1778 | XP ~ rs1412444 + rs13332328+ sex + age | 1781 | |
| 0.23 | XP ~ rs1412444 + sex + age | 1778 | XP ~ rs1412444 + rs1051338 + sex + age | 1777 | |
| 0.04 | XP ~ rs1412444 + sex + age | 1778 | XP ~ rs1412444 + rs2250781 + sex + age | 1773 | |
| 0.19 | XP ~ rs1412444 + rs2250781 + sex + age | 1773 | XP ~ rs1412444 + rs2250781 + rs1051338 + sex + age | 1773 | |
| 0.04 | XP ~ rs1412444 + rs1051338 + sex + age | 1777 | XP ~ rs1412444 + rs1051338 + rs2250781 + sex + age | 1773 | |
|
| 1e-3 | CAD ~ covariates | 387.7 | CAD ~ rs1412444 + covariates | 385.1 |
|
| 1e-3 | CAD ~ covariates | 387.7 | CAD ~ rs13332328 +covariates | 385.3 |
|
| 6e-4 | CAD ~ covariates | 387.7 | CAD ~ rs1051338 + covariates | 384.3 |
|
| 4e-4 | CAD ~ covariates | 387.7 | CAD ~ rs2250781 + covariates | 383.5 |
| 0.79 | CAD ~ rs1412444 + covariates | 385.1 | CAD ~ rs1412444 + rs13332328+ covariates | 386 | |
| 0.36 | CAD ~ rs1412444 + covariates | 385.1 | CAD ~ rs1412444 + rs1051338 + covariates | 384.5 | |
| 0.56 | CAD ~ rs1412444 + covariates | 385.1 | CAD ~ rs1412444 + rs2250781 + covariates | 385.9 | |
| 0.11 | CAD ~ rs1412444 + rs2250781 + covariates | 385.9 | CAD ~ rs1412444 + rs2250781 + rs1051338 + covariates | 385.1 | |
| 0.17 | CAD ~ rs1412444 + rs1051338 + covariates | 384.5 | CAD ~ rs1412444 + rs1051338 + rs2250781 + covariates | 385.1 |
ANOVA comparing generalized linear models with different SNP combinations accounting for LIPA expression and CAD (defined as history of myocardial infarction and/or >50% stenosis of vessel). Covariates in CATHGEN include sex, age, hypercholesterolemia, smoking, and number of diseased vessels.
Fig 4LIPA expression, CAD, and genotype.
Comparison of LIPA expression in CATHGEN for those with and without CAD based on rs142444 genotype. LIPA exhibits higher expression only in those without CAD in the homozygous minor group (p-value = 0.02).
Fig 5Tissue specific eQTLs for LIPA.
Heatmap of LD for those SNPs reported by GTEx as genome-wide significant eQTLs for LIPA. Lighter-colored squares in the heatmap represent LD blocks, with SNPs clustered by R2 and not by genomic position. Colored bars at top eQTLs in each tissue with more significant p-values denoted by darker color.