| Literature DB >> 32477401 |
Eddie Cano-Gamez1, Gosia Trynka1,2.
Abstract
Genome-wide association studies (GWAS) have successfully mapped thousands of loci associated with complex traits. These associations could reveal the molecular mechanisms altered in common complex diseases and result in the identification of novel drug targets. However, GWAS have also left a number of outstanding questions. In particular, the majority of disease-associated loci lie in non-coding regions of the genome and, even though they are thought to play a role in gene expression regulation, it is unclear which genes they regulate and in which cell types or physiological contexts this regulation occurs. This has hindered the translation of GWAS findings into clinical interventions. In this review we summarize how these challenges have been addressed over the last decade, with a particular focus on the integration of GWAS results with functional genomics datasets. Firstly, we investigate how the tissues and cell types involved in diseases can be identified using methods that test for enrichment of GWAS variants in genomic annotations. Secondly, we explore how to find the genes regulated by GWAS loci using methods that test for colocalization of GWAS signals with molecular phenotypes such as quantitative trait loci (QTLs). Finally, we highlight potential future research avenues such as integrating GWAS results with single-cell sequencing read-outs, designing functionally informed polygenic risk scores (PRS), and validating disease associated genes using genetic engineering. These tools will be crucial to identify new drug targets for common complex diseases.Entities:
Keywords: GWAS; QTL; SNP enrichment; TWAS; colocalization analysis; eQTL; single-cell RNA seq
Year: 2020 PMID: 32477401 PMCID: PMC7237642 DOI: 10.3389/fgene.2020.00424
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1Challenges in interpreting GWAS associations. From the top: Manhattan plot illustrates the association between genetic variants and a trait (e.g., a disease) at a genome-wide level (left panel) and within an example locus (right panel). Variants above the dotted line represent genome-wide significant associations. The panels below illustrate the main challenges in interpreting GWAS associations: high LD between variants (encoded in shades of red), variable levels of regulatory activity of the genomic regions across cell types (peaks of different heights represent different levels of activity of chromatin marks) and multiple genes within the associated locus.
FIGURE 2Overview of SNP enrichment analysis using chromatin annotations. SNP enrichment analysis integrates association signals from GWAS (Manhattan plot on the top left) with functional genomics data such as chromatin annotations (heatmap on the bottom left). GWAS SNPs are overlapped with regulatory elements (right panel) and if in a given tissue the overlap occurs more frequently than expected by chance, the tissue is assigned a high enrichment score.
Methods for SNP enrichment analysis.
| SNPsea | Accumulation of GWAS variants near genes with high tissue specificity | Gene expression, GWAS index variants | |
| EpiGWAS | Accumulation of GWAS variants near highly active regulatory elements | Chromatin marks, GWAS index variants | |
| GREGOR | Accumulation of GWAS variants in regulatory elements | Chromatin marks, GWAS index variants | |
| GoShifter | Intersection of GWAS variants with regulatory annotations (based on local-shifting of annotations) | Functional annotations, GWAS index variants | |
| fGWAS | Higher GWAS effect sizes observed if a loci and a SNP overlap a functional annotation | Functional annotations, GWAS summary statistics | |
| CHEERS | Accumulation of GWAS variants in regulatory elements with high tissue specificity | Chromatin marks (quantitative), GWAS index variants | |
| GARFIELD | Higher GWAS effect sizes observed in variants that overlap regulatory annotations | Chromatin annotations, full GWAS summary statistics | |
| RolyPoly | Higher GWAS effect sizes observed near highly expressed genes | Gene expression, full GWAS summary statistics | |
| LDSC | Accumulation of heritability in variants overlapping a functional annotation | Chromatin annotations, full GWAS summary statistics | |
| LDSC-SEG | Accumulation of heritability near tissue specific genes | Gene expression, full GWAS summary statistics |
FIGURE 3Overview of eQTL-mapping and colocalization. (A) In eQTL-mapping gene expression is profiled in thousands of individuals and the expression level of each gene is tested for association with genotypes at nearby (cis) SNPs. (B) Colocalization compares the association patterns of GWAS and eQTLs at a locus to find if both signals are driven by the same causal variants. (C) GWAS and eQTL signals can overlap for three reasons: two independent causal variants in LD (linkage), a single causal variant affecting the GWAS trait via gene expression modulation (causality) or a single causal variant affecting both traits independently (pleiotropy). A positive colocalization supports causality or pleiotropy in favor of linkage.
Methods for colocalization analysis.
| Regulatory trait concordance (RTC) | Conditional regression | Individual genotypes | |
| Proportionality test | Test for concordance of effects | Individual genotypes | |
| Sherlock | Genome-wide comparison of association “signatures” | Summary statistics | |
| COLOC | Bayesian test | Summary statistics | |
| gwas-pw | Bayesian test | Summary statistics | |
| eCAVIAR | Bayesian fine-mapping and colocalization | Summary statistics | |
| enloc | Bayesian test for enrichment, fine-mapping and colocalization | Summary statistics | |
| MOLOC | Bayesian test for multiple traits | Summary statistics |
FIGURE 4Overview of transcriptome-wide association studies TWAS leverage information from eQTL catalogs and GWAS studies to directly associate traits to genes. (A) TWAS use eQTL maps (which contain tissue-specific gene expression and genotypes for thousands of individuals) as a training set to build gene expression predictors. These predictors take the SNPs in cis to a gene and estimate its expression levels. (B) The resulting predictors are used to impute gene expression values across the hundreds of thousands of individuals in a GWAS study (which contains genotypes but no gene expression data). Finally, the imputed gene expression values are directly tested for association with the GWAS trait, resulting in a set of genes which positively or negatively influence it.