| Literature DB >> 26157023 |
Sarah L Spain1, Jeffrey C Barrett2.
Abstract
Genome-wide association studies (GWAS) have identified thousands of robust and replicable genetic associations for complex disease. However, the identification of the causal variants that underlie these associations has been more difficult. This problem of fine-mapping association signals predates GWAS, but the last few years have seen a surge of studies aimed at pinpointing causal variants using both statistical evidence from large association data sets and functional annotations of genetic variants. Combining these two approaches can often determine not only the causal variant but also the target gene. Recent contributions include analyses of custom genotyping arrays, such as the Immunochip, statistical methods to identify credible sets of causal variants and the addition of functional genomic annotations for coding and non-coding variation to help prioritize variants and discern functional consequence and hence the biological basis of disease risk.Entities:
Mesh:
Year: 2015 PMID: 26157023 PMCID: PMC4572002 DOI: 10.1093/hmg/ddv260
Source DB: PubMed Journal: Hum Mol Genet ISSN: 0964-6906 Impact factor: 6.150
Figure 1.An overview of procedures for fine-mapping of GWAS loci.
Figure 2.Illustration of conditional association analysis conditioning on the lead SNP, indicated by the orange circles (the SNP with the lowest P-value in the GWAS) using genotype level data for (A) one independent signal and (B) two independent signals. The top plots show the results of the association analysis and the bottom plots the result after conditioning on the lead SNP.
Figure 3.Fine-mapping from many variants in an associated region to a credible set of most likely causal variants. The plots illustrate the associated variants under two potential GWAS association peaks, with −log10 P-value plotted against the chromosome position. The grey lines indicate the position of genome-wide significance at 5 × 10−8, showing the number of variants that would be prioritized by P-value alone. The points plotted in yellow are the variants in high LD (r2) with the lead variant. The points coloured in orange are the variants included in the 95% credible set for the most likely causal variants.
Figure 4.Functional annotation schematic illustrating the annotation possibilities in the process of associated variant to target gene mapping. VEP, variant effect predictor; CADD, Combined Annotation-Dependent Depletion; TFBM, transcription factor binding motif.
Selection of fine-mapped GWAS loci from recent studies, concentrating on large studies using the custom genotyping chips
| Phenotype | References | Target gene | Array | Method (further details) |
|---|---|---|---|---|
| T1D | Onengut-Gumuscu | AFF3 | Immunochip | Bayesian analysis and 3C |
| BCAR2 | ||||
| PTPN22 | ||||
| IFIH1 | (three independent signals) | |||
| CTSH | ||||
| TYK2 | (two independent signals) | |||
| FUT2 | ||||
| T2D | DIAGRAM Consortium | JAZF1 | Metabochip | TE meta, MANTRA (overlaps enhancer region) |
| SLC3OA8 | (coding variant, functional experiment) | |||
| LDL cholesterol | Musunuru | SORT1 | – | siRNA knockdown (functional experiment) |
| Crohn’s disease | Farh | SMAD3 | Immunochip | PICS, TF binding |
| Multiple sclerosis | IKZF3 | Immunochip | PICS, eQTL analysis | |
| IL2RA | PICS, enhancer map | |||
| Breast cancer | Ghoussaini | IGFBP5 | iCOGS | Enhancer map (variant flanks transcriptional enhancer and interacts with promoter) |
| Dryden | IGFBP5 | – | CHi-C | |
| Breast cancer | French | CCND1 | iCOGS | 3C, allele-specific CHiP |
| Breast cancer | Orr | KLF4 | Additional genotyping | |
| Breast cancer | Meyer | FGFR2 | iCOGS | DHS (variants alter TF binding) |
| Prostate cancer | Kote-Jarai | TERT | iCOGS | Chip-seq and expression analysis |
| Breast cancer | Bojesen | TERT | iCOGS | mQTL TCGA (decreased methylation levels increase cancer risk) |
| RA | Okada | CDK6 /CDK4 | Immunochip | TE meta, |
| Psoriasis | Tsoi | IL13 | Immunochip | Functional SNP in high LD with lead SNP |
| Psoriasis | Tsoi | TRAF3IP2 | Immunochip | Functional lead SNP |
| STAT2 | Functional SNP in high LD with lead SNP | |||
| PRSS53 | Functional SNP in high LD with lead SNP | |||
| CARD14 | Functional lead SNP | |||
| TYK2 | Functional lead SNP | |||
| YDJC | Functional SNP in high LD with lead SNP | |||
| ERAP2 | Functional SNP in high LD with lead SNP | |||
| Psoriasis | Tsoi | NFKBIZ | Immunochip | eQTL |
| FUT11 | eQTL | |||
| MYOZ1 | Expression analysis (psoriatic versus normal skin) | |||
| NFKBIZ |
The target gene is the gene implicated by the functional annotation or fine-mapping analysis. TE = trans-ethnic meta-analysis; TF = transcription factor; 3C = chromosome conformation capture; CHi-C = capture Hi-C, similar to 3C; DHS = DNAse hypersensitivity sites; RA = rheumatoid arthritis.
| Function | Tool | URL |
|---|---|---|
| Functional annotation of genetic variation | VEP | |
| ANNOVAR | ||
| Reference panels for imputation and LD estimation | 1000 Genomes Project | |
| Bayesian method to identify credible sets using genotype level data | BIMBAM | |
| Bayesian method optimized for trans-ethnic meta-analysis | MANTRA | Available by request from author ( |
| Bayesian methods using summary statistics | CAVIARBF | |
| PAINTOR | ||
| Bayesian methods including functional annotation | fGWAS | |
| PICS | ||
| Non-coding genome annotation projects | ENCODE | |
| Roadmap | ||
| Fantom5 | ||
| Databases using regulatory information to infer function | RegulomeDB | |
| HaploREG | ||
| eQTLs | GTeX | |
| Enrichment analysis | Goshifter | |
| Drug target databases | ChEMBL | |
| Drugbank | ||
| Therapeutic target database | ||
| Pubmed text mining of literature | GRAIL | |
| Protein–protein interactions | DAPPLE | |
| Pathway prioritization protocol | MEAGA |