| Literature DB >> 28282431 |
Benjamin Goudey1,2,3,4, Gad Abraham5,6,7, Eder Kikianty8, Qiao Wang1,2, Dave Rawlinson1,2, Fan Shi1,2, Izhak Haviv9, Linda Stern2, Adam Kowalczyk1,2,10, Michael Inouye5,6,7.
Abstract
Interaction analysis of GWAS can detect signal that would be ignored by single variant analysis, yet few robust interactions in humans have been detected. Recent work has highlighted interactions in the MHC region between known HLA risk haplotypes for various autoimmune diseases. To better understand the genetic interactions underlying celiac disease (CD), we have conducted exhaustive genome-wide scans for pairwise interactions in five independent CD case-control studies, using a rapid model-free approach to examine over 500 billion SNP pairs in total. We found 14 independent interaction signals within the MHC region that achieved stringent replication criteria across multiple studies and were independent of known CD risk HLA haplotypes. The strongest independent CD interaction signal corresponded to genes in the HLA class III region, in particular PRRC2A and GPANK1/C6orf47, which are known to contain variants for non-Hodgkin's lymphoma and early menopause, co-morbidities of celiac disease. Replicable evidence for statistical interaction outside the MHC was not observed. Both within and between European populations, we observed striking consistency of two-locus models and model distribution. Within the UK population, models of CD based on both interactions and additive single-SNP effects increased explained CD variance by approximately 1% over those of single SNPs. The interactions signal detected across the five cohorts indicates the presence of novel associations in the MHC region that cannot be detected using additive models. Our findings have implications for the determination of genetic architecture and, by extension, the use of human genetics for validation of therapeutic targets.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28282431 PMCID: PMC5345796 DOI: 10.1371/journal.pone.0172826
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of datasets used in this study.
| Celiac cases | Controls | ||||||
|---|---|---|---|---|---|---|---|
| SNPs | Samples | Platform | Samples | Platform | Reference | ||
| UK1 | UK | 301,546 | 763 | Illumina Hap300v1-1 | 1420 | Illumina Hap550 | [ |
| UK2 | UK | 515,413 | 1826 | Illumina670-Quad | 3777 | Illumina 1.2M-Duo | [ |
| FIN | Finland | 513,952 | 647 | Illumina670-Quad | 1829 | Illumina 610-Quad | [ |
| NL | Netherlands | 515,169 | 803 | Illumina670-Quad | 846 | Illumina 670-Quad | [ |
| IT | Italy | 515,641 | 497 | Illumina670-Quad | 543 | Illumina 670-Quad | [ |
| Overlapping SNPs | 290,277 | ||||||
a. The number of samples/SNPs is reported after quality control procedures were applied.
b. All platforms contain a common set of Hap300 markers; the Hap550 and 610-Quad contain a common set of Hap550 markers.
Summary of the number of significant SNP pairs detected and of those that also replicate in at least one cohort.
| Dataset | Two SNPs inside MHC | One SNP inside MHC | Both SNPs outside MHC |
|---|---|---|---|
| UK1 | 5,930 (5,454) | 0 (0) | 1 (0) |
| UK2 | 100,863 (30,351) | 0 (0) | 23 (0) |
| FIN | 24,080 (8,058) | 0 (0) | 5 (0) |
| NL | 2,505 (819) | 0 (0) | 0 (0) |
| IT | 1,006 (397) | 0 (0) | 6 (0) |
| Unique pairs | 134,384 (45,079) | 0 (0) | 35 (0) |
For each dataset, we show the number of SNP pairs detected and number that appear as significant in at least one other cohort (in brackets). These have been separated into pairs where both SNPs are inside the MHC region, pairs with one SNP inside the MHC region and one outside and pairs where both SNPs are outside the MHC region.
Independent validated interaction signals detected in UK1.
| SNP | RS | Chr | Position (bp) | Closest Gene | Distance to gene (bp) | eQTL | Evidence of TF binding | Previously reported | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | hg18.chr6:g.31701455G>A | rs2260000 | 6 | 31701455 | SNORA38 | -2489 | - | Low | - |
| hg18.chr6:g.31736712C>T | rs805262 | 6 | 31736712 | C6orf47 | -184 | - | Low | - | |
| 2 | hg18.chr6:g.32777745G>A | rs2647050 | 6 | 32777745 | HLA-DQB1 | -35301 | - | Low | Emphysema imaging phenotypes |
| hg18.chr6:g.32778934A>G | rs2856705 | 6 | 32778934 | HLA-DQB1 | -36490 | - | - | - | |
| 3 | hg18.chr6:g.29719410A>G | rs29232 | 6 | 29719410 | GABBR1 | -10519 | - | - | Nasopharyngeal carcinoma |
| hg18.chr6:g.29775252T>C | rs7776082 | 6 | 29775252 | ZFP57 | -22342 | - | - | - | |
| 4 | hg18.chr6:g.32685358G>A | rs660895 | 6 | 32685358 | HLA-DRB1 | -19767 | GTEx/Regulome | High | Rheumatoid arthritis, IgA nephropathy |
| hg18.chr6:g.32877641C>T | rs2219893 | 6 | 32877641 | HLA-DOB | 10876 | - | Low | - | |
| 5 | hg18.chr6:g.31192414T>C | rs1062470 | 6 | 31192414 | CDSN | 1571 | GTEx | Low | - |
| hg18.chr6:g.31317489G>A | rs3130712 | 6 | 31317489 | HLA-C | 27015 | Regulome | High | - | |
| 6 | hg18.chr6:g.32259421T>C | rs2070600 | 6 | 32259421 | AGER | -656 | - | Low | Pulmonary function, Emphysema imaging phenotypes |
| hg18.chr6:g.32514320A>C | rs3129871 | 6 | 32514320 | HLA-DRA | 1276 | - | Low | Multiple sclerosis | |
| 7 | hg18.chr6:g.31429190T>C | rs2596501 | 6 | 31429190 | HLA-B | 431 | GTEx/Regulome | High | - |
| hg18.chr6:g.31886251C>T | rs2227956 | 6 | 31886251 | HSPA1L | 877 | Regulome | High | - | |
| 8 | hg18.chr6:g.32484449C>A | rs3763313 | 6 | 32484449 | BTNL2 | -1564 | Regulome | High | HIV |
| hg18.chr6:g.32871088G>A | rs2621377 | 6 | 32871088 | HLA-DOB | 17429 | - | - | - | |
| 9 | hg18.chr6:g.27520365A>G | rs7772160 | 6 | 27520365 | ZNF184 | 134 | GTEx | - | - |
| hg18.chr6:g.27588896C>T | rs6918131 | 6 | 27588896 | ZNF184 | -40020 | GTEx | Low | - | |
| 10 | hg18.chr6:g.31630648A>G | rs6929796 | 6 | 31630648 | NFKBIL1 | -3937 | GTEx | Low | - |
| hg18.chr6:g.31644203T>C | rs2844484 | 6 | 31644203 | LTA | 3651 | GTEx/Regulome | High | - | |
| 11 | hg18.chr6:g.33147603C>T | rs9277341 | 6 | 33147603 | HLA-DPA1 | -1829 | Regulome | High | - |
| hg18.chr6:g.33191099G>A | rs1810472 | 6 | 33191099 | HLA-DPB2 | 2829 | Regulome | High | - | |
| 12 | hg18.chr6:g.32919361G>A | rs6924102 | 6 | 32919361 | PSMB8 | -433 | GTEx | Medium | |
| hg18.chr6:g.32919607A>C | rs2071543 | 6 | 32919607 | PSMB8 | -187 | GTEx | Low | IgA nephropathy | |
| 13 | hg18.chr6:g.31572718T>C | rs3828903 | 6 | 31572718 | MICB | 2082 | GTEx/Regulome | High | - |
| hg18.chr6:g.32845485A>G | rs9368741 | 6 | 32845485 | HLA-DQB2 | -6177 | - | Low | - | |
| 14 | hg18.chr6:g.31002738T>C | rs2532934 | 6 | 31002738 | VARS2 | -524 | Regulome | High | |
| hg18.chr6:g.32517508G>A | rs3129882 | 6 | 32517508 | HLA-DRA | 1912 | Regulome | High | Parkinson's disease, Systemic sclerosis |
a. Closest gene and distance to that gene are derived from the UCSC RefGene annotation for HG18
b. eQTL column indicates whether a given SNP is an eQTL with evidence indicating that the SNP is reported as a significant eQTL in at least one tissue type in GTEx (gtexportal.org) or has been marked as being linked to gene expression in RegulomeDB (regulome.stanford.org). The genes reported as being regulated by these eQTLs are reported in
c. Level of evidence that a given SNP is likely to affect binding as described by RegulomeDB. For ease of interpretation, we have binned the RegulomeDB scores further into ‘high’ (1a-1f), ‘medium’ (2a-3), and ‘low’ (4–6) evidence categories.
d. Phenotype that has been previously associated with the given SNP as reported in the NHGRI-EBI GWAS Catalog.
Disease variance explained by models with additive and interacting genetic effects.
| Single SNPs | Combined | Validated interaction pairs | |||||
|---|---|---|---|---|---|---|---|
| Variance explained | AUC | Variance explained | AUC | Variance explained | AUC | ||
| UK1 | 0.329 | 0.883 | 0.341 | 0.888 | 0.339 | 0.887 | |
| UK2 | 0.269 | 0.855 | 0.288 | 0.864 | 0.267 | 0.854 | |
| Finn | 0.323 | 0.880 | 0.342 | 0.888 | 0.320 | 0.879 | |
| IT | 0.273 | 0.857 | 0.295 | 0.868 | 0.265 | 0.853 | |
| NL | 0.271 | 0.856 | 0.280 | 0.860 | 0.266 | 0.853 | |
Predictive power and disease variance explained by single SNPs and VEPs in cross-validation and in external validation, using SparSNP models. Models were optimized on the UK1 dataset (n = 2183 samples) in cross-validation (290K SNPs), and tested without modification on the other datasets. The proportion of disease variance explained (on the liability scale) assumes a population prevalence of 1%. Two-sided DeLong significance tests for AUC of single SNPs+pairs difference from AUC of single SNPs: UK2 P = 1.14×10−6, FIN P = 0.0351, IT P = 0.0535, NL P = 0.1772.