| Literature DB >> 28000695 |
Jennifer L Asimit1,2, Felicity Payne1, Andrew P Morris3, Heather J Cordell4, Inês Barroso1.
Abstract
Shared genetic aetiology may explain the co-occurrence of diseases in individuals more often than expected by chance. On identifying associated variants shared between two traits, one objective is to determine whether such overlap may be explained by specific genomic characteristics (eg, functional annotation). In clinical studies, inter-rater agreement approaches assess concordance among expert opinions on the presence/absence of a complex disease for each subject. We adapt a two-stage inter-rater agreement model to the genetic association setting to identify features predictive of overlap variants, while accounting for their marginal trait associations. The resulting corrected overlap and marginal enrichment test (COMET) also assesses enrichment at the individual trait level. Multiple categories may be tested simultaneously and the method is computationally efficient, not requiring permutations to assess significance. In an extensive simulation study, COMET identifies features predictive of enrichment with high power and has well-calibrated type I error. In contrast, testing for overlap with a single-trait enrichment test has inflated type I error. COMET is applied to three glycaemic traits using a set of functional annotation categories as predictors, followed by further analyses that focus on tissue-specific regulatory variants. The results support previous findings that regulatory variants in pancreatic islets are enriched for fasting glucose-associated variants, and give insight into differences/similarities between characteristics of variants associated with glycaemic traits. Also, despite regulatory variants in pancreatic islets being enriched for variants that are marginally associated with fasting glucose and fasting insulin, there is no enrichment of shared variants between the traits.Entities:
Mesh:
Substances:
Year: 2016 PMID: 28000695 PMCID: PMC5302181 DOI: 10.1038/ejhg.2016.171
Source DB: PubMed Journal: Eur J Hum Genet ISSN: 1018-4813 Impact factor: 4.246
Figure 1Flow chart of inter-rater approach to overlap analysis of two traits.
Details of possible SNP-specific binary covariates and their distribution among common SNPs in the 1000 Genomes CEU samples
| Q1 | SNP in a transcribed but not translated region | Mature miRNA variant; non-coding transcript exon variant; intron variant; NMD transcript variant; non-coding transcript variant | 0.513 | 0.517 |
| Q2 | SNP in a translated region but does not change the amino acid | Stop retained variant; synonymous variant; incomplete terminal codon variant; | 0.00334 | 0.00380 |
| Q3 | SNP is potentially deleterious | Inframe insertion; inframe deletion; missense variant; initiator codon variant; splice region variant; stop lost | 0.00341 | 0.00422 |
| Q4 | SNP is potentially loss of function | Stop gained; frameshift variant; splice donor variant; splice acceptor variant; start lost; | 0.000104 | 0.000139 |
| Q5 | SNP is in a potentially regulatory or regulatory region | 5′ UTR variant; 3′ UTR variant; TF-binding site variant; regulatory region ablation; regulatory region amplification; regulatory region variant | 0.135 | 0.144 |
| Q6 | SNP is intergenic | Intergenic variant; upstream gene variant; downstream gene variant | 0.635 | 0.638 |
The proportion of common MAF CEU 1KG SNPs that are positive for the covariate is given by CEU unpruned, whereas the coinciding results for SNPs pruned at r2>0.1 is given by CEU pruned. Results are based on VEP v81.
Estimates of type I error (including 95% confidence intervals) for the detection as a category positively enriched with overlap signals at coefficient significance level 0.05, for the null enrichment setting with equal-sized cases and controls
| N | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 3000 5000 | 0.032 (0.021, 0.043) | 0.059 (0.044,0.074) | 0.061 (0.046, 0.076) | 0.096 (0.078,0.114) | 0.046 (0.033, 0.059) | 0.142 (0.120, 0.163) | 0.063 (0.048, 0.079) | 0.114 (0.094, 0.134) | 0.038 (0.026, 0.050) | 0.061 (0.046, 0.076) |
| 5000 10 000 | 0.048 (0.037, 0.061) | 0.056 (0.042, 0.070) | 0.058 (0.044, 0.072) | 0.115 (0.095, 0.135) | 0.044 (0.031, 0.057) | 0.110 (0.091, 0.129) | 0.052 (0.038, 0.066) | 0.074 (0.058, 0.090) | 0.048 (0.035, 0.061) | 0.061 (0.046, 0.076) |
| 10 000 20 000 | 0.053 (0.039, 0.067) | 0.061 (0.046, 0.076) | 0.058 (0.044, 0.072) | 0.103 (0.084, 0.122) | 0.049 (0.036, 0.062) | 0.113 (0.093, 0.133) | 0.044 (0.031, 0.057) | 0.078 (0.061, 0.095) | 0.048 (0.035, 0.061) | 0.071 (0.055, 0.087) |
Study r has N each of cases and controls, r=1, 2.
Figure 2QQ-plots for the covariates in the most appropriate overlap model fit to simulated equal-sized case–control data (N1=3000 each and N2=5000 each). The model is fit to simulated data having p1=0.04, p2=0.02, p12=5 × 10−4 and no covariate categories are set-up as enriched.
Figure 3COMET power for detecting Q5 as a category positively enriched with overlap signals at coefficient significance level 0.05. In each of the 1000 simulations, the Q5 category (1.4% of common CEU SNPs LD-pruned at r2>0.1) was set to have a certain proportion of shared causal variants. The selected proportion of causal variants in this category p′12 is indicated in each column, followed by the proportion among the causal variants p′12/p12, as a percentage. Studies 1 and 2 are each equal-sized case–control studies of N1 each and N2 each, respectively. Type I error is denoted by bold font.
Results of the marginal and pair-wise inter-rater models overlap models fit to fasting glucose, fasting insulin and 2-h glucose
| Fasting insulin (FI) | 0.844 −0.466 0.450 5 | 0.189 0.258 0.293 12 | 0.113 0.0833 0.0687 271 | 0.194 0.0673 0.0778 836 | |||
| Fasting glucose (FG) | 0.974 −1.13 0.5790 3 | 0.491 0.00719 0.305 11 | 0.064 0.108 0.0709 960 | < | |||
| 2-h glucose (2G) | 0.624 −0.183 0.580 3 | 0.741 −0.376 0.580 3 | 0.356 0.0375 0.102 122 | 0.0675 0.168 0.112 399 | < | ||
| (FG, FI) | 0.308 0.183 0.364 29 | 1 0 NA 0 | 0.216 0.799 1.01 1 | 0.623 −0.112 0.357 10 | 0.167 0.361 0.374 32 | 0.339 0.414 0.433 | |
| (FG, 2G) | 0.544 −0.0637 0.578 15 | 1 0 NA 0 | 0.832 −0.517 0.537 11 | 0.302 0.661 0.640 | |||
| (FI, 2G) | 0.305 0.327 0.639 9 | 1 0 NA 0 | 0.783 −0.598 0.763 2 | 0.265 0.415 0.659 10 | 0.946 0.0516 0.763 | ||
Tests of positive enrichment are performed for all covariates and bold font indicates significance at level 0.05. Cell values of (1, 0, NA) indicate that the covariate was excluded from the final overlap model. Two-sided P-values are given for intercept estimates.
P-values for the positive enrichment of glycaemic trait-associated variants (at Bayesian decision criteria R=20, π 0=0.99) in six different tissues, as well as all tissues available in RegulomeDB
| Fasting insulin (FI) | 0.294 | 0.183 | 0.687 | 0.390 | 0.401 | |||
| Fasting glucose (FG) | 0.139 | 0.739 | 0.0584 | |||||
| 2-h glucose (2G) | 0.140 | 0.910 | 0.071 | 0.389 | 0.604 | 0.0900 | ||
| FI and FG | 0.263 (0.153) | 0.733 | 0.237 | 0.587 | 1.000 | 0.393 | 0.223 (0.416) | |
| FI and 2G | 0.492 (0.418) | 0.691 | 0.506 | 1.000 | 1.000 | 0.643 | 0.433 (0.552) | 0.680 |
| FG and 2G | 0.315 | 1.000 | 0.582 | 0.344 |
Pancreas tissue includes tissues from both pancreatic ducts and pancreatic islets. Bold values indicate significance at level 0.05.