| Literature DB >> 31040117 |
V Kartik Chundru1, Riccardo E Marioni2,3, James G D Prendergast4, Costanza L Vallerga5, Tian Lin5, Allan J Beveridge6, Jacob Gratten5,7, David A Hume7, Ian J Deary2, Naomi R Wray5,8, Peter M Visscher5,8, Allan F McRae.
Abstract
Genetic variants disrupting DNA methylation at CpG dinucleotides (CpG-SNP) provide a set of known causal variants to serve as models to test fine-mapping methodology. We use 1716 CpG-SNPs to test three fine-mapping approaches (Bayesian imputation-based association mapping, Bayesian sparse linear mixed model, and the J-test), assessing the impact of imputation errors and the choice of reference panel by using both whole-genome sequence (WGS), and genotype array data on the same individuals (n = 1166). The choice of imputation reference panel had a strong effect on imputation accuracy, with the 1000 Genomes Project Phase 3 (1000G) reference panel (n = 2504 from 26 populations) giving a mean nonreference discordance rate between imputed and sequenced genotypes of 3.2% compared to 1.6% when using the Haplotype Reference Consortium (HRC) reference panel (n = 32,470 Europeans). These imputation errors had an impact on whether the CpG-SNP was included in the 95% credible set, with a difference of ∼23% and ∼7% between the WGS and the 1000G and HRC imputed datasets, respectively. All of the fine-mapping methods failed to reach the expected 95% coverage of the CpG-SNP. This is attributed to secondary cis genetic effects that are unable to be statistically separated from the CpG-SNP, and through a masking mechanism where the effect of the methylation disrupting allele at the CpG-SNP is hidden by the effect of a nearby SNP that has strong linkage disequilibrium with the CpG-SNP. The reduced accuracy in fine-mapping a known causal variant in a low-level biological trait with imputed genetic data has implications for the study of higher-order complex traits and disease.Entities:
Keywords: CpG-SNPs; DNA-methylation; fine-mapping; imputation
Mesh:
Year: 2019 PMID: 31040117 PMCID: PMC6614908 DOI: 10.1534/genetics.118.301861
Source DB: PubMed Journal: Genetics ISSN: 0016-6731 Impact factor: 4.562
Figure 1Coverage of the CpG-SNP using three fine-mapping methods. The three methods perform similarly, with only a very small difference in coverage of the CpG-SNP. The coverage of the CpG-SNPs is at a maximum when using whole-genome sequence data, followed closely by the HRC imputed data, with the 1000 Genomes Project imputed data having a much lower coverage of the CpG-SNP.
Figure 2Coverage of the CpG-SNP in those probes where the CpG-SNP is genotyped on the array, and those where it is imputed. The coverage of the CpG-SNP was higher in the probes where the CpG-SNP was genotyped. This result was replicated in an independent dataset imputed using the HRC reference panel (Systems Genomics of Parkinson’s Disease Cohort). When the CpG-SNP is imputed, there is a large difference in the coverage between datasets imputed using the 1000 Genomes Project Phase 3 reference panel (LBC-1KG), and those imputed using the HRC reference panel (LBC-HRC, Replication-HRC).
Figure 3The phenotypic variance explained by the CpG-SNP in the three datasets plotted against one another. Although they are highly correlated, in the top row we observe that the phenotypic variance explained is on average higher in the LBC-WGS dataset than the two imputed datasets, and in the bottom row we observe that the phenotypic variance explained is on average higher in the LBC-HRC dataset than in the LBC-1KG dataset.
The coverage of the CpG-SNP and the size of the credible sets for the probes with a single independent association detected from the both conditional analyses (87% of all probes), using the whole-genome sequence dataset
| Method | Coverage (%) | Mean SNPs/set | Median SNPs/set | 95% quantile |
|---|---|---|---|---|
| J-test | 82 | 4 | 1 | 14 |
| BIMBAM | 87 | 5 | 1 | 19 |
| BSLMM | 80 | 4 | 1 | 10 |
Assuming that the CpG-SNP is the single underlying causal for the DNA methylation levels, we would expect that the CpG-SNP would be captured in at least 95% of the credible sets.
Figure 4The effect of the CpG-SNP and top SNP on the methylation levels, independent of one another. A and B show the change in methylation levels with a change in the genotype of the CpG-SNP, and the top SNP, respectively, with both having a large effect. C is split into three blocks indicating individuals with 0, 1, or 2 minor alleles at the top SNP, and within each block the points indicate the methylation levels of individuals with 0, 1, or 2 minor alleles at the CpG-SNP, showing there is almost no variation in methylation levels explained by the CpG-SNP after fixing the top SNP. D is the same as the second, except the SNPs are reversed, showing that even after fixing the CpG-SNP there is extra variation in the methylation levels explained by the top SNP.