| Literature DB >> 28915241 |
Muhammad Ahsan1, Weronica E Ek1, Mathias Rask-Andersen1, Torgny Karlsson1, Allan Lind-Thomsen1, Stefan Enroth1, Ulf Gyllensten1, Åsa Johansson1.
Abstract
Associations between epigenetic alterations and disease status have been identified for many diseases. However, there is no strong evidence that epigenetic alterations are directly causal for disease pathogenesis. In this study, we combined SNP and DNA methylation data with measurements of protein biomarkers for cancer, inflammation or cardiovascular disease, to investigate the relative contribution of genetic and epigenetic variation on biomarker levels. A total of 121 protein biomarkers were measured and analyzed in relation to DNA methylation at 470,000 genomic positions and to over 10 million SNPs. We performed epigenome-wide association study (EWAS) and genome-wide association study (GWAS) analyses, and integrated biomarker, DNA methylation and SNP data using between 698 and 1033 samples depending on data availability for the different analyses. We identified 124 and 45 loci (Bonferroni adjusted P < 0.05) with effect sizes up to 0.22 standard units' change per 1% change in DNA methylation levels and up to four standard units' change per copy of the effective allele in the EWAS and GWAS respectively. Most GWAS loci were cis-regulatory whereas most EWAS loci were located in trans. Eleven EWAS loci were associated with multiple biomarkers, including one in NLRC5 associated with CXCL11, CXCL9, IL-12, and IL-18 levels. All EWAS signals that overlapped with a GWAS locus were driven by underlying genetic variants and three EWAS signals were confounded by smoking. While some cis-regulatory SNPs for biomarkers appeared to have an effect also on DNA methylation levels, cis-regulatory SNPs for DNA methylation were not observed to affect biomarker levels. We present associations between protein biomarker and DNA methylation levels at numerous loci in the genome. The associations are likely to reflect the underlying pattern of genetic variants, specific environmental exposures, or represent secondary effects to the pathogenesis of disease.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28915241 PMCID: PMC5617224 DOI: 10.1371/journal.pgen.1007005
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Fig 1Manhattan and QQ plots for MIC-A.
The red line in the Manhattan plots (left panel) indicates the threshold of significance (Bonferroni adjusted P-value = 0.05). The black line in the QQ plots (right panel) shows the 1:1 regression line and the red line in the QQ plots is the regression line illustrating the inflation of low P-values and the slope of the red line equals the inflation factor (Lambda). A) Results from the primary EWAS between MIC-A and DNA methylation. B) Results from the primary GWAS. C) Results from the EWAS adjusted for all independent SNPs (N = 3) identified in the GWAS. D) Association between MIC-A GS and DNA methylation levels across the genome, and a zoom-in on a region of chromosome 6.
Fig 2Correlation links between genomic regions.
The links between the CpG sites represent squared correlation coefficients (R2) ranging from zero to 0.77 reflected by the intensity in the color with red representing positive correlations and blue negative correlation. A) Shows the correlation pattern between DNA methylation levels at different CpG sites across the autosomal chromosomes. Only CpG sites associated with any of the biomarkers are included. The outer circle shows the chromosome number, second circle shows the cytobands for each chromosome clock wisely directed. The third circle is an extract from the Manhattan plot the–log10(P-values), with the scale ranging from 7 to 24. B) Shows the correlation pattern between CpG sites that are associated with multiple biomarkers and the correlation pattern between these biomarkers. The second circle is the names of the CpG sites and the biomarkers included in theses analyses. Inside the cytoband-circle, the CpG cites (magenta) and the genes encoding respective biomarker (cyan) with lines to the genomic position are shown.
CpG sites that are associated with multiple biomarkers, and previous associations between the CpG locus diseases.
| CpG Name | Gene(s) annotation CpG | Disease related gene function | Biomarker | Pathogenic association of biomarker | Estimate | |
|---|---|---|---|---|---|---|
| cg00329615 | Immunoglobulin superfamily member 11 ( | IL-12 | Inflammation, cancer | 3.48 | 3.8x10-8 | |
| MIA | Cancer | 3.52 | 1.8x10-8 | |||
| TNFRSF4 | Cancer | 4.16 | 3.1 x10-10 | |||
| cg02664787 | Opioid binding protein/cell adhesion molecule-like ( | PlGF | Cardiovascular, cancer | -4.87 | 1.5 x10-8 | |
| TNF-RI | Cardiovascular, cancer | -4.46 | 5.3 x10-8 | |||
| cg03636183 c | Coagulation factor II (Thrombin) receptor-like 2 ( | WFDC2 | Cancer | -3.41 | 3.1 x10-8 | |
| IL-12c | Inflammation, cancer | 4.36 | 9.9 x10-10 | |||
| cg04134748 | PR domain containing 16 ( | Flt3L | Inflammation, cancer | 3.96 | 5.3 x10-12 | |
| SRC | Cardiovascular | -3.77 | 9.1 x10-9 | |||
| cg05304729 | Myeloid cell nuclear differentiation antigen ( | CXCL11 | Inflammation, cancer | -6.51 | 1.1 x10-9 | |
| CXCL9 | Inflammation, cancer | -5.36 | 2.5 x10-8 | |||
| cg07839457 a | NLR family, CARD domain containing 5 ( | CXCL11 | Inflammation, cancer | -3.26 | 3.5 x10-11 | |
| CXCL9 | Inflammation, cancer | -2.84 | 1.2 x10-10 | |||
| IL-12 | Inflammation, cancer | -2.62 | 2.9 x10-8 | |||
| IL-18 | Cardiovascular, inflammation | -2.6 | 7.0 x10-8 | |||
| cg09801824 | DnaJ (Hsp40) homolog, subfamily B, member 12 | CXCL11 | Inflammation, cancer | 9.91 | 3.6 x10-9 | |
| CXCL9 | Inflammation, cancer | 9.67 | 1.2 x10-10 | |||
| cg12785694 | Structural maintenance of chromosome 4 ( | CXCL11 | Inflammation, cancer | -4.43 | 6.7 x10-8 | |
| IL-12 | Inflammation, cancer | -4.41 | 8.1 x10-9 | |||
| cg12790638 | Hyaluronan and proteoglycan link protein 3 | Adrenomedullin | Cardiovascular, cancer | 3.95 | 7.2 x10-8 | |
| Cystatin B | Cardiovascular, cancer | 3.7 | 1.5 x10-8 | |||
| VEGF-A | Cardiovascular, cancer, inflammation | 3.98 | 5.4 x10-8 | |||
| cg16936953 | Vacuole membrane protein 1 ( | CXCL13 | Cancer | -3.68 | 5.6 x10-9 | |
| Follistatin | Cardiovascular, cancer | -3.55 | 1.0 x10-7 | |||
| GDF-15 | Cardiovascular, cancer | -3.74 | 2.5 x10-15 | |||
| IL27-A | Cardiovascular | -3.04 | 8.9 x10-9 | |||
| cg20497205 | Opsin 5 | CX3CL1 | Cardiovascular, inflammation | 9.04 | 4.1 x10-8 | |
| HB-EGF | Cardiovascular, cancer | 8.19 | 4.1 x10-8 |
a. The NLRC5 and VMP1/MIR21 regions consist of multiple associated CpG sites. However, these CpG sites are not independent and therefore only the most significant CpG site is included in this table.
b. The estimates represent change in biomarker values (in standard deviations) per 100% change in DNA methylation level (from 0 to 100%)
Fig 3Association between DNA methylation and biomarkers explained by SNPs.
A) Overview of possible scenarios. If the associations are fully explained (caused) by the SNPs, there will be no correlation between the DNA methylation and the biomarker after conditioning on the SNP. If the association is partially explained, there will be a correlation between the DNA methylation and the biomarker after conditioning on the SNPs, but the correlation will be weaker. If the SNP is not causing the association, the correlation between the DNA methylation and the biomarker will not be influenced when conditioning on the SNPs. B) and C) The squared marginal and partial (after conditioning on the SNPs) correlation coefficients between the DNA methylation and the biomarkers for pairs where a cis-regulatory SNP influencing the biomarker levels has been identified. In B) for pairs where the CpG site and the gene encoding the biomarker maps to the same locus, and in C) for pairs where CpG site and the gene encoding the biomarker map to different regions. For calculating the marginal and partial correlation coefficients, only individuals with DNA methylation, biomarker and SNP data available were included (N = 698).
CpG sites and SNPs in the same chromosomal region that are associated with the same biomarkers.
| Biomarker | Chr | SNP name | SNP position | CpG name(s) | CpG position | ||||
|---|---|---|---|---|---|---|---|---|---|
| CCL19 | 6 | rs2395201 | 32451897 | 5.9 x 10−17 | cg26805579 | 32372962 | 1.8 x 10−8 | 0.038 | 1.5 x 10−93 |
| CCL24 | 7 | rs10755885 | 75463505 | 1.8 x 10−39 | cg12943082 | 75419311 | 8.5 x 10−13 | 0.0057 | 3.6 x 10−16 |
| rs73359714 | 75447156 | 7.5 x 10−37 | |||||||
| rs11465295 | 75442407 | 2.0 x 10−12 | |||||||
| CHI3L1 | 1 | rs2153101 | 203168474 | 2.9 x 10−41 | cg07423149 | 203156246 | 1.3 x 10−24 | 5.9 x 10−5 | 3.1 x 10−75 |
| Cystatin B | 21 | rs35285321 | 45201832 | 7.2 x 10−15 | cg09468832 | 45199094 | 1.5 x 10−8 | 0.12 | 7.8 x 10−84 |
| cg08529987 | 45206724 | 4.6 x 10−9 | 0.0024 | 6.6 x 10−46 | |||||
| IL6RA | 1 | rs12133641 | 154428283 | 3.0 x 10−73 | cg21262032 | 154437693 | 1.2 x 10−8 | 0.18 | 8.9 x 10−18 |
| MIA | 19 | rs3869574 | 41284915 | 2.4 x 10−26 | cg09274963 | 41633902 | 1.1 x 10−8 | 0.0098 | 7.6 x 10−45 |
| MIC-A | 6 | rs6938453 | 31377793 | 5.9 x 10−29 | cg02884661 | 31382931 | 1.2 x 10−16 | 0.13 | 4.3 x 10−31 |
| rs52979004 | 31402666 | 4.2 x 10−15 | cg04628742 | 29973221 | 2.4 x 10−9 | 0.032 | 6.3 x 10−11 | ||
| rs2516470 | 31407331 | 8.8 x 10−10 | cg12001709 | 31466798 | 5.1 x 10−9 | 0.26 | 1.4 x 10−13 | ||
| RETN | 19 | rs149552675 | 7624687 | 1.1 x 10−10 | cg02346997 | 7733883, | 3.1 x 10−8 | 1.6 x 10−6 | 2.5 x 10−5 |
| RETN | 19 | cg22322184 | 7734203 | 9.1 x 10−9 | 1.9 x 10−7 | 0.0011 | |||
| ST2 | 2 | rs12712136 | 102936366 | 1.2 x 10−46 | cg05295703 | 102895712 | 8.4 x 10−9 | 0.95 | 2.3 x 10−49 |
The table shows the results for the biomarkers with overlapping (with regards to chromosomal loci) signals from the EWAS and the GWAS, the association between biomarker and CpG sites when adjusted for the SNPs, and the association between CpG methylation and SNPs.
a. Only independent top SNPs are included
b. N–Sample size for each set of analyses.
c. Only independent top CpG sites are included
d. P-value between the CpG site and the biomarker after adjusting for all independent SNPs.
e. P-value for the association between the CpG site and all independent SNPs (combined)
Fig 4A directed acyclic graph of the direct effects and correlations between protein levels, genetic variants, DNA methylation and smoking.
Single-headed arrows represent direct associations and double-headed arrows indicate correlations. Two of the correlations between DNA methylation and protein levels could be completely explained by the confounding by smoking that is influencing both DNA methylation and on protein levels. Similarly, 23 of the biomarker-DNA methylation correlations could be explained by genetic variants. The numbers in the figure represent the numbers that we found among the proteins and EWAS-significant CpG sites in our study.
Fig 5A directed acyclic graph of the causal inference.
X and Y are two correlated phenotypes (the DNA methylation level at a CpG site and a biomarker level). The instrumental variable (IV) is the cis-regulatory genetic variant that directly influences X. If the IV is influencing Y through X in this way, there is a causal relation between X and Y. If the IV is not influencing Y through X, there might be an unknown confounding factor influencing both X and Y (U2), or there might be a reverse causation (Y influences X). To be a valid instrument three criteria must be fulfilled: IV criterion 1) the genetic variant must not be associated with an unknown confounding factor (U1) that influences Y, and IV criterion 2) the genetic variant must not be directly associated with Y, and IV criterion 3) the genetic variant should be associated with X.