| Literature DB >> 31900413 |
Shaza B Zaghlool1,2, Brigitte Kühnel3,4, Mohamed A Elhadad3,4,5, Sara Kader1, Anna Halama1, Gaurav Thareja1, Rudolf Engelke6, Hina Sarwath6, Eman K Al-Dous7, Yasmin A Mohamoud7, Thomas Meitinger5,8,9, Rory Wilson3,4, Konstantin Strauch10,11, Annette Peters3,4,5, Dennis O Mook-Kanamori12, Johannes Graumann13,14, Joel A Malek7, Christian Gieger3,4,5, Melanie Waldenberger3,4,5, Karsten Suhre15.
Abstract
DNA methylation and blood circulating proteins have been associated with many complex disorders, but the underlying disease-causing mechanisms often remain unclear. Here, we report an epigenome-wide association study of 1123 proteins from 944 participants of the KORA population study and replication in a multi-ethnic cohort of 344 individuals. We identify 98 CpG-protein associations (pQTMs) at a stringent Bonferroni level of significance. Overlapping associations with transcriptomics, metabolomics, and clinical endpoints suggest implication of processes related to chronic low-grade inflammation, including a network involving methylation of NLRC5, a regulator of the inflammasome, and associated pQTMs implicating key proteins of the immune system, such as CD48, CD163, CXCL10, CXCL11, LAG3, FCGR3B, and B2M. Our study links DNA methylation to disease endpoints via intermediate proteomics phenotypes and identifies correlative networks that may eventually be targeted in a personalized approach of chronic low-grade inflammation.Entities:
Mesh:
Substances:
Year: 2020 PMID: 31900413 PMCID: PMC6941977 DOI: 10.1038/s41467-019-13831-w
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Study design and data integration.
We conducted an EWAS with proteomics in KORA and replicated the associations in QMDiab (see Fig. 2). We used Ingenuity Pathway Analysis (IPA) to connect CpG-linked genes to their associated proteins through literature-reported observations and complemented the network with associations to gene expression (BIOS), metabolomics (QMDiab), and clinical endpoints (KORA), finally adding all previously reported disease associations. The resulting networks, using the same color code as here, are presented in Figs. 3 and 4.
Fig. 2Summary of the step-wise EWAS.
A series of pEWASs was carried out in the discovery cohort KORA, followed by replication in the QMDiab study (Supplementary Data 2 and 3). In the initial step, methylation levels of 470,837 CpGs (M-values, winsorized) were tested for association with 1123 blood circulating protein concentrations (log-scaled, winsorized), leading to 38,492 associations that reached stringent Bonferroni significance, 12,606 of which were replicated. Potential driving factors (sex, white blood cell composition, genetic variants, age, smoking, BMI, and diabetes) were successively regressed out from the CpG and the protein levels, using the residues in the next EWAS step (see Methods). At each step, a number of associations (pQTMs) fell below the significance threshold (indicated by the black arrows). These associations were likely driven by the factor used in the previous regression step. Eventually, 318 pQTMs remained that were not driven by any of the factors listed here, 98 of which were replicated (Table 1).
Fig. 3The methylation-proteome network.
Circular plot of all 98 replicated cis- and trans- pEWAS associations.
Fig. 4Pappalysin-1 network.
This network comprises 72 CpG sites (green circles, same color and shape code as in Fig. 1) that associated with blood circulating levels of pappalysin-1 (PAPPA) (blue octagon), five of which were also associated with RNA expression in BIOS (gray squares) (Table 1). Ingenuity Pathway Analysis (IPA) was used to connect these CpG sites to PAPPA through protein-protein links (yellow diamonds) that were supported by experimental findings, reflecting the well documented role of PAPPA as an activator of the IGF and NFκB pathways. PAPPA levels also associated with relevant clinical phenotypes in KORA, reaching multiple testing corrected significance levels of p < 5.6 × 10−4 for CpG sites and p < 3.6 × 10−3 for proteins (purple rectangles). PAPPA has also been linked to various diseases in numerous previous studies (pink hexagons). Full literature support of these links is provided in Supplementary Note 3.
Replicated PAPPA pQTMs.
| CpG | pQTM (this study) | BetapQTM | eQTM (BIOS) | BetaeQTM | ||
|---|---|---|---|---|---|---|
| cg07708453 ( | PAPPA (4148-49_2) chr9:118,916,083-119,164,601 | 3.10 × 10−16 | 0.262 | ENSG00000116731 PRDM2 | 2.52 × 10−6 | 0.130 |
| cg19393755 ( | PAPPA | 2.03 × 10−14 | −0.246 | ENSG00000179604 CDC42EP4a | 3.58 × 10−11 | 0.132 |
| cg10831642 ( | PAPPA | 8.19 × 10−12 | −0.246 | ENSG00000107957 SH3PXD2A | 9.03 × 10−32 | 0.287 |
| cg26272069 ( | PAPPA | 9.25 × 10−12 | −0.224 | ENSG00000204681 GABBR1 | 3.80 × 10−8 | −0.073 |
| cg20290167 ( | PAPPA | 5.58 × 10−11 | −0.212 | ENSG00000176845 METRNL | 2.77 × 10−5 | 0.091 |
| – | (Total: 72 PAPPA pQTMs) | – | – | n.a. | n.a. | n.a. |
The p-value (PpQTM, linear regression) and regression coefficient (BetapQTM) from the discovery study are reported. The chromosomal position of the CpG sites and the related protein coding genes are given, together with the associated protein and SOMAmer identifiers (SeqId). The BIOS QTL server[5] was used to identify overlapping CpG methylation to gene expression associations (eQTMs). The respective p-values (peQTM, linear regression) and regression coefficients (betaeQTM) of the association of the respective CpG and the transcript are reported. All associations are located in trans, that is, the CpG and the blood circulating protein coding region were >1 Mbp. The five selected pQTMs here are only those associated with an overlapping eQTM. The full list is provided in Supplementary Data 4
aThese genes belong to the same cytogenic band as those reported by Illumina as being regulated by the respective CpG sites and are within physical proximity (<21,000 and 5000 bp downstream, respectively)
Other replicated pQTMs.
| CpG | pQTM (this study) | BetapQTM | eQTM (BIOS) | BetaeQTM | ||
|---|---|---|---|---|---|---|
| cg10604476 ( | ICAM5 (5124-69_3) chr19:10,400,657-10,407,454 | 6.09 × 10−25 | 0.356 | ENSG00000105376 ICAM5 | 2.22 × 10−9 | 0.186 |
| cg03650189 ( | ICAM5 | 1.22 × 10−24 | 0.344 | ENSG00000105376 ICAM5 | 1.68 × 10−5 | 0.129 |
| cg22910295 ( | ICAM5 | 3.72 × 10−24 | 0.339 | ENSG00000105376 ICAM5 | 2.37 × 10−6 | −0.134 |
| cg15011409 ( | ICAM5 | 5.96 × 10−23 | 0.331 | ENSG00000105376 ICAM5 | 2.66 × 10−12 | 0.198 |
| cg21994045 ( | ICAM5 | 4.18 × 10−17 | 0.291 | ENSG00000105376 ICAM5 | 1.21 × 10−5 | −0.148 |
| cg10773601 ( | CLEC11A (4500-50_2) chr19:51,226,586-51,228,974 | 8.06 × 10−27 | −0.341 | ENSG00000105472 CLEC11A | 1.06 × 10−71 | 0.424 |
| cg16651537 ( | CLEC11A | 1.67 × 10−24 | −0.326 | ENSG00000105472 CLEC11A | 3.54 × 10−63 | 0.434 |
| cg05575921 ( | PIGR (3216-2_2) chr1:207,101,863-207,119,811 | 8.08 × 10−16 | −0.264 | ENSG00000180104 EXOC3a | 1.19 × 10−6 | 0.063 |
| cg18419358 (n.a.) chr6:158,384,009 | GP1BA (4990-87_1) chr17:4,835,592-4,838,325 | 2.52 × 10−12 | 0.228 | n.a. | n.a. | n.a. |
| cg27535410 ( | PRTN3 (3514-49_2) chr19:840,963-848,175 | 1.21 × 10−11 | −0.219 | n.a. | n.a. | n.a. |
| cg13028630 ( | C4A/C4B (4481-34_2) chr6:31,937,353-32,079,643 | 1.38 × 10−11 | −0.246 | n.a. | n.a. | n.a. |
| cg09488502 ( | SIGLEC14 (5125-6_3) chr19:52,145,806-52,150,054 | 4.89 × 10−11 | 0.220 | n.a. | n.a. | n.a. |
The p-value (PpQTM, linear regression) and regression coefficient (BetapQTM) from the discovery study are reported. The chromosomal position of the CpG sites and the related protein coding genes are given, together with the associated protein and SOMAmer identifiers (SeqId). The BIOS QTL server[5] was used to identify overlapping CpG methylation to gene expression associations (eQTMs). The respective p-values (peQTM, linear regression) and regression coefficients (betaeQTM) of the association of the respective CpG and the transcript are reported. Some of the associations are located in cis, that is, the CpG and the blood circulating protein coding region were within 1 Mbp (e.g. CLEC11A and ICAM5), while others were located in trans (e.g. AHRR)
aThese genes belong to the same cytogenic band as those reported by Illumina as being regulated by the respective CpG sites and are within physical proximity (<21,000 and 5000 bp downstream, respectively)
Replicated NLRC5 pQTMs.
| CpG | pQTM (this study) | BetapQTM | eQTM (BIOS) | BetaeQTM | ||
|---|---|---|---|---|---|---|
| cg07839457 ( | CD48 (3292-75_1) chr1:160,648,536-160,681,641 | 1.10 × 10−21 | −0.306 | n.a. | n.a. | n.a. |
| cg08159663 ( | CD48 | 3.38 × 10−12 | −0.246 | ENSG00000140853 NLRC5 | 8.26 × 10−10 | 0.213 |
| cg07839457 ( | B2M (3485-28_2) chr15:45,003,675-45,011,075 | 7.60 × 10−20 | −0.292 | n.a. | n.a. | n.a. |
| cg16411857 ( | B2M | 1.94 × 10−15 | −0.256 | n.a. | n.a. | n.a. |
| cg00218406 ( | B2M | 3.97 × 10−13 | −0.234 | ENSG00000206337 HCP5 | 7.78 × 10−102 | 0.454 |
| cg08099136 ( | B2M | 4.80 × 10−11 | −0.214 | ENSG00000204264 PSMB8 | 1.96 × 10−23 | 0.258 |
| cg07839457 ( | CXCL10 (4141-79_1) chr4:76,942,273-76,944,650 | 8.06 × 10−19 | −0.283 | n.a. | n.a. | n.a. |
| cg07839457 ( | FCGR3B (3311-27_1) chr1:161,592,986-161,601,753 | 1.02 × 10−18 | −0.307 | n.a. | n.a. | n.a. |
| cg08159663 ( | FCGR3B | 1.44 × 10−14 | −0.290 | ENSG00000140853 NLRC5 | 8.26 × 10−10 | 0.213 |
| cg16411857 ( | FCGR3B | 3.79 × 10−13 | −0.260 | n.a. | n.a. | n.a. |
| cg07839457 ( | LAG3 (5099-14_3) chr12:6,881,678-6,887,621 | 1.29 × 10−17 | −0.274 | n.a. | n.a. | n.a. |
| cg08159663 ( | LAG3 | 9.22 × 10−13 | −0.246 | ENSG00000140853 NLRC5 | 8.26 × 10−10 | 0.213 |
| cg07839457 ( | CD163 (5028-59_1) chr12:7,623,409-7,656,489 | 1.83 × 10−15 | −0.256 | n.a. | n.a. | n.a. |
| cg07839457 ( | CXCL11 (3038-9_2) chr4:76,954,835-76,962,568 | 1.13 × 10−13 | −0.246 | n.a. | n.a. | n.a. |
The p-value (PpQTM, linear regression) and regression coefficient (BetapQTM) from the discovery study are reported. The chromosomal position of the CpG sites and the related protein coding genes are given, together with the associated protein and SOMAmer identifiers (SeqId). The BIOS QTL server[5] was used to identify overlapping CpG methylation to gene expression associations (eQTMs). The respective p-values (peQTM, linear regression) and regression coefficients (betaeQTM) of the association of the respective CpG and the transcript are reported. All associations are located in trans, that is, the CpG and the blood circulating protein coding region were < 1 Mbp
Fig. 5NLRC5 network.
Our pEWAS (green lines) identified multiple proteins (blue octagons) and CpG sites (green circles) that are related to major anti-inflammatory pathways, and that could be directly connected via intermediate genes and proteins using IPA (yellow diamonds). NLRC5 and beta-2-microglobulin both regulate the major histocompatibility complex MHC class I genes through interaction with various interleukins and NFκB. NLRC5 methylation also associated with various metabolic inflammatory markers (orange triangles). Finally, the associated proteins were are also associated with various clinical phenotypes in KORA at multiple-testing corrected significance levels (purple rectangles). NLRC5 methylation is a hallmark of chronic inflammation and has been reported in association with several inflammation-related diseases (pink hexagons). Full annotations of these links are provided in Supplementary Note 4.