Literature DB >> 32281463

The DNA methylome of inflammatory bowel disease (IBD) reflects intrinsic and extrinsic factors in intestinal mucosal cells.

Iolanda Agliata1, Nora Fernandez-Jimenez2, Chloe Goldsmith3, Julien C Marie3, Jose R Bilbao2,4, Robert Dante5, Hector Hernandez-Vargas3,6.   

Abstract

Abnormal DNA methylation has been described in human inflammatory conditions of the gastrointestinal tract, such as inflammatory bowel disease (IBD). As other complex diseases, IBD results from the balance between genetic predisposition and environmental exposures. As such, DNA methylation may be the consequence (and potential effector) of both, genetic susceptibility variants and/or environmental signals such as cytokine exposure. We attempted to discern between these two non-excluding possibilities by performing a combined analysis of published DNA methylation data in intestinal mucosal cells of IBD and control samples. We identified abnormal DNA methylation at different levels: deviation from mean methylation signals at site and region levels, and differential variability. A fraction of such changes is associated with genetic polymorphisms linked to IBD susceptibility. In addition, by comparing with another intestinal inflammatory condition (i.e., coeliac disease) we propose that aberrant DNA methylation can also be the result of unspecific processes such as chronic inflammation. Our characterization suggests that IBD methylomes combine intrinsic and extrinsic responses in intestinal mucosal cells, and could point to knowledge-based biomarkers of IBD detection and progression.

Entities:  

Keywords:  DNA methylation; biomarkers; coeliac disease (CeD); inflammatory bowel disease (IBD); methylation quantitative trait loci (mQTLs)

Year:  2020        PMID: 32281463      PMCID: PMC7518701          DOI: 10.1080/15592294.2020.1748916

Source DB:  PubMed          Journal:  Epigenetics        ISSN: 1559-2294            Impact factor:   4.528


Background

Inflammatory bowel disease (IBD) comprises Crohn’s disease (CD) and Ulcerative Colitis (UC), two chronic and progressive inflammatory conditions of the gastrointestinal (GI) tract that affect 2.2 million people in Europe and 1.4 million in United States [1,2]. The exact aetiology is not known, but IBD is characterized by various genetic abnormalities that result in aggressive response from both innate (i.e., macrophages and neutrophils) and acquired (i.e., T and B cells) immunity [3]. In CD, although inflammation may involve the entire GI tract, the ileum is mainly affected [4]. In UC, chronic and relapsing inflammation affects the colon and rectum [5] and is associated with increased risk of colon cancer development [6]. While genetics explains a fraction of inheritance of IBD (13,1% variance in CD and 8,2% in UC) [7], environmental factors may influence susceptibility through non-genetic mechanisms, such as DNA methylation [8,9]. Indeed, several recent studies have provided a detailed characterization of genomic abnormalities in IBD, including DNA methylation [10-12]. Although there is a clear crosstalk between DNA methylation and gene expression, the cause–effect relationship between these two processes is dependent on the biological context [9,13]. There is evidence for gene expression preceding DNA methylation changes [14-16], as well as evidence for DNA methylation as an effector of genetic variants and the resulting pathological phenotype [8]. Unifying both possibilities, DNA methylation may represent a mechanism to condition or to perpetuate the response to anti- and pro-inflammatory signals. For example, exposure to cytokines such as interleukin 6 (IL6) and transforming growth factor beta (TGF-β) has been associated with stable DNA methylation changes in epithelial cells [14,17-19]. However, it is unclear to what extent the altered DNA methylation of epithelial cells in IBD could be due to persistent cytokine exposure and/or to the direct consequence of genetic susceptibility variants (i.e., SNPs). Explaining the origin of DNA methylation changes in IBD may be of interest when exploiting their potential as biomarkers. Currently, the most used biomarkers for IBD are C-Reactive Protein and Calprotectin, although they are not specific for inflammation of intestinal origin, limiting their clinical use [20]. Instead, DNA methylation is known to be tissue-specific [21,22], and it may represent a sensor of cytokine exposures [23-26] and thus a better biomarker of IBD. Moreover, DNA markers are advantageous in terms of stability, improved isolation and storage, relative to RNA or protein [27]. With these assumptions, we performed a combined analysis of intestinal epithelium methylomes in IBD. Our goal was to identify candidate loci that can be potentially useful as biomarkers, using base-resolution methylation data in mucosal biopsies from a large aggregated dataset of CD and UC patients, an approach that may open the way to personalized prevention strategies.

Results

Genome-wide changes in DNA methylation are a common feature of IBD

To identify DNA methylation changes in cells of the intestinal mucosa associated with IBD, we reanalysed bead-array methylation data from different datasets (Tables 1 & 2). To increase coverage while enhancing data harmonization, we only included datasets based on the last two versions of Illumina methylation bead arrays (i.e., HM450 and EPIC, see Methods for other inclusion criteria) which share ~400 k informative features. Samples from these datasets included paediatric and adult IBD patients, from both sexes, and involved the two main forms of the condition (i.e., CD and UC).
Table 1.

Dataset characteristics.

AccessionConditionidatSamplesAgeOriginPMID
MTAB_5463UC/CDyes111/1046-15Europe29031501
GSE32146UC/CDno2514-17USANA
MTAB_3703/3709UC/CDyes1212-14Europe2376367
GSE81211UCyes1216-68South Korea27517910
GSE105798CDyes11NASouth KoreaNA
GSE42921UC/CDno235-19USANA

Characteristics of the datasets included in the study. Accession: either ArrayExpress or Gene Expression Omnibus (GEO) accession numbers. Condition: ulcerative colitis (UC) or Crohn’s disease (CD). PMID: PubMed ID. Idat: raw-level bead-array data availability. NA: not available.

Table 2.

Samples used in the study.

AccessionArrayControlsCasesProtocolSegmentInflammation
MTAB_5463EPIC2084IEC purification by sortingsig: 53, ter: 51Histology score availablea
MTAB_5463HM4503378IEC purification by sortingsig: 48, ter:48, asc: 15Histology score availablea
GSE32146HM4501015Intestinal mucosaColonNA
GSE81211HM45038Intestinal mucosaRectumEndoscopic score availableb
GSE105798HM45038Intestinal mucosaNANA
GSE42921HM4501211Intestinal mucosaColonNA

Characteristics of the samples included in each study. Accession: either ArrayExpress or Gene Expression Omnibus (GEO) accession numbers. Array: Illumina bead array version. ainflammation score available for each sample. bactive vs. inactive status, not attributable to each sample. NA: not available. In Segment, sig: sigmoid colon, ter: terminal ileum, asc: ascendent colon.

Dataset characteristics. Characteristics of the datasets included in the study. Accession: either ArrayExpress or Gene Expression Omnibus (GEO) accession numbers. Condition: ulcerative colitis (UC) or Crohn’s disease (CD). PMID: PubMed ID. Idat: raw-level bead-array data availability. NA: not available. Samples used in the study. Characteristics of the samples included in each study. Accession: either ArrayExpress or Gene Expression Omnibus (GEO) accession numbers. Array: Illumina bead array version. ainflammation score available for each sample. bactive vs. inactive status, not attributable to each sample. NA: not available. In Segment, sig: sigmoid colon, ter: terminal ileum, asc: ascendent colon. After filtering (see Methods), we tested for the association between IBD and DNA methylation at 392810 CpG sites (81 control and 204 IBD patients) using a linear model. In such a model, we adjusted for sex, age, dataset, and surrogate variables identified during data preprocessing (Figure S1). To account for statistical inflation, we used criteria of effect size (change in mean methylation of at least 10% between controls and IBD) and FDR-adjusted p value <0.05. Using these criteria, we identified 4205 differentially methylated positions (DMPs), out of which 436 were hypo- and 3769 were hypermethylated in IBD (Figure 1(a), Tables 3 and S1). DMPs were robust to IBD type (Figure 1(b)), and other clinical and technical features (Figure 1(c), S2, and S3). An important fraction of these sites was previously identified, in particular in the large dataset published by Howell et al. [10]. However, our dataset combination strategy has led to the identification of new associations. Moreover, the consistency of these findings across independent studies provides additional confidence on their robustness.
Figure 1.

DNA methylation distinguishes IBD from healthy intestinal epithelial cells.

Table 3.

Top DMPs.

Probe IDFDRDelta BetaSymbolDistance
cg164650271.14E-14-0.20PHACTR1122016
cg192694261.80E-12-0.22GGPS10
cg078394576.58E-12-0.23NLRC5435
cg162406831.24E-100.22ZNF436-AS10
cg241293564.02E-09-0.22HLA-DMA0
cg227181391.05E-080.25HMGCS20
cg269742145.56E-08-0.23LIPA0
cg028067153.03E-07-0.20HLA-DMA0
cg093218177.33E-07-0.25HLA-DPA10
cg018049342.60E-05-0.21HLA-DPA10
cg230459086.30E-05-0.21PDE4B0
cg060610869.70E-04-0.20FOXP414522

Differentially methylated positions (DMPs) with a mean difference between groups of at least 20% (delta-beta -CBrk- 20, FDR -OBrk- 0.05). Probe ID: Illumina probe reference, logFC: logarithmic fold-change between groups (IBD vs. control), FDR: false discovery rate, Symbol: gene symbol, Distance: distance in base pairs to the closest gene. Full list of DMPs can be found in Supplementary Table S1.

Top DMPs. Differentially methylated positions (DMPs) with a mean difference between groups of at least 20% (delta-beta -CBrk- 20, FDR -OBrk- 0.05). Probe ID: Illumina probe reference, logFC: logarithmic fold-change between groups (IBD vs. control), FDR: false discovery rate, Symbol: gene symbol, Distance: distance in base pairs to the closest gene. Full list of DMPs can be found in Supplementary Table S1. DNA methylation distinguishes IBD from healthy intestinal epithelial cells. A subset of DMPs mapped close to each other, suggesting a non-random association with particular genomic loci. To explore this observation, we performed a region-level analysis in the same combined dataset. This led to the identification of 55 differentially methylated regions (DMRs), 31 hypo and 24 hyper methylated in IBD (Tables 4 and S2). As expected, many of these regions corresponded to gene loci also identified using the probe-level strategy (Figure 2(b)).
Table 4.

Top DMRs.

SymbolCoordinates# CpGFDRbeta FC
BST2chr19:17516282-1751700857.82E-60-0.08
HMGCS2chr1:120311439-12031165341.37E-540.06
HLA-DPA1chr6:33040535-3304169785.01E-44-0.07
LGALS3chr14:55602634-5560445441.14E-290.06
LTBP3chr11:65318624-6531868321.65E-360.05
DACT2chr6:168665386-16866553331.92E-310.05
DYSFchr2:71823484-7182351727.51E-290.05
IARS2chr1:220292508-22029259023.27E-270.06
CIITAchr16:10969805-1097125061.27E-37-0.06
VILLchr3:38033516-3803393431.11E-290.06
HLA-Bchr6:31322298-3132385653.04E-25-0.05
THBS1chr15:39874776-3987624844.62E-160.07
CLDN4chr7:73223852-7322393525.85E-230.06
CLHC1chr2:55360999-5536131024.80E-23-0.08
ZNF467chr7:149462836-14946321924.80E-240.05
SHHchr7:155617333-15561739821.30E-240.07
ARL14chr3:160395420-16039571921.65E-210.08
TMEM232chr5:110062343-11006283774.54E-220.07
HOXA10chr7:27217057-2721760621.09E-230.05
FAAP20chr1:2120985-212172465.17E-210.07
EPHB3chr3:184297380-18429752235.82E-18-0.09
PSMD1chr2:231989800-23198982427.41E-17-0.08
MUC4chr3:195552341-19555242921.53E-16-0.08
SBNO2chr19:1130866-113096522.53E-18-0.07
PARP9chr3:122281881-12228197532.17E-19-0.08
HOXD8chr2:176996285-17699770755.07E-14-0.05
CUX1chr7:101579003-10157993635.12E-110.05
ATXN7L1chr7:105279391-10527988234.08E-170.05
ELLchr19:18589848-1858989424.50E-14-0.05
PLA2G2Achr1:20305344-2030568522.08E-13-0.08
SUMO1P1chr20:52356911-5235711723.12E-13-0.07
TRIM69chr15:45018591-4501890538.08E-21-0.05
HLA-Cchr6:31238245-3123875139.95E-16-0.06
SFT2D3chr2:128453108-12845348457.96E-130.08
STAT1chr2:191875807-19187667321.99E-10-0.08
RPS6KA2chr6:166970252-16697072723.48E-11-0.05
PHACTR1chr6:12594257-1259501921.33E-21-0.07
SLC45A4chr8:142255356-14225553721.54E-11-0.06
CANT1chr17:76991208-7699125321.02E-140.05
DRD3chr3:113933104-11393317521.63E-09-0.07
AFF3chr2:100170766-10017113641.54E-090.05
LRRC47chr1:3720588-372074421.29E-090.06
MIR34Achr1:9224003-922419822.40E-10-0.06
TRIM5chr11:5710654-571106827.64E-15-0.06
FOXP4chr6:41499640-4149979927.46E-08-0.05
ATP9Achr20:50312386-5031263221.01E-07-0.05
SLCO3A1chr15:92612836-9261328028.59E-08-0.05
CCDC155chr19:49891494-4989157423.13E-06-0.06
SVILchr10:29948324-2994842821.17E-05-0.06
TPM1chr15:63337352-6333749628.85E-06-0.05
SHTN1chr10:118652838-11865298121.18E-05-0.06
CES1P1chr16:55794595-5579473123.09E-05-0.06
GNL3chr3:52722445-5272245224.73E-050.06
CYP2E1chr10:135341870-13534262051.16E-050.05
ASPGchr14:104552032-10455220938.13E-050.05

Differentially methylated regions (DMRs) with at least five CpG sites, a maximum beta change between groups (beta FC) of at least 10%, and a minimum FDR of 0.05, are shown below. # CpGs: number of CpG sites per region, FDR: false discovery rate, Beta FC: methylation beta value fold change. Full list of DMRs can be found in Supplementary Table S2. The table is sorted by beta FC, from hypo- to hypermethylation in IBD.

Figure 2.

Mean DNA methylation and variability distinguishes IBD from healthy intestinal epithelial cells.

Top DMRs. Differentially methylated regions (DMRs) with at least five CpG sites, a maximum beta change between groups (beta FC) of at least 10%, and a minimum FDR of 0.05, are shown below. # CpGs: number of CpG sites per region, FDR: false discovery rate, Beta FC: methylation beta value fold change. Full list of DMRs can be found in Supplementary Table S2. The table is sorted by beta FC, from hypo- to hypermethylation in IBD. Mean DNA methylation and variability distinguishes IBD from healthy intestinal epithelial cells. In addition, to mean methylation differences at the probe and region levels (i.e., DMPs and DMRs), methylation variation has been associated with disease and cancer susceptibility [28]. To explore this, we used the iEVORA algorithm in the same datasets, to identify differentially variable and methylated CpGs (DVMCs). Using stringent criteria of differential methylation and variation, we identified 4532 DVMCs (Figure 2(a) and Table S3), most of them located in the vicinity of a known promoter (80%, within 2 kb of a transcription start site). Of note, for most of these sites (75%), IBD samples displayed higher variability than control tissues. In addition, more than half of them displayed lower methylation in IBD samples relative to control mucosa (63%). In summary, the intestinal mucosa of IBD displays large non-random methylome abnormalities characterized by high variability, but also by absolute changes in mean DNA methylation at particular loci.

Genomic and biological context of IBD-associated DNA methylation changes in intestinal epithelia

DMPs distinguishing IBD from control tissues were assessed for genomic distribution, in terms of gene-centric and CpG island (CGI)-centric context. DMPs were relatively absent from CGIs, gene promoters, or the vicinity of transcription start sites (TSS) (Figure 3(a-c)). Instead, hypo and hypermethylated DMPs were highly concentrated in non-CGI regions (i.e., open sea) (Figure 3(a)). Pathway analysis of DMRs revealed over-representation of pathways related to metabolism and signal transduction, including Adipogenesis, Haemostasis, G alpha signalling events, Pathways in cancer, and TGF-beta Receptor Signalling (Table 5).
Figure 3.

Genomic distribution of IBD-related DMPs.

Table 5.

Pathway analysis.

PathwayAdjusted P-valueCombined ScoreDataset
Adipogenesis genes_Mus musculus_WP4475.78E-0633.17WikiPathways 2016
Adipogenesis genes_Homo sapiens_WP2365.78E-0632.1WikiPathways 2016
TGF Beta Signalling Pathway_Mus musculus_WP1134.28E-0426.34WikiPathways 2016
TGF-beta Receptor Signalling_Homo sapiens_WP5607.32E-0424.45WikiPathways 2016
Alpha6-Beta4 Integrin Signalling Pathway_Mus musculus WP4882.44E-0318.52WikiPathways 2016
Haemostasis_Homo sapiens_R-HSA-1095821.31E-0329.15Reactome
G alpha(12/13) signalling events_Homo sapiens_R-HSA-4164824.29E-0321.79Reactome
Pathways in cancer _Homo sapiens_hsa 052005.90E-0426.06KEGG 2016
Aldosterone synthesis and secretion_Homo sapiens_hsa049255.90E-0424.68KEGG 2016
Pathway analysis. Genomic distribution of IBD-related DMPs. Overall, abnormal DNA methylation in IBD is relatively absent from CGIs. At the biological level, DNA methylation changes are enriched in inflammation-related pathways. Such changes may occur downstream of cytokine signalling. Alternatively, they may represent early changes linked to genetic susceptibility.

IBD DMPs are genomically closer to IBD risk polymorphisms and are enriched on blood mQTLs

DNA methylation may represent an intermediary between genotype and disease susceptibility, and such genetic influences on DNA methylation within a defined genomic context are known as methylation quantitative trait loci (mQTLs). Among differentially methylated genes with a significant genetic association, we found JAK3, KRT8, and HLA genes, confirming the findings of previous studies [7,29-31]. Moreover, some DMPs display a bimodal DNA methylation distribution (see Methods). After ruling out technical artefacts, such bimodal distribution may suggest that their methylation levels are directly dependent on genotype. To explore a genotype-methylation association, we calculated the genomic distance between DMPs identified in our analysis and single nucleotide polymorphisms (SNPs) associated with IBD risk [29,30,32]. Of note, DMPs were overall significantly closer to a known IBD risk SNP, compared to all HM450 sites taken together (Figure 4). This difference was preserved after independently comparing hyper or hypomethylated DMPs (although more evident in the latter), and consistent across three independent SNP datasets (Figures 4 and S2C).
Figure 4.

Genomic distances between IBD-related DMPs and known risk SNPs.

Genomic distances between IBD-related DMPs and known risk SNPs. We also tested the overlap between IBD-DMPs and CpGs participating in blood mQTLs as defined by McRae et al. [33]. Although this was not a significant enrichment, 544 out of the 4205 DMPs participated in the 52916 mQTLs reported previously (Supplementary Table S4). To ascertain whether the SNPs putatively associated to our DMPs were also associated to IBD, we interrogated the largest fine-mapping study performed to date on the disease that claims to identify associations at a base-pair resolution level [29]. We found that 4 of the 544 mQTLs identified here bear an IBD-associated polymorphism, namely rs11264305, rs17228058, rs3806308, and rs3807306, located in or close to ADAM15, SMAD3, RNF186, and IRF5, respectively. Briefly, we found that SNP-CpG pairs overlap regulatory loci, discernible by H3K27ac histone marks and the presence of a CpG island (in the case of ADAM15). These findings suggest that at least a fraction of IBD abnormal methylome is in direct relationship with upstream genetic susceptibility variants.

IBD and epithelial and immune cell fractions of the coeliac duodenum share DMPs

As the IBD methylome is both, related to inflammation and genetic susceptibility, it may also be largely unspecific. We therefore chose coeliac disease (CeD), a chronic inflammatory condition of the GI tract with a well-characterized genetic component, to get further insight into methylome specificity. In addition, DNA methylation data for epithelial and immune components of CeD were analysed separately [34]. When we crossed IBD-DMPs with epithelial CeD-DMPs we found that, out of 4205 IBD-DMPs and 43 CeD epithelial-DMPs, 8 were common (representation factor = 17.7, p < 1.5e-08) (Table 6). Interestingly, 5/8 common DMPs mapped to the HLA region on chromosome 6. On the other hand, 31 IBD-DMPs were common with the 310 CeD immune-DMPs (representation factor = 9.5, p < 1e-20). These common hits were enriched for TGF-β signalling pathway (WikiPathways, adjusted p value = 0.04419), and were spread across the genome. All common DMPs followed the same direction (i.e., hypo or hypermethylation) in both diseases, indicating that methylation alterations were concordant. However, methylation fold changes were larger in CeD, probably due to the fact that the coeliac DMPs were identified in separated cell populations, while IBD methylation was assessed in whole intestinal tissue potentially blurring cell-specific signatures.
Table 6.

IBD DMPs previously identified to be differentially methylated in both CeD duodenal epithelia and immune fractions.

CpGNearest geneMean ControlMean IBDDelta BetaFDR IBDFDR CeD
Epithelial fraction     
 cg00403478HLA-DPA16.2E-014.6E-01−0.303.9E-047.51E-03
 cg00676801STAT17.8E-016.7E-01−0.389.7E-051.40E-02
 cg02181920TAP15.2E-014.0E-01−0.222.2E-061.65E-02
 cg02286081HLA-DPB17.9E-016.6E-01−0.271.2E-021.84E-02
 cg06471536PREP5.5E-014.4E-01−0.171.4E-093.76E-02
 cg07839457NLRC55.5E-013.3E-01−0.256.6E-121.71E-02
 cg08735211HLA-DMA7.5E-015.9E-01−0.312.8E-063.24E-02
 cg09321817HLA-DPA17.4E-014.8E-01−0.387.3E-077.51E-03
Immune fraction     
 cg01829342FRMD64.4E-015.5E-010.191.3E-033.45E-02
 cg01971120RNU5 F-12.5E-013.7E-010.201.1E-111.93E-02
 cg01977473GRAMD2B3.0E-014.1E-010.126.3E-133.74E-02
 cg02360367SKI3.5E-014.8E-010.111.3E-119.49E-03
 cg02909176ATP11A3.3E-014.5E-010.131.0E-101.07E-02
 cg05364072MAP7D15.2E-016.3E-010.225.0E-061.86E-02
 cg07843390GNG72.9E-014.0E-010.161.5E-124.04E-02
 cg09558069S100A53.4E-014.4E-010.133.7E-114.38E-02
 cg09670127BCL11A3.3E-014.3E-010.133.0E-072.56E-02
 cg09799714PDZD32.4E-013.5E-010.152.4E-141.44E-02
 cg10331073APBB1IP2.7E-013.9E-010.101.6E-074.32E-02
 cg14997942DNAJC13.3E-014.6E-010.231.1E-111.97E-02
 cg15876825VGLL43.9E-015.3E-010.116.9E-094.24E-02
 cg15924102LINC011322.9E-014.3E-010.135.2E-143.81E-02
 cg16312609MIR47082.8E-013.9E-010.171.1E-132.85E-02
 cg17292100SPAG12.5E-013.5E-010.121.3E-132.85E-02
 cg17980364TMEM1352.8E-013.9E-010.121.2E-171.46E-02
 cg18423737SMAD75.6E-016.9E-010.138.6E-103.00E-02
 cg18556822NOSTRIN2.4E-013.6E-010.181.7E-154.88E-03
 cg19697512PPCDC3.9E-015.2E-010.157.7E-074.24E-02
 cg19794481MIR1413.1E-014.3E-010.122.3E-123.18E-02
 cg20059312NGEF3.7E-014.9E-010.101.1E-154.59E-02
 cg20555562TEX292.9E-014.2E-010.158.6E-082.79E-02
 cg20859933ZBTB443.4E-014.6E-010.131.6E-101.86E-02
 cg21646082CEP853.1E-014.2E-010.113.2E-114.77E-02
 cg22601415ANK33.6E-014.9E-010.117.6E-094.83E-02
 cg22659049LIMCH14.3E-015.9E-010.131.7E-034.24E-02
 cg24216990GUCA1A2.5E-013.5E-010.151.4E-153.69E-02
 cg26049390RDH103.1E-014.2E-010.141.1E-042.16E-02
 cg26075184NKX2-32.2E-013.7E-010.155.4E-111.52E-02
 cg26094842VTI1A3.0E-014.0E-010.191.1E-092.16E-02
IBD DMPs previously identified to be differentially methylated in both CeD duodenal epithelia and immune fractions. In summary, there is a significant overlap in DNA methylation changes associated with IBD and CeD, including the HLA region.

Discussion

IBD is a complex pathology with a wide range of clinical trajectories. Despite such heterogeneity, we show here that non-random changes in DNA methylation associated with IBD are robust to main clinical parameters and consistent across several studies. There are intrinsic limitations of DNA methylation analyses relative to standard genetic profiling, such as confounding, reverse causation, and cellular heterogeneity [13,21]. Interpretability becomes even more complex when aggregating data from independent studies. Despite our efforts in limiting the effect of potential confounders, we are aware that the residual effect of cell composition, anatomical location, inflammation, etc., and/or the differences in sample size from the different studies may have influenced our results. Different characteristics of DNA methylation, such as its relative stability, make this mark an ideal sensor of disease risk and progression. Indeed, several studies have been able to use DNA methylation as a marker of IBD in blood samples [31,35,36]. Both in blood and intestinal mucosa, a deeper mechanistic insight is necessary to better distinguish those methyl marks that are dependent on genetic susceptibility from those that are a consequence of environmental cues. We suggest here that IBD methylome is indeed a combination of both components, on the one hand, many associations at the site and region levels were enriched in inflammatory pathways, suggesting that methyl marks could have been introduced downstream of cytokine signalling (either up- or downs-stream of gene expression changes). On the other hand, at least a fraction of DNA methylation changes was linked to a neighbouring risk polymorphism, indicating an effector role for DNA methylation in the interface between genotype and phenotype. In agreement with the largest study selected for our meta-analysis [10], genes near abnormal DNA methylation were enriched in immune and inflammatory pathways, highlighting the role of chronic inflammation in both, UC and CD. In particular, TGF-β is a cytokine able to modulate the inflammatory response, and it was enriched in IBD-DMRs. Moreover, it was enriched in those DMPs common between IBD and CeD, in agreement with the crucial role of TGF-β pathway in regulating the intestinal T cell response. An additional element that emerged from our pathway analysis is the potential crosstalk between IBD and adipogenesis. In fact, patients with IBD, particularly those with CD, develop ectopic adipose tissue (fat-wrapping or creeping-fat) covering a large part of the small and large intestine [37]. It has been proposed that in obese or overweight IBD patients it is the mesenteric adipose tissue that contributes to intestinal and systemic inflammation [37]. In our study, we identified 4532 CpG sites that simultaneously display differential variation and differential methylation (DVMCs) associated with IBD. In most cases, IBD mucosal cells displayed higher variation at those DVMCs relative to control cells. Although this hypervariability may represent cellular variation (e.g., changes in inflammatory or stromal components of the intestinal mucosa), it has been suggested that a stochastic component of methylation variation at certain genomic locations may characterize pathological conditions [28,38]. Of note, differential variation in DNA methylation has been found in other pathologies, including cancer [38-40]. In particular, they have been described as predictors of cancer development in non-tumour tissues [28,39] or associated with exposure to known carcinogens [41]. This is an interesting finding, considering that one fraction of IBD patients has an increased susceptibility to develop colon cancer [42]. In terms of genomic distribution, we found that DMPs are relatively absent from CGIs. Instead, they could be associated with other regulatory regions such as enhancers, for example, in association with SNPs. Indeed, GWAS performed in multiple complex diseases have shown that SNPs of susceptibility are enriched in enhancer regions, and DNA methylation could be an intermediary in this process [43,44]. Illustrating this, the presence of differentially methylated sites in the vicinity of known susceptibility loci supports the notion of DNA methylation as an intermediary between genotype and phenotype (mQTLs). In addition, among DMRs with a significant genetic association, we find JAK3, KRT8, HLA genes, all of them associated with a role in IBD pathogenesis [45-49]. The presence of CpGs participating in both IBD-DMPs as well as mQTLs suggests that a considerable number of the DMPs identified in our metanalysis are regulated by SNP-genotypes in cis. However, very few of these are associated with IBD. This observation points to the possibility that, although fine-mapping aims to identify the SNPs responsible for the disease-association, other nearby SNPs in strong linkage disequilibrium could be the ones implicated in the mQTLs, drawing the methylation patterns reported. Additionally, we describe a picture in which most of the IBD-DMPs seem to be genotype-independent, since they do not participate in any mQTL, at least in blood. Regarding the SNPs associated to IBD as well as to the methylation levels of IBD-DMPs, it is interesting that the methylation of a CpG island 4 kb upstream of the cg24032190-DMP identified in the first intron of SMAD3 has been reported to be allele-specific and to regulate the expression of the gene [50]. Therefore, we propose another DMP in the same region that could mediate the association between the locus and IBD; and hypothesize that this could also be the case for the genomic regions surrounding ADAM15, RNF186, and IRF5. Regarding coeliac epithelial DMPs also found altered in IBD, it is important to note that most of them were located in the HLA region. This locus presents strong linkage disequilibrium and encodes a number of genes related to immune response and immune regulation through self-recognition [49,51], and strongly predisposes to autoimmune diseases such as CeD. In our previous work [34], we claimed to have found a genotype-independent methylation signature in coeliac duodenal epithelia. The finding of a signature in the HLA region common to IBD and CeD reinforces this idea, given that the HLA association with IBD is much weaker (variance explained <5%) than with CeD, and moreover, different HLA haplotypes drive these associations [45]. Additionally, this common methylation signature points to a non-specific pattern, probably responding to common inflammatory forces in the two disorders.

Conclusions

Our findings illustrate an aberrant DNA methylation landscape in IBD, independent of IBD subtype and other clinical and pathological features. The enrichment of abnormal DNA methylation in inflammatory pathways and genes suggests a direct role for this mark downstream of cytokine signalling and/or a risk genotype. Such a landscape may be a more general indicator of intestinal chronic inflammation, although evidence from purified epithelial cells suggests that those changes are not primarily explained by an inflammatory status [10]. Such effect of inflammation, as well as cell heterogeneity in general could not be directly accounted for in our analyses. However, we expect that such limitation will be compensated with the future addition of new IBD datasets with adequate and complete annotations. In addition, technological progress in other forms of methylation (e.g., 5hmC) and a higher coverage of the genome will add to the overall goal of identifying biomarkers in IBD.

Methods

Dataset selection

Dataset selection criteria included: methylome data obtained from intestinal mucosa (including colon and terminal ileum), availability of healthy controls and IBD samples (CD, UC, or both), in data obtained using Human Infinium Bead Arrays (Illumina’s HM450 or EPIC arrays), an established technology to detect DNA methylation [52]. Tables 1 & 2 shows the main characteristics of the datasets fulfiling these criteria. Dataset MTAB_3703/3709 was eventually excluded from the analyses as only 6 samples were of non-foetal origin, with only 3 samples from large intestine.

Data preprocessing

All methylation data and sample information were downloaded from Gene Expression Omnibus (GEO) and Array Express public repositories, and analysed using R/Bioconductor packages [53]. Normalized data was loaded into R directly from each repository, except when raw idat files were also available. In that case, idat files were normalized using the “Funnorm“ function of the minfi package [54]. Each dataset was independently assessed for data quality and distribution, before merging. Merged data was filtered for sex chromosomes, known cross-reactive probes [55], and probes associated with common SNPs that may reflect underlying polymorphisms rather than methylation profiles [56]. In addition, the ‘nmode.mc’ function of the ENmIx package was used for the identification of multimodal sites [57]. These sites were not removed at this step but were used instead to classify significant associations in a later step.

Quality control and cross-validation

After filtering, 392810 CpG sites common to all datasets were used to identify principal components (PC) of variation and plotted using PC regression and multidimensional scaling (MDS) plots. Strong associations were observed between PCs and known variables (i.e., dataset, sex, age, and anatomical location), with age and anatomical location partially confounded by the dataset of origin. As additional quality control, DNA methylation values were used to predict age and sex and contrast with downloaded phenotype information (Figure S1). Sex was inferred from the median total intensity signal on XY chromosomes and permitted the identification of eight sex mismatches that were removed from the analysis. Age prediction was performed using Horvath’s coefficients [58], as implemented in the wateRmelon package [59]. There was a strong positive correlation between reported and predicted age (Figure S1). For two datasets where age was not available, predicted age corresponded to adult samples, as reported in the corresponding repositories. The common merged and filtered matrix of methylation beta values and their corresponding phenotype data was taken to the next step. As validation of our aggregated analysis, we performed independent region-level analyses to test for the association between IBD and DNA methylation in three datasets, where enough power made it possible (dataset 1: all datasets with available idat files, 2: dataset based on EPIC bead array data, and 3: dataset GSE42921). There was a significant overlap among those three analyses, with 905 common gene symbols (Figure S2). We also performed a leave-one-out cross-validation approach. To this end, we successively removed each of the six datasets of the study and performed differential methylation analysis at the probe and region levels (Figure S2). Two different diagrams are shown due to limitations of this visualization, but they illustrate that there is a common set of CpG sites differentially methylated across all or most datasets, and an important overlap with our final list of differentially methylated probes. Similar results were obtained when differential methylation was studied at the region level (DMRs).

Latent variables and batch correction

In addition to the obvious batch effect of the dataset of origin, DNA methylation is known to be influenced by genotype, sex, age, and cell composition. As all of these factors are potential confounders, we tried to minimize or account for their effect using different strategies. Those factors where data was available (i.e., dataset, sex, predicted age) were modelled in a linear regression. In the particular case of sex where the effect on DNA methylation is strong, we removed an important part of such effect by filtering out all probes mapping to chromosomes X and Y, as described above. The effect of genotype was addressed a posteriori, in our mQTL analyses. For all other factors (except inflammation, where annotated data was not available for most samples), we were able to assess their association with the main components of variation before and after adjustment for latent variables identified using surrogate variable analysis (SVA) [60]. In particular, cell composition has been shown to be suited to be addressed using this strategy [61]. In our case, cell composition can be dependent on both, inflammation and anatomical location. Anatomical location was indeed strongly associated with the first component of variation (PC1) (Figure S2), an effect that was attenuated after SVA. A similar reduction in the strength of association with main PCs was observed for the effect of dataset, age, and sex. Of note, our variable of interest (IBD vs. control) was associated with the first three PCs after SVA adjustment, while the effect of all other co-variates and batches was minimized (Figure S1). In total, 29 surrogate variables were identified and they were modelled in our linear regression, together with dataset, sex, and age. There was no association (using linear regression) between surrogate variables (SVs, Figure S2) and our main variable. However, dataset of origin and anatomical location were strongly associated with several SVs (Figure S2).

Differential methylation

Associations were tested for 392810 CpG sites, across 285 samples (81 control and 204 IBD samples). Methylation data was modelled at the probe and region levels using a linear model with Bayesian adjustment [62]. Sex and dataset were modelled together with subject status (i.e., control or IBD patient). Surrogate variables identified in the previous step were also included in the linear model to account for unknown sources of variation. Quantile-quantile (QQ) plots were used to inspect the distribution of resulting p values and estimate statistical inflation (Figure S2). Differentially methylated positions (DMPs) and regions (DMRs) were selected based on a methylation change (delta beta) of at least 10% or 5% (for DMPs and DMRs, respectively) when comparing control vs. IBD samples and a false discovery rate – (FDR) adjusted p value below 0.05. DMRs were identified with the DMRcate package using the recommended proximity-based criteria [63]. A DMR was defined by the presence of at least two differentially methylated CpG sites with a maximum gap of 1000 bp. To identify CpG positions exhibiting significant differential variation and differential methylation (DVMCs), data was analysed using iEVORA, an algorithm that identifies DNA methylation outlier events shown to be indicative of malignancy [28]. iEVORA is based on Bartlett’s test (BT) that examines the differential variance in DNA methylation, but because BT is very sensitive to single outliers, it is complemented with re-ranking of significant events according to t-statistic (TT, t test), to balance the procedure. The significance is thus assessed at the level of differential variability, but the significance of differential variability with larger changes in the average DNA methylation are favoured over those with smaller shifts. We used adjusted q(BT) <0.001 and p(TT) <0.05 as thresholds for significant DVMCs. To study genomic context, we used HM450 annotations, with hg19 as the human reference genome, UCSC and previously reported genomic features [65]. Differentially methylated genes (DMPs, DMRs, and DVMCs) were further analysed to determine functional pathways and ontology enrichment using Enrichr [56]. We tested the association between two gene lists by calculating a hypergeometric distribution using the ‘phyper’ function implemented in R base. To this end, we used the gene list lengths, their overlap, and a conservative total number of sites (400 k for data based on HM450 bead arrays). Based on the same distribution, we calculated the random expectation and the corresponding proportion between the observed overlap and such expectation. This value is referred to as ‘representation factor’ throughout the text.

SNPs-DMPs associations in IBD and CeD

To identify methylation quantitative trait loci (mQTL), single nucleotide polymorphisms (SNPs) associated with IBD risk were obtained from a fine-mapping study of IBD with single-variant resolution [29]. Two independent GWAS were also considered in some of the analyses: (1). Jostins L et al. [32], and (2). Lange KM de et al. [30]. Genomic distances between 368 unique SNPs pooled from these three studies and IBD-associated DMPs were calculated using the R package GenomicRanges. In addition, we searched for those CpGs that apart from being differentially methylated in IBD according to our metanalysis, were previously reported to be differentially methylated in a previous work performed by our group in CeD [34]. CeD is a genetic, inflammatory condition of the duodenum in which the Human Leucocyte Antigen (HLA) region explains around 40% of the heritability, and HLA-DQ2/-DQ8 molecules are necessary for gliadin presentation and activation of the autoimmune response. Briefly, we looked for the overlap between the bimodal IBD-DMP list presented here and the coeliac DMPs found in both the epithelial and the immune cell fractions of the duodenum. We also searched for the IBD-DMPs that were previously reported to participate in blood mQTLs in cis (2 Mb, p < 1e-6), according to the largest to-date mQTL database available [33], and found the overlap between them and the SNPs associated to IBD [29]. All the overlaps were reported using in-house R scripts. We also calculated the representation factor and the associated probability of the overlaps (hypergeometric test), in order to establish whether they were significant. Click here for additional data file.
  63 in total

1.  The sva package for removing batch effects and other unwanted variation in high-throughput experiments.

Authors:  Jeffrey T Leek; W Evan Johnson; Hilary S Parker; Andrew E Jaffe; John D Storey
Journal:  Bioinformatics       Date:  2012-01-17       Impact factor: 6.937

2.  Accelerated age-related CpG island methylation in ulcerative colitis.

Authors:  J P Issa; N Ahuja; M Toyota; M P Bronner; T A Brentnall
Journal:  Cancer Res       Date:  2001-05-01       Impact factor: 12.701

Review 3.  Immunogenetic biomarkers in inflammatory bowel diseases: role of the IBD3 region.

Authors:  Manuel Muro; Ruth López-Hernández; Anna Mrowiec
Journal:  World J Gastroenterol       Date:  2014-11-07       Impact factor: 5.742

4.  ENmix: a novel background correction method for Illumina HumanMethylation450 BeadChip.

Authors:  Zongli Xu; Liang Niu; Leping Li; Jack A Taylor
Journal:  Nucleic Acids Res       Date:  2015-09-17       Impact factor: 16.971

5.  De novo identification of differentially methylated regions in the human genome.

Authors:  Timothy J Peters; Michael J Buckley; Aaron L Statham; Ruth Pidsley; Katherine Samaras; Reginald V Lord; Susan J Clark; Peter L Molloy
Journal:  Epigenetics Chromatin       Date:  2015-01-27       Impact factor: 4.954

6.  Fine-mapping inflammatory bowel disease loci to single-variant resolution.

Authors:  Hailiang Huang; Ming Fang; Luke Jostins; Maša Umićević Mirkov; Gabrielle Boucher; Carl A Anderson; Vibeke Andersen; Isabelle Cleynen; Adrian Cortes; François Crins; Mauro D'Amato; Valérie Deffontaine; Julia Dmitrieva; Elisa Docampo; Mahmoud Elansary; Kyle Kai-How Farh; Andre Franke; Ann-Stephan Gori; Philippe Goyette; Jonas Halfvarson; Talin Haritunians; Jo Knight; Ian C Lawrance; Charlie W Lees; Edouard Louis; Rob Mariman; Theo Meuwissen; Myriam Mni; Yukihide Momozawa; Miles Parkes; Sarah L Spain; Emilie Théâtre; Gosia Trynka; Jack Satsangi; Suzanne van Sommeren; Severine Vermeire; Ramnik J Xavier; Rinse K Weersma; Richard H Duerr; Christopher G Mathew; John D Rioux; Dermot P B McGovern; Judy H Cho; Michel Georges; Mark J Daly; Jeffrey C Barrett
Journal:  Nature       Date:  2017-06-28       Impact factor: 49.962

Review 7.  Best practices in DNA methylation: lessons from inflammatory bowel disease, psoriasis and ankylosing spondylitis.

Authors:  Jessica M Whyte; Jonathan J Ellis; Matthew A Brown; Tony J Kenna
Journal:  Arthritis Res Ther       Date:  2019-06-03       Impact factor: 5.156

8.  DNA Methylation and Transcription Patterns in Intestinal Epithelial Cells From Pediatric Patients With Inflammatory Bowel Diseases Differentiate Disease Subtypes and Associate With Outcome.

Authors:  Kate Joanne Howell; Judith Kraiczy; Komal M Nayak; Marco Gasparetto; Alexander Ross; Claire Lee; Tim N Mak; Bon-Kyoung Koo; Nitin Kumar; Trevor Lawley; Anupam Sinha; Philip Rosenstiel; Robert Heuschkel; Oliver Stegle; Matthias Zilbauer
Journal:  Gastroenterology       Date:  2017-10-12       Impact factor: 22.682

9.  Dynamic imbalance between cancer cell subpopulations induced by transforming growth factor beta (TGF-β) is associated with a DNA methylome switch.

Authors:  Marion Martin; Pierre-Benoit Ancey; Marie-Pierre Cros; Geoffroy Durand; Florence Le Calvez-Kelm; Hector Hernandez-Vargas; Zdenko Herceg
Journal:  BMC Genomics       Date:  2014-06-05       Impact factor: 3.969

Review 10.  Epigenome-wide Association Studies and the Interpretation of Disease -Omics.

Authors:  Ewan Birney; George Davey Smith; John M Greally
Journal:  PLoS Genet       Date:  2016-06-23       Impact factor: 5.917

View more
  3 in total

Review 1.  Assessing Differential Variability of High-Throughput DNA Methylation Data.

Authors:  Hachem Saddiki; Elena Colicino; Corina Lesseur
Journal:  Curr Environ Health Rep       Date:  2022-08-30

Review 2.  Epigenomic and transcriptomic analysis of chronic inflammatory diseases.

Authors:  Sabrina Ka Man Tam; Danny Chi Yeu Leung
Journal:  Genes Genomics       Date:  2021-02-27       Impact factor: 1.839

3.  Modulation of macrophage inflammatory function through selective inhibition of the epigenetic reader protein SP140.

Authors:  Nicola R Harker; David F Tough; Wouter J de Jonge; Mohammed Ghiboub; Jan Koster; Peter D Craggs; Andrew Y F Li Yim; Anthony Shillings; Sue Hutchinson; Ryan P Bingham; Kelly Gatfield; Ishtu L Hageman; Gang Yao; Heather P O'Keefe; Aaron Coffin; Amish Patel; Lisa A Sloan; Darren J Mitchell; Thomas G Hayhow; Laurent Lunven; Robert J Watson; Christopher E Blunt; Lee A Harrison; Gordon Bruton; Umesh Kumar; Natalie Hamer; John R Spaull; Danny A Zwijnenburg; Olaf Welting; Theodorus B M Hakvoort; Anje A Te Velde; Johan van Limbergen; Peter Henneman; Rab K Prinjha; Menno P J de Winther
Journal:  BMC Biol       Date:  2022-08-19       Impact factor: 7.364

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.