Literature DB >> 33152062

Genetic Risk Scores Identify Genetic Aetiology of Inflammatory Bowel Disease Phenotypes.

M D Voskuil^1,2, L M Spekhorst¹, K W J van der Sloot^1,3, B H Jansen¹, G Dijkstra¹, C J van der Woude⁴, F Hoentjen⁵, M J Pierik⁶, A E van der Meulen⁷, N K H de Boer⁸, M Löwenberg⁹, B Oldenburg¹⁰, E A M Festen^1,2, R K Weersma¹.

Abstract

BACKGROUND AND AIMS: Inflammatory bowel disease [IBD] phenotypes are very heterogeneous between patients, and current clinical and molecular classifications do not accurately predict the course that IBD will take over time. Genetic determinants of disease phenotypes remain largely unknown but could aid drug development and allow for personalised management. We used genetic risk scores [GRS] to disentangle the genetic contributions to IBD phenotypes.
METHODS: Clinical characteristics and imputed genome-wide genetic array data of patients with IBD were obtained from two independent cohorts [cohort A, n = 1097; cohort B, n = 2156]. Genetic risk scoring [GRS] was used to assess genetic aetiology shared across traits and IBD phenotypes. Significant GRS-phenotype (false-discovery rate [FDR] corrected p <0.05) associations identified in cohort A were put forward for replication in cohort B.
RESULTS: Crohn's disease [CD] GRS were associated with fibrostenotic CD [R2 = 7.4%, FDR = 0.02] and ileocaecal resection [R2 = 4.1%, FDR = 1.6E-03], and this remained significant after correcting for previously identified clinical and genetic risk factors. Ulcerative colitis [UC] GRS [R2 = 7.1%, FDR = 0.02] and primary sclerosing cholangitis [PSC] GRS [R2 = 3.6%, FDR = 0.03] were associated with colonic CD, and these two associations were largely driven by genetic variation in MHC. We also observed pleiotropy between PSC genetic risk and smoking behaviour [R2 = 1.7%, FDR = 0.04].
CONCLUSIONS: Patients with a higher genetic burden of CD are more likely to develop fibrostenotic disease and undergo ileocaecal resection, whereas colonic CD shares genetic aetiology with PSC and UC that is largely driven by variation in MHC. These results further our understanding of specific IBD phenotypes.

Entities: Chemical

Keywords: Genetics; inflammatory bowel disease; phenotypes

Mesh：

Year: 2021 PMID： 33152062 PMCID： PMC8218708 DOI： 10.1093/ecco-jcc/jjaa223

Source DB: PubMed Journal: J Crohns Colitis ISSN： 1873-9946 Impact factor: 9.071

1. Introduction

Inflammatory bowel disease [IBD], with ulcerative colitis [UC] and Crohn’s disease [CD] as its two major forms, is a chronic, relapsing, immune-mediated disease characterised by inflammation and ulceration of the gut mucosa. Patients with UC have continuous inflammation limited to the mucosal layer of the colon. In CD, the inflammation is discontinuous, may occur anywhere in the gastrointestinal tract, and involves all layers of the gut.[1,2] The clinical course of IBD is highly unpredictable and heterogeneous. Some patients have long periods of disease remission that do not even require therapy, but a significant proportion of patients experience frequent relapse of inflammation or progress to complicated CD disease behaviour such as fibrostenotic or penetrating disease.[1,2] This latter group often requires treatment escalation with potent immunosuppressive therapy, hospitalisation, or surgical resection. However, current clinical classifications cannot accurately predict IBD disease course.[3] Genome-wide association studies [GWAS] have identified around 240 independent genetic susceptibility loci for IBD and have implicated genes involved in autophagy, T cell response, and bacterial handling as important contributors to the development of IBD.[4,5] IBD also shares genetic aetiology with other immune-mediated diseases such as ankylosing spondylitis and coeliac disease.[6,7] The heterogeneous character of IBD suggests that different biological mechanisms lead to inflammation, and subgroups of patients may have different effector mechanisms contributing to their disease phenotypes. Identification of these patient-specific biological mechanisms could aid drug development and allow for personalised diagnostic work-up or treatment. Although similar patterns of disease phenotypes have been observed within families, genetic determinants of these clinical aspects of disease remain largely unknown outside their role in disease susceptibility.[8-10] Genetic risk scores [GRS] aggregate the effects of the thousands of trait-associated genetic variants discovered by GWAS. By combining the effects of many genetic variants with small effect sizes, GRS are powerful tools to identify genetic contributions to phenotypes.[11,12] GRS also have the potential to identify pleiotropic effects of genetic variants, which may aid drug discovery or drug repurposing. GRS may also help identify patients at risk for specific clinical aspects of IBD. In this study, we performed a within-cases genotype–phenotype study using two independent cohorts of patients with IBD. We constructed GRS for 13 traits, both related and unrelated to IBD, to reveal genetic determinants that contribute to IBD phenotypes. In contrast to previous genetic studies that used the Immunochip, we used novel genome-wide genetic array data with the potential to capture regions of the genome not covered by the Immunochip.[10]

2. Methods

2.1. Phenotype data

For the discovery phase of this study, patients were included from the 1000IBD cohort [cohort A]. 1000IBD consists of patients with IBD treated at the University Medical Center Groningen, for whom detailed phenotypes are prospectively collected and multi-omics profiles are generated.[13] For the replication phase of this study, we included patients from the Dutch IBD biobank cohort [cohort B], a prospective nationwide biobank of patients with IBD.[14] To ensure that both cohorts were independent, patients included in cohort A were excluded from cohort B. In both cohorts, each patient was diagnosed with IBD by his or her gastroenterologist using endoscopic data, histological data, radiological data, or a combination of these, and phenotyped according to the Montreal classification.[15] For each patient, their Montreal classification, surgical history [ileocaecal resection or colectomy], presence of extra-intestinal manifestations, primary sclerosing cholangitis [PSC] status, and smoking status were dichotomised into binary phenotypes. Only non-missing phenotype data were used, and missing data were not imputed. Cohorts A and B were compared using either the Wilcoxon rank sum or the chi square test. The ethical boards of each separate recruiting centre approved the study, and all patients included in this study gave written informed consent.

2.2. Genotype data

All patients were genotyped using the Global Screening Array [Infinium Global Screening Array, Illumina, San Diego, CA, USA; see Supplementary Methods, available as Supplementary data at ], as previously described.[13,14] In short, the Global Screening Array is a genotyping platform including over 700 000 genetic variants, which comprises a multi-ethnic genome-wide backbone combined with content derived from exome-sequencing studies and meta-analyses of several phenotype-specific consortia, including the International IBD Genetics Consortium. Extensive pre-imputation quality control was performed on the genotype data [Supplementary Methods] and, after pre-phasing with the Eagle2 algorithm, genetic data were imputed to the Haplotype Reference Consortium reference panel using the Michigan Imputation server.[16] After post-imputation quality control measurements were performed, 12 130 010 genetic variants with a minor allele frequency >0.1% remained. To limit bias from population stratification, only those patients with genetic data clustering with individuals from European ancestry were included, using the 1KG European dataset as the external reference panel.[17]

2.3. Genetic variant phenotype associations

Genetic variants in MST1, MHC, and NOD2, with known associations with age at diagnosis, CD disease location, CD disease behaviour, UC disease extent, and surgical history, were selected from the imputed genetic data[10] [Supplementary Table 1A, available as Supplementary data at ]. In total, we identified 11 out of 13 previously described variants. The remaining two genetic variants [MST1, rs35261698; MHC, rs77005575] were excluded during quality control. In cohort A, we tested for genotype–phenotype associations with CD disease location [Montreal L1 vs. L2; L3 vs. L2], CD disease behaviour [Montreal B2 vs. B1; B3 vs. B1 + B2], UC disease extent [Montreal E2 vs. E1; E3 vs. E1 + E2], and surgical history [ileocaecal resection or colectomy]. We performed logistic regression analyses in PLINK 1.9 [CoG Genomics],[18] adjusting for the covariates age and sex and for the first five principal components of the genetic data. To test for association with age at diagnosis [CD, UC, and IBD], we performed linear regression analyses in PLINK, adjusting for the same covariates as above. Since we only sought to replicate previously identified associations, genotype–phenotype associations with a p-value <0.05 were considered significant and no GWAS was performed. Effect sizes of genetic variants are described as odds ratio [OR] for logistic regression or beta [β] for linear regression.

2.4. HLA imputation

Previous IBD genetic studies have revealed that HLA alleles explain substantially more of the disease variance than is explained by index genetic variants in the MHC region.[19] We submitted phased genotypes of 6114 markers within the MHC region to the HLA*IMP:03 server, which imputed four-digit classical alleles of 11 HLA region genes for each individual[20] [Supplementary Figures 1 and 2, available as Supplementary data at ]. Only imputed HLA alleles with a posterior probability ≥99% and a frequency ≥ 5% were included.

2.5. Genetic risk scores

We selected 13 published GWAS [or meta-analyses] that indexed traits related to IBD, gastrointestinal diseases, immunological disease, IBD phenotypes, and negative control phenotypes [Supplementary Table 2, available as Supplementary data at ]. We obtained summary statistics of these GWAS from publicly available repositories, or through collaboration, and these were used as ‘base’ data.[5,21-30] Using PRSice2 software,[31] GRS were calculated for each of the base datasets. GRS were calculated by computing the sum of risk alleles corresponding to a base phenotype in each patient, weighted by the effect size estimate derived from the base GWAS. Genetic variants were pruned for linkage disequilibrium [R2 >0.1 within a 500-kb window], using the 1KG European dataset as external reference panel. The optimal GRS for each phenotype in cohort A was calculated using p-value thresholds [pT] from 1.0E-08 to 0.5, in steps of 5.0E-08. The explained variance [Nagelkerke’s R2] was derived from a linear model in which the IBD phenotype [target phenotype] was regressed on each GRS, adjusting for the covariates age, sex, and diagnosis [CD vs. UC] and the first five principal components from the genetic data. In total, GRS were calculated for 13 traits—CD, UC, primary sclerosing cholangitis [PSC], rheumatoid arthritis, asthma, coeliac disease, idiopathic pulmonary fibrosis, diverticulosis, ever smoking, former smoking, CD prognosis [poor prognosis defined by the need for repeated surgery or the use of two or more immunosuppressives], bone mineral density, and serum vitamin D levels—and targeted on a total of 24 phenotypes in cohort A.

2.6. Statistical analyses

To obtain the optimal pT, and thus the best predictive GRS, multiple models were fitted on each target phenotype and genetic variants were added to the model at each new pT. Although there was a high correlation between each model [ie. only a small number of variants was added at each threshold], the significance of the best-fit GRS should be corrected for this multiple testing. To obtain the empirical p-value of each GRS–phenotype association in cohort A, 10 000 rounds of permutation were performed. GRS–phenotype associations with an empirical p-value <0.05 were considered significant. Because we tested for associations with 24 phenotypes, of which 16 were independent [groups of] phenotypes, we performed Bonferroni correction for this number of phenotypes (empirical p-value * 16 = false-discovery rate [FDR]). An FDR <0.05 after Bonferroni correction was considered significant. All significant associations were put forward for replication in the independent cohort B. Associations were considered replicated when a GRS generated in cohort A was also significantly [p-value <0.05] associated with the same phenotype in cohort B, with a consistent direction of effect. Meta-analyses were performed to assess inter-cohort heterogeneity and obtain a meta p-value for each significant GRS–phenotype association. To facilitate interpretability, all GRS were standardised. To ensure that the significant GRS–phenotype associations were based on independent sets of genetic variants, these GRS were recalculated using a larger pruning window of 1 Mb. Next, we excluded all genetic variants within the gene regions of NOD2, MHC, and MST1 from the finally selected GRS, and tested whether the GRS remained significantly associated with the phenotype. Significant GRS–phenotype associations, clinical factors, and previously identified genetic predictors [ie. NOD2, MHC, and MST1 for CD disease location and CD disease behaviour; NOD2 and MHC for ileocaecal resection; and MHC for PSC] were included in a multivariate model. Finally, we repeated the multivariate analyses including HLA alleles previously identified as genetic predictors of CD and/or UC, instead of individual genetic variants in MHC.

2.7. Data availability

Raw data are [in part] available at [https://ega-archive.org/studies/EGAS00001002702] or upon request.

3. Results

3.1. Phenotype data

Clinical characteristics were obtained from 1097 patients from cohort A and 2156 patients from cohort B [data displayed in Table 1]. Patients in cohort B were younger than those in cohort A [p <1.0E-04]. Patients in cohort B were less often diagnosed with CD compared with cohort A, but had younger onset of CD, more ileocolonic [Montreal L3] localisation, and more inflammatory [Montreal B1] disease behaviour [all p < 1.0E-04]. Patients with UC in cohort B had more left-sided colitis compared with cohort A [p = 0.02]. Patients in cohort B had less often undergone colonic resection, but more often undergone ileocaecal resection [both p <1.0E-04]. In cohort B, there were more current smokers but fewer former smokers compared with cohort A. In cohort A, more patients had an IBD-PSC phenotype [p <1.0E-04]. Finally, extra-intestinal manifestations differed between the two cohorts [all p <0.04].

Table 1.

Phenotype distributions of discovery and replication cohorts.

	Cohort A	Cohort B
Characteristic	[n = 1097]	[n = 2156]	p-values
Sex, no. [%]			0.35
Female	559 [57%]	1266 [59%]
Age, median [IQR], years	48 [36–60]	44 [33–56]	<1.0E-04
Type of IBD diagnosis, no. [%]			<1.0E-04
Crohn’s disease	506 [52%]	1393 [65%]
Ulcerative colitis	417 [43%]	763 [35%]^a
IBD-unclassified	48 [5%]	NA
Montreal A			0.04
A1	79 [16%]	209 [15%]
A2	320 [65%]	977 [70%]
A3	96 [19%]	207 [15%]
Montreal L			<1.0E-04
L1	179 [58%]	187 [18%]
L2	101 [33%]	368 [34%]
L3	26 [9%]	514 [48%]
L4 [upper GI involvement]	51 [17%]	112 [10%]	0.16
Montreal B			<1.0E-04
B1	238 [48%]	927 [67%]
B2	178 [36%]	270 [19%]
B3	82 [16%]	196 [14%]
Bp	156 [31%]	381 [27%]	0.14
Montreal E			0.02
E1	45 [11%]	39 [6%]
E2	135 [32%]	230 [37%]
E3	242 [57%]	361 [57%]
Primary sclerosing cholangitis	78 [7%]	42 [2%]	<1.0E-04
Surgery
Colonic resection	349/980 [36%]	408 [19%]	<1.0E-04
Ileocaecal resection	180/980 [18%]	525 [24%]	2.0E-04
Smoking status
Current	190/942 [19%]	397 [29%]	0.09
Former	516/910 [53%]	698 [36%]	<1.0E-04
Ever	575/980 [59%]	836 [43%]	<1.0E-04
Extra-intestinal manifestations
Ocular manifestations	35 [3%]	103 [5%]	0.033
Cutaneous manifestations	147 [13%]	229 [11%]	0.019
Arthropathies	304 [28%]	410 [19%]	<1.0E-04
Arthritis	42 [4%]	150 [7%]	3.4E-04
Thromboembolism	12 [1%]	81 [4%]	<1.0E-04
Osteoporosis	59 [5%]	452 [21%]	<1.0E-04

Characteristics of patients with IBD from the discovery cohort [cohort A] and replication cohort [cohort B]. Percentages were calculated from non-missing data. Montreal refers to the Montreal classification.[15]

GI, gastrointestinal; IBD, inflammatory bowel disease; IQR, inter-quartile range; NA, not available.

aUC and IBD-unclassified were grouped in cohort B.

Phenotype distributions of discovery and replication cohorts. Characteristics of patients with IBD from the discovery cohort [cohort A] and replication cohort [cohort B]. Percentages were calculated from non-missing data. Montreal refers to the Montreal classification.[15] GI, gastrointestinal; IBD, inflammatory bowel disease; IQR, inter-quartile range; NA, not available. aUC and IBD-unclassified were grouped in cohort B. <>

3.2. Genetic variant phenotype associations

We first sought to replicate known associations between genetic variants and IBD phenotypes. We replicated [p <0.05] the associations of NOD2 with age at diagnosis in CD and IBD, CD disease location, CD disease behaviour, and the need for surgery in CD. Consistent with previous data, the association of NOD2 with the need for surgery in CD was largely mediated through younger age at diagnosis and ileal disease location [Supplementary Table 3, available as Supplementary data at ]. We also replicated the associations between MHC and UC disease extent and CD disease location. Finally, we replicated the association between MST1 and age at diagnosis in CD [Supplementary Table 1B].

3.3. Genetic risk scores

We constructed a total of 24 GRS models for each base trait [GRS targeted on phenotypes]. Supplementary Table 4, available as Supplementary data at , gives the estimates from all GRS linear regression analyses. As displayed in Table 2, seven GRS models remained significantly associated with an IBD phenotype after 10 000 permutations and Bonferroni correction, and were significantly replicated with a consistent effect direction in independent cohort B. The explained variance of significant GRS models across different pT are shown in Supplementary Figure 3, available as Supplementary data at . All seven GRS models remained significantly associated with an IBD phenotype when repeating the analyses with a larger pruning window of 1 Mb [Supplementary Table 5, available as Supplementary data at ].

Table 2.

Overview of significant GRS–phenotype associations

				Discovery cohort A			Replication cohort B	Meta-analyses
Genetic risk score	Phenotype	pT	Variants [n]	Empirical p-value	FDR	R ²	p-value	Z-score	p-value	Direction
Crohn’s disease	Ileocaecal resection	8.0E-04	5379	1.0E-04	1.6E-03	4.1%	2.0E-05	5.8	8.2E-09	++
excluding NOD2, MST, MHC			3836	1.0E-04	1.6E-03	4.3%
Crohn’s disease	Fibrostenotic Crohn’s	1.0E-08	219	1.0E-03	0.02	6.9%	2.2E-03	4.3	1.6E-05	++
excluding NOD2, MST, MHC			170	1.0E-04	1.6E-03	11.0%
Ulcerative colitis	Colonic Crohn’s	1.0E-04	1334	1.0E-03	0.02	9.1%	6.0E-03	-4.1	3.8E-05	--
excluding NOD2, MST, MHC			939	0.01	0.22
Primary sclerosing cholangitis	Colonic Crohn’s	0.01	12487	2.0E-03	0.03	3.6%	0.04	-3.5	5.5E-04	--
excluding NOD2, MST, MHC			10913	0.09
Primary sclerosing cholangitis	IBD-PSC	1.6E-03	2365	1.0E-04	1.6E-03	7.5%	2.0E-06	6.1	8.9E-10	++
Primary sclerosing cholangitis	Smoking history	9.6E-03	9735	2.5E-03	0.04	1.7%	7.6E-03	-3.9	8.5E-05	--
Crohn’s disease prognosis	IBD-PSC	1.5E-06	8	4.0E-04	6.4E-03	5.5%	2.1E-05	-5.5	3.4E-08	--
excluding MHC			3	0.71

Associations between genetic risk scores and clinical IBD phenotypes that remained significant after Bonferroni correction in the discovery cohort and showed positive replication with consistent direction of effect in the independent replication cohort. For IBD phenotypes with previously known genetic predictors [NOD2, MHC, and MST1], analyses were repeated after excluding these genes from the genetic data. ‘Variants’ refers to the number of genetic variants included in the optimal GRS for that phenotype [most explained variance]. Empirical p-value refers to the p-value after 10 000 rounds of permutation. Meta-analyses p-value refers to meta-analyses of results from both discovery [FDR] and replication [p-value] cohorts.

IBD, inflammatory bowel disease; CD, Crohn’s disease; FDR, false-discovery rate; GRS, genetic risk score; PSC, primary sclerosing cholangitis; pT, p-value threshold for optimal GRS.

Overview of significant GRS–phenotype associations Associations between genetic risk scores and clinical IBD phenotypes that remained significant after Bonferroni correction in the discovery cohort and showed positive replication with consistent direction of effect in the independent replication cohort. For IBD phenotypes with previously known genetic predictors [NOD2, MHC, and MST1], analyses were repeated after excluding these genes from the genetic data. ‘Variants’ refers to the number of genetic variants included in the optimal GRS for that phenotype [most explained variance]. Empirical p-value refers to the p-value after 10 000 rounds of permutation. Meta-analyses p-value refers to meta-analyses of results from both discovery [FDR] and replication [p-value] cohorts. IBD, inflammatory bowel disease; CD, Crohn’s disease; FDR, false-discovery rate; GRS, genetic risk score; PSC, primary sclerosing cholangitis; pT, p-value threshold for optimal GRS.

3.4. Genetic disease susceptibility predicts disease phenotypes

We reasoned that the genetic burden of CD and UC might be predictive of specific clinical phenotypes when including more genetic variants than just those that reached genome-wide significance in previous GWAS. The composite genetic risk of CD [CD GRS] was significantly associated with CD disease behaviour [B2 vs. B1; pT = 1.0E-08, FDR = 0.02, R2 = 6.9%] [Figure 1] and the risk of ileocaecal resection [pT = 8.0E-04, FDR = 1.6E-03, R2 = 4.1%], and these associations remained significant after excluding NOD2, MHC, and MST1 from the genetic data [both FDR = 1.6E-03]. Multivariate logistic regression analyses that included genetic factors previously identified as risk factors for CD disease behaviour revealed that age {p = 1.9E-06; odds ratio (OR) 1.09 (95% confidence interval [CI] 1.05–1.13)}, age at diagnosis (p = 1.8E-04; OR 0.93 [95% CI 0.90–0.97]), CD disease location (p = 1.8E-04; OR 3.47 [95% CI 1.84–6.76]), and CD GRS (p = 1.8E-03; OR 1.79 [95% CI 1.26–2.61]) were all independently associated with CD disease behaviour [Supplementary Table 6A, available as Supplementary data at ].

Figure 1.

Boxplot showing the composite genetic risk of CD by CD disease behaviour, the range of the composite genetic risk of Crohn’s disease for patients with UC and CD. Patients with CD are stratified by CD disease behaviour. Montreal refers to the Montreal classification.[15] Montreal B1: non-stricturing, non-penetrating Crohn’s disease; B2: fibrostenotic Crohn’s disease; B3: penetrating Crohn’s disease. CD, Crohn’s disease; UC, ulcerative colitis. Boxplot representing the range of the composite genetic risk of PSC, for patients with UC and CD. Patients with CD are stratified by CD disease location. Montreal refers to the Montreal classification.[15] Montreal BL: ileal Crohn’s disease; L2: colonic Crohn’s disease; L3: ileocolonic Crohn’s disease. Patients with confirmed diagnosis of PSC are displayed in red. CD, Crohn’s disease; PSC, primary sclerosing cholangitis; UC, ulcerative colitis. Multivariate regression analyses revealed that age (p = 8.9E-04; OR 1.08 [95% CI 1.03–1.13]), age at diagnosis (p = 0.01; OR 0.94 [95% CI 0.90–0.98]), CD disease location (p = 2.3E-07; OR 25.8 [95% CI 8.5–103]), CD disease behaviour [p = 7.8E-06; OR 6.47 [95% CI 2.91–15.1]), and CD GRS [p = 3.0E-03; OR 1.99 [95% CI 1.29–3.21]) were all independently associated with the risk of ileocaecal resection in patients with CD [Supplementary Table 6B]. The composite genetic risk of UC [UC GRS] was significantly associated with CD disease location [L1/L3 vs. L2; pT = 1.0E-04, FDR = 0.02, R2 = 9.1%], but this association was lost after excluding NOD2, MHC, and MST1 from the genetic data [FDR = 0.22]. Multivariate logistic regression analyses revealed that the UC GRS was independently associated with CD disease location (p = 2.0E-05; OR 0.60 [95% CI 0.48–0.76]) [Supplementary Table 6C]. Multivariate regression analyses were then repeated including the imputed HLA alleles that had previously been shown to be associated with CD or UC.[19] None of the selected HLA alleles were significantly associated with CD disease behaviour [Supplementary Table 7A, available as Supplementary data at ]. In addition to age, age at diagnosis, CD disease location, CD disease behaviour, and CD GRS, carriage of HLA-DRB1*03:01 (p = 0.01; OR 6.6 [95% CI 1.53–27.9]) and HLA-C*06:02 (p = 0.02; OR 0.21 [95% CI 0.05–0.75]) were independently associated with the risk of ileocaecal resection in patients with CD [Supplementary Table 7B]. In addition, UC GRS and carriage of HLA-DRB1*03:01 (p = 0.01; OR 0.46 [95% CI 0.27–0.81]) were both independently associated with CD disease location [Supplementary Table 7C].

3.5. Shared genetic aetiology

To further our understanding of the molecular mechanisms leading to specific disease phenotypes and to aid drug repurposing, we correlated GRS of diseases related to IBD [phenotypes] to specific IBD phenotypes. The composite genetic risk of PSC [PSC GRS] was significantly associated with IBD-PSC [pT = 1.6E-03, FDR = 1.6E-03, R2 = 7.5%]. Multivariate logistic regression analyses revealed that age (p = 2.8E-06; OR 1.06 [95% CI 1.03–1.08]), age at diagnosis (p = 4.8E-08; OR 0.93 [95% CI 0.90–0.95]), male sex (p = 0.03; OR 1.79 [95% CI 1.05–3.11]), UC (p = 1.0E-08; OR 7.34 [95% CI 3.85–15.22]), and PSC GRS (p = 8.1E-09; OR 1.87 [95% CI 1.51–2.32]) were all independently associated with the risk of IBD-PSC [Supplementary Table 6D]. Moreover, the PSC GRS was significantly associated with CD disease location [pT = 0.01, L1/L3 vs. L2; FDR = 0.03, R2 = 3.6%], but this association was lost after excluding NOD2, MHC, and MST1 from the genetic data [p = 0.09]. Indeed, multivariate logistic regression analyses showed that genetic variation in MHC and MST1 and the PSC GRS (p = 0.02; OR 0.73 [95% CI 0.56–0.94]) were all independently associated with CD disease location [Supplementary Table 6E]. Finally, the PSC GRS showed association with the smoking history of IBD patients [ever vs. never; pT = 9.6E-03, FDR = 0.04, R2 = 1.7%]. Multivariate logistic regression analyses revealed that age (p = 2.9E-03; OR 1.02 [95% CI 1.01–1.04]), age at diagnosis [p = 2.9E-05; OR 1.04 [95% CI 1.02–1.05]), CD [p = 4.2E-09; OR 2.37 [95% CI 1.78–3.18]), and PSC GRS [p = 5.5E-04; OR 0.78 [95% CI 0.67–0.90]) were all independently associated with smoking history [Supplementary Table 6F]. Several variants in MHC confer risk of PSC,[32] and recent GWAS have identified genetic variation in MHC to be associated with CD prognosis.[28] The composite genetic risk of a poor CD prognosis [CDprog GRS] was significantly associated with IBD-PSC [pT = 1.5E-06, FDR = 6.4E-03, R2 = 5.5%]. After excluding MHC from the genetic data, the association between CDprog GRS and IBD-PSC was lost [p = 0.71]. Multivariate logistic regression analyses that included variants in MHC that explain most of the association with PSC[32] revealed that age (p = 1.9E-06; OR 1.06 [95% CI 1.03–1.08]), age at diagnosis (p = 1.4E-08; OR 0.92 [95% CI 0.90–0.95]), male sex [p = 0.02; OR 1.86 [95% CI 1.09–3.20]), UC (p = 4.9E-09; OR 7.55 [95% CI 3.99–15.53]), and CDprog GRS (p = 3.0E-06; OR 0.61 [95% CI 0.50–0.75]) were all independently associated with the risk of IBD-PSC [Supplementary Table 6G]. We identified an association between the composite genetic risk of CD and CD disease location [L1 vs. L2]. However, this association failed the Bonferroni significance threshold in the discovery cohort [FDR = 0.10].

4. Discussion

In this study we used GRS to study the aggregated effect of thousands of trait-associated genetic variants on IBD phenotypes. We show that increased genetic risk of CD is associated with fibrostenotic CD. We also validate the putatively shared genetic aetiology of PSC and UC with colonic CD. Finally, our results add to the existing hypothesis of an interaction between smoking and PSC. Recent genotype–phenotype studies have explored genetic determinants of specific clinical IBD phenotypes and showed that genetic variants in the known IBD susceptibility loci NOD2, MHC, and MST1 were associated with age at onset and CD disease location.[10] In our study, we first replicated these previously identified genetic variants as predictors of clinical IBD phenotypes.[10] Using two independent cohorts of IBD patients with genome-wide genetic array data, we then showed that the composite genetic risk of CD is associated with fibrostenotic CD in patients with CD, even after excluding NOD2, MHC, and MST1. Our data suggest that the variants most strongly associated with CD also play a role in fibrostenotic disease. Moreover, the composite genetic risk of CD appears to be associated with the risk of ileocaecal resection and remains significant after correcting for CD disease location and CD disease behaviour. We further validated the association between colonic CD and the genetic risk of UC. This observation is in line with the hypothesis that there is a continuum of phenotypes ranging from ileal CD to colonic CD to UC, rather than CD and UC being two distinct diseases.[10,33] Our data suggest that genetic variation in MHC is largely driving this association. Although our multivariate analyses did not identify carriage of MHC genetic variants as an independent predictor of colonic CD, excluding MHC from the genetic data leads to loss of the association. Indeed, HLA alleles may explain substantially more phenotypic variance than individual genetic variants in MHC, and we identified carriage of HLA-DRB1*03:01 as an independent predictor of CD disease location. The fact that we could not reliably impute SNP rs77005575, the strongest MHC predictor of colonic CD, may explain why we did not capture additional signals in MHC. We observed pleiotropy between the genetic risk of PSC and the smoking history of patients with IBD. In our study, genetic risk of PSC was negatively correlated to the risk of smoking. Indeed, recent epidemiological studies have identified a decreased risk of PSC among smokers,[34] which might in part be explained by genetic factors. Shared genetic aetiology between diseases other than IBD and IBD phenotypes may point to a biology that could provide new therapeutic options or aid drug repurposing. A number of apparently unrelated disease processes may result in tissue fibrosis including, for example, intestinal fibrosis in CD and idiopathic pulmonary fibrosis [IPF]. We hypothesised we would find pleiotropy between the composite genetic risk of idiopathic pulmonary fibrosis [IPF GRS] and fibrostenotic CD, but could not identify this in our dataset. Moreover, we found no significant associations between the composite genetic risk of bone mineral density and serum vitamin D levels and the risk of osteoporosis in our IBD cohorts. A recent study that defined poor CD prognosis by the need for repeated surgery or the use of two or more immunosuppressives found no association between CD genetic risk and CD prognosis.[28] In contrast, four novel genetic variants distinct from those that confer risk to CD susceptibility have been identified as predictors of poor CD prognosis.[28] We could not identify an association between the aggregated effects of these novel genetic variants and IBD phenotypes such as CD disease location or CD disease behaviour. Genotype–phenotype studies are, however, dependent on the criteria used to define phenotypes, such as disease prognosis. In addition, differences in local medical practice and differences in medical practices over time may introduce significant bias. This may explain why we could not identify genetic contributions to the need for surgery, which we used as a proxy for poor prognosis. More objective outcome measures, eg. the number of flares, once corrected for confounders, may improve the identification of genetic predictors of disease prognosis. Moreover, the relatively small sample size of the initial CD prognosis GWAS may limit the accuracy of this GRS. Current clinical classifications fail to accurately predict IBD disease course.[3] We posit that GRS have the potential to uncover biological mechanisms contributing to disease phenotypes, and these mechanisms may in turn be used for drug development or improved patient stratification. Although current discriminative accuracy remains low, integrating other factors such as transcriptomic and microbial signatures may significantly improve discriminative power and might outperform current clinical classification systems in their ability to predict IBD disease course.[3] Patient-specific molecular profiles may be used to select more homogeneous groups of patients for future clinical trials. We hypothesise that therapies in development or registered for UC might be successfully repurposed to patients with colonic CD, which is biologically associated with UC. In a clinical setting, patients with a high molecular risk of fibrostenotic disease might be treated more aggressively early in their disease course, in particular in the presence of other environmental risk factors. We fitted GRS models on target phenotypes to obtain the optimal pT and thus the best predictive GRS for each phenotype. GRS models with relatively high optimal pT may suggest [pleiotropic] biological signals of [sets of] genetic variants that fail to reach GW significance. Large cohort studies of patients with multilayered molecular data are needed to explore clinically relevant molecular risk cut-off values. The use of two independent, well-characterised clinical cohorts of IBD patients, both genotyped using the same methods, strengthens the results of this study. However, the relatively small sizes of our cohorts and the fact that we calculated GRS for only 13 traits may have precluded comprehensive identification of genetic contributions to IBD phenotypes. In conclusion, we show that GRS can identify genetic contributions to clinical disease heterogeneity of IBD. Molecular phenotyping, including of genetic, microbial, and environmental factors, of well-characterised cohorts of IBD patients holds promise to further our understanding of the heterogeneous character of IBD and allow clinical trials to study personalised disease management strategies.

Funding

FH has received research grants from Dr Falk, Janssen-Cilag, Abbvie, and Takeda. RKW is supported by a Diagnostics Grant from the Dutch Digestive Foundation [D16-14]. NKHB has received unrestricted research grants from Dr Falk, TEVA Pharma BV, MLDS, and Takeda. EAMF is supported by an MLDS Career Development grant [CDG 14-04]. RKW has received unrestricted research grants from Takeda, Tramedico, and Ferring. EAMF has received an unrestricted research grant from Takeda.

Conflict of Interest

FH has served on advisory boards or as speaker for Abbvie, Janssen-Cilag, MSD, Takeda, Celltrion, Teva, Sandoz, and Dr Falk, and has received consulting fees from Celgene. NKHB has served as a speaker for AbbVie and MSD and as a consultant and principal investigator for TEVA Pharma BV and Takeda. All other authors report no conflicts of interest.

Acknowlegdements

The authors thank Kate McIntyre, Scientific Editor in the Department of Genetics, University Medical Center Groningen, for editing and formatting this manuscript.

Author Contributions

MDV, LMS, and KWJS had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Concept and design: MDV, EAMF, RKW. Acquisition, analysis, or interpretation of data: MDV, LMS, KWJS, BHJ, GD, CJW, FH, MJP, AEM, NKHB, ML, BO, EAMF, RKW. Drafting of the manuscript: MDV, EAMF, RKW. Critical revision of the manuscript: GD, CJW, FH, MJP, AEM, NKHB, ML, BO. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file.

33 in total

1. Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci.

Authors: Eli A Stahl; Soumya Raychaudhuri; Elaine F Remmers; Gang Xie; Stephen Eyre; Brian P Thomson; Yonghong Li; Fina A S Kurreeman; Alexandra Zhernakova; Anne Hinks; Candace Guiducci; Robert Chen; Lars Alfredsson; Christopher I Amos; Kristin G Ardlie; Anne Barton; John Bowes; Elisabeth Brouwer; Noel P Burtt; Joseph J Catanese; Jonathan Coblyn; Marieke J H Coenen; Karen H Costenbader; Lindsey A Criswell; J Bart A Crusius; Jing Cui; Paul I W de Bakker; Philip L De Jager; Bo Ding; Paul Emery; Edward Flynn; Pille Harrison; Lynne J Hocking; Tom W J Huizinga; Daniel L Kastner; Xiayi Ke; Annette T Lee; Xiangdong Liu; Paul Martin; Ann W Morgan; Leonid Padyukov; Marcel D Posthumus; Timothy R D J Radstake; David M Reid; Mark Seielstad; Michael F Seldin; Nancy A Shadick; Sophia Steer; Paul P Tak; Wendy Thomson; Annette H M van der Helm-van Mil; Irene E van der Horst-Bruinsma; C Ellen van der Schoot; Piet L C M van Riel; Michael E Weinblatt; Anthony G Wilson; Gert Jan Wolbink; B Paul Wordsworth; Cisca Wijmenga; Elizabeth W Karlson; Rene E M Toes; Niek de Vries; Ann B Begovich; Jane Worthington; Katherine A Siminovitch; Peter K Gregersen; Lars Klareskog; Robert M Plenge
Journal: Nat Genet Date: 2010-05-09 Impact factor: 38.330

2. Toward an integrated clinical, molecular and serological classification of inflammatory bowel disease: report of a Working Party of the 2005 Montreal World Congress of Gastroenterology.

Authors: Mark S Silverberg; Jack Satsangi; Tariq Ahmad; Ian D R Arnott; Charles N Bernstein; Steven R Brant; Renzo Caprilli; Jean-Frédéric Colombel; Christoph Gasche; Karel Geboes; Derek P Jewell; Amir Karban; Edward V Loftus; A Salvador Peña; Robert H Riddell; David B Sachar; Stefan Schreiber; A Hillary Steinhart; Stephan R Targan; Severine Vermeire; B F Warren
Journal: Can J Gastroenterol Date: 2005-09 Impact factor: 3.522

3. Clinical characteristics of Crohn's disease in 72 families.

Authors: J F Colombel; B Grandbastien; C Gower-Rousseau; S Plegat; J P Evrard; J L Dupas; J P Gendre; R Modigliani; J Bélaïche; J Hostein; J P Hugot; H van Kruiningen; A Cortot
Journal: Gastroenterology Date: 1996-09 Impact factor: 22.682

Review 4. Crohn's disease.

Authors: Joana Torres; Saurabh Mehandru; Jean-Frédéric Colombel; Laurent Peyrin-Biroulet
Journal: Lancet Date: 2016-12-01 Impact factor: 79.321

5. A meta-analysis of genome-wide association scans identifies IL18RAP, PTPN2, TAGAP, and PUS10 as shared risk loci for Crohn's disease and celiac disease.

Authors: Eleonora A M Festen; Philippe Goyette; Todd Green; Gabrielle Boucher; Claudine Beauchamp; Gosia Trynka; Patrick C Dubois; Caroline Lagacé; Pieter C F Stokkers; Daan W Hommes; Donatella Barisani; Orazio Palmieri; Vito Annese; David A van Heel; Rinse K Weersma; Mark J Daly; Cisca Wijmenga; John D Rioux
Journal: PLoS Genet Date: 2011-01-27 Impact factor: 5.917

6. The 1000IBD project: multi-omics data of 1000 inflammatory bowel disease patients; data release 1.

Authors: Floris Imhann; K J Van der Velde; R Barbieri; R Alberts; M D Voskuil; A Vich Vila; V Collij; L M Spekhorst; K W J Van der Sloot; V Peters; H M Van Dullemen; M C Visschedijk; E A M Festen; M A Swertz; G Dijkstra; R K Weersma
Journal: BMC Gastroenterol Date: 2019-01-08 Impact factor: 3.067

7. High-density mapping of the MHC identifies a shared role for HLA-DRB1*01:03 in inflammatory bowel diseases and heterozygous advantage in ulcerative colitis.

Authors: Philippe Goyette; Gabrielle Boucher; Dermot Mallon; Eva Ellinghaus; Luke Jostins; Hailiang Huang; Stephan Ripke; Elena S Gusareva; Vito Annese; Stephen L Hauser; Jorge R Oksenberg; Ingo Thomsen; Stephen Leslie; Mark J Daly; Kristel Van Steen; Richard H Duerr; Jeffrey C Barrett; Dermot P B McGovern; L Philip Schumm; James A Traherne; Mary N Carrington; Vasilis Kosmoliaptsis; Tom H Karlsen; Andre Franke; John D Rioux
Journal: Nat Genet Date: 2015-01-05 Impact factor: 41.307

8. Reference-based phasing using the Haplotype Reference Consortium panel.

Authors: Po-Ru Loh; Petr Danecek; Pier Francesco Palamara; Christian Fuchsberger; Yakir A Reshef; Hilary K Finucane; Sebastian Schoenherr; Lukas Forer; Shane McCarthy; Goncalo R Abecasis; Richard Durbin; Alkes L Price
Journal: Nat Genet Date: 2016-10-03 Impact factor: 38.330

9. Genome-wide association study in 79,366 European-ancestry individuals informs the genetic architecture of 25-hydroxyvitamin D levels.

Authors: Xia Jiang; Paul F O'Reilly; Hugues Aschard; Yi-Hsiang Hsu; J Brent Richards; Josée Dupuis; Erik Ingelsson; David Karasik; Stefan Pilz; Diane Berry; Bryan Kestenbaum; Jusheng Zheng; Jianan Luan; Eleni Sofianopoulou; Elizabeth A Streeten; Demetrius Albanes; Pamela L Lutsey; Lu Yao; Weihong Tang; Michael J Econs; Henri Wallaschofski; Henry Völzke; Ang Zhou; Chris Power; Mark I McCarthy; Erin D Michos; Eric Boerwinkle; Stephanie J Weinstein; Neal D Freedman; Wen-Yi Huang; Natasja M Van Schoor; Nathalie van der Velde; Lisette C P G M de Groot; Anke Enneman; L Adrienne Cupples; Sarah L Booth; Ramachandran S Vasan; Ching-Ti Liu; Yanhua Zhou; Samuli Ripatti; Claes Ohlsson; Liesbeth Vandenput; Mattias Lorentzon; Johan G Eriksson; M Kyla Shea; Denise K Houston; Stephen B Kritchevsky; Yongmei Liu; Kurt K Lohman; Luigi Ferrucci; Munro Peacock; Christian Gieger; Marian Beekman; Eline Slagboom; Joris Deelen; Diana van Heemst; Marcus E Kleber; Winfried März; Ian H de Boer; Alexis C Wood; Jerome I Rotter; Stephen S Rich; Cassianne Robinson-Cohen; Martin den Heijer; Marjo-Riitta Jarvelin; Alana Cavadino; Peter K Joshi; James F Wilson; Caroline Hayward; Lars Lind; Karl Michaëlsson; Stella Trompet; M Carola Zillikens; Andre G Uitterlinden; Fernando Rivadeneira; Linda Broer; Lina Zgaga; Harry Campbell; Evropi Theodoratou; Susan M Farrington; Maria Timofeeva; Malcolm G Dunlop; Ana M Valdes; Emmi Tikkanen; Terho Lehtimäki; Leo-Pekka Lyytikäinen; Mika Kähönen; Olli T Raitakari; Vera Mikkilä; M Arfan Ikram; Naveed Sattar; J Wouter Jukema; Nicholas J Wareham; Claudia Langenberg; Nita G Forouhi; Thomas E Gundersen; Kay-Tee Khaw; Adam S Butterworth; John Danesh; Timothy Spector; Thomas J Wang; Elina Hyppönen; Peter Kraft; Douglas P Kiel
Journal: Nat Commun Date: 2018-01-17 Impact factor: 14.919

10. Identification of 153 new loci associated with heel bone mineral density and functional involvement of GPC6 in osteoporosis.

Authors: John P Kemp; John A Morris; Carolina Medina-Gomez; Vincenzo Forgetta; Nicole M Warrington; Scott E Youlten; Jie Zheng; Celia L Gregson; Elin Grundberg; Katerina Trajanoska; John G Logan; Andrea S Pollard; Penny C Sparkes; Elena J Ghirardello; Rebecca Allen; Victoria D Leitch; Natalie C Butterfield; Davide Komla-Ebri; Anne-Tounsia Adoum; Katharine F Curry; Jacqueline K White; Fiona Kussy; Keelin M Greenlaw; Changjiang Xu; Nicholas C Harvey; Cyrus Cooper; David J Adams; Celia M T Greenwood; Matthew T Maurano; Stephen Kaptoge; Fernando Rivadeneira; Jonathan H Tobias; Peter I Croucher; Cheryl L Ackert-Bicknell; J H Duncan Bassett; Graham R Williams; J Brent Richards; David M Evans
Journal: Nat Genet Date: 2017-09-04 Impact factor: 38.330

1 in total

1. Network Pharmacology and Molecular Docking Analysis on Molecular Mechanism of Qingzi Zhitong Decoction in the Treatment of Ulcerative Colitis.

Authors: Xintian Shou; Yumeng Wang; Xuesong Zhang; Yanju Zhang; Yan Yang; Chenglin Duan; Yihan Yang; Qiulei Jia; Guozhen Yuan; Jingjing Shi; Shuqing Shi; Hanming Cui; Yuanhui Hu
Journal: Front Pharmacol Date: 2022-02-08 Impact factor: 5.810

1 in total