| Literature DB >> 26490195 |
Isabelle Cleynen1, Gabrielle Boucher2, Luke Jostins3, L Philip Schumm4, Sebastian Zeissig5, Tariq Ahmad6, Vibeke Andersen7, Jane M Andrews8, Vito Annese9, Stephan Brand10, Steven R Brant11, Judy H Cho12, Mark J Daly13, Marla Dubinsky14, Richard H Duerr15, Lynnette R Ferguson16, Andre Franke17, Richard B Gearry18, Philippe Goyette2, Hakon Hakonarson19, Jonas Halfvarson20, Johannes R Hov21, Hailang Huang13, Nicholas A Kennedy22, Limas Kupcinskas23, Ian C Lawrance24, James C Lee25, Jack Satsangi22, Stephan Schreiber26, Emilie Théâtre27, Andrea E van der Meulen-de Jong28, Rinse K Weersma29, David C Wilson30, Miles Parkes25, Severine Vermeire31, John D Rioux2, John Mansfield32, Mark S Silverberg33, Graham Radford-Smith34, Dermot P B McGovern35, Jeffrey C Barrett36, Charlie W Lees37.
Abstract
BACKGROUND: Crohn's disease and ulcerative colitis are the two major forms of inflammatory bowel disease; treatment strategies have historically been determined by this binary categorisation. Genetic studies have identified 163 susceptibility loci for inflammatory bowel disease, mostly shared between Crohn's disease and ulcerative colitis. We undertook the largest genotype association study, to date, in widely used clinical subphenotypes of inflammatory bowel disease with the goal of further understanding the biological relations between diseases.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26490195 PMCID: PMC4714968 DOI: 10.1016/S0140-6736(15)00465-1
Source DB: PubMed Journal: Lancet ISSN: 0140-6736 Impact factor: 202.731
Phenotype distribution of primary cohort
| Sex | ||||
| Male | 7227 (44%) | 6339 (51%) | 13 738 (47%) | |
| Female | 9257 (56%) | 6027 (49%) | 15 448 (53%) | |
| Missing | 418 (3%) | 231 (2%) | 652 (2%) | |
| Age at diagnosis (years) | ||||
| Median (quartiles) | 25 (19–36) | 31 (22–24) | 28 (20–40) | |
| <17 (A1) | 2568 (18%) | 1233 (11%) | 3903 (15%) | |
| 17–40 (A2) | 9166 (64%) | 6594 (58%) | 15 854 (61%) | |
| >40 (A3) | 2626 (18%) | 3469 (31%) | 6141 (24%) | |
| Missing | 2542 (15%) | 1301 (10%) | 3940 (13%) | |
| Family history | ||||
| Yes | 3471 (27%) | 2232 (21%) | 5778 (24%) | |
| No | 9575 (73%) | 8260 (79%) | 18 005 (76%) | |
| Missing | 3856 (23%) | 2105 (17%) | 6055 (20%) | |
| Smoking status | 21 718 | |||
| Smoker | 3319 (28%) | 1162 (12%) | 4512 (21%) | |
| Ex-smoker | 1665 (14%) | 2739 (28%) | 4436 (20%) | |
| Non-smoker | 6752 (58%) | 5853 (60%) | 12 770 (59%) | |
| Missing | 5166 (31%) | 2843 (23%) | 8120 (27%) | |
| Disease location | ||||
| Ileal (L1) | 3878 (31%) | .. | .. | |
| Colorectal (L2) | 2933 (24%) | .. | .. | |
| Ileocolonic (L3) | 5520 (44%) | .. | .. | |
| Other | 154 (1%) | .. | .. | |
| Upper GI (L4) | 1695 (14%) | .. | .. | |
| Missing | 2777 (18%) | .. | .. | |
| Disease extent | ||||
| Proctitis (E1) | .. | 1271 (12%) | .. | |
| Left-sided (E2) | .. | 4087 (38%) | .. | |
| Extensive (E3) | .. | 5212 (48%) | .. | |
| Other | .. | 205 (2%) | .. | |
| Missing | .. | 1822 (14%) | .. | |
| Disease behaviour | ||||
| Inflammatory (B1) | 6196 (50%) | .. | .. | |
| Stricturing (B2) | 3250 (26%) | .. | .. | |
| Penetrating (B3) | 3054 (24%) | .. | .. | |
| Missing | 2762 (18%) | .. | .. | |
| Surgery | ||||
| Yes | 7257 (52%) | 1932 (18%) | .. | |
| No | 6605 (48%) | 8575 (82%) | .. | |
| Missing | 3040 (18%) | 2090 (17%) | .. | |
GI=gastrointestinal.
Includes 255 patients with indeterminate colitis and 84 patients with missing exact diagnosis.
Excludes data obtained with patient questionnaires (2658 patients).
Surgery in ulcerative colitis refers to colectomy. Denominators for data are: 12 485 for disease location (L1, L2, L3, and other); and 11 717 for upper GI (L4) over non-missing information; and 12 485 for disease behaviour over non-missing B1, B2, and B3.
Figure 1Evolution of clinical subphenotypes
(A) Proportion of patients with Crohn's disease who have inflammatory (Montreal classification B1), stricturing (B2), or penetrating (B3) disease over time from diagnosis to most recent follow-up. (B) Proportion of patients with Crohn's disease who have ileal (L1), colonic (L2), or ileocolonic (L3) disease over time from diagnosis to most recent follow-up. (C) Survival plot of time from diagnosis of Crohn's disease to resectional surgery stratified by disease location. (D) Survival plot of time from diagnosis of ulcerative colitis to colectomy stratified by disease extent (extensive disease, E3; non-extensive disease, E1 and E2).
Associations between genotype and age at diagnosis achieving genome-wide significance
| p value | β (SE) | p value | β (SE) | p value | β (SE) | ||
|---|---|---|---|---|---|---|---|
| rs35261698 | 0·306 | 6·34 × 10−12 | −0·06 (0·01) | 3·65 × 10−07 | −0·06 (0·01) | 3·90 × 10−06 | −0·06 (0·01) |
| rs2172252 | 0·288 | 1·35 × 10−12 | −0·06 (0·01) | 2·93 × 10−08 | −0·07 (0·01) | 9·51 × 10−06 | −0·06 (0·01) |
| rs3197999 | 0·281 | 2·73 × 10−12 | −0·06 (0·01) | 2·37 × 10−08 | −0·07 (0·01) | 2·18 × 10−05 | −0·06 (0·01) |
| rs3115674 | 0·116 | 3·42 × 10−02 | −0·03 (0·01) | .. | .. | 3·35 × 10−02 | −0·04 (0·02) |
| rs4151651 | 0·034 | .. | .. | .. | .. | 1·15 × 10−02 | −0·07 (0·03) |
| rs3129891 | 0·209 | 1·15 × 10−06 | −0·05 (0·01) | .. | .. | 1·43 × 10−08 | −0·09 (0·02) |
| rs9268832 | 0·393 | 7·42 × 10−09 | −0·05 (0·01) | 4·56 × 10−07 | −0·06 (0·01) | 2·19 × 10−03 | −0·04 (0·01) |
| rs482044 | 0·401 | .. | .. | 1·51 × 10−02 | 0·03 (0·01) | .. | .. |
| rs2066844 (p.R702W) | 0·045 | 3·58 × 10−07 | −0·08 (0·02) | 1·21 × 10−07 | −0·1 (0·02) | .. | .. |
| rs2066845 (p.G908R) | 0·016 | 2·10 × 10−04 | −0·1 (0·03) | 5·50 × 10−03 | −0·09 (0·03) | 8·41 × 10−03 | −0·15 (0·06) |
| rs2066847 (p.L1007fsX) | 0·024 | 6·64 × 10−16 | −0·16 (0·02) | 2·04 × 10−16 | −0·17 (0·02) | .. | .. |
Loci are listed by single nucleotide polymorphism. Age at diagnosis assessed by linear regression analysis on normalised data for Crohn's disease and ulcerative colitis; IBD assessed by meta-analysis of Crohn's disease and ulcerative colitis data. Effect size is given as standard deviation unit (standard error of effect). MAF=minor allele frequency. IBD=inflammatory bowel disease. ..=non-significant associations (pnominal<0·05).
Genome-wide significant associations.
The most significant association pews per subphenotype, if genome-wide significant.
Associations between genotype and disease location, behaviour, extent, surgery, and colectomy achieving genome-wide significance
| p value | OR (95% CI), ileocolonic | OR (95% CI), ileal | p value | OR (95%CI) | p value | HR (95% CI) | p value | OR (95% CI) | p value | HR (95% CI) | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| rs2172252 | 0·288 | 3·10 × 10−02 | 1·07 (1·00–1·13) | 1·10 (1·02–1·19) | .. | .. | .. | .. | .. | .. | .. | .. |
| rs3197999 | 0·281 | 2·10 × 10−02 | 1·08 (1·02–1·15) | 1·10 (1·02–1·19) | .. | .. | .. | .. | .. | .. | .. | .. |
| rs3115674 | 0·116 | 3·00 × 10−03 | 0·88 (0·80–0·97) | 0·81 (0·72–0·91) | 4·00 × 10−03 | 0·89 (0·82–0·96) | .. | .. | 5·22 × 10−15 | 1·43 (1·30–1·58) | .. | .. |
| rs4151651 | 0·034 | 2·42 × 10−10 | 0·71 (0·62–0·81) | 0·58 (0·50–0·68) | 2·50 × 10−02 | 0·87 (0·77–0·98) | .. | .. | .. | .. | 6·05 × 10−12 | 1·72 (1·47–2·00) |
| rs6930777 | 0·112 | 8·13 × 10−23 | 0·68 (0·62–0·75) | 0·58 (0·52–0·65) | 2·00 × 10−03 | 0·89 (0·82–0·96) | .. | .. | .. | .. | 2·49 × 10−07 | 1·36 (1·21–1·52) |
| rs3129891 | 0·209 | .. | .. | .. | 8·00 × 10−03 | 0·92 (0·87–0·98) | .. | .. | 3·22 × 10−10 | 1·24 (1·17–1·32) | .. | .. |
| rs9268832 | 0·393 | 1·40 × 10−02 | 1·03 (0·97–1·09) | 0·94 (0·87–1·02) | 4·00 × 10−03 | 0·93 (0·89–0·97) | 1·45 × 10–02 | 0·95 (0·90–0·99) | 6·59 × 10−05 | 1·12 (1·06–1·19) | .. | .. |
| rs482044 | 0·401 | 2·38 × 10−09 | 1·15 (1·08–1·22) | 1·25 (1·16–1·35) | 8·46 × 10−06 | 1·11 (1·07–1·15) | 1·84 × 10–02 | 1·05 (1·01–1·10) | 1·57 × 10−05 | 0·88 (0·83–0·93) | 2·19 × 10−07 | 0·79 (0·73–0·87) |
| rs77005575 | 0·439 | 1·00 × 10−15 | 1·23 (1·16–1·30) | 1·33 (1·23–1·44) | 2·82 × 10−10 | 1·16 (1·12–1·21) | 9·20 × 10–04 | 1·08 (1·03–1·12) | 3·24 × 10−03 | 0·92 (0·87–0·98) | 1·55 × 10−04 | 0·85 (0·78–0·93) |
| rs2066844 (p.R702W) | 0·045 | 2·50 × 10−26 | 1·61 (1·43–1·81) | 1·94 (1·72–2·18) | 1·76 × 10−06 | 1·21 (1·12–1·31) | 4·67 × 10–03 | 1·10 (1·03–1·18) | .. | .. | .. | .. |
| rs2066845 (p.G908R) | 0·016 | 2·77 × 10−09 | 1·59 (1·31–1·93) | 1·82 (1·50–2·21) | 7·17 × 10−05 | 1·28 (1·14–1·44) | 2·87 × 10–03 | 1·17 (1·06–1·30) | .. | .. | .. | .. |
| rs2066847 (p.L1007fsX) | 0·024 | 1·01 × 10−35 | 1·89 (1·62–2·21) | 2·50 (2·14–2·92) | 5·73 × 10−10 | 1·31 (1·21–1·42) | 2·04 × 10–13 | 1·31 (1·22–1·40) | .. | .. | 3·55 × 10−02 | 1·32 (1·02–1·70) |
Loci are listed by single nucleotide polymorphism. Disease location assessed by multinomial logistic regression analysis; disease behaviour by ordinal logistic regression analysis (effect size is odds ratio [95% CI] for B2 versus B1, which is also equivalent to B3 vs B2+B1); and disease extent by binomial logistic analysis. Surgery and colectomy assessed by survival analysis under a Weibull distribution. MAF=minor allele frequency. OR=odds ratio. HR=hazard ratio. ..=non-significant associations (pnominal<0·05).
Genome-wide significant associations.
The most significant association per locus per subphenotype, if genome-wide significant.
Figure 2Effect of single nucleotide polymorphisms, HLA alleles, and polygenic risk scores on phenotypes of inflammatory bowel disease
(A) Effect sizes for genotype–phenotype associations for risk of Crohn's disease and ulcerative colitis (odds ratio relative to controls), Crohn's disease location (odds ratio of ileal vs colonic disease), Crohn's disease behaviour (proportional odds ratio), disease extent of ulcerative colitis (odds ratio of extensive vs non-extensive disease), and age at diagnosis (linear coefficients) for MST1, MHC, and NOD2 variants. All effect sizes are per allele, and are adjusted for associations with correlated phenotypes by including them as additional predictors in the regression model, along with principal components to control for stratification. See appendix A for more details on these regression models. Genome-wide significant associations are depicted by filled circles, and error bars depict 95% CIs. (B) Effect sizes of genetic risk scores for disease location, disease behaviour, and age at diagnosis including all 163 susceptibility loci. Effect sizes are calculated by linear regression of the risk score against the phenotype, adjusted for the effect of the other phenotypes and for principal components, and error bars depict 95% CIs. Filled circles represent effects that are significant after correcting for 15 phenotype-score combinations (p<0·003). Effect sizes are measured on scales standardised to unit variance (and thus represent the number of standard deviations that the mean phenotype increases by per standard deviation increase in the risk score).
Figure 3Violin plot showing the genetic substructure of inflammatory bowel disease location
The violin represents the range of the log CD versus UC score for the indicated subphenotype (calculated with the R package “vioplot”), with dots representing the mean of that group and error bars the 95% CIs. Although the effects are small compared with the variation within groups, the mean effects can still be measured accurately (right side of the figure). It can be seen on this figure that the Crohn's disease versus ulcerative colitis (CD vs UC) risk score placed colonic Crohn's disease between ileal Crohn's disease and ulcerative colitis. The plot also shows the positioning of the intermediate phenotypes (ileocolonic Crohn's disease and inflammatory bowel disease unclassified [IBD-U]) in between ileal and colonic Crohn's disease, and ulcerative colitis and colonic Crohn's disease, respectively.
Variance explained by demographic and genetic factors for disease location in adult onset of Crohn's disease
| Ever smoker | −0·041 | 0·108 | 7·00 × 10−1 | 0·01% |
| Smoker at diagnosis | 0·473 | 0·117 | 5·34 × 10−5 | 1·53% |
| Age at diagnosis | −0·033 | 0·005 | 1·50 × 10−9 | 2·14% |
| Year of birth | −0·010 | 0·005 | 6·94 × 10−2 | 0·19% |
| rs6930777 (MHC) | −0·302 | 0·082 | 2·19 × 10−4 | 0·54% |
| rs77005575 (MHC) | 0·190 | 0·055 | 5·21 × 10−4 | 0·55% |
| NOD2 | 0·532 | 0·070 | 2·60 × 10−14 | 3·23% |
| Genetic risk score | 0·165 | 0·038 | 1·61 × 10−5 | 1·01% |
| Genetic parameters | .. | .. | .. | 5·5% |
| Genetics and smoking | .. | .. | .. | 6·8% |
| All parameters | .. | .. | .. | 8·03% |
Number of risk alleles at the three NOD2 hits.
Crohn's disease versus ulcerative colitis (CD vs UC) genetic risk score (without NOD2 and MHC).
The total R2 for these parameters, excluding the principal component used to account for population stratification. In view of the correlation structure, this is not expected to be equivalent to the sum of R2 obtained for each parameter.
Variance explained by demographic and genetic factors for disease extent in adult onset of ulcerative colitis
| Ever smoking | 0·1268 | 0·0856 | 1·38 × 10−1 | 0·12% |
| Current smoking | 0·1229 | 0·1179 | 2·97 × 10−1 | 0·06% |
| Age at diagnosis | −0·0234 | 0·0055 | 2·41 × 10−5 | 1·10% |
| Year of birth | 0·0008 | 0·0055 | 8·79 × 10−1 | 0·00% |
| rs3115674 | 0·3795 | 0·0832 | 5·08 × 10−6 | 0·80% |
| Genetic risk score | 0·0784 | 0·0502 | 1·18 × 10−1 | 0·09% |
| Genetic parameters | .. | .. | .. | 0·9% |
| Genetics and smoking | .. | .. | .. | 1·1% |
| All parameters | .. | .. | .. | 2·39% |
Crohn's disease versus ulcerative colitis (CD vs UC) genetic risk score (without NOD2 and MHC).
The total R2 for these parameters, excluding the principal component used to account for population stratification. In view of the correlation structure, this is not expected to be equivalent to the sum of R2 obtained for each parameter.
Figure 4Histograms of Crohn's disease versus ulcerative colitis (CD vs UC) genetic risk score in patients with inflammatory bowel disease
Risk scores created from the 163 known inflammatory bowel disease risk loci with per-locus contributions estimated to maximally distinguish all Crohn's disease from ulcerative colitis. Distributions of ulcerative colitis samples are shown in blue, ileal Crohn's disease samples in green, and colonic Crohn's disease with hatched lines (middle area in dark green shows overlap of blue and green distributions). The overlap of all three distributions shows the shared genetic aetiology of inflammatory bowel disease, and the intermediate position of colonic Crohn's disease between ulcerative colitis and ileal Crohn's disease shows that it is genetically distinct from the others. Vertical dashed lines show boundaries for outlier analysis: ulcerative colitis cases above 2 were selected as being likely Crohn's disease and Crohn's disease cases below −2 as likely ulcerative colitis.