Literature DB >> 22170232

Amino acid position 11 of HLA-DRβ1 is a major determinant of chromosome 6p association with ulcerative colitis.

J-P Achkar1, L Klei, P I W de Bakker, G Bellone, N Rebert, R Scott, Y Lu, M Regueiro, A Brzezinski, M I Kamboh, C Fiocchi, B Devlin, M Trucco, S Ringquist, K Roeder, R H Duerr.   

Abstract

The major histocompatibility complex (MHC) on chromosome 6p is an established risk locus for ulcerative colitis (UC) and Crohn's disease (CD). We aimed to better define MHC association signals in UC and CD by combining data from dense single-nucleotide polymorphism (SNP) genotyping and from imputation of classical human leukocyte antigen (HLA) types, their constituent SNPs and corresponding amino acids in 562 UC, 611 CD and 1428 control subjects. Univariate and multivariate association analyses were performed, controlling for ancestry. In univariate analyses, absence of the rs9269955 C allele was strongly associated with risk for UC (P = 2.67 × 10(-13)). rs9269955 is a SNP in the codon for amino acid position 11 of HLA-DRβ1, located in the P6 pocket of the HLA-DR antigen binding cleft. This amino acid position was also the most significantly UC-associated amino acid in omnibus tests (P = 2.68 × 10(-13)). Multivariate modeling identified rs9269955-C and 13 other variants in best predicting UC vs control status. In contrast, there was only suggestive association evidence between the MHC and CD. Taken together, these data demonstrate that variation at HLA-DRβ1, amino acid 11 in the P6 pocket of the HLA-DR complex antigen binding cleft is a major determinant of chromosome 6p association with UC.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 22170232      PMCID: PMC3341846          DOI: 10.1038/gene.2011.79

Source DB:  PubMed          Journal:  Genes Immun        ISSN: 1466-4879            Impact factor:   2.676


Introduction

The major histocompatibility complex (MHC) on chromosome 6p contains the highly polymorphic human leukocyte antigen (HLA) genes and other immunoregulatory genes.[1, 2] Genetic variants in the MHC have been associated with susceptibility for many infectious and immune-mediated diseases including the inflammatory bowel diseases (IBD), ulcerative colitis (UC) and Crohn’s disease (CD).[3, 4] Features of the MHC such as dense gene clustering with broad linkage disequilibrium, extensive polymorphism, and heterogeneity among different populations have made localization of causal variants challenging.[2] HLA polymorphisms were the focus of attention in several IBD candidate gene association studies of relatively small sample size and meta-analyses of these studies found HLA associations in UC that were mostly different from those found in CD.[3-5] Subsequently, linkage between IBD and the chromosome 6p IBD3 locus was found in genome-wide linkage scans[6-8]. Recent genome-wide association studies (GWAS) have confirmed the MHC as one of 47 UC loci and 71 CD loci with significant evidence for association (P < 5×10−8).[9, 10] The most significant association signal in a recent meta-analysis of six GWAS that included 6,687 UC cases and 19,718 controls of European ancestry was at a single nucleotide polymorphism (SNP) in the MHC class II region (rs9268853, P = 1.35×10−55).[10] In contrast, the most significant MHC association signal in a meta-analysis of six CD GWAS that included a similar combined sample size (6,333 CD cases and 15,056 controls) was less significant than the UC signal and was located in the MHC class III region near the lymphotoxin A (LTA) locus (rs1799964, P = 3.98×10−11).[9, 10] Here, we explore the MHC association signal in the discovery stage of a new UC and CD GWAS with excellent coverage (>10,000 SNPs) across the extended MHC. We used our MHC SNP data and an existing reference dataset to impute classical HLA allele types, their constituent SNPs, and corresponding amino acids in our UC, CD and control samples. This allowed us to evaluate if the observed SNP associations in the MHC can be explained by variation specifically in the classical HLA genes.

Results

Analysis of genotyped MHC SNPs in IBD

First, we tested 10,347 genotyped SNPs in the MHC region from 29,299 to 33,884 kb on chromosome 6 using NCBI36/hg18 coordinates for association with UC and CD with ileal involvement. Among 35 SNPs that reached genome-wide significance (P < 5 × 10−8) in the UC analysis, the most significant SNP was rs2647025 (OR=1.95 [1.62–2.35, 95% confidence interval (CI)] for the G allele; P = 1.94×10−12), located in the promoter region of HLA-DQB1 (Figure 1A). This SNP is correlated with rs9268853 (r2 = 0.63 in HapMap 3-CEU[11]), which was the MHC region SNP with the most significant association in a recent UC GWAS meta-analysis[10], and it is also correlated with rs2395185 (r2 = 0.60 in our dataset), which was the MHC region SNP with the most significant association in the NIDDK IBD Genetics Consortium UC GWAS[12], both at distances of > 200 kb.
Figure 1

Major histocompatibility complex regional association plots for ulcerative colitis. (A) Association results for genotyped SNPs from the Illumina Omni1-quad BeadChip. The intensity of the red shading indicates the strength of the pairwise r2 correlation to the most associated SNP, rs2647025. (B) Association results for both genotyped (◇ symbols) and imputed (■ symbols) nucleotides focused in on the region of peak association in panel A. Horizontal lines represent the classical HLA alleles in this region. The intensity of the red shading indicates the strength of the pairwise r2 correlation to the most associated SNP marker, rs9269955-C. (C) Association results for imputed amino acids in HLA-DRβ1.

In contrast, there was only suggestive evidence for association between MHC region SNPs and CD with ileal involvement (Figure 2). The most significant association signal was found at rs17880124 (OR=2.23 [1.52–3.27, 95%CI] for the G allele; P = 3.82×10−5) which is located in an exon of the MHC class I polypeptide-related sequence A (MICA) gene. Of note, the association observed in UC was many orders of magnitude stronger than that in CD with ileal involvement despite a similar number of cases. Therefore, we focused on the UC signal through imputation of classical HLA alleles and their corresponding nucleotide and amino acid sequences.
Figure 2

Major histocompatibility complex regional association plot for Crohn’s disease with ileal involvement. Association results are for genotyped SNPs from the Illumina Omni1-quad BeadChip. The intensity of the red shading indicates the strength of the pairwise r2 correlation to the most associated SNP, rs17880124.

Analysis of imputed classical HLA alleles in UC

The following imputed genetic markers were included in our UC vs. control analyses: 156 classical HLA alleles at four-digit resolution, 95 classical HLA allele groups at two-digit resolution, 1,765 binary SNP features at 1,573 nucleotide positions, and 561 binary HLA amino acid features at 357 amino acid positions. The most significant association signal in UC mapped to rs9269955 (Figure 1B), which is a tri-allelic SNP within the coding region of HLA-DRB1 (position 32,660,116 using NCBI36/hg18 coordinates). In combination with the nucleotide position directly adjacent to it (rs17878703 at position 32,660,115), rs9269955 determines the codon for amino acid position 11 of the HLA-DRβ1 protein, where six different amino acid alleles are observed in the population at large (Table 1). Chromosome 6 position 32,660,114 is the third position in this codon, and it is not known to be polymorphic. Rs9269955-C (to indicate the presence of the C allele) is associated with protection against UC (OR = 0.51 [0.43–0.61, 95% CI], P = 2.67×10−13). In combination with the adjacent rs17878703 alleles, rs9269955-C encodes three of the six observed amino acids (aspartic acid, valine, or glycine) at HLA-DRβ1 amino acid 11 (Table 1). This SNP is correlated with rs2395185 (r2 = 0.88 in our dataset), which was the MHC region SNP with the most significant association in the NIDDK IBD Genetics Consortium UC GWAS.[12]
Table 1

Univariate results for DNA sequence (shown in order of positions 32,660,114 to 32,660,116) and codon determinants (shown in order of corresponding positions 32,660,116 to 32,660,114) for HLA-DRβ1, amino acid 11.

PositionAlleleDNA sequence (Positions 32,660,114 – 32,660,116)Codon (Positions 32,660,116 – 32,660,114)Frequency (UC)Frequency (Controls)OR (95% CI)P value
rs9269955 (position 32,660,116)C--C0.1880.3000.51 (0.43–0.61)2.67 × 10−13
A--A0.4510.4311.11 (0.96–1.28)1.51 × 10−1
G--G0.3620.2681.52 (1.31–1.77)5.56 × 10−8
rs17878703 (position 32,660,115)T-T-0.0030.0110.25 (0.07–0.81)2.14 × 10−2
C-C-0.0920.1390.61 (0.48–0.77)3.38 × 10−5
A-A-0.2380.2660.86 (0.72–1.01)6.97 × 10−2
G-G-0.6670.5841.46 (1.26–1.70)9.25 × 10−7
HLA-DRβ1, amino acid 11AspATCGAU0.0030.0110.25 (0.07–0.81)2.14 × 10−2
ValAACGUU0.0930.1510.55 (0.44–0.70)1.11 × 10−6
GlyACCGGU0.0920.1390.61 (0.48–0.77)3.38 × 10−5
SerAGAUCU0.4510.4311.11 (0.96–1.28)1.52 × 10−1
LeuAAGCUU0.1450.1151.32 (1.07–1.63)8.98 × 10−3
ProAGGCCU0.2160.1531.48 (1.24–1.77)1.61 × 10−5

DNA, deoxyribonucleic acid; UC, ulcerative colitis; OR, odds ratio; CI, confidence interval; A, adenine; C, cytosine; G, guanine; T, thymine; U, uracil; Asp, aspartic acid; Val, valine; Gly, glycine; Ser, serine, Leu, leucine; Pro, proline.

To analyze the role of specific amino acid positions in the HLA genes in UC, we conducted omnibus tests for association with degrees-of-freedom equal to the number of distinct residues for that amino acid position minus one (Table 2). The most significant finding was for HLA-DRβ1 amino acid 11 (P = 2.68×10−13), consistent with the results noted above (Figure 1C). Several other amino acid associations were highly significant including other amino acid positions in HLA-DRβ1, HLA-DQα1 or HLA-DQβ1 (Table 2).
Table 2

Omnibus amino acid tests for ulcerative colitis versus control. Amino acid positions with omnibus P < 5 × 10−8 are shown.

HLA amino acid positionCodon middle nucleotide position (chromosome 6, NCBI36/hg18)Degrees of freedomOmnibus P value
HLA-DRβ1, amino acid 18132,657,33517.48 × 10−9
HLA-DRβ1, amino acid 10432,657,56614.70 × 10−12
HLA-DRβ1, amino acid 9832,657,58414.68 × 10−12
HLA-DRβ1, amino acid 3732,660,03741.46 × 10−8
HLA-DRβ1, amino acid 3032,660,05856.01 × 10−10
HLA-DRβ1, amino acid 1332,660,10951.39 × 10−10
HLA-DRβ1, amino acid 1132,660,11552.68 × 10−13
HLA-DQα1, amino acid 4732,717,19132.73 × 10−10
HLA-DQα1, amino acid 5032,717,20022.95 × 10−11
HLA-DQα1, amino acid 5332,717,20922.12 × 10−11
HLA-DQα1, amino acid 17532,717,98822.28 × 10−10
HLA-DQα1, amino acid 21532,718,46415.95 × 10−12
HLA-DQβ1, amino acid 18532,737,73318.62 × 10−11
Because these results highlighted HLA-DRβ1 amino acid 11, we further analyzed the six amino acids at this position and the corresponding classical HLA-DRB1 allele groups at two-digit resolution (Table 3). The three amino acids (aspartic acid, valine, and glycine) encoded by the rs9269955-C allele in combination with the adjacent rs17878703 alleles, are all associated with protection against development of UC.
Table 3

Ulcerative colitis versus control associations for HLA-DRβ1 amino acid 11 residues and corresponding classical HLA-DRB1 alleles. The multivariate best model for HLA-DRβ1 amino acid 11 residues alone was identified with stepwise regression. UC, ulcerative colitis; OR, odds ratio; CI, confidence interval; Asp, aspartic acid; Gly, glycine; Leu, leucine; Pro, proline; Ser, serine; Val, valine.

Amino Acid at HLA-DRβ1 position 11Corresponding HLA-DRB1 group

Amino AcidFrequency (UC)Frequency (Controls)UnivariateMultivariateHLA-DRB1 groupFrequency (UC)Frequency (Controls)OR (95% CI)P value
OR (95% CI)P valueOR (95% CI)P value
Asp0.0030.0110.25 (0.07–0.81)2.14 × 10−20.21 (0.06–0.69)1.03 × 10−2HLA-DRB1*090.0030.0110.24 (0.07–0.81)2.11 × 10−2
Gly0.0920.1390.61 (0.48–0.77)3.38 × 10−50.55 (0.43–0.69)7.53 × 10−7HLA-DRB1*070.0920.1390.61 (0.48–0.77)3.38 × 10−5
Leu0.1450.1151.32 (1.07–1.63)8.98 × 10−3HLA-DRB1*010.1450.1151.32 (1.07–1.63)8.98 × 10−3
Pro0.2160.1531.48 (1.24–1.77)1.61 × 10−5HLA-DRB1*150.1930.1221.64 (1.36–1.98)2.87 × 10−7
HLA-DRB1*160.0230.0310.75 (0.47–1.20)2.34 × 10−1
Ser0.4510.4311.11 (0.96–1.28)1.52 × 10−1HLA-DRB1*030.1020.1070.93 (0.74–1.16)5.05 × 10−1
HLA-DRB1*080.0270.0290.93 (0.61–1.43)7.49 × 10−1
HLA-DRB1*11 #0.1550.1301.30 (1.06–1.60)1.11 × 10−2
HLA-DRB1*120.0260.0181.59 (0.97–2.61)6.57 × 10−2
HLA-DRB1*130.1100.1180.94 (0.75–1.18)6.09 × 10−1
HLA-DRB1*140.0310.0301.02 (0.68–1.54)9.11 × 10−1
Val0.0930.1510.55 (0.44–0.70)1.11 × 10−60.50 (0.40–0.64)2.14 × 10−8HLA-DRB1*040.0920.1380.62 (0.48–0.78)6.93 × 10−5
HLA-DRB1*100.0010.0140.06 (0.01–0.46)6.99 × 10−3

All HLA-DRB1*11 alleles are associated with serine at HLA-DRβ1 amino acid position 11, except HLA-DRB1*11:22 and HLA-DRB1*11:30, which are associated with valine and leucine, respectively.

Among 28 imputed classical HLA-DRB1 alleles tested at four-digit resolution, three were significantly associated with UC (DRB1*15:01, OR = 1.59 [1.31–1.93, 95% CI], P = 3.68×10−6; DRB1*01:03, OR = 38.39 [7.50–196.60, 95% CI], P = 1.20×10−5; DRB1*07:01, OR = 0.61 [0.48–0.77, 95% CI] P = 3.38×10−5). Because the above findings highlighted HLA-DRB1 association in UC, we then evaluated the quality of our classical HLA-DRB1 allele imputation at two-digit resolution by performing HLA-DRB1 genotyping via SSO probes and also next-generation sequencing using genomic DNA from 384 of our study subjects. This analysis demonstrated that the imputation procedure we applied was 98.8% accurate (see Supplementary Materials). We next determined the most parsimonious model to explain the association of HLA-DRβ1 amino acid 11 with UC using forward stepwise model selection for the six observed amino acids. The best model included only three of the six amino acids: valine, glycine and aspartic acid. The overall P value for this best model was 3.60×10−13 as compared to a P value of 2.68×10−13 for the full model that included all six amino acid alleles, suggesting that most of the association signal for UC at this position can be accounted for by only these three amino acids. Of note, valine, glycine and aspartic acid are the same three amino acids encoded by the most significant SNP allele, rs9269955-C, when it is combined with the adjacent rs17878703 SNP alleles. This provides good internal validation between these different analytic approaches and highlights that variation at HLA-DRβ1 amino acid 11 explains much of the HLA association with UC.

UC versus control best multivariate model

When we performed analyses conditioned on including either rs9269955-C or the HLA-DRβ1 amino acid 11 variants, there were residual UC versus control association signals due to effects of other variants in the HLA region. This finding is consistent with prior observations in UC that multiple independent association signals exist in the MHC. We used a forward stepwise model selection procedure to select the best set of markers to predict UC (Table 4). This best model has an overall P value of 4.28×10−40 and includes rs9269955-C and 13 other markers that span the chromosome 6 region from 29.45 to 33.81 Mb.
Table 4

Ulcerative colitis versus control association for MHC marker terms in best model identified with stepwise regression.

MarkerChromosome 6 position (NCBI36/hg18)GeneA1A2Frequency (UC)Frequency (controls)UnivariateMultivariate
P valueOR (95% CI)P valueOR (95% CI)
rs9269955-C32,660,116HLA-DRB1AbsentPresent0.8120.7002.67 × 10−131.95 (1.63–2.33)9.07 × 10−45.97 (2.08–17.17)
rs104941433,056,585BRD2AG0.7300.6783.51 × 10−51.43 (1.21–1.69)1.84 × 10−51.53 (1.26–1.85)
rs44045432,035,321SKIV2LAG0.3390.2471.57 × 10−71.51 (1.29–1.76)2.38 × 10−82.35 (1.74–3.17)
rs927336332,734,250HLA-DQA1/HLA-DQB1CA0.8350.7521.44 × 10−81.71 (1.42–2.06)1.55 × 10−82.15 (1.65–2.81)
rs284467731,063,338MUC21GA0.9650.9303.60 × 10−62.36 (1.64–3.39)2.01 × 10−31.83 (1.25–2.69)
rs1136759-T32,660,109HLA-DRB1PresentAbsent0.1840.2766.35 × 10−100.57 (0.47–0.68)6.52 × 10−34.39 (1.51–12.76)
rs91565431,646,476NFKBIL1/LTAAT0.3820.3304.70 × 10−41.31 (1.13–1.52)6.69 × 10−61.49 (1.25–1.77)
rs2843565631,988,616C2GA0.7870.8431.27 × 10−40.71 (0.59–0.84)6.99 × 10−62.27 (1.59–3.24)
rs777298229,448,986OR5V1/OR12D3CT0.2140.1793.45 × 10−31.31 (1.09–1.57)4.65 × 10−41.41 (1.16–1.72)
rs313539132,518,965HLA-DRAAG0.1810.1151.18 × 10−61.61 (1.33–1.96)8.12 × 10−61.95 (1.45–2.61)
rs1130380-C32,740,672HLA-DQB1AbsentPresent0.4950.5624.04 × 10−40.78 (0.67–0.89)2.85 × 10−41.47 (1.19–1.80)
rs693376332,830,830HLA-DQA2/HLA-DQB2GA0.1330.0854.32 × 10−71.81 (1.44–2.29)3.41 × 10−41.61 (1.24–2.08)
rs9266196-C31,432,808HLA-BPresentAbsent0.3740.3294.72 × 10−31.23 (1.07–1.43)1.93 × 10−31.35 (1.12–1.64)
rs645774033,805,103IP6K3GA0.7550.7105.86 × 10−31.25 (1.07–1.47)4.82 × 10−31.28 (1.08–1.53)

Markers are listed according to the order in which they came into the model. The frequencies and odds ratios are given for the A1 allele. For markers with more than two alleles, presence or absence of the specified allele was compared. The reference sequence gene is listed for intragenic markers and the two flanking reference sequence genes are listed for intergenic markers. A1, allele 1; A2, allele 2; OR, odds ratio; CI, confidence interval; A, adenine; C, cytosine; G, guanine; T, thymine.

UC versus CD with ileal involvement best multivariate model

In order to compare HLA associations between UC and CD with ileal involvement, we performed an analysis using UC subjects as cases and CD with ileal involvement subjects as controls. Initial association analyses for all markers in our study were performed and then we applied stepwise model selection to determine the best model for a UC versus CD with ileal involvement comparison (Table 5A). The model that was selected included 11 markers and had an overall model P value of 4.48×10−33. Not unexpectedly, there was no overlap between these markers and those that were chosen in the UC versus control best model described above (Table 4).
Table 5

Best model association results for ulcerative colitis versus Crohn’s disease with ileal involvement (Table 5A), ulcerative colitis versus control using markers from 5A (Table 5B) and Crohn’s disease with ileal involvement versus control using markers from 5A (Table 5C) for MHC marker terms identified with stepwise regression.

Table 5A. Ulcerative colitis versus Crohn’s disease with ileal involvement.
MarkerChromosome 6 position (NCBI36/hg18)Gene(s)A1A2Frequency (UC)Frequency (Ileal CD)UnivariateMultivariate
P valueOR (95% CI)P valueOR (95% CI)
rs264702532743927HLA-DQB1/HLA-DQA2GA0.8360.6822.00 × 10−162.35 (1.92–2.88)2.84 × 10−132.24 (1.80–2.78)
rs1689968231551678HCG26/MICBCG0.0310.0145.09 × 10−32.34 (1.29–4.23)5.27 × 10−43.14 (1.64–6.00)
rs225726931431332HLA-BGA0.6340.5416.66 × 10−61.48 (1.25–1.75)1.88 × 10−41.43 (1.19–1.73)
rs4154411232737898HLA-DQB1CT0.9660.9393.93 × 10−31.81 (1.21–2.72)7.52 × 10−42.12 (1.37–3.27)
rs313060933097499HLA-DOA/HLA-DPA1CT0.9840.9491.86 × 10−53.25 (1.89–5.56)2.05 × 10−32.45 (1.39–4.34)
rs1689916831366666HLA-C/HLA-BGA0.9770.9567.83 × 10−31.93 (1.19–3.12)8.02 × 10−32.02 (1.20–3.39)
rs21013433648187ZBTB9/BAK1GA0.7310.6783.39 × 10−31.31 (1.09–1.57)9.68 × 10−41.39 (1.14–1.68)
HLA-B, amino acid 99-Y31432174HLA-BPresentAbsent0.9940.9772.12 × 10−34.01 (1.65–9.74)5.29 × 10−33.77 (1.48–9.57)
rs313055931205280PSORS1C1CT0.8190.7504.45 × 10−51.53 (1.25–1.87)4.43 × 10−31.38 (1.11–1.72)
rs225697431663371LST1AC0.2010.1511.52 × 10−31.42 (1.14–1.76)1.23 × 10−31.48 (1.17–1.89)
rs313536532497233BTNL2/HLA-DRACA0.2370.1864.87 × 10−31.32 (1.09–1.61)2.90 × 10−31.40 (1.12–1.74)

Markers are listed according to the order in which they came into the model. The frequencies and odds ratios are given for the A1 allele. For markers with more than two alleles, presence or absence of the specified allele was compared. The reference sequence gene is listed for intragenic markers and the two flanking reference sequence genes are listed for intergenic markers. A1, allele 1; A2, allele 2; UC, ulcerative colitis; Ileal CD, Crohn’s disease with ileal involvement; OR, odds ratio; CI, confidence interval; A, adenine; C, cytosine; G, guanine; T, thymine; Y, tyrosine.

We then used the 11 markers from the UC versus CD with ileal involvement best model to perform two further analyses: UC versus control and CD with ileal involvement versus control (Tables 5B and 5C). The model P value for UC versus control was 1.59×10−19 which is less significant than the P value of 4.28×10−40 for the unrestricted UC best model (Table 4). The model P value for CD with ileal involvement versus control was 1.42×10−5. Divergent effects for each UC versus CD with ileal involvement best model marker in the UC versus control compared to the CD with ileal involvement versus control analyses are apparent when the odds ratios for each marker are compared.

Discussion

The MHC locus demonstrates the strongest evidence for association to UC among 47 well-established UC loci identified in a GWAS meta-analysis[10], and is also one of 71 well-established CD loci identified by GWAS meta-analysis.[9] In order to better understand MHC association signals in UC and CD, we used dense MHC SNP data from the discovery stage of an ongoing, new UC and CD GWAS to impute classical HLA types, their constituent SNPs and corresponding amino acids, and we performed detailed analyses of the genotyped and imputed data. Our univariate tests of binary SNP and SNP allele markers, and our omnibus tests of polymorphic HLA amino acid positions both highlighted HLA-DRβ1, amino acid position 11 as the MHC feature most significantly associated with UC. The C allele of rs9269955 was the SNP allele most significantly associated with UC (presence of rs9269955-C is associated with protection and absence is associated with risk for UC). In combination with the immediately adjacent SNP, it encodes the valine, glycine or aspartic acid amino acid residues at HLA-DRβ1, amino acid 11, which were all associated with protection against UC. Furthermore, in multivariate analysis, the most parsimonious model to explain the association with UC at amino acid 11 consisted of valine, glycine and aspartic acid as the only terms. HLA-DRB1 has extensive polymorphism as demonstrated by its 928 alleles and the 704 proteins for which it codes (International Immunogenetics Information System/HLA Database: http://www.ebi.ac.uk/imgt/hla)[13]. Valine at amino acid 11 corresponds to the common DRB1*04 (DR4) or lower frequency DRB1*10 (DR10) allele groups, glycine to DRB1*07 (DR7), and aspartic acid to DRB1*09 (DR9). The HLA-DR4, -DR7 and -DR9 allele groups were associated with protection against UC in a meta-analysis of prior studies.[3] They almost always occur on haplotypes carrying the HLA-DRB4 gene which encodes the DR53 antigen, and HLA-DRB4*01:01 has been associated with protection against UC in Japan.[14] In addition, the previously reported HLA-DR2 association with risk for UC[3, 5] is consistent with our observation that proline at position 11 in HLA-DRβ1 is associated with risk for UC. Based on the complementary findings from our different analyses and their correlation with results from prior studies, we conclude that variation at amino acid position 11 of HLA-DRβ1 is a major determinant of chromosome 6p association with ulcerative colitis. The potential biological significance of the UC association of amino acid position 11 relates to the peptide binding specificity of HLA class II molecules and their role in antigen presentation to T cells.[15, 16] The three-dimensional structure of the class II molecule HLA-DR1 heterodimer (DRA/DRB1*0101) has been well characterized and its peptide binding groove has been shown to be determined by polymorphic molecules that form nine pockets with different chemical and size characteristics.[15, 17] In one of these pockets (P6), amino acid position 11 appears to be the only variable residue and thus determines the binding specificity of that pocket.[18] Of note, hydrophobic amino acid residues at DRβ1 amino acid 11 were found to be associated with protection against development of sarcoidosis.[19] This finding suggests that such hydrophobic interactions could affect peptide binding in the P6 pocket.[19] We therefore hypothesize that variation at the amino acid position 11 of HLA-DRβ1 could have an effect on peptide binding in the HLA-DR complex antigen binding cleft that alters risk for the development of UC. It is important to note that the MHC association signal in UC is complex and not completely explained by amino acid position 11 in HLA-DRβ1. In fact, our forward stepwise model selection identified 13 other terms besides rs9269955-C. This model is highly significant with an overall P value of 4.28×10−40, but it will need to be validated in additional large cohorts. Included in our model was another missense SNP allele in HLA-DRB1, the T allele of rs1136759. rs1136759 and two adjacent flanking SNPs encode variation at HLA-DRβ1, amino acid 13, which is located in the P4 pocket of the HLA-DR complex antigen binding cleft. The finding that two of the terms in the best model for prediction of UC risk relate to the HLA-DRβ1 complex antigen binding cleft emphasizes the probable importance of HLA-DRB1 in the pathogenesis of UC. Four other MHC class II loci variants, including SNPs in HLA-DQB1 (rs1130380-C) and HLA-DRA (rs3135391), between HLA-DQA1 and HLA-DQB1 (rs9273363), and between HLA-DQA2 and HLA-DQB2 (rs6933763), were associated with UC in our multivariate model. The HLA-DRB, -DQB and -DPB genes are all highly polymorphic and encode β-chains of the class II molecule αβ heterodimer while the α-chains are encoded by the HLA-DQA, -DPA genes and -DRA genes.[4] Three polymorphisms in MHC class III loci (rs440454, rs28435656, and rs915654) were included as terms in our UC versus control model. The MHC class III region is one of the most gene dense regions in the human genome. Two of the SNPs in our model, rs440454 and rs28435656, are in linkage disequilibrium (r2 = 0.54 in HapMap 3-CEU[11]) and located in an MHC class III segment that contains four genes within 30 kb including superkiller viralicidic activity 2-like (SKIV2L) and RD RNA binding protein (RDBP).[20] rs440454 is in perfect linkage disequilibrium (r2 = 1.0 in HapMap 3-CEU[11]) with SNP rs419788 that was associated with risk for lupus.[21] rs28435656 is located in the complement component 2 (C2) gene which is located immediately adjacent to the region that includes SKIV2L and RDBP. Finally, rs915654 is located 5 prime to the lymphotoxin A (LTA) locus which has been associated with CD and diabetes.[22] All these findings suggest a role for MHC class III genes in UC pathogenesis which warrants further investigation. Another association of potential pathogenic interest identified in our UC versus control model is rs2844677, a synonymous SNP in the coding region of the mucin 21, cell surface associated (MUC21) gene. MUC21 is a recently identified gene that is expressed in normal colon among other tissues and produces a transmembrane mucin involved in cell adhesion.[23, 24] In the last part of our analysis, we compared MHC region association signals between UC and CD with ileal involvement. The finding that the 11 studied markers each had odds ratios with effects in opposite directions for the two IBD phenotypes together with the results from our initial association analysis in which the most significant associations in UC were different than those for ileal CD, demonstrates that the association signals for UC and ileal CD are quite different. This conclusion correlates with results of prior studies which have shown that the only consistent associations with risk for both UC and CD have been for HLA-DRB1*01:03 and HLA-B52.[3, 4] In contrast, alleles of the HLA-DR2 split antigen DR15 have been associated in opposite directions with HLA-DRB1*15:01 associated with protection against CD and HLA-DRB1*15:02 associated with increased risk for UC.[3, 5] In summary, we have performed detailed analyses to better understand MHC association signals in UC and CD. Our most significant finding is that a specific variation at amino acid position 11 of HLA-DRβ1, the only variable amino acid in the P6 pocket of the HLA-DR complex antigen binding cleft, explains a substantial portion of the MHC association signal and corresponds with several previously established classical HLA class II associations in UC. The observed alteration at amino acid position 11 of HLA-DRβ1 may affect peptide binding and result in an altered immune activation underlying protection against UC. We have also developed a novel multivariate model that further defines the contribution of MHC variation to risk for UC and highlights other genes of potential importance in UC pathogenesis. Finally, our multivariate modeling suggests different effects of MHC polymorphisms in UC and CD.

Materials and Methods

Study subjects

Our study sample included 574 UC, 630 CD with at least ileal involvement, and 1,508 control subjects of European ancestry that were recruited for genetic studies at the Cleveland Clinic or the University of Pittsburgh under institutional review board-approved protocols. All subjects provided written informed consent. IBD diagnoses and assessment of disease location were confirmed by IBD physicians via review of primary medical records using standard endoscopic, radiographic and histologic criteria.

Genotyping and quality control

Study subjects were genotyped using the Illumina Omni1-quad BeadChip (Illumina, San Diego, CA) at the Feinstein Institute for Medical Research of the North Shore-Long Island Jewish Health System. Data from samples with preliminary genotype call rates > 0.98 using cluster positions provided by Illumina were reclustered using the Illumina GenomeStudio software, and the new cluster positions were applied to all samples. Initial quality control of the genotyping data included removal of one sample from each pair with estimated identity-by-descent proportion > 0.10, removal of samples with genotype missing rates > 0.05, or with discordant SNP-determined and reported gender or ambiguous SNP-determined gender, and removal of SNPs with genotype missing rates > 0.05, minor allele frequencies in controls < 0.005, or Hardy-Weinberg P values in controls < 1×10−6. These quality control steps were performed using the PLINK software.[25] Subsequently, tag SNPs with genotype missing rates < 0.1% and physical separation of at least 0.4 megabases (Mb) were used in spectral analysis of ancestry that identified 929 controls with a relatively homogenous ‘European’ ancestral background. Additional SNPs with minor allele frequencies < 0.005 or Hardy-Weinberg P values < 0.001 in these 929 controls were removed from the dataset.

Ancestry matching

To control for potential confounding due to variation in genetic ancestry, study subjects were grouped into 11 approximately homogenous clusters, based on genetic distances derived from GemTools.[26, 27] Ancestry was inferred based on SNPs with genotype missing rates < 0.1% and a physical separation of at least 0.2 Mb. In all of the association analyses, we controlled for ancestry by including cluster membership as a blocking variable. The inflation across the genome-wide SNP data was minimal (genomic control lambda[28] = 1.02 for UC vs. control and 1.03 for CD with ileal involvement vs. control), confirming that the samples were well matched.

Imputation of classical HLA, SNP, and amino acid allele dosages

We followed a previously described procedure[29] to impute classical HLA alleles and their corresponding amino acid sequences in our cases and controls, using the genotyped SNPs in our GWAS as input. This imputation procedure is conceptually similar to HLA*IMP[32] in that haplotype information across the region is used to predict classical HLA alleles based on genotyped SNPs. A prior study demonstrated empirical evidence that the imputations have good accuracy[29] reaching comparable levels of accuracy to the work on which HLA*IMP is based.[32] As the reference panel, we used a data set of 263 HLA-A, -B, -C, -DRB1, -DQA1, -DQB1, -DPA1 and -DPB1 classical alleles at four-digit resolution, 3,852 SNPs, and 372 amino acid positions in 2,767 unrelated founder individuals of European descent collected by the MHC Working Group of the Type 1 Diabetes Genetics Consortium.[30] All variants were encoded as biallelic markers, allowing us to use standard tools for imputation. For variants with greater than two alleles, each allele was coded as present or absent, and analyzed in a separate test. We used default parameters for BEAGLE (http://faculty.washington.edu/browning/beagle/beagle.html): ten iterations of phasing/imputation, testing four pairs of haplotype pairs for each individual at each iteration. For each variant, we used the posterior probabilities of carrying 0 (AA), 1 (AB) or 2 (BB) copies to calculate the effective dosage for allele B (=2xPr(BB) + Pr(AB)). To obtain allele dosages for MHC region Omni1-quad SNPs, we used BEAGLECALL.[31] Three iterations of BEAGLECALL were run, with increasing stringency of genotype calling filters (callthreshold=0.9 and missingcohort=0.1 in iteration 1, callthreshold=0.98 and missingcohort=0.02 in iteration 2, and callthreshold=0.985 and missingcohort=0.015 in iteration 3). We combined dosage information for markers in the Type 1 Diabetes Genetics Consortium reference panel with dosage information for additional Omni1-quad SNPs that appeared in both genome builds NCBI36/hg18 and GRCh37/hg19 into a combined set of genetic features in the MHC region from 29,299 to 33,884 kilobases (kb) on chromosome 6 using NCBI36/hg18 coordinates. HLA-DRB1 imputation quality at two-digit resolution was assessed by sequence-specific oligonucleotide (SSO) probes and next-generation sequencing of genomic DNA collected from 384 of our study subjects (see Supplementary Materials).

Association analyses

Association analyses were performed using allele dosage data from 562 UC, 611 CD with ileal involvement, and 1,428 control samples that passed quality control. We examined the association between binary markers in the HLA region and UC versus control and CD with ileal involvement versus control using logistic regression with a log-additive model. Forward stepwise model selection was used to determine a set of markers in the post imputation data that jointly predicted disease versus control status, without including multiple markers that were in tight linkage disequilibrium. Markers with an allele frequency < 0.001 were excluded. The Bayesian Information Criterion (BIC) was used to find a model that balanced model complexity with parsimony. The stepwise procedure started by taking the best marker (lowest P value) into the regression model and iteratively adding markers until the BIC ceased to improve. This procedure was performed in R (http://www.r-project.org) using the “glm” and “step” functions. For each polymorphic amino acid position in the HLA region we also conducted an omnibus test for association using multivariate logistic regression with degrees-of-freedom equal to the number of distinct residues for that amino acid position minus one. For the position yielding the smallest P value we used stepwise regression, limited to that position, to select a parsimonious model for the site. Finally, using stepwise regression we determined a model for differentiating UC and CD with ileal involvement. In this model, CD with ileal involvement subjects served as controls and UC subjects served as cases. For each multivariate model, we provide the P value associated with the best model. This P value pertains to the null hypothesis that none of the terms in the model has any explanatory value, versus the alternative hypothesis that at least one term is associated with the phenotype. The degrees-of-freedom associated with this test equals the number of markers in the multivariate model.
  30 in total

1.  Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices.

Authors:  T Sturniolo; E Bono; J Ding; L Raddrizzani; O Tuereci; U Sahin; M Braxenthaler; F Gallazzi; M P Protti; F Sinigaglia; J Hammer
Journal:  Nat Biotechnol       Date:  1999-06       Impact factor: 54.908

2.  HLA class II alleles in Japanese patients with inflammatory bowel disease.

Authors:  S Yoshitake; A Kimura; M Okada; T Yao; T Sasazuki
Journal:  Tissue Antigens       Date:  1999-04

3.  Inflammatory bowel disease susceptibility loci defined by genome scan meta-analysis of 1952 affected relative pairs.

Authors:  David A van Heel; Sheila A Fisher; Andrew Kirby; Mark J Daly; John D Rioux; Cathryn M Lewis
Journal:  Hum Mol Genet       Date:  2004-02-19       Impact factor: 6.150

4.  Linkage of inflammatory bowel disease to human chromosome 6p.

Authors:  J Hampe; S H Shaw; R Saiz; N Leysens; A Lantermann; S Mascheretti; N J Lynch; A J MacPherson; S Bridger; S van Deventer; P Stokkers; P Morin; M M Mirza; A Forbes; J E Lennard-Jones; C G Mathew; M E Curran; S Schreiber
Journal:  Am J Hum Genet       Date:  1999-12       Impact factor: 11.025

5.  Three-dimensional structure of the human class II histocompatibility antigen HLA-DR1.

Authors:  J H Brown; T S Jardetzky; J C Gorga; L J Stern; R G Urban; J L Strominger; D C Wiley
Journal:  Nature       Date:  1993-07-01       Impact factor: 49.962

6.  Crystal structure of the human class II MHC protein HLA-DR1 complexed with an influenza virus peptide.

Authors:  L J Stern; J H Brown; T S Jardetzky; J C Gorga; R G Urban; J L Strominger; D C Wiley
Journal:  Nature       Date:  1994-03-17       Impact factor: 49.962

7.  A genomewide analysis provides evidence for novel linkages in inflammatory bowel disease in a large European cohort.

Authors:  J Hampe; S Schreiber; S H Shaw; K F Lau; S Bridger; A J Macpherson; L R Cardon; H Sakul; T J Harris; A Buckler; J Hall; P Stokkers; S J van Deventer; P Nürnberg; M M Mirza; J C Lee; J E Lennard-Jones; C G Mathew; M E Curran
Journal:  Am J Hum Genet       Date:  1999-03       Impact factor: 11.025

8.  Human leukocyte antigen-DRB1 position 11 residues are a common protective marker for sarcoidosis.

Authors:  P J Foley; D S McGrath; E Puscinska; M Petrek; V Kolek; J Drabek; P A Lympany; P Pantelidis; K I Welsh; J Zielinski; R M du Bois
Journal:  Am J Respir Cell Mol Biol       Date:  2001-09       Impact factor: 6.914

9.  Four ubiquitously expressed genes, RD (D6S45)-SKI2W (SKIV2L)-DOM3Z-RP1 (D6S60E), are present between complement component genes factor B and C4 in the class III region of the HLA.

Authors:  Z Yang; L Shen; A W Dangel; L C Wu; C Y Yu
Journal:  Genomics       Date:  1998-11-01       Impact factor: 5.736

10.  Meta-analysis identifies 29 additional ulcerative colitis risk loci, increasing the number of confirmed associations to 47.

Authors:  Carl A Anderson; Gabrielle Boucher; Charlie W Lees; Andre Franke; Mauro D'Amato; Kent D Taylor; James C Lee; Philippe Goyette; Marcin Imielinski; Anna Latiano; Caroline Lagacé; Regan Scott; Leila Amininejad; Suzannah Bumpstead; Leonard Baidoo; Robert N Baldassano; Murray Barclay; Theodore M Bayless; Stephan Brand; Carsten Büning; Jean-Frédéric Colombel; Lee A Denson; Martine De Vos; Marla Dubinsky; Cathryn Edwards; David Ellinghaus; Rudolf S N Fehrmann; James A B Floyd; Timothy Florin; Denis Franchimont; Lude Franke; Michel Georges; Jürgen Glas; Nicole L Glazer; Stephen L Guthery; Talin Haritunians; Nicholas K Hayward; Jean-Pierre Hugot; Gilles Jobin; Debby Laukens; Ian Lawrance; Marc Lémann; Arie Levine; Cecile Libioulle; Edouard Louis; Dermot P McGovern; Monica Milla; Grant W Montgomery; Katherine I Morley; Craig Mowat; Aylwin Ng; William Newman; Roel A Ophoff; Laura Papi; Orazio Palmieri; Laurent Peyrin-Biroulet; Julián Panés; Anne Phillips; Natalie J Prescott; Deborah D Proctor; Rebecca Roberts; Richard Russell; Paul Rutgeerts; Jeremy Sanderson; Miquel Sans; Philip Schumm; Frank Seibold; Yashoda Sharma; Lisa A Simms; Mark Seielstad; A Hillary Steinhart; Stephan R Targan; Leonard H van den Berg; Morten Vatn; Hein Verspaget; Thomas Walters; Cisca Wijmenga; David C Wilson; Harm-Jan Westra; Ramnik J Xavier; Zhen Z Zhao; Cyriel Y Ponsioen; Vibeke Andersen; Leif Torkvist; Maria Gazouli; Nicholas P Anagnou; Tom H Karlsen; Limas Kupcinskas; Jurgita Sventoraityte; John C Mansfield; Subra Kugathasan; Mark S Silverberg; Jonas Halfvarson; Jerome I Rotter; Christopher G Mathew; Anne M Griffiths; Richard Gearry; Tariq Ahmad; Steven R Brant; Mathias Chamaillard; Jack Satsangi; Judy H Cho; Stefan Schreiber; Mark J Daly; Jeffrey C Barrett; Miles Parkes; Vito Annese; Hakon Hakonarson; Graham Radford-Smith; Richard H Duerr; Séverine Vermeire; Rinse K Weersma; John D Rioux
Journal:  Nat Genet       Date:  2011-02-06       Impact factor: 38.330

View more
  17 in total

Review 1.  Genetic architectures of seropositive and seronegative rheumatic diseases.

Authors:  Yohei Kirino; Elaine F Remmers
Journal:  Nat Rev Rheumatol       Date:  2015-04-28       Impact factor: 20.543

2.  High-allelic variability in HLA-C mRNA expression: association with HLA-extended haplotypes.

Authors:  F Bettens; L Brunet; J-M Tiercy
Journal:  Genes Immun       Date:  2014-02-06       Impact factor: 2.676

3.  Novel HLA-DP region susceptibility loci associated with severe acute GvHD.

Authors:  R K Goyal; S J Lee; T Wang; M Trucco; M Haagenson; S R Spellman; M Verneris; R E Ferrell
Journal:  Bone Marrow Transplant       Date:  2016-09-05       Impact factor: 5.483

4.  A novel approach to detect cumulative genetic effects and genetic interactions in Crohn's disease.

Authors:  Ming-Hsi Wang; Claudio Fiocchi; Stephan Ripke; Xiaofeng Zhu; Richard H Duerr; Jean-Paul Achkar
Journal:  Inflamm Bowel Dis       Date:  2013-08       Impact factor: 5.325

Review 5.  The genetics of human autoimmune disease: A perspective on progress in the field and future directions.

Authors:  Michael F Seldin
Journal:  J Autoimmun       Date:  2015-09-04       Impact factor: 7.094

6.  HLA-DRB1 Amino Acid Positions and Residues Associated with Antibody-positive Rheumatoid Arthritis in Black South Africans.

Authors:  Nimmisha Govind; Richard J Reynolds; Bridget Hodkinson; Claudia Ickinger; Michele Ramsay; S Louis Bridges; Mohammed Tikly
Journal:  J Rheumatol       Date:  2018-11-01       Impact factor: 4.666

7.  Gene-gene and gene-environment interactions in ulcerative colitis.

Authors:  Ming-Hsi Wang; Claudio Fiocchi; Xiaofeng Zhu; Stephan Ripke; M Ilyas Kamboh; Nancy Rebert; Richard H Duerr; Jean-Paul Achkar
Journal:  Hum Genet       Date:  2013-11-17       Impact factor: 4.132

8.  A large-scale genetic analysis reveals a strong contribution of the HLA class II region to giant cell arteritis susceptibility.

Authors:  F David Carmona; Sarah L Mackie; Jose-Ezequiel Martín; John C Taylor; Augusto Vaglio; Stephen Eyre; Lara Bossini-Castillo; Santos Castañeda; Maria C Cid; José Hernández-Rodríguez; Sergio Prieto-González; Roser Solans; Marc Ramentol-Sintas; M Francisca González-Escribano; Lourdes Ortiz-Fernández; Inmaculada C Morado; Javier Narváez; José A Miranda-Filloy; Lorenzo Beretta; Claudio Lunardi; Marco A Cimmino; Davide Gianfreda; Daniele Santilli; Giuseppe A Ramirez; Alessandra Soriano; Francesco Muratore; Giulia Pazzola; Olga Addimanda; Cisca Wijmenga; Torsten Witte; Jan H Schirmer; Frank Moosig; Verena Schönau; Andre Franke; Øyvind Palm; Øyvind Molberg; Andreas P Diamantopoulos; Simon Carette; David Cuthbertson; Lindsy J Forbess; Gary S Hoffman; Nader A Khalidi; Curry L Koening; Carol A Langford; Carol A McAlear; Larry Moreland; Paul A Monach; Christian Pagnoux; Philip Seo; Robert Spiera; Antoine G Sreih; Kenneth J Warrington; Steven R Ytterberg; Peter K Gregersen; Colin T Pease; Andrew Gough; Michael Green; Lesley Hordon; Stephen Jarrett; Richard Watts; Sarah Levy; Yusuf Patel; Sanjeet Kamath; Bhaskar Dasgupta; Jane Worthington; Bobby P C Koeleman; Paul I W de Bakker; Jennifer H Barrett; Carlo Salvarani; Peter A Merkel; Miguel A González-Gay; Ann W Morgan; Javier Martín
Journal:  Am J Hum Genet       Date:  2015-03-26       Impact factor: 11.025

9.  Conditional analysis identifies three novel major histocompatibility complex loci associated with psoriasis.

Authors:  Jo Knight; Sarah L Spain; Francesca Capon; Adrian Hayday; Frank O Nestle; Alex Clop; Jonathan N Barker; Michael E Weale; Richard C Trembath
Journal:  Hum Mol Genet       Date:  2012-08-21       Impact factor: 6.150

10.  Genetic and Transcriptomic Bases of Intestinal Epithelial Barrier Dysfunction in Inflammatory Bowel Disease.

Authors:  Maaike Vancamelbeke; Tim Vanuytsel; Ricard Farré; Sare Verstockt; Marc Ferrante; Gert Van Assche; Paul Rutgeerts; Frans Schuit; Séverine Vermeire; Ingrid Arijs; Isabelle Cleynen
Journal:  Inflamm Bowel Dis       Date:  2017-10       Impact factor: 5.325

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.