Literature DB >> 23143596

High-density genetic mapping identifies new susceptibility loci for rheumatoid arthritis.

Steve Eyre1, John Bowes, Dorothée Diogo, Annette Lee, Anne Barton, Paul Martin, Alexandra Zhernakova, Eli Stahl, Sebastien Viatte, Kate McAllister, Christopher I Amos, Leonid Padyukov, Rene E M Toes, Tom W J Huizinga, Cisca Wijmenga, Gosia Trynka, Lude Franke, Harm-Jan Westra, Lars Alfredsson, Xinli Hu, Cynthia Sandor, Paul I W de Bakker, Sonia Davila, Chiea Chuen Khor, Khai Koon Heng, Robert Andrews, Sarah Edkins, Sarah E Hunt, Cordelia Langford, Deborah Symmons, Pat Concannon, Suna Onengut-Gumuscu, Stephen S Rich, Panos Deloukas, Miguel A Gonzalez-Gay, Luis Rodriguez-Rodriguez, Lisbeth Ärlsetig, Javier Martin, Solbritt Rantapää-Dahlqvist, Robert M Plenge, Soumya Raychaudhuri, Lars Klareskog, Peter K Gregersen, Jane Worthington.   

Abstract

Using the Immunochip custom SNP array, which was designed for dense genotyping of 186 loci identified through genome-wide association studies (GWAS), we analyzed 11,475 individuals with rheumatoid arthritis (cases) of European ancestry and 15,870 controls for 129,464 markers. We combined these data in a meta-analysis with GWAS data from additional independent cases (n = 2,363) and controls (n = 17,872). We identified 14 new susceptibility loci, 9 of which were associated with rheumatoid arthritis overall and five of which were specifically associated with disease that was positive for anticitrullinated peptide antibodies, bringing the number of confirmed rheumatoid arthritis risk loci in individuals of European ancestry to 46. We refined the peak of association to a single gene for 19 loci, identified secondary independent effects at 6 loci and identified association to low-frequency variants at 4 loci. Bioinformatic analyses generated strong hypotheses for the causal SNP at seven loci. This study illustrates the advantages of dense SNP mapping analysis to inform subsequent functional investigations.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 23143596      PMCID: PMC3605761          DOI: 10.1038/ng.2462

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   38.330


Rheumatoid arthritis is a common, complex disease affecting up to 1% of the adult population. It is an archetypal autoimmune disease, typified by the presence of serum autoantibodies, including antibodies directed against the Fc portion of immunoglobulins (rheumatoid factor) and against citrullinated peptides (anti-citrillunated peptide antibodies (ACPA)). Genetic studies of rheumatoid arthritis, including recent application of genome wide association studies (GWAS), have identified 32 risk loci among individuals of European ancestry, including HLA-DRB1, PTPN22, and other loci with shared autoimmune associations[1, 2]. The Immunochip Consortium was formed to design a custom Illumina Infinium array that leveraged the remarkable genetic overlap of susceptibility loci identified across a range of autoimmune diseases. The custom array allows investigators to perform gene-finding and fine-mapping experiments in a co-ordinated manner. Full details have been described previously[3]. Briefly, the array consisted of all known single nucleotide polymorphisms (SNPs) from the 1000 Genomes Project as well as private resequencing efforts for 186 loci, known to be involved in 12 autoimmune diseases. For these loci there is the unique opportunity to fine map autoimmune disease associations. Additional SNPs were included as part of a deep replication effort. This not only provided the opportunity to identify novel rheumatoid arthritis associations with other autoimmune disease loci or with variants with suggestive statistical evidence for association from a previous meta-analysis of but also to refine the GWAS signal and reduce the number of potential causal variants in the 31 non-HLA confirmed loci. We tested 129,464 polymorphic markers passing quality control, with a minor allele frequency >1%, in 11,475 cases (7,222 ACPA positive, 3,297 ACPA negative and 957 unassigned) and 15,870 controls (Table 1 and Supplementary Tables 1 and 2). We performed analysis on the total rheumatoid arthritis dataset, and also in subsets stratified by ACPA status (Supplementary Table 3). We also had access to GWAS data for an additional 2,363 ACPA positive cases and 17,872 controls, independent of the current study (Table 1). We observed strong evidence of association for the previously identified susceptibility loci (Table 2 and Supplementary Tables 3 and 4).
Table 1

Sample Collections

Rheumatoid arthritis cases and controls for the Immunochip analysis were assembled from a number of different studies from 6 centres across 5 countries (online methods). Genotype data for additional samples analysed in previously published rheumatoid arthritis ACPA positive GWAS were available from 4 studies. Rheumatoid arthritis cases were classified as anti citrullinated peptide antigen (ACPA) positive and ACPA negative. F:M = female: male.

CollectionCases
Controls
All % Female ACPA + ACPA % Female
Immunochip UK38707424061000843053
Swedish EIRA2762701762987194073
US2536751803593213465
Dutch64866330301200442
Swedish Umea8527052424296369
Spanish8077439721639965

TOTAL 1147573722233391587057

GWAS BRASS (US)47982479-162745
Canada58676586-155354
NARAC2 (US)74648746-656749
WTCCC (UK)55274552-812546

TOTAL 2363682363-1787248

TOTAL 1383872958532973374252
Table 2

Non-HLA loci associated with rheumatoid arthritis at genome-wide significance level

Novel loci are shown with either the best SNP on Immunochip (Table 2a), if <5×10−8, or from the most associated SNP from the combined analysis of GWAS and Immunochip data (Table 2b).

SNPGeneChrMAFRiskallelePORLD region r2>0.9*Region sizeLocalization ofLD region (r2>0.9) relative to nearest genes
Novel loci on Immunochip
rs34536443b TYK2 19p130.04G2.3 × 10−140.6210,427,721-10,492,27464.55 kb47.96 kb 5′ to exon 13 of RAVER1; completeICAM3; complete TYK2
rs13397b IRAK1 Xq280.12A1.2 × 10−121.27153,196,345-153,248,24851.9kp5′ to exon 2 of TMEM187; HCFC1; 25 kb 3′ ofIRAK1
rs8026898b TLE3 # 15q230.29A1.4 × 10−101.1769,984,462-70,010,64726.19 kb329.48 kb 3′ of TLE3
rs8043085b RASGRP1 15q140.25A1.4 × 10−101.1738,828,140-38,844,10615.97 kbIntron 2 of RASGRP1;
rs2240336b PADI4 1p360.42A5.9 × 10−090.8817,673,102-17,674,4021.30 kbIntron 9 PADI4
rs8192284a(rs2228145) IL6R 1q210.42C1.3 × 10−080.9154,418,749-154,428,2839.54 kbIntron 6 to intron 9 of IL6R
rs13330176b IRF8 16q240.22A4.0 × 10−081.1586,016,026-86,019,0873.06 kb59.83 kb 3′ of IRF8

Novel loci adding GWAS data
rs12764378d ARID5B # 10q210.23A4.5 × 10−101.1463,786,554-63,800,00413.45 kbIntron 4 of ARID5B
rs9979383c RUNX1#21q220.36G5.0 × 10−100.936,712,588-36,715,7613.17 kb5′ region of RUNX1
rs12936409/ rs2872507c IKZF3 17q120.47A2.8 × 10−091.137,912,377-38,080,912168.54 kbIKZF3;GSDMB; Intron 1 to 164.92 kb 3′ ofORMDL3
rs883220d POU3F1# 1p340.26A2.1 × 10−080.8938,614,867-38,644,86130.00 kb102.42 kb 5′ of POU3F1
rs2834512d RCAN1 # 21q220.12A2.1 × 10−080.8635,909,625-35,930,91521.29 kbIntron 1 of RCAN1
rs595158c CD5 11q120.49C3.4 × 10−081.0960,888,001-60,922,63434.63 kbIntron 5 to 27.31 kb 3′ of CD5; Intron 1 to 9.73kb 3′of VPS37C
rs2275806d GATA3 # 10p140.41G4.6 × 10−081.118,095,340-8,097,3682.03 kb227bp 5′ to exon 2 of GATA3

Known loci on Immunochip
rs2476601b PTPN22 1p130.09A7.5 × 10−771.78114,303,808-114,377,56873.76 kbComplete RSBN1; Exon 14 to 52.62 kb 3′ ofPTPN22
rs71624119a ANKRD55 5q110.25A5.6 × 10−200.8155,440,730-55,442,2491.52 kbIntron 6 of ANKRD55
rs6920220b TNFAIP3 6q230.2A2.3 × 10−131.2137,959,235-138,006,50447.27 kb181.85 kb 5′ of TNFAIP3
rs932036a RBPJ 4p150.3A2.0 × 10−101.1426,085,480-26,128,71043.23 kb36.37 kb 5′ of RBPJ
rs59466457b CCR6 6q270.44A2.7 × 10−101.15167,526,096-167,540,84214.75 kbIntron 1 of CCR6
rs13426947a STAT4 2q320.19A7.2 × 10−101.15191,900,449-191,935,80435.36 kbIntron 5 to 18 of STAT4
rs2812378b CCL21 9p130.34G7.2 × 10−101.1534,707,373-34,710,3382.97 kbCCL21
rs6032662b CD40 20q130.24G1.4 × 10−090.8644,730,245-44,747,94717.70 kb16.67 kb 5′ to intron 1 of CD40
rs2843401b MMEL1 1p360.33A6.6 × 10−090.872,516,781-2,709,164192.38 kbComplete MMEL1,C1ORF93; TTC34
rs10209110a AFF3 2q110.49A1.1 × 10−080.9100,640,432-100,730,11189.68 kb5′ region to intron 2 of AFF3
rs34695944b REL 2p160.37G2.6 × 10−081.1361,072,664-61,164,33191.67 kbComplete REL
rs11571302b CTLA4 2q330.48A4.5 × 10−080.89204,738,919-204,745,0036.08 kb236bp 3′ of CTLA4; 56.47 kb 5′ of ICOS;
rs39984a GIN1 5q210.32A9.3 × 10−080.88102,595,778-102,625,33529.56 kbIntron 1 to 10.97 kb 3′of C5orf30; 139.92 kb 5′of GIN1
rs35677470a DNASE1L3 3p140.08A1.7 × 10−071.1958,181,499-58,183,6362.14 kbExon 8 to intron 9 of DNASE1L3; 134.97 kb 5′of PXK
rs3807306b IRF5 7q320.49C1.9 × 10−070.89128,580,680-128,580,6801bpIntron 1 of IRF5
rs3218251b IL2RB 22q120.25A1.9 × 10−071.1337,544,245-37,545,5051.26 kbIntron 1 of IL2RB
rs4938573b DDX6 11q230.18G5.3 × 10−070.87118,662,993-118,745,88482.89 kbComplete SETP16; 1.14 kb 5′ of DDX6
rs6546146b SPRED2 2p140.38A8.0 × 10−070.965,556,324-65,598,30041.98 kbIntron 1 to intron 4 of SPRED2
rs629326b TAGAP 6q250.41C1.1 × 10−060.9159,489,791-159,496,7136.92 kb23.61 kb 5′ of TAGAP
rs10739580b TRAF1 9q330.33G1.7 × 10−061.12123,640,500-123,708,28667.79 kbComplete TRAF1
rs10795791a IL2RA 10p150.4G3.0 × 10−061.096,106,266-6,108,3402.08 kb1.93 kb 5′ of IL2RA
rs4840565a BLK 8p230.27G3.9 × 10−061.111,338,383-11,352,48514.10 kb13.13 kb 5′ to intron 1 of BLK
rs798000b CD2 1p130.34G6.2 × 10−061.11117,280,696-117,280,6961bp16.31 kb 5′ of CD2
rs1980422f CD28 2q330.23G8.7 × 10−061.12204,610,004-204,634,56924.57 kb7.45 kb 3′ of CD28; 97.94 kb 5′ of CTLA4
rs2014863a PTPRC 1q310.36C2.1 × 10−051.09198,791,907-198,810,00818.10 kb65.36 kb 3′ of PTPRC
rs10683701b KIF5A 12q130.33-2.3 × 10−050.958,034,835-58,105,09470.26 kbsnoU13;52.90 kb 5′ to intron 5 of OS9; 54.42kb 3′ of KIF5A
rs947474a PRKCQ 10p150.17G2.5 × 10−050.96,390,450-6,390,4501bp78.66 kb 3′ of PRKCQ
rs10494360b FCGR2A 1q230.12G3.0 × 10−051.14161,463,876-161,480,64916.77 kb11.34 kb 5′ to exon 5 of FCGR2A
rs6911690b PRDM1 6q210.12G1.2 × 10−040.87106,435,981-106,508,64072.66 kb25.55 kb 5′ of PRDM1
rs78560100a IL2-IL21 4q270.07C5.8 × 10−041.13123,030,583-123,503,591473.01 kbKIAA1109;ADAD1;IL2; 30.19 kb 3′ of IL21
rs570676b TRAF6 11p120.38A2.1 × 10−030.9336,486,064-36,519,62433.56 kbIntron 3 to 22.51 kb 3′ of TRAF6

Previously indentified loci are shown with the most significantly associated SNP on Immunochip (2c) indicates the data is from all rheumatoid arthritis samples on Immunochip

Previously indentified loci are shown with the most significantly associated SNP on Immunochip (2c) is data from Immunochip for ACPA positive individuals

Previously indentified loci are shown with the most significantly associated SNP on Immunochip (2c) is data from adding GWAS samples and all rheumatoid arthritis Immunochip data

Previously indentified loci are shown with the most significantly associated SNP on Immunochip (2c) is from ACPA positive Immunochip and GWAS data.

co-ordinates based on GRCh37 assembly.

region not included for dense mapping on Immunochip.

We identified fourteen novel rheumatoid arthritis loci for populations of European ancestry (TYK2, IRAK1, TLE3, RASGRP1, PADI4, IL6R, IRF8, ARID5B, IKZF3, RUNX1, POU3F1, RCAN1, CD5, GATA3) at genome-wide levels of significance (p<5×10−8)(Figure 1): 7 with Immunochip data alone (Table 2) and a further 7 when Immunochip data was combined with the GWAS meta analysis data (Table 2). These loci add 4% to the estimate of heritability explained by confirmed loci, bringing the total to 51%, of which HLA explains 36%. When we removed all known loci from the Immunochip data, we still observed evidence of an excessive number of nominally associated alleles, consistent with the possibility that there are many additional undiscovered alleles [4](Supplementary Figure 1). Interestingly, if a study-wide significance threshold of 9.0×10−7 is applied (calculated based on the number of effective independent tests when accounting for linkage disequilibrium (LD)), significant association is also observed at two additional loci; ELMO1 (rs75351767 pall=2.94×10−7) and BACH2 (rs72928038 Pall = 8.23×10−7) (Supplementary Table 3). A further 8 loci are implicated at suggestive levels of significance (p<1 ×10−5) in either the full or ACPA positive sub-group analysis including PTPN2 (rs62097857 Pall=4.4×10−6); TNIP1 (rs6579837 Ppos=1.7×10−6) and TNFSF4 (rs61828284 Ppos=5.4×10−6) (Supplementary Table 3).
Figure 1

Manhattan plot of association statistics highlighting all autosomal loci associated to rheumatoid arthritis in the study

P values of association to ACPA positive rheumatoid arthritis from the meta-analysis of the Immunochip and GWAS data are shown. Known and new rheumatoid arthritis associated loci are shown in red and black respectively. Three associated loci (identified by a *) only reach P<5×10−8 when ACPA positive and ACPA negative cases are included in the analysis. The dashed grey line indicates genome-wide significance (P=5×10−8).

Previously, we have fine-mapped MHC associations observed in GWAS data of partially overlapping samples by applying imputation of HLA classical alleles and amino acids[5]. The Immunochip platform includes denser SNP coverage within the MHC region which facilitates more accurate imputation. In a preliminary analysis applying the same imputation and fine-mapping approach to ACPA positive cases and controls typed on Immunochip, we observed the same associations that we reported previously. The most significant polymorphic nucleotide was again rs17878703, mapping to position 11 of the HLA-DRB1 peptide sequence (p<10−677). Testing individual amino acid positions within HLA-DRB1 revealed the strongest association at position 11 (p<10−745); conditioning on the position 11 effect we observed association at position 71 (p=6×10−60); finally conditioning on effects at both positions 11 and 71 we observed significant association at position 74 (p=7×10−19). Adjusting for all HLA-DRB1 alleles to identify independent effects outside this gene we observed significant associations at HLA-B corresponding to the presence of aspartate at position 9 in the peptide sequence (p=1×10−17). Adjusting for all HLA-DRB1 alleles and Asp-9 in HLA-B, we observed associations at HLA-DPB1 corresponding to the presence of phenylalanine at position 9 in the peptide sequence (p=1×10−17). While it has been demonstrated that ACPA positive and ACPA negative disease has a different allelic association at the MHC and at PTPN22 [6], previous studies have not been powered to address this issue definitively in additional non MHC loci. Here we analysed 3,297 ACPA negative cases and identify association at genome wide significance to ANKRD55 (rs71624119 p=5.2 ×10−12, OR=0.78) in addition to HLA (rs4143332 p=2.9×10−15, OR=1.37) (Supplementary Table 3). Strikingly, ANKRD55 has a similar effect as in ACPA positive disease. Comparing association in ACPA positive and negative subgroups we see that for the 45 non-HLA loci, around half show a significantly larger effect size in ACPA positive disease (comparison of OR p<0.05), 5 of these loci having a markedly stronger association with this form of disease (PTPN22, CCR6, CD40, RASGRP1 and TAGAP). Eleven loci show no statistical difference in association to either form of rheumatoid arthritis (Supplementary Table 5).This preliminary analysis indicates that differences in the serological subtype of disease may well be reflected in a difference in genetic pre-disposition potentially providing a basis for stratified medicine. The majority of the 14 new loci associated with rheumatoid arthritis susceptibility, along with previously confirmed loci, were found to contain proteins strongly linked to immune function using GRAIL analysis (Supplementary Table 6 and Supplementary Figure 2), for example, CD5, IRF8 and TYK2. We also report novel association with IRAK1, previously associated with systemic lupus erythematosus (SLE)[7]. This is the first X chromosome locus association to rheumatoid arthritis, and is of relevance given the female predominance of both diseases (9:1 and 3:1 ratio of females: males in SLE and rheumatoid arthritis, respectively). Interestingly, this locus has been shown to occasionally escape X inactivation in female cells[8]. Three of the novel loci confirmed here for the first time in samples of European ancestry have previously been associated in either samples of East Asian ancestry (PADI4, ARID5B) or when using a multiethnic approach (IKZF3)[9-11]. The SNPs associated in this study are moderately correlated with those identified in samples of East Asian origin, PADI4 SNPs rs2240336 and rs766449 r2=0.25, D′=1; ARID5B SNPs rs12764378 and rs10821944 r2=0.52, D′=0.86. PADI genes are involved in the citrullination of peptides and as such are strong candidates for involvement in disease, given the presence of ACPA auto-antibodies. Although the association at PADI4 (rs2240336) is greater in ACPA positive disease (OR=0.88 P=6.49×10−9) compared to ACPA negative cases (OR=0.93, P=0.01) our formal test comparing OR did not show a statistically significant difference (P=0.14). We applied conditional logistic regression to test for secondary effects within each locus. In 6 non-HLA loci (13%) (TNFAIP3, CD28, REL, STAT4, TYK2, RASGRP1) we observed additional independent association signals (Supplementary Figure 3). In total we observed 51 independent risk alleles in 45 non-HLA rheumatoid arthritis loci. To test the possibility that the two risk alleles tag an untyped SNP, we carried out haplotype analysis of the six loci but found no evidence for haplotype specific effects at any locus (Supplementary Table 7). At only four loci, REL, CD28, TYK2 and TNFAIP3, did we observe associations with low frequency variants (MAF<0.05) (Supplementary Table 8). Out of the 46 rheumatoid arthritis loci, 39 were densely genotyped by Immunochip. For 12 loci we observed that the most strongly associated SNP was not tightly linked to the previously reported leading SNP at that locus, shifting the association signal (Supplementary Table 9). For the 39 confirmed non-HLA rheumatoid arthritis loci on Immunochip, dense mapping refines the association to a single gene for 19 loci (Supplementary Table 10). Our analysis also identified 7 non-synonymous SNPs within exonic regions (Table 3), as well as a number showing strong regulatory potential (Supplementary Table 11), that are highly correlated (r2>0.9) with the lead SNP and which are strong candidates for the aetiological variant. The most associated SNP at the IL6R locus (rs8192284) is non-synonymous, shows high correlation with circulating IL6R levels and as well as being associated with a decrease risk of coronary heart disease[12, 13], is in strong LD (r2=0.97, D′=1) with the SNP recently reported to be associated with asthma (rs4129267)[14]. Interestingly, the risk allele at the asthma associated SNP (OR=1.09, p=2.4×10−8) is protective for rheumatoid arthritis (OR=0.9, p = 1.3×10−8). The IL6R ligand, IL6, is the target of the biologic drug, tocilizumab, which has been shown to be an effective treatment for rheumatoid arthritis. Abatacept is another biologic drug, with therapeutic efficacy in clinical trials and which targets another rheumatoid arthritis susceptibility gene, CTLA4. These examples highlight the potential for targeting genes within risk loci.
Table 3

Potential causal exonic SNPs located by Immunochip dense genotyping. Conservation is by phastCons17way, study 99th percentile = 0.998; 95th percentile = 0.367. An essential splice site is a splice donor variant within the 2 base pair region at the 5′ end of an intron. A splice site is a sequence variant within 1-3 base pairs of the exon or 3-8 base pairs of the intron.

ChrPOSGeneSNPMAFr2 withleadSNPLocationAlleleAmino acidchangePolyphenSIFTConservation
12,535,613MMEL1rs46485620.331Essential splice siteA---1
1114,377,568PTPN22rs24766010.12leadNon-Synonymous codingAArg620Trpbenigntolerated0.992
1154,426,970IL6Rrs22281450.38leadNon-Synonymous codingCAsp358Alabenigntolerated0.008
358,183,636DNASEIL3rs356774700.09leadNon-Synonymous codingAArg206Cysprobablydamagingdeleterious0.992
1160,893,235CD5rs22291770.470.96Non-Synonymous codingCAla471Valprobablydamagingdeleterious0.835
1910,449,358ICAM3rs72580150.231Non-Synonymous coding/Splice siteCArg115Glybenigntolerated0
1910,463,118TYK2rs345364430.04leadNon-Synonymous codingGPro1104Alaprobablydamagingdeleterious0.189
Testing for statistical interactions between the 46 lead SNPs in confirmed rheumatoid arthritis loci, revealed preliminary evidence for 6 significant pairwise interactions, after Bonferroni correction (p<5×10−4) (Supplementary Table 12). The GATA3-PRKCQ interaction is supported by earlier biological observations[15]. From 38 rheumatoid arthritis associated SNPs or proxies accessed for eQTL analysis, 18 showed an eQTL effect on at least one probe, giving a total of 51 SNP-probe combinations with significant eQTL effect (Supplementary Table 13). From these 18 SNPs, 11 showed an independent or primary eQTL effect on one or more probes (20 SNP-probe combinations), whereas 7 SNPs were not significant after conditioning of the strongest eQTL signal in the locus. Using a previously described approach, we assessed whether the 46 independent rheumatoid arthritis associated regions, defined by previously known and novel SNP associations discovered here, harboured genes that were specifically expressed in distinct immune cell-types[16]. We observed in a large expression data set of 223 sorted mouse immune cells[17], that these regions contained genes that were most significantly more specifically expressed in CD4+ effector memory T-cell subsets (p<10−7) (Supplementary Figure 4). Of the diseases sharing susceptibility loci with rheumatoid arthritis, systematic fine mapping has only been published, to date, for celiac disease[3]. Previously the two diseases were found to share 6 confirmed non-HLA loci (MMEL, REL, CD28/CTLA4, TNFAIP3, TAGAP and IL2/21) [2]; Immunochip data now identifies an additional 4 confirmed loci common to both diseases (DDX6, STAT4, PRKCQ and IRAK1) and a further 4 potential rheumatoid arthritis loci (BACH2, p=8.2×10−7, ELMO1, p=2.9×10−7, PTPN2, p=4.4×10−6, PVT1, p=2×10−5) in common with confirmed celiac disease loci (Supplementary Table 14). Of the ten rheumatoid arthritis/celiac disease loci, 4 share the same lead SNP (CD28, IL2_21, TNFAIP3 and IRAK1) and a fifth (MMEL1) shares highly correlated SNPs (r2>0.88) and, for all of these variants, the risk allele is the same for both diseases. For two loci (PRKCQ and DDX6), the lead SNPs are only moderately correlated (r2>0.62) with the minor allele being protective in both diseases. The effects in STAT4 appear quite different with 3 independent effects in celiac disease and two different independent associations in rheumatoid arthritis. The strongest association signal for risk of celiac disease at TAGAP is with the minor allele of a SNP (rs182429) in moderate LD (r2=0.44) with the rheumatoid arthritis risk SNP (rs629326). Indeed, when considering overlap of rheumatoid arthritis susceptibility loci with other autoimmune diseases, only the PADI4 and CCL21 non-MHC loci currently show unique association, suggesting that they may be important in determining that the autoimmune reaction is directed at synovial joints. In summary, through fine mapping on a custom made array designed to capture variation across a number of loci associated with autoimmune diseases, we have identified 14 novel European ancestry rheumatoid arthritis loci; refined the peak of association to a single gene at 19 loci, identified 7 SNPs which might potentially be functional, found independent effects at 6 loci and detected association with SNPs with low MAF (<0.05) at 4 loci. In one third of cases, imputation of GWAS signals without fine-mapping, would have implicated a different genetic region as being disease causal thus illustrating the importance of dense fine mapping analysis prior to embarking on expensive functional studies.

Methods

Genotyping

All samples were genotyped for the Immunochip custom array in accordance with Illumina protocols at six centres: UK (Sanger Centre, Hinxton, Cambridge, UK and the University of Virginia, USA), US and Spain (Feinstein Institute, New York, USA), Sweden EIRA (The Genome Institute, Singapore), Sweden Umea (Department of Medical Sciences, SNP&SEQ Technology Platform, Uppsala University Hospital, Uppsala, Sweden) and The Netherlands (Department of Genetics, University Medical Centre Groningen).

Genotype calling and quality control

Genotype calling was performed on all samples at The University of Manchester as a single project using the Genotyping Module (v1.8.4) of the GenomeStudio Data Analysis Software package. Initial genotype clustering was performed using the default Illumina cluster file (Immunochip_Gentrain_June2010.egt) and manifest file Immuno_BeadChip_11419691_B.bpm (NCBI build 36) using the GenTrain2 clustering algorithm. Poor performing samples (call rate < 0.90), labelled duplicates (selection informed by 10th percentile GenCall score (p10 GC)) and samples identified post-genotyping as inappropriate for inclusion were also excluded at this point (Supplementary table 1). Automated reclustering was performed on all remaining samples to calibrate clusters on the study sample set. Poor quality assays were excluded prior to downstream quality control processes by extensive manual review of clustering performance. A subset of good quality SNPs was identified based on the ranking of quality metrics: cluster separation (<0.4), signal intensity (<1.0), call rate (<0.98) and allele frequency. In addition, SNPs that mapped to the Y chromosome or mitochondria, were non-polymorphic, were duplicates, or zeroed in the default Illumina cluster file were also excluded. This resulted in a dataset of 165,549 good quality SNPs (Supplementary Table 2). To facilitate the meta-analysis and reduce differential missingness each of the six population datasets were processed as discrete entities. SNPs were excluded from each of the datasets with a call rate < 0.99 (cases or controls), a MAF < 0.01 or if they deviated from HWE (p < 5.7×10−7). Samples were excluded with a call rate < 0.99 or if they were identified as outliers based on autosomal heterozygosity (Supplementary Table 3 and 4). Samples were also excluded if they were considered to be outliers based on ethnicity inferred by principal component analysis (PCA). PCA was performed using EIGENSOFT v4.2 with HapMap phase 2 samples as reference populations on a subset of SNPs with a MAF > 0.05 and filtered to minimise inter-marker LD (excluding the MHC region, 23 regions of high LD and previously confirmed rheumatoid arthritis susceptibility regions) (Supplementary Figure 1). Cryptic relatedness was assessed within each dataset by calculating identity-by-descent (IBD) using PLINK v1.07 using the PCA SNP set. A single sample from any related pair (PI_HAT > 0.1875) was removed from the analysis (informed by call rate). In addition IBD was inferred across all six datasets to exclude cross-dataset related individuals (Supplementary table 5). The genomic control inflation factor (λGC) was calculated within each Immunochip dataset using SNPs included as deep replication for a study investigating the genetic basis for reading and writing ability (submitted by J.C. Barrett). This set of SNPs was filtered as described for the PCA SNP set, leaving a total of 1,469 SNPs distributed evenly across the genome. The λGC for the datasets was estimated at; 1.07 (UK), 1.03 (US), 0.97 (SE-E), 0.94 (SE-U), 1.12 (NL), and 1.10 (ES). Using the same SNPs to estimate λGC1000, where the factor is scaled to the equivalent of 1000 cases and 1000 controls, in the Immunochip meta-analysis resulted in a rescaled λ of 1.02 (1.23 without rescaling). All novel findings remained significantly associated when including gender and λGC as a covariate in the analysis (Supplementary Table 15).

Immunochip meta-analysis

Association statistics were calculated in each dataset using logistic regression under an additive model (SNPs coded 0, 1 or 2 with respect to minor allele dosage) and incorporating the top ten principal components as covariates. Odds ratios and standard errors were combined across the six datasets using inverse-variance meta-analysis assuming a fixed effect.

Independent effects

Initial evidence for secondary effects was assessed at each of the previously known and newly identified loci using a forward stepwise logistic regression. The index SNP at each region was included as a covariate and the association statistics re-calculated for the remaining test SNPs. This process was repeated until no SNPs reached the minimum level of significance. The criteria for declaring an independent effect was defined as: p-value < 5×10−4, not highly correlated with index SNP, the conditioned p-value must not differ substantially from the unconditioned value. We next tested if the two-SNP fitted the risk at the locus significantly better than the one-SNP model using a likelihood ratio test. The effect estimates for each two-SNP haplotype was calculated by including indicator variables for carriage of haplotypes. The indicator variables were constructed by phasing the genotype data for each region satisfying the above criteria were phased using the SHAPEIT algorithm[18].

GWAS meta-analysis

GWAS case-controls collections were previously described[1].Six collections were included in the present study: BRASS, CANADA, EIRA, NARAC1, NARAC2, WTCCC. After quality control and data filtering, the datasets were imputed using IMPUTE and haplotype-phased HapMap Phase 2 European CEU founders as a reference panel[19]. We used IBS estimates to remove related samples across the Immunochip and GWAS collections, using GWAS genotype data instead of imputed data. In each of the twelve collections, we selected a set of SNPs with missing-genotype rate<0.5%, minor allele frequency>5% and Hardy-Weinberg PHWE>5×10−7. Then, we extracted SNPs that passed these filters and were shared between the 12 collections. After further LD pruning and resolving flipping issues, the data from the 12 collections were merged to calculate the IBS statistics. When related samples were identified (siblings or duplicates), the sample from the GWAS collection was removed to preferentially keep Immunochip data in the subsequent association analyses. Filtering and IBS calculation were performed using PLINK[20]. Two GWAS datasets, EIRA and NARAC1, were excluded because of strong overlap (>90% rheumatoid arthritis cases) with the Immunochip SE-E and US collections, respectively. This resulted in a total sample size of 13,838 rheumatoid arthritis cases and 33,742 controls, distributed in 10 collections (Table 1). The software SNPTEST v2.2 was used to conduct logistic regression analysis of rheumatoid arthritis case-control status in each GWAS collection, conditioning upon the 5 first eigenvectors from PCA analysis, and after excluding SNPs with low statistical information (info score<0.7) or MAF<1%. We also excluded SNPs that were not represented in the filtered Immunochip data. The λGC for the individual datasets was estimated at; 1.04 (BRASS), 1.02 (CANADA), 1.04 (NARAC2) and 1.05 (WTCCC). There was a slight inflation in λGC in these cohorts when using the 1,469 SNPs included on Immunochip to investigate the genetic basis of reading and writing ability; 1.11 (BRASS), 1.15 (CANADA), 1.07 (NARAC2) and 1.05 (WTCCC). We conducted an inverse-variance weighted meta-analysis to combine the results across the 10 collections. We also computed Cochran’s Q statistics and I2 statistics to assess heterogeneity across collections. Meta-analysis and heterogeneity statistics computation was adapted from the MANTEL program[21].

Serological subtype statistical analysis

Multinomial logistic regression was applied to compute odds ratios (OR), 95% confidence interval and p-values for association between the minor allele at every locus and either ACPA-positive (ORACPA-positive) or ACPA-negative rheumatoid arthritis (ORACPA-negative) assuming additivity on the log-odds scale (i.e. every locus was coded as 0,1 or 2 corresponding to the copy number of the minor allele). The minor allele was defined according to the allele frequency in the total population, including cases and controls. To test for differences between ORACPA-positive and ORACPA-negative, the linear combination β+ - β−, where β+ is log (ORACPA-positive) and β− is log (ORACPA-negative) was calculated, along with its standard error. This enables a p-value for the difference in association to be calculated.

GRAIL analysis

We performed GRAIL analysis (http://www.broadinstitute.org/mpg/grail/grail.php) using HG18 and Dec2006 PubMed datasets, default settings and the 46 genome-wide significant rheumatoid arthritis susceptibility loci (most associated SNP) as seeds.

Interaction analysis

We performed an analysis of epistasis using the most significantly associated SNP from each of the 46 loci (Table 2). Logistic regression was performed in PLINK to model epistasis in each of the the six datasets with the top 10 PCs included as covariates. For each pair of SNPs, the likelihood ratio test was employed to compute the p-value of the interaction term for each dataset. Epistasis results were combined using METAL and Bonferroni corrected.

eQTL analysis

eQTL analysis was done on the peripheral blood of 1,469 unrelated individuals (1,240 samples run on the Illumina HT12v3 platform, 229 samples run on the Illumina H8v2 platform) from the United Kingdom and the Netherlands. Details of the eQTL analysis have been previously described[22] . In short, we assessed the effect of all rheumatoid arthritis associated SNPs (Table 2) on expression of genes, located within 250kb left and right from the SNP (cis eQTLs). All individuals from the eQTL study were genotyped on Illumina Hap300K platform and then imputed to HapMap 2 using Impute 2.0 software. Since not all SNPs from Illumina Immunochip platform were genotyped or imputed on the 1,469 eQTL samples, we used the following strategy (Supplementary Figure 5): First, we investigated whether the SNP is present in the eQTL data and had passed the QC for eQTL mapping (MAF >= 5%, HWE P-value >= 0.001, call rate >= 95%). From 50 rheumatoid arthritis-SNPs, 26 were present in HapMap imputed datasets and were directly assessed for eQTL effects (Supplementary Table 13). For the other 24 SNPs, not present in our HapMap imputed data, we checked whether the rheumatoid arthritis-SNP was available in 1000 genomes database. If so, we queried all SNPs within 10MB of the rheumatoid arthritis-SNP that were also present in the eQTL data and would pass eQTL QC measures, and picked the SNP with the highest LD present in HapMap after QC. The threshold of r2>0.8 for the LD was used. For 12 SNPs, no proxy was available with our criteria, and these SNPs were not included in the eQTL analysis. For the remaining 12 SNPs the best proxy SNP is included to the eQTL table (Supplementary Figure 5). We also performed a cis-eQTL analysis for the top associated gene expression probe, as well as two conditional analyses: (1) conditioning on the effect of the rheumatoid arthritis-SNP (gSNP), and (2) conditioning on the effect of the top eQTL SNP (eSNP) (Supplementary Table 13). The rheumatoid arthritis associated SNP was labelled as having a primary effect on gene expression if it was either the top eQTL in the locus, or was a good proxy of top eQTL SNP (r2>8). It was labelled as an independent eQTL if it showed an effect after conditioning on the primary eQTL. From 20 rheumatoid arthritis SNPs, that showed an eQTL effect, 13 had either an independent or primary eQTL effect on one or more probes (22 SNP-probe combinations). A further 7 SNPs were not significant after the conditioning of the strongest eQTL signal in the locus, suggesting that they are not primary eQTLs.
  22 in total

1.  Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci.

Authors:  Eli A Stahl; Soumya Raychaudhuri; Elaine F Remmers; Gang Xie; Stephen Eyre; Brian P Thomson; Yonghong Li; Fina A S Kurreeman; Alexandra Zhernakova; Anne Hinks; Candace Guiducci; Robert Chen; Lars Alfredsson; Christopher I Amos; Kristin G Ardlie; Anne Barton; John Bowes; Elisabeth Brouwer; Noel P Burtt; Joseph J Catanese; Jonathan Coblyn; Marieke J H Coenen; Karen H Costenbader; Lindsey A Criswell; J Bart A Crusius; Jing Cui; Paul I W de Bakker; Philip L De Jager; Bo Ding; Paul Emery; Edward Flynn; Pille Harrison; Lynne J Hocking; Tom W J Huizinga; Daniel L Kastner; Xiayi Ke; Annette T Lee; Xiangdong Liu; Paul Martin; Ann W Morgan; Leonid Padyukov; Marcel D Posthumus; Timothy R D J Radstake; David M Reid; Mark Seielstad; Michael F Seldin; Nancy A Shadick; Sophia Steer; Paul P Tak; Wendy Thomson; Annette H M van der Helm-van Mil; Irene E van der Horst-Bruinsma; C Ellen van der Schoot; Piet L C M van Riel; Michael E Weinblatt; Anthony G Wilson; Gert Jan Wolbink; B Paul Wordsworth; Cisca Wijmenga; Elizabeth W Karlson; Rene E M Toes; Niek de Vries; Ann B Begovich; Jane Worthington; Katherine A Siminovitch; Peter K Gregersen; Lars Klareskog; Robert M Plenge
Journal:  Nat Genet       Date:  2010-05-09       Impact factor: 38.330

2.  PLINK: a tool set for whole-genome association and population-based linkage analyses.

Authors:  Shaun Purcell; Benjamin Neale; Kathe Todd-Brown; Lori Thomas; Manuel A R Ferreira; David Bender; Julian Maller; Pamela Sklar; Paul I W de Bakker; Mark J Daly; Pak C Sham
Journal:  Am J Hum Genet       Date:  2007-07-25       Impact factor: 11.025

3.  Identification of IRAK1 as a risk gene with critical role in the pathogenesis of systemic lupus erythematosus.

Authors:  Chaim O Jacob; Jiankun Zhu; Don L Armstrong; Mei Yan; Jie Han; Xin J Zhou; James A Thomas; Andreas Reiff; Barry L Myones; Joshua O Ojwang; Kenneth M Kaufman; Marisa Klein-Gitelman; Deborah McCurdy; Linda Wagner-Weiner; Earl Silverman; Julie Ziegler; Jennifer A Kelly; Joan T Merrill; John B Harley; Rosalind Ramsey-Goldman; Luis M Vila; Sang-Cheol Bae; Timothy J Vyse; Gary S Gilkeson; Patrick M Gaffney; Kathy L Moser; Carl D Langefeld; Raphael Zidovetzki; Chandra Mohan
Journal:  Proc Natl Acad Sci U S A       Date:  2009-03-27       Impact factor: 11.205

4.  Meta-analysis identifies nine new loci associated with rheumatoid arthritis in the Japanese population.

Authors:  Yukinori Okada; Chikashi Terao; Katsunori Ikari; Yuta Kochi; Koichiro Ohmura; Akari Suzuki; Takahisa Kawaguchi; Eli A Stahl; Fina A S Kurreeman; Nao Nishida; Hiroko Ohmiya; Keiko Myouzen; Meiko Takahashi; Tetsuji Sawada; Yuichi Nishioka; Masao Yukioka; Tsukasa Matsubara; Shigeyuki Wakitani; Ryota Teshima; Shigeto Tohma; Kiyoshi Takasugi; Kota Shimada; Akira Murasawa; Shigeru Honjo; Keitaro Matsuo; Hideo Tanaka; Kazuo Tajima; Taku Suzuki; Takuji Iwamoto; Yoshiya Kawamura; Hisashi Tanii; Yuji Okazaki; Tsukasa Sasaki; Peter K Gregersen; Leonid Padyukov; Jane Worthington; Katherine A Siminovitch; Mark Lathrop; Atsuo Taniguchi; Atsushi Takahashi; Katsushi Tokunaga; Michiaki Kubo; Yusuke Nakamura; Naoyuki Kamatani; Tsuneyo Mimori; Robert M Plenge; Hisashi Yamanaka; Shigeki Momohara; Ryo Yamada; Fumihiko Matsuda; Kazuhiko Yamamoto
Journal:  Nat Genet       Date:  2012-03-25       Impact factor: 38.330

5.  Integrating autoimmune risk loci with gene-expression data identifies specific pathogenic immune cell subsets.

Authors:  Xinli Hu; Hyun Kim; Eli Stahl; Robert Plenge; Mark Daly; Soumya Raychaudhuri
Journal:  Am J Hum Genet       Date:  2011-09-29       Impact factor: 11.025

6.  Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis.

Authors:  Eli A Stahl; Daniel Wegmann; Gosia Trynka; Javier Gutierrez-Achury; Ron Do; Benjamin F Voight; Peter Kraft; Robert Chen; Henrik J Kallberg; Fina A S Kurreeman; Sekar Kathiresan; Cisca Wijmenga; Peter K Gregersen; Lars Alfredsson; Katherine A Siminovitch; Jane Worthington; Paul I W de Bakker; Soumya Raychaudhuri; Robert M Plenge
Journal:  Nat Genet       Date:  2012-03-25       Impact factor: 38.330

7.  A genome-wide association study suggests contrasting associations in ACPA-positive versus ACPA-negative rheumatoid arthritis.

Authors:  Leonid Padyukov; Mark Seielstad; Rick T H Ong; Bo Ding; Johan Rönnelid; Maria Seddighzadeh; Lars Alfredsson; Lars Klareskog
Journal:  Ann Rheum Dis       Date:  2010-12-14       Impact factor: 19.103

8.  Meta-analysis of genome-wide association studies in celiac disease and rheumatoid arthritis identifies fourteen non-HLA shared loci.

Authors:  Alexandra Zhernakova; Eli A Stahl; Gosia Trynka; Soumya Raychaudhuri; Eleanora A Festen; Lude Franke; Harm-Jan Westra; Rudolf S N Fehrmann; Fina A S Kurreeman; Brian Thomson; Namrata Gupta; Jihane Romanos; Ross McManus; Anthony W Ryan; Graham Turner; Elisabeth Brouwer; Marcel D Posthumus; Elaine F Remmers; Francesca Tucci; Rene Toes; Elvira Grandone; Maria Cristina Mazzilli; Anna Rybak; Bozena Cukrowska; Marieke J H Coenen; Timothy R D J Radstake; Piet L C M van Riel; Yonghong Li; Paul I W de Bakker; Peter K Gregersen; Jane Worthington; Katherine A Siminovitch; Lars Klareskog; Tom W J Huizinga; Cisca Wijmenga; Robert M Plenge
Journal:  PLoS Genet       Date:  2011-02-24       Impact factor: 5.917

9.  Dense genotyping identifies and localizes multiple common and rare variant association signals in celiac disease.

Authors:  Gosia Trynka; Karen A Hunt; Nicholas A Bockett; Jihane Romanos; Vanisha Mistry; Agata Szperl; Sjoerd F Bakker; Maria Teresa Bardella; Leena Bhaw-Rosun; Gemma Castillejo; Emilio G de la Concha; Rodrigo Coutinho de Almeida; Kerith-Rae M Dias; Cleo C van Diemen; Patrick C A Dubois; Richard H Duerr; Sarah Edkins; Lude Franke; Karin Fransen; Javier Gutierrez; Graham A R Heap; Barbara Hrdlickova; Sarah Hunt; Leticia Plaza Izurieta; Valentina Izzo; Leo A B Joosten; Cordelia Langford; Maria Cristina Mazzilli; Charles A Mein; Vandana Midah; Mitja Mitrovic; Barbara Mora; Marinita Morelli; Sarah Nutland; Concepción Núñez; Suna Onengut-Gumuscu; Kerra Pearce; Mathieu Platteel; Isabel Polanco; Simon Potter; Carmen Ribes-Koninckx; Isis Ricaño-Ponce; Stephen S Rich; Anna Rybak; José Luis Santiago; Sabyasachi Senapati; Ajit Sood; Hania Szajewska; Riccardo Troncone; Jezabel Varadé; Chris Wallace; Victorien M Wolters; Alexandra Zhernakova; B K Thelma; Bozena Cukrowska; Elena Urcelay; Jose Ramon Bilbao; M Luisa Mearin; Donatella Barisani; Jeffrey C Barrett; Vincent Plagnol; Panos Deloukas; Cisca Wijmenga; David A van Heel
Journal:  Nat Genet       Date:  2011-11-06       Impact factor: 38.330

10.  Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis.

Authors:  Soumya Raychaudhuri; Cynthia Sandor; Eli A Stahl; Jan Freudenberg; Hye-Soon Lee; Xiaoming Jia; Lars Alfredsson; Leonid Padyukov; Lars Klareskog; Jane Worthington; Katherine A Siminovitch; Sang-Cheol Bae; Robert M Plenge; Peter K Gregersen; Paul I W de Bakker
Journal:  Nat Genet       Date:  2012-01-29       Impact factor: 38.330

View more
  284 in total

1.  Functional implications of disease-specific variants in loci jointly associated with coeliac disease and rheumatoid arthritis.

Authors:  Javier Gutierrez-Achury; Maria Magdalena Zorro; Isis Ricaño-Ponce; Daria V Zhernakova; Dorothée Diogo; Soumya Raychaudhuri; Lude Franke; Gosia Trynka; Cisca Wijmenga; Alexandra Zhernakova
Journal:  Hum Mol Genet       Date:  2015-11-05       Impact factor: 6.150

Review 2.  The genomic landscape of human immune-mediated diseases.

Authors:  Xin Wu; Haiyan Chen; Huji Xu
Journal:  J Hum Genet       Date:  2015-08-20       Impact factor: 3.172

Review 3.  Methodological challenges in mendelian randomization.

Authors:  Tyler J VanderWeele; Eric J Tchetgen Tchetgen; Marilyn Cornelis; Peter Kraft
Journal:  Epidemiology       Date:  2014-05       Impact factor: 4.822

4.  Improved performance of epidemiologic and genetic risk models for rheumatoid arthritis serologic phenotypes using family history.

Authors:  Jeffrey A Sparks; Chia-Yen Chen; Xia Jiang; Johan Askling; Linda T Hiraki; Susan Malspeis; Lars Klareskog; Lars Alfredsson; Karen H Costenbader; Elizabeth W Karlson
Journal:  Ann Rheum Dis       Date:  2014-03-31       Impact factor: 19.103

5.  Personalized Risk Estimator for Rheumatoid Arthritis (PRE-RA) Family Study: rationale and design for a randomized controlled trial evaluating rheumatoid arthritis risk education to first-degree relatives.

Authors:  Jeffrey A Sparks; Maura D Iversen; Rachel Miller Kroouze; Taysir G Mahmoud; Nellie A Triedman; Sarah S Kalia; Michael L Atkinson; Bing Lu; Kevin D Deane; Karen H Costenbader; Robert C Green; Elizabeth W Karlson
Journal:  Contemp Clin Trials       Date:  2014-08-20       Impact factor: 2.226

Review 6.  Does TNF Promote or Restrain Osteoclastogenesis and Inflammatory Bone Resorption?

Authors:  Baohong Zhao
Journal:  Crit Rev Immunol       Date:  2018       Impact factor: 2.214

Review 7.  Using chromatin marks to interpret and localize genetic associations to complex human traits and diseases.

Authors:  Gosia Trynka; Soumya Raychaudhuri
Journal:  Curr Opin Genet Dev       Date:  2013-11-25       Impact factor: 5.578

Review 8.  The genetics revolution in rheumatology: large scale genomic arrays and genetic mapping.

Authors:  Stephen Eyre; Gisela Orozco; Jane Worthington
Journal:  Nat Rev Rheumatol       Date:  2017-06-01       Impact factor: 20.543

9.  Familial aggregation of arthritis-related diseases in seropositive and seronegative rheumatoid arthritis: a register-based case-control study in Sweden.

Authors:  Thomas Frisell; Karin Hellgren; Lars Alfredsson; Soumya Raychaudhuri; Lars Klareskog; Johan Askling
Journal:  Ann Rheum Dis       Date:  2014-12-12       Impact factor: 19.103

Review 10.  Strategies to predict rheumatoid arthritis development in at-risk populations.

Authors:  Elizabeth W Karlson; Dirkjan van Schaardenburg; Annette H van der Helm-van Mil
Journal:  Rheumatology (Oxford)       Date:  2014-08-04       Impact factor: 7.580

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.