Literature DB >> 20018022

Replication of recently identified associated single-nucleotide polymorphisms from six autoimmune diseases in Genetic Analysis Workshop 16 rheumatoid arthritis data.

Harshal Deshmukh1, Xana Kim-Howard, Swapan K Nath.   

Abstract

Many autoimmune diseases share similar underlying pathology and have a tendency to cluster within families, giving rise to the concept of shared susceptibility genes among them. In the Genetic Analysis Workshop 16 rheumatoid arthritis (RA) data we sought to replicate the genetic association between single-nucleotide polymorphisms (SNPs) identified in recent genome-wide association studies (GWAS) on RA and five other autoimmune diseases. We identified 164 significantly associated non-HLA SNPs (p < 10-5) from 16 GWAS and 13 candidate gene studies on six different autoimmune diseases, including RA, systemic lupus erythematosus, type 1 diabetes, Crohn disease, multiple sclerosis, and celiac disease. Using both direct and imputation-based association test, we replicated 16 shared susceptibility regions involving RA and at least one of the other autoimmune diseases. We also identified hidden population structure within cases and controls in Genetic Analysis Workshop 16 RA data and assessed the effect of population structure on the shared autoimmunity regions. Because multiple autoimmune diseases share common genetic origin, these could be areas of immense interest for further genetic and clinical association studies.

Entities:  

Year:  2009        PMID: 20018022      PMCID: PMC2795929          DOI: 10.1186/1753-6561-3-s7-s31

Source DB:  PubMed          Journal:  BMC Proc        ISSN: 1753-6561


Background

Autoimmune diseases affect 5% of the human population [1]. Although there is considerable heterogeneity among these disorders, their manifestations are believed to arise from immune-mediated attack against self-antigens. Despite their clinical heterogeneity, recent studies examining gene expression profiles in peripheral blood mononuclear cells (PBMC) of individuals with autoimmune disorders reveal common features that are either shared within a disease group or among disease groups as exemplified in rheumatoid arthritis (RA) [2] or in systemic lupus erythematosus (SLE) [3]. The major symptoms of RA arise through immune-mediated destruction of peripheral joints; however, these features are typically accompanied by systemic complications such as rheumatoid nodules and vasculitis. Immune-mediated destruction is the central feature of autoimmune diseases like SLE, type 1 diabetes (T1D), multiple sclerosis (MS), and celiac disease (CLD). Given the similarities in the basic pathology of these autoimmune disorders, it is not surprising to see autoimmune diseases clustering within families, which leads to the hypothesis of common autoimmunity genes being shared between diseases. An example of such shared gene is Runx1, which is shown to be associated with SLE, psoriasis, and RA [4]. Increasing numbers of GWAS for autoimmune disorders have enhanced the possibility of identifying such shared autoimmune regions. The goals of the present study are 1) to identify population structure in Genetic Analysis Workshop (GAW) 16 RA cases and controls, 2) to replicate the genetic association in RA identified from recent GWAS on six common autoimmune diseases [RA, Crohn disease (CD), CLD, SLE, MS, and T1D], and 3) to study the effect of admixture on associated regions.

Methods

After searching the PubMed database we identified recently published 16 GWAS and other 13 candidate gene association studies [5-28] on RA, CD, SLE, MS, CLD, and T1D. SNPs which showed significant association at a genome-wide "suggestive" threshold (p < 10-5) were chosen for replication in GAW16 RA data. The preselected threshold (p < 10-5) was chosen as "suggestive" to control properly the family-wide type 1 error as recommended by Duggal et al. [29] to adjust p-value to control the family-wide type 1 error in genome-wide association studies. The rationale for choosing this threshold was to maximize true associations from the GWAS. We performed an association analysis using predefined quality control criteria (MAF ≥ 1%, SNP missingness rate of ≤ 10%, and Hardy-Weinberg equilibrium ≥ 0.001 in controls) and identified significant SNPs for RA either by direct association using PLINK [30] or by imputation using fastPHASE [31]. To identify the hidden population structure in cases and controls, we estimated and compared the likelihood of this data under different numbers of ancestral populations (k). We used STRUCTURE [32] for estimating the best k separately for cases and controls. We identified 343 ancestry informative markers (AIMs) from two previously published reports [33,34] that were available in GAW16 RA data. These AIMs were used in both estimating population structure and admixture proportion in each individual, as well as correcting for the effect of population substructure in genetic association. We employed two different methods for controlling the effect of population substructure, i.e., structured association test (SAT) [35] with 10,000 permutations and covariate-adjusted logistic regression. We also included sex as a covariate in the logistic regression model; however, it did not significantly affect the association results and was excluded from the final model. To corroborate the evidence of population structure we performed principal-component analysis using EIGENSOFT. We evaluated the statistical significance of each eigenvector using Tracy-Widom (TW) statistics as described by Patterson et al. and calculated the total variation explained by the significant eigenvector [36]. Finally, we sought to replicate regions that showed association signals across GAW16 data and at least one of the GWAS. If the associated SNPs were not present (either failed or were not genotyped in the study) in the GAW16 data, we looked at the surrounding region in the GAW data (100-kb region centered on the published associated SNP). If any of the SNPs from these regions showed significance at a replication threshold of p < 0.05, we imputed this region using HAPMAP data (60 unrelated CEU parents) and assessed association.

Results

We have identified substantial population substructure in GAW16 RA samples. Figure 1A and 1B show estimated structured likelihood probability of data for cases and controls, respectively. The best fitted model for cases favored the assumption of a two-population model (ancestry proportion = 0.955, 0.045) and three-population model for controls (ancestry proportion = 0.771, 0.115, 0.074). However, a combined case-control data favored a three-population model (ancestry proportion = 0.528, 0.257, 0.215). For controls, the likelihood probabilities for two-, three-, and four-population models are similar and that for cases, the likelihood probabilities for a two- and three-population model is similar. We ran principal-components analysis on the combined cases-control data and calculated TW statistics [36] for the top 10 eigenvectors, and 4 significant eigenvectors (p > 0.05) explained 23% of the variation in the whole dataset. This suggests substantial population structure within GAW16 data.
Figure 1

Likelihood of data under number of hidden populations (K) estimated separately for controls (A) and cases (B). K denotes number of populations.

Likelihood of data under number of hidden populations (K) estimated separately for controls (A) and cases (B). K denotes number of populations. We initially selected 164 non-HLA associated SNPs from 16 recently published GWAS and 13 candidate gene association studies (p < 10-5) to check for replication in the GAW16 dataset. We found associated SNPs for SLE (n = 49), CD (n = 39), T1D (n = 32), RA (n = 37), CLD (n = 4), and MS (n = 9). Of these 164 SNPs, 92 SNPs were found in the GAW16 data and evaluated by a direct allelic association test. The remaining 72 SNPs were assessed by indirect association (by imputation). Of these 164 SNPs, 29 were significantly replicated (p < 0.05). Nine of these SNPs replicated at p-values between 0.05 and 0.01, 11 were between 0.01 and 10-5, and 8 replicated at p < 10-5. Table 1 shows susceptibility loci with the p-values for autoimmune diseases (CD, CLD, T1D, SLE, and RA) identified from various GWAS. The last two columns show association based p-values for the same loci in the entire GAW16 RA data and p-values adjusted for population admixture.
Table 1

Replication of association in multiple autoimmune diseases

Corrected p-value

Chromosome numberCytogenetic positionGeneSNPPhysical positionAssociated diseasesUncorrected GAW p-valueaAdjusting with ancestry as covariate in a logistic regression modelSATb
11p31IL23Rrs1146580467414547CD1.09 × 10-31.04 × 10-32.04 × 10-3
11p13PTPN22rs2476601114089610SLE, RA, T1D1.12 × 10-121.76 × 10-102.66 × 10-10
22q24IFIH1rs1990760162949558T1D6.54 × 10-32.74 × 10-22.44 × 10-2
22q32.2-q32.3STAT4rs6752770191681808RA, SLE7.00 × 10-31.36 × 10-23.36 × 10-2
33p21MST1rs319799949696536CD2.31 × 10-23.57 × 10-23.57 × 10-2
44q27KIAA1109rs13151961123473107Celiac T1D, RA4.81 × 10-22.74 × 10-23.74 × 10-2
55p13PTGER4rs461376340428485CD1.96 × 10-37.56 × 10-35.56 × 10-3
66q23near TNFAIP3rs6933404138000928SLE3.13 × 10-42.01 × 10-33.01 × 10-3
66q23near TNFAIP3rs13192841138008907SLE2.93 × 10-45.71 × 10-46.47 × 10-4
66q23near TNFAIP3rs12527282138008945SLE2.28 × 10-43.37 × 10-42.27 × 10-4
66q23near TNFAIP3rs2327832138014761SLE1.06 × 10-47.51 × 10-46.51 × 10-4
66q23near TNFAIP3rs602414138053358SLE6.03 × 10-41.29 × 10-21.29 × 10-2
66q27CCR6rs2301436167408399CD1.67 × 10-21.74 × 10-24.25 × 10-2
88p23.1XKR6rs1178324710826285SLE4.50 × 10-21.76 × 10-25.77 × 10-2
88p21.1C8orf12rs783605911309574SLE8.87 × 10-31.36 × 10-26.78 × 10-2
88p21.3C8orf13-BLKrs273634011381382SLE1.45 × 10-52.38 × 10-50
88p21.3C8orf13-BLKrs1327711311386595SLE3.46 × 10-65.69 × 10-60
88p23.1BLKrs261847611389950SLE3.21 × 10-64.10 × 10-6* c
88p23.1BLKrs224893211429059SLE9.79 × 10-36.49 × 10-36.69 × 10-3
99q33.2PHF19rs1953126122680321RA2.76 × 10-84.97 × 10-80
99q33.2PHF19rs1609810122682172RA1.79 × 10-83.38 × 10-8*
99q33.2PHF19rs881375122692719RA2.27 × 10-84.55 × 10-80
99q33.2PHF19rs6478486122695150RA1.79 × 10-83.38 × 10-8*
99q33.2near PHF19rs3761847120769793RA1.24 × 10-83.88 × 10-80
99q33.2C5rs2900180122776861RA6.24 × 10-91.88 × 10-80
1010q24NKX2-3rs11190140101281583CD4.93 × 10-28.10 × 10-28.80 × 10-2
1919q13RSHL1rs811107150999246CD5.91 × 10-51.66 × 10-40
2222q11.21UBE2L3rs575421720264229SLE8.94 × 10-36.34 × 10-36.57 × 10-3
2222q13.2SCUBE1rs207172541934258SLE2.23 × 10-21.83 × 10-21.57 × 10-2

aAllelic association test

bStructured association test

c *, Imputed SNP

Replication of association in multiple autoimmune diseases aAllelic association test bStructured association test c *, Imputed SNP

Discussion

There is a growing understanding that susceptibility to autoimmune diseases is due to a complex interaction of multiple genes and environmental factors, and many of these may be shared among many autoimmune diseases. In this analysis we attempted to replicate previously identified associations in multiple autoimmune diseases and inferred regions of shared autoimmunity between GAW16 data and any other autoimmune disease. We did not explore the HLA region in our study because this region has already been extensively investigated and is a very well know complex region of shared autoimmunity among various autoimmune disorders [37,38]. GWAS have emerged as an effective tool to identify common polymorphism underlying complex diseases. One of the major sources of bias in GWAS is population stratification, a variation of ancestry proportions between cases and controls. This stratification can lead to differences in allele frequency between cases and controls unrelated to disease status, consecutively leading to an increased type 1 error [9]. We used 343 AIMs and applied them to cases and controls separately to infer population structure. We have demonstrated substantial population substructure in both cases and controls. In fact, we have identified more sub-structure in controls than cases. Obviously, this would have major impact if not corrected properly while performing association studies. We identified 16 different cytogenetic regions of shared autoimmunity between GAW16 data and at least one of the proposed autoimmune diseases. There were eight shared regions with SLE (1p13, 2q32.2-q32.3, 6p21.32, 6q23, 8p21.3, 8p23.1, 22q11.21, 22q13.2), six shared regions with CD (1p31, 3p21, 5p13, 6q27, 10q24, 19q13), four shared regions with RA (1p13, 2q32.2-q32.3, 4q27, 9q33.2), four shared regions with T1D (1p13, 2q24, 2q33, 4q27), and one shared region with CLD (4q27). Interestingly, PTPN22 (1p13), STAT4 (2q32.2-q32.3), and KIAA1109 (4q27) were all associated with multiple autoimmune disease. It should also be noted that SLE shared the most susceptibility genes with RA, suggesting common underlying pathologic processes perpetrated by common loci. These associations are constant, robust, and persisted after correcting for population structure. It is also noteworthy to report that none of the nine associated SNPs from MS are replicated in the GAW16 RA data. However, our study was not an exhaustive replication with RA and the five other autoimmune diseases because SNPs were chosen using a predefined threshold (p < 10-5). It is possible that SNPs that showed weak to moderate association (0.05-10-5) with other autoimmune disease could have been highly associated with RA. Also, the other studies from which the list of 164 non-HLA SNPs were selected do not all control for population admixture so it is possible that we missed analyzing an important SNP in the GAW16 data. We did not evaluate that possibility. It is worth future research to look more exhaustively at SNPs found by GWAS and candidate gene analyses that do not pass genome-wide significance but are significant at the p < 0.05 level.

Conclusion

It has long been suspected that autoimmune diseases may share common pathogenesis and susceptibility genes, and several recent studies [4,5] support this hypothesis. Identification of these shared regions can help in identification of novel genetic pathways in autoimmune disease causation, can increase understanding higher prevalence of different autoimmune disorders in families, and may identify targeted regions for gene therapy. Our study successfully identified 16 areas of shared susceptibility involving RA and other autoimmune diseases. These can be further explored by association and clinical studies to solve the conundrum of shared autoimmunity amongst various autoimmune diseases.

List of abbreviations used

AIM: Ancestry informative marker; CD: Crohn disease; CLD: Celiac disease; GAW: Genetic Analysis Workshop; MS: Multiple sclerosis; PBMC: Peripheral blood mononuclear cells; RA: Rheumatoid arthritis; SAT: Structured association test; SLE: Systemic lupus erythematosus; T1D: Type 1 diabetes; TW: Tracy-Widom

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

SKN conceived of the study, and participated in its design and coordination and helped to draft the manuscript. HAD and XK-H did the analysis and drafted the manuscript.
  37 in total

1.  Association mapping in structured populations.

Authors:  J K Pritchard; M Stephens; N A Rosenberg; P Donnelly
Journal:  Am J Hum Genet       Date:  2000-05-26       Impact factor: 11.025

2.  Inference of population structure using multilocus genotype data.

Authors:  J K Pritchard; M Stephens; P Donnelly
Journal:  Genetics       Date:  2000-06       Impact factor: 4.562

Review 3.  Autoimmune disease: why and where it occurs.

Authors:  P Marrack; J Kappler; B L Kotzin
Journal:  Nat Med       Date:  2001-08       Impact factor: 53.440

4.  Refining the impact of TCF7L2 gene variants on type 2 diabetes and adaptive evolution.

Authors:  Agnar Helgason; Snaebjörn Pálsson; Gudmar Thorleifsson; Struan F A Grant; Valur Emilsson; Steinunn Gunnarsdottir; Adebowale Adeyemo; Yuanxiu Chen; Guanjie Chen; Inga Reynisdottir; Rafn Benediktsson; Anke Hinney; Torben Hansen; Gitte Andersen; Knut Borch-Johnsen; Torben Jorgensen; Helmut Schäfer; Mezbah Faruque; Ayo Doumatey; Jie Zhou; Robert L Wilensky; Muredach P Reilly; Daniel J Rader; Yu Bagger; Claus Christiansen; Gunnar Sigurdsson; Johannes Hebebrand; Oluf Pedersen; Unnur Thorsteinsdottir; Jeffrey R Gulcher; Augustine Kong; Charles Rotimi; Kári Stefánsson
Journal:  Nat Genet       Date:  2007-01-07       Impact factor: 38.330

5.  Association of systemic lupus erythematosus with C8orf13-BLK and ITGAM-ITGAX.

Authors:  Geoffrey Hom; Robert R Graham; Barmak Modrek; Kimberly E Taylor; Ward Ortmann; Sophie Garnier; Annette T Lee; Sharon A Chung; Ricardo C Ferreira; P V Krishna Pant; Dennis G Ballinger; Roman Kosoy; F Yesim Demirci; M Ilyas Kamboh; Amy H Kao; Chao Tian; Iva Gunnarsson; Anders A Bengtsson; Solbritt Rantapää-Dahlqvist; Michelle Petri; Susan Manzi; Michael F Seldin; Lars Rönnblom; Ann-Christine Syvänen; Lindsey A Criswell; Peter K Gregersen; Timothy W Behrens
Journal:  N Engl J Med       Date:  2008-01-20       Impact factor: 91.245

6.  Variation in interleukin 7 receptor alpha chain (IL7R) influences risk of multiple sclerosis.

Authors:  Frida Lundmark; Kristina Duvefelt; Ellen Iacobaeus; Ingrid Kockum; Erik Wallström; Mohsen Khademi; Annette Oturai; Lars P Ryder; Janna Saarela; Hanne F Harbo; Elisabeth G Celius; Hugh Salter; Tomas Olsson; Jan Hillert
Journal:  Nat Genet       Date:  2007-07-29       Impact factor: 38.330

7.  Population structure and eigenanalysis.

Authors:  Nick Patterson; Alkes L Price; David Reich
Journal:  PLoS Genet       Date:  2006-12       Impact factor: 5.917

8.  Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci.

Authors:  Jason D Cooper; Deborah J Smyth; Adam M Smiles; Vincent Plagnol; Neil M Walker; James E Allen; Kate Downes; Jeffrey C Barrett; Barry C Healy; Josyf C Mychaleckyj; James H Warram; John A Todd
Journal:  Nat Genet       Date:  2008-11-02       Impact factor: 38.330

9.  Novel Crohn disease locus identified by genome-wide association maps to a gene desert on 5p13.1 and modulates expression of PTGER4.

Authors:  Cécile Libioulle; Edouard Louis; Sarah Hansoul; Cynthia Sandor; Frédéric Farnir; Denis Franchimont; Séverine Vermeire; Olivier Dewit; Martine de Vos; Anna Dixon; Bruno Demarche; Ivo Gut; Simon Heath; Mario Foglio; Liming Liang; Debby Laukens; Myriam Mni; Diana Zelenika; André Van Gossum; Paul Rutgeerts; Jacques Belaiche; Mark Lathrop; Michel Georges
Journal:  PLoS Genet       Date:  2007-03-05       Impact factor: 5.917

10.  A large-scale rheumatoid arthritis genetic study identifies association at chromosome 9q33.2.

Authors:  Monica Chang; Charles M Rowland; Veronica E Garcia; Steven J Schrodi; Joseph J Catanese; Annette H M van der Helm-van Mil; Kristin G Ardlie; Christopher I Amos; Lindsey A Criswell; Daniel L Kastner; Peter K Gregersen; Fina A S Kurreeman; Rene E M Toes; Tom W J Huizinga; Michael F Seldin; Ann B Begovich
Journal:  PLoS Genet       Date:  2008-06-27       Impact factor: 5.917

View more
  5 in total

Review 1.  The therapeutic potential of epigenetics in autoimmune diseases.

Authors:  Maria De Santis; Carlo Selmi
Journal:  Clin Rev Allergy Immunol       Date:  2012-02       Impact factor: 8.667

2.  Non-synonymous variant (Gly307Ser) in CD226 is associated with susceptibility to multiple autoimmune diseases.

Authors:  Amit K Maiti; Xana Kim-Howard; Parvathi Viswanathan; Laura Guillén; Xiaoxia Qian; Adriana Rojas-Villarraga; Celi Sun; Carlos Cañas; Gabriel J Tobón; Koichi Matsuda; Nan Shen; Alejandra C Cherñavsky; Juan-Manuel Anaya; Swapan K Nath
Journal:  Rheumatology (Oxford)       Date:  2010-03-24       Impact factor: 7.580

3.  Evaluation of 19 autoimmune disease-associated loci with rheumatoid arthritis in a Colombian population: evidence for replication and gene-gene interaction.

Authors:  Harshal A Deshmukh; Amit K Maiti; Xana R Kim-Howard; Adriana Rojas-Villarraga; Joel M Guthridge; Juan-Manuel Anaya; Swapan K Nath
Journal:  J Rheumatol       Date:  2011-07-15       Impact factor: 4.666

4.  Haplotype-based analysis: a summary of GAW16 Group 4 analysis.

Authors:  Elizabeth Hauser; Nadine Cremer; Rebecca Hein; Harshal Deshmukh
Journal:  Genet Epidemiol       Date:  2009       Impact factor: 2.135

5.  Targeting CD226/DNAX accessory molecule-1 (DNAM-1) in collagen-induced arthritis mouse models.

Authors:  Muriel Elhai; Gilles Chiocchia; Carmen Marchiol; Franck Lager; Gilles Renault; Marco Colonna; Guenter Bernhardt; Yannick Allanore; Jérôme Avouac
Journal:  J Inflamm (Lond)       Date:  2015-02-08       Impact factor: 4.981

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.