Literature DB >> 18466515

Genome-wide linkage and association analysis of rheumatoid arthritis in a Canadian population.

Zhi Wei1, Mingyao Li.   

Abstract

Rheumatoid arthritis (RA) is an autoimmune disease with a moderately strong genetic component. Previous linkage and candidate gene studies have identified several regions that predispose to RA, including the HLA-DRB1 and PTPN22. We conducted genome-wide linkage analysis with 128 affected individuals from 60 families in a Canadian cohort that were genotyped using the Illumina linkage panel and genome-wide association analysis with 158 affected individuals from the same cohort that were genotyped using the Affymetrix 100 K platform. Multipoint nonparametric linkage scan revealed three linkage peaks with LOD scores greater than 1.5. We also identified 13 significantly associated SNPs at the genome-wide level of 0.05 after Bonferroni adjustment for multiple testing. Several of the significantly associated SNPs are located close to previously identified linkage regions, but not in the linkage peaks identified in the same cohort. We could not replicate association with HLA-DRB1 and PTPN22. Our results indicate that high coverage and sufficient sample size are crucial for the success of genome-wide association studies.

Entities:  

Year:  2007        PMID: 18466515      PMCID: PMC2359870          DOI: 10.1186/1753-6561-1-s1-s19

Source DB:  PubMed          Journal:  BMC Proc        ISSN: 1753-6561


Background

Rheumatoid arthritis (RA) is a complex autoimmune genetic disorder in which the immune system attacks normal tissues as if they were invading pathogens. Twin and family studies have suggested that the heritability of RA is ~60%. A well established RA susceptibility locus is the HLA region located on chromosome 6p, which is estimated to account for one-third of the genetic component of RA etiology. Apart from the HLA region, a number of other chromosomal regions have been replicated among various genome-wide linkage scans in which the leading regions include chromosome 1p13, 1q41–43, 6q16, 16p, and 18q [1]. Linkage analysis has low power to detect genetic variants that confer modest disease risks. For complex diseases such as RA, tests of genetic association with the disease may be more effective. Genetic association analyses have led to the identification of PTPN22 [2], a gene that has been replicated in many subsequent studies. Additional susceptibility loci for RA that have been implicated by association analyses include PADI4, SLC22A4, RUNX1, and CTLA4. In this investigation, we performed genome-wide linkage and association analyses of the Canadian Rheumatoid Arthritis Genetic Study (CRAGS) data made available to Genetic Analysis Workshop 15 participants. We seek to identify genetic variants that predispose to RA and to characterize their genetic contributions.

Methods

Data sets and initial data quality checking

The CRAGS provided two data sets. The first data set includes 60 families (128 affected individuals) that were genotyped using the Illumina linkage panel on 5429 SNPs across 22 autosomal chromosomes. The second data set includes 158 affected individuals (78 affected sib pairs (ASPs) and one affected avuncular pair) that were genotyped using the Affymetrix 100 K platform on 113,237 SNPs across 22 autosomal chromosomes. Among the 113,237 SNPs, a total of 87,181 SNPs had >85% genotypes completed, and exhibited a minor allele frequency (MAF) of >0.05. The 87,181 SNPs that passed the initial quality control had an average MAF of 0.247 and genotyping success rate of 96.8%.

Test of Hardy-Weinberg equilibrium in the presence of disease association

Assessing Hardy-Weinberg equilibrium (HWE) is often an important step for checking the quality of genotype data. The standard test of HWE assumes that the genotypes are randomly sampled from the general population. However, in the CRAGS, all individuals are affected. As a result, when a marker is associated with the disease, the corresponding genotypes may no longer be a random sample. Assessing departure from HWE in the presence of disease association is particularly important for genome-wide association studies in which the disease variants are either directly genotyped or are in linkage disequilibrium (LD) with the genotyped markers. Analysis using the standard HWE test might result in many rejections, and perhaps, some of the rejected markers are in LD with the disease variants. Here we develop a likelihood framework that allows the assessment of departure from HWE while taking into account potential association with the disease. Assume a homogeneous sample of ASPs is collected and genotyped at a diallelic marker with two alleles A and a (with frequencies p and q, respectively). Let g ∈ {0, 1, 2} represent the number of allele A, and Pbe the corresponding genotype frequency. Under HWE, the genotype frequencies in the general population are P0 = q2, P1 = 2pq, and P2 = p2, respectively. For an ASP with genotypes g1 and g2, the retrospective likelihood is , where ris the genotype relative risk of genotype g compared with genotype 0. When HWE is assumed, the parameters to be estimated are {r1, r2, p}; when departure from HWE is allowed, the parameters to be estimated are {r1, r2, P1, P2}. Table 1 lists the joint genotype probability for a sib pair under the null and the alternative models, respectively. For a sample of ASPs, the overall likelihood of the data, L, is simply the product of the likelihood for each ASP. We can test for residual departure from HWE using a likelihood ratio statistic T = 2[ln(1) - ln(0)], where 1 and 0 are the likelihoods maximized under the alternative and the null models, respectively. Under the null hypothesis of HWE, T is approximately distributed as a χ2 distribution with one degree of freedom. This test assesses departure from HWE after adjusting for possible association with the disease, therefore minimizing the chance that an important marker is flagged as problematic.
Table 1

Joint genotype probability for a sib pair (genotypes are unordered)

Pr(g1, g2)

(g1, g2)Assume HWEAllow departure from HWE
(0, 0) q 4 + p q 3 + 1 4 p 2 q 2 P 0 2 + 1 2 P 0 P 1 + 1 16 P 1 2
(0, 1)pq2(1 + q) P 0 P 1 + 1 4 P 1 2
(0, 2) 1 2 p 2 q 2 1 8 P 1 2
(1, 1)pq(1 + pq) 1 2 P 0 P 1 + 1 4 P 1 2 + 1 2 P 1 P 2 + 2 P 0 P 2
(1, 2)p2q(1 + p) 1 4 P 1 2 + P 1 P 2
(2, 2) 1 4 p 2 q 2 + p 3 q + p 4 1 16 P 1 2 + 1 2 P 1 P 2 + P 2 2
Joint genotype probability for a sib pair (genotypes are unordered)

Linkage and association analysis

We performed genome-wide, nonparametric multipoint linkage analysis using the SPAIR statistic [3] as implemented in MERLIN [4] on the 60 families that were genotyped using the Illumina linkage panel. The SPAIR statistic combines information from pairs of affected individuals and can detect regions of excess IBD sharing. We performed single-marker association analysis using LAMP [5,6], which uses a maximum-likelihood model to extract information on genetic association from samples of unrelated individuals, sibships, and larger pedigrees. Briefly, the program estimates the disease-SNP haplotype frequencies and three penetrances using all available data by maximizing the likelihood of the marker data conditional on the disease phenotypes. A likelihood ratio test with approximately two degrees of freedom is constructed by comparing the likelihood maximized under the alternative model, which allows for LD between the disease and SNP loci, with the likelihood maximized under the null model that assumes linkage equilibrium. We assumed a fixed disease prevalence of 0.8%. Different disease prevalence changed parameter estimates slightly, but did not appear to affect the overall ranking of SNPs.

Results

Our multipoint nonparametric linkage analysis revealed three linkage signals at a LOD score threshold of 1.5, corresponding to a -log10(p-value) > 2.37 (Figure 1). These linkage peaks are on chromosome 12 (LOD = 1.89 at 123 cM, asymptotic p-value = 0.002), chromosome 6 (LOD = 1.83 at 161.7 cM, asymptotic p-value = 0.002), and chromosome 9 (LOD = 1.69 at 141 cM, asymptotic p-value = 0.003). We did not observe evidence of linkage in the HLA region, despite the fact that approximately one-third of the total genetic contribution in RA is attributed to genes in the HLA region. Because the Affymetrix 100 K platform includes a denser set of SNPs in the HLA region and more ASPs in the CRAGS were genotyped, we also conducted nonparametric linkage analysis on chromosome 6 with the Affymetrix SNPs. The analysis was conducted using MERLIN [4], in which LD between SNPs was modeled by considering haplotypes within clusters of tightly linked markers. We obtained results similar to those from the Illumina SNPs, suggesting that the lack of linkage evidence is probably due to the limited sample size of the CRAGS.
Figure 1

Genome-wide linkage and association analysis. The solid curve is the -log10(p-value) of the multipoint LOD score from MERLIN. The gray line is the -log10(p-value) of the likelihood ratio test of association from LAMP. SNPs that are significantly associated with RA after Bonferroni adjustment are circled.

Genome-wide linkage and association analysis. The solid curve is the -log10(p-value) of the multipoint LOD score from MERLIN. The gray line is the -log10(p-value) of the likelihood ratio test of association from LAMP. SNPs that are significantly associated with RA after Bonferroni adjustment are circled. Among the 87,181 SNPs that were genotyped by the Affymetrix 100 K platform and passed initial data quality checking, 145 of them had a p-value < 0.001 using our test of HWE. These SNPs were excluded from subsequent association analysis because LAMP assumes HWE at the tested SNP in the general population. For the remaining 87,036 SNPs, we did single-marker association analysis using LAMP (Figure 1). We corrected for multiple testing using the Bonferroni criterion and controlled the family-wise error rate at αgenome = 0.05. We identified 13 significantly associated SNPs at the genome-wide level, but none of them fell in linkage peaks identified using the 60 families (Table 2).
Table 2

Significantly associated SNPs after Bonferroni correction with αgenome = 0.05 using LAMP

ChrSNPPosition (bp)MAFLRTp-Value
1rs1256493183,251,9500.302830.492.39 × 10-7
2SNP_A-1732798142,778,2050.078436.241.35 × 10-8
4rs4834009126,300,9770.152335.002.51 × 10-8
4rs10517834166,748,1080.301434.772.82 × 10-8
5rs1052089323,724,8830.083934.493.24 × 10-8
7rs659317954,518,3910.280329.983.09 × 10-7
9rs68047132,506,9490.295333.435.05 × 10-8
9rs411129066,457,4200.282130.162.82 × 10-7
10rs1050927267,769,0790.273929.753.47 × 10-7
13rs1049247765,139,7380.102948.263.31 × 10-11
15rs209062219,204,6810.303835.691.78 × 10-8
15rs1051977431,024,4770.256629.793.40 × 10-7
18rs111594762,064,9850.285332.001.13 × 10-7
Significantly associated SNPs after Bonferroni correction with αgenome = 0.05 using LAMP The most strongly associated SNP is rs10492477, located at 13q21. This SNP maps to the PCDH9 gene, which belongs to the protocadherin gene family, a subfamily of the cadherin superfamily. PCDH9 is predominantly expressed in brain, but is also expressed in hairy cell leukemia cells. Hairy cell leukemia can be responsible for polyarthritis due to immunity-drive inflammation, which can precede or follow the clinical onset of leukemic symptoms and usually presents as RA [7]. PCDH9 has not been reported as a RA susceptibility locus, suggesting it is a new candidate gene. The next most strongly associated SNP is SNP_A-1732768, located at 142.8 Mb on chromosome 2. This SNP is ~15 Mb away from the linkage region identified through linkage analysis in Caucasian families in the North American Rheumatoid Arthritis Consortium [8]. In addition, rs4834009 (chromosome 4, 126.3 Mb), rs10520893 (chromosome 5, 23.7 Mb), and rs10509272 (chromosome 10, 67.8 Mb), are all within ~15 Mb of the linkage regions identified by Amos et al. [8]. Although not reaching genome-wide significance, several other SNPs showed trend of association, including SNPs on chromosomes 6, 8, 11, 12, 16, 17, and 20. Unexpectedly, we did not observe significant association between RA and PTPN22, despite that the association with PTPN22 has been replicated extensively. Further examination of the data revealed that among the 42 SNPs that were examined by the HapMap, only four of them were included in the Affymetrix 100 K array set. Surprisingly, we did not observe evidence of association between RA and the HLA complex either. Among the 102 SNPs were genotyped in the HLA region, 85 passed our data quality checking, and the most strongly associated SNP had a p-value of 0.05. A recent study of the extended MHC region identified 6338 SNPs [9], whereas 5 only 1.6% of them are included in the Affymetrix 100 K array set. Because association analysis depends critically on the degree of LD between the tested marker and the unobserved disease locus, it is indeed not surprising that given the limited coverage of the HLA region, the current data did not support evidence of association.

Conclusion

We conducted genome-wide linkage analysis using SNPs genotyped by the Illumina linkage panel and genome-wide association analysis using SNPs genotyped by the Affymetrix 100 K platform on a set of affected relative pairs of RA in CRAGS. Multipoint nonparametric linkage analysis identified three linkage peaks with maximum LOD score greater than 1.5. Our single marker association analysis showed strong evidence of association on chromosomes 1, 2, 4, 5, 7, 9, 10, 11, 13, 15, and 18. Several significantly associated SNPs locate at or close to the previously detected RA linkage regions, but not in the linkage peaks identified in the CRAGS. For the well-known RA-susceptibility loci-HLA-DRB1 and PTPN22-we did not find evidence of association. Further examination of the data revealed that both regions are not well covered by the Affymetrix 100 K platform. Another possible reason is that the sample size available to this investigation is limited. Although genome-wide association is a promising approach to search susceptibility genes for complex diseases, the success of this approach depends critically on several factors, including the effect size of the disease genes, LD around the disease loci, and the sample size of the study. Our results indicate that future genome-wide association studies should employ a platform that has better coverage across the genome.

Competing interests

The author(s) declare that they have no competing interests.
  9 in total

1.  Merlin--rapid analysis of dense genetic maps using sparse gene flow trees.

Authors:  Gonçalo R Abecasis; Stacey S Cherny; William O Cookson; Lon R Cardon
Journal:  Nat Genet       Date:  2001-12-03       Impact factor: 38.330

2.  Joint modeling of linkage and association: identifying SNPs responsible for a linkage signal.

Authors:  Mingyao Li; Michael Boehnke; Goncalo R Abecasis
Journal:  Am J Hum Genet       Date:  2005-04-05       Impact factor: 11.025

3.  Efficient study designs for test of genetic association using sibship data and unrelated cases and controls.

Authors:  Mingyao Li; Michael Boehnke; Gonçalo R Abecasis
Journal:  Am J Hum Genet       Date:  2006-03-20       Impact factor: 11.025

4.  Allele-sharing models: LOD scores and accurate linkage tests.

Authors:  A Kong; N J Cox
Journal:  Am J Hum Genet       Date:  1997-11       Impact factor: 11.025

Review 5.  Pathways to gene identification in rheumatoid arthritis: PTPN22 and beyond.

Authors:  Peter K Gregersen
Journal:  Immunol Rev       Date:  2005-04       Impact factor: 12.988

Review 6.  Chronic immunity-driven polyarthritis in hairy cell leukemia. Report of a case and review of the literature.

Authors:  J P Vernhes; T Schaeverbeke; J Fach; L Lequen; B Bannwarth; J Dehais
Journal:  Rev Rhum Engl Ed       Date:  1997-10

7.  High-density SNP analysis of 642 Caucasian families with rheumatoid arthritis identifies two new linkage regions on 11p12 and 2q33.

Authors:  C I Amos; W V Chen; A Lee; W Li; M Kern; R Lundsten; F Batliwalla; M Wener; E Remmers; D A Kastner; L A Criswell; M F Seldin; P K Gregersen
Journal:  Genes Immun       Date:  2006-05-04       Impact factor: 2.676

8.  A missense single-nucleotide polymorphism in a gene encoding a protein tyrosine phosphatase (PTPN22) is associated with rheumatoid arthritis.

Authors:  Ann B Begovich; Victoria E H Carlton; Lee A Honigberg; Steven J Schrodi; Anand P Chokkalingam; Heather C Alexander; Kristin G Ardlie; Qiqing Huang; Ashley M Smith; Jill M Spoerke; Marion T Conn; Monica Chang; Sheng-Yung P Chang; Randall K Saiki; Joseph J Catanese; Diane U Leong; Veronica E Garcia; Linda B McAllister; Douglas A Jeffery; Annette T Lee; Franak Batliwalla; Elaine Remmers; Lindsey A Criswell; Michael F Seldin; Daniel L Kastner; Christopher I Amos; John J Sninsky; Peter K Gregersen
Journal:  Am J Hum Genet       Date:  2004-06-18       Impact factor: 11.025

9.  A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC.

Authors:  Paul I W de Bakker; Gil McVean; Pardis C Sabeti; Marcos M Miretti; Todd Green; Jonathan Marchini; Xiayi Ke; Alienke J Monsuur; Pamela Whittaker; Marcos Delgado; Jonathan Morrison; Angela Richardson; Emily C Walsh; Xiaojiang Gao; Luana Galver; John Hart; David A Hafler; Margaret Pericak-Vance; John A Todd; Mark J Daly; John Trowsdale; Cisca Wijmenga; Tim J Vyse; Stephan Beck; Sarah Shaw Murray; Mary Carrington; Simon Gregory; Panos Deloukas; John D Rioux
Journal:  Nat Genet       Date:  2006-09-24       Impact factor: 38.330

  9 in total
  4 in total

1.  Wnt signaling genes of murine chromosome 15 are involved in sex-affected pathways of inflammatory arthritis.

Authors:  Elena Kudryavtseva; Toni S Forde; Andrew D Pucker; Vyacheslav A Adarichev
Journal:  Arthritis Rheum       Date:  2011-10-17

2.  Cutaneous transcriptome analysis in NIH hairless mice.

Authors:  Zhong-Hao Ji; Jian Chen; Wei Gao; Jin-Yu Zhang; Fu-Shi Quan; Jin-Ping Hu; Bao Yuan; Wen-Zhi Ren
Journal:  PLoS One       Date:  2017-08-07       Impact factor: 3.240

Review 3.  The Role of Collagen Triple Helix Repeat-Containing 1 Protein (CTHRC1) in Rheumatoid Arthritis.

Authors:  Askhat Myngbay; Limara Manarbek; Steve Ludbrook; Jeannette Kunz
Journal:  Int J Mol Sci       Date:  2021-02-28       Impact factor: 5.923

4.  Replication of british rheumatoid arthritis susceptibility Loci in two unrelated chinese population groups.

Authors:  Hua Li; Yonghe Hu; Tao Zhang; Yang Liu; Yantang Wang; Tai Yang; Minhui Li; Qiaoli Luo; Yu Cheng; Qiang Zou
Journal:  Clin Dev Immunol       Date:  2013-09-03
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.