| Literature DB >> 35470158 |
Saedis Saevarsdottir1,2,3,4, Lilja Stefansdottir5, Patrick Sulem5, Gudmar Thorleifsson5, Egil Ferkingstad5, Gudrun Rutsdottir5, Bente Glintborg6,7, Helga Westerlind2, Gerdur Grondal3,4,8, Isabella C Loft9, Signe Bek Sorensen10, Benedicte A Lie11,12, Mikael Brink13, Lisbeth Ärlestig13, Asgeir Orn Arnthorsson5, Eva Baecklund14, Karina Banasik15, Steffen Bank10, Lena I Bjorkman16, Torkell Ellingsen17,18, Christian Erikstrup19, Oleksandr Frei20,21,22, Inger Gjertsson23, Daniel F Gudbjartsson5,24, Sigurjon A Gudjonsson5, Gisli H Halldorsson5,24, Oliver Hendricks25,26, Jan Hillert27, Estrid Hogdall28, Søren Jacobsen7,29, Dorte Vendelbo Jensen30, Helgi Jonsson3,4, Alf Kastbom31, Ingrid Kockum27, Salome Kristensen32,33, Helga Kristjansdottir8, Margit H Larsen34, Asta Linauskas33,35, Ellen-Margrethe Hauge36,37, Anne G Loft36,37, Bjorn R Ludviksson3,38, Sigrun H Lund5, Thorsteinn Markusson5,3, Gisli Masson5, Pall Melsted5,24, Kristjan H S Moore5, Heidi Munk17,18, Kaspar R Nielsen39, Gudmundur L Norddahl5, Asmundur Oddsson5, Thorunn A Olafsdottir5,3, Pall I Olason5, Tomas Olsson27, Sisse Rye Ostrowski7,34, Kim Hørslev-Petersen25, Solvi Rognvaldsson5, Helga Sanner40,41, Gilad N Silberberg42, Hreinn Stefansson5, Erik Sørensen34, Inge J Sørensen29, Carl Turesson43, Thomas Bergman2, Lars Alfredsson27,44, Tore K Kvien45,46, Søren Brunak15, Kristján Steinsson8, Vibeke Andersen10,17,47, Ole A Andreassen20,21, Solbritt Rantapää-Dahlqvist13, Merete Lund Hetland6,7, Lars Klareskog42, Johan Askling2, Leonid Padyukov42, Ole Bv Pedersen9, Unnur Thorsteinsdottir5,3, Ingileif Jonsdottir5,3,38, Kari Stefansson1,3.
Abstract
OBJECTIVES: To find causal genes for rheumatoid arthritis (RA) and its seropositive (RF and/or ACPA positive) and seronegative subsets.Entities:
Keywords: autoantibodies; polymorphism, genetic; rheumatoid arthritis
Mesh:
Substances:
Year: 2022 PMID: 35470158 PMCID: PMC9279832 DOI: 10.1136/annrheumdis-2021-221754
Source DB: PubMed Journal: Ann Rheum Dis ISSN: 0003-4967 Impact factor: 27.973
RA study populations from six Northwestern European countries included in the present study*
| Total | Total | Sweden | Denmark | Iceland | Norway | UK biobank | FinnGen | |||||||
| Ca | Co | Ca | Co | Ca | Co | Ca | Co | Ca | Co | Ca | Co | |||
| RA overall | 31 313 | 995 377 | 8658 | 9418 | 7662 | 86 964 | 3613 | 341 788 | 881 | 28 517 | 5798 | 402 767 | 4701 | 125 923 |
| Seropositive RA | 18 019 | 991 604 | 6455 | 9423 | 4850 | 86 964 | 1746 | 313 704 | 587 | 28 517 | 913 | 407 652 | 3468 | 145 344 |
| Seronegative RA | 8515 | 1 015 471 | 1852 | 9436 | 2652 | 86 966 | 1069 | 322 808 | 455 | 28 517 | 1051 | 407 514 | 1436 | 143 312 |
| Serology lacking | 4779 | – | 351 | – | 160 | – | 798 | – | 0 | – | 3834 | – | 0 | – |
*The following ICD-10 codes were used, in addition to clinical diagnoses validated by physicians, from case–control studies on RA or Scandinavian rheumatology quality and patient registers: RA overall (M05.8, M05.9, M06.0, M06.8, M06.9), seropositive RA (M05.8, M05.9 and/or positive rheumatoid factor (RF) and/or anti-CCP antibody measurement), seronegative RA (M06.0, M06.8 or M06.9 with negative RF measurement (and negative anti-CCP measurement if available). See Methods for further details.
Ca, number of cases; Co, number of controls; RA, rheumatoid arthritis.
Sequence variants outside the HLA locus that associate with RA overall, seropositive (rheumatoid factor and/or anti-CCP antibody positive) and/or seronegative RA in GWAS meta-analysis within six Northwestern-European countries (table 1). Association results are shown for the lead signals for all three RA groups, and the heterogeneity between the seropositive and seronegative subsets.† Effect alleles with novel associations are marked with.*
| Chr | Position | Effect allele* |
| Annotation | Seropositive RA | Seronegative RA | RA overall |
| |||
| OR |
| OR |
| OR |
| ||||||
| chr1 | 2 800 059 | rs897628-T* |
| Missense | 0.90 |
| 0.98 |
| 0.94 |
|
|
| chr1 | 113 834 946 | rs2476601-A |
| Missense | 1.59 |
| 1.29 |
| 1.41 |
|
|
| chr1 | 161 506 414 | rs9427397-T* |
| Missense | 1.11 |
| 1.02 |
| 1.07 |
|
|
| chr2 | 60 881 694 | rs67574266-A |
| 5-prime UTR | 1.08 |
| 1.01 |
| 1.05 |
|
|
| chr2 | 111 119 036 | rs72836346-C* |
| Upstream gene | 1.14 |
| 1.01 |
| 1.10 |
|
|
| chr2 | 191 073 180 | rs140675301-A* |
| Missense | 2.27 |
| 1.23 |
| 1.63 |
|
|
| chr2 | 191 094 763 | rs4853458-A |
| Intron | 1.11 |
| 1.10 |
| 1.10 |
|
|
| chr2 | 203 880 280 | rs11571297-C |
| Regulatory | 0.89 |
| 0.95 |
| 0.92 |
|
|
| chr3 | 58 197 909 | rs35677470-A |
| Missense | 1.13 |
| 1.16 |
| 1.10 |
|
|
| chr4 | 26 083 889 | rs10517086-A |
| Intergenic | 1.11 |
| 1.06 |
| 1.09 |
|
|
| chr5 | 56 148 856 | rs7731626-A |
| Intron | 0.87 |
| 0.87 |
| 0.88 |
|
|
| chr6 | 137 678 425 | rs35926684-G |
| Regulatory | 1.12 |
| 1.02 |
| 1.09 |
|
|
| chr6 | 159 085 568 | rs2451258-C | . | Regulatory | 0.91 |
| 0.99 |
| 0.96 |
|
|
| chr6 | 167 127 770 | rs3093017-C |
| Intron | 1.11 |
| 1.04 |
| 1.07 |
|
|
| chr7 | 50 313 596 | rs10261758-G* |
| Intron | 1.07 |
| 1.04 |
| 1.07 |
|
|
| chr7 | 128 938 247 | rs2004640-G* |
| Splice donor | 0.92 |
| 0.94 |
| 0.94 |
|
|
| chr8 | 11 480 078 | rs2409780-C |
| Regulatory | 1.09 |
| 1.05 |
| 1.08 |
|
|
| chr8 | 100 105 506 | rs1471293-A* |
| 5-prime UTR | 1.08 |
| 1.04 |
| 1.05 |
|
|
| chr9 | 120 933 192 | rs35942002-A |
| Upstream gene | 1.09 |
| 1.05 |
| 1.06 |
|
|
| chr10 | 6 056 986 | rs706778-T |
| Intron | 1.09 |
| 1.07 |
| 1.07 |
|
|
| chr10 | 31 122 426 | rs1538981-C |
| Regulatory | 0.91 |
| 0.99 |
| 0.94 |
|
|
| chr11 | 64 340 005 | rs479777-C* |
| Upstream gene | 0.93 |
| 0.92 |
| 0.94 |
|
|
| chr11 | 118 870 448 | rs7117261-T | . | Regulatory | 0.90 |
| 0.94 |
| 0.92 |
|
|
| chr11 | 128 627 057 | rs73013527-C |
| Intergenic | 1.08 |
| 1.04 |
| 1.06 |
|
|
| chr12 | 111 446 804 | rs3184504-T |
| Missense | 1.10 |
| 1.08 |
| 1.08 |
|
|
| chr13 | 28 029 870 | rs76428106-C* |
| Intron | 1.35 |
| 1.15 |
| 1.23 |
|
|
| chr13 | 39 788 092 | rs8002731-C |
| Intron | 0.92 |
| 0.94 |
| 0.93 |
|
|
| chr14 | 92 651 884 | rs117068593-T* |
| Missense | 0.93 |
| 0.94 |
| 0.93 |
|
|
| chr15 | 69 751 888 | rs11636401-G* | . | TF binding site | 0.91 |
| 0.95 |
| 0.93 |
|
|
| chr16 | 85 982 485 | rs9939427-A |
| Intergenic | 1.10 |
| 1.06 |
| 1.07 |
|
|
| chr16 | 88 981 246 | rs62045818-C* |
| Upstream gene | 0.93 |
| 1.00 |
| 0.96 |
|
|
| chr17 | 39 908 216 | rs11078928-C |
| Splice acceptor | 1.07 |
| 1.05 |
| 1.04 |
|
|
| chr19 | 10 352 442 | rs34536443-C |
| Missense | 0.69 |
| 0.81 |
| 0.75 |
|
|
| chr19 | 10 359 299 | rs12720356-C* |
| Missense | 0.87 |
| 0.90 |
| 0.90 |
|
|
| chr19 | 10 354 167 | rs35018800-A* |
| Missense | 0.63 |
| 0.86 |
| 0.77 |
|
|
| chr21 | 35 340 290 | rs8129030-T | . | Regulatory | 0.92 |
| 0.96 |
| 0.95 |
|
|
| chr21 | 44 236 891 | rs11558819-T* |
| Missense | 0.91 |
| 0.98 |
| 0.95 |
|
|
*Sequence variants that remain significant after adjustment for previously reported sequence variants (online supplemental table 1). Bold indicates candidate causal genes (summarised in figure 2).
†We performed a meta-analysis using logistic regression analysis assuming a multiplicative model, reporting OR and two-sided p values adjusted for year of birth, sex and origin (Iceland) or the first 20 principal components (other countries). Variants were split into five classes based on their genome annotation and significance threshold based on the number of variants in each class. The adjusted significance thresholds are 1.3×10–7 for variants with high impact (splice donor, splice acceptor, stop gained, frameshift, stop lost, initiator codon), 2.6×10–8 for variants with moderate impact (missense, splice region, stop retained, inframe indels), 2.4×10–9 for low-impact variants (synonymous, 5’ UTR, 3’ UTR, upstream and downstream), 1.2×10–9 for other low-impact variants in DNase I hypersensitivity sites (intronic, intergenic, regulatory-region) and 5.92×10–10 for all other variants not in DNase I hypersensitivity sites. Primary signal at each locus (1 Mb) was selected based on conditional association analysis of all variants at each locus, using Bonferroni corrected p values (0.05×P/class-specific p value threshold). We report the coding signal when two markers are equivalent after conditional analysis. Secondary signals are sequence variants that remained GWAS significant after adjustment for the lead signal and other independent (secondary) signals at the locus. When different but correlated variants are lead in RA overall and seropositive RA, the seropositive RA signal is presented here. See further in online supplemental tables 2 and 3.
GWAS, genome-wide association study; Phet, a p value for test of heterogeneity between the effects in seropositive and seronegative RA subsets; RA, rheumatoid arthritis.
Figure 2Identification of sequence variants that associate with seropositive RA and the multiomics approaches used to recognise candidate causal genes. (A) schematic overview of the experimental approach used to identify sequence variants that associate with seropositive RA and their systematic annotation, applying multiomics approach to identify candidate causal genes, that is, based on whether lead variants or correlated variants (R2 >0.8) affect protein coding (online supplemental tables 2–4), mRNA expression (cis-eQTL (online supplemental tables 5 and 6)) or levels of proteins in plasma (pQTL (online supplemental table 7)). (B) Out of 33 lead variant associations outside the HLA-locus (online supplemental table 3), 25 candidate causal genes were identified as listed, ranked by effect (OR). All effects are shown for the risk increasing allele based on GWAS in RA study populations from Northwestern Europe (table 1). Associations that are previously unreported in RA are marked with *. Grey boxes highlight where data point to a candidate causal gene. GWAS, genome-wide association study; RA, rheumatoid arthritis.
Figure 1Effects of the lead sequence variants associated with seropositive RA (18 019 cases) compared with RA overall (31 313 cases, left graph) and seronegative RA (8515 cases, right graph). The x-axis and the y-axis show the logarithmic estimated ORs for the associations with the three phenotypes. All effects are shown for the RA risk increasing allele based on current meta-analysis of study population from six countries in Northwestern Europe (table 1). Error bars represent 95% CIs. The red line represents slope (SD) based on a simple linear regression through the origin using MAF (1-MAF) as weights. See further results in table 2 and online supplemental tables 2; 3.
Figure 3STAT4 missense variant rs140675301 is associated with seropositive RA (18 019 cases), is not correlated with previously reported variants at the locus and leads to an amino acid change in a highly conserved area of the protein. (A) Locus plot for the association of variants at the STAT4 locus with seropositive RA. The upper graph illustrates that the intronic variant rs4853458, that is the lead variant at the locus, is not correlated (r2 <0.2) with the missense variant rs140675301, that is coloured in purple. The missense variant rs140675301 is only highly correlated (r2 >0.8) with one variant, the intronic variant rs189948717 (coloured in red), that has less effect (seropositive RA: OR=1.81, p=3.69×10−6). Neither of these variants have previously been reported in any disease. The lower graph highlights that the lead variant at the locus (rs4853458, coloured in purple) has many correlated variants, coloured by degree of correlation (r2) with rs4853458. (B) Secondary structure of STAT4 (viewed from two angles) based on a structural model with STAT1 crystal structure (PDB code: 1yvl.1.A (Mao et al, Molecular Cell 2005;17:761–71) as template. Glu128Val (red) is located in a loop connecting the N-terminal domain (blue), important for tetramer formation of STATs and nuclear translocation, and the coiled coil domain (green), which provides a carbonised hydrophilic surface that binds to regulatory factors.24 α-Helices are drawn as cylinders. Invariant residues are marked with asterix. (C) multiple sequence alignment of the conserved STAT4 loop between the N-terminal domain (α8) and the coiled coil (α9) domain, performed with Clustal omega (https://www.ebi.ac.uk/Tools/msa/clustalo/). RA, rheumatoid arthritis.
Figure 4The JAK-STAT pathway. The figure and table shows which receptors, JAK and STAT subtypes certain cytokines bind to, highlighting proteins encoded by and/or affected by causal genes in seropositive RA, based on the multiomics analysis of sequence variants associated with risk of seropositive RA (shown in bold). Binding of a cytokine to its receptor activates the associated Janus kinases (JAK). The JAK in turn phosphorylates (P) the receptor, which provides a docking for signal transducers and activators of transcription (STATs) and other signalling molecules to bind to the receptor. STATs also become phosphorylated and translocate to the nucleus, where they regulate gene expression. *Protein targeted by drugs that are registered for RA. **Proteins targeted by drugs registered or in pipeline for other diseases. RA, rheumatoid arthritis.