Literature DB >> 33722698

Genetics of osteoarthritis.

G Aubourg1, S J Rice1, P Bruce-Wootton1, J Loughlin2.   

Abstract

Osteoarthritis genetics has been transformed in the past decade through the application of large-scale genome-wide association scans. So far, over 100 polymorphic DNA variants have been associated with this common and complex disease. These genetic risk variants account for over 20% of osteoarthritis heritability and the vast majority map to non-protein coding regions of the genome where they are presumed to act by regulating the expression of target genes. Statistical fine mapping, in silico analyses of genomics data, and laboratory-based functional studies have enabled the identification of some of these targets, which encode proteins with diverse roles, including extracellular signaling molecules, intracellular enzymes, transcription factors, and cytoskeletal proteins. A large number of the risk variants correlate with epigenetic factors, in particular cartilage DNA methylation changes in cis, implying that epigenetics may be a conduit through which genetic effects on gene expression are mediated. Some of the variants also appear to have been selected as humans adapted to bipedalism, suggesting that a proportion of osteoarthritis genetic susceptibility results from antagonistic pleiotropy, with risk variants having a positive role in joint formation but a negative role in the long-term health of the joint. Although data from an osteoarthritis genetic study has not yet directly led to a novel treatment, some of the osteoarthritis associated genes code for proteins that have available therapeutics. Genetic investigations are therefore revealing fascinating fundamental insights into osteoarthritis and can expose options for translational intervention.
Copyright © 2021 The Author(s). Published by Elsevier Ltd.. All rights reserved.

Entities:  

Keywords:  DNA methylation; Epigenetics; Functional analysis; GWAS; Genetics; SNPs

Mesh:

Year:  2021        PMID: 33722698      PMCID: PMC9067452          DOI: 10.1016/j.joca.2021.03.002

Source DB:  PubMed          Journal:  Osteoarthritis Cartilage        ISSN: 1063-4584            Impact factor:   7.507


Introduction

Comprehensive molecular genetic investigations into multifactorial human diseases have been a scientific highlight of the past decade. Such analyses have garnered insights into fundamental aspects of disease pathophysiology, identifying gene targets and biological pathways for therapeutic intervention. These genetic studies involve the genome-wide association scan (GWAS) of DNA variants, principally mapping alleles at single nucleotide polymorphisms (SNPs), in cases and controls. The use of large cohorts, often involving hundreds of thousands of individuals, and high-density SNP genotyping arrays supplemented by statistical imputation has enabled tens of thousands of DNA variants to be associated with a broad range of diseases. Osteoarthritis (OA) has been subjected to such detailed genetic investigation and each year new OA susceptibility risk loci are reported, along with subsequent mechanistic investigations that help to clarify how genetic risk impacts the cells and tissues of the articulating joint. In this review we summarize the current status of OA genetics, from discovery of risk loci and their functional investigation, through to epigenetics and the clinical utility of the new knowledge.

OA is a polygenic disease

Investigators conducting OA GWAS have focused on cases diagnosed with the typical age-related form of the disease, in which onset occurs from the fifth decade of life. Furthermore, individuals with post-traumatic OA, or another obviously non-genetic cause of disease, are often excluded from analyses. The logic here is that the case group will then consist of individuals who are more likely to have a genetic component underlying their disease. This is the most common form of OA, which consequently confers the highest burden upon society. Selection of OA cases has involved a range of diagnostic tools including X-ray evidence of joint space narrowing, joint pain, or the need to replace the diseased joint due to severe end-stage OA. Cases have been sub-categorized based on disease at particular skeletal sites (principally knees, hips, and hands) and GWAS have been performed both by joint site stratification, and by combining all cases. Studies have encompassed European cohorts, individuals of European descent, Asian cohorts, and African Americans. To increase confidence in the results, studies are collaborative and usually involve a discovery component followed by replication and meta-analysis. Such rigor has enabled geneticists to identify 124 SNPs associated with OA to date (Table I)2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25. These SNPs encompass 95 independent loci spread across the genome, with some loci (for example, at the type XI collagen gene COL11A1) having several SNPs marking separate associations at the locus. OA is therefore a highly polygenic disease.
Table I

The 124 SNPs significantly associated with osteoarthritis risk. The nearest protein coding gene to the SNP is named. EA, effect allele; NEA, non-effect allele; OR, odds ratio; ns, not stated. The physical location of the SNP (hg19 genome assembly) is reported along with the chromatin-state description of the region in the Roadmap ChIP-seq dataset in mesenchymal stem cells (MSCs, E006), cultured chondrocytes (E046), adipose-derived MSCs (E025), cultured adipocytes (E023), and osteoblasts (E129)

ChromosomeAssociation SNPNearerst protein coding geneEA/NEAORVariant (chromatin state)Position (hg19)P-valueCausal geneReference
1rs3753841COL11A1A/G1.08Missense (Transcribed)10,33,79,9185.20 × 10−1022
rs2126643COL11A1C/T1.10Intronic (Transcribed)10,34,04,3842.10 × 10−1422
rs4338381COL11A1A/G1.10Intronic (Promoter)10,35,72,9274.37 × 10−1524
chr1:150214028ANP32EC/CT1.03Intergenic15,02,14,0282.54 × 10−824
rs550034492RABGAP1LTA(17)/T1.03Intronic (Transcribed)17,41,92,4031.05 × 10−824
rs11583641COLGALT2C/T1.083′UTR (Transcribed)18,39,06,2455.58 × 10−10COLGALT224
rs2820436ZC3H11BA/C1.07Intergenic (Enhancer)21,96,40,6802.01 × 10−921
rs2785988SLC30A10A/C1.08Intergenic (Repressed)21,97,44,1383.90 × 10−1022
rs2820443SLC30A10C/T1.06Intergenic (Enhancer)21,97,53,5092.01 × 10−921
1.066.01 × 10−1124
rs10916199ZNF678A/GnsIntronic (Heterochromatin)22,79,02,4722.4 × 10−13WNT9A25
rs10218792KIF26BG/T1.04Intronic (Transcribed)24,57,50,9322.03 × 10−824
2rs2061027LTBP1A/G1.04Intronic (Transcribed)33,43,3363.16 × 10−1324
rs2061026LTBP1A/G1.06Intronic (Enhancer)33,43,5491.40 × 10−1122
rs2862851TGFAC/TnsIntronic (Repressed)7,07,12,8025.20 × 10−1114
rs3771501TGFAA/G1.06Intronic (Repressed)7,07,17,6531.66 × 10−821
1.054.24 × 10−1624
rs12470967SDPRA/G1.06Intergenic19,26,71,9811.50 × 10−824
rs62182810RAPH1A/G1.03Intronic (Transcribed)20,43,87,4821.65 × 10−924
3rs7639618COL6A4P1G/A1.43Missense (Repressed)1,52,16,4297.3 × 10−113
rs62262139RBM6A/G1.04Intronic (Transcribed)5,00,22,0499.09 × 10−1124
rs11177GNL3A/G1.12Missense (Transcribed)5,27,21,3051.25 × 10−108
rs6976GNL3T/C1.123′UTR (Transcribed)5,27,28,8047.24 × 10−118
rs3774355ITIH1A/G1.09Intronic (Repressed)5,28,17,7788.20 × 10−1424
rs678ITIH1T/A1.08Missense (Repressed)5,28,20,9811.60 × 10−922
rs12107036TP63G/A1.21Intronic18,96,00,1602.15 × 10−38
4rs11732213SLBPT/C1.06Intronic (Transcribed)17,04,2448.81 × 10−1024
rs1913707RAB28A/G1.08Intergenic1,30,39,4402.96 × 10−1124
rs34811474ANAPC4G/A1.04Intronic (Transcribed)2,54,08,8382.17 × 10−924
rs11335718ANXA3-/C1.11Intronic (Transcribed)7,95,28,5434.26 × 10−821
rs13107325SLC39A8T/C1.10Missense (Transcribed)10,31,88,7098.29 × 10−1924
5rs10471753PIK3R1G/CnsIntergenic (Enhancer)6,78,18,9523.80 × 10−914
rs35611929AP3B1A/G1.06Intronic (Transcribed)7,74,67,8248.29 × 10−1924
rs3884606FGF18G/A1.04Intronic (Enhancer)17,08,71,0748.25 × 10−924
6rs1800562HFEG/A1.95Missense (Transcribed)2,60,93,1415.0 × 10−1422
rs115740542HIST1H2BCC/T1.06Intergenic (Promoter)2,61,23,5028.59 × 10−924
rs10947262BTNL2C/T1.31Intronic (Repressed)3,23,73,3125.0 × 10−94
rs7775228HLA-DQB1T/C1.34Intergenic (Repressed)3,26,58,0792.43 × 10−84
rs9277552HLA-DPB1C/T1.063′UTR (Transcribed)3,30,55,5012.37 × 10−1024
rs12154055CDC5LG/A1.03Intergenic4,44,49,6972.71 × 10−824
rs10948155SUPT3H/RUNX2C/TnsIntergenic4,46,87,9875.20 × 10−11RUNX214
rs10948172SUPT3H/RUNX2G/A1.14Intronic (Enhancer)4,47,77,9617.92 × 10−8RUNX28
9.00 × 10−1122
rs2396502SUPT3H/RUNX2C/A1.09Intronic (Transcribed)4,53,57,6992.12 × 10−1224
rs1997995SUPT3H/RUNX2G/A1.09Intronic (Transcribed)4,53,74,1831.1 × 10−1122
rs12206662SUPT3H/RUNX2G/AnsIntronic (Transcribed)4,53,76,2211.3 × 10−914
rs80287694BMP5G/A1.12Intronic5,56,36,9402.66 × 10−924
rs12209223FILIP1A/C1.16Intronic7,61,64,5892.9 × 10−1522
1.173.88 × 10−1624
rs9350591FILIP1T/C1.18Intergenic7,62,41,5272.42 × 10−98
7rs143083812SMOT/C2.84Missense (Transcribed)1,28,84,4107.90 × 10−1222
rs11764536HDAC9C/A1.26Intronic1,84,09,9931.60 × 10−922
rs788748IGFBP3A/G0.70Intergenic4,60,26,1812.0 × 10−813
rs11409738DYNC1L1TA/T1.04Intronic (Transcribed)9,57,19,8342.13 × 10−1024
rs3815148COG5C/A1.14Intronic (Transcribed)10,69,38,4204.11 × 10−95
rs4730250DUS4LG/A1.17Intronic (Transcribed)10,72,07,6959.20 × 10−911
rs7792864RNF32C/G2.35Intergenic15,63,46,0874.00 × 10−917
8rs330050PPP1R3BG/C1.04Intergenic (Repressed)90,87,6791.93 × 10−1124
rs4733724GSDMCA/G1.11Intergenic (Enhancer)13,01,23,7287.20 × 10−1222
rs60890741GSDMCC/CA1.11Intronic13,07,68,5034.50 × 10−924
rs11780978PLECA/G1.13Intronic (Transcribed)14,50,34,8521.98 × 10−9PLEC21
9rs10116772GLIS3C/A1.03Intronic (Transcribed)42,90,5413.71 × 10−823
rs10974438GLIS3A/C1.03Intronic (Transcribed)42,91,9281.34 × 10−824
rs116882138MOB3BA/G1.25Intergenic2,73,13,5575.09 × 10−821
rs1078301COL27A1T/A1.07Intergenic (Repressed)11,69,09,1461.4 × 10−1022
rs919642COL27A1T/A1.05Intergenic (Repressed)11,69,11,1478.55 × 10−1524
rs1330349TNCC/G1.08Intronic (Transcribed)11,78,40,7424.10 × 10−1124
rs2480930TNCA/G1.09Intronic (Transcribed)11,78,42,3076.60 × 10−1222
rs4836732ASTN2C/T1.20Intronic (Transcribed)11,92,66,6956.11 × 10−108
rs13283416ASTN2G/T1.10Intronic11,93,01,6075.3 × 10−1422
rs34687269ASTN2A/T1.09Intronic11,94,84,1321.67 × 10−1224
rs10760442LMX1BG/A1.09Intronic (Repressed)12,93,83,9007.60 × 10−1222
rs62578127LMX1BC/T1.09Intronic (Repressed)12,93,86,8602.77 × 10−1224
11rs17659798METTL15A/C1.06Intergenic2,88,74,9972.06 × 10−1024
rs11031191DCDC5T/G1.03Intergenic3,07,74,2801.42 × 10−824
rs2070852F2C/GnsIntronic (Transcribed)4,67,44,9254.7 × 10−825
rs10896015LTBP3G/A1.09Intronic (Enhancer)6,53,23,7257.70 × 10−1022
1.082.74 × 10−924
rs34419890SPTBN2T/C1.13Intergenic (Repressed)6,65,01,6241.99 × 10−824
rs1149620TSKUT/A1.04Intronic (Transcribed)7,65,06,5726.93 × 10−1024
12rs4764133ERP27T/CnsIntergenic1,50,64,3631.80 × 10−15MGP18
ns4.8 × 10−1425
rs10492367PTHLH/KLHL42T/G1.14Intergenic2,80,14,9701.25 × 10−2424
rs10843013PTHLH/KLHL42C/A1.14Intergenic2,80,25,1965.4 × 10−1922
rs12049916CCDC91G/AnsIntronic2,83,59,9852.00 × 10−825
rs79056043LRIG3G/A1.18Intronic (Transcribed)5,82,89,5981.33 × 10−924
rs317630CPSF6T/C1.04Intronic (Transcribed)6,96,37,8471.97 × 10−824
rs11105466ATP2B1A/G1.04Intergenic (Enhancer)9,03,26,9192.16 × 10−824
rs2171126CRADDT/C1.03Intronic (Transcribed)9,41,67,2209.07 × 10−1024
rs835487CHST11G/A1.13Intronic10,50,60,7671.64 × 1088
rs11059094MLXIPT/C1.08Intronic (Transcribed)12,26,06,8377.38 × 10−1124
rs1060105SBNO1C/T1.07Missense (Transcribed)12,38,06,2191.9 × 10−822
rs56116847SBNO1A/G1.06Intronic (Transcribed)12,38,35,2333.19 × 10−1024
rs4765540FAM101AC/T1.083′UTR (Transcribed)12,47,99,6423.4 × 10−922
13rs11842874MCF2LA/G1.17Intergenic (Repressed)11,36,94,5092.04 × 10−87
15rs35912128USP8T/-1.08Intronic (Transcribed)5,07,31,1322.18 × 10−824
rs4775006ALDH1A2A/C1.06Intergenic (Enhancer)5,82,15,7278.40 × 10−1024
rs3204689ALDH1A2C/G1.463′UTR (Transcribed)5,82,46,8023.99 × 10−10ALDH1A212
rs12901372SMAD3C/G1.08Intronic (Enhancer)6,70,78,1685.60 × 10−1122
rs12901071SMAD3A/G1.08Intronic (Enhancer)6,73,70,3893.12 × 10−1020
rs35206230CSKT/C1.04Intergenic (Transcribed)7,50,97,7801.48 × 10−1224
16rs9930333FTOG/T1.05Intronic (Transcribed)5,37,99,9771.52 × 10−924
rs8044769FTOC/T1.11Intronic (Transcribed)5,38,39,1356.85 × 10−88
rs6499244NFAT5/WWP2A/T1.063′UTR (Transcribed)6,97,35,2713.88 × 10−1124
rs34195470WWP2G/A1.07Intronic (Enhancer)6,99,55,6902.7 × 10−1122
rs864839JPH3T/G1.08Intronic (Repressed)8,76,71,4192.01 × 10−821
rs1126464DPEP1G/C1.04Missense (Transcribed)8,97,04,3651.56 × 10−1024
17rs35087650SMG6TT/-1.07Intronic (Transcribed)20,58,3501.18 × 10−924
rs8067763SOX9G/A1.06Intergenic (Repressed)1,00,12,9392.39 × 10−924
rs2953013NF1C/A1.05Intronic (Transcribed)2,94,96,3433.07 × 10−1024
rs62063281MAPTG/A1.10Intronic (Repressed)4,40,38,7855.30 × 10−1224
rs547116051MAPTC/-1.83Intronic4,40,57,8871.50 × 10−824
rs7222178NACA2A/T1.09Intergenic5,96,52,2821.70 × 10−922
A/T1.103.78 × 10−1124
rs2521349MAP2K6A/G1.13Intronic (Transcribed)6,75,03,5019.95 × 10−1021
18rs10502437TMEM241G/A1.03Intronic (Transcribed)2,09,70,7062.50 × 10−824
19rs11880992DOT1LA/GnsIntronic (Transcribed)21,76,4033.20 × 10−1614
rs12982744DOT1LC/G1.17Intronic (Transcribed)21,77,1937.8 × 10−910
rs1560707SLC44A2T/G1.04Intronic (Transcribed)1,07,50,7381.35 × 10−1324
chr19:18,898,330COMPG/C16.70Missense1,88,98,3304.00 × 10−1215
rs375575359ZNF345del/T1.21Intronic (Heterochromatin)3,73,53,3477.54 × 10−921
rs75621460TGFB1A/G1.16Intergenic (Enhancer)4,18,33,7841.62 × 10−15TGFB124
rs4252548IL11T/C1.30Missense (Transcribed)5,58,79,6722.1 × 10−1122
1.321.96 × 10−1224
20rs143384GDF5A/G1.105′UTR (Promoter)3,40,25,7561.40 × 10−1922
1.104.77 × 10−2324
rs143383GDF5T/C1.795′UTR (Promoter)3,40,25,9831.8 × 10−13GDF52
1.176.2 × 10−116
rs6094710NCOA3A/G1.28Intergenic (Enhancer)4,60,95,6492.0 × 10−10NCOA311
21rs6516886RWDD2BT/A1.10Intergenic (Transcribed)3,03,93,6645.84 × 10−8RWDD2B21
rs2836618ERGA/G1.09Intergenic4,00,48,2953.20 × 10−1124
22rs117018441EP300T/G5.89Intergenic (Transcribed)4,15,53,9171.8 × 10−2522
rs532464664CHADLins8/-7.70Missense (Promoter)4,16,34,0884.50 × 10−1815
rs528981060SCUBE1A/G1.68Intronic (Repressed)4,36,62,2412.0 × 10−824
The 124 SNPs significantly associated with osteoarthritis risk. The nearest protein coding gene to the SNP is named. EA, effect allele; NEA, non-effect allele; OR, odds ratio; ns, not stated. The physical location of the SNP (hg19 genome assembly) is reported along with the chromatin-state description of the region in the Roadmap ChIP-seq dataset in mesenchymal stem cells (MSCs, E006), cultured chondrocytes (E046), adipose-derived MSCs (E025), cultured adipocytes (E023), and osteoblasts (E129) Effect sizes of OA risk-conferring alleles are small, with the vast majority of odds ratios (ORs) being <1.5 and with no common large-effect loci of the magnitude seen, for example, at the human leucocyte antigen (HLA) in autoimmune arthritic diseases such as rheumatoid arthritis. The largest OR so far reported for an OA locus is 16.70 to a variant within the cartilage oligomeric matrix protein gene COMP, but this variant is restricted to an extensive Icelandic pedigree and is absent from other European populations (Table I). Overall, OA is an archetypal example of a common, polygenic disease in which disease occurs due to the inheritance of multiple risk alleles of modest individual impact. It also fits with the “liability threshold” model of polygenic diseases. This model considers that many distinct genetic variants can increase susceptibility to a discontinuous disease or trait, such as whether an individual has OA or does not. If an individual reaches the threshold number of those variants and their concomitant effects are exerted, they will consequently develop the disease.

OA risk alleles primarily reside in non-protein coding regions of the genome

There are two main mechanisms by which genetic variation can act on a phenotype. The first is through a direct change to a protein. This could, for example, be the result of a genetic variant changing the DNA code in the coding sequence of a gene, which introduces an amino acid substitution that alters protein function. This is the mechanism by which DNA variants act in many Mendelian diseases but is rare for common diseases. The second mechanism is through altering the regulation of gene expression, leading to an increase or decrease of the level of a gene's mRNA, and subsequently the levels of the encoded protein. This is the mechanism by which the vast majority of variants act in common diseases. Genetic loci at which SNP genotypes are associated with such changes in gene expression levels are termed expression quantitative trait loci (eQTLs), and most of the genetic variants that have so far been associated with common diseases are located in non-coding regions of the genome. In OA, less than 10% of the 124 SNPs so far associated with the disease are located in a protein coding sequence (Fig. 1).
Fig. 1

The genomic locations of the SNPs reported to be associated with OA. UTR, untranslated region.

The genomic locations of the SNPs reported to be associated with OA. UTR, untranslated region.

Functional follow-up of OA genetic signals – statistical fine mapping and in silico analysis

Across the field of common disease research, it has proven challenging biologically to interpret the results of GWAS and consequently the translation of genetic discoveries into effective therapies remains elusive. Significant GWAS hits are reported in the form of lead, or “sentinel”, SNPs which mark a genomic locus associated with disease. These loci can be large and complex regions of high linkage disequilibrium (LD), and this information alone provides little insight into causal variants, target genes, and the molecular mechanisms underpinning disease pathology. Direct functional follow-up of the loci in relevant tissues and disease models is therefore essential, yet this process is laborious, expensive, and can require large numbers of patient samples. Furthermore, causal variants can exert their pathological effects in a spatiotemporal manner, and it is vital that appropriate investigative tools and models are applied. As a result, statistical methods of refining GWAS signals, along with the integration of genome and epigenome-wide public datasets are increasingly being applied post-GWAS to prioritise variants and genes for subsequent functional analyses. Due to the small effect sizes conferred by SNP genotype in most polygenic diseases, simply selecting the SNP at each locus with the lowest P-value is unlikely to provide insight into the most probable causal variant. Bayesian fine-mapping considers all SNPs at a single locus that reach genome-wide significance (P < 5 × 10−8) along with accompanying variants in LD. Posterior probabilities (PP) of causality are applied to identify a credible set of variants, which can contain either single or multiple causal SNPs,. To date, fine-mapping approaches have been applied to several GWAS of OA,,. This includes the largest OA GWAS to date, which mined the full UK Biobank dataset to investigate over 77,000 individuals with the disease. At 6 of the 52 novel OA risk loci reported, the investigators were able to identify a single causal variant with >95% PP; SNPs rs34811474, rs13107325, rs547116051, rs75621460, rs4252548 and rs528981060 in Table I. Post-GWAS, the resolution of fine-mapping can be further enhanced by the integration of genomic and epigenomic functional data. Again, this approach is being increasingly utilised in the OA field, aided in recent years by the generation and subsequent public availability of large datasets generated in human articular cartilage, chondrocytes, and mesenchymal stem cells (Fig. 2, sections A to H). Such datasets have proved invaluable to the OA and musculoskeletal research community and include gene and transcript expression data,, foetal and aged chromatin accessibility data (ATAC-seq),, chromatin state data (ChIP-seq), long range chromatin interactions (Capture HiC), and DNA methylation (DNAm) data44, 45, 46, 47, 48. The inclusion of such data into post-GWAS analysis helps to assign a biological function to OA-associated variants. A prime example of the successful mining of such datasets post-GWAS was reported in a recent study of hand OA. The integration of relevant datasets prioritised the Wnt ligand gene WNT9A, its regulatory elements, and causal variant rs1158850 at the locus. Other public resources have been applied to the OA field to further support in silico analyses. This includes TRANSFAC to identify SNP-mediated changes to transcription factor binding motifs, and GTEx to colocalise eQTLs with associated SNPs. However, as mentioned above, it is vital that such data integrations are correctly interpreted, as many public databases do not include data generated in articular joint tissues and, as such, the continuation of data generation in human musculoskeletal tissues is vital to advance the field.
Fig. 2

Example of how an OA GWAS risk SNP can lead to mechanistic insights and an associated OA risk gene. (A) Manhattan plot of an OA GWAS. Each dot represents a SNP. The -log10 P-value represents the significance of each SNP being preferentially carried by OA patients vs healthy controls. The dotted line represents the genome wide significance threshold. SNPs rs1, rs2 and rs3 are all OA significant risk SNPs. (B) Cartilage mQTL analysis for rs1. Each circle represents a CpG site available on the methylation array. The -log10 P-value represents the significance that methylation at a CpG is associated with rs1 genotype. The dotted line represents the statistical significance threshold after multiple testing correction. The red circle represents a CpG site (cg1) significantly associated with rs1 genotype (this is an mQTL). This mQTL analysis should be done for each SNP that is found to be associated with OA (rs2 and rs3). This analysis is usually limited to the CpGs within the physical proximity of each association SNP (e.g., 1 megabase (Mb) region). (C) Arrows representing the location and transcriptional direction of each gene within the region. rs1 is located within GENE3. (D) Linkage disequilibrium (LD) in CEU population from the HapMap project. Image correlates with the topologically associated domains (TADs), shown by each pyramid, within the region. Red represents regions in LD, with each pyramid encompassing a region that is inherited as a block. In this example both rs1 and cg1 are within the same TAD as are GENE2 and GENE3. These genes would be prioritized as candidate OA risk genes for further analysis. (E) LD link r2 values between rs1 and each SNP within the region. Each bar represents a SNP and the height corresponds to the r2 value between 0 and 1. All SNPs in high LD with rs1 are just as likely to be the functional SNP. (F) UCSC (http://genome.ucsc.edu/) track from GTEx showing all eQTLs operating within the region. Each line represents an eQTL in a tissue type for one of the four genes listed. The position of the bar on the x-axis represents which SNP this operates at. All SNPs with an eQTL are prioritized as functional. (G) UCSC track of ChIP-seq data in chondrocytes and osteoblasts. Green represents transcription sites, yellow enhancer sites and red transcription start sites. Enhancers and transcription start sites are prioritized as functional. (H) Genome wide ATAC-seq data for knee articular cartilage available on UCSC browser. (I) Association between genotype at rs1 and GENE2 expression levels using qPCR. Each dot represents cartilage tissue from a different patient. C allele is associated with increased expression. (J) Violin plot showing association between genotype at rs1 and methylation levels at cg1. The width of the blue violin represents the proportion of patients within that range, the bar the mean and the dotted line the quartile range. C allele at rs1 is associated with decreased methylation at cg1. (K) Allelic expression imbalance (AEI) of GENE2. rs1 C allele is preferentially transcribed compared to the T allele. (L) Association between GENE2 expression levels and cg1 methylation levels, highlighting a methylation-expression QTL (meQTL). Each dot represents cartilage tissue from a different patient. Increased GENE2 expression is associated with decreased cg1 methylation levels. (M) Association between GENE2 AEI and cg1 methylation, again highlighting an meQTL. An increased ratio of rs1 C/T allele transcripts is associated with decreased methylation at cg1. (N) In vitro Lucia reporter analysis with DNA insert containing rs1 and cg1. rs1 C and T alleles are compared to each other with cg1 methylated or unmethylated. C allele and decreased methylation act synergistically to increase enhancer activity. (O) In vitro deletion of cg1 region using CRISPR-Cas9 and subsequent gene expression analysis of GENE2 and GENE3. Each dot represents a biological in vitro replicate. GENE3 expression levels are unchanged whilst GENE2 expression levels are reduced. (P) In vitro methylation and demethylation of cg1 using CRISPR-dCas9 DNMT3a/TET1. On the x-axis, controls are listed by the abbreviation C and cases by the lightning bolt. Methylation changes do not influence GENE3 expression levels. Increased methylation of cg1 decreases GENE2 expression and conversely, decreased methylation increases expression levels. This concludes that methylation at cg1 is only driving GENE2 expression. GENE2 is deemed the target of the OA risk marked by SNP rs1.

Example of how an OA GWAS risk SNP can lead to mechanistic insights and an associated OA risk gene. (A) Manhattan plot of an OA GWAS. Each dot represents a SNP. The -log10 P-value represents the significance of each SNP being preferentially carried by OA patients vs healthy controls. The dotted line represents the genome wide significance threshold. SNPs rs1, rs2 and rs3 are all OA significant risk SNPs. (B) Cartilage mQTL analysis for rs1. Each circle represents a CpG site available on the methylation array. The -log10 P-value represents the significance that methylation at a CpG is associated with rs1 genotype. The dotted line represents the statistical significance threshold after multiple testing correction. The red circle represents a CpG site (cg1) significantly associated with rs1 genotype (this is an mQTL). This mQTL analysis should be done for each SNP that is found to be associated with OA (rs2 and rs3). This analysis is usually limited to the CpGs within the physical proximity of each association SNP (e.g., 1 megabase (Mb) region). (C) Arrows representing the location and transcriptional direction of each gene within the region. rs1 is located within GENE3. (D) Linkage disequilibrium (LD) in CEU population from the HapMap project. Image correlates with the topologically associated domains (TADs), shown by each pyramid, within the region. Red represents regions in LD, with each pyramid encompassing a region that is inherited as a block. In this example both rs1 and cg1 are within the same TAD as are GENE2 and GENE3. These genes would be prioritized as candidate OA risk genes for further analysis. (E) LD link r2 values between rs1 and each SNP within the region. Each bar represents a SNP and the height corresponds to the r2 value between 0 and 1. All SNPs in high LD with rs1 are just as likely to be the functional SNP. (F) UCSC (http://genome.ucsc.edu/) track from GTEx showing all eQTLs operating within the region. Each line represents an eQTL in a tissue type for one of the four genes listed. The position of the bar on the x-axis represents which SNP this operates at. All SNPs with an eQTL are prioritized as functional. (G) UCSC track of ChIP-seq data in chondrocytes and osteoblasts. Green represents transcription sites, yellow enhancer sites and red transcription start sites. Enhancers and transcription start sites are prioritized as functional. (H) Genome wide ATAC-seq data for knee articular cartilage available on UCSC browser. (I) Association between genotype at rs1 and GENE2 expression levels using qPCR. Each dot represents cartilage tissue from a different patient. C allele is associated with increased expression. (J) Violin plot showing association between genotype at rs1 and methylation levels at cg1. The width of the blue violin represents the proportion of patients within that range, the bar the mean and the dotted line the quartile range. C allele at rs1 is associated with decreased methylation at cg1. (K) Allelic expression imbalance (AEI) of GENE2. rs1 C allele is preferentially transcribed compared to the T allele. (L) Association between GENE2 expression levels and cg1 methylation levels, highlighting a methylation-expression QTL (meQTL). Each dot represents cartilage tissue from a different patient. Increased GENE2 expression is associated with decreased cg1 methylation levels. (M) Association between GENE2 AEI and cg1 methylation, again highlighting an meQTL. An increased ratio of rs1 C/T allele transcripts is associated with decreased methylation at cg1. (N) In vitro Lucia reporter analysis with DNA insert containing rs1 and cg1. rs1 C and T alleles are compared to each other with cg1 methylated or unmethylated. C allele and decreased methylation act synergistically to increase enhancer activity. (O) In vitro deletion of cg1 region using CRISPR-Cas9 and subsequent gene expression analysis of GENE2 and GENE3. Each dot represents a biological in vitro replicate. GENE3 expression levels are unchanged whilst GENE2 expression levels are reduced. (P) In vitro methylation and demethylation of cg1 using CRISPR-dCas9 DNMT3a/TET1. On the x-axis, controls are listed by the abbreviation C and cases by the lightning bolt. Methylation changes do not influence GENE3 expression levels. Increased methylation of cg1 decreases GENE2 expression and conversely, decreased methylation increases expression levels. This concludes that methylation at cg1 is only driving GENE2 expression. GENE2 is deemed the target of the OA risk marked by SNP rs1.

Functional follow-up of OA genetic signals – laboratory studies

Following statistical fine mapping and in silico analysis, focussed in vitro and in vivo studies of individual loci are essential to establish causal effects of allelic variation and to validate prioritised gene targets. In the OA field, a range of molecular genetic techniques have been applied for this purpose using patient joint tissues and relevant primary and immortalised cells. Arthroplasty samples of osteoarthritic joint tissues are invaluable to study the effect of genotype upon differential expression of genes (eQTLs) and to identify correlations between genotype and DNA methylation (mQTLs; see below). Due to interindividual variability in gene expression, it is often difficult to detect eQTLs through direct stratification of gene expression levels by SNP genotype. This observation is supported by the large numbers of patient samples required to identify such tissue-specific effects in the GTEx database. A complementary approach, which is considerably more sensitive, is allelic expression imbalance (AEI) analysis, in which the relative ratio of mRNA transcripts produced from each allele of a SNP are quantified,,50, 51, 52, 53, 54, 55, 56, 57. This has proven useful in identifying or narrowing down the effector gene, or genes, at a gene-rich OA locus, such as at chr3p21.1, where significant AEI was identified in patient cartilage for five of the seven investigated genes, including the nuclear protein gene GNL3 and the signal peptidase gene SPCS1. A separate investigation of OA risk conferred by the T allele of rs4764133, identified a reduction in cartilage expression of the matrix Gla protein gene MGP, relative to the non-risk allele C, confirming this gene as the mediator of OA risk at this locus. Of note, this effect was also observed in other joint tissues, whilst the opposite effect was observed in whole blood samples, highlighting the tissue-specificity of the molecular mechanisms underlying disease risk and the biological pleiotropy exerted by risk SNPs. Gene reporter assays have been widely applied to validate regions for regulatory activity in vitro, and can be adapted to investigate the impact of both SNP genotype and DNAm upon regulatory function,. Furthermore, such assays can be used to validate putative microRNA (miRNA) binding sites and miRNA gene targets. In 2019, a massively-parallel reporter assay (MPRA) was applied to investigate 35 known OA risk loci. Klein and colleagues investigated all SNPs in LD at each locus, a total of 1,605, and observed significant differential allelic regulation at six variants. The advent of CRISPR-Cas9 and subsequent development of the Cas9 toolbox has revolutionised targeted editing of the genome and epigenome. For functional analyses of OA risk loci, CRISPR has been used to delete both putative regulatory elements and functional SNPs in Tc28a2 chondrocytes, confirming the RUNX family transcription factor gene RUNX2, the transforming growth factor gene TGFB1, and the collagen galactosyltransferase gene COLGALT2 as targets of OA genetic risk,,. The development of a catalytically dead Cas9 (dCas9) fused to enzymes which either methylate (DNMT3a) or demethylate (TET1) CpGs has allowed precision editing of DNAm at targeted CpGs for the first time. This is increasingly being applied to cell models of OA, and has demonstrated causal relationships between CpG methylation and gene expression at loci where correlative relationships have previously been identified in patient samples. A recent study focussing on the OA risk residing at chr21q21 used a dCas9-TET1 construct for targeted demethylation of a hypermethylated Methylation quantitative trait locus (mQTL) region within the promoter of the RWD domain-containing protein gene RWDD2B. A mean reduction in methylation of 21.5% within the cell population resulted in a 3.8-fold increase in gene expression, specific to RWDD2B, confirming the gene as the target of OA-associated genetic and epigenetic effects at the locus. Functional follow-up studies of GWAS results therefore enable the identification of target genes of OA associated SNPs. Table II lists some of the functional tools and approaches used in the analysis of risk loci whilst Fig. 2 provides a schematic of how one can progress from an association SNP (sections A to H) towards a functional variant (sections I to M) and target gene (sections N to P). The OA targets so far discovered via such approaches include COLGALT2, RUNX2, PLEC (encoding plectin), MGP, ALDH1A2 (encoding aldehyde dehydrogenase), TGFB1, GDF5 (encoding growth differentiation factor 5), and RWDD2B (Table I),,,52, 53, 54, 55, 56, 57,,. These genes encode proteins with diverse roles, including extracellular signaling molecule (GDF5, TGFB1), extracellular calcium regulator (MGP), intracellular enzyme (ALDH1A2), transcription factor (RUNX2) and cytoskeletal protein (PLEC), indicating that OA genetic risk is operating on a broad range of biological functions.
Table II

A list of functional tools and how they are used in OA genetic studies

ToolDefinitionUtility and interpretation of data
Linkage disequilibrium (LD)The non-random association of two alleles within a population. LD ranges from 0 to 1. At perfect LD (r2 = 1) alleles are inherited together as a haplotypeSNPs in LD with the association SNP, and the genomic regions in which they reside, are prioritized for further investigation
ATAC-sequencingAssesses chromatin accessibility at a genome-wide level in cells of interestOpen regions signify potential regulatory elements and are prioritized as functionally important
ChIP-sequencingCategorizes DNA-protein binding regions into transcription start sites, enhancers or other active regulatory sequencesRegions labelled as active are prioritized as functionally important
Expression quantitative trait locus (eQTL)SNP genotype correlates with expression of a gene. Can be measured by allelic expression imbalance (AEI) analysis or by stratifying expression of a gene by GWAS SNP genotypeIf expression correlates with genotype, the gene may be the target of the functional effect mediated by the association signal
Methylation quantitative trait locus (mQTL)SNP genotype correlates with methylation levels at a CpGIf methylation correlates with genotype, that CpG may be an intermediate of the functional effect mediated by the association signal
Methylation and expression quantitative trait locus (meQTL)CpG methylation levels correlate with the expression of a geneThe GWAS SNP may act by differentially methylating DNA to regulate gene expression
Reporter analysisin vitro experiment assessing reporter protein expression driven by a DNA insert cloned at a promoter or enhancer siteCan be used to confirm that a sequence is a regulator of gene expression, that the two alleles of a SNP have differential regulatory activity, and that these effects can be modulated by DNA methylation
CRISPR-cas9Sequence-specific guide RNAs are used to delete or alter the sequence of a DNA region of interest in a cell line or primary cell to assess effect on gene expressionCan reveal a particular gene from amongst many as being the target of the association signal
CRISPR dCas9-DNMT3a/TET1Sequence-specific guide RNAs are used to hyper- or hypo-methylate a DNA sequence to assess effect on gene expressionCan confirm the role of DNA methylation as an intermediate between an association signal and the expression of its target gene
A list of functional tools and how they are used in OA genetic studies Of the above eight genes, six of the respective protein products fall within a network of protein interactions (Fig. 3). Interestingly, two additional proteins in this network, CDC5L (cell division cycle 5 like) and SMAD3 (SMAD family member 3), are also encoded by genes that reside at OA risk loci (Table I),,. This protein–protein interaction (PPI) network indicates that multiple variants on distinct chromosomes can lead to the dysregulation of genes that play vital roles in shared chondrocyte pathways. Furthermore, the network of OA risk gene products (PPI enrichment P = 1.0 × 10−16) is enriched for the Gene Ontology (GO; http://geneontology.org/) processes extracellular matrix organization, tissue development, and cellular response to growth factor stimulus (Fig. 3). Such PPI networks offer mechanistic insight into the liability threshold model of polygenic diseases mentioned earlier, by highlighting that when a pathway is impacted by multiple risk alleles, it is likely to be perturbed.
Fig. 3

A STRING protein–protein interaction network of OA risk gene products. Yellow nodes indicate proteins encoded by OA risk genes COLGALT2, RUNX2, PLEC, MGP, TGFB1 and GDF5. Blue nodes are proteins encoded by unconfirmed causal genes SMAD3 and CDC5L at OA risk loci. Grey nodes are protein interactors. Nodes were functionally annotated with the top three Gene Ontology (GO) processes: extracellular matrix (ECM) organization, green (P = 5.23 × 10−14); tissue development, orange (P = 5.06 × 10−13); and cellular response to growth factor stimulus, purple (P = 1.63 × 10−13). Edge thickness (lines between proteins) indicates the confidence of interactions, based upon experimental evidence, co-expression, databases, and text-mining. STRING website; https://string-db.org/.

A STRING protein–protein interaction network of OA risk gene products. Yellow nodes indicate proteins encoded by OA risk genes COLGALT2, RUNX2, PLEC, MGP, TGFB1 and GDF5. Blue nodes are proteins encoded by unconfirmed causal genes SMAD3 and CDC5L at OA risk loci. Grey nodes are protein interactors. Nodes were functionally annotated with the top three Gene Ontology (GO) processes: extracellular matrix (ECM) organization, green (P = 5.23 × 10−14); tissue development, orange (P = 5.06 × 10−13); and cellular response to growth factor stimulus, purple (P = 1.63 × 10−13). Edge thickness (lines between proteins) indicates the confidence of interactions, based upon experimental evidence, co-expression, databases, and text-mining. STRING website; https://string-db.org/.

Interaction between OA genetics and epigenetics

In recent years, the increase in the numbers of reported OA genetic risk variants, along with technological advances allowing in-depth epigenetic studies in cartilage, have uncovered an interplay between genetics and epigenetics in OA. This was recently the subject of an in-depth review and will only be discussed briefly here. There are three main mechanisms of epigenetic regulation of gene expression: post-translational modification of histones, non-coding RNAs (ncRNAs, such as miRNAs), and DNAm, of which the latter is the most well-studied. Studies of the cartilage DNA methylome have led to the discovery of OA-associated methylation quantitative trait loci (mQTLs), at which there is a correlation between genotype at an OA risk SNP and cis DNAm,,,. DNAm is thought to act as a conduit through which the functional effect of SNP genotype is exerted upon expression of the target gene (Fig. 2). These analyses of the cartilage epigenome have allowed the epigenetic prioritisation of regulatory regions (harbouring the mQTLs), causal SNPs (located within the regions) and effector genes (regulated by the regions). We have recently repeated the mQTL analysis in our dataset of 87 hip and knee arthroplasty cartilage samples to include all OA GWAS SNPs reported to date, including both novel variants and those previously analysed (Table III). Of the 124 SNPs across the 95 independent loci reported in the literature (Table I), we were able to investigate 114, which were either directly genotyped on our array, or for which a suitable proxy was available. Genotype at all SNPs was investigated for correlations with CpG methylation in a 1 megabase (Mb) region flanking the association signal as previously described. Twenty-eight SNPs were shown to mediate mQTLs across 23 different loci (some SNPs had shared effects upon DNAm at CpGs, Table III). This is consistent with our previous observations that 25% of OA risk loci are also mQTLs,,.
Table III

Cartilage methylation quantitative trait loci (mQTLs) at OA association SNPs. The nearest protein coding gene to the mQTL is named. The physical location of the CpG (hg19 genome assembly) is reported. r2∗, linkage disequilibrium (LD) value between association SNP and the proxy SNP used for mQTL analysis; an r2 of 1.0 is perfect LD. r2∗∗, linkage disequilibrium value between the SNPs marking two mQTLs at a locus. rs6976/rs11177∗, SNPs are in perfect LD. FDR, P-values were adjusted to account for the tests performed using a false discovery rate (FDR) estimation based on Benjamini-Hochberg correction

LocusGWAS SNPProxy SNPr2∗SNP chromosomeSNP position (hg19)CpG IDCpG position (hg19)FDRSlopeGeneReferencer2∗∗
1rs11583641rs109114721.00118,39,06,245cg1813158218,39,12,3052.50E-030.52COLGALT263
2rs10916199rs107994280.97122,79,02,472cg0979673922,79,24,0556.89E-070.38SNAP4725
cg1152039522,79,24,1156.83E-030.21
3rs62182810rs23054170.94220,43,87,482cg1011487720,44,27,1994.66E-080.98RAPH163
4rs6976/rs11177∗35,27,28,804cg180994085,25,52,5933.85E-07−0.79GNL3640.89
cg130606425,25,56,6432.86E-020.26
cg272940085,27,48,3592.23E-020.23
cg184040415,28,24,2832.51E-04−0.21
rs67835,28,20,981cg180994085,25,52,5934.43E-06−0.76Novel to this analysis
cg130606425,25,56,6431.36E-020.28
cg272940085,27,48,3592.95E-020.23
cg184040415,28,24,2832.75E-06−0.24
5rs11732213rs7987561.00417,04,244cg2098736915,79,5725.00E-030.39SLBP63
4cg2500779915,79,6575.35E-030.73
6rs9277552rs92774640.7763,30,55,501cg021976343,30,48,8751.43E-020.96HLA-DPB163
cg254917043,30,48,8794.38E-030.86
cg139212453,30,53,7916.21E-030.40
cg023755853,30,91,1113.81E-02−0.64
7rs10948155rs5291250.8264,46,87,957cg139797084,46,95,3182.05E-04−0.36SUPT3H/RUNX2Novel to this analysis0.60
cg209137474,46,95,4273.56E-07−0.66
rs1094817264,47,77,691cg139797084,46,95,3182.05E-04−0.3664
cg209137474,46,95,4273.56E-07−0.66
8rs1997995rs92964590.9464,53,74,183cg254944804,53,87,9174.34E-020.32SUPT3H/RUNX2Novel to this analysis
9rs60890741rs125428560.96813,07,68,504cg1817054513,10,80,1371.59E-02−0.35ASAP163
10rs11780978814,50,34,852cg1940517714,50,01,4282.48E-170.77PLEC54
cg2078495014,50,02,5221.31E-050.47
cg0187083414,50,02,8351.09E-050.38
cg24891660†14,50,03,6533.81E-020.42
cg0742747514,50,08,1106.18E-18−1.24
cg0233183014,50,08,2884.48E-08−0.99
cg0425539114,50,08,3978.46E-16−0.84
cg1459884614,50,08,9097.25E-22−1.24
cg2329925414,50,08,9572.81E-18−1.24
cg1029994114,50,09,1373.44E-02−0.33
cg21511203†14,50,15,0372.93E-020.25
cg07212837†14,50,68,7734.34E-02−0.11
11rs2070852114,67,44,925cg033390774,71,65,0577.53E-030.38F225
12rs10896015rs122700541.00116,53,23,725cg218908206,53,08,6454.45E-020.38LTBP3Novel to this analysis
13rs4764133rs116143330.98121,50,64,363cg209170831,51,14,2331.76E-020.27MGP63
14rs317630rs4908721.00126,96,37,847cg223756636,97,25,4351.31E-111.06CPSF663
15rs10601051212,38,06,219cg2174528712,34,64,5112.23E-020.39SBNO1Novel to this analysis0.12
cg1016951512,37,07,5366.89E-070.77
rs56116847rs284668870.961212,38,35,233cg1016951512,37,07,5364.36E-02−0.46Novel to this analysis
16rs3204689rs3204689155,82,46,802cg120319625,83,53,8491.12E-07−0.62ALDH1A264
17rs35206230rs13789401.00157,50,97,780cg200407477,47,15,1051.97E-03−0.29CSK63
cg102534847,51,65,8961.10E-020.36
18rs6499244rs13640630.90166,97,35,271cg267362006,99,51,7061.06E-02−0.48WWP2630.26
cg266619226,99,51,8206.16E-03−0.45
rs34195470rs9048090.79166,99,55,690cg267362006,99,51,7061.20E-26−1.02Novel to this analysis
cg266619226,99,51,8204.85E-22−0.90
19rs2953013172,94,96,343cg132631042,96,71,2934.86E-02−0.33NF163
20rs62063281rs169406651.00174,40,38,785cg171177184,36,63,2081.20E-261.65MAPT63
cg108266884,37,14,9921.07E-03−0.35
cg152957324,39,42,1283.89E-04−0.39
cg111172664,39,71,4615.30E-03−0.46
cg165203124,39,71,4712.26E-03−0.31
cg188789924,39,74,3441.95E-141.40
cg182280764,39,83,3627.91E-11−1.10
cg073681274,42,30,9944.34E-020.44
cg156333884,42,66,5301.16E-18−1.52
cg236165314,42,69,2585.20E-05−0.41
21rs10502437182,09,70,706cg155358962,10,79,3811.68E-021.20TMEM241Novel to this analysis
22rs143383rs60877040.91203,40,25,756cg147522273,40,00,4818.54E-030.78GDF5
23rs6516886rs28321551.00213,03,93,664cg18001,4273,03,91,7843.81E-02−0.53RWDD2B54
cg202202423,03,92,1881.05E-05−0.68
cg247513783,03,96,3493.09E-04−0.23
cg161402733,04,55,6161.86E-020.34
Cartilage methylation quantitative trait loci (mQTLs) at OA association SNPs. The nearest protein coding gene to the mQTL is named. The physical location of the CpG (hg19 genome assembly) is reported. r2∗, linkage disequilibrium (LD) value between association SNP and the proxy SNP used for mQTL analysis; an r2 of 1.0 is perfect LD. r2∗∗, linkage disequilibrium value between the SNPs marking two mQTLs at a locus. rs6976/rs11177∗, SNPs are in perfect LD. FDR, P-values were adjusted to account for the tests performed using a false discovery rate (FDR) estimation based on Benjamini-Hochberg correction Several OA risk loci have been identified which colocalise with genes encoding histone-modifying proteins including the transcriptional regulator SUPT3H, the nuclear receptor coactivator NCOA3 and the histone methyltransferase DOT1L8, 9, 10, 11,. DOT1L expression is essential for cartilage homeostasis, and the identification of this region as a risk locus strongly supports the requirement for tight regulation of chromatin state to maintain cartilage integrity. Similarly, OA risk SNPs have been identified in the region of cartilage-specific ncRNAs that are known to be vital for homeostasis of the articular joint surface and are dysregulated in OA, including the miRNAs miR-140 and miR-45523,25. Our knowledge of the interplay between genetics and epigenetics is rapidly developing as novel technologies emerge that allow interrogation of the epigenome. Future analyses require in-depth studies utilising large sample sizes to further enhance our understanding of the role of epigenetics as a contributor to OA development.

Evolutionary origins and developmental effects of OA genetic risk

One of the most fascinating investigations of OA genetics in recent years has been into the role of natural selection as humans adapted to bipedalism and its relationship to OA risk. As noted above, the vast majority of OA SNPs reside in non-coding regions of the genome and are presumed to regulate gene expression. In a series of publications, Capellini and colleagues have shown that a large proportion of these SNPs are enriched in chondrocyte regulatory elements that are active during joint formation in the embryo, that these elements have been subjected to selection to facilitate bipedalism, and that their DNA sequences have then been constrained to further changes to preserve the derived joint shape,65, 66, 67. It would appear, therefore, that a proportion of OA risk alleles are subtly changing these regulatory elements (for example, by quantitatively altering the binding of gene regulatory proteins), that this is tolerated due to only modest effects on joint morphology but that these changes become pathological and detrimental as we age. Since these negative effects only impact the elderly, there is no selective pressure acting against the alleles. It is proposed that these OA risk alleles have achieved appreciable frequency in the population by drift or by antagonistic pleiotropy. The latter is when an allele has a positive phenotypic effect in the young but a negative phenotypic effect in the elderly and is a process thought to contribute to the occurrence of many common age-associated diseases. These deleterious effects of allelic drift and of pleiotropic alleles will become increasingly important to public health as populations age and the prevalence of OA continues to increase. The concept that OA has etiological routes in joint development is not a new one69, 70, 71, 72 and many studies have considered the role of OA SNPs as modulators of joint shape73, 74, 75, 76, 77. However, the data from the Capellini team provides evidence that a proportion of OA genetic risk is instigated at the start of life. This has profound implications in that joint development may be as important, if not more so, than joint maintenance when considering fundamental causes of the disease. There are also consequences for the functional and genomic studies of OA GWAS signals. At present, when primary tissues are investigated these tend to be donated by OA patients who have undergone arthroplasty. In future, developmental tissues should also be analyzed. This will enable further understanding of the interaction between our DNA and its own self-regulatory mechanisms, how this relationship adapts throughout the life-course, and when and how functional effects are exerted, leading to disease. As an example, one could examine foetal tissue to assess if OA associated eQTLs identified in OA patients are operating during development. If so, this would imply that the gene regulatory elements that are the targets of OA genetic risk are also functionally active during development and that they then go on to predispose to OA as we age.

Clinical utility of OA genetics

Although the biological complexity of OA is daunting, the progress being made in our molecular genetic understanding of the disease has the potential to inform and accelerate development of prognostic markers and tailored OA therapeutics. Indeed, several disease modifying osteoarthritis drugs (DMOADs) that are currently in clinical trials, including intra-articular TGF-β and FGF-18 growth factor therapies and Wnt inhibitors, are targeting proteins whose genes have been highlighted by GWAS,. The link between OA genetics and epigenetics opens up the latter as a therapeutic option. As noted earlier, several OA risk loci colocalise with genes encoding histone-modifying proteins,,,, and histone deacetylase (HDAC) inhibitors do show efficacy as inhibitors of the expression of catabolic molecules, such as matrix metalloproteinases (MMPs) and IL-1, in OA chondrocyte and mouse OA models. The ability to modulate DNAm using CRISPR-Cas9 tools is also promising. In the functional study cited above, a dCas9-TET1 construct was used for demethylation of a hypermethylated mQTL within the promoter of RWDD2B. The resulting increase in expression of RWDD2B reversed the impact of the OA genetic risk at this locus. Although undertaken in an immortalized cell line, this study highlights the possibility of using epigenome editing to counter the gene expression effects of a risk locus. The use of DMOADs and epigenetic interventions will need to consider how best to deliver a therapeutic into the joint tissue and ensure its permanency, and which patients to target for which therapies. This latter issue may be aided by the application of OA polygenic risk scores (PRS). As ever more GWAS loci are reported for a given disease, there is an enhanced ability to stratify an individual's probability of developing disease based upon frequency and effect size of their inherited variants, generating a PRS. The development of PRS has the potential to accurately screen for disease risk among populations and can inform interventions,. There are therefore reasonable grounds to be optimistic that the data emerging from OA genetic studies, and from the genomic analyses that complement these, will be applied to patient treatment. Based on the likely developmental origin of some OA risk, translation of genetic discoveries will need to consider the time in an individual's life at which it is most optimal to start a treatment.

Concluding remarks

Many OA genetics reviews have been written over the years and each new one provides further insight into this common, complex disease. This reflects the speed at which the field is advancing, with new discoveries being reported on a regular basis. The bedrock of this is the highly successful application of GWAS to large cohorts, which have so far yielded over 100 OA GWAS SNPs (Table I). From these studies, it is apparent that OA genetic risk is operating on a range of biological mechanisms encompassing articular joint formation, homeostasis and maintenance (Fig. 3), with impacts on the expression of synovial joint tissue genes being common. Although impressive, the current proportion of the heritability accounted for by known OA risk loci is just over 20%, meaning that there are still a large number of loci to be discovered. Efforts are underway to fill the gap and at an impressive scale, with large cohorts from across the globe being investigated (https://www.genetics-osteoarthritis.com/home/index.html). What is particularly exciting about these studies is that ethnic groups who have been underrepresented in OA GWAS are now starting to be included, which should enable a clearer picture to be generated of the genetic architecture of OA at a more global population level. Ongoing GWAS should continue to be complemented by follow-up genomic data analyses. This should not be restricted to cartilage nor to a particular age group: as touched on above, the molecular genetics of OA should be considered as a risk running throughout the life-course and not restricted to older individuals. Although for many loci it is possible to highlight a causal SNP and then prioritize a target gene by statistical fine mapping and in silico analysis, this is not always the case. The application of laboratory-based functional tools is essential to validate a target and to elucidate one when fine-mapping draws a blank. Going forward, the use of cells with enhanced chondrogenic potential combined with three dimensional (3D) models of cartilage, and of cartilage with bone, will provide more robust and realistic cell and organ models for the functional analyses of OA SNPs and their target genes. These models will also enable the inclusion of mechanical load as an experimental parameter, further aligning them with the in vivo reality of an articulating joint. A complementary approach to such in vitro functional studies is the investigation of risk alleles in mice. This has so far proven particularly insightful for the OA risk that maps to the GDF5 locus and similar reports are appearing in the literature. A recent database of genes associated with OA in animal models can complement these investigations. The degree to which the functional characterization of OA GWAS signals will benefit from animal models is open to debate, but clearly there are grounds for optimism,. Finally, although no OA genetic study has yet led to a diagnostic tool or treatment, some of the OA-associated genes encode proteins that are active in pathways which have therapeutics available (Fig. 3), several of which are in clinical trials,. Genetic discoveries do therefore have translatable potential and as more GWAS loci are reported, the current gap between discovery and utility will narrow.

Author contributions

All authors were involved in drafting the article and revising it critically for intellectual content, and all authors approved the final version to be submitted.

Conflict of interest

The authors declare that they have no conflicts of interest related to the content of this manuscript.
  88 in total

1.  Heritability of Threshold Characters.

Authors:  E R Dempster; I M Lerner
Journal:  Genetics       Date:  1950-03       Impact factor: 4.562

2.  Discovery and analysis of methylation quantitative trait loci (mQTLs) mapping to novel osteoarthritis genetic risk signals.

Authors:  S J Rice; K Cheung; L N Reynard; J Loughlin
Journal:  Osteoarthritis Cartilage       Date:  2019-06-05       Impact factor: 6.576

3.  Whole-genome sequencing identifies rare genotypes in COMP and CHADL associated with high risk of hip osteoarthritis.

Authors:  Unnur Styrkarsdottir; Hannes Helgason; Asgeir Sigurdsson; Gudmundur L Norddahl; Arna B Agustsdottir; Louise N Reynard; Amanda Villalvilla; Gisli H Halldorsson; Aslaug Jonasdottir; Audur Magnusdottir; Asmundur Oddson; Gerald Sulem; Florian Zink; Gardar Sveinbjornsson; Agnar Helgason; Hrefna S Johannsdottir; Anna Helgadottir; Hreinn Stefansson; Solveig Gretarsdottir; Thorunn Rafnar; Ina S Almdahl; Anne Brækhus; Tormod Fladby; Geir Selbæk; Farhad Hosseinpanah; Fereidoun Azizi; Jung Min Koh; Nelson L S Tang; Maryam S Daneshpour; Jose I Mayordomo; Corrine Welt; Peter S Braund; Nilesh J Samani; Lambertus A Kiemeney; L Stefan Lohmander; Claus Christiansen; Ole A Andreassen; Olafur Magnusson; Gisli Masson; Augustine Kong; Ingileif Jonsdottir; Daniel Gudbjartsson; Patrick Sulem; Helgi Jonsson; John Loughlin; Thorvaldur Ingvarsson; Unnur Thorsteinsdottir; Kari Stefansson
Journal:  Nat Genet       Date:  2017-03-20       Impact factor: 38.330

4.  Genome-wide DNA methylation study identifies significant epigenomic changes in osteoarthritic cartilage.

Authors:  Matlock A Jeffries; Madison Donica; Lyle W Baker; Michael E Stevenson; Anand C Annan; Mary Beth Humphrey; Judith A James; Amr H Sawalha
Journal:  Arthritis Rheumatol       Date:  2014-10       Impact factor: 10.995

Review 5.  Epigenetics in osteoarthritis: Potential of HDAC inhibitors as therapeutics.

Authors:  Nazir M Khan; Tariq M Haqqi
Journal:  Pharmacol Res       Date:  2017-08-18       Impact factor: 7.658

Review 6.  Etiology of osteoarthritis: genetics and synovial joint development.

Authors:  Linda J Sandell
Journal:  Nat Rev Rheumatol       Date:  2012-01-10       Impact factor: 20.543

7.  Methylation quantitative trait locus analysis of osteoarthritis links epigenetics with genetic risk.

Authors:  Michael D Rushton; Louise N Reynard; David A Young; Colin Shepherd; Guillaume Aubourg; Fiona Gee; Rebecca Darlay; David Deehan; Heather J Cordell; John Loughlin
Journal:  Hum Mol Genet       Date:  2015-10-13       Impact factor: 6.150

8.  Prioritization of PLEC and GRINA as Osteoarthritis Risk Genes Through the Identification and Characterization of Novel Methylation Quantitative Trait Loci.

Authors:  Sarah J Rice; Maria Tselepi; Antony K Sorial; Guillaume Aubourg; Colin Shepherd; David Almarza; Andrew J Skelton; Ioanna Pangou; David Deehan; Louise N Reynard; John Loughlin
Journal:  Arthritis Rheumatol       Date:  2019-06-27       Impact factor: 10.995

9.  Multi-Tissue Epigenetic and Gene Expression Analysis Combined With Epigenome Modulation Identifies RWDD2B as a Target of Osteoarthritis Susceptibility.

Authors:  Eleanor Parker; Ines M J Hofer; Sarah J Rice; Lucy Earl; Sami A Anjum; David J Deehan; John Loughlin
Journal:  Arthritis Rheumatol       Date:  2020-12-01       Impact factor: 10.995

View more
  5 in total

1.  Familial Clustering and Genetic Analysis of Severe Thumb Carpometacarpal Joint Osteoarthritis in a Large Statewide Cohort.

Authors:  Catherine M Gavile; Nikolas H Kazmers; Kendra A Novak; Huong D Meeks; Zhe Yu; Joy L Thomas; Channing Hansen; Tyler Barker; Michael J Jurynec
Journal:  J Hand Surg Am       Date:  2022-09-29       Impact factor: 2.342

Review 2.  Genetic biomarkers in osteoarthritis: a quick overview.

Authors:  Ignacio Rego-Pérez; Francisco J Blanco; Alejandro Durán-Sotuela; Paula Ramos-Louro
Journal:  Fac Rev       Date:  2021-11-10

3.  Associations between adipokines gene polymorphisms and knee osteoarthritis: a meta-analysis.

Authors:  Yuqing Wang; Fanqiang Meng; Jing Wu; Huizhong Long; Jiatian Li; Ziying Wu; Hongyi He; Haochen Wang; Ning Wang; Dongxing Xie
Journal:  BMC Musculoskelet Disord       Date:  2022-02-22       Impact factor: 2.362

4.  Can genetics guide exercise prescriptions in osteoarthritis?

Authors:  Osvaldo Espin-Garcia; Madhu Baghel; Navraj Brar; Jackie L Whittaker; Shabana Amanda Ali
Journal:  Front Rehabil Sci       Date:  2022-07-29

5.  Identification of TMEM129, encoding a ubiquitin-protein ligase, as an effector gene of osteoarthritis genetic risk.

Authors:  Abby Brumwell; Guillaume Aubourg; Juhel Hussain; Eleanor Parker; David J Deehan; Sarah J Rice; John Loughlin
Journal:  Arthritis Res Ther       Date:  2022-08-08       Impact factor: 5.606

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.