| Literature DB >> 34088990 |
Catriona L K Barnes1, Caroline Hayward2, David J Porteous3, Harry Campbell1, Peter K Joshi1, James F Wilson4,5.
Abstract
Orkney and Shetland, the population isolates that make up the Northern Isles of Scotland, are of particular interest to multiple sclerosis (MS) research. While MS prevalence is high in Scotland, Orkney has the highest global prevalence, higher than more northerly Shetland. Many hypotheses for the excess of MS cases in Orkney have been investigated, including vitamin D deficiency and homozygosity: neither was found to cause the high prevalence of MS. It is possible that this excess prevalence may be explained through unique genetics. We used polygenic risk scores (PRS) to look at the contribution of common risk variants to MS. Analyses were conducted using ORCADES (97/2118 cases/controls), VIKING (15/2000 cases/controls) and Generation Scotland (30/8708 cases/controls) data sets. However, no evidence of a difference in MS-associated common variant frequencies was found between the three control populations, aside from HLA-DRB1*15:01 tag SNP rs9271069. This SNP had a significantly higher risk allele frequency in Orkney (0.23, p value = 8 × 10-13) and Shetland (0.21, p value = 2.3 × 10-6) than mainland Scotland (0.17). This difference in frequency is estimated to account for 6 (95% CI 3, 8) out of 150 observed excess cases per 100,000 individuals in Shetland and 9 (95% CI 8, 11) of the observed 257 excess cases per 100,000 individuals in Orkney, compared with mainland Scotland. Common variants therefore appear to account for little of the excess burden of MS in the Northern Isles of Scotland.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34088990 PMCID: PMC8560837 DOI: 10.1038/s41431-021-00914-w
Source DB: PubMed Journal: Eur J Hum Genet ISSN: 1018-4813 Impact factor: 4.246
Summary statistics for ORCADES, VIKING and Generation Scotland.
| Population | Sex | Count | Mean age (standard deviation) | ||||
|---|---|---|---|---|---|---|---|
| Case | Control | Total | Case | Control | Total | ||
| ORCADES | M | 28 | 843 | 871 | 54.30 (9.40) | 54.85 (15.20) | 54.83 (15.04) |
| F | 69 | 1275 | 1344 | 49.13 (12.36) | 53.85 (15.54) | 53.61 (15.43) | |
| All | 97 | 2118 | 2215 | 50.64 (11.76) | 54.25 (15.41) | 54.09 (15.28) | |
| VIKING | M | 3 | 839 | 842 | 60.95 (8.83) | 51.34 (15.47) | 51.37 (15.46) |
| F | 12 | 1251 | 1263 | 53.73 (13.10) | 48.93 (15.06) | 48.97 (15.05) | |
| All | 15 | 2090 | 2105 | 55.28 (12.39) | 49.90 (15.27) | 49.93 (15.26) | |
| Generation Scotland | M | 5 | 3574 | 3579 | 45.80 (7.92) | 45.89 (15.26) | 45.89 (15.26) |
| F | 25 | 5134 | 5159 | 50.80 (9.53) | 46.48 (14.78) | 46.50 (14.76) | |
| All | 30 | 8708 | 8738 | 49.97 (9.35) | 46.23 (14.98) | 46.25 (14.97) | |
Count and mean age for the Orkney Complex Disease Study (ORCADES), Viking Health Study – Shetland (VIKING) and Generation Scotland cohorts, split by gender and multiple sclerosis (MS) disease status.
Fig. 1Forest plots of z-scored polygenic risk scores (PRSs) for multiple sclerosis cases and controls in Generation Scotland, ORCADES and VIKING.
PRS calculated using three SNP sets are used for comparison: A. the full SNP set (n = 127), B. the SNP set without HLA-DRB1*15:01 tag SNP rs9271069 (n = 126), and C. a risk score for rs9271069 alone.
Logistic regression results for predicting MS risk using polygenic risk scores.
| Model | Estimate | SE | Pr(>| | Sig. | AIC | AUC | AUC SE | ||
|---|---|---|---|---|---|---|---|---|---|
| VIKING: Full SNP set | 0.59 | 0.26 | 2.26 | 0.024 | * | 167.73 | 0.075 | 0.762 | 0.055 |
| VIKING: HLA only | 0.31 | 0.23 | 1.37 | 0.172 | 170.99 | 0.055 | 0.706 | 0.066 | |
| VIKING: SNP set without HLA | 0.51 | 0.27 | 1.86 | 0.063 | 169.27 | 0.066 | 0.753 | 0.047 | |
| VIKING: Null model | 170.72 | 0.045 | 0.696 | 0.056 | |||||
| ORCADES: Full SNP set | 0.60 | 0.11 | 5.58 | 2.36 × 10–8 | *** | 696.36 | 0.070 | 0.705 | 0.029 |
| ORCADES: HLA only | 0.40 | 0.09 | 4.31 | 1.60 × 10–5 | *** | 710.31 | 0.047 | 0.658 | 0.032 |
| ORCADES: SNP set without HLA | 0.45 | 0.11 | 4.11 | 3.98 × 10–5 | *** | 710.85 | 0.046 | 0.666 | 0.030 |
| ORCADES: Null model | 725.93 | 0.019 | 0.600 | 0.029 | |||||
| Generation Scotland: Full SNP set | 0.63 | 0.19 | 3.40 | 0.001 | *** | 372.32 | 0.069 | 0.765 | 0.042 |
| Generation Scotland: HLA only | 0.32 | 0.16 | 1.96 | 0.050 | 380.24 | 0.048 | 0.731 | 0.045 | |
| Generation Scotland: SNP set without HLA | 0.53 | 0.19 | 2.81 | 0.005 | ** | 375.83 | 0.059 | 0.743 | 0.045 |
| Generation Scotland: Null model | 381.73 | 0.038 | 0.714 | 0.052 |
Logistic regression results for predicting MS risk using PRS, using ORCADES, VIKING and Generation Scotland cohorts. MS status was used as the dependent variable in each model. Age, sex, principal component 1 and principal component 2 were used as covariates for all models. Three separate PRS were used as the independent variable within each population: the full SNP set (n = 127 SNPs), the full SNP set without HLA-DRB1*15:01 SNP rs9271069 (n = 126), and for the HLA-DRB1*15:01 SNP rs9271069 alone (n = 1). The null model was fitted without any PRS. AIC values, Nagelkerke’s R2 values and AUC values (with associated standard error values) are also included for each model. For ease of readability, only the coefficients of the genetic effect are shown. Thus, the coefficients of the covariates are not included. For the null models, only the AIC, Nagelkerke’s R2 and AUC values are shown, as there is no genetic effect included within the null model. The Sig. column denotes the signficance of the model estimate, where *** indicates a p value < 0.001, ** indicates a p value < 0.01 and * indicates a p value < 0.05.
Comparison of PRS of MS controls between populations.
| SNP set | Population | Controls ( | Mean PRS | Population | Controls ( | Mean PRS | ||
|---|---|---|---|---|---|---|---|---|
| Full | GS | 8708 | −0.047 | ORCADES | 2120 | 0.094 | −5.78 | 8.34 × 10–9 |
| GS | 8708 | −0.047 | VIKING | 2158 | 0.052 | −4.05 | 5.26 × 10–5 | |
| ORCADES | 2120 | 0.094 | VIKING | 2158 | 0.052 | 1.35 | 0.18 | |
| Without rs9271069 | GS | 8708 | −0.01 | ORCADES | 2120 | −0.033 | 1.03 | 0.3 |
| GS | 8708 | −0.01 | VIKING | 2158 | 0.034 | −1.77 | 0.08 | |
| ORCADES | 2120 | −0.033 | VIKING | 2158 | 0.034 | −2.22 | 0.03 | |
| rs9271069 only | GS | 8708 | −0.063 | ORCADES | 2120 | 0.151 | −8.54 | 2.11 × 10–17 |
| GS | 8708 | −0.063 | VIKING | 2158 | 0.07 | −5.33 | 1.05 × 10–7 | |
| ORCADES | 2120 | 0.151 | VIKING | 2158 | 0.07 | 2.53 | 0.01 |
Two-sided t test results comparing z-score transformed polygenic risk scores (PRS) of multiple sclerosis controls between Generation Scotland, ORCADES and VIKING, using PRS produced using the full SNP set (n = 127), the SNP set without HLA-DRB1*15:01 tag SNP rs9271069 (n = 126) and a risk score for rs9271069 alone.
The contribution of common risk variants to excess MS prevalence in the Northern Isles.
| Generation Scotland | VIKING (95% CI) | ORCADES (95% CI) | ||
|---|---|---|---|---|
| Expected excess MS risk due to all common risk variants | Log(OR) | 0 | ||
| Equivalent number of cases | 0 | 9 (5, 14) | 8 (5, 11) | |
| Expected excess MS risk due to common risk variants without | Log(OR) | 0 | ||
| Equivalent number of cases | 0 | 3 (0, 6) | −1 (−3, 2) | |
| Expected excess MS risk due to | Log(OR) | 0 | ||
| Equivalent number of cases | 0 | 6 (3, 8) | 9 (8, 11) | |
| Observed excess MS risk in populations | Log(OR) | 0 | 0.71 (0.51, 0.91) | 1.02 (0.83, 1.21) |
| Equivalent number of cases | 0 | 150a | 257a |
Expected and observed excess MS risk (log of odds ratios) in both VIKING (Shetland; n controls = 2158) and ORCADES (Orkney; n controls = 2120) when compared to Generation Scotland (n controls = 8708). The difference between Generation Scotland and either ORCADES or VIKING for expected MS risk differences are highlighted in bold. Expected log(OR) values were calculated from the logistic regression results by multiplying the coefficient from the model with the mean PRS calculated from the full SNP set (n = 127), the full SNP set without HLA-DRB1*15:01 SNP rs9271069 (n = 126) and HLA-DRB1*15:01 SNP rs9271069 alone (n = 1). Observed log of odds values were calculated from the prevalence data found in the paper by Visser et al. The logistic regression model was fit to the cohort PRS data, using MS as the dependent variable, PRS as the independent variable and age, sex and the first two principal components.
aTaken directly from the observed data.