| Literature DB >> 31936765 |
Paloma Gómez-Fernández1, Aitzkoa Lopez de Lapuente Portilla1,2, Ianire Astobiza1, Jorge Mena1,3, Andoni Urtasun1, Vivian Altmann4, Fuencisla Matesanz5, David Otaegui6, Elena Urcelay7, Alfredo Antigüedad8, Sunny Malhotra9, Xavier Montalban9, Tamara Castillo-Triviño6, Laura Espino-Paisán7, Orhan Aktas10, Mathias Buttmann11,12, Andrew Chan13, Bertrand Fontaine14, Pierre-Antoine Gourraud15,16, Michael Hecker17, Sabine Hoffjan18, Christian Kubisch19, Tania Kümpfel20, Felix Luessi21, Uwe K Zettl17, Frauke Zipp21, Iraide Alloza1,3, Manuel Comabella9, Christina M Lill4,21,22,23, Koen Vandenbroeck1,3,24.
Abstract
The IL22RA2 locus is associated with risk for <span class="Disease">multiple sclerosis (MS) but causative variants are yet to be determined. In a single nucleotide polymorphism (SNP) screen of this locus in a Basque population, rs28385692, a rare coding variant substituting Leu for Pro at position 16 emerged significantly (p = 0.02). This variant is located in the signal peptide (SP) shared by the three secreted protein isoforms produced by IL22RA2 (IL-22 binding protein-1(IL-22BPi1), IL-22BPi2 and IL-22BPi3). Genotyping was extended to a Europe-wide case-control dataset and yielded high significance in the full dataset (p = 3.17 × 10-4). Importantly, logistic regression analyses conditioning on the main known MS-associated SNP at this locus, rs17066096, revealed that this association was independent from the primary association signal in the full case-control dataset. In silico analysis predicted both disruption of the alpha helix of the H-region of the SP and decreased hydrophobicity of this region, ultimately affecting the SP cleavage site. We tested the effect of the p.Leu16Pro variant on the secretion of IL-22BPi1, IL-22BPi2 and IL-22BPi3 and observed that the Pro16 risk allele significantly lowers secretion levels of each of the isoforms to around 50%-60% in comparison to the Leu16 reference allele. Thus, our study suggests that genetically coded decreased levels of IL-22BP isoforms are associated with augmented risk for MS.Entities:
Keywords: IL-22 binding protein isoform; IL22RA2; autoimmune; multiple sclerosis; mutation; signal peptide
Mesh:
Substances:
Year: 2020 PMID: 31936765 PMCID: PMC7017210 DOI: 10.3390/cells9010175
Source DB: PubMed Journal: Cells ISSN: 2073-4409 Impact factor: 6.600
Clinical and demographic features of patients and controls included in the genetic study. 1 SD: standard deviation. 2 RR: relapsing remitting MS. 3 ScP: secondary progressive MS. 4 PP: primary progressive MS. 5 ND: not determined. 6 EDSS: expanded disability status scale.
| Population | Number (% Female) | Age, Average ± SD 1 | RR 2 & ScP 3/PP 4/Other/ND 5 | Age at Onset, Average ± SD | EDSS 6, Mean ± SD | |
|---|---|---|---|---|---|---|
| Bilbao | Cases | 647 (72.3) | 42.5 ± 12.01 | 79.6/9/1.4/10 | 30.42 ± 10.17 | 2.9 ± 2.3 |
| Controls | 573 (60.3) | 44.2 ± 9 | - | - | - | |
| Donostia | Cases | 572 (64.8) | 46.4 ± 4.8 | 84.8/3.8/4.8/6.6 | 33.01 ± 11.05 | 2.79 ± 2.7 |
| Controls | 250 (66) | 50.52 ± 13.26 | - | - | - | |
| Barcelona | Cases | 676 (63.3) | 40.17 ± 12.93 | 81.5/14.8/3.7 | 31.6 ± 9.9 | 3.91 ± 2.5 |
| Controls | 910 (52.7) | 40.2 ± 12.9 | - | - | - | |
| Madrid | Cases | 899 (63.7) | 44.8 ± 10.55 | 79.7/6.9/4.7/8.7 | 29.8 ± 8.65 | 2.56 ± 2.13 |
| Controls | 697 (55.1) | 40.96 ± 16.71 | - | - | - | |
| Andalucía | Cases | 1474 (61) | 43 ± 12 | 47.4/1/9/42.6 | 28.87 ± 10.25 | ND |
| Controls | 1777 (64.4) | 40.22 ± 12.9 | - | - | - | |
| Germany | Cases | 3762 (70.2) | 42.2 ± 13.6 | ND | ND | ND |
| Controls | 2972 (60.1) | 41.1 ± 14.05 | - | - | - | |
| France | Cases | 1344 (63.6) | 44.3 ± 11.8 | ND | ND | ND |
| Controls | 768 (60.4) | 39.6 ± 13 | - | - | - | |
Figure 1A single nucleotide polymorphism (SNP) screen of the IL22RA2 locus in the Bilbao dataset. SNPs, depicted with dots in different colors depending on r2 values with respect to the index SNP rs202573, are plotted as a function of their log-converted p-value (left Y axis) and their position on chromosome 6 according to hg19 assembly of the human genome (X axis). The recombination rate across the locus is provided on the right Y axis. The red line represents the significance threshold (p = 0.05). SNPs genotyped in the primary screening [3] are shown in italics. rs202573, which was genotyped both in the previous and in the present study, is underlined.
Association values of SNPs included in the mapping analysis in the Bilbao dataset. 1 Position is according to the hg19 genome build. 2 RAF: risk allele frequency. 3 OR: odds ratio. 4 CI: confidence interval.
| SNP | Position 1 | Risk Allele | RAF 2 Cases | RAF Controls | Other Allele |
| OR 3 (95% CI 4) |
|---|---|---|---|---|---|---|---|
| rs4896239 | 137,448,873 | C | 0.52 | 0.50 | T | 0.19 | 1.116 (0.942–1.31) |
| rs17066096 | 137,452,908 | G | 0.29 | 0.27 | A | 0.26 | 1.132 (0.92–1.34) |
| rs12194034 | 137,458,262 | A | 0.23 | 0.22 | T | 0.65 | 1.047 (0.86–1.274) |
| rs1543509 | 137,465,656 | C | 0.15 | 0.14 | T | 0.92 | 1.012 (0.797–1.285) |
| rs28366 | 137,466,087 | C | 0.24 | 0.23 | T | 0.52 | 1.066 (0.88–1.297) |
| rs276466 | 137,466,614 | A | 0.78 | 0.78 | G | 0.99 | 1.001 (0.799–1.25) |
| rs10484798 | 137,470,756 | A | 0.76 | 0.72 | G | 0.05 | 1.23 (1.0–1.508) |
| rs13217897 | 137,471,327 | G | 0.83 | 0.79 | A | 0.02 | 1.291 (1.05–1.591) |
| rs202573 | 137,473,672 | A | 0.33 | 0.28 | G | 0.007 | 1.273 (1.067–1.518) |
| rs2064501 | 137,477,823 | T | 0.50 | 0.49 | C | 0.65 | 1.039 (0.879–1.226) |
| rs11154914 | 137,480,411 | G | 0.19 | 0.16 | A | 0.06 | 1.23 (0.99–1.524) |
| rs28385692 | 137,482,840 | C | 0.02 | 0.01 | T | 0.05 | 1.972 (0.983–3.954) |
| rs13197049 | 137,491,211 | A | 0.83 | 0.80 | T | 0.03 | 1.260 (1.021–1.556) |
| rs6570136 | 137,494,622 | A | 0.46 | 0.45 | G | 0.85 | 1.017 (0.847–1.222) |
| rs7745487 | 137,496,672 | A | 0.18 | 0.15 | G | 0.10 | 1.201 (0.96–1.496) |
Association values of all SNPs included in the fine mapping conditioned to the index SNP, rs202573. 1 OR: odds ratio. 2 CI: confidence interval.
| SNP | OR 1 (95% CI 2) |
|
|---|---|---|
| rs4896239 | 0.95 (0.798–1.131) | 0.5638 |
| rs17066096 | 1.059 (0.8746–1.282) | 0.5569 |
| rs12194034 | 1.025 (0.8412–1.249) | 0.8055 |
| rs1543509 | 0.9821 (0.7719–1.25) | 0.8835 |
| rs28366 | 1.047 (0.8594–1.275) | 0.6497 |
| rs276466 | 0.9779 (0.7803–1.225) | 0.8458 |
| rs10484798 | 0.8452 (0.6874–1.039) | 01105 |
| rs13217897 | 0.8249 (0.6608–1.03) | 0.08906 |
| rs2064501 | 0.8337 (0.6674–1.041) | 0.109 |
| rs11154914 | 1.043 (0.7864–1.384) | 0.7687 |
| rs28385692 | 1.799 (0.8654–3.739) | 0.1158 |
| rs13197049 | 0.846 (0.6766–1.058) | 0.1423 |
| rs6570136 | 0.8669 (0.7018–1.071) | 0.1855 |
| rs7745487 | 1.001 (0.7436–1.347) | 0.9947 |
Association values of the haplotypes in blocks. LD blocks were calculated with the confidence interval method using Haploview [25,26].
| Block | Haplotype | Frequency | Case, Control Frequencies |
|
|---|---|---|---|---|
| rs4896239 + rs28385692 | TA | 0.501 | 0.479, 0.519 | 0.102 |
| CG | 0.267 | 0.273, 0.262 | 0.6031 | |
| CA | 0.232 | 0.248, 0.219 | 0.1642 | |
| rs12194034 + rs1543509+rs28366 + rs276466 | TTTA | 0.408 | 0.404, 0.411 | 0.763 |
| TTTG | 0.22 | 0.216, 0.224 | 0.7026 | |
| ATCA | 0.217 | 0.215, 0.218 | 0.8637 | |
| TCTA | 0.148 | 0.160, 0.139 | 0.2266 | |
| rs132117897 + rs202573 + rs2064501 + rs11154914 + rs13197049 | GGCAA | 0.495 | 0.475, 0.512 | 0.1298 |
| AGTAT | 0.193 | 0.178, 0.206 | 0.1565 | |
| GATGA | 0.172 | 0.192, 0.156 | 0.0561 | |
| GATAA | 0.124 | 0.139, 0.111 | 0.0806 | |
| rs6570136 +rs7745487 | GG | 0.551 | 0.543, 0.558 | 0.5396 |
| AG | 0.278 | 0.265, 0.289 | 0.2853 | |
| AA | 0.171 | 0.192, 0.153 | 0.0373 |
Figure A1Linkage disequilibrium structure and haplotypes of IL22RA2. On the left, LD plot of the 15 SNPs analyzed in the Bilbao cohort. Blocks were calculated using the confidence interval algorithm implemented in Haploview [25,26]. Only samples genotyped for all SNPs were considered for the haplotype calculation. The numbers inside the squares indicate r2 values, and darker shades of gray represent higher degrees of linkage disequilibrium. On the right, all the haplotypes that were present in the Bilbao cohorts considering all SNPs in a single LD block are represented. The only haplotype containing the risk (C) allele of the non-synonymous SNP rs28385692 is boxed. SNPs with significant associations and rs28385692 are indicated on top of the right panel.
Figure 2Forest plots representing effect size estimates (OR, 95% confidence interval) of the risk alleles of rs28385692 (a), rs202573 (b) and rs17066096 (c) in the study populations: Central Spain (Madrid area), South of Spain (Andalucía), North of Spain (Basque Country), East of Spain (Barcelona area), Germany, and France, and in the combined dataset. The dots’ size is proportional to the sample size of each population.
Association values of the three SNPs in the discovery + validation datasets conditioned on rs17066096 and rs28385692. 1 OD: odds ratio. 2 CI: confidence interval.
| Conditioned to rs17066096 | Conditioned to rs28385692 | ||||
|---|---|---|---|---|---|
| SNP | Reference (minor) Allele |
| OR 1 (95% CI 2) |
| OR (95% CI) |
| rs17066096 | G | NA | NA | 0.001042 | 1.098 (1.039–1.162) |
| rs202573 | A | 0.2424 | 1.029 (0.981–1.079) | 0.3093 | 1.033 (0.9702–1.1) |
| rs28385692 | C | 0.001146 | 1.098 (1.101–1.476) | NA | NA |
Figure A2Haplotypes formed by the three SNPs genotyped in the Basque Country, Madrid, Andalucía, Barcelona and Germany populations. The three SNPs were considered in a single haplotype block. On the left, LD plot showing pairwise r2 values between the SNPs. On the right, the 5 SNP haplotypes present in the combined cohorts. The only haplotype containing the C allele of rs28385692 is boxed.
Functional predictions of associated SNPs and proxies. SNPs with significant associations in the mapping exercise or in the discovery + validation cohorts and their proxies (r2 > 0.8 in 1000 Genomes Phase III CEU population) were assessed using VEP and RegulomeDB. 1 Minor allele frequency is based on European populations in the 1000 Genomes Project Phase III.
| SNP | Proxy | Major Allele | Minor Allele (Frequency) 1 | Ensembl Consequence | SIFT | PolyPhen | RegulomeDB |
|---|---|---|---|---|---|---|---|
| rs10484798 | rs28362847 | G | A (0.21) | regulatory_region_variant | - | - | 5: TF binding or DNase peak |
| rs10484798 | G | A (0.21) | intron_variant | - | - | 6: other | |
| rs13197049 | rs13217897 | G | A (0.17) | intron_variant | - | - | 3a: TF binding + any motif + DNase peak |
| rs17175239 | A | G (0.17) | intergenic_variant | - | - | 5: TF binding or DNase peak | |
| rs1961618 | C | T (0.17) | intron_variant | - | - | 5: TF binding or DNase peak | |
| rs12664889 | C | A (0.17) | intron_variant | - | - | 7: no data | |
| rs13197049 | A | T (0.17) | intron variant | - | - | 7: no data | |
| rs11154913 | A | G (0.17) | intron_variant | - | - | 5: TF binding or DNase peak | |
| rs13193435 | C | A (0.17) | intron_variant | - | - | 5: TF binding or DNase peak | |
| rs7749054 | T | G (0.17) | intergenic_variant | - | - | 6: other | |
| rs13197049 | A | T (0.17) | intron_variant | - | - | 7: no data | |
| rs7766677 | A | C (0.17) | intergenic_variant | - | - | 7: no data | |
| rs13217897 | rs13217897 | G | A (0.17) | intron_variant | - | - | 3a: TF binding + any motif + DNase peak |
| rs13193435 | C | A (0.17) | intron variant | - | - | 5: TF binding or DNase peak | |
| rs1961618 | C | T (0.17) | intron_variant | - | - | 5: TF binding or DNase peak | |
| rs17175239 | A | G (0.17) | intergenic_variant | - | - | 5: TF binding or DNase peak | |
| rs7766677 | A | C (0.17) | intergenic_variant | - | - | 6: other | |
| rs11154913 | A | G (0.17) | intron_variant | - | - | 7: no data | |
| rs7749054 | T | G (0.17) | intergenic_variant | - | - | 7: no data | |
| rs12664889 | C | A (0.17) | intron_variant | - | - | 7: no data | |
| rs13197049 | A | T (0.17) | intron_variant | - | - | 7: no data | |
| rs17066096 | rs17066063 | G | A (0.23) | TF_binding_site_variant | - | - | 3a: TF binding + any motif + DNase peak |
| rs62420820 | G | A (0.23) | regulatory_region_variant | - | - | 3a: TF binding + any motif + DNase peak | |
| rs72975618 | C | T (0.23) | TF_binding_site_variant | - | - | 4: TF binding + DNase peak | |
| rs1322553 | A | G (0.23) | regulatory_region_variant | - | - | 5: TF binding or DNase peak | |
| rs12214115 | G | T (0.23) | regulatory_region_variant | - | - | 5: TF binding or DNase peak | |
| rs12214014 | C | T (0.23) | regulatory_region_variant | - | - | 5: TF binding or DNase peak | |
| rs17066096 | A | G (0.23) | intergenic_variant | - | - | 6: other | |
| rs202573 | rs202571 | T | C (0.31) | intron_variant | - | - | 7: no data |
| rs202573 | G | A (0.31) | intron_variant | - | - | 7: no data | |
| rs28385692 | rs28385692 | T | C (0.03) | missense_variant | Tolerated (0.11) | Benign (0.376) | 5: TF binding or DNase peak |
Features of the individual and consensus computational tools used in this study. Adapted from [42].
| ID Server Name | Individual or Consensus Tool | Website | MLT That Based on | Input Parameters | Deleterious Threshold & Outputs | Ref. |
|---|---|---|---|---|---|---|
|
| Consensus |
| MMAF, linear kernel, radial kernel and polynomial kernel | A score of SIFT, PolyPhen-2, GERPþþ, Mutation Taster, Mutation Assessor, FATHMM, LRT, SiPhy and PhyloP | D (Deleterious), N (Neutral) and U (Unknown) | [ |
|
| Consensus |
| RF | A score of PANTHER, PhD-SNP, SIFT, and SNAP | Disease related or Polymorphic non-synonymous SNVs | [ |
|
| Consensus |
| Weighted majority vote consensus | A score of PolyPhen-1, PolyPhen-2, SIFT, MAPP, PhD-SNP and SNAP | Confidence scores and neutral or deleterious | [ |
|
| Consensus |
| RF | A score of MutPred, FATHMM, VEST, Poly-Phen, SIFT, PROVEAN, Mutation Assessor, Mutation Taster, LRT, GERP, SiPhy, phyloP, and phastCons | Disease variants or rare neutral variants | [ |
|
| Individual |
| A linear kernel support vector machine (SVM) | SNVs | Functional, deleterious, and pathogenic variants | [ |
|
| Individual |
| Physicochemical properties and alignment score | The original amino acid, the position of the substitution and the new amino acid. | Score (0–1). The predicted is damaging if the score <=0.05 and tolerated if the score >0.05 | [ |
|
| Individual |
| Conservation method | Genome build, chromosome position, reference allele and substituted allele or Protein ID and variant | (VC) Variant conservation score and (VS) Variant specificity score. Level of functional impact (high, medium, low, neutral) | [ |
|
| Individual |
| Alignment scores | The original amino acid, the position of the substitution and the new amino acid | Score (0–1). The predicted is damaging if the score <=0.05 and tolerated if the score >0.05 | [ |
|
| Individual |
| ANN | Protein sequence | Non-neutral and neutral, Score and accuracy | [ |
|
| Individual |
| Alignment Scores HMM | Protein sequence and substitution | All GO annotations & Phylogenetic annotation | [ |
|
| Individual |
| Gradient boosting algorithm | Chromosome position, and protein variation (position, and first amino acid, and second amino acid variant) | If the probability is >0.5 then the SNV is predicted to be Pathogenic otherwise Benign | [ |
|
| Individual |
| Empirical rules | Protein, SNP identifier or Protein sequence in FASTA format and positions of the substitution | Probably damaging or Benign, or Possibly damaging, Sensitivity, specificity, Multiple sequence alignment and 3D Visualization | [ |
Figure A3The pathogenicity prediction of rs28385692 using PredictSNP (a), Meta-SNP (b), and (c) Ensemble VEP web servers. The results indicate the predictive severity of the p.Leu16Pro variant conferred by the SNP rs28385692 for the three IL-22BP isoforms.
Figure 3Prediction of the effect of the p.Leu16Pro amino acid change on the IL-22BP signal peptide structure and cleavage site of mature IL-22BP. (a) The signal peptide cleavage site indicated with green (wt, Leu16) or red (Pro16) arrows, was predicted using SignalP-3.0, Phobius and PsiPred software. The cleavage site for the canonical sequence is predicted to occur between positions 21 and 22 by the three software applications used. The p.Leu16Pro variant causes a shift in the predicted cleavage site to between position 20 and 21 according to Phobius but not SignalP 3.0 and PsiPred each of which predicted identical cleavage sites to the canonical sequence. All software coincided in predicting a decrease in the length of the H-region in the mutant form. (b) Representation of the composition, hydrophobicity and charge of the amino acids that comprise the signal peptide of IL-22BP. The three domain structures of the IL-22BP signal peptide are represented based on the overall results obtained in (a), and consist of the N-region, a hydrophilic positively charged N-terminal region; the H-region, a hydrophobic core region; and the C-region, a polar uncharged C-terminal region that is recognized by signal peptidase.
Figure A4Signal peptide prediction in the Leu16 and Pro16 variant IL-22BP proteins. SignalP 5.0, PrediSi and Signal-3L predicted the cleavage site for both wild-type and p.Leu16Pro mutant to be the 21st residue of the signal peptide. Panels (a–c) show cleavage sites from SignalP 5.0, PrediSi and Signal-3L, respectively. Panels (d,e) represent the secondary structure predictions from RaptorX and SABLE, respectively.
Figure 4The Pro16 variant in the SP of the three IL-22BP isoforms is associated with decreased secretion levels compared to the Leu16 variant. (a) HEK293 cells were transfected with the indicated expression plasmids, 24 h later cells were lysed and the conditioned medium collected. Intracellular and secreted IL-22BP isoform protein levels were measured by ELISA (mean ± SEM; n = 3; p-values by unpaired t-test). (b) HEK293 cells were transfected with the indicated expression plasmids, 24 h later cells were lysed and the conditioned medium was subjected to acetone precipitation (AP). Both cell lysates (CL) and AP were resolved by SDS-PAGE under non-reducing conditions and immunoblotted against FLAG (Ponceau staining served as loading control). For AP, the immunoblot membrane was subjected to longer exposure times. (c) HEK293 cells were transiently transfected with the indicated expression vectors (EV denotes empty vector), 24 h later cells were lysed and RNA purified. Intracellular IL-22BP protein was immunoaffinity-purified with FLAG agaroses and detected by WB following FLAG purification and in cell lysates (CL) and pass through fraction (PT). GRP94 detection and Ponceau staining served as loading controls. Transfection efficiency was measured by IL22RA2 RT-qPCR relative to the housekeeping gene ACTB. Mean ± SEM of three technical replicates. Note that as previously observed [20], immunoreactive bands corresponding to intracellular IL-22BP isoforms appear as a series of 43 to 56 kDa bands due to differential N-glycosylation, with secreted IL-22BPi2 gaining ~8 kDa (56 vs. 48 kDa) due to complex N-glycosylation.
Figure A5Leu16 to Pro mutation in the signal peptide of IL-22BPi2 is associated with lower intracellular levels. HEK293 cells were transfected with the IL22RA2v2 wild type (Leu16), mutant (16P) or empty expression plasmids; 24 h later cells were fixed, permeabilized and immunostained with anti-IL22BP antibody. Cells were analyzed by flow cytometry.
Figure 5Native signal peptide of IL-22BP does not efficiently mediate secretion of IL-22BPi2. HEK293 cells were transfected with IL-17, IL-22BPi2, and IL-17SP_IL-22BPi2 expression plasmids, and 24 h later, cells were lysed and the conditioned medium subjected to acetone precipitation (AP). Both cell lysates (CL) and AP were resolved by SDS-PAGE under non-reducing conditions and immunoblotted against FLAG, IL-22BP and using tubulin as loading control. Intracellular IL-22BP reactive bands relative to tubulin ones were scanned and represented as fold change to IL-22BP wild-type (mean ± SD, n = 2).