| Literature DB >> 36198672 |
Martin Kerick1, Marialbert Acosta-Herrera2,3, Carmen Pilar Simeón-Aznar4, José Luis Callejas5, Shervin Assassi6, Susanna M Proudman7, Mandana Nikpour8, Nicolas Hunzelmann9, Gianluca Moroncini10, Jeska K de Vries-Bouwstra11, Gisela Orozco12,13, Anne Barton12,13, Ariane L Herrick14, Chikashi Terao15, Yannick Allanore16, Carmen Fonseca17, Marta Eugenia Alarcón-Riquelme18, Timothy R D J Radstake19, Lorenzo Beretta20, Christopher P Denton17, Maureen D Mayes6, Javier Martin21.
Abstract
Copy number (CN) polymorphisms of complement C4 play distinct roles in many conditions, including immune-mediated diseases. We investigated the association of C4 CN with systemic sclerosis (SSc) risk. Imputed total C4, C4A, C4B, and HERV-K CN were analyzed in 26,633 individuals and validated in an independent cohort. Our results showed that higher C4 CN confers protection to SSc, and deviations from CN parity of C4A and C4B augmented risk. The protection contributed per copy of C4A and C4B differed by sex. Stronger protection was afforded by C4A in men and by C4B in women. C4 CN correlated well with its gene expression and serum protein levels, and less C4 was detected for both in SSc patients. Conditioned analysis suggests that C4 genetics strongly contributes to the SSc association within the major histocompatibility complex locus and highlights classical alleles and amino acid variants of HLA-DRB1 and HLA-DPB1 as C4-independent signals.Entities:
Year: 2022 PMID: 36198672 PMCID: PMC9534873 DOI: 10.1038/s41525-022-00327-8
Source DB: PubMed Journal: NPJ Genom Med ISSN: 2056-7944 Impact factor: 6.083
Fig. 1C4 and HERV-K copy numbers and Systemic Sclerosis risk.
a depicts relative systemic sclerosis (SSc) risk vs total C4 copy number stratified by C4A CN. The SSc risk score is calculated per individual as the sum of effect sizes (betas) multiplied with the design matrix. Betas of C4A, C4B and C4A:C4B were taken from the most complex model “d” (see “Methods”). Crosses are calculated as average relative risk per rounded C4 CN + /− 2 standard deviations (y axis). Linear regression lines are colored by C4A CN and drawn to visualize the interaction effect of C4A and C4B. The y axis contains a color code to aid a comparison with (b). b depicts the relative SSc risk of combinations of C4A and C4B CNs. Relative risk is calculated as in (a). Outer circles are drawn according to population frequency ranges of each C4A, C4B combination and highlight more common combinations. Diagonal dotted lines help to identify combinations of equal total C4 CN. c depicts relative SSc risk in male individuals vs total C4 CN stratified by C4B CN. Relative risk is calculated like in (a) using effect sizes of C4A, C4B, C4A:C4B, Sex:C4A, and Sex:C4B. Crosses are calculated as average relative risk per rounded C4 CN + /− 2 standard deviations (y axis). Cubic regression lines are colored by C4B CN and drawn to visualize the interaction effect of C4A and C4B. d depicts relative SSc risk vs total C4 CN stratified by HERV-K CN. Relative risk is calculated like in (a) using effect sizes of C4A, C4B, C4A:C4B, HERV-K:C4A, and HERV-K:C4B. Crosses are calculated as average relative risk per rounded C4 CN + /− 2 standard deviations (y axis). Linear regression lines are colored by HERV-K CN.
Logistic-regression analysis for total C4, C4A, C4B, and HERV-K copy numbers.
| 1st cohort ( | 2nd cohort ( | Meta-analysis | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Model | Model terms | Beta | s.e. | Beta | s.e. | Beta | s.e. | |||
| a: all | total C4 | −0.23 | 0.03 | 6.3E-17 | −0.20 | 0.13 | 0.12 | −0.23 | 0.03 | 1.9E-17 |
| HERV-K | 0.12 | 0.02 | 2.9E-10 | 0.16 | 0.10 | 0.10 | 0.12 | 0.02 | 7.9E-11 | |
| b: all | C4A | −0.31 | 0.03 | 3.7E-19 | −0.36 | 0.18 | 0.04 | −0.31 | 0.03 | 4.7E-20 |
| C4B | −0.20 | 0.03 | 7.0E-11 | −0.14 | 0.14 | 0.33 | −0.19 | 0.03 | 4.6E-11 | |
| HERV-K | 0.16 | 0.02 | 2.6E-13 | 0.23 | 0.11 | 0.04 | 0.16 | 0.02 | 3.4E-14 | |
| b: female | C4A | −0.27 | 0.04 | 6.5E-12 | −0.81 | 0.33 | 0.01 | −0.28 | 0.04 | 2.6E-12 |
| C4B | −0.22 | 0.03 | 1.1E-10 | −0.10 | 0.27 | 0.70 | −0.22 | 0.03 | 2.3E-13 | |
| HERV-K | 0.15 | 0.02 | 1.4E-10 | 0.41 | 0.19 | 0.03 | 0.15 | 0.02 | 1.5E-14 | |
| b: male | C4A | −0.49 | 0.08 | 1.7E-09 | −1.19 | 0.48 | 0.01 | −0.51 | 0.08 | 1.1E-10 |
| C4B | −0.09 | 0.07 | 2.3E-01 | −0.73 | 0.43 | 0.09 | −0.11 | 0.07 | 1.2E-01 | |
| HERV-K | 0.18 | 0.05 | 2.8E-04 | 0.66 | 0.28 | 0.02 | 0.19 | 0.05 | 7.5E-05 | |
| c: all | C4AShort | −0.29 | 0.18 | 1.0E-01 | −0.43 | 0.32 | 0.18 | −0.32 | 0.16 | 3.9E-02 |
| C4ALong | −0.16 | 0.02 | 2.0E-10 | −0.11 | 0.18 | 0.55 | −0.16 | 0.02 | 1.1E-15 | |
| C4BShort | −0.20 | 0.03 | 1.4E-10 | −0.12 | 0.28 | 0.67 | −0.20 | 0.03 | 2.5E-11 | |
| C4BLong | −0.04 | 0.03 | 1.8E-01 | 0.09 | 0.15 | 0.56 | −0.04 | 0.03 | 2.3E-01 |
Depicted are beta values from the logistic-regression analysis of three different models (blocks of rows, see “Methods”). All models contained sex and five genetic principal components as co-variables. Logistic-regression analysis for the first cohort additionally contained cohort as co-variable. Model b was also calculated separately for females and males. Models contain copy numbers as calculated from the imputed C4 alleles per individual as dosages.
Fig. 2C4 expression and C4 protein concentrations in whole blood.
a depicts residualized total C4 expression levels by total C4 copy number (CN) stratified by HERV-K CN. C4 expression is calculated as the sum of C4A and C4B expression as obtained by RNA-Sequencing. The residualized expression has been calculated by regressing out 20 (18) principal components for controls and cases, respectively. Data has been grouped by rounded C4 and HERV-K CN dosage. b depicts normalized C4 protein levels in plasma by total C4 CN stratified by HERV-K CN. C4 protein levels have been normalized across 10+ laboratory sites. c depicts residualized total C4 expression levels (like in a) for SSc and controls, stratified by sex. Significant comparisons are highlighted by asterisk (*P < 0.05, **P < 0.01, ***P < 0.001). d depicts normalized C4 protein levels (like in b) for SSc and controls, stratified by sex. Significant comparisons are highlighted by asterisk (*P < 0.05, **P < 0.01, ***P< 0.001). e depicts normalized C4 protein levels in blood from 119 adult men (blue) and 447 adult women (red) as a function of age with locally estimated scatterplot smoothing (LOESS). Protein levels are normalized to the number of C4 gene copies in an individual’s genome. All boxplot are drawn with default settings in R 4.0.3: lines are defined as first, second and third quartile (Q1, Q2, Q3), whiskers depict the most extreme data points within Q1–1.5 interquartile range (IQR), and Q3 + 1.5 IQR. Boxplot notches are defined as 95% confidence interval of the median.
Fig. 3MHC region conditional association with systemic sclerosis.
Association is calculated in the first dataset (N = 26,633) using logistic regression with cohort, genetic background (PC1-5), and sex as covariates and depicted as position (GRCh38) by significance (Manhattan plot) in gray if no additional covariates were used. The dotted line represents the genome-wide significance cutoff P = 5 × 10−8. a Manhattan plot with marked C4 eQTLs obtained from the GTEx v8 database. b Manhattan plot with additional conditioning on ten independent C4 eQTLs, obtained by forward selection in the first dataset, depicted in blue. The arrow marks the position of HLA-DPB1. c Manhattan plot with additional conditioning on 16 independent C4 eQTLs (obtained by forward selection to explain expression variance in the second dataset (N = 857) depicted in blue. The arrows mark the positions of HLA-DPB1 and HLA-DRB1. d Manhattan plot with additional conditioning on 16 independent C4 eQTLs (obtained by forward selection to explain expression variance in the second dataset) and 8 independent amino acids of DRB1 and DPB1 (obtained by forward selection in the first dataset conditioning on 16 independent C4 eQTLs) depicted in blue.