| Literature DB >> 34573331 |
Yixiao Zeng1,2, Kaiqiong Zhao2,3, Kathleen Oros Klein2, Xiaojian Shao4, Marvin J Fritzler5, Marie Hudson2,6,7, Inés Colmegna6,8, Tomi Pastinen9,10, Sasha Bernatsky6,8, Celia M T Greenwood1,2,3,9,11.
Abstract
High levels of anti-citrullinated protein antibodies (ACPA) are often observed prior to a diagnosis of rheumatoid arthritis (RA). We undertook a replication study to confirm CpG sites showing evidence of differential methylation in subjects positive vs. negative for ACPA, in a new subset of 112 individuals sampled from the population cohort and biobank CARTaGENE in Quebec, Canada. Targeted custom capture bisulfite sequencing was conducted at approximately 5.3 million CpGs located in regulatory or hypomethylated regions from whole blood; library and protocol improvements had been instituted between the original and this replication study, enabling better coverage and additional identification of differentially methylated regions (DMRs). Using binomial regression models, we identified 19,472 ACPA-associated differentially methylated cytosines (DMCs), of which 430 overlapped with the 1909 DMCs reported by the original study; 814 DMRs of relevance were clustered by grouping adjacent DMCs into regions. Furthermore, we performed an additional integrative analysis by looking at the DMRs that overlap with RA related loci published in the GWAS Catalog, and protein-coding genes associated with these DMRs were enriched in the biological process of cell adhesion and involved in immune-related pathways.Entities:
Keywords: DNA methylation; anti-citrullinated protein antibody positivity; cell adhesion; differentially methylated cytosines; differentially methylated regions; rheumatoid arthritis; targeted bisulfite sequencing
Mesh:
Substances:
Year: 2021 PMID: 34573331 PMCID: PMC8472734 DOI: 10.3390/genes12091349
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Demographics of participants in initial study (Dataset 1) and current replication study (Dataset 2).
| Dataset 1 | Dataset 2 | |||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| |
| ACPA OD, mean (range) | 39.0 (2.8–228.5) | 65.0 (40.1–210.4) | 7.0 (2.8–19.0) | 133.5 (4.4–228.5) | 55.8 (3.5–191.6) | 98.3 (60.2–178.9) | 7.3 (3.5–18.8) | 117.0 (3.5–191.6) |
| Age, mean (sd) | 54.8 (7.7) | 54.8 (8.0) | 54.7 (7.5) | 55.3 (9.5) | 54.2 (7.9) | 54.4 (7.7) | 54.2 (8.1) | 53.0 (8.1) |
| Female, n (%) | 78 (64.5) | 33 (61.1) | 39 (63.9) | 6 (100) | 62 (55.4) | 29 (58) | 30 (55.6) | 3 (37.5) |
| Smoker, n (%) | ||||||||
| Current | 26 (21.5) | 12 (22.2) | 13 (21.3) | 1 (16.7) | 18 (16.1) | 6 (12) | 10 (18.5) | 2 (25) |
| Past | 47 (38.8) | 23 (42.6) | 22 (36.1) | 2 (33.3) | 48 (42.9) | 20 (40) | 24 (44.4) | 4 (50) |
| Never, | 4 (3.3) | 0 (0) | 4 (6.6) | 0 (0) | 4 (3.6) | 3 (6) | 1 (1.9) | 0 (0) |
| Missing | 44 (36.4) | 19 (35.2) | 22 (36.1) | 3 (50) | 42 (37.5) | 21 (42) | 19 (35.2) | 2 (25) |
| Blood cell proportions, | mean (range) | |||||||
| monocyte | 0.077 (0.022) | 0.077 (0.020) | 0.077 (0.023) | 0.079 (0.040) | 0.079 (0.019) | 0.079 (0.019) | 0.081 (0.019) | 0.075 (0.019) |
| lymphocyte | 0.280 (0.068) | 0.281 (0.072) | 0.280 (0.063) | 0.280 (0.096) | 0.286 (0.072) | 0.283 (0.069) | 0.299 (0.067) | 0.219 (0.093) |
| neutrophil | 0.613 (0.078) | 0.614 (0.079) | 0.612 (0.071) | 0.615 (0.140) | 0.604 (0.082) | 0.607 (0.082) | 0.590 (0.076) | 0.680 (0.093) |
| eosinophil | 0.023 (0.015) | 0.023 (0.017) | 0.024 (0.014) | 0.020 (0.008) | 0.025 (0.017) | 0.026 (0.016) | 0.025 (0.019) | 0.021 (0.012) |
| basophil | 0.007 (0.004) | 0.007 (0.005) | 0.006 (0.004) | 0.008 (0.004) | 0.006 (0.004) | 0.006 (0.004) | 0.005 (0.004) | 0.004 (0.003) |
Note: subjects without cell-type composition information were removed for both datasets. One subject in Dataset 1 with low-positive ACPA level (OD = 29.67) was also removed, leaving only medium-positive (40 < OD ≤ 60) and high-positive (OD > 60) ones for the ACPA-positive group in Dataset 1. All subjects in the positive group of Dataset 2 were highly positive (OD > 60). Abbreviations: ACPA: anti-citrullated protein antibodies; RA: rheumatoid arthritis; OD: optical density.
Number of CpGs covered in the two datasets, and overlaps in captured sites.
| Dataset 1 | Dataset 2 | Overlaps | |
|---|---|---|---|
| # of CpGs covered in at least two samples with at least one read | 5,041,032 | 5,307,142 | 3,948,157 |
| # of CpGs covered after quality control | 1,305,080 | 4,259,820 | 1,095,002 |
Figure 1Histograms of mean sequencing depth across the CpGs covered by both datasets in the targeted region.
Summary of different models fitted and corresponding identified DMCs/DMRs.
| Models | #CpGs Tested | #DMCs (#DMRs) | #HyperDMCs (#HyperDMRs) | #HypoDMCs (#HypoDMRs) |
|---|---|---|---|---|
| I. ACPA-positive vs. ACPA-negative | 4,259,820 | 19,472 (814) | 8581 (334) | 10,891 (480) |
| II. ACPA-positive vs. ACPA-negative | 1,305,080 | 853 (44) | 569 (31) | 284 (13) |
| Overlaps by position | 1,095,002 | 157 (10) | 43 (3) | 16(1) |
| III. ACPA-positive vs. ACPA-negative | 19,472 * | 6314 (302) | 2415 (115) | 3899 (187) |
| IV. ACPA-positive vs. ACPA-negative | 853 † | 515 (28) | 371 (22) | 144 (6) |
| Overlaps by position | 157 | 31 (3) | 14 (1) | 1 (0) |
| V. Self-reported RA vs. Asymptomatic | 4,282,792 | 18,874 (843) | 10,909 (578) | 7965 (265) |
| VI. Self-reported RA vs. Asymptomatic | 1,295,623 | 258 (15) | 99 (5) | 159 (10) |
| Overlaps by position | 1,099,279 | 55 (4) | 15 (1) | 11 (1) |
* In which 6026 DMCs found not to be affected by cis-SNPs were refitted without genetic adjustment. † In which 503 DMCs found not to be affected by cis-SNPs were refitted without genetic adjustment.
Figure 2P-values QQ-plot of genome-wide ACPA-methylation associations. (a): 1,305,080 CpGs with good coverage in Dataset 1. (b): 4,259,820 CpGs with good coverage in Dataset 2. (c): 1,095,002 CpGs with good coverage in both datasets.
Figure 3Numbers of sign-consistent overlapping DMCs between Model I and Model V generated by permutation analysis using regioneR. The X-axis represents the number of overlapping sites on a log10 scale, and the Y-axis specifies the probability density so that the histogram has a total area of one. The black curve on the left side shows the estimated null distribution, from 10,000 permutations, for the number of overlapping sites when significant DMCs for ACPA and RA are randomly selected from the 2,106,243 CpGs tested in both analyses and showing same direction of effect. The red vertical line denotes the 0.05 significance threshold for p-values. The green line on the right side shows the observed number of overlapping DMCs (1441) with same direction of effect, which is much larger than any of the 10,000 permutations.
Figure 4Scatter plots illustrating agreement between results for the 157 CpGs identified as demonstrating ACPA-methylation associations in both Dataset 1 and Dataset 2. (a) Estimated model coefficients (log odds ratios) from the EWAS binomial regressions. Points are colored in blue if the coefficient signs agree, and in pink otherwise; (b) (p-values) from EWAS binomial regressions; (c) estimated coefficients (log odds ratios) after genetic adjustments; (d) (p-values) after genetic adjustments including lines indicating significance threshold (). Shapes and colors for points in (b–d) correspond to those assigned in panel (a) for intuitive tracking of their changes.
Figure 5Estimated coefficients and confidence intervals for association between methylation and ACPA status from binomial regressions in Dataset 1 and Dataset 2, with and without inclusion of meQTL covariates.
Figure 6Null distributions and corresponding 0.05 p-value thresholds for overlap or agreement, from 10,000 permutations in different cases: (a) the number of overlapping ACPA-associated DMCs between two datasets; (b) the mean distance between ACPA-associated DMCs in two datasets; (c) the number of overlapping ACPA-associated gDMCs between two datasets; (d) the mean distance between ACPA-associated gDMCs in two datasets; (e) the number of overlapping RA-associated DMCs between two datasets; (f) the mean distance between RA-associated DMCs in two datasets. The X-axis represents the number of overlapping sites in (a,c,e) and the mean distance in (b,d,f), the Y-axis specifies the density so that the histogram has a total area of one. The green vertical bars represent what we actually observed in each case. The sampling universe for permutations was CpGs tested in both datasets.
The number of DMCs (DMRs) identified in the initial and this replication study, and overlap in captured sites.
| Initial Study | Replication Study | Overlaps | Overlaps | |
|---|---|---|---|---|
| ACPA-positive vs. ACPA-negative | ||||
| # of CpGs tested | 4,635,909 | 4,259,820 | ||
| # of DMCs(DMRs) identified | 1909 (509) | 19,472 (814) | 410 (23) | 230 (11) |
| Self-reported RA vs. ACPA asymptomatic | ||||
| # of CpGs tested | 4,109,916 | 4,282,792 | ||
| # of DMCs(DMRs) identified | 955 (249) | 18,874 (843) | 156 (9) | 110 (6) |
Summary of protein-coding genes associated with identified ACPA and RA DMRs, and those associated with RA causal SNPs from GWAS Catalog.
| Source | # of Mapped Genes | Overlap with GWAS Genes |
|---|---|---|
| 585 SNPs from GWAS Catalog | 295 | |
| 814 ACPA-associated DMRs | 403 | |
| 843 RA-associated DMRs | 376 |
Figure 7Gene Ontology terms showing over-representation with corrected p-values , for the list of genes derived from the ACPA-associated DMRs. Abbreviations: BP: Biological Process; CC: Cellular Component; MF: Molecular Function.
Gene Ontology terms that collectively show over-representation in all gene sets, at corrected p-value .
| Source | Term ID | Term Name | |||
|---|---|---|---|---|---|
| GO:BP | GO:0098609 | cell–cell adhesion |
|
|
|
| GO:BP | GO:0007155 | cell adhesion |
|
|
|
| GO:BP | GO:0022610 | biological adhesion |
|
|
|
| GO:CC | GO:0005886 | plasma membrane |
|
|
|
Gene symbols involved in cell–cell adhesion (GO:0098609) for each of the profiled gene sets, with the number of associated SNPs/DMRs.
| Gene List | Genes Involved | #SNPs/DMRs |
|---|---|---|
| GWAS Catalog |
| 86 |
| ACPA-associated DMRs |
| 55 |
| RA-associated DMRs |
| 49 |
Figure 8Overlaid IPA canonical pathways for the gene sets involved in the biological process of cell–cell adhesion (GO:0098609). Genes for RA GWAS SNPs, ACPA-associated DMRs and RA-associated DMRs are surrounded by the blue, orange and green boxes, respectively. The genes shared across sets are placed in the intersection areas. The genes in highlighted canonical pathways are connected to the pathway names by lines.
Figure 9Procedures for data collection and analysis.
Figure 10An example of a hypermethylated DMR.