Literature DB >> 36076479

Epigenetic activation of antiviral sensors and effectors of interferon response pathways during SARS-CoV-2 infection.

Jan Bińkowski1, Olga Taryma-Leśniak1, Karolina Łuczkowska2, Anna Niedzwiedź3, Kacper Lechowicz4, Dominik Strapagiel5, Justyna Jarczak6, Veronica Davalos7, Aurora Pujol8, Manel Esteller9, Katarzyna Kotfis4, Bogusław Machaliński2, Miłosz Parczewski10, Tomasz K Wojdacz11.   

Abstract

Recent studies have shown that methylation changes identified in blood cells of COVID-19 patients have a potential to be used as biomarkers of SARS-CoV-2 infection outcomes. However, different studies have reported different subsets of epigenetic lesions that stratify patients according to the severity of infection symptoms, and more importantly, the significance of those epigenetic changes in the pathology of the infection is still not clear. We used methylomics and transcriptomics data from the largest so far cohort of COVID-19 patients from four geographically distant populations, to identify casual interactions of blood cells' methylome in pathology of the COVID-19 disease. We identified a subset of methylation changes that is uniformly present in all COVID-19 patients regardless of symptoms. Those changes are not present in patients suffering from upper respiratory tract infections with symptoms similar to COVID-19. Most importantly, the identified epigenetic changes affect the expression of genes involved in interferon response pathways and the expression of those genes differs between patients admitted to intensive care units and only hospitalized. In conclusion, the DNA methylation changes involved in pathophysiology of SARS-CoV-2 infection, which are specific to COVID-19 patients, can not only be utilized as biomarkers in the disease management but also present a potential treatment target.
Copyright © 2022 The Authors. Published by Elsevier Masson SAS.. All rights reserved.

Entities:  

Keywords:  COVID-19; Coronavirus; DNA methylation; Epigenetics; SARS-CoV-2

Mesh:

Substances:

Year:  2022        PMID: 36076479      PMCID: PMC9271528          DOI: 10.1016/j.biopha.2022.113396

Source DB:  PubMed          Journal:  Biomed Pharmacother        ISSN: 0753-3322            Impact factor:   7.419


Introduction

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has infected almost 510 million individuals and claimed over 6.2 million lives worldwide as of April 25, 2022 (https://origin-coronavirus.jhu.edu/map.html). The vast majority of SARS-CoV-2 infections is characterized by a range of mild symptoms including fever, cough, and general malaise [1]. About 14% of patients progress to severe disease (with dyspnoea, hypoxia, or greater than 50% lung involvement on imaging tests) and further 5% develop critical disease, characterized by acute respiratory distress syndrome (ARDS), and multi-organic failure (MOF) requiring mechanical ventilation in the Intensive Care Units (ICUs) [2]. The severity of COVID-19 disease, caused by SARS-CoV-2 infection, has been generally associated with dysregulation of immune activity [3], including increased levels of cytokines [4], [5]) produced by a subset of inflammatory monocytes [6], [7], lymphopenia [8], and T cell exhaustion [9], [10]. The epidemiological studies repeatedly show that sex and older age plus one of the following conditions: diabetes, obesity, hypertension, and cardiovascular pathology are the risk factors for severe outcome of the disease [7], [11]. Those risk factors, however, cannot explain the high heterogeneity and unpredictability of the COVID-19 outcomes observed in clinics. Prediction of individual response to the SARS-CoV-2 infection is one of the biggest challenges of the COVID-19 patient clinical management [12], [13], [14], [15]. A number of genome-wide association studies (GWAS) have identified genetic loci that appear to modulate the risk of severe outcomes of the infection [16], [17], [18]. The prevalence of one of those loci, a haplotype derived from Neandertal DNA, was associated with the COVID-19 mortality in the populations with a high prevalence of that haplotype [19]. However, the populations in which this haplotype is almost absent also display high mortality rates. That suggests a key role of the environmental factors in the disease outcomes. DNA methylation is one of the major epigenetic mechanisms of gene expression regulation, essential during development and for the maintenance of cell-type identity [20]. In principle, it is a process of covalent addition of methyl groups to the cytosines in DNA strand and cell specific genome-wide methylation pattern is referred to as methylome. DNA methylation is an enzymatic process that can be influenced by environmental factors and several studies have reported significant differences between methylomes of specific populations, but also individuals (as reviewed for example in: [21], [22]). Moreover, the acquisition of specific DNA methylation changes in blood has already been shown by us and others to predispose to the disease [23]. Recent studies have confirmed that SARS-CoV-2 infection influences the methylomes of the blood cells [24], [25], [26]. However, the results of those studies agree only to some extent, suggesting that repose of the methylome to the infection may be host specific and vary between patients and populations of patients. We have consolidated methylomics and transcriptomics data from studies that aimed to elaborate SARS-CoV-2 virus influence on blood cells including: two studies from USA [24], [26], a study from Spain [25] and a patient cohort from our institution with the aim to investigate the causal effect of the host methylome interaction with the pathogen. Overall, we observed uniform response of methylome of blood cells to SARS-CoV-2 virus in 478 hospitalized COVID-19 patients from geographically distant and ethnically mixed populations. The identified methylation signature was specific to COVID-19 patients and not present in blood of healthy controls and patients with upper respiratory tract infections. We then assessed the expression of genes associated with those methylation changes and shown that expression of at least a subset of those genes is regulated by SARS-CoV-2 induced methylation changes.

Material and methods

Data collection

All used datasets are deposited at Gene Expression Omnibus (GEO) database. Methylation data of COVID-19 cohorts at: GSE168739, GSE174818, GSE167202, and GSE202296; other infectious respiratory disease cohorts at: GSE174818 and GSE167202; and healthy individuals at: GSE167202, GSE112618, GSE123914, and GSE153211. Expression (RNA-seq) data of COVID-19 and non-COVID-19 patients at: GSE157103.

Ethics

Written consent was obtained from all participants from Polish cohort before enrollment. All procedures were in accordance with the ethical standards approved by the Bioethics Committee of the Pomeranian University, Szczecin, with approval number KB-0012/101/2020 -A.

Methylation profiling data processing

Raw microarray files (.idat) were processed using the ChAMP R package [27]. As the data exceeded available computational abilities, we prepared an R script to process chunks of data in sequential manner. Each chunk (n = 50) was drawn randomly without replacement from the sampling frame and then processed into a beta-matrix using standard ChAMP pipeline for EPIC array. Finally, all beta-matrixes were combined into one data matrix file for further analysis. Code available from: https://github.com/ClinicalEpigeneticsLaboratory/distChAMP. Raw beta values were corrected for cell-fraction proportions using the refBase function of the ChAMP package. However, to predict WBC fractions proportion we used outperforming robust partial correlation method implemented in EpiDISH R package [28], [29], [30], instead of the default reference-based algorithm. As a reference methylation profiles, we used “centDHSbloodDMC.m” dataset for seven main WBC types (B-cells, CD4T, CD8T, NK, Monocytes, Neutrophils and Eosinophils) [28]. We defined differentially methylated probe (DMP) as a probe with more than 0.05 absolute average methylation difference between compared groups and we used ANOVA test to calculate p-value for specific CpG if the distribution of beta values in each of the comparison groups was normal (Shapiro-Wilk test) and variance was equal (Barttlet test). If the beta values were not normally distributed or variances between groups were different, we used non-parametric Kruskal-Wallis test. To include additional covariates in analysis we used logit or multinomial logit regression models, then significance of methylation level change was assessed using the Wald test. We controlled for false discovery rate at the level 0.05 using step-up Benjamini-Hochberg procedure.

Gene expression data processing

The RNA-seq data from COVID-19 and other respiratory infections patients (GSE157103) were analysed as TPM (Transcript Per Million). The logit regression model was used to test statistical significance of observed expression differences between studied groups of patients. We considered the expression to be altered if absolute value of log2 fold-change of studied groups was ≥ 1 and FDR corrected p-value calculated for gene expression variable in logit model was ≤ 0.05 (Wald test). The models were adjusted for: sex, age, WBC fractions and steroids intake. To fulfill independence between covariates we removed all covariates with VIF (variance inflation factor) > 6 (in case of continuous variables) or p-value ≤ 0.05 (Fisher exact test) in pairwise comparison of covariates (in case of categorical variables).

High-dimensional data visualisation

Data with more than three dimensions were visualised in the form of heatmaps clustered using unsupervised Ward’s algorithm or alternatively using scatterplots based on data transformed by t-SNE. In t-SNE algorithm, we set perplexity equal to the smallest number of samples in analysed groups and the gradient calculation was performed using “exact” method. The rest of parameters were set as default, implemented in scikit-learn library. In both methods, we assumed Euclidian distance.

The functional enrichment analysis

Gene Set Enrichment Analysis was performed using GENE2FUNC function of FUMA GWAS (Functional Mapping and Annotation of Genome-Wide Association Studies) [31] with Molecular Signatures Database (MSigDB) as a reference [32], [33]. This database contains “hallmark” collection of gene sets [34], which are one of the most reliable databases for GSEA as they were generated by in silico identification of the overlaps between gene sets in other MSigDB collections, followed by manual evaluation and annotation of the genes [34]. The evidence channels used in STRING analysis were: (1) textmining, which searches for protein names in PubMed abstracts, in an in-house collection of more than three million full text articles, and in other text collections [35], [36], (2) experiments, which includes evidence from actual laboratory experiments, based on interaction databases organized in the IMEx consortium [37] and BioGRID [38], (3) database, which collects evidence imported from pathway databases (asserted by a human expert curator), and (4) co-expression, in which gene expression data originating from a variety of expression experiments are normalized, filtrated and then correlated [39] (Table 3).

Results

Blood cell composition of clinical sample needs to be considered in analyses of causal methylation changes

In response to SARS-CoV-2 infection the ratios of sub-populations of blood cells may temporally change and the magnitude of those changes differs between infected individuals [5], [40]. Increasing number of data suggests that to identify causal methylation pattern alterations in individual cell types, the correction for the proportion of different cells in the individual sample should be performed especially when analysing data from independent studies and populations [28], [41], [42], [43]. Thus, we first evaluated to what extent the analysis of SARS-CoV-2 infection-related methylation changes in our study can potentially be affected by sample specific white blood cells (WBC) proportions. Using reference-based Robust Partial Correlations-RPC (EpiDISH) [28] we inferred the proportions of the cells from EPIC microarray data for each of the blood samples studied. The levels of each of the cell types significantly varied between cohorts of COVID-19 patients as well as healthy controls (Supplementary Fig. 1a). Moreover, mean or variance of methylation levels, were significantly different (p-value ≤ 0.05, Kruskal-Wallis and Levene test, respectively) for all cell fractions except eosinophils, between the groups. We then performed cell-fraction correction (CFC) of individual methylation profiles with refBase function of ChAMP package [27], [29], [30], modified to use reference-based Robust Partial Correlations-RPC (implemented in EpiDISH package) [28], [29]. The proportions of specific blood cell fractions in patients as well as healthy controls showed relatively minor differences after CFC, indicating the need to perform this type of correction before the analysis of the causal methylation changes (Supplementary Fig. 1b). To confirm positive effect of cell-fraction correction we also regressed beta values for 10,000 randomly selected CpG sites with percentage of blood cell fractions across all methylation profiles in the study to calculate adjusted R-square value of that regression. This value in principle reflects the strength of linear association between beta values (methylation levels) and proportions of the cells in the sample. In our data, up to 94% (median: 0.06; interquartile range (IQR): 0.20) of methylation level at specific CpG sites was potentially attributed to different cell proportions and the influence of cell fractions on beta values is significantly reduced by cell-fraction correction that we performed (Supplementary Fig. 1c). These results again show the need to perform cell fraction correction when studying the methylome blood cells and our subsequent analyses described below are based on corrected data.

Uniform response of blood methylome to SARS-CoV-2 infection

Majority of the COVID-19 patients in our study, including all patients in Polish (n = 32) and USA 1 cohorts (n = 102) [24], as well as 213 patients in Spanish [25] and 131 in USA 2 [26] were patients with symptoms that required hospitalization. We began our analysis of the effect of SARS-CoV-2 on methylomes of blood cells with a comparison of methylation profiling data of those patients with the data from 119 healthy blood samples that we considered a healthy control group (the outline of this analysis is shown in Supplementary Fig. 2a and Table 1 includes clinical characteristics of the cohorts). Differentially methylated probe (DMP) in our study was a probe with statistically significant (FDR corrected p-value ≤ 0.05; Kruskal-Wallis or ANOVA) average methylation level change (hyper- or hypomethylation) in each of the studied cohorts in reference to controls and the same direction of that change across all cohorts. Additionally, as we used data from different studies and expected a certain level of the batch effect and technological noise, we included in this analysis only probes with methylation level difference of more than 5% points. This analysis identified 1773 DMPs (Supplementary Table 1) of which 1601 (90.3%) were hypo- and 172 (9.7%) hypermethylated. Not surprisingly, the unsupervised clustering based on identified DMPs was able to effectively distinguish two analysed groups ( Fig. 1a), but importantly patients did not cluster in experiment dependent fashion indicating lack of batch effect in our data set. Both clustering and t-Distributed Stochastic Neighbour Embedding (t-SNE) based visualisation (Fig. 1b) showed high heterogeneity of methylation levels at identified DMPs across patients with even some of the methylation profiles of healthy individuals mixing with the COVID-19 patients.
Table 1

Clinical characteristics of COVID-19 patients of all studied populations, non-COVID-19 patients, as well as healthy controls.

COVID-19 PLCOVID-19 ES[25]COVID-19 USA-1[24]non-COVID-19 USA-1[24]COVID-19 USA-2[26]non-COVID-19 USA-2[26]Healthy controls[26], [56], [57], [58]
Total, n32407102*2616465119
Age (years), mean (95% CI)48.7 (44.5–52.8)42.1 (41.1–43.1)61.3 (58.1–64.5)63.8 (57.3–70.4)50.5 (47.9–53.2)54.1 (50.0–58.2)53.2 (50.6–55.7)
Sex, n(%)
Male19 (59.4%)185 (45.5%)64 (62.7%)13 (50%)93 (56.7)35 (53.8)18 (15.1%)
Female13 (40.6%)222 (54.5%)38 (37.3%)13 (50%)71 (43.3)30 (46.2)101 (84.9%)
BMI, mean (95% CI)28.6 (26.9–28.7)< 3030.4 (28.4–32.4)30.4 (26.7–34.0)N/AN/AN/A
Smoking history14 (43.8%)N/A18 (17.6%)10 (38.5%)N/AN/AN/A
Diabetes mellitus, n(%)2 (6.3%)0 (0%)36 (35.3%)6 (23.1%)N/AN/AN/A
Hypertension, n(%)9 (28.1%)0 (0%)N/AN/AN/AN/AN/A
Pulmonary disease, n(%)2 (9.4%)0 (0%)21 (20.6%)4 (15.4%)21 (12.8%)8 (12.3%)N/A
ICU, n(%)0 (0%)99 (24.3%)51 (50%)16 (61.5%)N/A
Severity group, n(%)
Hospitalised32 (100%)213 (52.3%)102 (100%)N/A131 (79.9%)N/AN/A
Asymptomatic/mild symptoms0 (0%)194 (47.7%)0 (0%)N/A33 (20.1%)N/AN/A
COVID-19 pneumonia
Yes25 (78.1%)203 (49.9%)N/AN/AN/AN/AN/A
No7 (21.9%)184 (45.2%)N/AN/AN/AN/AN/A
Unknown0 (0%)20 (4.9%)N/AN/AN/AN/AN/A

Expression data with clinical characteristics used as covariates in logistic regression were available for 99 patients

Fig. 1

Comparison of methylation levels in 1773 DMPs displaying identical methylation changes in all COVID-19 cohorts and GSEA analysis of the genes annotated to those DMPs.

(a) Heatmap illustrating unsupervised clustering of beta values at identified DMPs in COVID-19 patients and healthy controls.

(b) t-SNE based visualisation of beta values of identified DMPs in COVID-19 patients and healthy controls.

(c) illustration of GSEA results from FUMA GWAS platform with hallmark MSigDB database as a reference.

Clinical characteristics of COVID-19 patients of all studied populations, non-COVID-19 patients, as well as healthy controls. Expression data with clinical characteristics used as covariates in logistic regression were available for 99 patients Comparison of methylation levels in 1773 DMPs displaying identical methylation changes in all COVID-19 cohorts and GSEA analysis of the genes annotated to those DMPs. (a) Heatmap illustrating unsupervised clustering of beta values at identified DMPs in COVID-19 patients and healthy controls. (b) t-SNE based visualisation of beta values of identified DMPs in COVID-19 patients and healthy controls. (c) illustration of GSEA results from FUMA GWAS platform with hallmark MSigDB database as a reference. The identified DMPs annotated to 1112 genes. To approximate biological function of those genes we used GENE2FUNC function of FUMA GWAS (Functional Mapping and Annotation of Genome-Wide Association Studies) [31] and performed gene set enrichment analysis (GSEA) with Molecular Signatures Database (MSigDB) as a reference [32], [33], which contains “hallmark” collection of gene sets [34]. Despite that our analyzes identified rather large gene set, those genes were statistically significantly enriched in only two terms: interferon alpha (adjusted p value = 2.74E-07) and interferon gamma response pathways (adjusted p value = 1.51E-02) (Fig. 1c), with enrichment of 21 of the genes in each of the pathways and 15 genes shared between both pathways. Not surprisingly, this suggests that identified methylation changes take part in interferon response activated upon viral infection (as reviewed e. g. [44]). However, given the accuracy of GSEA analysis normally observed in similar experiments, identification of only two pathways in our analysis indicates a very high specificity of the methylation changes induced by SARS-CoV-2.

Majority of methylome changes observed in COVID-19 patients is present in patients with other respiratory infections

Two studies: USA 1 (n = 26) [24] and USA 2 (n = 65) [26], reported blood cells methylation profiling data from patients that were tested negative for SARS-CoV-2 but developed similar to COVID-19 disease symptoms. The disease etiology of patients in USA 1 cohort was unknown but most of the cases developed pneumonia and in the USA 2 cohort all patients were positive for other than SARS-CoV-2 acute upper respiratory viral infections (for details see Supplementary Table 2). Considering the differences in diseases’ etiology in both cohorts, we analysed methylomes patients from each of those cohorts separately. In general, unsupervised clustering (Fig. 2a-b), as well as t-SNE (Fig. 2c-d) analysis showed that COVID-19 patients and patients with other respiratory infections had a very similar methylation status at the subset of DMPs we identified between hospitalized COVID-19 patients and controls. That suggests that most of the findings in our initial analysis methylation changes are a part of the general response of the immune system to the respiratory tract infections. Nevertheless, especially in case of USA 1 cohort, both unsupervised clustering (Fig. 2a) and t-SNE (Fig. 2c) suggested certain level of clustering of patients with other respiratory disease from COVID-19 patients and hence the presence of the subset of DMPs differentially methylated between these two groups.
Fig. 2

Comparison of methylation changes at DMPs identified in all COVID-19 cohorts between COVID-19 patients and patients with other respiratory tract infections. a-b Heatmap illustrating unsupervised clustering based on beta values of this subset of DMPs for USA 1 (a) and USA 2 (b) cohort. c-d t-SNE based visualisation of beta values of this subset of DMPs for USA 1 (a) and USA 2 (b) cohort.

Comparison of methylation changes at DMPs identified in all COVID-19 cohorts between COVID-19 patients and patients with other respiratory tract infections. a-b Heatmap illustrating unsupervised clustering based on beta values of this subset of DMPs for USA 1 (a) and USA 2 (b) cohort. c-d t-SNE based visualisation of beta values of this subset of DMPs for USA 1 (a) and USA 2 (b) cohort.

A subset of methylation changes is specific to SARS-CoV-2 infection

Given the specificity of the GSEA analysis and as the above results suggested that there might be a subset of methylation changes specific to SARS-CoV-2 infection, we modified our initial analysis and compared methylomes of the COVID-19 patients in each cohort with both healthy blood and two of the other respiratory infections patient cohorts (Supplementary Fig. 2b). This analysis identified 20 hypo- and 4 hypermethylated DMPs ( Fig. 3a and Supplementary Table 3) displaying methylation changes identical in each of the analysed cohorts of COVID-19 patients and different from both healthy controls and patients with other respiratory tract infections. Those DMPs mapped to 17 genes, which are listed in Table 2. Twelve of those genes including: AIM2, CYSTM1, DTX3L, PARP9, CMPK2, EPSTI1, IFI44L, IFIT3, IRF7, LRBA, MX1, and TRIM22 have been described to be associated with immune response (source: GeneCards) and all except LRBA were annotated as interferon-regulated genes (Interferome database [45]). All the DMPs mapped to those genes were hypomethylated. Interestingly, three of the identified COVID-19 specific DMPs (cg22930808, cg00959259, cg07815522) were located at PARP9/DTX3L promoter region (TSS1500, 5’UTR) in N-shore of CpG island present in that genomic region (Fig. 3b and Supplementary Table 3), and four DMPs (cg16785077, cg13155430, cg21549285, cg22862003) mapped to the shores or shelfs of two CpG islands presents in promoter of MX1 (Fig. 3c and Supplementary Table 3). In case of PARP/DTX3L observed methylation changes affected adjacent CpG site and formed a differentially methylated region (DMR) (Fig. 3b), while at MX1 gene DMPs were distributed across the promoter region (Fig. 3c). The four of the remaining genes including: AQP8, CDC42EP3, GPR176, and PLBD1, were associated with hypermethylated DMPs and had no described function in immune system. One hypomethylated DMP mapped to FLJ43663 (LINC-PINT) gene which is non-protein coding RNA that has been described to have neuroprotective function in the brains of patients with neurodegenerative disorders [46]. Those five genes also did not show expression changes specific to COVID-19 patients in the gene expression analyses described below.
Fig. 3

Comparison of methylation levels at COVID-19 specific DMPs and results of GSEA analysis based on genes annotated to those DMPs. (a) Comparison of average methylation levels at COVID-19 specific DMPs between all studied cohorts. (b-c) Median methylation levels at all CpG sites targeted by EPIC array in the promoters of PARP9/DTX3L (b) and MX1 (c) genes. The arrows indicate identified DMPs.

Table 2

Biological function of proteins encoded by genes associated with COVID-19 specific DMPs (source: GeneCards).

Gene_NameFull gene nameEncoded protein function
IRF7Interferon regulatory factor 7Key transcriptional regulator of type I interferon (IFN)-dependent immune responses and plays a critical role in the innate immune response against DNA and RNA viruses
MX1MX Dynamin Like GTPase 1Cellular antiviral response: induced by type I and type II interferons and antagonizes the replication process of several different RNA and DNA viruses
AIM2Absent In Melanoma 2Involved in innate immune response by recognizing cytosolic double-stranded DNA and inducing caspase-1-activating inflammasome formation in macrophages
IFI44LInterferon Induced Protein 44 LikeExhibits a low antiviral activity against hepatitis C virus
PARP9Poly(ADP-Ribose) Polymerase Family Member 9Plays a role in DNA damage repair and in immune responses including interferon-mediated antiviral defences; in macrophages, positively regulates pro-inflammatory cytokines production in response to IFNG stimulation
IFIT3Interferon-induced protein with tetratricopeptide repeats 3Acts as an inhibitor of cellular as well as viral processes, cell migration, proliferation, signalling, and viral replication
TRIM22Tripartite Motif Containing 22Participates in antiviral cell innate immunity; it is interferon-induced
DTX3LDeltex E3 Ubiquitin Ligase 3LPlays a role in DNA damage repair and in interferon-mediated antiviral responses; in association with PARP9, plays a role in antiviral responses
AQP8Aquaporin 8Facilitates the transport of water across biological membranes along an osmotic gradient
LRBALipopolysaccharide-responsive and beige-like anchor proteinMay be involved in coupling signal transduction and vesicle trafficking to enable polarized secretion and/or membrane deposition of immune effector molecules
FLJ43663 (LINC-PINT)Long Intergenic Non-Protein Coding RNA, P53 Induced Transcript.
PLBD1Phospholipase B Domain Containing 1May act as an amidase or a peptidase (By similarity)
EPSTI1Epithelial-stromal interaction protein 1Plays a role in M1 macrophage polarization and is required for the proper regulation of gene expression during M1 versus M2 macrophage differentiation (By similarity)
CMPK2Cytidine/Uridine Monophosphate Kinase 2May participate in dUTP and dCTP synthesis in mitochondria
CDC42EP3Cdc42 effector protein 3Probably involved in the organization of the actin cytoskeleton
CYSTM1Cysteine Rich Transmembrane Module Containing 1Among its related pathways are Innate Immune System
GPR176G protein-coupled receptor 176Orphan receptor involved in normal circadian rhythm behaviour
Comparison of methylation levels at COVID-19 specific DMPs and results of GSEA analysis based on genes annotated to those DMPs. (a) Comparison of average methylation levels at COVID-19 specific DMPs between all studied cohorts. (b-c) Median methylation levels at all CpG sites targeted by EPIC array in the promoters of PARP9/DTX3L (b) and MX1 (c) genes. The arrows indicate identified DMPs. Biological function of proteins encoded by genes associated with COVID-19 specific DMPs (source: GeneCards).

Most of the genes associated with COVID-19 specific DMPs interact in interferon response pathways

The GSEA analysis with MSigDB hallmark gene sets as a reference again linked those genes to interferon alpha (adjusted p value = 2.73e-11) and interferon gamma response pathways (adjusted p value = 1.57e-7) ( Fig. 4a), with six of the genes including CMPK2, EPSTI1, IFI44L, IFIT3, IRF7, and MX1 to take part in both processes.
Fig. 4

GSEA analysis of the genes associated with COVID-19 specific DMPs. (a) GSEA analysis based on “hallmark” collection in Molecular Signatures Database (MSigDB). (b) PPI networks analysis of genes associated with COVID-19 specific DMPs. Lines show interaction between proteins according to: textmining (green), experiments (pink), databases (blue), and co-expression (black) “evidence channels” (see Material and methods (Section 2.6.) for the definitions of the evidence channels).

GSEA analysis of the genes associated with COVID-19 specific DMPs. (a) GSEA analysis based on “hallmark” collection in Molecular Signatures Database (MSigDB). (b) PPI networks analysis of genes associated with COVID-19 specific DMPs. Lines show interaction between proteins according to: textmining (green), experiments (pink), databases (blue), and co-expression (black) “evidence channels” (see Material and methods (Section 2.6.) for the definitions of the evidence channels). Next, we used STRING platform [47] to assess potential protein-protein interaction (PPI) between proteins encoded by the identified genes. This analysis found evidence for the interactions between ten proteins from our gene set (Fig. 4b) at statistically significant level (Benjamini-Hochberg-adjusted p-value<1.0e-16), and enrichment of those genes in six local network clusters (ontology terms used in STRING) associated with interferon alpha/beta signalling pathways (Supplementary Table 4). Moreover, four “evidence channels”, which are the source databases utilized in STRING to find the evidence for the interaction (for detailed description of those data bases see Material and methods (Section 2.6.)) also indicated interactions between those genes (Supplementary Table 5).

Identified COVID-19 specific methylation changes are likely a proxy of broader methylation changes

Considering high heterogeneity of studied cohorts, technical limitations of methylation screening technology and bioinformatics data analysis of this type of data, it is plausible that a number of the infection specific methylation changes did not reach minimum methylation difference threshold (five percent points) in our analysis. We attempted to overcome those limitations and assessed whether methylation levels of COVID-19 specific DMPs are correlated with methylation levels of any other CpGs screened by microarray we used. We performed this analysis for each of the patients’ cohorts separately and found that in all analysed cohorts methylation changes at 30 additional probes were strongly correlated (|r| ≥ 0.7; Supplementary Table 6) with at least one of the COVID-19 specific DMPs. Interestingly, nine of the probes mapped to already identified COVID-19 specific genes, including: MX1, IFI44L, FLJ43663, PARP9/DTX3L, AQP8, IRF7, and AIM2. Both unsupervised clustering (Supplementary Fig. 3a-b), as well as t-SNE-based visualisation (Supplementary Fig. 3c-d) based on this subset of 30 DMPs clearly separated COVID-19 patients from patients with other respiratory diseases in both USA 1 and for USA 2 cohorts, indicating that methylation changes within identified probes are also specific to SARS-CoV-2 infection. Moreover, FUMA based GSEA showed again, significant enrichment of those genes in interferon alpha (adjusted p value = 1.56e-25) and interferon gamma response pathways (adjusted p value = 5.23e-19) with additional seven genes OAS1, IRF9, BST2, MVB12A, IFIH1, DDX60, and PSMB8 identified in this analysis also associated with interferon pathways (Supplementary Fig. 3e). These results suggest that methylation changes may not reflect a full picture of interaction between SARS-CoV-2 and methylome of blood cells and studies based on more accurate clinical designs are necessary to elaborate the full impact of the virus on blood cells methylome.

SARS-CoV-2 infection specific methylation changes are present in blood cells of asymptomatic patients and patients with mild symptoms

Two of the cohorts, in our study included patients tested positive for SARS-CoV-2 but displaying no or mild infection symptoms and not requiring hospitalization (Spanish cohort: n = 194 and USA 2: n = 33). We compared methylation levels at the 24 COVID-19 specific DMPs between those patient groups and healthy controls using logit regression model (adjusted for age and sex, for the details of this analysis see Material and methods (Section 2.3.) and Supplementary Table 7). The methylation levels at majority of those DMPs except for four (cg10728454, cg07992500, cg18519762, cg22488164) were statistically significantly different (FDR corrected p-value ≤ 0.05) and close to or higher than 5% point in asymptomatic SARS-CoV-2 positive patients than in healthy controls. Most importantly the methylation changes at almost all of those DMPs displayed a general trend of increasing methylation difference from healthy controls to hospitalized patients with not hospitalized patients showing intermediate methylation levels between those two groups ( Fig. 5).
Fig. 5

Comparison of the methylation levels at COVID-19 specific DMPs between non-hospitalized COVID-19 patients, healthy controls and hospitalised COVID-19 patients.

Comparison of the methylation levels at COVID-19 specific DMPs between non-hospitalized COVID-19 patients, healthy controls and hospitalised COVID-19 patients.

Hypomethylation of genes associated with COVID-19 specific DMPs activates those genes expression

The STRING analysis indicated that proteins encoded by ten of the genes with COVID-19 specific DMPs interact (Fig. 4b). To assess the association between methylation and expression of the genes in this gene set, we first compared their expression levels between COVID-19 patients (n = 62) and healthy blood samples (n = 24) [48]. This analysis showed that nine of the genes including MX1, CMPK2, PARP9, TRIM22, AIM2, IFI44L, IFIT3, IRF7, and EPSTI1 were significantly upregulated in blood of COVID-19 patients (p ≤ 0.05, ANOVA; |log2(FC)| ≥ 1) ( Fig. 6a). The expression of DTX3L was also statistically significantly upregulated (p ≤ 0.05, ANOVA) but the effect size was slightly lower than in case of other genes (Fig. 6a).
Fig. 6

Analysis of association of methylation changes with the expression of genes mapped to COVID-19 specific DMPs. (a) Volcano plot describing the expression of the analysed genes between COVID-19 patients and healthy controls in COVID19db dataset. The vertical lines correspond to two-fold expression change and the horizontal line represents a p-value of 0.05. (b) Scatter plots describing association of methylation levels with expression of genes up regulated in COVID-19 patients. Each dot in the scatterplot represents expression and methylation level of the gene for COVID-19 (green) or other respiratory infections (red) patients. (c) Comparison of expression levels of genes upregulated in COVID-19 patients between non-ICU and ICU COVID-19 patients.

Analysis of association of methylation changes with the expression of genes mapped to COVID-19 specific DMPs. (a) Volcano plot describing the expression of the analysed genes between COVID-19 patients and healthy controls in COVID19db dataset. The vertical lines correspond to two-fold expression change and the horizontal line represents a p-value of 0.05. (b) Scatter plots describing association of methylation levels with expression of genes up regulated in COVID-19 patients. Each dot in the scatterplot represents expression and methylation level of the gene for COVID-19 (green) or other respiratory infections (red) patients. (c) Comparison of expression levels of genes upregulated in COVID-19 patients between non-ICU and ICU COVID-19 patients. Then, we compared the expression of those genes between COVID-19 patients (n = 102) and patients with other respiratory diseases (n = 26) using RNA-seq data from: [49]. The logistic regression analysis adjusted for WBC fraction, age, sex, and steroids intake (for the details of this analysis see Material and methods (Section 2.4.)) showed that the expression was statistically significantly upregulated (FDR corrected p-value ≤ 0.05) specifically in COVID-19 patients, with |log2(FC)| ≥ 1 observed for all the genes except AIM2 for which |log2(FC)| was 0.92 (Supplementary Table 8). Only one gene IRF7 did not show statistically significant expression change in this analysis. Interestingly, despite the strong association of methylation and expression, the expression of all genes in this gene set was markedly differed between COVID-19 patients (Fig. 6b), suggesting potential association of the expression levels with the disease outcome. We, therefore, compared the expression levels between 50 patients described as only hospitalized and 46 as admitted to intensive care units (ICU). The expression of all of those genes significantly differed between non-ICU and ICU patients (FDR corrected p-value ≤ 0.05), however the effect size (|log2(FC)| ≥ 1) was significant only for four of the genes, including IFI44L, MX1, CMPK2, and IFIT3 (Fig. 6c and Supplementary Table 9). Together these results indicate that there are other factors influencing the expression of the genes epigenetically activated during SARS-CoV-2 infection, what also potentially explains remarkable unpredictability and heterogeneity of COVID-19 clinical outcomes in the presence of uniform methylation changes.

Discussion

We combined methylomics and transcriptomics data from four geographically distant and ethnically heterogeneous cohorts of COVID-19 patients, with the aim to assess causal interaction of SARS-CoV-2 virus with host methylome. The study involved the largest so far cohort of COVID-19 patients from geographically distant and to some extent ethnically heterogeneous populations. Our analysis identified 1773 CpG sites that displayed remarkably stable methylation changes in blood of hospitalized 478 COVID-19 patients and those changes were independent of individual WBC proportions. The pathway enrichment analysis based on this methylation signature showed that identified methylation changes mapped to genes significantly enriched in terms related to only: interferon alpha and interferon gamma response. Nevertheless, the majority of those changes were also present in patients displaying symptoms of upper tract respiratory infection similar to SARS-CoV-2 infection, indicating that this methylation signature is a part of general response of methylome of blood cells to the infection. However, the methylation status of 24 of CpG sites from this signature, which annotated to 17 genes, was different in blood of COVID-19 patients than in healthy blood and blood of patients with other upper tract respiratory infections but displaying symptoms similar to COVID-19. This suggests that methylome of blood cells is involved in pathogenesis of COVID-19 in virus specific manner and epigenetic changes regulate a specific subset of the genes during SARS-CoV-2 infection. Twelve out of seventeen of the genes associated with COVID-19 specific methylation changes including: AIM2, CYSTM1, DTX3L, PARP9, CMPK2, EPSTI1, IFI44L, IFIT3, IRF7, LRBA, MX1, and TRIM22 have been associated with immune response and eleven (except for LRBA) were shown to be regulated by interferon [45]. Moreover, proteins encoded by ten of those genes, including AIM2, DTX3L, PARP9, CMPK2, EPSTI1, IFI44L, IFIT3, IRF7, MX1, and TRIM22 have been shown to interact. Detailed analysis of function of those genes showed that all of them have been previously shown to be effectors in the interferon response pathways [45] and PARP9 gene has additionally been reported as a non-canonical sensor for RNA virus to initiate and amplify type I interferon production [50]. Interestingly, MX1 and PARP9/DTX3L promoter regions respectively harboured four and three CpG sites hypomethylated in COVID-19 patients. The epigenetic regulation of MX1 has been shown in viral infections but to our best knowledge involvement the DNA methylation-mediated epigenetic regulation of PARP9/DTX3L bidirectional promoter has not been previously described, and importantly transcripts of PARP9 (alias BAL1) and DTX3L (alias BBAP), as well as MX1, have well described function in interferon signaling pathways and pathology of SARS-CoV-2 virus infection [50], [51], [52], [53], [54], [55]. Most importantly, we showed that all of the genes in this gene set were differentially expressed between blood cells of COVID-19 patients and healthy controls, and nine of them (except for IRF7) were also up regulated in blood of COVID-19 patients but not in patients with other respiratory infections. This indicates that methylation is essential for the activation of those genes and it is likely that they undergo activation specifically upon SARS-CoV-2 infection. However, despite of the uniform in all studied patient cohort methylation of genes associated with SARS-CoV-2 infection, the expression of the genes mapped to those epigenetic changes differed significantly between patients and interestingly the patients with high expression were significantly less likely to be admitted to ICU. This suggests that the response of methylome of blood cells to the infection appears to be essential for the regulation of the expression of the genes during infection but there are other factors, which influence the response of blood transcriptome of COVID-19 patients to SARS-CoV-2 virus. The heterogeneity of the gene expression could potentially explain the unpredictability of COVID-19 clinical outcomes but also suggests that other mechanisms are needed for the transcriptome response to the infection. Nevertheless, further studies focused on the design driven by the clinical outcomes are needed to elaborate on those interactions.

Conclusions

In conclusion, we identified methylation changes in blood cells from COVID-19 patients that are specific to SARS-CoV-2 infection and are present in patients from four geographically distant regions. Furthermore, those changes appear to regulate the expression of associated genes, but our results also indicate that there are other, likely host specific, factors modulating the expression of those genes during the infection. One more important conclusion from our study is that the power of the study, as well as consistent clinical description of the patients, is critical to elaborate molecular effects of the SARS-CoV-2 infection. Only by basing our analyses on four cohorts of COVID-19 patients, cohort of healthy subjects, and data from non-COVID-19 respiratory disease, we were able to not only identify disease related methylation changes, but most importantly cross-validate findings between the populations.

Funding

This study was funded by grant number: PPN/PPO/2018/1/00088/U and of the West Pomeranian Province 2014–2020, PROTO_LAB/K1/2020/U/18. The funding agencies did not influence on study design, execution and publication of results.

Data sharing statement

Complete DNA methylation raw data of the 32 COVID-19 patients from Polish cohort are deposited at GEO repository under accession number GSE202296.

CRediT author statement

Jan Bińkowski: Conceptualization, Methodology, Formal analysis, Visualization, Writing – original draft, Writing – review & editing. Olga Taryma-Leśniak: Conceptualization, Methodology, Formal analysis, Visualization, Writing – original draft, Writing – review & editing. Karolina Łuczkowska: Methodology, Resources, Data curation. Anna Niedzwiedź: Methodology, Resources, Data curation. Kacper Lechowicz: Methodology, Resources, Data curation. Dominik Strapagiel: Resources, Data curation, Writing - review & editing. Justyna Jarczak: Methodology, Resources, Data curation. Veronica Davalos: Conceptualization, Writing - review & editing. Aurora Pujol: Conceptualization, Writing - review & editing. Manel Esteller: Conceptualization, Writing - review & editing. Katarzyna Kotfis: Methodology, Resources, Data curation, Writing - review & editing. Bogusław Machaliński: Methodology, Resources, Data Curation. Miłosz Parczewski: Conceptualization, Methodology, Resources, Data curation, Writing - review & editing. Tomasz K. Wojdacz: Conceptualization, Methodology, Formal analysis, Writing – original draft, Writing – review & editing, Supervision, Funding acquisition. All authors contributed to interpretation of findings and approved the final version of the manuscript.

Conflict of interest statement

Manel Esteller is an advisor of Quimatryx and Ferrer International. The remaining authors declare no conflict of interest.
  58 in total

1.  Clinical and immunological features of severe and moderate coronavirus disease 2019.

Authors:  Guang Chen; Di Wu; Wei Guo; Yong Cao; Da Huang; Hongwu Wang; Tao Wang; Xiaoyun Zhang; Huilong Chen; Haijing Yu; Xiaoping Zhang; Minxia Zhang; Shiji Wu; Jianxin Song; Tao Chen; Meifang Han; Shusheng Li; Xiaoping Luo; Jianping Zhao; Qin Ning
Journal:  J Clin Invest       Date:  2020-05-01       Impact factor: 14.808

2.  Identification of poly(ADP-ribose) polymerase 9 (PARP9) as a noncanonical sensor for RNA virus in dendritic cells.

Authors:  Junji Xing; Ao Zhang; Yong Du; Mingli Fang; Laurie J Minze; Yong-Jun Liu; Xian Chang Li; Zhiqiang Zhang
Journal:  Nat Commun       Date:  2021-05-11       Impact factor: 14.919

3.  Robust enumeration of cell subsets from tissue expression profiles.

Authors:  Aaron M Newman; Chih Long Liu; Michael R Green; Andrew J Gentles; Weiguo Feng; Yue Xu; Chuong D Hoang; Maximilian Diehn; Ash A Alizadeh
Journal:  Nat Methods       Date:  2015-03-30       Impact factor: 28.547

4.  Genomewide Association Study of Severe Covid-19 with Respiratory Failure.

Authors:  David Ellinghaus; Frauke Degenhardt; Luis Bujanda; Maria Buti; Agustín Albillos; Pietro Invernizzi; Javier Fernández; Daniele Prati; Guido Baselli; Rosanna Asselta; Marit M Grimsrud; Chiara Milani; Fátima Aziz; Jan Kässens; Sandra May; Mareike Wendorff; Lars Wienbrandt; Florian Uellendahl-Werth; Tenghao Zheng; Xiaoli Yi; Raúl de Pablo; Adolfo G Chercoles; Adriana Palom; Alba-Estela Garcia-Fernandez; Francisco Rodriguez-Frias; Alberto Zanella; Alessandra Bandera; Alessandro Protti; Alessio Aghemo; Ana Lleo; Andrea Biondi; Andrea Caballero-Garralda; Andrea Gori; Anja Tanck; Anna Carreras Nolla; Anna Latiano; Anna Ludovica Fracanzani; Anna Peschuck; Antonio Julià; Antonio Pesenti; Antonio Voza; David Jiménez; Beatriz Mateos; Beatriz Nafria Jimenez; Carmen Quereda; Cinzia Paccapelo; Christoph Gassner; Claudio Angelini; Cristina Cea; Aurora Solier; David Pestaña; Eduardo Muñiz-Diaz; Elena Sandoval; Elvezia M Paraboschi; Enrique Navas; Félix García Sánchez; Ferruccio Ceriotti; Filippo Martinelli-Boneschi; Flora Peyvandi; Francesco Blasi; Luis Téllez; Albert Blanco-Grau; Georg Hemmrich-Stanisak; Giacomo Grasselli; Giorgio Costantino; Giulia Cardamone; Giuseppe Foti; Serena Aneli; Hayato Kurihara; Hesham ElAbd; Ilaria My; Iván Galván-Femenia; Javier Martín; Jeanette Erdmann; Jose Ferrusquía-Acosta; Koldo Garcia-Etxebarria; Laura Izquierdo-Sanchez; Laura R Bettini; Lauro Sumoy; Leonardo Terranova; Leticia Moreira; Luigi Santoro; Luigia Scudeller; Francisco Mesonero; Luisa Roade; Malte C Rühlemann; Marco Schaefer; Maria Carrabba; Mar Riveiro-Barciela; Maria E Figuera Basso; Maria G Valsecchi; María Hernandez-Tejero; Marialbert Acosta-Herrera; Mariella D'Angiò; Marina Baldini; Marina Cazzaniga; Martin Schulzky; Maurizio Cecconi; Michael Wittig; Michele Ciccarelli; Miguel Rodríguez-Gandía; Monica Bocciolone; Monica Miozzo; Nicola Montano; Nicole Braun; Nicoletta Sacchi; Nilda Martínez; Onur Özer; Orazio Palmieri; Paola Faverio; Paoletta Preatoni; Paolo Bonfanti; Paolo Omodei; Paolo Tentorio; Pedro Castro; Pedro M Rodrigues; Aaron Blandino Ortiz; Rafael de Cid; Ricard Ferrer; Roberta Gualtierotti; Rosa Nieto; Siegfried Goerg; Salvatore Badalamenti; Sara Marsal; Giuseppe Matullo; Serena Pelusi; Simonas Juzenas; Stefano Aliberti; Valter Monzani; Victor Moreno; Tanja Wesse; Tobias L Lenz; Tomas Pumarola; Valeria Rimoldi; Silvano Bosari; Wolfgang Albrecht; Wolfgang Peter; Manuel Romero-Gómez; Mauro D'Amato; Stefano Duga; Jesus M Banales; Johannes R Hov; Trine Folseraas; Luca Valenti; Andre Franke; Tom H Karlsen
Journal:  N Engl J Med       Date:  2020-06-17       Impact factor: 91.245

5.  Differential white blood cell count in the COVID-19: A cross-sectional study of 148 patients.

Authors:  Aditya Anurag; Prakash Kumar Jha; Abhishek Kumar
Journal:  Diabetes Metab Syndr       Date:  2020-11-02

6.  DNA methylation arrays as surrogate measures of cell mixture distribution.

Authors:  Eugene Andres Houseman; William P Accomando; Devin C Koestler; Brock C Christensen; Carmen J Marsit; Heather H Nelson; John K Wiencke; Karl T Kelsey
Journal:  BMC Bioinformatics       Date:  2012-05-08       Impact factor: 3.169

7.  A lncRNA survey finds increases in neuroprotective LINC-PINT in Parkinson's disease substantia nigra.

Authors:  Alon Simchovitz; Mor Hanan; Nadav Yayon; Songhua Lee; Estelle R Bennett; David S Greenberg; Sebastian Kadener; Hermona Soreq
Journal:  Aging Cell       Date:  2020-02-20       Impact factor: 9.304

8.  Dexamethasone in Hospitalized Patients with Covid-19.

Authors:  Peter Horby; Wei Shen Lim; Jonathan R Emberson; Marion Mafham; Jennifer L Bell; Louise Linsell; Natalie Staplin; Christopher Brightling; Andrew Ustianowski; Einas Elmahi; Benjamin Prudon; Christopher Green; Timothy Felton; David Chadwick; Kanchan Rege; Christopher Fegan; Lucy C Chappell; Saul N Faust; Thomas Jaki; Katie Jeffery; Alan Montgomery; Kathryn Rowan; Edmund Juszczak; J Kenneth Baillie; Richard Haynes; Martin J Landray
Journal:  N Engl J Med       Date:  2020-07-17       Impact factor: 91.245

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.