Lanyi Fu1, Maolin Yao1, Xuedong Liu1, Dong Zheng1. 1. Laboratory of Genetics and Molecular Biology, College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, China.
Abstract
The coronavirus disease (COVID-19) pandemic caused by SARS-CoV-2 is ongoing. Individuals with sarcoidosis tend to develop severe COVID-19; however, the underlying pathological mechanisms remain elusive. To determine common transcriptional signatures and pathways between sarcoidosis and COVID-19, we investigated the whole-genome transcriptome of peripheral blood mononuclear cells (PBMCs) from patients with COVID-19 and sarcoidosis and conducted bioinformatic analysis, including gene ontology and pathway enrichment, protein-protein interaction (PPI) network, and gene regulatory network (GRN) construction. We identified 33 abnormally expressed genes that were common between COVID-19 and sarcoidosis. Functional enrichment analysis showed that these differentially expressed genes were associated with cytokine production involved in the immune response and T cell cytokine production. We identified several hub genes from the PPI network encoded by the common genes. These hub genes have high diagnostic potential for COVID-19 and sarcoidosis and can be potential biomarkers. Moreover, GRN analysis identified important microRNAs and transcription factors that regulate the common genes. This study provides a novel characterization of the transcriptional signatures and biological processes commonly dysregulated in sarcoidosis and COVID-19 and identified several critical regulators and biomarkers. This study highlights a potential pathological association between COVID-19 and sarcoidosis, establishing a theoretical basis for future clinical trials.
The coronavirus disease (COVID-19) pandemic caused by SARS-CoV-2 is ongoing. Individuals with sarcoidosis tend to develop severe COVID-19; however, the underlying pathological mechanisms remain elusive. To determine common transcriptional signatures and pathways between sarcoidosis and COVID-19, we investigated the whole-genome transcriptome of peripheral blood mononuclear cells (PBMCs) from patients with COVID-19 and sarcoidosis and conducted bioinformatic analysis, including gene ontology and pathway enrichment, protein-protein interaction (PPI) network, and gene regulatory network (GRN) construction. We identified 33 abnormally expressed genes that were common between COVID-19 and sarcoidosis. Functional enrichment analysis showed that these differentially expressed genes were associated with cytokine production involved in the immune response and T cell cytokine production. We identified several hub genes from the PPI network encoded by the common genes. These hub genes have high diagnostic potential for COVID-19 and sarcoidosis and can be potential biomarkers. Moreover, GRN analysis identified important microRNAs and transcription factors that regulate the common genes. This study provides a novel characterization of the transcriptional signatures and biological processes commonly dysregulated in sarcoidosis and COVID-19 and identified several critical regulators and biomarkers. This study highlights a potential pathological association between COVID-19 and sarcoidosis, establishing a theoretical basis for future clinical trials.
On March 11, 2020, the World Health Organization declared coronavirus disease (COVID-19) as a pandemic, which is caused by SARS-CoV-2, a positive-stranded RNA virus (Gorbalenya et al., 2020). Recent reports have demonstrated that individuals with pre-existing health conditions (e.g., cancer, heart disease, diabetes) have a higher risk of SARS-CoV-2 infection compared with the general population (Al-Quteimat and Amer, 2020; Peric and Stulnig, 2020; Radke et al., 2020). Researchers worldwide are striving to unravel therapeutic strategies that are effective for this population (A. Nashiry et al., 2021; M.A. Nashiry et al., 2021; Satu et al., 2021; Taz et al., 2021). Sarcoidosis is a risk factor for COVID-19 (Manansala et al., 2021). However, the pathological mechanisms for this increased risk have not been identified. Identifying the mechanisms might help to develop therapies for the treatment of COVID-19 in patients with sarcoidosis.Sarcoidosis is a systemic granulomatous disease that generally causes severe pulmonary dysfunction (Bargagli and Prasse, 2018). Chest discomfort, dyspnea, and dry cough are the most common clinical symptoms (Al-Kailany et al., 2021). According to reports from the United States, sarcoidosis has a mortality rate of 7.6% (Li et al., 2021). Individuals with sarcoidosis are at high-risk of COVID-19, which is difficult to treat owing to the lack of effective drugs and therapeutic regimens (Kondle et al., 2021).Understanding the pathological mechanisms interlinking sarcoidosis and COVID-19 would provide a clear pathophysiological basis for treatment. Toward this end, in this study, we collected data of qualified samples from the Gene Expression Omnibus (GEO) database, which were combined with transcriptome data to explore unique transcriptional signatures of sarcoidosis and COVID-19, and further investigated the common pathological pathways that underlie both diseases. These screened transcriptional signatures and enriched pathways may provide key targets for the treatment of COVID-19 in patients with sarcoidosis.
Materials and methods
Datasets
We acquired three gene expression datasets (GSE164805, GSE152418, and GSE42832) from GEO (https://www.ncbi.nlm.nih.gov/geo/; Barrett et al., 2013). The GSE164805 and GSE152418 datasets provide the whole-genome transcriptome of peripheral blood mononuclear cells (PBMCs) from healthy controls and patients with COVID-19. Ten samples, five samples each from healthy controls and from patients with severe COVID-19, were extracted from the GSE164805 dataset to identify differentially expressed genes (DEGs; Zhang et al., 2021). The GSE152418 dataset contains 34 PBMC samples, with 17 samples each from healthy controls and 17 samples from patients with COVID-19. Among them, two patients were sampled twice in succession; thus, these repeated-measurement samples were excluded before analysis. The RNA sequences of the GSE152418 dataset were obtained using the Illumina NextSeq 500 platform (Blanco-Melo et al., 2020). The GSE42832 dataset contains the transcriptional profiles of PBMCs from healthy controls and patients with sarcoidosis. Ten samples, including five from healthy controls and five from patients with sarcoidosis, were extracted from this dataset to identify DEGs (Bloom et al., 2013).
Identification of differentially expressed genes
The R package “limma” was used (with default parameters) to identify DEGs associated with COVID-19 and sarcoidosis from the GSE164805 and GSE42832 datasets, respectively. The limma package uses linear models and moderated t-statistics to analyze microarray data. It provides normalization functions, supporting features that are especially useful for a linear modeling approach (Smyth, 2005). We screened the DEGs of controls and COVID-19 patients in the GSE152418 dataset using the R package “DESeq2,” which uses shrinkage estimation for dispersions and fold changes to improve estimate stability and interpretability (Love et al., 2014). A log2-fold change (absolute) > 1.0 and adjusted P-value < 0.05 were used as the cut-off criteria to identify DEGs in all three datasets. The Benjamini-Hochberg procedure was used to control the false discovery rate. The common DEGs among the GSE164805, GSE152418, and GSE42832 datasets were identified and visualized using the R packages.
Functional analysis of differentially expressed genes
We performed Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis to investigate the biological functions of the identified DEGs. GO analysis contains three aspects: cellular component (CC), molecular functions (MF), and biological process (BP) (Doms and Schroeder, 2005). First, we used the R package “org.Hs.eg.db” to convert the IDs of overlapping DEGs. Subsequently, we used the R packages “clusterprofiler” (Yu et al., 2012) and “GOplot” (Walter et al., 2015) for the GO and KEGG analysis. The species was limited to Homo sapiens. GO terms and pathways were declared to be significantly enriched when the adjusted P-value was <0.05.
Protein-protein interaction network analysis
Protein-protein interaction (PPI) networks help to untangle complex molecular mechanisms. The PPI networks of sarcoidosis-specific genes, COVID-19-specific genes, and common genes between sarcoidosis and COVID-19 were respectively constructed using the Search Tool for Retrieval of Interacting Genes/Proteins (STRING: https://string-db.org/) database (Szklarczyk et al., 2019). All DEGs were input into the STRING database as seed genes, and their neighboring interacting partners were also included in the network. The generated PPI networks were exported to Cytoscape software (Shannon et al., 2003), and the hub genes in three networks were identified by the cytoHubba plugin (Chin et al., 2014). We selected genes in the top 10 based on maximal clique centrality as hub genes. The diagnostic potential of hub genes in COVID-19 or sarcoidosis was tested using receiver operating characteristic (ROC) curve analysis. An area under the ROC curve (AUC) of >0.5 was considered statistically significant.
Gene regulatory network analysis
Gene regulatory networks (GRNs) describe regulatory relationships between transcription factors (TFs) and microRNAs (miRNAs) and their target genes (Wang et al., 2018). We built DEG-TF and DEG-miRNA interaction networks using the JASPAR (Khan et al., 2018) and Tarbase (Sethupathy et al., 2006) databases in the Network Analyst website (Zhou et al., 2019) to explore the effects of TFs and miRNAs on DEGs. The list of the generated DEG-TF and DEG-miRNA pairs was reintroduced into Cytoscape software for further visualization. Significant hub miRNAs and TFs were detected using the cytoHubba plugin in Cytoscape and ranked by the Matthews correlation coefficient (MCC) value.
Results
Transcriptomic signature
The DEGs identified in COVID-19 and sarcoidosis from the three GEO datasets are presented as volcano plots in Fig. 1A–C, with 13,567, 2157 and 773 DEGs were identified in the GSE164805, GSE152418, and GSE42832 datasets, respectively, of which 33 DEGs were common to the three datasets, indicating a potentially common molecular mechanism contributing to COVID-19 and sarcoidosis pathology (Fig. 1D–E, Table 1
). The interactions among the 33 common DEGs based on the STRING database are shown in Fig. 1F. The heatmap in Fig. 1G shows that most of the DEGs were upregulated in the disease group compared with the corresponding expression in the normal controls.
Fig. 1
Identification of common differentially expressed genes between COVID-19 and sarcoidosis. DEGs identified in COVID-19 and sarcoidosis from GSE164805 (A), GSE152418 (B) and GSE42832 (C) are presented as volcano plots. The blue dots represent down-regulated genes, and yellow plots represent up-regulated genes. (D) Venn diagram showing the 644 common DEGs between GSE152418 and GSE164805 datasets. (E) Venn diagram showing the 33 DEGs shared between COVID-19 and sarcoidosis. (F) Protein-protein interaction network constructed using common DEGs shared between COVID-19 and sarcoidosis. (G) Heatmap showing the expression landscape of shared DEGs in sarcoidosis dataset. Abbreviation: DEG, differentially expressed genes.
Table 1
Statistical measurement of common genes between sarcoidosis and COVID-19 datasets.
Gene
logFC
P.value
Adj.P.val
Gene
logFC
P.value
Adj.P.val
FCGR1A
2.878674
3.29E−20
1.14E−15
PPBP
1.165549
0.00018746
0.02113713
FCGR1B
2.647999
2.49E−17
4.32E−13
TUBB1
1.160117
0.00019872
0.02153726
DHRS9
2.608846
7.91E−17
9.14E−13
MUC4
1.156990
0.00019919
0.02153726
AQP9
2.184951
2.67E−12
6.63E−09
FABP5
1.145486
0.00023993
0.02448231
IL1B
1.897941
1.14E−09
1.98E−06
CTSG
1.112332
0.00037636
0.03065154
DYSF
1.713539
3.91E−08
3.77E−05
CACNA1S
1.105433
0.000379
0.03072175
TAGAP
1.628856
1.70E−07
0.000131
ANXA3
1.086968
0.00048986
0.03517862
ANKRD22
1.613975
2.32E−07
0.000167
PYCR1
1.077254
0.00053426
0.03517862
APOBEC3A
1.591804
3.42E−07
0.000215
FFAR2
1.057344
0.00069084
0.03517862
FOLR3
1.551470
6.83E−07
0.000382
LPAR1
1.047851
0.00075782
0.03517862
MMP25
1.352393
1.54E−05
0.004017
ATF3
1.041001
0.00081275
0.03517862
TNFAIP6
1.296651
3.17E−05
0.006799
CYP4F8
1.035995
0.00087245
0.03517862
SLC25A37
1.254662
5.63E−05
0.009862
XDH
1.033177
0.00089368
0.03517862
ARG1
1.228584
7.86E−05
0.012342
TNNT1
1.015741
0.00112162
0.03517862
EPSTI1
1.225689
8.12E−05
0.012472
SMTN
1.014436
0.00111006
0.03517862
HBD
1.176368
0.000185
0.021099
ITM2C
−1.00671
0.00122408
0.03517862
ACRBP
1.168952
0.000177
0.020864
The first and fifth columns represent the gene symbol of the common genes between sarcoidosis and COVID-19 datasets; the second and sixth columns represent the log fold change between the groups; the third and seventh columns represent the t-statistics P-value; the fourth and eighth columns represent the adjusted P-values (P.adjust), which means P-value has been adjusted using Benjamini and Hochberg.
Identification of common differentially expressed genes between COVID-19 and sarcoidosis. DEGs identified in COVID-19 and sarcoidosis from GSE164805 (A), GSE152418 (B) and GSE42832 (C) are presented as volcano plots. The blue dots represent down-regulated genes, and yellow plots represent up-regulated genes. (D) Venn diagram showing the 644 common DEGs between GSE152418 and GSE164805 datasets. (E) Venn diagram showing the 33 DEGs shared between COVID-19 and sarcoidosis. (F) Protein-protein interaction network constructed using common DEGs shared between COVID-19 and sarcoidosis. (G) Heatmap showing the expression landscape of shared DEGs in sarcoidosis dataset. Abbreviation: DEG, differentially expressed genes.Statistical measurement of common genes between sarcoidosis and COVID-19 datasets.The first and fifth columns represent the gene symbol of the common genes between sarcoidosis and COVID-19 datasets; the second and sixth columns represent the log fold change between the groups; the third and seventh columns represent the t-statistics P-value; the fourth and eighth columns represent the adjusted P-values (P.adjust), which means P-value has been adjusted using Benjamini and Hochberg.
Functional enrichment of differentially expressed genes
The COVID-19-specific DEGs were most strongly associated with the GO terms and KEGG pathways “regulation of mitotic nuclear division,” “neuronal cell body,” “extracellular matrix structural constituent conferring tensile strength,” and “inflammatory mediator regulation of TRP channels” (Fig. 2A), whereas the sarcoidosis-specific DEGs were most strongly associated with “neutrophil mediated immunity,” “tertiary granule,” “pattern recognition receptor activity,” and “phagosome” (Fig. 2B). Most of the 33 shared between sarcoidosis and COVID-19, were associated with the BP term “neutrophil degranulation”, the CC term “tertiary granule lumen,” the MF term “lgG binding”, and the “amoebiasis” and “antifolate resistance” KEGG pathways (Fig. 2C). Table 2
summarizes the top six GO items and KEGG pathways associated with the common dysregulated genes in the two diseases.
Fig. 2
Functional insights into differentially expressed genes. We performed GO and KEGG analysis to uncover the molecular function of identified DEGs, including COVID-19-specific DEGs (A), sarcoidosis-specific DEGs (B) and the common DEGs between sarcoidosis and COVID-19 (C). Abbreviations: BP, biological process; CC, cellular component; DEGs, differentially expressed genes; GO, gene ontology; MF, molecular functions; KEGG, Kyoto encyclopedia of genes and genomes.
Table 2
GO items and KEGG pathways enriched by common genes between COIVD-19 and sarcoidosis datasets.
Ontology
ID
Description
Count
Pvalue
P.adjust
BP
GO:0043312
Neutrophil degranulation
8
9.13E−07
0.000336
BP
GO:0002283
Neutrophil activation involved in immune response
8
9.56E−07
0.000336
BP
GO:0042119
Neutrophil activation
8
1.11E−06
0.000336
BP
GO:0002446
Neutrophil mediated immunity
8
1.13E−06
0.000336
BP
GO:0006692
Prostanoid metabolic process
3
4.89E−05
0.009682
BP
GO:0006693
Prostaglandin metabolic process
3
4.89E−05
0.009683
BP
GO:0008643
Carbohydrate transport
4
0.000101
0.01714
BP
GO:0070942
Neutrophil mediated cytotoxicity
2
0.000119
0.017465
BP
GO:0002367
Cytokine production involved in immune response
3
0.000637
0.029095
BP
GO:0002369
T cell cytokine production
2
0.002001
0.048849
CC
GO:1904724
Tertiary granule lumen
3
8.70E−05
0.002774
CC
GO:0030139
Endocytic vesicle
5
0.000102
0.002774
CC
GO:0042581
Specific granule
4
0.000111
0.002774
CC
GO:0034774
Secretory granule lumen
5
0.000133
0.002774
CC
GO:0060205
Cytoplasmic vesicle lumen
5
0.000169
0.002774
CC
GO:0031983
Vesicle lumen
5
0.000172
0.002774
CC
GO:0035578
Azurophil granule lumen
3
0.000389
0.005394
CC
GO:0005766
Primary lysosome
3
0.001822
0.018216
MF
GO:0019864
IgG binding
2
0.000162
0.022157
MF
GO:0043177
Organic acid binding
4
0.000431
0.029493
MF
GO:0019865
Immunoglobulin binding
2
0.000801
0.036541
MF
GO:0015144
Carbohydrate transmembrane transporter activity
2
0.001904
0.065201
KEGG
hsa05146
Amoebiasis
3
0.001391
0.099746
KEGG
hsa01523
Antifolate resistance
2
0.002099
0.099746
The first column represents the category for each item. Gene ontology (GO) contains three aspects: cellular component (CC), molecular functions (MF), and biological process (BP); the second column represents the serial number of each item; the third column shows the full name of each item; the fourth column represents the number of genes enriched in each item; the fourth column represents the P-values; the sixth column represents the adjusted P-values (P.adjust), which means P-value has been adjusted using Benjamini and Hochberg.
Functional insights into differentially expressed genes. We performed GO and KEGG analysis to uncover the molecular function of identified DEGs, including COVID-19-specific DEGs (A), sarcoidosis-specific DEGs (B) and the common DEGs between sarcoidosis and COVID-19 (C). Abbreviations: BP, biological process; CC, cellular component; DEGs, differentially expressed genes; GO, gene ontology; MF, molecular functions; KEGG, Kyoto encyclopedia of genes and genomes.GO items and KEGG pathways enriched by common genes between COIVD-19 and sarcoidosis datasets.The first column represents the category for each item. Gene ontology (GO) contains three aspects: cellular component (CC), molecular functions (MF), and biological process (BP); the second column represents the serial number of each item; the third column shows the full name of each item; the fourth column represents the number of genes enriched in each item; the fourth column represents the P-values; the sixth column represents the adjusted P-values (P.adjust), which means P-value has been adjusted using Benjamini and Hochberg.
Proteomic signature
The PPI network of sarcoidosis-specific DEGs contained 110 nodes and 146 edges (Fig. 3A); the PPI network of COVID-19-specific DEGs contained 56 nodes and 115 edges (Fig. 3B); and the PPI network of the 33 common DEGs contained 56 nodes and 109 edges (Fig. 3C). The MF GO terms of the 10 hub genes (FKBP1B, ATF3, FABP5, TAGLN3, ENO3, DDIT3, ENO2, FKBP1A, TAGLN, and HINT1) are listed in Table 3
. The ROC curves for genes with significant AUC values for predicting disease are shown in Fig. 4
. The ROC curves identified MCM3, MCM5, CDC27, CDC16, FABP5, ENO2, TAGLN, and ATF3 as having diagnostic potential for COVID-19, and ACTR2, ACTR3, ARPC1B, ARPC1A, FABP5, ENO2, TAGLN, and ATF3 as having diagnostic potential for sarcoidosis. These results revealed that hub genes identified from the PPI network of common DEGs had higher diagnostic potential in both COVID-19 and sarcoidosis than hub genes identified from the PPI network of disease-specific DEGs.
Fig. 3
Identification of proteomic signatures. Protein-protein interaction networks was constructed using sarcoidosis-specific DEGs (A), COVID-19-specific DEGs (B), common DEGs (C) and their neighboring interacting partners. The yellow nodes represent the identified hub genes. Abbreviation: DEGs, differentially expressed genes.
Table 3
Overview of the hub genes obtained from the PPI network consisted of common DEGs.
Gene
Description
Related diseases and pathways
Reference
FKBP1B
FKBP prolyl Isomerase 1B
This gene is a member of the immunophilin protein family. Increased FKBP1b reversed calcium dysregulation and memory impairment in aging rats
PMID: 26224869
ATF3
Activating transcription factor 3
This gene is associated with various steps of tumorigenesis. Decreased expression of ATF3 contributes to gastric cancer progression via increasing β-catenin and CEMIP
PMID: 34728784
FABP5
Fatty acid binding protein 5
This gene encodes the fatty acid binding protein, which promotes lymph node metastasis in cervical cancer
PMID: 32550890
TAGLN3
Transgelin 3
This gene contributes to the sequential progression of vertebrate neurogenesis
PMID: 25565981
ENO3
Enolase 3
This gene is a metalloenzyme that acts as a tumor inhibitor in hepatocellular carcinoma development
PMID: 35004693
DDIT3
DNA damage inducible transcript 3
This gene is associated with many mineralization processes and inflammatory diseases and suppresses cementoblast differentiation via the NF-κB pathway
PMID: 30488444
ENO2
Enolase 2
This gene encodes enolase isoenzymes that promotes cell proliferation and glycolysis in acute lymphoblastic leukemia
PMID: 29689546
FKBP1A
FKBP prolyl isomerase 1A
This gene plays an important role in intercellular communication between endocardium and myocardium via regulating Notch1
PMID: 23571217
TAGLN
Transgelin
This gene encodes an actin cross-linking/gelling protein that mediates ovarian cancer progression via RhoA/ROCK pathway
PMID: 34538264
HINT1
Histidine triad nucleotide binding protein 1
This gene mediates activation of MITF transcriptional activity in human melanoma cells
PMID: 28394346
The first and second columns show the gene symbol and full name of hub proteins. The third and fourth columns describe the related disease and pathway that hub gene involved in, and relevant references.
Fig. 4
Receiver-operating characteristic curves to assess the potential diagnostic value of biomarkers for COVID-19 and sarcoidosis. (A) shows the diagnostic potential of MCM3, MCM5, CDC27 and CDC16 in COVID-19. (B) shows the diagnostic potential of ACTR2, ACTR3, ARPC1B and ARPC1A in sarcoidosis. (C) shows the diagnostic potential of FABP5, ENO2, FKBP1A, and TAGLN3 in COVID-19. (D) shows the diagnostic potential of FABP5, ENO2, TAGLN, and ATF3 in sarcoidosis.
Identification of proteomic signatures. Protein-protein interaction networks was constructed using sarcoidosis-specific DEGs (A), COVID-19-specific DEGs (B), common DEGs (C) and their neighboring interacting partners. The yellow nodes represent the identified hub genes. Abbreviation: DEGs, differentially expressed genes.Overview of the hub genes obtained from the PPI network consisted of common DEGs.The first and second columns show the gene symbol and full name of hub proteins. The third and fourth columns describe the related disease and pathway that hub gene involved in, and relevant references.Receiver-operating characteristic curves to assess the potential diagnostic value of biomarkers for COVID-19 and sarcoidosis. (A) shows the diagnostic potential of MCM3, MCM5, CDC27 and CDC16 in COVID-19. (B) shows the diagnostic potential of ACTR2, ACTR3, ARPC1B and ARPC1A in sarcoidosis. (C) shows the diagnostic potential of FABP5, ENO2, FKBP1A, and TAGLN3 in COVID-19. (D) shows the diagnostic potential of FABP5, ENO2, TAGLN, and ATF3 in sarcoidosis.
Regulatory signature
TFs and miRNAs regulate various biological processes, including pathological processes in sarcoidosis and COVID-19. We constructed miRNA-DEG and TF-DEG regulatory networks and identified several hub regulatory factors for each network. For the 33 common DEGs, the TF-DEG network contained 36 nodes and 123 edges and the miRNA-DEG network contained 33 nodes and 98 edges (Fig. 5A). For COVID-19-specific DEGs, the TF-DEG network contained 24 nodes and 54 edges and the miRNA-DEG network contained 44 nodes and 299 edges (Fig. 5B). For sarcoidosis-specific DEGs, the TF-DEG network contained 46 nodes and 198 edges and the miRNA-DEG network contained 40 nodes and 244 edges (Fig. 5C).
Fig. 5
Construction of gene regulatory networks. (A) Protein-protein interaction network consisted of common DEGs between sarcoidosis and COVID-19. (B) Protein-protein interaction network consisted of COVID-19-specific DEGs. (C) Protein-protein interaction network consisted of sarcoidosis-specific DEGs. The genes in highlight represent hub TFs/miRNA. Abbreviations: DEGs, differentially expressed genes; miRNA, microRNA; TFs, transcription factors.
Construction of gene regulatory networks. (A) Protein-protein interaction network consisted of common DEGs between sarcoidosis and COVID-19. (B) Protein-protein interaction network consisted of COVID-19-specific DEGs. (C) Protein-protein interaction network consisted of sarcoidosis-specific DEGs. The genes in highlight represent hub TFs/miRNA. Abbreviations: DEGs, differentially expressed genes; miRNA, microRNA; TFs, transcription factors.
Discussion
A previous study investigated the cellular pathways involved in sarcoidosis and COVID-19; however, only autophagy-related pathways were considered (Calender et al., 2020). With a broader systems biology approach, our study identified pivotal genes, pathways, regulators, and biomarkers which were associated with COVID-19 and sarcoidosis, individually and collectively. These findings therefore provide novel potential therapeutic targets for patients with sarcoidosis and COVID-19. Moreover, to the best of our knowledge, our study is the first to collectively analyze the host transcriptional response to COVID-19 and sarcoidosis.We performed differential expression analysis to identify the common DEGs between sarcoidosis and COVID-19 followed by enrichment analysis to reveal the potential common mechanisms. We identified 33 common DEGs, which were largely associated with cytokine-related signaling pathways. This finding implies that cytokines have a strong biological relevance with the pathogenesis of COVID-19 and sarcoidosis. Consistent with our findings, a previous study found that patients with COVID-19 have high cytokine signatures, which appeared to be the driving feature of COVID-19 (Blanco-Melo et al., 2020). Similarly, multiple studies have reported exuberant inflammatory cytokine production in patients with sarcoidosis (Beirne et al., 2009; Gerke and Hunninghake, 2008). Cytokines are soluble proteins that are often responsible for many infection-related symptoms, such as myalgia, fever, and headache (Slifka and Whitton, 2000), which are the common symptoms of COVID-19.PPI networks are fundamental in systems biology and help to detect critical molecules among proteins (Ewing et al., 2007). We identified 10 hub proteins in the PPI network of common DEGs. ROC analysis showed that most of the hub proteins had high diagnostic potential (AUC > 0.7), suggesting their potential as biomarkers of COVID-19 and sarcoidosis. Among the hub genes, FKBP prolyl isomerase 1B (FKBP1B) is a member of the immunophilin family. Increased FKBP1b levels have been reported to reverse calcium dysregulation and memory impairment in aging rats (Gant et al., 2015). Activating transcription factor 3 (ATF3) is associated with various steps of tumorigenesis, and its decreased expression contributes to gastric cancer progression via increasing β-catenin and CEMIP levels (Xie et al., 2021). Fatty acid binding protein 5 (FABP5) encodes a fatty acid binding protein, which has been shown to promote lymph node metastasis in cervical cancer (Zhang et al., 2020). Transgelin 3 (TAGLN3) contributes to the sequential progression of vertebrate neurogenesis (Ratié et al., 2014). Enolase 3 (ENO3) is a metalloenzyme that acts as a tumor inhibitor in hepatocellular carcinoma development (Cui et al., 2021). DNA damage inducible transcript 3 (DDIT3) is associated with many mineralization processes and inflammatory diseases, and suppresses cementoblast differentiation via the NF-κB pathway (Liu et al., 2019). Enolase 2 (NEO2) encodes enolase isoenzymes that promote cell proliferation and glycolysis in acute lymphoblastic leukemia (Liu et al., 2018). FKBP prolyl isomerase 1A (FKBP1A) plays an important role in intercellular communication between the endocardium and myocardium via regulating Notch1 expression (Liu et al., 2018). Transgelin (TAGLN) encodes an actin cross-linking/gelling protein that mediates ovarian cancer progression via the RhoA/ROCK pathway (Wei et al., 2021). Histidine triad nucleotide binding protein 1 (HINT1) mediates the activation of MITF transcriptional activity in human melanoma cells (Motzik et al., 2017).GRNs, which describe the regulatory relationships between the TFs and miRNAs and their target genes, are an essential component of bioinformatics. In this study, GRN analysis revealed that 36 TFs and 33 miRNAs could regulate the common DEGs. Among those, hsa-mir-16-5p and FOXC1 showed the strongest interactions with the common DEGs. A previous study demonstrated that hsa-mir-16-5p can regulate ACE2-related networks (Wicik et al., 2020). ACE2 is one of the main cellular receptors for SARS-CoV-2: the virus binds to ACE2 receptor to infiltrate host cells by binding the receptor and then rapidly replicates and spreads, ultimately causing organ damage (Renhong et al., 2020). Thus, this study suggests that hsa-mir-16-5p affects the clinical outcomes of patients with COVID-19. Forkhead box C1 (FOXC1) mediates chemotherapeutic resistance and the metastasis of gastric adenocarcinoma via the epithelial-to-mesenchymal transition and the hedgehog pathway (Jun et al., 2021).This study has some limitations. We conducted an in-depth analysis of the transcriptomic data of COVID-19 and sarcoidosis to identify pivotal DEGs, pathways, regulators, and biomarkers. However, our findings have not been validated in vitro, ex vivo, and in vivo, and we hope to work on this in the future. Furthermore, we did not perform transcriptomic analyses in patients with sarcoidosis having COVID-19 owing to the current lack of relevant sample data. However, with continuous accumulation of samples in the future, this analysis is likely to become possible.
Conclusion
In this study, we identified common pathological mechanisms between COVID-19 and sarcoidosis using multiple analytical approaches. We identified 33 common DEGs between COVID-19 and sarcoidosis. A total of 36 TFs and 33 miRNAs regulated the common DEGs, including hsa-mir-16-5p and FOXC1. We also identified several hub proteins related to the common DEGs that were identified. The identified hub genes have the potential for use as biomarkers for the diagnosis of COVID-19 in patients with sarcoidosis. We believe our findings can provide a pathophysiological basis explaining why patients with sarcoidosis tend to develop severe COVID-19, which can inform treatment to improve outcomes in this high-risk population.
CRediT authorship contribution statement
Lanyi Fu: Visualization, Writing – original draft, Data curation, Project administration. Maolin Yao: Formal analysis, Data curation, Writing – original draft, Project administration. Xuedong Liu: Conceptualization, Supervision, Funding acquisition. Dong Zheng: Conceptualization, Supervision, Funding acquisition.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Authors: A Motzik; E Amir; T Erlich; J Wang; B-G Kim; J M Han; J H Kim; H Nechushtan; M Guo; E Razin; S Tshori Journal: Oncogene Date: 2017-04-10 Impact factor: 9.867
Authors: Asif Nashiry; Shauli Sarmin Sumi; Salequl Islam; Julian M W Quinn; Mohammad Ali Moni Journal: Brief Bioinform Date: 2021-03-22 Impact factor: 11.622
Authors: Tanya Barrett; Stephen E Wilhite; Pierre Ledoux; Carlos Evangelista; Irene F Kim; Maxim Tomashevsky; Kimberly A Marshall; Katherine H Phillippy; Patti M Sherman; Michelle Holko; Andrey Yefanov; Hyeseung Lee; Naigong Zhang; Cynthia L Robertson; Nadezhda Serova; Sean Davis; Alexandra Soboleva Journal: Nucleic Acids Res Date: 2012-11-27 Impact factor: 16.971