Shweta Arora1, Prithvi Singh2, Ravins Dohare3, Rishabh Jha2, Mansoor Ali Syed4. 1. Translational Research Lab, Department of Biotechnology, Faculty of Natural Sciences, Jamia Millia Islamia, New Delhi 110025, India. Electronic address: shweta169213@st.jmi.ac.in. 2. Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi 110025, India. 3. Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi 110025, India. Electronic address: ravinsdohare@gmail.com. 4. Translational Research Lab, Department of Biotechnology, Faculty of Natural Sciences, Jamia Millia Islamia, New Delhi 110025, India. Electronic address: smansoor@jmi.ac.in.
Abstract
COVID-19 is a lurking calamitous disease caused by an unusual virus, SARS-CoV-2, causing massive deaths worldwide. Nonetheless, explicit therapeutic drugs or clinically approved vaccines are not available for COVID-19. Thus, a comprehensive research is crucially needed to decode the pathogenic tools, plausible drug targets, committed to the development of efficient therapy. Host-pathogen interactions via host cellular components is an emerging field of research in this respect. miRNAs have been established as vital players in host-virus interactions. Moreover, viruses have the capability to manoeuvre the host miRNA networks according to their own obligations. Besides protein coding mRNAs, noncoding RNAs might also be targeted in infected cells and viruses can exploit the host miRNA network via ceRNA effect. We have predicted a ceRNA network involving one miRNA (miR-124-3p), one mRNA (Ddx58), one lncRNA (Gm26917) and two circRNAs (Ppp1r10, C330019G07RiK) in SARS-CoV infected cells. We have identified 4 DEGs-Isg15, Ddx58, Oasl1, Usp18 by analyzing a mRNA GEO dataset. There is no notable induction of IFNs and IFN-induced ACE2, significant receptor responsible for S-protein binding mediated viral entry. Pathway enrichment and GO analysis conceded the enrichment of pathways associated with interferon signalling and antiviral-mechanism by IFN-stimulated genes. Further, we have identified 3 noncoding RNAs, playing as potential ceRNAs to the genes associated with immune mechanisms. This integrative analysis has identified noncoding RNAs and their plausible targets, which could effectively enhance the understanding of molecular mechanisms associated with viral infection. However, validation of these targets is further corroborated to determine their therapeutic efficacy.
COVID-19 is a lurking calamitous disease caused by an unusual virus, SARS-CoV-2, causing massive deaths worldwide. Nonetheless, explicit therapeutic drugs or clinically approved vaccines are not available for COVID-19. Thus, a comprehensive research is crucially needed to decode the pathogenic tools, plausible drug targets, committed to the development of efficient therapy. Host-pathogen interactions via host cellular components is an emerging field of research in this respect. miRNAs have been established as vital players in host-virus interactions. Moreover, viruses have the capability to manoeuvre the host miRNA networks according to their own obligations. Besides protein coding mRNAs, noncoding RNAs might also be targeted in infected cells and viruses can exploit the host miRNA network via ceRNA effect. We have predicted a ceRNA network involving one miRNA (miR-124-3p), one mRNA (Ddx58), one lncRNA (Gm26917) and two circRNAs (Ppp1r10, C330019G07RiK) in SARS-CoV infected cells. We have identified 4 DEGs-Isg15, Ddx58, Oasl1, Usp18 by analyzing a mRNA GEO dataset. There is no notable induction of IFNs and IFN-induced ACE2, significant receptor responsible for S-protein binding mediated viral entry. Pathway enrichment and GO analysis conceded the enrichment of pathways associated with interferon signalling and antiviral-mechanism by IFN-stimulated genes. Further, we have identified 3 noncoding RNAs, playing as potential ceRNAs to the genes associated with immune mechanisms. This integrative analysis has identified noncoding RNAs and their plausible targets, which could effectively enhance the understanding of molecular mechanisms associated with viral infection. However, validation of these targets is further corroborated to determine their therapeutic efficacy.
Coronaviruses form a varied group of single-stranded positive-sense RNA viruses, known to infect an extensive range of vertebrate hosts, and causing moderate illnesses of the respiratory tract such as the common cold. An emergence of three, profoundly pathogenic human coronaviruses, from zoonotic viruses has been observed in the former three decades. These include - the Middle East Respiratory-Syndrome-related Coronavirus (MERS-CoV), circulating in both camels and humans. It was not so virulent in case of camels but highly pathogenic in humans, affecting ~2500 people around the world, with a case-fatality rate of ~36%; Severe Acute Respiratory-Syndrome-related Coronavirus (SARS-CoV-1), influencing a global population of approximately 8000 people and a case-fatality rate of around 10% in 2002–2003; and the recent severe acute respiratory-syndrome-related coronavirus (SARS-CoV-2), causing Coronavirus Disease-2019 (COVID-19), the global mortality rate of which needs to be determined (Blanco-Melo et al., 2020, Hasan et al., 2014). Viruses displaying nucleotide identity to SARS were identified in hosts such as palm civets, raccoon dogs in 2002–2003 SARS epidemic. Later on, highly conserved viral strains existing in horseshoe bats were found to bind and infect human cells. Therefore, these SARS-like coronaviruses, using bats, civets and humanACE2 (Angiotensin 1 converting enzyme 2) receptors do not have clear zoonotic origins but are strongly suggestive of further outbreaks in human populations. An infection with these tremendously pathogenic coronaviruses may lead to severe lung disorders such as acute respiratory distress syndrome (ARDS) and acute lung injury (ALI), usually resulting in loss of lung function and even death (Cascella et al., 2020).COVID-19, the disease caused by SARS-CoV-2 is a matter of considerable interest and can be calamitous. According to the latest and largest data set of the new coronavirus outbreak, released by The Chinese Centre for Disease Control and Prevention (https://www.cdc.gov/), the severity of the disease varies among patients, but preponderant cases were expressed mild, with the majority of them having cold-like symptoms to mild Pneumonia. Around 14% of entrenched cases were found to be severe, associated with serious Pneumonia and shortness of breath. 5% of patients were extremely critical exhibiting septic shock, respiratory failure, and multi-organ failure, and approximately 2.3% of them culminated in death. An analysis of the data from 45,000 accustomed cases from China has suggested that elderly people and persons with pre-existent illnesses are more prone to foster relentless forms of COVID-19 (https://www.the-scientist.com/news-opinion/why-some-covid-19-cases-are-worse-than others-67160). Thus, this specific template of enhancing severity with age is different from other viral outbreaks, prominently, the 1918 Flu pandemic, manifesting high rate of mortality in young children and people between 20 and 40 years of age. Nonetheless, it is commodiously persistent with records of MERS and SARS coronavirus outbreaks. Moreover, it is also ambiguous that the new coronavirus is more fatal than the other forms that are frequently found circulating among people in winters and frequently causes common cold (Gralinski and Baric, 2015, Harenberg and Favaloro, 2020).Although there is adequate information about the mortality of the clinical disease, but knowledge about its pathobiology is restricted. Despite this, no data displaying the detailed cellular responses to the virus, but data from previous studies with SARS-CoV can be used to hypothesize the plausible course of events taking place during infection. Previous studies have suggested that cells that are probably infected, determine that COVID-19 can be divided into three different phases, each comparable to the different clinical stage of the disease. Phase 1 corresponds to the asymptomatic state i.e. initial 1–2 days of infection, wherein the sniffed SARS-CoV-2 virus binds to epithelial cells of nasal cavity via S-protein mediated binding and initiates replication. Consequently, replicase poly-proteins are translated by the positive strand of RNA genome which is then further cleaved by 16 nsps (nonstructural proteins encoded by the single-stranded virus). ACE2 is the cardinal receptor for both SARS-CoV and SARS-CoV-2 (Hofmann et al., 2004, Wrapp et al., 2020). In vitro data has suggested that ciliated cells are the primarily infected cells in the conducting airways, but the expression of ACE2 on these cells is limited. Moreover, there is no cell type preference in the conducting airways which is suggestive of viral propagation only, with very little immune response during this phase. Nasal swabs can be used to detect the virus at this stage (Mason, 2020). Phase 2 conforms the migration of virus in the upper airways along with conducting airways, leading to the stimulation of a sturdy immune response within next few days of infection. The disease COVID-19 is clinically manifested at this point. The levels of cytokines can be used to determine the extent of innate immune response generated. Β and λ interferons secreted by viral infected epithelial cells, CXCL10 secreted by alveolar type II cells have been reported to be effective disease markers in SARS. Determination of the host innate immune response might be suggestive of the subsequent course of the disease. However, the disease is mild and confined to upper respiratory tract and conducting airways in around 80% of the infectedpersons. Furthermore, approximately 20% of the patient’s progress towards phase 3 wherein virus infects alveolar type II cells, the gas exchange units of lungs. The virus propagates and releases a large number of viral particles in these cells leading to apoptosis of the infected cells. Thus, it acts as a self-replicating pulmonary toxin by releasing new viral particles that further infect adjacent type II cells. In this way, most of the type II cells of lung will be ravaged, triggering the secondary pathways for epithelial regeneration. This would result in diffused alveolar damage with fibrin rich hyaline membranes and the presence of multinucleated giant cells, finally culminating in abrupt stimulation of wound healing processes, leading to the development of pulmonary fibrosis (Gu and Korteweg, 2007, Qian et al., 2013, Wu and McGoogan, 2020, Xu et al., 2020). Disease resumption would require epithelial regeneration and persuasive innate and acquired immune responses. It has been reported that SARS-CoV proliferates inside Type II cells leading to their apoptotic cell death followed by a release of large number of viral particles (Qian et al., 2013). As a result, the released viral particles infect adjacent type II cells, propagate inside them and release more viral particles, after cell death. Thus, it acts as a self-replicating pulmonary toxin, which upon infection, promotes its own replication and death of the infected cell, finally resulting in the release of more and more viral particles. Consequently, most of the type II cells will be lost and secondary pathways for epithelial regeneration would be triggered. Moreover, it would result in the generation of few multinucleated giant cells and diffused alveolar damage with fibrin rich hyaline membranes. In addition, Type I cells would also not be synthesised as Type II cells are the precursor cells for Type I cells. Such altered wound healing responses would result in more profound Scaring and Fibrosis than that occurs in ARDS (Gu and Korteweg, 2007, Xu et al., 2020, Yee et al., 2017). Elderly people are at higher risk due to their curtailed immune response and diminished mucociliary clearance and repairing capability (Cascella et al., 2020, Harenberg and Favaloro, 2020).Although the occurrence of a viral disease is very challenging and requires prompt responses, but specific drugs are not available for several viral diseases including coronaviruses. Hence, the study of host and pathogen interactions will definitely play a significant role in the development of antiviral therapeutics. However, conventional therapeutics is based on the use of neutralizing antibodies, but more recently, genomic medicine has come up with the discovery of microRNAs (miRNAs), long noncoding RNAs (lncRNAs) and circular RNAs (circRNAs), based on gene silencing. miRNAs are small noncoding RNAs (~19–26 nts) that regulate posttranscriptional gene expression by binding to the complementary sites on the 3′UTR of target mRNA and causing either its degradation or inhibits its translation. It has been well cited that several plants, animals, and viruses encode these noncoding RNAs that regulate numerous important aspects of cellular processes involving proliferation, development, differentiation, apoptosis, metabolism, and tumorigenesis. Transcription Factors (TFs) are proteins which assist in regulation of gene transcription by binding to DNA-binding domains in the promoter region of target genes (Lee and Young, 2000). The regulation of both TFs and miRNAs is strongly interconnected as they both share same regulatory logics (Hobert, 2008). Besides, they act in a largely combinatorial manner, cooperatively regulating the same target genes. As TFs and miRNAs may also mutually regulate each other, feed-forward loops (FFLs) comprising genes, TFs, and miRNAs thus exist (Zhang et al., 2015). FFLs can be broadly differentiated into three types according to the master regulator: TF-FFL, miRNA-FFL, and composite FFL (Sun et al., 2012). Moreover, the regulatory network analysis revealed that FFLs comprise over-represented network motifs in the mammalian regulatory network (Shalgi et al., 2007, Tsang et al., 2007). miRNAs encoded by viruses are unique in a way that they regulate the gene expression of both the virus and their host. lncRNAs (noncoding RNAs of >200nts) and circRNAs bind to miRNAs competitively via miRNA response elements and act as competing endogenous RNAs (ceRNAs). Thus, ceRNAs regulate the expression of miRNAs targeting specific mRNAs, thereby acting as a sponge for miRNAs. Some of the contemporary studies have demonstrated the plausible role of miRNAs as an antiviral therapeutic tool in case of HSV, Dengue, HIV-1, Hepatitis-C virus (HCV) and Influenza (Barbu et al., 2020, Ghosal et al., 2014, Miao et al., 2019). Therefore, identification of lncRNA,/circRNA-help in constructing a ceRNA network, which may provide novel and effective targets for the development of innovative therapeutic strategies. Our study has identified a few budding targets of SARS-CoV-2 and their interrelations with miRNA and lncRNAs, which could be effective tools in building antiviral therapy against the deadly SARS-CoV-2.
Materials and methods
Degs screening from mRNA expression profile
An exhaustive searching in NCBI-Gene Expression Omnibus (GEO) (Clough and Barrett, 2016) was performed for extracting the SARS-CoV associated mRNA expression dataset. mRNA time-series expression dataset with accession number GSE19137 comprising a total of 21 samples (i.e., 7 samples with 3 replicates of each) corresponding to 3 uninfected and 18 infected samples was chosen from GEO. The 18 infected samples were studied at different time points of Day1, Day2, Day3, Day5, Day14, and Day28, respectively. The relative expression levels of a gene symbol mapping to multiple probeset IDs were averaged and the probeset IDs with no corresponding gene symbol were eliminated (Ahmad et al., 2019, Shriwash et al., 2019, Singh et al., 2020). A three-stage screening protocol was employed to screen the significant differentially expressed genes (DEGs) between two infection status groups, i.e., uninfected vs infected. In stage-I, GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/) web-based tool was accessed for identifying the DEGs between uninfected and daywise infected samples, respectively. These DEGs were categorized in six different groups, i.e., uninfected vs Day1 samples (Group-1 DEGs); uninfected vs Day2 samples (Group-2 DEGs); uninfected vs Day3 samples (Group-3 DEGs); uninfected vs Day5 samples (Group-4 DEGs); uninfected vs Day14 samples (Group-5 DEGs); uninfected vs Day28 samples (Group-6 DEGs) at a and . To obtain stage-II DEGs, every DEG group in stage-I was compared with each other. Here we assumed DEGs when comparing and group of DEGs where and . Finally, a group of DEGs is equal to . All the DEGs were combined in order to obtain the master group of DEGs followed by the elimination of duplicates to fetch highly significant stage-III DEGs. The stage-III DEGs corresponding to a and were classified as the up and downregulated, respectively.
PPI network construction and key module selection
The Search Tool for the Retrieval of Interacting Genes (STRING, https://string-db.org/) v11.0 (Szklarczyk et al., 2019) web-based tool was used to analyze the interaction of these DEGs in the form of Protein-Protein Interaction (PPI) network and visualized via Cytoscape v3.8.0 (Shannon et al., 2003). Confidence score was considered as a highly significant threshold in STRING for the construction of PPI network. We also analyzed the PPI network topological properties via Network Analyzer plugin available in Cytoscape. Moreover, the Molecular Complex Detection (MCODE) plugin available in Cytoscape was used to analyze the PPI network and to identify the highly correlated gene clusters/modules. The parameters set as the threshold for network scoring and cluster finding were as follows: “Degree cutoff ”, “node score cutoff ”, “k-score”, and “max. depth” (Pan et al., 2018).
Enrichment analysis for hub genes identification
Pathway enrichment analysis of genes belonging to the highest scoring cluster was performed to filter out the hub genes followed by their GO term enrichment analysis. Pathway enrichment analysis of the highest-scoring cluster genes was performed using ReactomePA package (Yu and He, 2016). It performs pathway analysis using the Reactome database implementing hypergeometric models. ReactomePA takes the input of genes as Ensemble IDs, therefore the gene symbols were converted to their corresponding Ensemble IDs using bitr function available in clusterProfiler package (Yu et al., 2012). A q-value was set as the preferred threshold for identifying significantly enriched pathways and their associated genes. GO term enrichment analysis of the significantly enriched pathway genes was performed using Mouse Genome Index (MGI) Mammalian Phenotype Level 4 gene set library available in Enrichr database (Kuleshov et al., 2016). A p-value was set as the preferred threshold for identifying significantly enriched terms and their associated genes.
Construction of circRNA/lncRNA-miRNA-mRNA/TF regulatory network
The mouse miRNAs intended to target our hub genes were collected from StarBase v2.0 (Li et al., 2014) and miRWalk v3.0 (Sticht et al., 2018) databases, respectively. miRNAs with an overall score and binding on region were selected from miRWalk. Significant TF-gene pairs with an integrated top rank score (p-value) were fetched from ChEA v3.0 database (Keenan et al., 2019). miRNAs intended to target these extracted TFs were also collected from StarBase and miRWalk databases with previously described thresholds. The miRNAs targeting both our hub genes and TFs were considered as the final miRNAs. These miRNAs were now used to retrieve the lncRNA-miRNA and circRNA-miRNA interaction pairs from StarBase database. Lastly, all the five types of regulatory interaction pairs were integrated to establish the 5-node circRNA/lncRNA-miRNA-mRNA/TF regulatory network and visualized using Cytoscape.
Results
SARS-CoV-associated DEGs identification
In stage-I, a total of 284 Group-1, 802 Group-2, 55 Group-3, 206 Group-4, 99 Group-5, and 58 Group-6 DEGs were screened based on the p-value and threshold. The variations of all stage-I DEGs expression levels were visualized and assessed by volcano plots as shown in Fig. 1
(A-F). These DEGs were subjected to further filtering with respect to the aforementioned comparison order to obtain the stage-II DEGs using a web-based tool Venny 2.1 (https://bioinfogp.cnb.csic.es/tools/venny/). Venn plot showing the number of stage-II DEGs (3 1 6) is shown in Fig. 2
. List of stage-III DEGs (2 5 6) was obtained after the removal of duplicates from stage-II DEGs and can be seen in Supplementary Table S1. The heatmap (Fig. 3
) of the top 10 up and downregulated DEGs was generated for comparison of their expression levels across daywise samples.
Fig. 1
Volcano plots highlighting (A) 284, (B) 802, (C) 55, (D) 206, (E) 99, and (F) 58 stage-I DEGs. The green and red colored points indicates downregulated and upregulated DEGs. Black colored points indicates no DEGs. The × and y axes represent the and , respectively.
Fig. 2
Venn plot indicating the number of stage-II DEGs. The 6 categories of stage-I DEGs (i.e., Group-1 DEGs, Group-2 DEGs, Group-3 DEGs, Group-4 DEGs, Group-5 DEGs, and Group-6 DEGs) were uniquely compared with each other in order to obtain 316 stage-II DEGs.
Fig. 3
Heatmap plot of top 10 up and down regulated stage-III DEGs. The normalized expression values of each DEG (columns) is plotted across all respective daywise samples (rows). The data was categorized with respect to the sample groups (controls and daywise). Cluster dendrograms for row and column are displayed along the left and top sides of the plot. The annotation bar displaying the infection timepoint is shown in the right panel.
Volcano plots highlighting (A) 284, (B) 802, (C) 55, (D) 206, (E) 99, and (F) 58 stage-I DEGs. The green and red colored points indicates downregulated and upregulated DEGs. Black colored points indicates no DEGs. The × and y axes represent the and , respectively.Venn plot indicating the number of stage-II DEGs. The 6 categories of stage-I DEGs (i.e., Group-1 DEGs, Group-2 DEGs, Group-3 DEGs, Group-4 DEGs, Group-5 DEGs, and Group-6 DEGs) were uniquely compared with each other in order to obtain 316 stage-II DEGs.Heatmap plot of top 10 up and down regulated stage-III DEGs. The normalized expression values of each DEG (columns) is plotted across all respective daywise samples (rows). The data was categorized with respect to the sample groups (controls and daywise). Cluster dendrograms for row and column are displayed along the left and top sides of the plot. The annotation bar displaying the infection timepoint is shown in the right panel.
PPI network analysis and key module selection
The PPI network of screened DEGs constructed using STRING database comprised of 63 nodes and 157 edges as shown in Fig. 4
(A). A total of 63 out of 256 stage-III DEGs participated in the PPI network at the selected highest confidence level, i.e.. Moreover, the most highly significant cluster with a was selected amongst all the 5 identified clusters and consisted of 11 nodes and 50 edges (Fig. 4B). Various topological/centrality distributions of the PPI network can be seen in Fig. 5
(A-F).
Fig. 4
(A) PPI network comprising 63 nodes and 157 edges constructed using STRING database. The red and blue colored nodes represent up and downregulated proteins. (B) Top scoring PPI cluster consists of 11 nodes and 50 edges.
Fig. 5
Topological property/centrality distribution plots showing (A) Node degree distribution, (B) Betweenness centrality, (C) Closeness centrality, (D) Average shortest path length distribution, (E) Neighborhood connectivity, and (F) Topological coefficient of PPI network.
(A) PPI network comprising 63 nodes and 157 edges constructed using STRING database. The red and blue colored nodes represent up and downregulated proteins. (B) Top scoring PPI cluster consists of 11 nodes and 50 edges.Topological property/centrality distribution plots showing (A) Node degree distribution, (B) Betweenness centrality, (C) Closeness centrality, (D) Average shortest path length distribution, (E) Neighborhood connectivity, and (F) Topological coefficient of PPI network.The pathway enrichment analysis of these 11 genes revealed that only 4 genes (Isg15, Ddx58, Oasl1, Usp18) were involved in significantly enriched pathways and GO terms at the aforementioned threshold. The list of significantly enriched pathways and GO terms are shown in Table 1, Table 2
, respectively. Hence, these 4 genes were considered as the hub genes and used in further analysis. The boxplots comparing the daywise gene expression levels of these hub genes can be seen in Fig. 6
. GO analysis results have displayed that changes in biological processes of our hub genes were momentously implicated in abnormal circulating interferon levels, abnormal granulocyte morphology and physiology, increased activated T cell numbers, decreased bone strength, abnormal lymphocyte morphology and increased incidence of chronic myelocytic leukemia. In addition, our hub genes were involved in highly significant pathways such as antiviral mechanism by IFN-stimulated genes and interferon signaling.
Table 1
List of significantly enriched pathways () and genes(s) involved in them. The results were compiled from reactome database using ReactomePA package.
Pathways
q-value
Gene(s)
Count
Antiviral mechanism by IFN-stimulated genes
2.24E−09
Usp18/Isg15/Ddx58/Oasl1
4
Interferon Signaling
1.97E−08
Usp18/Isg15/Ddx58/Oasl1
4
ISG15 antiviral mechanism
3.00E−07
Usp18/Isg15/Ddx58
3
Ub-specific processing proteases
0.002535107
Usp18/Ddx58
2
Deubiquitination
0.003563381
Usp18/Ddx58
2
Negative regulators of DDX58/IFIH1 signaling
0.003563381
Ddx58
1
Regulation of IFNA signaling
0.003563381
Usp18
1
Interferon alpha/beta signaling
0.004093248
Usp18
1
Termination of translesion DNA synthesis
0.004589162
Isg15
1
Translesion synthesis by Y family DNA polymerases bypasses lesions on DNA template
0.005184469
Isg15
1
DNA Damage Bypass
0.005942213
Isg15
1
Table 2
List of significantly enriched GO terms () and gene(s) involved in them. The results were compiled from MGI Phenotype Ontology Database.
Box-and-whisker plots indicating the daywise expression levels of four upregulated hub genes. The expression values across three biological replicates (daywise samples) are averaged in order to produce a single daywise expression value for each gene. The top and bottom lines of the boxes signify the first and third quartiles with median being represented as a line inside each box. Genes and their expression level range are shown on the x and y axes, respectively.
List of significantly enriched pathways () and genes(s) involved in them. The results were compiled from reactome database using ReactomePA package.List of significantly enriched GO terms () and gene(s) involved in them. The results were compiled from MGI Phenotype Ontology Database.Box-and-whisker plots indicating the daywise expression levels of four upregulated hub genes. The expression values across three biological replicates (daywise samples) are averaged in order to produce a single daywise expression value for each gene. The top and bottom lines of the boxes signify the first and third quartiles with median being represented as a line inside each box. Genes and their expression level range are shown on the x and y axes, respectively.
Analysis of circRNA/lncRNA-miRNA-mRNA/TF regulatory network
The 5-node circRNA/lncRNA-miRNA-mRNA/TF regulatory network was composed of 340 nodes and 630 edges as shown in Fig. 7
(A). Various topological /centrality distributions of this 5-node network can be seen in Fig. 8
(A-F). Supplementary Table S2 summarizes all five types of regulatory interactions involved in this network. Within this quintuple network, we noticed the formation of a closed 3-node miRNA feed-forward loop (FFL) and a 4-node ceRNA network, respectively.
Fig. 7
(A) SARS-CoV-associated integrated quintuple circRNA/lncRNA-miRNA-mRNA/TF regulatory network. Red circular nodes represent the miRNAs, blue square nodes represent the hub genes/mRNAs, yellow triangular nodes represent the TFs, magenta diamond nodes represent lncRNAs, and the green conical nodes represent the circRNAs, respectively. (B) 3-node miRNA FFL dissected from the quintuple network. The red circular nodes represent the miRNAs, blue square nodes represent the hub genes/mRNAs, and yellow triangular nodes represent the TFs. (C) The highest-order motif from this FFL consists of one miRNA (mmu-miR-124-3p), one TF (Stat2) and one hub gene (Ddx58). (D) 4-node circRNA/lncRNA-miRNA-mRNA ceRNA regulatory network dissected from the quintuple network. The red circular nodes represent the miRNAs, blue square nodes represent the hub genes/mRNAs, green conical nodes represent the circRNAs, and magenta diamond nodes represent lncRNAs (E) The highest-order motif from this ceRNA network consists of one miRNA (mmu-miR-124-3p), one hub gene (Ddx58), two circRNAs (Ppp1r10, C330019G07Rik), and one lncRNA (Gm26917). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 8
Topological property/centrality distribution plots showing (A) Node degree distribution, (B) Betweenness centrality, (C) Closeness centrality, (D) Average shortest path length distribution, (E) Neighborhood connectivity, and (F) Topological coefficient of quintuple network.
(A) SARS-CoV-associated integrated quintuple circRNA/lncRNA-miRNA-mRNA/TF regulatory network. Red circular nodes represent the miRNAs, blue square nodes represent the hub genes/mRNAs, yellow triangular nodes represent the TFs, magenta diamond nodes represent lncRNAs, and the green conical nodes represent the circRNAs, respectively. (B) 3-node miRNA FFL dissected from the quintuple network. The red circular nodes represent the miRNAs, blue square nodes represent the hub genes/mRNAs, and yellow triangular nodes represent the TFs. (C) The highest-order motif from this FFL consists of one miRNA (mmu-miR-124-3p), one TF (Stat2) and one hub gene (Ddx58). (D) 4-node circRNA/lncRNA-miRNA-mRNA ceRNA regulatory network dissected from the quintuple network. The red circular nodes represent the miRNAs, blue square nodes represent the hub genes/mRNAs, green conical nodes represent the circRNAs, and magenta diamond nodes represent lncRNAs (E) The highest-order motif from this ceRNA network consists of one miRNA (mmu-miR-124-3p), one hub gene (Ddx58), two circRNAs (Ppp1r10, C330019G07Rik), and one lncRNA (Gm26917). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)Topological property/centrality distribution plots showing (A) Node degree distribution, (B) Betweenness centrality, (C) Closeness centrality, (D) Average shortest path length distribution, (E) Neighborhood connectivity, and (F) Topological coefficient of quintuple network.
miRNA FFL network
The 3-node miRNA FFL comprises 3 major regulatory interaction types (i.e., miRNA-mRNA repression; TF-mRNA regulation; miRNA-TF repression). It was composed of 26 nodes and 66 edges and is shown in Fig. 7(B). The hub gene, Ddx58 is targeted by the maximum number of miRNAs (i.e., 9). Also, Stat1 and Mxd1 are the TFs targeted by the maximum number of miRNAs (i.e.,8). mmu-miR-34a-5p targeted maximum number of TFs (i.e.,4). The hub genes, Usp18 and Isg15 were regulated by a maximum number of TFs (i.e., 8). The highest-order subnetwork observed in the FFL included one miRNA (mmu-miR-124-3p), one TF (Stat2), one hub gene (Ddx58) and is shown in Fig. 7(C).
circRNA/lncRNA-miRNA-mRNA ceRNA network
The 4-node circRNA/lncRNA-miRNA-mRNA ceRNA network comprised a total of 332 nodes and 578 edges as shown in Fig. 7(D). The highest-order subnetwork observed included one miRNA (mmu-miR-124-3p), one hub gene (Ddx58), two circRNAs (Ppp1r10, C330019G07Rik), and one lncRNA (Gm26917) as shown in Fig. 7(E).The analysis of FFL and ceRNA network revealed that one miRNA (mmu-miR-124-3p), one lncRNA (Gm26917), one TF (Stat2), one mRNA (Ddx58), and two circRNAs (Ppp1r10, C330019G07Rik) were the hub items in this quintuple regulatory network.
Discussion
This study focuses on the identification of interactions between miRNAs, lncRNAs, circRNAs, TFs, and mRNAs to establish a regulatory network between them in case of SARS-CoV infection. This regulatory network will help us identify the probable targets and noncoding RNAs which could directly act upon the genome and control the transcriptome on a large scale. In order to analyze the modulations in host response to SARS-CoV, we started with the elucidation of mRNA expression of virus-infectedmice at different time intervals. The mRNA expression data from a total of 21 samples at different time intervals was obtained from GEO dataset GSE19137. We compared the expression data in 5 different combinations and 3 stages. Our analysis identified a total of 284 Group-1, 802 Group-2, 55 Group-3, 206 Group-4, 99 Group-5 and 58 Group-6 DEGs in stage-I, which were then vigorously screened for two more stages leading to the determination of overall 256 DEGs. Among them, 66% were upregulated and 34% were downregulated (Supplementary Table S1). Further, PPI network analysis indicated the presence of 5 prime clusters of induced genes, of which only 1 highly significant cluster with 11 nodes and 50 edges was identified (score = 10) (Fig. 4B). Pathway enrichment analysis of genes present in this cluster divulged the significant involvement of only 4 genes in cogent pathways (). These genes (Isg15, Ddx58, Oasl1, Usp18) were described as the hub genes, all of which were considerably convoluted in antiviral mechanism by IFN-stimulated genes and Interferon signalling pathways (Table 1).We have observed a notable induction of well-delineated direct effectors of the innate immune system known as ISGs (IFN-stimulated genes), inclusive of Isg15, Usp18, Mx2, Oas3, Ifit3, along with several cytokines such as Cxcl3, Cxcl9, Cxcl10 and Il6. However, the genes involved in IFN-1 signalling (Ly6e, Mx1, Oas3, Ifi44l, If16, Ifitm3) and Type-III IFN related genes (Il28, Il29) were not robustly induced (Supplementary Table S1). These results are in concordance with the study by Blanco-Melo et al. (2020) and can be correlated with limited antiviral response of the host, in case of SARS-CoV-2 infection. Even though, IFN-I and IFN-III were not sufficiently induced, we have observed a notable induction of viral RNA sensors such as Oas3, Ifit3 and other direct effectors of innate immune response. This response is still lacking the stimulation of antiviral genes customarily induced after IFN-I/III signalling (Schoggins et al., 2011), limiting the host antiviral response. Moreover, although direct effectors of innate immune response are induced, the antiviral response is still restricted, which could be due to viral evasion of the immune system. Several in vitro studies have established the strategies by which SARS-CoV evades the immune system (Frieman et al., 2008, Totura and Baric, 2012). One of the strategy is masking of the viral genome to evade host immune response. IFIT members are important antiviral sensors that detect viral RNAs. They recognize unmethylated 2′-O RNA and inhibits translation of uncapped viral mRNAs. However, SARS-CoV encodes a 2′-O-methyltransferase activity in nsp16, which protects them from Ifit recognition (Daffis et al., 2010, Menachery et al., 2014). Thus, although ISGs such as Ifits are activated, antiviral response is limited due to modifications of viral genome.Although IFN-I/III genes are expressed to a limited extent, they are sufficient enough to induce ISGs in infected cells. It has been well documented that IFN-III machinery is exclusively furnished for first defence against low doses of virus. However, when the viral dose exceeds, IFN-I comes into action, exaggerating the inflammatory process and generating disease immunopathology, thus indicating an important role of IFN-I, rather than IFN-III in triggering inflammation (Andreakos et al., 2017, Galani et al., 2017, Kikkert, 2020). Moreover, there exists a positive regulatory mechanism that augments IFN signalling cascade, thereby maintaining and amplifying ISGs. In this cascade, Stat1 and Irf9 (IFN regulatory factor), associates with Stat2 and forms Isgf3 transcription factor that binds to ISG promoter elements and amplifies the expression of ISGs. This enhanced and prolonged-expression of ISGs promotes viral resistance and locks the cells into an autocrine signalling loop that sustains IFN signal transduction and releases pro-inflammatory cytokines, which are responsible for the display of pathobiological features associated with the disease (Kikkert, 2020, Mesev et al., 2019). Most of the patients develop atypical pneumonia specified by fever, fatigue, respiratory symptoms such as cough, sore throat, and shortness of breath. Acute diffuse alveolar damage (DAD), oedema, diffused fibrin, and hyaline membranes in the alveolar spaces, typical of ALI have been observed in early-stage of disease. However, in later stages, there is an organized phase DAD and eventual deposition of fibrous tissue with denuded airways, haemorrhage, and increased macrophage population (Gralinski and Baric, 2015). Researchers have found that late-term progression of the disease is more likely to be associated with immunopathological damage, caused due to significant upregulation of pro-inflammatory cytokines such as IL-2, IL-7, IL-10, TNF-α, MCP-1 and G-CSF. This storm of cytokines are inducers of viral sepsis and inflammation-induced lung injury, leading to other complexities such as respiratory failure, shock, ARDS, organ failure and potentially death (Prompetchara et al., 2020). Briefly, it has been found that viral infection reduces anti-viral IFN responses, as is also observed in our findings, leading to uncontrolled replication of virus and trigger of inflammatory conditions. As a result, there is an intrusion of monocytes, macrophages and neutrophils that leads to enhanced production of pro-inflammatory cytokines, responsible for immunopathology of lung (Prompetchara et al., 2020). Thus, enhanced expression of ISGs after 2 days of infection, when the viral titres are increased (Fig. 6) can be correlated with the abovesaid autocrine signalling loop and development of viral resistance.ISGs are cynical in controlling viral infections. They encode direct antiviral effectors, that are abeyant in positively and negatively regulating IFN signalling pathways and other host responses. Well-characterized ISGs usually target conserved facets of viral replication such as modulation of viral nucleic acid integrity (targeted by Oas/RNase L, ADAR, Oligoadenylate synthase proteins that activates RNase L, endoribonucleases and degrades cellular and viral RNA), viral entry (Ifitm3), virus egress (Bst2), viral protein translocation (PKR, IFIT family members). Most ISGs possess antiviral effects, while some such as Isg15 exerts both Pro and antiviral effects. Isg15 is one of the most robustly induced ISG that can alter host immunity and directly inhibit viral replication. It is a Ubiquitin-like protein, existing in both conjugated and non-conjugated forms. ISGylation is a process by which it conjugates to the target protein following a sequence of reactions catalysed by IFN-I induced E1 activating, E2 conjugating and E3 ligase enzymes. The process can be reversed by Ubiquitin-specific Protease 18 (Usp18). Moreover, Usp18 also acts as a negative regulator of IFN signalling while competing with Jak1 (Janus Kinase 1) for binding, after binding with the second chain of IFNα/β receptor 2 (Ifnar2). Thus, Usp18 has different effects on Isg15 conjugation and as well as altered IFN signalling, that needs to be interpreted very cautiously. Additionally, an unconjugated form of Isg15 functions as a cytokine stimulating IFN-γ production, NK cells proliferation, maturation of dendritic cells and acting as a chemotactic signal for neutrophils. It also stabilizes Usp18 and regulates IFN signalling by mediating autophagic degradation of viral sensors (RIG-I, also known as Ddx58, Retinoic acid-inducible gene1) (Choi et al., 2015, Perng and Lenschow, 2018). During the course of a viral infection, RIG-I (i.e. pattern recognition receptor, encoded by DDX58 gene in humans) senses viral RNPs leading to its oligomerization and association with mitochondrial antiviral signalling protein (MAVs), with subsequent activation and induction of ISGs (Perng and Lenschow, 2018, p. 15). In addition, it has also been reported that SARS-CoV-2 spike protein (S), binds with angiotensin-converting enzyme 2 (ACE2) and stimulates viral entry into the host, following association with host proteases, mainly transmembrane serine protease2 (TMPRSS2). Both ACE2 and TMPRSS2 are expressed on nasal secretory cells, type II pneumocytes and absorptive enterocytes. ACE2 is upregulated by Type-I IFNs and Type-II IFNs to a lesser extent (Ziegler et al., 2020). Thus, it could also be an ISG possessing antiviral effect promoting viral egress. Studies have reported that binding of SARS-CoV to mouseACE2, reduces its expression and causes lung failure. However, it is very weakly induced in case of murine cells. This can also be related to our results where there is no notable induction of IFNs, ACE2 and TMPRSS2. Further, the S protein of humanSARS-CoV has a very high binding affinity for humanACE2 than that of mice and rats (Li et al., 2004, Tseng et al., 2007). Thus, restrictions posed by host factors also play an important part in selection of genes to study the dynamics of host pathogen interactions in SARS-CoV-2 infection. Moreover, targeting ISGs and strategies to fine tune their levels of expression would be beneficial in developing an effective therapeutic tool against the virus.In this respect, other parts of the innate immune system package such as small noncoding (micro, circular) and long noncoding RNAs are considered. These elements are of considerable importance as they have been implicated in the regulation of numerous pathological and physiological processes in animals and humans. It has been found that several host miRNAs are deregulated in order to confound the immune system against a viral infection. In addition, lncRNAs also play powerful roles in inflammatory pathways, thereby regulating gene expression through multiple mechanisms including ceRNA theory. As conceded in this theory, lncRNAs act as “miRNA sponges” and reduce the degradative effects of miRNAs on their target mRNAs (Ghosal et al., 2014). Based on this, our study has identified a quintuple network, consisting of one miRNA (miR-124-3p), one lncRNA (Gm26917), one TF (Stat2), one mRNA (Ddx58) and two circRNAs (Ppp1r10, C330019G07Rik). miR-124-3p is known to modulate TLR-mediated innate immune responses by targeting Stat3 and reduce the secretion of Il6 and TNF-α (Qin et al., 2016). It has been found to be upregulated in case of HIV-1 infection, causing downregulation of p21 and TASK1, consequently leading to an increase in the release of virions (Farberov et al., 2015).Viruses have their own ways to promote their replication. One way is to hijack the host RNA helicases (Kikkert, 2020). RIG-I/Ddx58 receptors have a helicase domain which interacts with viral nsp13 and initiates the viral life cycle. Ddx58 is highly upregulated (as observed in our analysis), in SARS-CoV-2 infected cells, which could also be one of the probable reason for increase in viral replication. Ddx58 is also involved in processes of mRNA splicing and miRNA biogenesis and thus, its upregulation would cause reprogramming of miRNA splicing events resulting in downregulation of miRNAs such as miR-124-3p which has its predicted site in 3′-UTR of Ddx58 (Cheng et al., 2018, Dardenne et al., 2014). Thus, overexpression of miR-124-3p would cause the degradation of Ddx58, thereby resulting in the reduction of viral replication. Moreover, RIG-I/Ddx58 also augments Stat1 (TF), which upon phosphorylation combines with Stat2 and IFN regulatory factor 9 (Ifn9) to form heterotrimers and initiate the transcription of ISGs to exert antiviral effects (Jiang et al., 2011). Moreover, to evade ISG mediated antiviral effects, the virus may also induce cytosolic translocation of Drosha (an enzyme involved in miRNA biogenesis) (Shapiro et al., 2014), which could reprogram the splicing events to yield another lncRNAs and circRNAs that could sponge miR-124-3p and hamper its degradative actions on Ddx58.Overall, it can be speculated that despite of non-significant induction of IFNs and hence absence of considerable upregulation of ACE2 and TMPRSS2 in mice, the viruses still gain entry into the cell. Although the secreted IFNs are very limited in amount but are sufficient enough to weakly induce ACE2 that promotes binding of virus to the host cell. Later on, SARS-CoV-2 identifies the target cells and utilises the host system to promote their replication. This process begins with an increase in expression of ISGs, as a part of host innate immune response, but the virus modulates the system in its own way to capture the IFN releasing mechanisms and reduce the secretion of IFNs. Although IFNs are secreted in very low amounts, but are sufficient enough to induce ISGs, along with viral sensor RIG-I/Ddx58. Since ACE2 is also IFN induced, it can also be considered as an ISG. ISGs generally have antiviral effects but some such as Isg15 also possess proviral effects. Increased Isg15 may result in the generation of an autocrine loop that could amplify and prolong their own secretion, ultimately leading to the development of viral resistance. Further, the virus also confiscates host RNA helicases such as the helicase domain in Ddx58 to enhance viral replication independent of IFN related pathways. However, Ddx58 is also implicated in splicing events of miRNA biogenesis, thus association of viral helicase with Ddx58 may alter these splicing events in order to provide replication advantage to viruses. Drosha is another enzyme involved in miRNA biogenesis and it would also be steered by viruses to reprogram splicing events associated with miRNA biogenesis to synthesize other noncoding RNAs that could sponge the degradative effect of miR-124-3p on Ddx58 and hence favour viral replication.Although the study is carried out on mRNA expression data of SARS-CoV (HumanSARS-CoVTOR2, SARS-related human coronavirus (NC_004718)) infectedmice (Lau et al., 2015). This strain belongs to the genus Betacoronaviruses and phylogenetic analysis have shown that all bat derived coronaviruses (causative of COVID-19), fall under all five subgenera of the Genus Betacoronaviruses and share 99% of genomic similarities (Lu et al., 2020). Thus, it can be speculated that SARS-COV-2 also utilizes the similar strategies to modulate host immune responses, such as dampening of IFN-I response, but additional mechanisms need to be uncovered. Moreover, the deregulation of IFN-I/III regulated genes has been observed in SARS-CoV infection, leading to uncontrolled viral replication and other immunopathologic conditions. Since these genes are also found to be deregulated in studies associated with SARS-CoV-2 infection (Blanco-Melo et al., 2020), it is assumed to have similar effects in SARS-CoV-2 infection. Thus, we could see from the above network that IFNs play a pivotal role in antiviral responses and strategies to carefully balance the levels of IFNs are essentially required to maintain an equilibrium between tissue tolerance and viral infection. Furthermore, this network could be a very effective tool in understanding the dynamics of host and virus interactions taking place in a SARS-CoV-2 infected cell.The key limitations of our study is that it is based on the data available on public databases, however, it needs further in vitro and in vivo validation to determine its clinical efficiency. Additionally, the study highlights the prospective interactions between mRNAs, miRNA and ceRNAs based on database search, but does not include the expression data of miRNA, lncRNA and circRNA. Availability of this data would enhance the visionary attributes of interactions and would be more beneficial in establishing a clear understanding of the mechanisms involved.
Conclusions
The present study identifies a quintuple ceRNA network involving miRNA (miR-124-3p), lncRNA (Gm26917), circRNAs (Ppp1r10, C330019G07Rik), mRNA (Ddx58) and a TF (Stat2). This network is an in-silico representation of the host-pathogen interactions occurring in SARS-CoV infected cell. The study has identified the host factors and other targets that are critical for development of antiviral responses and has pinpointed the viral processes that could be evolved to evade antiviral immunity. Despite, the study is based on SARS-CoV infection data, this data can be very useful in determining the sequence of events taking place in SARS-CoV-2 infection, due to high level of genomic similarities between both the viruses. Moreover, an exploration of host-virus interactions would be very beneficial in designing effective therapeutic strategies based on these targets. However, the interactions identified in the study are not validated both in vitro and in vivo and hence needs to be experimentally substantiated.
Funding
The present study was supported by the Science and Engineering Research Board (SERB), Department of Science and Technology (grant numbers: SB/S2/RJN-199/2014 to Mansoor A Syed and EEQ/2016/000509 to Ravins Dohare).
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Authors: Min Yee; William Domm; Robert Gelein; Karen L de Mesy Bentley; R Matthew Kottmann; Patricia J Sime; B Paige Lawrence; Michael A O'Reilly Journal: Am J Respir Cell Mol Biol Date: 2017-04 Impact factor: 6.914
Authors: Jillian S Shapiro; Sonja Schmid; Lauren C Aguado; Leah R Sabin; Ari Yasunaga; Jaehee V Shim; David Sachs; Sara Cherry; Benjamin R tenOever Journal: Proc Natl Acad Sci U S A Date: 2014-04-28 Impact factor: 11.205
Authors: Damian Szklarczyk; Annika L Gable; David Lyon; Alexander Junge; Stefan Wyder; Jaime Huerta-Cepas; Milan Simonovic; Nadezhda T Doncheva; John H Morris; Peer Bork; Lars J Jensen; Christian von Mering Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971
Authors: Daniel Wrapp; Nianshuang Wang; Kizzmekia S Corbett; Jory A Goldsmith; Ching-Lin Hsieh; Olubukola Abiona; Barney S Graham; Jason S McLellan Journal: Science Date: 2020-02-19 Impact factor: 47.728
Authors: Shaniya Ahmad; Mohd Murshad Ahmed; P M Z Hasan; Archana Sharma; Anwar L Bilgrami; Kailash Manda; Romana Ishrat; Mansoor Ali Syed Journal: Genes (Basel) Date: 2020-11-10 Impact factor: 4.096