Literature DB >> 35669390

Identification of novel regulatory pathways across normal human bronchial epithelial cell lines (NHBEs) and peripheral blood mononuclear cell lines (PBMCs) in COVID-19 patients using transcriptome analysis.

Lavanya C¹, Aajnaa Upadhyaya¹, Arpita Guha Neogi¹, Vidya Niranjan¹.

Abstract

The SARS-CoV-2 is one of the most infectious and deadly coronaviruses, which has gripped the world, causing the COVID-19 pandemic. Despite the numerous studies being conducted on this virus, many uncertainties are with the disease. This is exacerbated by the speedy mutations acquired by the viral strain, which enables the disease to present itself differently in different people, introducing new factors of uncertainty. This study aims at the identification of regulatory pathways across two cell lines, namely, the peripheral blood mononuclear cell line (PBMC) and the normal human bronchial epithelial (NHBE) cell line. Both the above-mentioned cell lines were considered because they support viral replication. Furthermore, the NHBE cell line captures vital changes in the lungs, which are the main organs affected by the COVID-19 patients, and the PBMC cell line is closely linked to the body's immune system. RNA-Seq analysis, differential gene expression and gene set enrichment analysis for pathway identification were followed. Pathway analysis throws light upon the various systems affected in the body due to the COVID-19. Gene regulatory networks associated with the significant pathways were also designed. These networks aid in identifying various gene targets, along with their interactions. Studying the functionality of the pathways and the gene interactions associated with them, aided by long COVID studies, will provide immense clarity about the current COVID-19 scenario. In the long term, this will help in the design of therapeutic approaches against the SARS-CoV-2 and can also contribute to drug repurposing studies. Ultimately, this study identifies and analyses the relationship of various undiscovered or lesser explored pathways in the human body to the SARS-CoV-2 and establish a clearer picture of the association to help streamline further studies and approaches.

Entities: Chemical

Keywords: Differential gene expression; Drug repurposing; Gene regulatory networks; Neurodegenerative disorders; Pathway analysis

Year: 2022 PMID： 35669390 PMCID： PMC9159965 DOI： 10.1016/j.imu.2022.100979

Source DB: PubMed Journal: Inform Med Unlocked ISSN： 2352-9148

Severe Acute Respiratory Syndrome Coronavirus-2 Middle East Respiratory Syndrome Coronavirus Coronavirus Disease 2019 Peripheral Blood Mononuclear Cells - Normal Human Bronchial Epithelial Cells Angiotensin Converting Enzyme 2 National Center for Biotechnology Information European Nucleotide Archive Genome Reference Consortium Human genome General Feature Format General Transfer Format - FASTA Nucleic Acid Sequence Read Archive Hierarchical Indexing for Spliced Alignment of Transcripts Sequence Alignment Map Binary Alignment Map - Differential Gene Expression Kyoto Encyclopedia of Genes and Genomes KEGG Orthology-Based Annotation System Gene Set Enrichment Analysis ENCyclopedia Of DNA Elements Gene Ontology Neurodegenerative disease

Introduction

Coronaviruses, known for their spiked structure, belong to the Coronaviridae family, which infects the upper gastrointestinal and respiratory tract in mammals [1]. SARS-CoV-2 outbreaks in 2002 and 2003, as well as MERS-CoV, which has been causing an epidemic in some parts of Africa since 2012, have aided researchers in their understanding of coronaviruses. COVID-19, which started in late December 2019, was declared a pandemic by the World Health Organization on March 11, 2020 [2]. Although there is no effective cure for the infection, which affects general health, vaccines have shown promising results. Among the unavailability of treatment regimens, an overwhelming amount of research has been carried out on drug discovery. These include, but are not limited to, understanding the role of repurposed drugs like Hydroxychloroquine, Favipiravir, remdesivir [3] along with carbon fullerene and nanotubes as potential binding agents against the protein targets [4]. Apart from synthetic compounds, Phyto-actives are under intense investigation for its anti-viral activity [5]. The potential targets also inculcated signalling pathways like NF-kB (inflammatory and apoptosis related) pathways, which are targeted during drug discovery [6]. Researchers are still trying to understand the virus and its potential mutations to find a permanent solution to this global concern. However, the key to understanding the secrets of this virus lies in its genetic material. Coronaviruses have the largest genome of approximately 26–32 kB among the RNA viruses. SARS-CoV–2 has a positive-sense single-stranded RNA as its genetic material [7]. Transcriptome studies of other types of coronaviruses have revealed much information about the kind of infection and the pathways that affect the body, and they have also exposed novel approaches. However, it is uncertain if SARS-CoV – 2 has the same transcriptome properties as SARS-CoV – 1 and whether the prior techniques would be effective against this virus. As a result, transcriptomics is being used in this study to unveil previously undisclosed information regarding COVID-19 and SARS-CoV – 2. Transcriptomics is an NGS-based approach that helps analyse the region of the actively transcribed genome. RNA-Seq involves the sequencing of the entire RNA population without any annotation to the genome. More reads are generated if the sequence is transcriptionally active, which plays a significant role in the analysis [8]. RNA–Seq analysis yields valuable information about the differentially expressed genes, which helps in discovering the novel pathways associated with these genes. To characterize the genes involved and analyse the affected pathways, transcriptomics of cell lines derived from lungs and blood were considered for this study, and were later subjected to gene enrichment analysis. Normal human bronchial epithelial cell lines were considered for the study, as the virus directly affects the lungs of the infected person. Peripheral blood mononuclear cells (PBMCs) were considered for the following reasons: (i) viral replication in these cell lines, (ii) dynamic changes in the disease during the infection could be detected in these cell lines [9], and (iii) these cell lines are directly related to the immune system. This study draws attention to the pathways at a more intricate level to obtain valuable information about the virus and COVID-19 using two different cell lines. Since not much work has been done concerning these two cell lines, this study can provide potential insights into the virus and viral infection.

Materials and methodology

Fig. 1 represents the overall methodology that is followed for this study.

Fig. 1

Pipeline methodology flowchart.

Dataset collection

From the NHBE and PBMC cell lines, three healthy and patient samples were obtained. Run Selector is a tool available through the Sequence Read Archive (SRA), which fine-tunes web-based search results based on the two dozen fields to filter SRA data in Run Selector. In this study, the Run Selector was narrowed down to search transcriptome data, and the RNA-Seq assay type was selected. Clicking on the hyperlinks inside the data table will lead to the NCBI page for the SRA Study or BioProject. The healthy and patient sample IDs from both the cell lines were collected. The or file transfer protocol (FTP) links to fetch these samples were taken from European Nucleotide Archive (ENA). The human reference genome data were available from the most recent version, GRCh38, released in 2013, also called hG38. GRCh38 V.32 was used by downloading all the GTF and GFF files from Gencode. NHBE samples were taken from an open-access project. In the mentioned project, biological replicates were created using uninfected human lung biopsies derived from one female aged 60 and one male aged 72. Technical replicates were processed by deriving lung samples from a single deceased male, a COVID-19 patient aged 74. The biosamples were accessed from the BioProject ID PRJNA615032 (https://www.ncbi.nlm.nih.gov/sra/?term=PRJNA615032) [10,11]. The samples for PBMCs were collected from a study that consisted of 6 convalescent patients, 10 RTP patients, and 10 healthy controls for the analysis of the immunological characteristics of PBMCs. Transcriptome sequencing performed a comprehensive characterization of the transcriptional changes in the three groups. The biosamples were accessed from the BioProject ID SRP304889 (https://www.ncbi.nlm.nih.gov/sra/?term=SRP304889).

Indexing

Indexing the sequences help differentiate between multiple sequences by using various techniques and algorithms. These techniques also help reduce the computer memory acquired by the human genome, thus enabling a faster and efficient indexing of the sequence. This characteristic allows the aligner to narrow down the query sequence of the genome, which in turn saves time and memory [8]. In this study, the human reference genome was indexed before the sample reads were mapped and aligned for better results. Once the human reference was indexed, the NHBE and PBMC sample sets were mapped and aligned with it.

Mapping and alignment

After the human reference genome is indexed, the sample sequences must be mapped or aligned to the human reference genome or a de novo assembly must be conducted. This is one of the primary steps in transcriptome analysis. Mapping is done to locate the origins of the reads in the human reference genome [8] and to locate and identify the distances between the genes in the chromosome [12]. Therefore, the three sample sets of both cell lines in this work, including the healthy and patient conditions, were mapped and aligned to the human reference genome collected using the HISAT2 tool. The HISAT2 tool is a mapping and alignment tool that yields higher accuracy compared to other alignment tools. It uses several fast algorithms like the Burrows–Wheeler Aligner and Bowtie [12]. The mapping results from the HISAT2 tool are stored in the Sequence Alignment Map (SAM) format (output file has a. sam extension). SAM is a tab-delimited format that is readable, slow to parse and easily examinable. However, it must be converted to Binary Alignment Map (BAM) format to reduce the size of the samples and parse the sequences faster [8]. BAM is the compressed, binary version of SAM. BAM format compresses up to 128 Mb. The SAM/BAM format contains a header and an alignment section with 11 mandatory fields, with each line starting with the symbol ‘@’ [8]. Samtools were used to convert SAM into BAM. These tools were downloaded and installed using anaconda. Samtools looks for matches and mismatches at each genome coordinate of the reads and removes duplicates while compressing the files and changing the file format from the. sam to the. bam format [8]. The bam files obtained are indexed to obtain a companion file, called an index file or the bam index file. This file has the same name, suffixed with. bai. This allows programs to skip through some areas of the sequences and jump directly to specific parts of the bam file. The bai file is valid only when there is a corresponding bam file. For each alignment file in this study, the. bam.bai was kept in the same directory as the bam files of the sample.

Differential gene expression

RNA-Seq provides count data for quantitative readouts, statistical analysis and visualization with the aid of tools such as DESeq or R/Bioconductor packages [[13], [14], [15]]. In transcriptome analysis, different conditions are compared using differential gene expression (DGE) analysis [16]. Here, the samples were subjected to pairwise DGE that used the count table generated from RNA-Seq for analysis using a tool called OmicsBox. OmicsBox is a bioinformatics software solution that has various modules to run with different types of NGS-based approaches [17]. However, the transcriptomics module was used in this study, and under this module, the DGE analysis tool was selected. The results obtained from the alignment in the form of BAM files and the human reference genome in the gff format were directly uploaded to OmicsBox. The genes between the experimental and contrast conditions were run over time with well-known and versatile statistical packages like NOISeq, edgeR and maSIg pro [[18], [19], [20], [21]]. Differential expression analysis identifies which genes are expressed under specified conditions and the extent to which they are expressed. These genes offer biological insight into the processes affected by the conditions of interest. The count data used for differential expression analysis represent the number of sequence reads originating from a particular gene. A higher number of counts imply more reads associated with a particular gene and a strong assumption that there is a higher expression level of that gene in the sample. In this work, the count data were first normalized to account for differences between the library sizes and RNA composition between samples. Following this, the normalized counts were used to make some plots for QC at the gene and sample levels. Finally, differential gene expression analysis was performed. The gene count of counts per million (CPM) was the filter, which compared replicates between the same sample group.

Gene set enrichment and pathway analysis

The differentially expressed genes obtained after DGE from OmicsBox were subjected to gene set enrichment (GSE). This analysis is a popular agenda to interpret and analyse the. Information obtained from the DGE was used for pathway summary. This approach is flexible, robust, dramatically reduces background noise and can be used for highly heterogeneous datasets. It also helps in detecting pathway activity changes based on the differentially expressed genes, and the employment of GSE methods in this pipeline has helped model the associated pathways [22]. Kobas 3.0 tool was used for pathway analysis, and EnrichR tool was used to optimizethe pathways obtained. Visualization and networking of pathways were done using InBio Map™. KOBAS version 3.0, named KOBAS intelligent version (KOBAS-i), is a web server and software that annotates an input set of genes or proteins by mapping to genes with known pathways in the KEGG PATHWAY database. It uses a hypergeometric test to identify significant pathways. Moreover, it uses five different pathway databases, namely KEGG PATHWAY, PID, BioCyc, Reactome and Panther, and five human disease databases, including OMIM, KEGG DISEASE, FunDO, GAD and NHGRI GWAS Catalog. KOBAS-i accepts different types of inputs, like gene IDs, symbols, FASTA sequence, or tabular BLAST output [23,24]. The two programs in KOBAS 3.0 are ‘annotate’ and ‘identify’. The annotate program identifies the coding regions and locations of the genes. The enrichment module provides information about the pathways and GO terms that are statistically significantly associated with the input gene list or expression [23]. Two different enrichment analyses are available: gene-list enrichment and exp-data enrichment. For this study, the KEGG pathway database was chosen as the filter to map the input genes to the KEGG pathway database. The upregulated and downregulated gene lists were uploaded separately after selecting the Gene-set enrichment option. The two lists were uploaded separately due to the input limit of 3000 gene symbols set by KOBAS-i. This tool outputs statistically significant pathways associated with the statistically significant genes obtained after gene set enrichment analysis in the form of bar and bubble plots. KOBAS 3.0 is freely available at: http://kobas.cbi.pku.edu.cn/. The Enrichr tool was used to optimize the pathways. Enrichr currently contains an extensive and diverse collection of 102 gene set libraries available for analysis [25]. Features such as submission of fuzzy sets, uploading of BED (Browser Extensible Data) files and visualization in the form of clustergrams are incorporated. The clustergram generates correlation plots of the most significant pathways. The upregulated and downregulated cell lines from both the cell lines were uploaded to Enrichr to output a bar plot representing the most significant pathways obtained from different pathway databases. Enrichr is freely available at http://amp.pharm.mssm.edu/Enrichr [[26], [27], [28]]. For networking, interpretation and visualising the pathways obtained from KOBAS 3.0 and Enrichr, another tool called InBio Map™ was used. With this tool, relevant annotations from the enrichment analysis results were selected. Custom network visualizations were created using annotations, and gene expression datasets about this study were given as input. Networks were filtered based on topology or proteins. InBio Map™ is freely available at https://inbio-discover.com/ [29]. Further details and the codes used are available in the published protocol [30].

Results and discussion

Results obtained from omics box

Count table generation

In the count tables, 38,199 genes are acquired from both cell lines. The gene expression of RNA-seq studies is approximated using the count table. The number of reads that overlap a certain feature, such as a gene, is referred to as a count. Supplementary Tables S2A–S2B provides count tables for both cell lines. In Supplementary figures, S1A–S1D, bar charts depicting counts per category and box plots depicting count distributions are presented.

Differential gene expression

Tables containing differentially expressed genes obtained from the two cell lines were generated and are provided in Supplementary Tables S2C and S2D. These tables show whether a gene is upregulated (FDR ≤0.05, logFC ≥1) or downregulated (FDR ≤0.05, logFC. < −1). Genes that have not passed the filtering step are not shown here. The logFC values describe the gene expression changes between different experimental conditions (healthy and patient). The log CPM tag averages the log2-counts-per-millions. The False discovery rate (FDR) is generally estimated by the Benjamini–Hochberg method. Typically, the p-values indicate how significant the results are. A p-value less than 0.05 (typically ≤0.05) is statistically significant. FDR is an adjusted p-value to trim false-positive results. Since there are hundreds and thousands of genes, false positives may enter by chance. Therefore, FDR offers more confidence than just a p-value. FDR values less than 0.05 are considered real. The likelihood ratio (LR) statistic for the generalized linear model (LR test) evaluates the integrity of two competing statistical models based on the ratio of their commonness or likelihood. The results from the differential gene expression analysis of both cell lines are summarized in the bar plots shown in Fig. 2 . The heatmap showing the regulation of the top 50 differentially expressed genes is represented in Fig. 3 .

Fig. 2

Fig. 3

Fig. 3A represents the heatmap for the differentially expressed genes of the NHBE cell line. The first half of the genes (till FER1L6-CS2) are upregulated in the healthy samples and downregulated in the patient samples. The second half of the genes (from HNRNPH3) are downregulated in the healthy samples and upregulated in the patient samples. Fig. 3B represents the heatmap for the differentially expressed genes from the PBMC cell line. The first half of the genes (till ZNF503) are upregulated in the healthy samples and downregulated in the patient samples. The second half of the genes (from LINC01089) are downregulated in the healthy samples and upregulated in the patient samples.

Fig. 2A: DGE results from the NHBE cell line. Fig. 2B: DGE results from the PBMC cell line. The first rectangle represents the total number of features (genes) obtained from the count table. The next block represents the number of genes that were obtained after applying the normalisation filter. The third block represents the genes that did not show any expression pattern. The last two blocks show the number of upregulated and downregulated genes, respectively. Fig. 3A represents the heatmap for the differentially expressed genes of the NHBE cell line. The first half of the genes (till FER1L6-CS2) are upregulated in the healthy samples and downregulated in the patient samples. The second half of the genes (from HNRNPH3) are downregulated in the healthy samples and upregulated in the patient samples. Fig. 3B represents the heatmap for the differentially expressed genes from the PBMC cell line. The first half of the genes (till ZNF503) are upregulated in the healthy samples and downregulated in the patient samples. The second half of the genes (from LINC01089) are downregulated in the healthy samples and upregulated in the patient samples. The volcano plots shown in Fig. 4 represent the regulation of genes based on their FDR and logFC values. The FDR values are less than 0.05 for both upregulated and downregulated genes, and the logFC values are higher than 1 for the upregulated genes and lower than −1 for the downregulated genes. Genes that do not show any expression pattern are denoted by black. In Figs. 3 and 4, the upregulated genes are denoted by red, and the downregulated genes are denoted by green. The MA and MDS plots representing the differentially expressed genes of both cell lines are provided in the Supplementary Figs. S1E–S1H.

Fig. 4

Fig. 4A represents the volcano plot visualising the differentially expressed genes from the NHBE cell line and Fig. 4B represents the volcano plot visualising the differentially expressed genes from the PBMC cell line.

Results from pathway analysis

KOBAS 3.0 pathway analysis

In KOBAS 3.0, the gene lists from both samples were uploaded separately, and the above-mentioned plots containing the significant pathways were obtained. This tool allows the user to visualize the output in different ways, as follows. Enriched pathways visualized in cirFunMap: Each node represents an enriched pathway, and the node colour portrays different clusters. The node size corresponds to six levels of enriched p-values. Furthermore, the node size ranges from small to large: [0.05,1], [0.01,0.05], [0.001,0.01], [0.0001,0.001], [1e-10,0.0001] and [0,1e-10]. Enriched pathways visualized in barplot: Each row represents an enriched pathway, and the length of the bar represents the enrich ratio, which is given by the ratio of the number of input genes to the background gene number. Here, the genes present in the whole human genome are considered the background genes; however, the users can input their own set of background genes. The colour of the bars corresponds to the different clusters to which they belong. If there are more than five pathways for each cluster, the top five pathways with the highest enrich ratio will be displayed. Hence, only the most significant pathways are displayed in the plot. Fig. 5 represents the bar plots derived from the upregulated and downregulated genes of the NHBE cell line. Fig. 6 represents the bar plots derived from the upregulated and downregulated genes of the PBMC cell line.

Fig. 5

Fig. 6

Fig. 6A shows the bar plots visualising the significant pathways from the upregulated genes of the PBMC cell line. Fig. 6B shows the bar plots visualising the significant pathways from the downregulated genes of the PBMC cell line.

Fig. 5A shows the bar plots visualising the significant pathways from the upregulated genes of the NHBE cell line. Fig. 5B shows the bar plots visualising the significant pathways from the downregulated genes of the NHBE cell line. Fig. 6A shows the bar plots visualising the significant pathways from the upregulated genes of the PBMC cell line. Fig. 6B shows the bar plots visualising the significant pathways from the downregulated genes of the PBMC cell line. Bubble plots visualising the enriched pathways were also obtained for both cell lines and are provided in the Supplementary Fig. S1I–S1L.

Enrichr

The gene lists from the two cell lines were uploaded to Enrichr, which then generated significant pathways from different pathway databases. All the pathways generated by Enrichr and the significant genes associated with them are provided in the supplementary tables. The corresponding p-values and corrected p-values are also mentioned. A corrected p-value provides more confidence in the results obtained. The bar chart represents the top 10 enriched pathways and their p-values. The blue coloured bars indicate that the pathways have significant p-values (<0.05). An asterisk (*) next to the p-value indicates that the pathway also has a significant adjusted p-value (<0.05). One of the most significant pathways represented in Fig. 7 A is the coronavirus disease pathway. The figure implies that most of the significant genes correlate with this disease, which also implies that the other associated pathways must be closely observed to draw conclusive results about the coronavirus infection.

Fig. 7

The above bar plots show the most significant pathways obtained from the KEGG database for the NHBE and PBMC cell lines.

The above bar plots show the most significant pathways obtained from the KEGG database for the NHBE and PBMC cell lines. Fig. 7B shows the bar plot obtained from the gene set enrichment of the PBMC cell line, which implies that the significant genes obtained from the upregulated and downregulated gene lists of the PBMC cell line correlate with the immunological pathways, which is one of the most heavily affected systems of the human body in COVID-19.

Discussions

Based on the results obtained, all the significant pathways can be broadly classified into glucose metabolism pathways, immunological pathways, pathways of neurodegeneration, cellular physiology, and signalling pathways. Gene networks for some categories mentioned above were drawn by loading the significant genes from the different significant pathways obtained, to InBio Discover™.

Association of glucose metabolism pathways with COVID-19

According to the American Diabetes Association, people with diabetes are more likely to get COVID-19. They are also more prone to severe complications after contracting this disease [31]. Diabetes is one of the major comorbidities correlated with severe COVID-19. The replication rate of the causative virus increases with the increase in glucose levels in the host. Such an increase in the replication rate is followed by a cytokine storm and ACE-2 upregulation [32]. Several studies have attempted to model the important role of the pathways associated with glucose metabolism in increased SARS-CoV-2 replication, which in turn elicits an immune response in the host. The results showed that in diabetes complications, the pathways classified under the glucose metabolism pathways are glycolysis/gluconeogenesis, insulin resistance, insulin signalling, type 1 diabetes mellitus, and AGE-RAGE signalling pathways. The insulin resistance and signalling pathways were downregulated, whereas the other pathways were upregulated. Targeting those pathways related to glucose metabolism will eventually shed light on the changes in all systems in a body under viral attack. This will help design new anti-viral approaches that target the pathogen. A gene network consisting of the significant genes belonging to the above-mentioned pathways was constructed to examine the interaction between the families of genes, as shown in Fig. 8 .

Fig. 8

represents the gene network of the glucose metabolism pathways derived from the significantly regulated genes of the NHBE cell line.

Association of neurodegenerative pathways with COVID-19

The hospital admission data indicated that the mortality rate is higher in COVID-19 patients who have dementia and other neurodegenerative diseases, than in patients who do not suffer from any neurodegenerative comorbidities. To date, no particular cause for the above phenomena has been noted. However, SARS-CoV-2 infection possibly damages the nasal epithelium, which is connected to the central nervous system. It was further observed that, due above, neurodegenerative diseases may occur in the future [33]. During the 15th International Virtual Conference on Alzheimer's Disease (AD) and Parkinson's Disease (PD) 2021, the clinical and social implications of COVID-19 on neurodegenerative disorders were evaluated. A group of physicians and researchers are currently conducting clinical trials to establish conclusive results regarding the same [34]. Our study has shown neurodegenerative disorders, such as prion disease, Parkinson's disease (PD), Huntington's disease (HD), Alzheimer's disease (AD) and amyotrophic lateral sclerosis (ALS), as some of the highly significant pathways associated with COVID-19. All the above pathways were observed to be upregulated in the NHBE cell line. The significant genes associated with the abovementioned pathways were ND6, NOTCH1, NDUFB10, UQCRB, NDUFA12, ATP6, ATP12, ITPR1, UQCR10, UBE2L4, TUBA1B, TUBB6, LRP1, and UCHL1 (S2E). There were scattered families of involved genes belonging to the above-mentioned significant pathways. Gene association and interaction studies must be conducted to obtain a consolidated gene network. The finding above suggests that COVID-19 is significantly related to neurodegenerative disorders, which must be further studied to obtain conclusive results.

Association of immunological pathways with COVID-19

The immune system is one of the most heavily affected systems in COVID-19. Many studies have shown that a host immune response is elicited by the SARS-CoV-2. SARS-CoV-2 infection has been observed to trigger both the adaptive and innate immune systems. Furthermore, B and T cells are essential for generating an immunogenic response against the virus. Moreover, COVID-19 is associated with cytokine storms. The genes associated with cytokine storms have also been identified [35]. Most of the upregulated immunological pathways were observed in the PBMC cell line. The pathways in this category include the antigen processing and presentation pathway, the phagosomal pathway, the hematopoietic cell lineage pathway, the influenza A pathway, the allograft rejection pathway and the lysosomal pathway. Lysosomes create an acidic environment that helps in combating the SARS-CoV-2 [36]. All the above-mentioned pathways are significant. The CD74 gene is associated with pulmonary inflammation in COVID-19. The HLA family of genes regulates the immune system to combat pathogens. The regulation of the HLA family of genes depends on many factors, such as disease severity and ethnicity of the test and control groups [37]. A gene network was obtained from the significant genes of the pathways belonging to this category, as shown in Fig. 9 . Knowledge about the genes responsible for a host's immune response helps in monitoring the spread of the infection and can aid in further mutation studies.

Fig. 9

represents the gene network for the immunological pathways derived from the significantly regulated genes of the PBMC cell line.

Association of cellular signalling and physiological pathways with COVID-19

Pathogens utilize the regulatory and signalling pathways of a host organism to provide for themselves, and this phenomenon exerts a profound impact on signalling pathways during a viral infection. Grimes et al. (2020) observed that the MAPK (Mitogen-activated protein kinase) signalling pathway is upregulated, serving as a pro-inflammatory response against pathogenic attacks [38]. The AMPK (adenosine monophosphate-activated protein kinase) pathway controls the autophagy response against viral infections. Downregulation of this pathway may result in a cytokine storm, which is an essential host immune response against COVID-19. The pathways in this category are the focal adhesion, ECM–receptor interaction, cellular senescence, TNF signalling, AMPK signalling and MAPK signalling pathways. All of these pathways are involved in cell proliferation, differentiation and, in the maintenance of cell integrity. In the NHBE cell line, some of the most significant pathways associated with the upregulated genes are the focal adhesion, ECM–receptor interaction and cellular senescence pathways, whereas the AMPK and TNF (Tumour Necrosis Factor) signalling pathways are associated with the downregulated genes. The presence of upregulated genes in the PBMC cell line indicates that the MAPK signalling pathway is one of the most significant pathways. Studying the regulation of cellular signalling pathways associated with COVID-19 will help us identify the different receptors and cell targets in order to narrow down and verify the various therapeutic approaches for COVID-19. A gene network for the above pathways is shown in Fig. 10 .

Fig. 10

visualises the gene network of the cellular signalling and physiology pathways derived from the significantly regulated genes of both the cell lines.

visualises the gene network of the cellular signalling and physiology pathways derived from the significantly regulated genes of both the cell lines. Furthermore, a genetic network developed from the most significant genes associated with COVID-19 is presented in Fig. 11 . These genes proved to be potential therapeutic targets against viral infection. Further gene functionality studies must be conducted to obtain conclusive results regarding the same.

Fig. 11

shows the gene network derived from the significant genes associated with the COVID-19 disease.

Future work and conclusion

The coronavirus epidemic continues to spread over the world on an unpredictable trajectory. However, with the knowledge of pathways involved in COVID-19, the predictability of the disease increases and the control taken against this disease is also more assertive. Gene Regulatory pathways have become a wide focus of interest recently. The pathways obtained can determine the long-term effects of the coronavirus infection. Further, it can also help evaluate various therapeutic approaches based on the different gene targets found in the above-mentioned pathways. Understanding changes in glucose metabolism pathways can help assess the contributing factors of long covid effects on diabetic patients. Although evidence from further studies is required, the link between neurodegenerative diseases and COVID-19 can ensure better care facilities for patients with neurodegenerative disorders during the pandemic. Early detection of patients with characteristic host immune response markers can help identify the progression of the disease in such patients and can guarantee appropriate measures accordingly. Comprehensive assessment of cellular physiology and signalling pathways paves the way toward novel therapeutic approaches and drug repurposing studies. Other important cell signalling targets of bronchial epithelial cells of COVID-19 patients must be identified in order to verify the current treatment strategies and propose potential strategies against the same. Gene networks provide information regarding the interaction of various genes and can also throw light on potential protein interactions. The need of the hour is to perform long COVID studies to determine the exact impact of COVID-19 on individuals and societies. The SARS-CoV-2 mutates at an exponential rate and each variant presents different symptoms and attacks different pathways in the body. Under such circumstances, expanding research on the scope of the disease by going back to the basics will generate new and valuable information about the virus. This knowledge will help us combat the disease in a better and more appropriate manner.

Funding

The research work conducted has no source of funding.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this study.

31 in total

1. Fast gapped-read alignment with Bowtie 2.

Authors: Ben Langmead; Steven L Salzberg
Journal: Nat Methods Date: 2012-03-04 Impact factor: 28.547

2. Gene Set Knowledge Discovery with Enrichr.

Authors: Zhuorui Xie; Allison Bailey; Maxim V Kuleshov; Daniel J B Clarke; John E Evangelista; Sherry L Jenkins; Alexander Lachmann; Megan L Wojciechowicz; Eryk Kropiwnicki; Kathleen M Jagodnik; Minji Jeon; Avi Ma'ayan
Journal: Curr Protoc Date: 2021-03

3. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.

Authors: Michael I Love; Wolfgang Huber; Simon Anders
Journal: Genome Biol Date: 2014 Impact factor: 13.583

4. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences.

Authors: Charlotte Soneson; Michael I Love; Mark D Robinson
Journal: F1000Res Date: 2015-12-30

5. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool.

Authors: Edward Y Chen; Christopher M Tan; Yan Kou; Qiaonan Duan; Zichen Wang; Gabriela Vaz Meirelles; Neil R Clark; Avi Ma'ayan
Journal: BMC Bioinformatics Date: 2013-04-15 Impact factor: 3.169

6. Elevated Glucose Levels Favor SARS-CoV-2 Infection and Monocyte Response through a HIF-1α/Glycolysis-Dependent Axis.

Authors: Ana Campos Codo; Gustavo Gastão Davanzo; Lauar de Brito Monteiro; Gabriela Fabiano de Souza; Stéfanie Primon Muraro; João Victor Virgilio-da-Silva; Juliana Silveira Prodonoff; Victor Corasolla Carregari; Carlos Alberto Oliveira de Biagi Junior; Fernanda Crunfli; Jeffersson Leandro Jimenez Restrepo; Pedro Henrique Vendramini; Guilherme Reis-de-Oliveira; Karina Bispo Dos Santos; Daniel A Toledo-Teixeira; Pierina Lorencini Parise; Matheus Cavalheiro Martini; Rafael Elias Marques; Helison R Carmo; Alexandre Borin; Laís Durço Coimbra; Vinícius O Boldrini; Natalia S Brunetti; Andre S Vieira; Eli Mansour; Raisa G Ulaf; Ana F Bernardes; Thyago A Nunes; Luciana C Ribeiro; Andre C Palma; Marcus V Agrela; Maria Luiza Moretti; Andrei C Sposito; Fabrício Bíscaro Pereira; Licio Augusto Velloso; Marco Aurélio Ramirez Vinolo; André Damasio; José Luiz Proença-Módena; Robson Francisco Carvalho; Marcelo A Mori; Daniel Martins-de-Souza; Helder I Nakaya; Alessandro S Farias; Pedro M Moraes-Vieira
Journal: Cell Metab Date: 2020-07-17 Impact factor: 27.287

1. Identification of novel regulatory pathways across normal human bronchial epithelial cell lines (NHBEs) and peripheral blood mononuclear cell lines (PBMCs) in COVID-19 patients using transcriptome analysis.

Authors: Lavanya C; Aajnaa Upadhyaya; Arpita Guha Neogi; Vidya Niranjan
Journal: Inform Med Unlocked Date: 2022-06-02

1 in total

Identification of novel regulatory pathways across normal human bronchial epithelial cell lines (NHBEs) and peripheral blood mononuclear cell lines (PBMCs) in COVID-19 patients using transcriptome analysis.

Introduction

Materials and methodology

Dataset collection

Indexing

Mapping and alignment

Differential gene expression

Gene set enrichment and pathway analysis

Results and discussion

Results obtained from omics box

Count table generation

Differential gene expression

Results from pathway analysis

KOBAS 3.0 pathway analysis

Enrichr

Discussions

Association of glucose metabolism pathways with COVID-19

Association of neurodegenerative pathways with COVID-19

Association of immunological pathways with COVID-19

Association of cellular signalling and physiological pathways with COVID-19

Future work and conclusion

Funding

Declaration of competing interest

1. Fast gapped-read alignment with Bowtie 2.

2. Gene Set Knowledge Discovery with Enrichr.

3. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.

4. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences.

5. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool.

6. Elevated Glucose Levels Favor SARS-CoV-2 Infection and Monocyte Response through a HIF-1α/Glycolysis-Dependent Axis.

7. The Architecture of SARS-CoV-2 Transcriptome.

Review 8. SARS-CoV-2 and Coronavirus Disease 2019: What We Know So Far.

Review 9. Multifaceted Role of AMPK in Viral Infections.

Review 10. COVID-19 and olfactory dysfunction: A possible associative approach towards neurodegenerative diseases.

1. Identification of novel regulatory pathways across normal human bronchial epithelial cell lines (NHBEs) and peripheral blood mononuclear cell lines (PBMCs) in COVID-19 patients using transcriptome analysis.