Literature DB >> 34840672

Identification of the susceptibility genes for COVID-19 in lung adenocarcinoma with global data and biological computation methods.

Li Gao1, Guo-Sheng Li1, Jian-Di Li1, Juan He1, Yu Zhang2, Hua-Fu Zhou3, Jin-Liang Kong4, Gang Chen1.   

Abstract

INTRODUCTION: The risk of infection with COVID-19 is high in lung adenocarcinoma (LUAD) patients, and there is a dearth of studies on the molecular mechanism underlying the high susceptibility of LUAD patients to COVID-19 from the perspective of the global differential expression landscape.
OBJECTIVES: To fill the research void on the molecular mechanism underlying the high susceptibility of LUAD patients to COVID-19 from the perspective of the global differential expression landscape.
METHODS: Herein, we identified genes, specifically the differentially expressed genes (DEGs), correlated with the susceptibility of LUAD patients to COVID-19. These were obtained by calculating standard mean deviation (SMD) values for 49 SARS-CoV-2-infected LUAD samples and 24 non-affected LUAD samples, as well as 3931 LUAD samples and 3027 non-cancer lung samples from 40 pooled RNA-seq and microarray datasets. Hub susceptibility genes significantly related to COVID-19 were further selected by weighted gene co-expression network analysis. Then, the hub genes were further analyzed via an examination of their clinical significance in multiple datasets, a correlation analysis of the immune cell infiltration level, and their interactions with the interactome sets of the A549 cell line.
RESULTS: A total of 257 susceptibility genes were identified, and these genes were associated with RNA splicing, mitochondrial functions, and proteasomes. Ten genes, MEA1, MRPL24, PPIH, EBNA1BP2, MRTO4, RABEPK, TRMT112, PFDN2, PFDN6, and NDUFS3, were confirmed to be the hub susceptibility genes for COVID-19 in LUAD patients, and the hub susceptibility genes were significantly correlated with the infiltration of multiple immune cells.
CONCLUSION: In conclusion, the susceptibility genes for COVID-19 in LUAD patients discovered in this study may increase our understanding of the high risk of COVID-19 in LUAD patients.
© 2021 The Author(s).

Entities:  

Keywords:  CI, confidence interval; COVID-19; COVID-19, coronavirus disease 2019; DEG; DEG, differentially expressed genes; FC, fold change; FPKM, fragments per kilobase per million; GTEx, Genotype-tissue Expression; HPA, human protein atlas; IHC, immunohistochemistry; Immune infiltration; LUAD; LUAD, lung adenocarcinoma; PPI, protein-to-protein interaction; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2; SMD, standard mean difference; SROC, summarized receiver’s operating characteristics; Susceptibility; TF, transcription factor; TPM, transcripts per million reads; WGCNA; WGCNA, weighted gene co-expression network analysis

Year:  2021        PMID: 34840672      PMCID: PMC8605816          DOI: 10.1016/j.csbj.2021.11.026

Source DB:  PubMed          Journal:  Comput Struct Biotechnol J        ISSN: 2001-0370            Impact factor:   7.271


Introduction

Coronavirus disease 2019 (COVID-19) has swept across the globe since it was first identified in December 2019 in Wuhan, China, and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the primary culprit behind COVID-19 [1], [2]. To date, COVID-19 has resulted in over 72 million diagnosed cases and more than 1 million deaths, posing a serious threat to the public [3]. Factors such as age and comorbidities may affect the severity of clinical manifestations of COVID-19, which include mild or no pneumonia, fever, headaches, hemoptysis, myalgia, fatigue, and sputum production [4], [5]. The pervasion of the virus is attributed to the transmission route from person to person via saliva droplets, direct contact with COVID-19 patients, or aerosol transmission [2], [6]. Although the public is susceptible to COVID-19, elderly people and people with health conditions, such as cardiovascular disease, chronic obstructive pulmonary disease, hypertension, cancer, and diabetes mellitus, are more predisposed to the more severe symptoms of COVID-19 [7], [8], [9]. A recent cohort study by Liang et al. pointed out the higher incidence of cancer in COVID-19 patients than in normal populations and the fact that COVID-19 patients with cancer were also more likely to suffer from acute complications than those without cancer [10]. Lung cancer has been found to be the most common type of cancer in COVID-19 patients [11]. Considering the association between lung cancer and COVID-19 and the highest frequency of lung adenocarcinoma (LUAD) among all histological subtypes of lung cancer, it is necessary to dig into the molecular mechanism underlying the high susceptibility of LUAD patients to COVID-19. Despite the fact that the molecular basis for the susceptibility of LUAD patients to COVID-19 has been investigated by several studies, these studies all focused on the ACE2 SAR2-Cov-2 receptor [12], [13], [14]. The crucial importance of ACE2 in the high vulnerability of LUAD patients to COVID-19 needs not be emphasized. Apart from ACE2, numerous other genes and pathways may play essential roles in the susceptibility of LUAD patients to COVID-19, and these factors have not been extensively investigated by prior studies. To fill this research void, the present work identifies genes correlated with the susceptibility of LUAD patients to COVID-19 via biological computational methods and by calculating standard mean difference (SMD) values for differentially expressed genes (DEGs) in 49 SARS-CoV-2-infected LUAD samples and 24 non-affected LUAD samples, as well as 3931 LUAD samples and 3027 non-cancer lung samples from the 40 most pooled RNA-seq and microarray datasets, which created the most complete LUAD dataset assembled so far. The susceptibility genes for COVID-19 in LUAD were annotated with their biological functions, and hub susceptibility genes significantly related to COVID-19 were selected via a weighted gene co-expression network analysis (WGCNA). Then, the hub genes were further analyzed via the examination of their clinical significance in multiple datasets, a correlation analysis with immune cell infiltration levels, and their interactions with the interactome sets of the A549 cell line. The present study is anticipated to stimulate strategies that can be used to help LUAD patients with COVID-19.

Materials and methods

Accumulation of global RNA-seq and microarray datasets for LUAD

We searched the TCGA and Genotype-tissue Expression (GTEx) databases to obtain level-three fragments per kilobase per million (FPKM) and transcripts per million reads (TPM) of RNA-seq data of 533 LUAD and 347 normal lung samples (288 normal cases and 533 tumor cases from the TCGA database; 59 normal cases from the GTEx database). The FPKM expression matrix was converted to a TPM expression matrix and normalized with the log2(x + 0.001) algorithm. Other databases, including GEO, ArrayExpress, SRA, and Oncomine, were searched for microarrays containing gene expression data for LUAD and non-cancer lung samples. The following search strategies were used to retrieve the microarrays: “cancer OR carcinoma OR tumour OR tumor OR malignan* OR neoplas*” AND “Lung OR pulmonary OR respiratory OR respiration OR aspiration OR bronchi OR bronchioles OR alveoli OR pneumocytes.” The details of each RNA-seq and microarray dataset, including accession ID, country, platform, sample type, and sample numbers, were extracted and compiled together. The processing of all included microarrays followed the steps of probe matching, log2 transformation if possible, averaging for repeated items, and normalization between arrays. If several GSE datasets were generated from the same GPL platform, they were merged into one dataset, and the batch effect was removed from these datasets with the sva and limma packages of R software v.3.6.1.

The DEGs of LUAD identified from all collected datasets

The index of (SMD) was used to summarize continuous variables with different units of measurement and large differences in mean. We successfully applied the methods of calculating SMD values to the gathered datasets to comprehensively evaluate the expression trends of specific genes in human cancers. In the present work, differential expression analysis for all RNA-seq and microarrays was performed with the limma package of R software V.3.6.1. Genes with significant aberrant expression in LUAD samples of any of the included datasets according to differential expression analysis (log 2 fold change (FC) value of >1 or <−1 and adjusted P value < 0.05) were reserved for further estimation of SMD values. The SMD values with 95% confidence intervals (CIs) were computed based on data including the number of samples, mean, and standard deviation of expression values in the LUAD and non-cancer groups. A meta package of R software v.3.6.1 was utilized for the calculation of SMD values for all reserved genes. Up-regulated reserved genes (log2FC > 1, adj. P < 0.05) with SMD values and 95 %CI > 0 were defined as upregulated DEGs in LUAD, and down-regulated reserved genes (log2FC < −1, adj. P < 0.05) with SMD values and 95 %CI < 0 were defined as down-regulated DEGs in LUAD.

The susceptibility genes for COVID-19 in LUAD

In the current study, two microarrays from the GEO database, GSE147507 and GSE163547, were included to obtain the DEGs between SARS-CoV-2-infected LUAD samples and non-affected LUAD samples [15], [16]. The limma package for R software has been widely used in differential expression analyses of microarrays as well as RNA-seq data. It was crucial to normalize the expression values of all the samples before making meaningful comparisons between different groups of samples on the same measurement scale [17]. Therefore, the R package known as biomaRt was first used to infer the transcript per million (TPM) expression matrix from the original count expression matrix of GSE147507. Following the normalization of the expression matrix, differential expression analysis was applied to the two microarrays. The screening criteria for DEGs were log2FC values of >1 or <−1 and adjusted P values < 0.05. The intersection of up-regulated DEGs in LUAD (part 2.2) and up-regulated DEGs in SARS-CoV-2-infected LUAD samples was designated as the susceptibility gene for COVID-19 in LUAD patients. To understand the enrichment of susceptibility genes in terms of biological processes, cellular components, molecular functions, and KEGG pathways, we performed functional annotations of these susceptibility genes using the ClusterProfiler R package after the determination of susceptibility genes for COVID-19 in LUAD patients. P < 0.05 indicated significant functional annotation. The interrelationships between susceptibility genes and biological processes and pathways significantly related to COVID-19 were depicted via a protein-to-protein interaction (PPI) network plotted with STRING.

Identification of hub susceptibility genes for COVID-19 in LUAD through WGCNA

WGCNA is a powerful method that facilitates the investigation of the intricate gene correlations and associations between the expression profiles of genes and phenotypes of human diseases [18]. The core principle of the WGCNA method is to transform the gene expression matrix into pairwise correlation matrices, based on which co-expressed genes in the same module can be identified [19]. In this work, correlation analysis was further conducted to analyze the relationships between gene modules and clinical traits [20]. Herein, a scale-free topology network was built for all intersected susceptibility genes expressed in 77 COVID-19 patient samples and 118 non-affected control samples from the GSE161731 dataset as the input file. The information on the positive or negative diagnosis of COVID-19 coded as bivariate was the corresponding phenotypic data. The number of genes in the minimum module was set at 15, and genes that shared high connectivities with similar expression patterns were clustered into the same co-expression modules. The 10 genes with the highest gene–trait significance (P < 0.05) values in the module, that is, those showing most remarkable positive correlation with the phenotype of COVID-19, were regarded as the hub susceptibility genes for COVID-19 in LUAD patients. All steps of WGCNA were performed with the WGCNA package in R software v.3.6.1.

The clinical significance of the hub susceptibility genes for COVID-19 in LUAD patients

Forest plots of SMD and the summarized receiver operating characteristics (SROC) curves were created for the 10 hub susceptibility genes for COVID-19 in LUAD patients based on the compiled expression data in all collected LUAD datasets, which was accomplished with the meta package in R software and Stata v.14.0. The protein expression levels of hub susceptibility genes in LUAD and normal lung tissues were evaluated with immunohistochemistry (IHC) images obtained from the human protein atlas (HPA) database. The prognostic value of the hub susceptibility genes for LUAD patients was assessed through Kaplan–Meier survival curves in the KM plotter database. The prognostic data used to draw the Kaplan–Meier survival curves were aggregated from 14 RNA-seq and microarray datasets, including CAARRAY, GSE14814, GSE19188, GSE29013, GSE30219, GSE31210, GSE3141, GSE31908, GSE37745, GSE43580, GSE4573, GSE50081, GSE8894, and TCGA. All LUAD patients with prognostic information on overall survival were divided into low- and high-expression groups based on the median expression value of the hub susceptibility genes. The validation of survival analyses was conducted with the Lung Cancer Explorer database for hub susceptibility genes that had significant prognostic value from the KM plotter database.

Exploration of whether there is relevance between COVID-19-related host protein expression and hub susceptibility genes

LUAD patients from the TCGA database were divided into two groups with different expression levels of hub susceptibility genes through k-means clustering methods of the NbClust package in R software v.4.1.0. The expression of TMPRSS2, a processing enzyme required for the SARS-CoV-2 infection of lung epithelia, was compared between the two groups by unpaired Students’ t tests in GraphPad Prism v.8.0.1.

The effects of cigarette smoking on the expression of hub susceptibility genes in LUAD

Smoking-related effects on the expression level of 10 hub susceptibility genes were explored by comparing the differential expression of these genes in LUAD patient groups with different histories of smoking. Analysis of this part was performed using the UALCAN tool, where independent student’s t-test was employed for comparison between subgroups of LUAD patients.

Correlations between hub susceptibility genes and immune cell infiltration in COVID-19 patients

The CIBERSORT method was employed to deduce the levels of 22 immune-infiltrating cells in the 77 COVID-19 samples obtained from the GSE161731 dataset. The correlations between the immune infiltration levels of 22 cells and the expression of hub susceptibility genes in 77 COVID-19 samples were calculated through Pearson’s correlation tests.

In-depth analysis of the hub susceptibility genes for COVID-19 in LUAD

The molecular mechanisms of the hub genes in terms of endowing LUAD cases with susceptibility to COVID-19 were further investigated by predicting upstream miRNA and transcription factors (TF). We also estimated the interrelationships between hub susceptibility genes for COVID-19 in LUAD and the interactome sets of the A549 and Calu-3 cell lines. These analyses were enabled by the ChIP Enrichment Analysis (ChEA) and Coronascape databases.

Results

A total of 114 datasets pertaining to LUAD were included to collect DEGs in 3931 LUAD and 3027 non-cancer lung samples. The PRISMA flow diagram for selecting eligible datasets is demonstrated in Supplementary Fig. 1. The details of the 114 datasets and two datasets related to SARS-CoV-2 infection in LUAD cells (GSE147507 and GSE163547) are provided in Supplementary Table 1. According to the filtering criteria for DEGs, 6455 up-regulated DEGs with positive SMD values and 4527 down-regulated DEGs with negative SMD values were reserved (Supplementary Table 2). The differential expression analysis results for GSE147507 and GSE163547 reported 851 up-regulated DEGs and 2036 down-regulated DEGs in SARS-CoV-2-infected LUAD samples versus non-affected LUAD samples (Supplementary Fig. 2A and B; Supplementary Table 3). The intersection results for 6455 up-regulated DEGs in LUAD samples and 851 up-regulated DEGs in SARS-CoV-2-infected LUAD samples revealed 257 susceptibility genes for COVID-19 in LUAD (Supplementary Fig. 2C). Functional enrichment analyses of the 257 susceptibility genes for COVID-19 in LUAD indicated the significant assembly of them in biological processes and molecular functions, such as the cellular amino acid metabolic process, ncRNA processing, ribonucleoprotein complex biogenesis, catalytic activity acting on RNA, and methyl-CpG binding, as well as KEGG pathways, including proteasomes, the biosynthesis of amino acids, and one carbon pool by folate (Supplementary Fig. 3; Table 1). The inter-activities between the component genes of biological processes and pathways related to RNA splicing, mitochondrial function, and proteasome are described vividly in the PPI network (Supplementary Fig. 4).
Table 1

Functional enrichment annotation for susceptibility genes for COVID-19 in LUAD.

CategoryIDDescriptionGeneRatiopvaluep.adjustqvalueCount
BPGO:0006520cellular amino acid metabolic process20/2251.7E−084.2E−053.8E−0520
BPGO:0034470ncRNA processing16/2259.7E−071.1E−039.7E−0416
BPGO:0022613ribonucleoprotein complex biogenesis18/2251.6E−061.1E−039.7E−0418
BPGO:0042254ribosome biogenesis13/2251.7E−061.1E−039.7E−0413
BPGO:0034660ncRNA metabolic process20/2252.1E−061.1E−039.7E−0420
BPGO:0031145anaphase-promoting complex-dependent catabolic process8/2258.0E−063.4E−033.0E−038
BPGO:1902036regulation of hematopoietic stem cell differentiation7/2253.3E−051.2E−021.1E−027
BPGO:0002479antigen processing and presentation of exogenous peptide antigen via MHC class I, TAP-dependent7/2254.3E−051.3E−021.1E−027
BPGO:0033209tumor necrosis factor-mediated signaling pathway10/2255.1E−051.3E−021.1E−0210
BPGO:0061418regulation of transcription from RNA polymerase II promoter in response to hypoxia7/2255.1E−051.3E−021.1E−027
CCGO:0034709methylosome6/2403.7E−091.2E−061.0E−066
CCGO:0098798mitochondrial protein complex14/2402.0E−063.3E−042.9E−0414
CCGO:0005687U4 snRNP4/2408.2E−068.7E−047.7E−044
CCGO:0005759mitochondrial matrix18/2402.1E−051.7E−031.5E−0318
CCGO:0000793condensed chromosome11/2405.5E−053.0E−032.6E−0311
CCGO:0034719SMN-Sm protein complex4/2405.5E−053.0E−032.6E−034
CCGO:0046540U4/U6 × U5 tri-snRNP complex5/2401.0E−044.2E−033.7E−035
CCGO:0097526spliceosomal tri-snRNP complex5/2401.0E−044.2E−033.7E−035
CCGO:0000502proteasome complex6/2401.3E−044.2E−033.7E−036
CCGO:0034708methyltransferase complex7/2401.4E−044.2E−033.7E−037
MFGO:0140098catalytic activity, acting on RNA15/2375.7E−062.6E−032.4E−0315
MFGO:0008327methyl-CpG binding4/2371.2E−042.4E−022.2E−024
MFGO:0051082unfolded protein binding7/2371.6E−042.4E−022.2E−027
MFGO:0016840carbon-nitrogen lyase activity3/2374.1E−044.6E−024.2E−023
KEGGhsa03050Proteasome6/1174.7E−057.7E−037.1E−036
KEGGhsa01230Biosynthesis of amino acids7/1179.6E−057.9E−037.4E−037
KEGGhsa00670One carbon pool by folate4/1171.7E−049.1E−038.4E−034
KEGGhsa05014Amyotrophic lateral sclerosis15/1172.2E−049.1E−038.4E−0315
KEGGhsa03040Spliceosome9/1173.2E−049.3E−038.6E−039
KEGGhsa03013RNA transport10/1173.4E−049.3E−038.6E−0310
KEGGhsa03008Ribosome biogenesis in eukaryotes7/1171.0E−032.4E−022.2E−027
KEGGhsa05022Pathways of neurodegeneration - multiple diseases16/1171.2E−032.6E−022.4E−0216

Note: BP: biological process; CC: cellular component; MF: molecular function; KEGG: Kyoto Encyclopedia of Genes and Genomes. Only top ten significant terms were displayed in the table.

Functional enrichment annotation for susceptibility genes for COVID-19 in LUAD. Note: BP: biological process; CC: cellular component; MF: molecular function; KEGG: Kyoto Encyclopedia of Genes and Genomes. Only top ten significant terms were displayed in the table.

Identification of hub susceptibility genes for COVID-19 in LUAD

Based on a hierarchical clustering of topological overlap matrix (TOM)-based dissimilarity and an amalgamation of modules with close relationships, the 257 susceptibility genes were merged into three modules (Fig. 1). A correlation analysis between the module Eigengenes and the trait data of COVID-19 patients and control subjects suggested that the brown module showed the most notable positive correlation with a diagnosis of COVID-19 (r = 0.334, P < 0.001; Table 2). The identification of hub susceptibility genes for COVID-19 in LUAD was restricted to the brown module, and 10 genes, MEA1, MRPL24, PPIH, EBNA1BP2, MRTO4, RABEPK, TRMT112, PFDN2, PFDN6, and NDUFS3, with the highest gene-trait significance were confirmed as the hub susceptibility genes for COVID-19 in LUAD (Table 3).
Fig. 1

Weighted gene co-expression network analysis results for susceptibility genes for COVID-19 in LUAD. A. Sample dendrogram and heatmap of the diagnostic information on COVID-19 based on the GSE161731 dataset. The name of each sample was labeled in the dendrogram. The red bar indicates the diagnosis of COVID-19. B. The selection of the best soft thresholding power. The red line represents the cut-off value of the evaluation parameters of the scale-free network (R2 = 0.9). C. Cluster dendrogram and the merged gene modules. Bars in different colors distinguish different gene modules. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Table 2

Correlations between module Eigengenes and the diagnosis of COVID-19 patients.

ModuleCorrelation coefficientP value
blue0.3115919.25E−06
brown0.33380571.85E−06
grey−0.25654952.94E−04
Table 3

Genes significantly correlated with the diagnosis of COVID-19 from brown module.

ProbesModule colorGene-trait significanceP value of gene-trait significanceModule membershipP value of module membership
MEA1brown0.4492700874.46E−110.8091777791.89E−46
MRPL24brown0.4099419972.66E−090.857692731.17E−57
PPIHbrown0.3949523031.11E−080.8630486323.78E−59
EBNA1BP2brown0.3941836841.19E−080.8729203324.55E−62
MRTO4brown0.3621154551.97E−070.810757429.21E−47
RABEPKbrown0.3464386437.01E−070.7117858141.97E−31
TRMT112brown0.3286666892.72E−060.8273306113.14E−50
PFDN2brown0.3188717625.54E−060.8840690561.13E−65
PFDN6brown0.3146121887.49E−060.8486369972.83E−55
NDUFS3brown0.3123337328.78E−060.8827342053.19E−65
MRPL57brown0.3070830491.26E−050.8839421621.25E−65
PSMA7brown0.2972020592.45E−050.7788291585.89E−41
NME1brown0.2821702796.43E−050.47908631.39E−12
SNRPFbrown0.2667390810.000163650.9062072624.48E−74
GNL2brown0.2528288980.0003625040.8742856241.72E−62
TIMM23brown0.2465880050.0005106750.8151217761.22E−47
NDUFA6brown0.2465094950.0005128530.825199799.18E−50
SNRPD3brown0.2462760270.0005193810.8705351072.43E−61
ADSLbrown0.2064935340.0037763130.8648793791.13E−59
SNRPD1brown0.1869603020.0088682840.8191753711.77E−48
ERHbrown0.1734344620.0153204650.7081572725.40E−31
ROMO1brown0.1658788810.0204732930.5810075675.37E−19

Note: The top ten genes were designated as the hub susceptibility genes for COVID-19 in LUAD.

Weighted gene co-expression network analysis results for susceptibility genes for COVID-19 in LUAD. A. Sample dendrogram and heatmap of the diagnostic information on COVID-19 based on the GSE161731 dataset. The name of each sample was labeled in the dendrogram. The red bar indicates the diagnosis of COVID-19. B. The selection of the best soft thresholding power. The red line represents the cut-off value of the evaluation parameters of the scale-free network (R2 = 0.9). C. Cluster dendrogram and the merged gene modules. Bars in different colors distinguish different gene modules. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) Correlations between module Eigengenes and the diagnosis of COVID-19 patients. Genes significantly correlated with the diagnosis of COVID-19 from brown module. Note: The top ten genes were designated as the hub susceptibility genes for COVID-19 in LUAD.

Obvious up-regulated expression and prognostic significance of hub susceptibility genes in LUAD

All 10 hub susceptibility genes for COVID-19 exhibited remarkable overexpression in LUAD samples compared to non-cancer lung samples, and the differential expression of most of the hub susceptibility genes for COVID-19 could be used to moderately discriminate LUAD from non-cancer lung samples (Supplementary Figs. 5–8). The IHC staining results further confirmed the higher expression levels of TRMT112, PFDN6, and NDUFS3 in LUAD tissues than in normal lung tissues (Supplementary Fig. 9). The up-regulation of five hub susceptibility genes, MEA1, MRPL24, PFDN2, PFDN6, and NDUFS3, served as a significant indicator of worse overall survival for LUAD patients (P < 0.05; Fig. 2). Among the five genes, MEA1, PFDN2, and PFDN6 were verified to exert a significant impact on the survival of LUAD patients in the Lung Cancer Explorer database (Fig. 3).
Fig. 2

Prognostic analysis results for five hub susceptibility genes in LUAD. A. Kaplan–Meier survival curves on the impact of MEA1 expression on the overall survival of LUAD patients. B. Kaplan–Meier survival curves on the impact of MRPL24 expression on the overall survival of LUAD patients. C. Kaplan–Meier survival curves on the impact of PFDN2 expression on the overall survival of LUAD patients. D. Kaplan–Meier survival curves on the impact of PFDN6 expression on the overall survival of LUAD patients. E. Kaplan–Meier survival curves on the impact of NDUFS3 expression on the overall survival of LUAD patients. HR: hazard ratio. The black and red lines delineate the overall survival probability of LUAD patients in the low and high expression groups, respectively. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 3

The validation of the survival analysis for susceptibility genes in LUAD. A. Forest plots of the hazard ratio value for MEA1. B. Forest plots of the hazard ratio value for PFDN2. C. Forest plots of the hazard ratio value for PFDN6. TE: Estimate of treatment effect. SETE: Standard error of treatment estimate.

Prognostic analysis results for five hub susceptibility genes in LUAD. A. Kaplan–Meier survival curves on the impact of MEA1 expression on the overall survival of LUAD patients. B. Kaplan–Meier survival curves on the impact of MRPL24 expression on the overall survival of LUAD patients. C. Kaplan–Meier survival curves on the impact of PFDN2 expression on the overall survival of LUAD patients. D. Kaplan–Meier survival curves on the impact of PFDN6 expression on the overall survival of LUAD patients. E. Kaplan–Meier survival curves on the impact of NDUFS3 expression on the overall survival of LUAD patients. HR: hazard ratio. The black and red lines delineate the overall survival probability of LUAD patients in the low and high expression groups, respectively. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) The validation of the survival analysis for susceptibility genes in LUAD. A. Forest plots of the hazard ratio value for MEA1. B. Forest plots of the hazard ratio value for PFDN2. C. Forest plots of the hazard ratio value for PFDN6. TE: Estimate of treatment effect. SETE: Standard error of treatment estimate.

The insignificant links between COVID-19-related host protein expression and hub susceptibility genes

LUAD patients were clustered into the k1 group (212 LUAD patients) with a low expression of 10 hub susceptibility genes and the k2 group (323 LUAD patients) with a high expression of 10 hub susceptibility genes (Fig. 4A and B). TMPRSS2 expression was slightly higher in the k2 group than in the k1 group (5.163 ± 1.768; 5.287 ± 1.659), though without statistical significance (P = 0.411; Fig. 4C).
Fig. 4

K-means clustering of LUAD patients based on hub susceptibility genes and TMPRSS2 expression in different groups. A. Cluster plot. LUAD patient samples in clusters 1 or 2 are represented by blue dots and green triangles, respectively. B. Heatmap of the expression characteristics of hub susceptibility genes in two clusters of LUAD samples. C. Box plot of TMPRSS2 expression in LUAD samples of clusters 1 and 2. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

K-means clustering of LUAD patients based on hub susceptibility genes and TMPRSS2 expression in different groups. A. Cluster plot. LUAD patient samples in clusters 1 or 2 are represented by blue dots and green triangles, respectively. B. Heatmap of the expression characteristics of hub susceptibility genes in two clusters of LUAD samples. C. Box plot of TMPRSS2 expression in LUAD samples of clusters 1 and 2. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) Parallel expression patterns based on LUAD patients from the TCGA database indicated that smokers or reformed smokers with a history of more than 15 years exhibited higher levels of MRPL24, EBNA1BP2, PFDN2, and PFDN6 than nonsmokers (P < 0.05; Supplementary Fig. 10). The proportions of various immune cells in the 77 COVID-19 patient samples are illustrated in the composition map shown in Fig. 5. The correlation analysis of the expression of the 10 hub susceptibility genes and the infiltration levels of 23 immune cells in the COVID-19 samples reflect the fact that the fraction of plasma cells, CD4 activated memory T cells, and follicular helper T cells increased with the elevation of EBNA1BP2, PFDN6, and NDUFS3 expression in COVID-19 samples, while the fraction of macrophages (M0), gamma delta T cells, and neutrophils decreased with higher PPIH, MRPL24, and TRMT112 expression in COVID-19 samples (Fig. 6).
Fig. 5

The scale-stacked bar plot of the proportions of various immune cells in 77 COVID-19 patient samples. The fractions of the infiltration levels of various immune cells are represented in bars of different colors.

Fig. 6

The correlation diagram of the relationships between the infiltration levels of various immune cells and the expression of hub susceptibility genes in COVID-19 samples. Positive and negative correlations are indicated in blue and red colors, respectively. The size of nodes indicated the absolute value size of the correlation coefficient. Significant correlation results are marked with a red box. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

The scale-stacked bar plot of the proportions of various immune cells in 77 COVID-19 patient samples. The fractions of the infiltration levels of various immune cells are represented in bars of different colors. The correlation diagram of the relationships between the infiltration levels of various immune cells and the expression of hub susceptibility genes in COVID-19 samples. Positive and negative correlations are indicated in blue and red colors, respectively. The size of nodes indicated the absolute value size of the correlation coefficient. Significant correlation results are marked with a red box. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) The topological network in Supplementary Fig. 11 displays upstream miRNAs or TFs that may regulate the transcription of the 10 hub susceptibility genes. Particularly, TFs, such as SP1, ELK1, and GABPA, and miRNAs, such as hsa-miR-206, hsa-miR-548a-5p, and hsa-miR-548c-5p, could target more than one hub susceptibility gene. The 10 hub susceptibility genes and eight interactome sets of the A549 cell lines exhibited overlaps at both the gene level and the shared term level (Fig. 7); no interrelationships were found between the 10 hub susceptibility genes and the interactome sets of Calu-3 cell lines (data not shown).
Fig. 7

Overlap between hub susceptibility genes and eight interactome sets of the A549 cell lines. A. Overlaps at the gene level, in which identical genes are linked by purple curves; B. Overlaps at the shared term level, in which genes belonging to the same ontology term are linked by blue curves. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Overlap between hub susceptibility genes and eight interactome sets of the A549 cell lines. A. Overlaps at the gene level, in which identical genes are linked by purple curves; B. Overlaps at the shared term level, in which genes belonging to the same ontology term are linked by blue curves. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Discussion

The COVID-19 pandemic has resulted in great challenges in the clinical management of cancer patients, especially LUAD patients [21], [22], [23]. More attention should be paid to LUAD patients with COVID-19 to improve their life conditions. Great efforts have been devoted by previous researchers to conquer COVID-19’s impact on LUAD patients. Luo et al. carried out a bioinformatics study with multiple databases to analyze the prognostic value and mechanism of the proprotein convertase, FURIN, in LUAD [24]. Uddin et al. investigated the association of ACE2 expression with immune signatures, immune ratios, and pathways in LUAD through the computational analysis of the expression profile of ACE2 in LUAD from the TCGA and GEO databases [11]. The work of Tang et al. demonstrated the expression characteristics of ACE2, TMPRSS2, and AAK1 in LUAD and their influence on immune infiltration via differential expression analysis, enrichment pathway analysis, and the estimation of immune cell infiltration in LUAD [25]. However, the scientific mechanism underlying the high susceptibility of LUAD patients to COVID-19 had not been clarified. We are the first group to expound on the mechanism of the high vulnerability of LUAD patients to COVID-19 through the landscape profiling of the differentially expressed genes between LUAD samples infected with SARS-CoV-2 and non-infected LUAD samples. In the process of obtaining the susceptibility genes for COVID-19 in LUAD, unlike the traditional practice of gathering DEGs merely from datasets of COVID-19 and uninfected samples, the susceptibility genes for COVID-19 in LUAD were gathered from 49 SARS-CoV-2-infected LUAD samples and 24 non-affected LUAD samples, as well as 3931 LUAD samples and 3027 non-cancer lung samples in globally compiled RNA-seq and microarray datasets. The narrowing of the search range to the common up-regulated DEGs in LUAD and up-regulated DEGs in SARS-CoV-2-infected LUAD samples might enhance the relevance of the identified genes to COVID-19. Functional enrichment analyses of the selected susceptibility genes implied potential biological processes, molecular functions, and pathways through which these genes may make LUAD patients more likely to develop COVID-19. We noted that several of the enriched biological process terms were associated with RNA splicing and mitochondrial functions. Previous studies on COVID-19 have provided evidence regarding the impact of COVID-19 on RNA splicing and mitochondrial functions. Singh et al. suggested that SARS-CoV-2 might cause mitochondrial dysfunctions via downregulating the ribosomal, mitochondrial complex I, and mitochondrial fission-promoting genes [26]. Banerjee et al. reported that NSP16, a non-structural protein encoded by SARS-CoV-2, could inhibit global mRNA splicing by combining with the mRNA recognition domains of the U1 and U2 splicing of RNAs [27]. The most significant pathway was clustered near the susceptibility genes; proteasomes, which play crucial roles in viral replication processes, and proteasome inhibitors were proposed by Longhitano et al. as promising therapeutic strategies for COVID-19 [28], [29], [30]. The above functional annotation results imply that these susceptibility genes may increase LUAD patients’ risk of COVID-19 by participating in the biological processes and pathways of RNA splicing, mitochondrial functions, and proteasomes. Another fruitful finding of the current study is that several of the hub susceptibility genes, including MEA1, PFDN2, and PFDN6, were proven by both training and validation sets to be significantly related to the poor survival of LUAD patients. Many studies have reported DEGs with prominent prognostic value in LUAD; the prognostic significance of the hub susceptibility genes in LUAD from the present study is also noteworthy and might be alternatives as survival indicators of LUAD patients in future clinical practice. There have been interesting studies pointing out that cigarette smoking exerts boosting effects on the membrane expression of genes, including ACE2, in lung epithelial cells, thus increasing the risk of contracting COVID-19 [31], [32]. To determine about whether cigarette smoking also had a certain effect on the expression of the 10 hub susceptibility genes, we conducted gene expression analysis in non-smoker versus smoker groups, and we were surprised to find the higher expression of four hub susceptibility genes, including MRPL24, EBNA1BP2, PFDN2, and PFDN6, in LUAD smokers. The results imply that the increased susceptibility of LUAD smokers to COVID-19 compared to LUAD non-smokers might partly be attributed to the stimulating effects of cigarette smoking on the expression of these susceptibility genes. It was found that innate and adaptive immune cells extensively infiltrated fatal COVID-19 lungs [33]. In respiratory epithelial cells and cardiomyocytes, SARS-CoV-2 could induce innate immune responses mediated by double-stranded RNA [34], which demonstrated the considerable immune response stimulated by SARS-CoV-2. Therefore, we also checked the correlations between the expression of susceptibility genes and the infiltration level of immune cells in COVID-19 samples. Of the 10 hub susceptibility genes for COVID-19, PPIH was one of the host proteins engaged in the regulation of the calcineurin/NFAT pathway, thus playing a vital role in immune cell activation [35], [36]. The work of Susanne et al. indicated the redundant interaction between immunophilins, including PPIH, and CoV non-structural protein 1 [37]. MRTO4 was also found to be associated with the virus-induced immune response and has been screened out as one of the key genes in human antigen-presenting cells activated by the polio vaccine [38]. The connection between TRMT112 and T cells could be traced to the study by Kohei et al., in which TRMT112 was distinctively differentially expressed between T cell subsets from paroxysmal nocturnal hemoglobinuria patients and healthy control subjects [39]. Corresponding to the findings in prior studies, the significant relationships between the 10 hub susceptibility genes and the fractions of immune cells in the current study hinted at the potential involvement of the hub susceptibility genes in the immune activities of the human body against SARS-CoV-2. Interesting results regarding the virus–host interface have been yielded by the PPI maps of SARS-CoV-2 proteins and human proteins via AP-MS and BioID, which facilitated the recognition of the pathogenicity of SARS-CoV-2 [40], [41], [42], [43], [44]. Therefore, it was necessary to explore the overlaps between hub susceptibility genes for COVID-19 and the interactome sets of A549 cells in response to COVID-19. The interrelations between the hub susceptibility genes and the interactome sets of the A549 cell lines provide useful clues regarding the molecular mechanisms of these hub genes in rendering LUAD patients vulnerable to COVID-19 infection.

Conclusion

In summary, we identified a string of susceptibility genes for COVID-19 in LUAD. These susceptibility genes, MEA1, MRPL24, PPIH, EBNA1BP2, MRTO4, RABEPK, TRMT112, PFDN2, PFDN6, and NDUFS3, may increase the vulnerability of LUAD patients to COVID-19 by interfering with multiple biological processes and pathways, such as RNA splicing, mitochondrial functions, and the proteasome or immune functions of the human body. Further, in vitro and in vivo experiments should be carried out in future work to validate the functional roles and immunity correlations of hub susceptibility genes in increasing the risk of COVID-19 infection in LUAD patients. Additionally, future studies could examine the antibody levels for viral antigens, particularly the anti-SPIKE antibody, after natural infection or after vaccination in LUAD samples with high or low expressions of the 10 susceptibility genes. These were also the limitations of the present work. Nevertheless, the findings in the current study may shed new light on the high susceptibility of LUAD patients to COVID-19.

Data availability

The data underlying this article are available in the article and its online supplementary material.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  44 in total

Review 1.  Transcriptional regulation by calcium, calcineurin, and NFAT.

Authors:  Patrick G Hogan; Lin Chen; Julie Nardone; Anjana Rao
Journal:  Genes Dev       Date:  2003-09-15       Impact factor: 11.361

Review 2.  Managing Lung Cancer during Coronavirus Disease 2019 Pandemic.

Authors:  Yara Ibrahim Khalifeh; Arafat Hussein Tfayli
Journal:  Turk Thorac J       Date:  2021-03-01

3.  The ubiquitin proteasome system is necessary for efficient proliferation of porcine reproductive and respiratory syndrome virus.

Authors:  Yu Pang; Mao Li; Yanrong Zhou; Wei Liu; Ran Tao; Hejin Zhang; Shaobo Xiao; Liurong Fang
Journal:  Vet Microbiol       Date:  2020-12-08       Impact factor: 3.293

4.  Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia.

Authors:  Qun Li; Xuhua Guan; Peng Wu; Xiaoye Wang; Lei Zhou; Yeqing Tong; Ruiqi Ren; Kathy S M Leung; Eric H Y Lau; Jessica Y Wong; Xuesen Xing; Nijuan Xiang; Yang Wu; Chao Li; Qi Chen; Dan Li; Tian Liu; Jing Zhao; Man Liu; Wenxiao Tu; Chuding Chen; Lianmei Jin; Rui Yang; Qi Wang; Suhua Zhou; Rui Wang; Hui Liu; Yinbo Luo; Yuan Liu; Ge Shao; Huan Li; Zhongfa Tao; Yang Yang; Zhiqiang Deng; Boxi Liu; Zhitao Ma; Yanping Zhang; Guoqing Shi; Tommy T Y Lam; Joseph T Wu; George F Gao; Benjamin J Cowling; Bo Yang; Gabriel M Leung; Zijian Feng
Journal:  N Engl J Med       Date:  2020-01-29       Impact factor: 176.079

5.  SARS-CoV-2 induces double-stranded RNA-mediated innate immune responses in respiratory epithelial-derived cells and cardiomyocytes.

Authors:  Yize Li; David M Renner; Courtney E Comar; Jillian N Whelan; Hanako M Reyes; Fabian Leonardo Cardenas-Diaz; Rachel Truitt; Li Hui Tan; Beihua Dong; Konstantinos Dionysios Alysandratos; Jessie Huang; James N Palmer; Nithin D Adappa; Michael A Kohanski; Darrell N Kotton; Robert H Silverman; Wenli Yang; Edward E Morrisey; Noam A Cohen; Susan R Weiss
Journal:  Proc Natl Acad Sci U S A       Date:  2021-04-20       Impact factor: 11.205

6.  Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study.

Authors:  Fei Zhou; Ting Yu; Ronghui Du; Guohui Fan; Ying Liu; Zhibo Liu; Jie Xiang; Yeming Wang; Bin Song; Xiaoying Gu; Lulu Guan; Yuan Wei; Hui Li; Xudong Wu; Jiuyang Xu; Shengjin Tu; Yi Zhang; Hua Chen; Bin Cao
Journal:  Lancet       Date:  2020-03-11       Impact factor: 79.321

7.  Identification of angiotensin-converting enzyme 2 (ACE2) protein as the potential biomarker in SARS-CoV-2 infection-related lung cancer using computational analyses.

Authors:  Abdus Samad; Tamanna Jafar; Jahirul Hasnat Rafi
Journal:  Genomics       Date:  2020-09-08       Impact factor: 5.736

Review 8.  COVID-19 vaccines: The status and perspectives in delivery points of view.

Authors:  Jee Young Chung; Melissa N Thone; Young Jik Kwon
Journal:  Adv Drug Deliv Rev       Date:  2020-12-24       Impact factor: 17.873

9.  A SARS-CoV-2 protein interaction map reveals targets for drug repurposing.

Authors:  David E Gordon; Gwendolyn M Jang; Mehdi Bouhaddou; Jiewei Xu; Kirsten Obernier; Kris M White; Matthew J O'Meara; Veronica V Rezelj; Jeffrey Z Guo; Danielle L Swaney; Tia A Tummino; Ruth Hüttenhain; Robyn M Kaake; Alicia L Richards; Beril Tutuncuoglu; Helene Foussard; Jyoti Batra; Kelsey Haas; Maya Modak; Minkyu Kim; Paige Haas; Benjamin J Polacco; Hannes Braberg; Jacqueline M Fabius; Manon Eckhardt; Margaret Soucheray; Melanie J Bennett; Merve Cakir; Michael J McGregor; Qiongyu Li; Bjoern Meyer; Ferdinand Roesch; Thomas Vallet; Alice Mac Kain; Lisa Miorin; Elena Moreno; Zun Zar Chi Naing; Yuan Zhou; Shiming Peng; Ying Shi; Ziyang Zhang; Wenqi Shen; Ilsa T Kirby; James E Melnyk; John S Chorba; Kevin Lou; Shizhong A Dai; Inigo Barrio-Hernandez; Danish Memon; Claudia Hernandez-Armenta; Jiankun Lyu; Christopher J P Mathy; Tina Perica; Kala Bharath Pilla; Sai J Ganesan; Daniel J Saltzberg; Ramachandran Rakesh; Xi Liu; Sara B Rosenthal; Lorenzo Calviello; Srivats Venkataramanan; Jose Liboy-Lugo; Yizhu Lin; Xi-Ping Huang; YongFeng Liu; Stephanie A Wankowicz; Markus Bohn; Maliheh Safari; Fatima S Ugur; Cassandra Koh; Nastaran Sadat Savar; Quang Dinh Tran; Djoshkun Shengjuler; Sabrina J Fletcher; Michael C O'Neal; Yiming Cai; Jason C J Chang; David J Broadhurst; Saker Klippsten; Phillip P Sharp; Nicole A Wenzell; Duygu Kuzuoglu-Ozturk; Hao-Yuan Wang; Raphael Trenker; Janet M Young; Devin A Cavero; Joseph Hiatt; Theodore L Roth; Ujjwal Rathore; Advait Subramanian; Julia Noack; Mathieu Hubert; Robert M Stroud; Alan D Frankel; Oren S Rosenberg; Kliment A Verba; David A Agard; Melanie Ott; Michael Emerman; Natalia Jura; Mark von Zastrow; Eric Verdin; Alan Ashworth; Olivier Schwartz; Christophe d'Enfert; Shaeri Mukherjee; Matt Jacobson; Harmit S Malik; Danica G Fujimori; Trey Ideker; Charles S Craik; Stephen N Floor; James S Fraser; John D Gross; Andrej Sali; Bryan L Roth; Davide Ruggero; Jack Taunton; Tanja Kortemme; Pedro Beltrao; Marco Vignuzzi; Adolfo García-Sastre; Kevan M Shokat; Brian K Shoichet; Nevan J Krogan
Journal:  Nature       Date:  2020-04-30       Impact factor: 69.504

10.  An Integrated Systems Biology Approach Identifies the Proteasome as A Critical Host Machinery for ZIKV and DENV Replication.

Authors:  Guang Song; Emily M Lee; Jianbo Pan; Miao Xu; Hee-Sool Rho; Yichen Cheng; Nadia Whitt; Shu Yang; Jennifer Kouznetsova; Carleen Klumpp-Thomas; Samuel G Michael; Cedric Moore; Ki-Jun Yoon; Kimberly M Christian; Anton Simeonov; Wenwei Huang; Menghang Xia; Ruili Huang; Madhu Lal-Nag; Hengli Tang; Wei Zheng; Jiang Qian; Hongjun Song; Guo-Li Ming; Heng Zhu
Journal:  Genomics Proteomics Bioinformatics       Date:  2021-02-19       Impact factor: 7.691

View more
  2 in total

1.  Ogt Demonstrated Conspicuous Clinical Significance in Cancers, from Pan-Cancer to Small-Cell Lung Cancer.

Authors:  Deng Tang; Guo-Sheng Li; Ruo-Xiang Xu; Si-Yi Zhu; Jing Luo; Jin-Hua Zheng; Jun Liu; Hua-Song Lu; Mei-Hua Jin; Chong-Xi Bao; Jia Tian; Wu-Sheng Deng; Neng-Yong Zeng; Hua-Fu Zhou; Jin-Liang Kong; Gang Chen
Journal:  J Oncol       Date:  2022-03-21       Impact factor: 4.375

2.  Upregulation of CCNB2 and Its Perspective Mechanisms in Cerebral Ischemic Stroke and All Subtypes of Lung Cancer: A Comprehensive Study.

Authors:  Ming-Jie Li; Shi-Bai Yan; Gang Chen; Guo-Sheng Li; Yue Yang; Tao Wei; De-Shen He; Zhen Yang; Geng-Yu Cen; Jun Wang; Liu-Yu Liu; Zhi-Jian Liang; Li Chen; Bin-Tong Yin; Ruo-Xiang Xu; Zhi-Guang Huang
Journal:  Front Integr Neurosci       Date:  2022-07-19
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.