Literature DB >> 33126319

Identification of therapeutic targets and mechanisms of tumorigenesis in non-small cell lung cancer using multiple-microarray analysis.

Dan Zhao1, Hai-Jun Mu2, Hai Bing Shi2, Hong Xia Bi1, Yun Fei Jiang2, Guo Hua Liu2, Hong Yan Zheng2, Bo Liu2.   

Abstract

Lung cancer is the most commonly occurring cancer attributed to the leading cause of cancer-related deaths globally. Non-small cell lung cancer (NSCLC) comprises 85% to 90% of lung cancers. The survival rate of patients with advanced stage NSCLC is in months. Moreover, the underlying molecular mechanisms still remain to be understood.We used 2 sets of microarray data in combination with various bioinformatic approaches to identify the differentially expressed genes (DEGs) in NSCLC patients.We identified a total of 419 DEGs using the Limma package. Gene set enrichment analysis demonstrated that "Citrate cycle (TCA cycle)," "RNA degradation," and "Pyrimidine metabolism" pathways were significantly enriched in the NSCLC samples. Gene Ontology annotations of the 419 DEGs primarily comprised "glycosaminoglycan binding," "cargo receptor activity," and "organic acid binding." Kyoto Encyclopedia of Genes and Genomes analysis revealed that DEGs were enriched in pathways related to "Malaria," "Cell cycle," and "IL-17 signaling pathway." Protein protein interaction network analysis showed that the hub genes constituted of CDK1, CDC20, BUB1, BUB1B, TOP2A, CCNA2, KIF20A, CCNB1, KIF2C, and NUSAP1.Taken together, the identified hub genes and pathways will help understand NSCLC tumorigenesis and develop prognostic markers and therapeutic targets against NSCLC.

Entities:  

Mesh:

Year:  2020        PMID: 33126319      PMCID: PMC7598833          DOI: 10.1097/MD.0000000000022815

Source DB:  PubMed          Journal:  Medicine (Baltimore)        ISSN: 0025-7974            Impact factor:   1.817


Introduction

Lung cancer is the leading cause for cancer mortality and second most commonly diagnosed cancer globally.[ However, with the development of efficient therapeutic interventions, lung cancer has become a treatable disease from its status of extremely poor prognosis.[ Non-small cell lung cancer (NSCLC) comprises 85% to 90% of lung cancers. NSCLC includes adenocarcinoma, squamous cell carcinoma, large cell carcinoma, and other rare subtypes.[ Therefore, it is important to identify molecular biomarkers and mechanism(s) of pathogenesis associated with NSCLC that will help develop novel therapeutic approaches to improve patient outcome. The advent of in silico technology has allowed significant progress to be made in understanding the molecular mechanisms and biomarkers involved in a variety of diseases.[ Gene Expression Omnibus database is a gene expression database created and maintained by NCBI that is widely utilized for identifying key genes and potential mechanisms associated with tumorigenesis and its progression.[ In this study, we used integrated bioinformatic analysis to determine the biological characteristics of NSCLC. The microarray datasets GSE18842[ and GSE118370[ were used to identify the differentially expressed genes (DEGs) in NSCLC. We performed enrichment analysis using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways to determine potential mechanisms of tumorigenesis in children with NSCLC.[ Protein-protein interaction (PPI) network analysis using STRING predicted the correlation between the proteins of the DEGs.[ In summary, we have screened 10 hub genes associated with NSCLC and demonstrated the molecular mechanisms involved in the progression of NSCLC.

Materials and methods

Microarray analysis

Microarray data were obtained from the Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/geo/).[ The following search terms were used: “NSCLC,” “gene expression,” and “microarray.” We used 2 gene expression datasets (GSE18842 and GSE118370). The platforms used for both datasets were based on the GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array. GSE18842 comprised a total of 91 samples (45 control samples and 46 NSCLC samples) and GSE118370 comprised 12 samples (6 controls and 6 NSCLC samples). This study was approved by the Ethics Committee of The Third Affiliated Hospital, Qiqihar Medical University.

Identification of DEGs

The Limma package was downloaded from Bioconductor for data consolidation (data standardization and log2 conversion) and screening DEGs. A standard of |log2FC| ≥2 and P-values < .05 were considered statistically significant. Data was pre-processed by converting probes into gene symbols, data consolidation, and batch normalization by “sva” in the R software (version 3.5.2) and the merged data were devoid of batch effects. After batch normalization, Limma was used to identify DEGs between NSCLC and control samples.[ Adjusted P-values < .05 and |log2FC| ≥2 were considered statistically significant.

Gene set enrichment analysis (GSEA)

GSEA is a method to calculate the statistical significance between a defined set of genes among 2 groups. The reference gene sets in this study was c2.cp.kegg.v6.2. symbols. Gmt using which the normalized enrichment score was calculated. If the P- and false discovery rate q-values were both <0.05, the gene set was considered to be significantly enriched.[

GO and KEGG pathway enrichment analysis for DEGs

GO and KEGG pathways were analyzed to identify DEGs by the R software. clusterProfiler was used to identify and visualize the GO terms and KEGG pathways associated with DEGs.[P-value < .05 demonstrated significant enrichment.

PPI network for DEGs

STRING database (STRING11.0; www.string-db.org) was used to predict the DEG PPI network. Interactions with a combined score of >0.9 was the cut-off. PPI networks were generated using the Cytoscape software (version3.7.1). Significant modules and hub genes were confirmed by molecular complex detection (MCODE1.5.1; a cytoscope plug-in).[ The most significant module identified by MCODE had a score of ≥10.

Hub gene screening

The hub genes were identified using Cytoscape (version 3.7.1). CytoHubba can predict the important nodes and subnetworks in each network employing multiple topological algorithms.[

Results

Identification of DEGs in NSCLC

A total of top 50 DEGs were identified as shown in the heatmap (Fig. 1A) that included 187 upregulated and 232 downregulated genes (adjusted P < .05, |log2FC| ≥2) as shown in the volcano plot (Fig. 1B).
Figure 1

Heatmap of DEGs screened by Limma package. (A) Red areas represent highly expressed genes and green areas represent lowly expressed genes in NSCLC from tumor compared with normal. (B) Volcano plot analysis identifies DEGs. There were including 187 upregulated genes and 232 down regulated genes. Red dots represent upregulated genes and blue dots represent downregulated genes in NSCLC from tumor compared with normal. DEGs = differentially expressed genes, NSCLC = non-small cell lung cancer.

Heatmap of DEGs screened by Limma package. (A) Red areas represent highly expressed genes and green areas represent lowly expressed genes in NSCLC from tumor compared with normal. (B) Volcano plot analysis identifies DEGs. There were including 187 upregulated genes and 232 down regulated genes. Red dots represent upregulated genes and blue dots represent downregulated genes in NSCLC from tumor compared with normal. DEGs = differentially expressed genes, NSCLC = non-small cell lung cancer.

PPI network analysis

To identify the most significant DEG clusters, PPI networks, consisting of 190 nodes and 1085 edges, were generated using STRING and Cytoscope (Fig. 2).
Figure 2

PPI network of DEGs. PPI network of DEGs created by STRING. Cycles represent genes and lines represent PPIs. DEG = differentially expressed gene, PPI = protein-protein interaction.

PPI network of DEGs. PPI network of DEGs created by STRING. Cycles represent genes and lines represent PPIs. DEG = differentially expressed gene, PPI = protein-protein interaction.

Bioinformatic analysis of DEGs

The DEGs were analyzed by GO enrichment and KEGG pathway. GO analysis revealed that DEGs were significantly enriched in 25 terms (Fig. 3). The top 3 terms were: “glycosaminoglycan binding,” “cargo receptor activity,” and “organic acid binding.” KEGG analysis revealed that DEGs were significantly enriched in 10 terms (Fig. 4). The top 3 terms were: “Malaria,” “Cell cycle,” and “IL-17 signaling pathway.”
Figure 3

GO enrichment result of DEGs. The x-axis represents gene ratio and y-axis represents GO terms. The size of cycle represents gene count. Different color of circles represents different adjusted P-value. DEG = differentially expressed gene, GO = Gene Ontology.

Figure 4

KEGG enrichment result of DEGs. The x-axis represents gene ratio and y-axis represents KEGG terms. The size of circle represents gene count. Different color of circles represents different adjusted P-value. DEG = differentially expressed gene, KEGG = Kyoto Encyclopedia of Genes and Genomes.

GO enrichment result of DEGs. The x-axis represents gene ratio and y-axis represents GO terms. The size of cycle represents gene count. Different color of circles represents different adjusted P-value. DEG = differentially expressed gene, GO = Gene Ontology. KEGG enrichment result of DEGs. The x-axis represents gene ratio and y-axis represents KEGG terms. The size of circle represents gene count. Different color of circles represents different adjusted P-value. DEG = differentially expressed gene, KEGG = Kyoto Encyclopedia of Genes and Genomes.

GSEA

GSEA was performed to explore the underlying biological functions involved in NSCLC; “Citrate cycle (TCA cycle),” “RNA degradation,” and “Pyrimidine metabolism” were the significantly enriched gene sets that correlated with NSCLC (P < .05; Fig. 5).
Figure 5

GSEA plot showing most enriched gene sets in NSCLC from tumor compared with normal. (A) The most significant enriched gene set positively correlated with NSCLC was Citrate cycle (TCA cycle) (NES = 1.90, P-value < .05, FDR q-value < 0.05); (B) The second significant enriched gene set positively correlated with NSCLC was RNA degradation (NES = 1.80, P-value < .05, FDR q-value < 0.05); (C) The third significant enriched gene set positively correlated with NSCLC was Pyrimidine metabolism (NES = 1.76, P-value < .05, FDR q-value < 0.05). FDR = false discovery rate, GSEA = gene set enrichment analysis, NES = normalized enrichment score.

GSEA plot showing most enriched gene sets in NSCLC from tumor compared with normal. (A) The most significant enriched gene set positively correlated with NSCLC was Citrate cycle (TCA cycle) (NES = 1.90, P-value < .05, FDR q-value < 0.05); (B) The second significant enriched gene set positively correlated with NSCLC was RNA degradation (NES = 1.80, P-value < .05, FDR q-value < 0.05); (C) The third significant enriched gene set positively correlated with NSCLC was Pyrimidine metabolism (NES = 1.76, P-value < .05, FDR q-value < 0.05). FDR = false discovery rate, GSEA = gene set enrichment analysis, NES = normalized enrichment score.

Hub gene analysis

We downloaded the STRING-generated PPI network of 419 DEGs and then used Cytoscape to construct the whole network. The most significant module of the whole network comprised 10 nodes, that is, hub genes (Fig. 6A). To evaluate the prognostic value of hub genes in lung cancer patients, we performed log-rank survival curve analysis using data from TCGA RNA sequencing datasets (GEPIA tool). The overall survival of patients with higher expression of the lung cancer hub genes was shorter than patients with lower expression of the hub genes (Fig. 6B).
Figure 6

The most significant module of DEGs: (A) The PPI network of DEGs was performed using Cytoscape and the most significant module was obtained from PPI network with 10 nodes. (B) Overall survival analysis for 10 hub genes in lung cancer from the GEPIA tool. DEG = differentially expressed gene, PPI = protein protein interaction.

The most significant module of DEGs: (A) The PPI network of DEGs was performed using Cytoscape and the most significant module was obtained from PPI network with 10 nodes. (B) Overall survival analysis for 10 hub genes in lung cancer from the GEPIA tool. DEG = differentially expressed gene, PPI = protein protein interaction.

Discussion

The last 2 decades have seen significant advancements in the treatment of NSCLC, thereby improving our understanding of the biology of NSCLC and mechanisms of tumor progression, early detection, and multimodal care.[ The use of small molecule tyrosine kinase inhibitors and immunotherapy has led to unprecedented survival benefits in patients. However, success of treatment and survival rates for NSCLC patients still remain low (especially for metastasized tumors).[ Therefore, there is a need for the development of novel drugs and combination therapies to allow benefits to a wider patient population and improve outcomes in NSCLC patients. We have focused on bioinformatic analysis to identify DEGs between NSCLC and normal individuals. In this study, we analyzed data from 2 microarray datasets that identified a total of 419 DEGs (187 upregulated and 232 downregulated genes). Among these, 10 candidates comprised the hub genes. GO enrichment of DEGs revealed that they were mostly enriched in glycosaminoglycan binding,[ cargo receptor activity,[ and organic acid binding[; KEGG enrichment showed predominant functional enrichment in malaria,[ cell cycle,[ and IL-17 signaling pathway.[ GSEA showed that NSCLC was associated primarily with the citrate cycle (TCA cycle),[ RNA degradation,[ and Pyrimidine metabolism pathway.[ Previous studies have shown that these GO, KEGG, and GSEA pathways play important roles in the development and progression of tumors. All 10 hub genes may be key biomarkers in NSCLC tumorigenesis The genes were CDK1, CDC20, BUB1, BUB1B, TOP2A, CCNA2, KIF20A, CCNB1, KIF2C, and NUSAP1. CCNB1 affects NF-kB signaling that is associated with angiogenesis.[ We used PubMed and GeneCards to understand the biological function of TOP2A. TOP2A has been studied the most among the hub genes and it affects multiple cancer types. Raghavan et al (2012) demonstrated that CDK1 enhances the radioresistance of NSCLC cells.[ Overexpression of CDC20 correlates with the poor prognosis of primary NSCLC patients.[ Bub1 inhibition and resulting activation of DNA damage signaling regulates the localization of apoptin.[ Lee et al (2017) have shown that BUB1B was associated with poor survival rates, possibly because such patients tended to have more invasive disease.[ TOP2A induces drug resistance via Wnt signaling in NSCLC.[ Biagioni et al (2012) showed that CCNA2 affects cell cycle regulation.[ KIF20A overexpression confers a malignant phenotype to ovarian tumors by promoting proliferation and inhibiting apoptosis.[ Lu et al (2014) revealed that KIF2C promotes the rapid restructuring of the microtubule cytoskeleton.[ NUSAP1 is involved in the development, progression, and metastasis of multiple cancers.[ These proteins play an important role in various tumors, especially NSCLC, and may help identify specific and sensitive diagnostic markers and therapeutic targets.

Conclusion

This study identified a host of DEGs and hub genes in NSCLC patients that may act as candidates for biomarkers and therapeutic targets, thereby enabling a better understanding of the molecular mechanism(s) in NSCLC. However, the study has some limitations, such as a relatively small sample size and lack of knowledge on the correlation between these genes and clinical information. Thus, further experiments are needed to verify the candidates in this study and determine their biological roles in NSCLC.

Author contributions

Hai-Jun Mu was responsible for the methodology. Dan Zhao, Yun Fei Jiang, and Guo Hua Liu were involved in the investigation. Dan Zhao and Hai-Jun Mu wrote the manuscript. Hai Bing Shi, Bo Liu, Hong Yan Zheng, and Hong Xia Bi were involved in conceptualization and review & editing of manuscript. All the authors have read and approved the final manuscript.
  32 in total

1.  DNA damage response signaling triggers nuclear localization of the chicken anemia virus protein Apoptin.

Authors:  Thomas J Kucharski; Isabelle Gamache; Ole Gjoerup; Jose G Teodoro
Journal:  J Virol       Date:  2011-09-21       Impact factor: 5.103

2.  Overexpression of CDC20 predicts poor prognosis in primary non-small cell lung cancer patients.

Authors:  Tatsuya Kato; Yataro Daigo; Masato Aragaki; Keidai Ishikawa; Masaaki Sato; Mitsuhito Kaji
Journal:  J Surg Oncol       Date:  2012-04-04       Impact factor: 3.454

3.  Mitochondrial Dysfunctions Regulated Radioresistance through Mitochondria-to-Nucleus Retrograde Signaling Pathway of NF-κB/PI3K/AKT2/mTOR.

Authors:  Yuehua Wei; Lulu Chen; Hui Xu; Conghua Xie; Yunfeng Zhou; Fuxiang Zhou
Journal:  Radiat Res       Date:  2018-06-04       Impact factor: 2.841

4.  Effect of nitric oxide synthase on multiple drug resistance is related to Wnt signaling in non-small cell lung cancer.

Authors:  Yang Li; Chengyuan Ma; Xu Shi; Zhongmei Wen; Dan Li; Munan Sun; Hui Ding
Journal:  Oncol Rep       Date:  2014-07-23       Impact factor: 3.906

5.  AZD5438, an inhibitor of Cdk1, 2, and 9, enhances the radiosensitivity of non-small cell lung carcinoma cells.

Authors:  Pavithra Raghavan; Vasu Tumati; Lan Yu; Norman Chan; Nozomi Tomimatsu; Sandeep Burma; Robert G Bristow; Debabrata Saha
Journal:  Int J Radiat Oncol Biol Phys       Date:  2012-07-12       Impact factor: 7.038

Review 6.  From targets to targeted therapies and molecular profiling in non-small cell lung carcinoma.

Authors:  A Thomas; A Rajan; A Lopez-Chavez; Y Wang; G Giaccone
Journal:  Ann Oncol       Date:  2012-11-06       Impact factor: 32.976

7.  Altered biochemical specificity of G-quadruplexes with mutated tetrads.

Authors:  Kateřina Švehlová; Michael S Lawrence; Lucie Bednárová; Edward A Curtis
Journal:  Nucleic Acids Res       Date:  2016-10-26       Impact factor: 16.971

8.  SPINK1 promotes cell growth and metastasis of lung adenocarcinoma and acts as a novel prognostic biomarker.

Authors:  Liyun Xu; Changchang Lu; Yanyan Huang; Jihang Zhou; Xincheng Wang; Chaowu Liu; Jun Chen; Hanbo Le
Journal:  BMB Rep       Date:  2018-12       Impact factor: 4.778

9.  Identification of Key Genes and Pathways in Female Lung Cancer Patients Who Never Smoked by a Bioinformatics Analysis.

Authors:  Ke Shi; Na Li; Meilan Yang; Wei Li
Journal:  J Cancer       Date:  2019-01-01       Impact factor: 4.207

Review 10.  Prognostic values of F-box members in breast cancer: an online database analysis and literature review.

Authors:  Xiaochen Wang; Tao Zhang; Shizhen Zhang; Jinlan Shan
Journal:  Biosci Rep       Date:  2019-01-03       Impact factor: 3.840

View more
  2 in total

1.  Evaluate the diagnostic and prognostic value of NUSAP1 in papillary thyroid carcinoma and identify the relationship with genes, proteins, and immune factors.

Authors:  Tiantian Gao; Lei Zhao; Fan Zhang; Conghui Cao; Shuting Fan; Xiaoguang Shi
Journal:  World J Surg Oncol       Date:  2022-06-16       Impact factor: 3.253

2.  Aberrant mitochondrial homeostasis at the crossroad of musculoskeletal ageing and non-small cell lung cancer.

Authors:  Konstantinos Prokopidis; Panagiotis Giannos; Oliver C Witard; Daniel Peckham; Theocharis Ispoglou
Journal:  PLoS One       Date:  2022-09-06       Impact factor: 3.752

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.