Literature DB >> 30664219

Identification of differentially expressed genes and enriched pathways in lung cancer using bioinformatics analysis.

Tingting Long1, Zijing Liu2, Xing Zhou1, Shuang Yu1, Hui Tian1, Yixi Bao1.   

Abstract

Lung cancer is the leading cause of cancer‑associated mortality worldwide. The aim of the present study was to identify the differentially expressed genes (DEGs) and enriched pathways in lung cancer by bioinformatics analysis, and to provide potential targets for diagnosis and treatment. Valid microarray data of 31 pairs of lung cancer tissues and matched normal samples (GSE19804) were obtained from the Gene Expression Omnibus database. Significance analysis of the gene expression profile was used to identify DEGs between cancer tissues and normal tissues, and a total of 1,970 DEGs, which were significantly enriched in biological processes, were screened. Through the Gene Ontology function and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, 77 KEGG pathways associated with lung cancer were identified, among which the Toll‑like receptor pathway was observed to be important. Protein‑protein interaction network analysis extracted 1,770 nodes and 10,667 edges, and identified 10 genes with key roles in lung cancer with highest degrees, hub centrality and betweenness. Additionally, the module analysis of protein‑protein interactions revealed that 'chemokine signaling pathway', 'cell cycle' and 'pathways in cancer' had a close association with lung cancer. In conclusion, the identified DEGs, particularly the hub genes, strengthen the understanding of the development and progression of lung cancer, and certain genes (including advanced glycosylation end‑product specific receptor and epidermal growth factor receptor) may be used as candidate target molecules to diagnose, monitor and treat lung cancer.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 30664219      PMCID: PMC6390056          DOI: 10.3892/mmr.2019.9878

Source DB:  PubMed          Journal:  Mol Med Rep        ISSN: 1791-2997            Impact factor:   2.952


Introduction

Lung cancer is one of the most frequent malignancies worldwide and is the most common cause of global cancer-associated mortality, with over a million succumbing each year (1). Global mortality from lung cancer increased from 3.5 million in 1990 to 4.2 million in 2015 (2) and it is estimated that there will be 2.1 million new lung cancer cases and 1.8 million deaths in 2018, representing close to 1 in 5 (18.4%) cases of cancer-associated mortality (3). The five-year survival rate for lung cancer patients is very low. This may be attributed to the lack of effective therapeutic methods and the difficulty in early diagnosis for lung cancer (4). Lung cancer is considered to be a heterogeneous disease and a number of factors including genetic mutations, environmental factors and individual habits can contribute to cancer occurrence, progression and metastasis (5). A number of genes and cellular pathways have been reported to participate in these processes (6,7). Thus, understanding the precise molecular mechanisms underlying lung cancer progression is important for the development of diagnostic and therapeutic strategies. Microarray has increasingly become a promising tool in studying medical oncology (8). A previous study on gene expression profiling in cancer used microarray technology (9), but only a few of these studies have been conducted on lung cancer (10). In addition, comparative analysis of the differentially expressed genes (DEGs) remains relatively limited (10), and a reliable biomarker profile discriminating cancer from normal tissues remains to be identified. The expression changes of genes in the development and progression of lung cancer require further analysis. In addition, the interactions among the identified DEGs, particularly the important signaling pathways and the interaction networks, should be elucidated. In the present study, original data (GSE19804) were downloaded from Gene Expression Omnibus (GEO; www.ncbi.nlm.nih.gov/geo), which is a hub for the archiving of microarray data and their retrieval. Following the elimination of mismatched chips, the DEGs between lung cancer tissues and normal tissues were identified by comparing gene expression profiles. Subsequently, the DEGs were screened using Gene-Spring software for Gene Ontology (GO) and pathway enrichment analysis. By investigating their hub nodes and modules using a protein-protein interaction (PPI) network, the present study aimed to further understand the molecular mechanisms for lung cancer development and to identify potential candidate biomarkers for diagnosis, therapeutic targets and prognosis. At the same time, the present study focused on Toll-like receptor (TLR) pathways, which exert important immune regulatory functions and have been implicated in tumor progression (11,12).

Materials and methods

Microarray data

The gene expression profile GSE19804, a comprehensive analysis of the molecular signature of non-smoking patients with non-small cell lung cancer (NSCLC), were obtained from the GEO database. The majority of the tumors were adenocarcinomas (93%), and 78% of the samples were in stage I or II. Gene expression profile analysis was performed based on the GPL570 platform (HG-U133-Plus-2; Affymetrix Human Genome U133 Plus 2.0 Array) by Lu et al (13) and then subjected to bioinformatics analysis. There were 120 chips in this dataset. Following quality control by signal strength distribution normalization, correlation analysis and principal component analysis in Agilent Gene-Spring GX v.11.5 (14), the mismatched ones were eliminated and the remaining 31 pairs of cancerous and normal tissues were used for subsequent analysis.

Identification of DEGs

The raw data used for the analysis were pre-processed using the Affy package (version 1.48.0; bioconductor.org/packages/release/bioc/html/affy.html) in R language (15). Hierarchical clustering analysis was applied to categorize the data into two groups of different expression patterns. Significance analysis by Student's t-test and fold-change (FC) in the expression of genes between each pair of cancerous and normal tissues were jointly used to identify DEGs. Then, the Benjamini and Hochberg method (16) was used to calculate the adjusted P-values [the false discovery rate (FDR)]. The criterion of statistical significance was FDR<0.05 and |Log2 (FC)|>1.

GO and pathway enrichment analysis of DEGs

The Database for Annotation, Visualization and Integrated Discovery database (DAVID; version 6.8; david.ncifcrf.gov/), an essential tool for systematically extracting biological information from numerous genes (17), was used to perform GO (www.geneontology.org) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG; www.genome.jp/) pathway analysis; P<0.05 was considered to indicate a statistically significant difference.

Integration of PPI network and screening of modules

The Search Tool for the Retrieval of Interacting Genes (STRING, version 10.0; string-db.org/) (18), covering 9,643,763 proteins from 2,031 organisms and 932,553,897 interactions, was used to retrieve predicted PPIs. All associations obtained in STRING were provided with a confidence score. Only experimentally validated interactions with a combined score >0.4 were selected to construct the PPI network using Cytoscape software (version 3.4.0; www.cytoscape.org/) (19). The Molecular Complex Detection (MCODE) plugin in Cytoscape was utilized to screen the modules of the PPI network, Furthermore, functional and pathway enrichment analyses were performed on the DEGs in the modules. P<0.05 was considered to indicate a statistically significant difference.

Results

A total of 120 chips were acquired from the GEO datasets. Subsequent to quality control (signal strength distribution normalization, correlation analysis and principal component analysis in Gene-Spring), 62 chips were selected, including data from 31 lung cancer tissues and their matched normal lung tissues. The Pearson's correlation (signal) map (Fig. 1A) and Relative Signal Box Plot map (Fig. 1B) of the pre-treated data presented the performance of normalization. The series from each chip were analyzed separately and the DEGs lists were identified. Information on the expression levels of 54,675 genes was obtained using the GPL570 platform. A total of 1,970 DEGs (cancer tissues vs. normal tissues), including 534 up-regulated and 1,436 down-regulated genes (data not shown), were selected based on the criteria of FDR<0.05 and |Log2 (FC)|>1 (Fig. 1C). The statistical metrics for key DEGs was shown in Table I.
Figure 1.

Identification of DEGs. (A) Pearson's correlation (signal) map. The correlation coefficient was close to 1.0, indicating higher repeatability or similar to distribution. (B) Relative Signal Box Plot map. The red line is the base line; more similar distribution implies higher repeatability of the data. (C) Volcano plot comparing all of the DEGs. The red dots indicate DEGs that were significant at |Log2 (FC)|>1. DEGs, differentially expressed genes; T, lung cancer tissues; N, normal lung tissues.

Table I.

Statistical metrics for key differentially expressed genes.

Probe Set IDGeneP-valueFold change (tumor vs. normal)
210081_atAGER1.08×10−19−42.35560
232578_atCLDN189.04×10−14−41.43548
204712_atWIF11.82×10−12−34.45463
209875_s_atSPP12.01×10−1434.41545
203980_atFABP44.45×10−17−27.01589
219230_atTMEM1004.84×10−14−24.15423
37892_atCOL11A11.10×10−1022.98382
209469_atGPM6A5.90×10−25−22.56606
205725_atSCGB1A11.21×10−10−22.50023
213317_atCLIC51.04×10−16−22.22463

AGER, advanced glycosylation end-product specific receptor; CLDN18, claudin 18; WIF1, WNT inhibitory factor 1; SPP1, secreted phosphoprotein 1; FABP4, fatty acid binding protein 4; TMEM100, transmembrane protein 100; COL11A1, collagen type XI α1 chain; GPM6A, glycoprotein M6A; SCGB1A1, secretoglobin family 1A member 1; CLIC5, chloride intracellular channel 5.

Hierarchical clustering analysis of DEGs

Hierarchical clustering analysis was performed for DEGs following the extraction of the expression values. As shown in Fig. 2, the 62 specimens were divided into the lung cancer group and the normal group. The volcano plot demonstrated that compared with normal tissues, there were more downregulated genes than upregulated genes in lung cancer tissues. These results indicated that the DEGs possessed distinct expression patterns in tumors and normal tissues.
Figure 2.

Hierarchical clustering analysis of the 1,970 differentially expressed genes. Red and green indicate higher and lower gene expression, respectively. T, lung cancer tissues; N, normal lung tissues.

GO term enrichment analysis

To investigate the function of the DEGs, GO term enrichment analysis was conducted with the online software DAVID. In general, the DEGs were significantly enriched in biological processes (BP), molecular function (MF) and cellular component (CC) (Table II). In particular, upregulated DEGs were significantly enriched in BP, including ‘mitotic cell cycle process’, ‘mitotic cell cycle’, ‘cell division’, ‘chromosome segregation’ and ‘multicellular organism catabolic process’ (Table III). The down-regulated DEGs were also significantly enriched in BP, including ‘circulatory system development’, ‘cardiovascular system development’, ‘vasculature development’, ‘blood vessel development’ and ‘locomotion’ (Table III), respectively.
Table II.

Gene Ontology analysis of differentially expressed genes.

Gene set nameGene counts%P-value
GOTERMGO:0016477-cell migration25714.13.64×10−40
-BP-FATGO:0072358-cardiovascular system development22212.25.09×10−40
GO:0072359-circulatory system development22212.25.09×10−40
GO:0048870-cell motility27715.25.51×10−40
GO:0051674-localization of cell27715.25.51×10−40
GOTERMGO:0005102-receptor binding22212.23.09×10−14
-MF-FATGO:0005539-glycosaminoglycan binding563.14.52×10−13
GO:0019838-growth factor binding392.14.62×10−11
GO:0098772-molecular function regulator19910.93.11×10−10
GO:0008201-heparin binding422.31.20×10−9
GOTERMGO:0005576-extracellular region63534.94.41×10−23
-CC-FATGO:0044421-extracellular region part55130.31.42×10−22
GO:0005615-extracellular space25313.92.95×10−20
GO:0005578-proteinaceous extracellular matrix975.38.71×10−20
GO:0031012-extracellular matrix1236.83.43×10−19

BP, biological processes; FAT, functional annotation tool; MF, molecular function; CC, cellular component.

Table III.

Gene Ontology functional enrichment analyses of differentially expressed genes associated with lung cancer.

A, Up-regulated

CategoryTerm/gene functionGene count%P-value
GOTERM_BP_FATGO:1903047-mitotic cell cycle process6212.52.47×10−12
GOTERM_BP_FATGO:0000278-mitotic cell cycle6412.99.70×10−12
GOTERM_BP_FATGO:0051301-cell division459.11.16×10−10
GOTERM_BP_FATGO:0007059-chromosome segregation336.71.83×10−10
GOTERM_BP_FATGO:0044243-multicellular organism catabolic process163.23.58×10−10
GOTERM_MF_FATGO:0005201-extracellular matrix structural constituent122.47.34×10−6
GOTERM_MF_FATGO:0042802-identical protein binding6012.13.73×10−5
GOTERM_MF_FATGO:0008574-ATP-dependent microtubule motor activity, plus-end-directed61.25.13×10−5
GOTERM_MF_FATGO:1990939-ATP-dependent microtubule motor activity61.26.96×10−5
GOTERM_MF_FATGO:0016758-transferase activity, transferring hexosyl groups163.22.89×10−4
GOTERM_CC_FATGO:0005578-proteinaceous extracellular matrix316.31.75×10−7
GOTERM_CC_FATGO:0005576-extracellular region17635.55.35×10−6
GOTERM_CC_FATGO:0098643-banded collagen fibril61.21.26×10−5
GOTERM_CC_FATGO:0005583-fibrillar collagen trimer61.21.26×10−5
GOTERM_CC_FATGO:0000779-condensed chromosome, centromeric region141.91.87×10−5

B, Down-regulated

CategoryTerm/gene functionGene count%P-value

GOTERM_BP_FATGO:0072359-circulatory system development19014.33.52×10−44
GOTERM_BP_FATGO:0072358-cardiovascular system development19014.33.52×10−44
GOTERM_BP_FATGO:0001944-vasculature development14611.09.36×10−43
GOTERM_BP_FATGO:0001568-blood vessel development13710.39.96×10−40
GOTERM_BP_FATGO:0040011-locomotion24218.33.86×10−38
GOTERM_MF_FATGO:0005102-receptor binding17913.57.17×10−16
GOTERM_MF_FATGO:0005539-glycosaminoglycan binding453.46.57×10−12
GOTERM_MF_FATGO:0098772-molecular function regulator15511.72.41×10−10
GOTERM_MF_FATGO:0008201-heparin binding362.73.61×10−10
GOTERM_MF_FATGO:0003779-actin binding624.79.32×10−10
GOTERM_CC_FATGO:0044421-extracellular region part40130.36.04×10−18
GOTERM_CC_FATGO:0005576-extracellular region45934.68.97×10−18
GOTERM_CC_FATGO:0005615-extracellular space18714.14.24×10−16
GOTERM_CC_FATGO:0009986-cell surface1188.91.97×10−15
GOTERM_CC_FATGO:0031012-extracellular matrix896.77.14×10−14

BP, biological processes; FAT, functional annotation tool; MF, molecular function; CC, cellular component.

KEGG pathway analysis

KEGG pathway analysis was used to identify pathways for these DEGs. A total of 19 and 58 significantly enriched pathways for up-regulated and down-regulated genes, respectively, were identified. The most significantly enriched pathways associated with lung cancer were ‘extracellular matrix (ECM)-receptor interaction’, ‘malaria’, ‘complement and coagulation cascades’, ‘focal adhesion’, ‘protein digestion and absorption’, ‘cell adhesion molecules’, ‘PI3K-Akt signaling pathway’, ‘Rap1 signaling pathway’, ‘tight junction’ and ‘p53 signaling pathway’ (Fig. 3 and Table IV). In addition, the enriched KEGG pathways also included the ‘TLR signaling pathway’.
Figure 3.

Top 10 most significantly enriched pathways of differentially expressed genes associated with lung cancer as analyzed by Kyoto Encyclopedia of Genes and Genomes pathway analysis. ECM, extracellular matrix; PI3K, phosphoinositide-3-kinase; Akt, protein kinase B; Rap1, Ras-proximate-1.

Table IV.

Top 10 most overrepresented Kyoto Encyclopedia of Genes and Genomes pathways of differentially expressed genes.

Gene set nameCount%P-value
hsa04512: ECM-receptor interaction281.51.52×10−7
hsa05144: Malaria201.12.20×10−7
hsa04610: Complement and coagulation cascades241.33.03×10−7
hsa04510: Focal adhesion472.65.97×10−7
hsa04974: Protein digestion and absorption251.49.42×10−6
hsa04514: Cell adhesion molecules (CAMs)331.82.59×10−5
hsa04151: PI3K-Akt signaling pathway613.36.57×10−5
hsa04015: Rap1 signaling pathway402.23.29×10−4
hsa04530: Tight junction291.64.68×10−4
hsa04115: p53 signaling pathway181.04.77×10−4

ECM, extracellular matrix; PI3K, phosphoinositide-3-kinase; Akt, protein kinase B.

Construction of the PPI network and screening of modules

Based on the predicted interactions of identified DEGs, the PPI network was constructed to identify the most important proteins and biological modules that may serve crucial roles in the development of lung cancer. A total of 1,770 nodes and 10,667 edges were screened from the PPI network (Fig. 4). Each gene was assigned a degree representing the number of neighboring nodes in the network and changes in the proteins/genes. The top 10 hub nodes with the highest degrees in lung cancer were epidermal growth factor receptor (EGFR), Jun proto-oncogene activator protein (AP)-1 transcription factor subunit (JUN), Fos proto-oncogene AP-1 transcription factor subunit (FOS), interleukin 6 (IL6), MYC proto-oncogene basic helix-loop-helix protein 39 (MYC), matrix metalloproteinase 9 (MMP9), Cyclin dependent kinase 1 (CDK1), Cadherin 1 (CDH1), FYN proto-oncogene Src family tyrosine kinase (FYN) and fibroblast growth factor 2 (FGF2) (Table V). EGFR exhibited the highest node degree of 198 and the betweenness was 0.088. The high degree of these hub genes indicated that these proteins may serve crucial roles in maintaining the whole protein interaction network. In addition, to explore the significance of these DEGs, the top 3 significant modules were selected, and the functional annotation of the genes associated with the modules was analyzed (Fig. 5 and Table VI). The results demonstrated that these modules were associated with the chemokine signaling pathway (Fig. 5A), cell cycle (Fig. 5B) and pathways in cancer (Fig. 5C).
Figure 4.

Protein-protein interaction network of differentially expressed genes identified by Search Tool for the Retrieval of Interacting Genes.

Table V.

Top 10 hub nodes with highest degrees of interaction in lung cancer.

NameNode degreeBetweenness centralityCloseness centralityStress centralityClustering coefficient
EGFR1980.0880.4642,403,5140.085
JUN1850.0550.4571,936,2380.116
FOS1640.0380.4471,365,0320.120
IL61390.0310.4271,039,0560.138
MYC1360.0300.4411,089,7020.147
MMP91350.0280.4331,011,4340.148
CDK11210.0320.427936,8280.105
CDH11190.0350.4261,062,3480.111
FYN1180.0350.4101,089,7980.127
FGF21140.0130.420577,5680.184
Figure 5.

Top 3 modules from the protein-protein interaction network. Nodes and links represent human proteins and protein interactions. (A) The enriched pathways of module 1; (B) The enriched pathways of module 2; (C) The enriched pathways of module 3.

Table VI.

Top 3 modules from the protein-protein interaction network.

ModulesTermP-valueFalse discovery rateGenes
1Chemokine signaling pathway6.11×10−126.23×10−9ADCY4, CXCL5, PPBP, CCL21, CXCL3, CXCL2, CXCR2, GNG2, CCL5, CXCL12, CCL28
Neuroactive ligand-receptor interaction5.62×10−50.057248P2RY13, S1PR1, C5AR1, SSTR1, P2RY14, FPR1, FPR2
Cytokine-cytokine receptor interaction2.51×10−40.255655PPBP, CCL21, CXCR2, CCL5, CXCL12, CCL28
2Cell cycle2.53×10−91.80×10−6CCNB1, CDK1, CCNB2, MAD2L1, BUB1B, CDC20, CCNA2
Progesterone-mediated oocyte maturation2.90×10−60.002056CCNB1, CDK1, CCNB2, MAD2L1, CCNA2
Oocyte meiosis2.99×10−40.212472CDK1, MAD2L1, CDC20, AURKA
3Pathways in cancer7.30×10−78.72×10−4AGTR1, EDNRB, VEGFC, IL6, CDKN2A, PLCB4, GNAQ, PPARG, CDH1, FGF2, PIK3R1, TGFB2
Chagas disease (American trypanosomiasis)9.65×10−60.011527ACE, IL6, PLCB4, GNAQ, PIK3R1, TGFB2
African trypanosomiasis2.00×10−50.023928ICAM1, IL6, PLCB4, GNAQ, F2RL1

DEGs and pathway analysis of the TLR pathway

Among all of the significantly enriched pathways for DEGs, the TLR signaling pathway was the focus of the present study due to its close association with cancer. There were 12 DEGs primarily involved in this pathway, including secreted phosphoprotein 1, IL6, mitogen-activated protein kinase kinase kinase 8, TLR8, FOS, chemokine ligand 4 (CCL4), TLR4, CCL5, JUN, phosphoinositide-3- kinase regulatory subunit 1, protein kinase 3 and mitogen-activated protein kinase 13 (Fig. 6 and Table VII). Using the available gene data of this signaling pathway, a network diagram was constructed including a series of receptors, signaling kinases, transcription factors and cytokines. These genes form a complete signaling pathway to serve an important regulatory role. These results suggested that the TLR pathway could be one of the significant pathways involved in cancer treatment, providing a target for drug development.
Figure 6.

Gene network associated with Toll-like receptor pathways. Red and green dots represent up- and downregulated differentially expressed genes, respectively. Grey diamonds marked with yellow stars represent the associated genes, which were added by the system. The dot size represents the difference of one gene between lung cancer tissues and normal tissues.

Table VII.

Main genes associated with Toll-like receptors pathways.

Gene IDGeneP-valueFold change
6696SPP12.01×10−1434.41545
3569IL61.05×10−12−8.05396
1326MAP3K85.29×10−15−4.17527
51311TLR85.01×10−10−3.84351
2353FOS5.20×10−12−3.81020
6351CCL45.31×10−9−2.77206
7099TLR43.40×10−7−2.50968
6352CCL51.51×10−6−2.42054
3725JUN1.05×10−8−2.36435
5295PIK3R18.86×10−10−2.26284
10000AKT39.98×10−11−2.22438
5603MAPK137.48×10−112.14546

Discussion

Cancer is essentially a genetic disease, and many genetic alterations accumulate during the multistep process of carcinogenesis, which eventually leads to abnormal unrestrained cell growth and malignant phenotype (20). Lung cancer is the most common primary pulmonary malignant tumor in terms of incidence and mortality (21). The total number of lung cancer cases has become a major public health concern in China (rates are 40 per 100,000), and the situation in other parts of world, including Micronesia/Polynesia (rates are >76.5 per 100,000) and Eastern Europe (rates are >61.2 per 100,000) is even worse (3), which is mainly attributable to cigarette smoking (22). Therefore, the early diagnosis and effective treatment of lung cancer is urgently required, which may be achieved via the identification of the DEGs between lung cancer and normal tissues, and by understanding the underlying molecular mechanism. Microarray and high throughput sequencing analysis can screen a large number of genes in the human genome for further functional analysis, and can be widely used to screen biomarkers for early diagnosis and specific therapeutic targets. Therefore, they may aid the diagnosis of lung cancer in the early stages and the development of targeted treatment, thus improving prognosis. Microarray studies possess great potential to provide novel insights into the pathogenesis of complex diseases (23). The present study systematically applied integrated bioinformatics methods to identify new candidates that serve roles in the progression of lung cancer. The data extracted from the GEO dataset contained 31 pairs of lung cancer and normal samples. A total of 534 up-regulated and 1,436 down-regulated DEGs in cancerous tissues, when compared with normal samples, were identified using bioinformatics analysis, indicating that the occurrence and development of cancer is closely associated with genetic mutations (24). Then, GO and KEGG pathway analyses were used to investigate the interactions of these DEGs. Finally a PPI network identified specific key genes. The results of the present study may provide potential biomarkers for the diagnosis of lung cancer. For example, it was identified that advanced glycosylation end-product specific receptor (AGER), with a 42-fold decrease in patients with lung cancer, was the most differentially expressed gene in DEG analysis. AGER belongs to the immunoglobulin superfamily and is an oncogenic transmembrane receptor up-regulated in various types of human cancers, including squamous cervical cancer (25) and pancreatic tumor (26). However, it has an inhibitory effect on lung cancer development (27). In patients with lung adenocarcinoma, AGER polymorphisms are associated with disease susceptibility and prognosis, and can predict survival (28,29). Therefore, AGER may be effective as a potential marker for lung cancer, however, further studies are required. GO analysis is helpful for annotating genes and gene products. GO analysis in the present study demonstrated that up-regulated DEGs were mainly involved in the cell cycle and material metabolism, including ‘mitotic cell cycle’, ‘cell division’ and ‘multicellular organism catabolic’ processes, which mainly refer to cellular processes. However, the down-regulated DEGs were mainly involved in organ systems, including the circulatory system, cardiovascular system and vasculature development, all of which are associated with systemic processes. This is consistent with the knowledge that the defective functioning of cell biological processes (30) and the state of the body system status are important causes of tumor development and progression. Also it indicated that cellular activity was enhanced and system function was weakened. Therefore, monitoring the expression of these DEGs may aid the discovery of mechanisms for tumorigenesis and tumor progression. It is known that signal transduction in cancer cells differs substantially from normal cells (31). The KEGG pathway database contains information on systematic analysis of gene functions, linking genomics with functional information. Enrichment analysis identified important KEGG pathways associated with lung cancer, including ‘ECM-receptor interaction’, ‘malaria’, ‘complement and coagulation cascades’, ‘focal adhesion’, and ‘protein digestion and absorption’. Pathway disturbance in ‘ECM-receptor interaction’, ‘complement and coagulation cascades’ and ‘adhesive attraction’ (32–34) have been highly noted in lung cancer. The present study identified that the ECM-tumor cell interactions may activate intracellular signaling pathways, which is responsible for tumor cell invasion and metastasis (35,36). The present study focused on the TLR pathway among the enriched pathways that are expressed in a wide variety of cancer cells and immune cells (37). This pathway is centrally involved in the initiation of innate immunity and induction of adaptive immune responses (38), which is useful for maintaining organism integrity and is markedly involved in cancer progression, development and defense (39). However, the activation of TLRs in tumor cells induces the synthesis of pro-inflammatory factors and immunosuppressive molecules, which can enhance the resistance of tumor cells to cytotoxic lymphocyte attack and lead to immune evasion (40). Cancer cells can avoid immune surveillance as TLRs trigger cells to release a number of biological factors, including IL6, vascular endothelial growth factor and MMP (41). Notably, these factors were included in the DEGs between lung cancer tissues and normal tissues. Therefore, it was suggested that targeting tumor TLR signaling pathways may provide promising therapeutic methods (40). A PPI network was constructed with DEGs, which revealed that the top 10 hub genes with highest degrees were EGFR, JUN, FOS, IL6, MYC, MMP9, CDK1, CDH1, FYN and FGF2. EGFR was at the core of the PPI network and exhibited the highest degree with a connectivity of 198, suggesting a role for EGFR as a potential marker for lung cancer. EGFR is a transmembrane receptor tyrosine kinase and is frequently observed in lung cancer patients with poor differentiation and poor prognosis (42). It can lead to signal transduction including cell differentiation, proliferation and apoptosis through phosphorylating other proteins, and has been reported to be overexpressed in patients with NSCLC (43). Activating mutations in the EGFR gene are considered to be favorable prognostic markers (44) and have become a novel personalized treatment target for patients with NSCLC (45). However, the role of EGFR as a marker of lung cancer diagnosis and treatment is unclear and requires further investigation. JUN and FOS are the predominant components of AP-1, which serves a critical role in the transcriptional regulation of genes involved in cell survival, proliferation, migration and transformation (46). These two proteins are overexpressed in the development of various carcinomas, including pulmonary malignancies (47). The IL6 gene is co-expressed with a number of oncogenic genes, and IL6 is produced and secreted by immune and tumor cells (48). It is involved in different physiologic and pathophysiologic processes such as promoting tumorigenesis and modifying various tumor behaviors, including apoptosis, migration, proliferation, angiogenesis and metabolism (49,50). Elevated levels of serum IL6 are associated with poor prognosis in the majority of malignancies (51). The MYC gene is an oncogene with a high frequency of amplification in lung cancer (52) and its protein has important roles in cell proliferation and differentiation. It is closely associated with tumor occurrence, progression and prognosis in different types of tumors (53). The other 5 hub genes also serve important roles in cell migration, proliferation, differentiation, apoptosis and cell cycle process (54–56), affecting the development of cancer. MMP9 degrades and restores the ECM, and its overexpression not only promotes NSCLC metastasis directly but can also facilitate metastatic spread (57). CDK1 is expressed in high levels in tumor tissues, and is associated with poor prognosis and shorter survival time in patients with NSCLC (58). The CDH1 gene encodes a tumor suppressor protein and its mutation or deletion results in the promotion of cancer invasion and metastasis (55) in addition to poor prognosis (59). FYN contributes to the progression of cancer by regulating cell cycle, differentiation, adhesion, motility and survival of cells (56). FGF2 has been shown to be activated in lung cancer (60) and is involved in angiogenesis by initiating a signal transduction cascade that promotes cell proliferation, motility and angiogenesis (61). Therefore, all the hub genes may possess key roles in lung cancer and could interact with each other. They may be used as potential effective candidates for early diagnosis or prognosis. The PPI module contained 1,770 nodes and 10,667 edges, and the top 3 modules extracted were ‘chemokine signaling pathway’, ‘cell cycle’ and ‘pathways in cancer’, all of which are associated with lung cancer. The chemokine signaling pathway contains a number of chemokine proteins, including chemokine (C-X-C motif) ligand (CXCL)-2, CXCL3 and CCL18, which are differentially expressed in lung cancer. As chemokine gradients direct cell migration towards the site of inflammation (62), they are key in immune system functioning, and are important for the removal of pathogens, inflammation, cell and organ development, wound repair, occurrence of tumors and metastasis, and transplantation immune rejection (63–65). Cell cycle disorders and overgrowth of cells are common biological characteristics of tumors, leading to increased cell proliferation and decreased apoptosis (66). It should be noted that the cell cycle is a tightly regulated process and is frequently aberrant in lung cancer (67). The relevant inhibitors of the cell cycle have emerged as novel drugs for the treatment of lung cancer by suppressing the unrestricted cell division and growth of lung cancer cells (67). Finally, pathways in cancer are of similar significance in cancer cell proliferation, apoptosis, metastasis, angiogenesis and survival. However, there were still certain limitations in the present study. First, although a set number of genes were revealed to be potential markers for lung cancer, further experiments are still required to evaluate the roles of these genes as novel biomarkers. Secondly, the gene expression profile was composed of 93% adenocarcinomas and 7% squamous cancer, since there is great genetic heterogeneity between adenocarcinoma and squamous cancer. Furthermore, the variation trend of JUN, FOS, IL6, MYC and FGF2 in this study is inconsistent with that reported in previous studies (46,47); this may be due to the limited numbers of chips. Therefore, further data for lung cancer are required to validate the results of the present study. Thirdly, due to the limitation of the dataset used, it was not possible to construct a miRNA-target gene regulatory network or transcription factor-target gene regulatory network. Finally, survival analysis could not be performed based on the data presented in the present study. In conclusion, the present study provided a comprehensive bioinformatics analysis of DEGs in lung cancer. Analysis of these altered genes provided information regarding the molecular mechanisms of lung cancer and significant biomarkers or targets for the diagnosis and treatment of lung cancer. However, further molecular biological experiments are required to confirm the function of the DEGs and pathways in different types of lung cancer.
  10 in total

1.  T2-DAG: a powerful test for differentially expressed gene pathways via graph-informed structural equation modeling.

Authors:  Jin Jin; Yue Wang
Journal:  Bioinformatics       Date:  2021-11-10       Impact factor: 6.937

2.  Transcription factors linked to the molecular signatures in the development of hepatocellular carcinoma on a cirrhotic background.

Authors:  Jamshid Motalebzadeh; Elaheh Eskandari
Journal:  Med Oncol       Date:  2021-09-01       Impact factor: 3.064

3.  Dysregulation of ferroptosis may involve in the development of non-small-cell lung cancer in Xuanwei area.

Authors:  Guangjian Li; Jiapeng Yang; Guangqiang Zhao; Zhenghai Shen; Kaiyun Yang; Linwei Tian; Qinghua Zhou; Ying Chen; Yunchao Huang
Journal:  J Cell Mol Med       Date:  2021-02-02       Impact factor: 5.310

4.  Identification of HMMR as a prognostic biomarker for patients with lung adenocarcinoma via integrated bioinformatics analysis.

Authors:  Zhaodong Li; Hongtian Fei; Siyu Lei; Fengtong Hao; Lijie Yang; Wanze Li; Laney Zhang; Rui Fei
Journal:  PeerJ       Date:  2021-12-22       Impact factor: 2.984

5.  The Polymorphism and Expression of EGFL7 and miR-126 Are Associated With NSCLC Susceptibility.

Authors:  Weipeng Liu; Yunyun Zhang; Fengdan Huang; Qianli Ma; Chuanyin Li; Shuyuan Liu; Yan Liang; Li Shi; Yufeng Yao
Journal:  Front Oncol       Date:  2022-04-14       Impact factor: 5.738

6.  Bioinformatics analysis and identification of genes and pathways involved in patients with Wilms tumor.

Authors:  Yufeng Li; Haizhou Tang; Zhenwen Huang; Huaxing Qin; Qin Cen; Fei Meng; Liang Huang; Lifang Lin; Jian Pu; Di Yang
Journal:  Transl Cancer Res       Date:  2022-08       Impact factor: 0.496

7.  Identification of differentially expressed microRNAs and the potential of microRNA-455-3p as a novel prognostic biomarker in glioma.

Authors:  Wei Wang; Shuwen Mu; Qingshuang Zhao; Liang Xue; Shousen Wang
Journal:  Oncol Lett       Date:  2019-09-27       Impact factor: 2.967

8.  Time course analysis of large-scale gene expression in incised muscle using correspondence analysis.

Authors:  Tetsuya Horita; Mohammed Hassan Gaballah; Mamiko Fukuta; Sanae Kanno; Hideaki Kato; Masataka Takamiya; Yasuhiro Aoki
Journal:  PLoS One       Date:  2020-03-25       Impact factor: 3.240

9.  Exploring the potential biomarkers for prognosis of glioblastoma via weighted gene co-expression network analysis.

Authors:  Mengyuan Zhang; Zhike Zhou; Zhouyang Liu; Fangxi Liu; Chuansheng Zhao
Journal:  PeerJ       Date:  2022-01-18       Impact factor: 2.984

Review 10.  Role of Cytokines and Chemokines in NSCLC Immune Navigation and Proliferation.

Authors:  Sowmya Ramachandran; Amit K Verma; Kapil Dev; Yamini Goyal; Deepti Bhatt; Mohammed A Alsahli; Arshad Husain Rahmani; Ahmad Almatroudi; Saleh A Almatroodi; Faris Alrumaihi; Naushad Ahmad Khan
Journal:  Oxid Med Cell Longev       Date:  2021-07-16       Impact factor: 6.543

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.