Literature DB >> 32722976

Identification of potential hub genes associated with the pathogenesis and prognosis of hepatocellular carcinoma via integrated bioinformatics analysis.

Ziqi Meng1, Jiarui Wu1, Xinkui Liu1, Wei Zhou1, Mengwei Ni1, Shuyu Liu1, Siyu Guo1, Shanshan Jia1, Jingyuan Zhang1.   

Abstract

OBJECTIVE: The objective was to identify potential hub genes associated with the pathogenesis and prognosis of hepatocellular carcinoma (HCC).
METHODS: Gene expression profile datasets were downloaded from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) between HCC and normal samples were identified via an integrated analysis. A protein-protein interaction network was constructed and analyzed using the STRING database and Cytoscape software, and enrichment analyses were carried out through DAVID. Gene Expression Profiling Interactive Analysis and Kaplan-Meier plotter were used to determine expression and prognostic values of hub genes.
RESULTS: We identified 11 hub genes (CDK1, CCNB2, CDC20, CCNB1, TOP2A, CCNA2, MELK, PBK, TPX2, KIF20A, and AURKA) that might be closely related to the pathogenesis and prognosis of HCC. Enrichment analyses indicated that the DEGs were significantly enriched in metabolism-associated pathways, and hub genes and module 1 were highly associated with cell cycle pathway.
CONCLUSIONS: In this study, we identified key genes of HCC, which indicated directions for further research into diagnostic and prognostic biomarkers that could facilitate targeted molecular therapy for HCC.

Entities:  

Keywords:  Gene Expression Omnibus; Hepatocellular carcinoma; bioinformatics analysis; differentially expressed genes; hub genes; survival

Mesh:

Substances:

Year:  2020        PMID: 32722976      PMCID: PMC7391448          DOI: 10.1177/0300060520910019

Source DB:  PubMed          Journal:  J Int Med Res        ISSN: 0300-0605            Impact factor:   1.671


Introduction

On a global scale, cancer is the main public health problem and liver cancer is a major contributor to both cancer morbidity and mortality.[1] Liver cancer is the sixth most common cancer and the fourth highest cause of cancer-related mortality worldwide.[2] There were expected to be 42,030 newly diagnosed cases and 31,780 deaths of liver cancer in the United States during 2019.[3] Hepatocellular carcinoma (HCC) is the most common form of primary liver cancer, comprising 75% to 85% of cases.[2] The well-recognized risk factors for HCC include chronic infection with hepatitis B (HBV) or hepatitis C virus, exposure to dietary aflatoxin, alcohol-induced cirrhosis, smoking, obesity, and type 2 diabetes.[2,4] In Asia (especially China), chronic HBV infection is the leading etiologic factor of HCC.[5] Most HCC patients are diagnosed at an advanced stage, and locoregional treatments (chemoembolization) and surgical treatments are relatively disappointing in terms of overall survival (OS) of patients with advanced disease.[6] In addition, traditional chemotherapies have not shown promising outcomes in treatment of HCC and have significant toxicity.[6,7] Meanwhile, the lack of early detection of diagnostic markers and limited treatment strategies increase the risk of poor prognosis and death.[8] Therefore, there is a pressing need to develop robust diagnostic strategies and effective therapies for HCC patients.[9] Over the past decades, microarray technology and bioinformatics have been extensively applied to identify the molecular mechanisms of HCC, which provide strong research support for the diagnosis, treatment, and prognosis of HCC.[10] Because of the ability to process a large number of datasets quickly, integrated bioinformatics analysis and microarray technology have allowed researchers to comprehensively identify the functions of numerous differentially expressed genes (DEGs) in HCC, and they help researchers explore the complicated process of HCC occurrence and development.[10,11] A work by He et al.[12] identified four hub genes and two important pathways in the development of HCC from cirrhosis from one Gene Expression Omnibus (GEO) dataset using a bioinformatics method, including DEG screening, enrichment analyses, and construction of a protein–protein interaction (PPI) network. Zhang et al.[13] screened hub genes and pathways correlated with the occurrence and progression of HCC via a series of bioinformatics analyses incorporating DEGs identification, functional enrichment analyses, PPI network and module analysis, and weighted correlation network analysis. Zhou et al.[11] identified the pivotal genes and microRNAs in HCC using a bioinformatics approach, including analysis of raw data via GEO2R, Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses, and construction of PPI network. However, to improve the diagnosis and treatment of HCC, novel diagnostic and prognostic biomarkers for HCC are needed. The flowchart of the study approach is shown in Figure 1.
Figure 1.

Flowchart for identification of core genes and pathways for hepatocellular carcinoma (HCC). GEO, Gene Expression Omnibus; DEG, differentially expressed gene; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; PPI, protein–protein interaction; GEPIA, Gene Expression Profiling Interactive Analysis.

Flowchart for identification of core genes and pathways for hepatocellular carcinoma (HCC). GEO, Gene Expression Omnibus; DEG, differentially expressed gene; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; PPI, protein–protein interaction; GEPIA, Gene Expression Profiling Interactive Analysis.

Materials and methods

Ethical approval

Ethical approval was not required in this study because we analyzed only published data from the GEO database.

Gene expression profile data

Gene expression profile data (GSE36376,[14] GSE39791,[15] GSE41804,[16] GSE54236,[17,18] GSE57957,[19] GSE62232,[20] GSE64041,[21] GSE69715,[22] GSE76427,[23] GSE84005, GSE87630,[24] GSE112790,[25] and GSE121248[26]) were downloaded from the GEO database (http://www.ncbi.nlm.nih.gov/geo/),[27] a public data repository, including high-throughput gene expression and other functional genome datasets. The selection criteria for the included datasets were as follows: (1) tissue samples collected from human HCC and corresponding adjacent or normal tissues; and (2) including at least 40 samples.

Integrated analysis of microarray datasets

The matrix data of each GEO dataset were normalized and log2 transformed using the R software package limma,[28] and the DEGs between HCC and corresponding adjacent or normal tissues were also filtered using the limma package. Integration of DEGs identified from the 13 datasets was performed by RobustRankAggreg package[29] in R software. A |log2 fold change (FC)| ≥1 and adjusted P-value < 0.05 were considered significant for the DEGs.

Enrichment analyses of DEGs

Database for Annotation, Visualization and Integrated Discovery (DAVID; https://david.ncifcrf.gov/, version, 6.8)[30] is a comprehensive functional annotation tool for extracting biological significance from large gene/protein datasets. In this study, the GO and KEGG pathway enrichment analyses for the DEGs were conducted via DAVID. The visualization of enrichment analysis results was conducted by using ggplot2[31] and the GOplot[32] package in the R software.

PPI network and module analysis

Search Tool for the Retrieval of Interacting Genes/Proteins (STRING; https://string-db.org/)[33] is a database of known and predicted protein interactions, showing direct and indirect interactions among proteins. This database was applied to obtain potential interactions among the DEGs. PPIs with a confidence score ≥0.7 were reserved and imported into Cytoscape software[34] to construct the PPI network. Furthermore, the clustering modules in this PPI network were analyzed using the MCODE (Molecular Complex Detection) plugin in Cytoscape.[35] Pathway enrichment analyses for important modules were also carried out. The visualization of enrichment analysis results was performed by using the imageGP platform (http://www.ehbio.com/ImageGP/index.php/Home/Index/GOenrichmentplot.html).

Survival analysis of hub genes

Kaplan–Meier plotter (KM plotter; http://kmplot.com/analysis/) is a database containing clinical data and gene expression data.[36] This database is used to further understanding the molecular basis of disease and identifying biomarkers associated with survival.[37] The recurrence-free survival and OS information were based on GEO, the European Genome-phenome Archive (EGA), and The Cancer Genome Atlas (TCGA) databases. Hazard ratios (HR) with 95% confidence intervals and log rank P-value were calculated to assess the association of gene expression with survival and are shown in plots.[38]

Expression level analysis and correlation analysis of hub genes

The Gene Expression Profiling Interactive Analysis (GEPIA; http://gepia.cancer pku.cn/index.html)[39] is a newly developed web-based tool that applies a standard processing pipeline to analyze gene expression data between tumor and normal tissues. The relationship of expression of hub genes in HCC and normal tissues were visualized by boxplot.[38] In addition, correlation analysis was performed by GEPIA to check the relative ratios between two genes.[39]

Results

Identification of DEGs

In the present study, 13 datasets were downloaded from GEO that included 1100 cancer tissues and 717 corresponding adjacent or normal tissues (Table 1). After integrated analysis, 380 DEGs (293 downregulated and 87 upregulated) were identified (Figure 2a-m and Appendix). Figure 2n shows the top 20 down- and upregulated genes.
Table 1.

Information for the 13 Gene Expression Omnibus datasets included in the current study.

DatasetPlatformNumber of samples (tumor/control)
GSE36376GPL10558-Illumina HumanHT-12 V4.0 expression beadchip433 (240/193)
GSE39791GPL10558-Illumina HumanHT-12 V4.0 expression beadchip144 (72/72)
GSE41804GPL570-[HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array40 (20/20)
GSE54236GPL6480-Agilent-014850 Whole Human Genome Microarray 4x44K G4112F (Probe Name version)161 (81/80)
GSE57957GPL10558-Illumina HumanHT-12 V4.0 expression beadchip78 (39/39)
GSE62232GPL570-[HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array91 (81/10)
GSE64041GPL6244-[HuGene-1_0-st] Affymetrix Human Gene 1.0 ST Array [transcript (gene) version]125 (60/65)
GSE69715GPL570-[HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array103 (37/66)
GSE76427GPL10558-Illumina HumanHT-12 V4.0 expression beadchip167 (115/52)
GSE84005GPL5175-[HuEx-1_0-st] Affymetrix Human Exon 1.0 ST Array [transcript (gene) version]76 (38/38)
GSE87630GPL6947-Illumina HumanHT-12 V3.0 expression beadchip94 (64/30)
GSE112790GPL570-[HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array198 (183/15)
GSE121248GPL570-[HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array107 (70/37)
Figure 2.

Identification of DEGs. Volcano plots of Gene Expression Omnibus datasets (a) GSE36376, (b) GSE39791, (c) GSE41804, (d) GSE54236, (e) GSE57957, (f) GSE62232, (g) GSE64041, (h) GSE69715, (i) GSE76427, (j) GSE84005, (k) GSE87630, (l) GSE112790, and (m) GSE121248; (n) heat map of DEGs. Blue indicates lower expression levels, red indicates higher expression levels, and white indicates no differentially expression among the genes. Each column represents one dataset and each row represents one gene. The number in each rectangle represents the normalized gene expression level. The gradual color ranged from blue to red represents the changing process from downregulation to upregulation. DEG, differentially expressed gene.

Information for the 13 Gene Expression Omnibus datasets included in the current study. Identification of DEGs. Volcano plots of Gene Expression Omnibus datasets (a) GSE36376, (b) GSE39791, (c) GSE41804, (d) GSE54236, (e) GSE57957, (f) GSE62232, (g) GSE64041, (h) GSE69715, (i) GSE76427, (j) GSE84005, (k) GSE87630, (l) GSE112790, and (m) GSE121248; (n) heat map of DEGs. Blue indicates lower expression levels, red indicates higher expression levels, and white indicates no differentially expression among the genes. Each column represents one dataset and each row represents one gene. The number in each rectangle represents the normalized gene expression level. The gradual color ranged from blue to red represents the changing process from downregulation to upregulation. DEG, differentially expressed gene.

GO and KEGG pathway enrichment analyses of DEGs

To deepen our understanding of DEGs, we performed GO and KEGG pathway enrichment analyses. Thirty-one significantly enriched GO terms were selected based on a false discovery rate (FDR) < 0.05 (Figure 3a and Appendix). In the GO terms were 13 terms for biological process, mainly related to metabolic process, P450 pathway, and oxidation-reduction process; 12 terms for molecular function, highly involved with multiple enzyme activities, heme binding, iron ion binding and oxygen binding; and 6 terms for cellular components, associated with organelle membrane, extracellular exosome, extracellular region, extracellular space, blood microparticle, and membrane attack complex.
Figure 3.

Enrichment analysis of DEGs. (a) GO enrichment analysis of DEGs, (b) top 5 terms of GO enrichment, and (c) KEGG pathway analysis of DEGs. DEG, differentially expressed gene; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; FDR, false discovery rate.

Enrichment analysis of DEGs. (a) GO enrichment analysis of DEGs, (b) top 5 terms of GO enrichment, and (c) KEGG pathway analysis of DEGs. DEG, differentially expressed gene; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; FDR, false discovery rate. In the KEGG pathway enrichment analyses, we screened nine pathways according to FDR < 0.05. Figure 3c shows the results of KEGG analysis; the DEGs primarily participated in diverse metabolism-associated signaling pathways, such as metabolic pathways, retinol metabolism, drug metabolism-cytochrome P450, among others.

PPI network establishment and module analysis

The PPI network of DEGs comprised 242 nodes and 1267 interactions (Figure 4a); degree was calculated to identify candidate key nodes. Finally, 11 potential key nodes were identified, the degrees of which were all more than four times the corresponding average values: CDK1, CCNB2, CDC20, CCNB1, TOP2A, CCNA2, MELK, PBK, TPX2, KIF20A, and AURKA (Table 2). Moreover, to determine important clustering modules in the PPI network, module analysis was performed using MCODE, and the two modules with the highest scores (score >10) were obtained (Figure 4b, 4c). The enrichment pathways of module 1 and module 2 are shown in Figure 5. Module 1 was highly associated with cell cycle and oocyte meiosis; module 2 was closely connected to drug metabolism-cytochrome P450, linoleic acid metabolism, chemical carcinogenesis, arachidonic acid metabolism, retinol metabolism, metabolism of xenobiotics by cytochrome P450, and metabolic pathways.
Figure 4.

PPI network and hub clustering modules. (a) The PPI network of DEGs, (b) module 1 (MCODE score = 38.769), and (c) module 2 (MCODE score = 10.364). Blue circles represent downregulated genes and red circles represent upregulated genes. PPI, protein–protein interaction; DEG, differentially expressed gene; MCODE, Molecular Complex Detection.

Table 2.

Upregulated hub genes with high degrees.

GeneDegreeTypeMCODE Cluster
CDK1 47upModule 1
CCNB2 46upModule 1
CDC20 45upModule 1
CCNB1 45upModule 1
TOP2A 44upModule 1
CCNA2 44upModule 1
MELK 43upModule 1
PBK 43upModule 1
TPX2 43upModule 1
KIF20A 43upModule 1
AURKA 43upModule 1
Figure 5.

Pathway analysis of the two modules with the highest scores. The y-axis shows significantly enriched KEGG pathways, and x-axis shows the two modules. KEGG, Kyoto Encyclopedia of Genes and Genomes; FDR, false discovery rate.

PPI network and hub clustering modules. (a) The PPI network of DEGs, (b) module 1 (MCODE score = 38.769), and (c) module 2 (MCODE score = 10.364). Blue circles represent downregulated genes and red circles represent upregulated genes. PPI, protein–protein interaction; DEG, differentially expressed gene; MCODE, Molecular Complex Detection. Upregulated hub genes with high degrees. Pathway analysis of the two modules with the highest scores. The y-axis shows significantly enriched KEGG pathways, and x-axis shows the two modules. KEGG, Kyoto Encyclopedia of Genes and Genomes; FDR, false discovery rate.

Survival analysis, expression, and correlation analysis of hub genes

Survival analysis of 11 hub genes was performed using the KM plotter. The results showed that high expression of CDK1 (HR = 2.15, 95% CI: 1.52–3.06; P = 1.1e−05), CCNB2 (HR = 1.91, 95% CI: 1.28–2.87; P = 0.0013), CDC20 (HR = 2.49, 95% CI: 1.72–3.59; P = 5.1e−07), CCNB1 (HR = 2.34, 95% CI: 1.55–3.54; P = 3.4e−05), TOP2A (HR = 1.99, 95% CI: 1.39–2.86; P = 0.00012), CCNA2 (HR = 1.92, 95% CI: 1.36–2.72; P = 0.00018), MELK (HR = 2.22, 95% CI: 1.5–3.27; P = 3.7e−05), PBK (HR = 2.24, 95% CI: 1.5–3.34; P = 4.8e−05), TPX2 (HR = 2.29, 95% CI: 1.62–3.24; P = 1.4e−06), KIF20A (HR = 2.33, 95% CI: 1.63–3.32; P = 1.8e−06), and AURKA (HR = 1.77, 95% CI: 1.25–2.5; P = 0.0011) was related to unfavorable OS for HCC patients (Figure 6). Furthermore, GEPIA was adopted to analyze the different expression level of hub genes in HCC and normal tissues and the 11 hub genes were confirmed to be highly expressed in HCC tissues (Figure 7). The correlations between hub genes were also analyzed by GEPIA, which showed that the 11 hub genes were significantly correlated with each other. Figure 8 showed that the increase in expression of CDK1 was strongly associated with increased expression of the other 10 hub genes.
Figure 6.

Prognostic roles of 11 hub genes in patients with HCC shown as survival curves. (a) CDK1, (b) CCNB2, (c) CDC20, (d) CCNB1, (e) TOP2A, (f) CCNA2, (g) MELK, (h) PBK, (i) TPX2, (j) KIF20A, and (k) AURKA. HCC, hepatocellular carcinoma; HR, hazard ratio.

Figure 7.

Analysis of expression levels of 11 hub genes in human HCC. The red and gray boxes represent cancer and normal tissues, respectively. (a) CDK1, (b) CCNB2, (c) CDC20, (d) CCNB1, (e) TOP2A, (f) CCNA2, (g) MELK, (h) PBK, (i) TPX2, (j) KIF20A, and (k) AURKA. HCC, hepatocellular carcinoma; LIHC, liver hepatocellular carcinoma.

Figure 8.

Correlation analysis of 10 hub genes in HCC with CDK1: (a) CCNB2, (b) CDC20, (c) CCNB1, (d) TOP2A, (e) CCNA2, (f) MELK, (g) PBK, (h) TPX2, (i) KIF20A, and (j) AURKA. HCC, hepatocellular carcinoma.

Prognostic roles of 11 hub genes in patients with HCC shown as survival curves. (a) CDK1, (b) CCNB2, (c) CDC20, (d) CCNB1, (e) TOP2A, (f) CCNA2, (g) MELK, (h) PBK, (i) TPX2, (j) KIF20A, and (k) AURKA. HCC, hepatocellular carcinoma; HR, hazard ratio. Analysis of expression levels of 11 hub genes in human HCC. The red and gray boxes represent cancer and normal tissues, respectively. (a) CDK1, (b) CCNB2, (c) CDC20, (d) CCNB1, (e) TOP2A, (f) CCNA2, (g) MELK, (h) PBK, (i) TPX2, (j) KIF20A, and (k) AURKA. HCC, hepatocellular carcinoma; LIHC, liver hepatocellular carcinoma. Correlation analysis of 10 hub genes in HCC with CDK1: (a) CCNB2, (b) CDC20, (c) CCNB1, (d) TOP2A, (e) CCNA2, (f) MELK, (g) PBK, (h) TPX2, (i) KIF20A, and (j) AURKA. HCC, hepatocellular carcinoma.

Discussion

HCC is the most common type of malignancy and one of the leading causes of cancer-related mortality worldwide.[40,41] Although much research has been done on HCC, its early diagnosis and treatment remains difficult because of a lack of understanding of the molecular mechanisms associated with HCC occurrence and development.[41] Therefore, in-depth studies of the etiological factors and molecular mechanisms of HCC are of critical importance for HCC diagnosis and treatment.[11] Currently, bioinformatics analysis and microarray technology are developing rapidly and this approach can be used to identify therapeutic targets for diagnosis, therapy, and prognosis of a variety of neoplasms.[42] In this research, we identified 380 DEGs, including 293 downregulated and 87 upregulated genes, between HCC and corresponding adjacent or normal tissues. Enrichment analyses indicated that the DEGs were mostly associated with metabolic processes, such as metabolism of retinol, drugs, xenobiotics, tyrosine, tryptophan, and histidine, as well as fatty acid degradation. This indicated that metabolic dysregulation is closely related to HCC. In addition, we obtained 11 hub genes (CDK1, CCNB2, CDC20, CCNB1, TOP2A, CCNA2, MELK, PBK, TPX2, KIF20A, and AURKA) in the PPI network, which were upregulated in HCC tissues compared with normal tissues; expression of the first hub gene, CDK1, was strongly correlated with that of the other hub genes. Overexpression of the 11 hub genes was correlated with worse OS. Recent evidence implies that tumor cells need specific interphase cyclin-dependent kinases (CDKs) to proliferate.[43] Cyclin-dependent kinase 1 (CDK1) belongs to the CDK family, a member of the serine/threonine protein kinases, and it is crucial for the cell cycle phase transitions G1/S and G2/M.[44,45] CDK1 is required for cell proliferation because it is the only CDK that can initiate mitosis.[46] The deregulation of CDK1 is likely related to HCC tumorigenesis.[47] Research has found that high expression of CDK1 is correlated with poor OS of HCC.[45] Cyclins act as the regulatory subunits of the CDKs, regulating temporal transitions among various stages of the cell cycle via CDK activation.[48] Cyclin-A2 (CCNA2), cyclin-B1 (CCNB1), and cyclin-B2 (CCNB1), encoded by the CCNA2, CCNB1, and CCNB2 genes, respectively, all belong to the cyclin family. CCNA2 activates CDK1 at the end of interphase to facilitate the onset of mitosis, and CCNA2 overexpression has been reported in numerous types of cancers.[49] A previous study indicated that CCNA2 amplification and overexpression is associated with carcinogenesis of transgenic mouse liver tumors.[50] Moreover, research has demonstrated that inhibition of CCNA2 can arrest HCC cell proliferation and tumorigenesis.[51] High expression of CCNA2 is associated with reduced survival in patients with breast cancer and HCC.[52,53] CCNB1 and CCNB2 are the principal activators of CDK1 and, together with CDK1, they promote the G2/M transition.[54,55] Expression of CCNB1 changes periodically throughout the cell cycle, and is a crucial initiator of mitosis.[56] Decreased CCNB1 expression is related to inhibition of HCC occurrence and development, and activation of CCNB1 expression can promote proliferation in human HCC cells.[56,57] Furthermore, previous research has shown that CCNB1 is closely connected to prognosis of HCC patients. [56,58] The dimerization of CCNB2 with CDK1 is an essential component of the cell cycle regulatory machinery, and an increase in expression of CCNB2/CDK1 could promote tumor cell proliferation.[55] Furthermore, CCNB2 is highly expressed in several malignant tumors and overexpression of CCNB2 is related to poor prognosis in HCC.[59] Cell division cycle protein 20 (encoded by CDC20) is a regulator of cell cycle checkpoints, which plays a crucial role in anaphase initiation and exit from mitosis.[60,61] It degrades several important substrates, including cyclin A and CCNB1, to regulate cell cycle progression.[62,63] CDC20 overexpression is related to progression and poor prognosis in various malignant tumors.[64-67] Thus, it is a potential target in multiple cancer treatments.[68] A recent study found that increased expression of CDC20 is related to HCC development and progression.[67] Additionally, research has indicated that silencing expression of CDC20 and heparanase can activate cell apoptosis; thus, targeting inhibition of both CDC20 and heparanase expression is an ideal approach for the treatment of HCC.[69] Aurora kinase A (encoded by AURKA) is involved in centrosome duplication, spindle formation, chromosomal amplification and segregation, and cytokinesis, and it plays a significant role in centrosome maturation and mitotic commitment in the late G2 phase.[70,71] Abnormal activity of AURKA promotes tumorigenic progression and transformation via defective control at the checkpoint of mitotic spindle.[72] Meanwhile, AURKA is highly expressed in a variety of human cancers, including breast cancer,[73] lung cancer,[74] gastrointestinal cancer,[75] bladder cancer,[76] and oral cancer.[77] A study demonstrated that genetic variations in AURKA might be a reliable predictor of early-stage HCC and a crucial biomarker for HCC development.[78] Moreover, other research has indicated that AURKA contributes in metastasis and invasiveness of HCC.[79] Therefore, AURKA might represent a new therapeutic target for HCC. Topoisomerase II alpha (TOP2A), a potential biomarker for cancer therapy, has been detected in various types of cancer.[80-82] It participates in chromosome condensation and chromatid separation.[80] TOP2A encodes topoisomerase II alpha[81] and is reported to be overexpressed in HCC tissues.[83] Furthermore, a study has shown that TOP2A has prognostic value in HCC and its reactive agents can be used in HCC therapy.[84] Maternal embryonic leucine zipper kinase (encoded by MELK) is a member of the AMP protein kinase family of serine/threonine kinases, which affect many stages of tumorigenesis.[85] Several studies have shown that MELK is an oncogenic kinase essential for early HCC recurrence, and its expression is upregulated in HCC.[86-88] Furthermore, MELK inhibition is associated with suppression of tumor growth, indicating that MELK is a potential therapeutic target for HCC.[89] PDZ-binding kinase (encoded by PBK) can regulate cell cycle processes.[90] Although PBK is barely detectable in normal somatic tissues, it is often elevated in various tumor tissues and is therefore an important target for cancer screening and targeted therapy.[91,92] Recent research has shown that PBK overexpression promotes migration and invasion of HCC, and it could be a therapeutic target for HCC metastasis.[93] Targeting protein for Xklp2 (TPX2) expression is modulated by the cell cycle, and it is detected in G1/S phase and disappears after cytokinesis.[94,95] Several studies have indicated that TPX2 is highly expressed in different types of cancers.[96,97] Additionally, expression of TPX2 is related to proliferation and apoptosis in HCC.[98] TPX2 overexpression promotes the growth and metastasis of HCC.[99] Kinesin family member 20A (KIF20A) is required during mitosis for the final step of cytokinesis.[100,101] Studies have found that high expression of KIF20A is correlated with progression or poor prognosis of many types of cancers.[102-104] Furthermore, KIF20A is aberrantly expressed in HCC tissues and its expression may be associated with poor OS.[105] According to enrichment analyses of two modules, we found that module 1 was mostly associated with cell cycle and module 2 was closely related to metabolic pathways. Furthermore, all 11 hub genes belonged to module 1 and most are associated with cell cycle and enriched in the “cell cycle” pathway. A number of studies have elucidated that cell cycle disorders are closely related to human cancer.[43] Therefore, the carcinogenesis and progression of HCC may be associated with the cell cycle pathway, and we might be able to suppress HCC cell cycle progression, inhibit HCC cell proliferation, and reduce HCC malignancy by downregulating expression of the 11 hub genes identified herein. Compared with previous studies, this work has several advantages, as follows. First, the current integrated microarray data used a relatively large sample size from several GEO datasets (GSE36376,[14] GSE39791,[15] GSE41804,[16] GSE54236,[17,18] GSE57957,[19] GSE62232,[20] GSE64041,[21] GSE69715,[22] GSE76427,[23] GSE84005, GSE87630,[24] GSE112790,[25] and GSE121248[26]). Second, functional enrichment analyses were performed to identify the main biological functions and pathways regulated by DEGs. Third, we established PPI networks, conducted module analysis, discovered potential biomarkers for diagnosis and prognosis of HCC, and performed correlation analysis of hub genes. The limitations of this work were as follows: First, our results need to be verified by corresponding experimental studies. Second, we obtained data from the GEO database, and data quality cannot be verified. Finally, our study focused on genes that are typically identified as significant changes in diverse datasets, without regard to sex, age, or grading and staging of tumors from which the samples were derived.

Conclusion

In the present work, we identified 11 hub genes (CDK1, CCNB2, CDC20, CCNB1, TOP2A, CCNA2, MELK, PBK, TPX2, KIF20A, and AURKA) associated with the development and poor prognosis of HCC by integrated bioinformatics analysis. However, because our study was based on data analysis only, further experiments are required to confirm the results. Our findings will provide evidence and new insights to enhance approaches for the early diagnosis, prognosis, and treatment of HCC.
NamelogFCTypeNamelogFCTypeNamelogFCType
CLEC1B −3.33713down IL13RA2 −1.41685down CSRNP1 −1.20759down
C9 −2.93972down PAMR1 −1.30729down ZGPAT −1.283655down
FCN3 −3.32589down CYP26A1 −1.82557down FAM150B −1.096361down
CYP1A2 −3.61576down JCHAIN −1.90133down LPA −1.568535down
HAMP −3.72675down ADIRF −1.34189down ALPL −1.135143down
SLCO1B3 −2.84405down NNMT −1.65555down S100A8 −1.149369down
SPP2 −2.19217down TAT −1.77239down GPM6A −1.287388down
APOF −2.7681down MS4A6A −1.02381down RCL1 −1.112209down
NAT2 −2.42415down VNN1 −1.43431down CYP2B7P −1.31568down
CLRN3 −2.35658down HSD17B2 −1.27883down CCBE1 −1.131678down
RDH16 −2.05491down FAM134B −1.27241down LINC01093 −1.711116down
SLC25A47 −2.3928down CTH −1.2995down ST3GAL6 −1.008844down
SLC22A1 −2.49578down ACAA1 −1.06823down TBX15 −1.105089down
THRSP −2.37999down OTC −1.12724down BCO2 −1.572843down
CLEC4G −2.8104down CYP2A7 −1.7189down LUM −1.123456down
GBA3 −2.26827down C6 −1.48624down ESR1 −1.022446down
DNASE1L3 −2.22313down GREM2 −1.17719down CYR61 −1.101151down
SHBG −1.96811down HPD −1.56635down HBA2 −1.227362down
LY6E −2.01561down KBTBD11 −1.69651down KDM8 −1.06201down
CDHR2 −2.02873down CA2 −1.30707down GADD45G −1.126764down
TMEM27 −2.33949down AKR7A3 −1.25278down ASPG −1.055061down
C7 −2.2597down RNF125 −1.03098down FCGR2B −1.141195down
FBP1 −1.79884down TTC36 −1.69649down ASPA −1.025006down
SRD5A2 −1.89056down PROM1 −1.44661down PBLD −1.006234down
MT1M −3.02758down ADH6 −1.22168down HHIP −1.37843down
BBOX1 −2.04999down ETNPPL −1.15368down CRP −1.053533down
APOA5 −1.774down HSD17B13 −1.50866down FREM2 −1.522232down
IGFBP3 −1.70456down ANXA10 −1.62516down ADRA1A −1.161964down
ADH4 −2.15911down FXYD1 −1.41243down CNTN3 −1.176196down
KMO −1.91086down OGDHL −1.30838down ITLN1 −1.034492down
CYP8B1 −1.76864down PON1 −1.17061down UGT2B10 −1.031179down
CXCL14 −2.31161down ACSM3 −1.52866down DIRAS3 −1.123875down
GHR −2.12511down SLC27A5 −1.33347down STEAP4 −1.061309down
ADGRG7 −1.85853down LIFR −1.47372down CYP4A22 −1.074568down
MARCO −2.25079down HABP2 −1.06311down TFPI2 −1.00071down
MT1F −2.59948down GRAMD1C −1.07675down MT1A −1.093671down
CYP39A1 −1.86139down TKFC −1.07859down RAB25 −1.081375down
OIT3 −2.4803down STEAP3 −1.09586down RDH5 −1.006888down
MBL2 −1.62953down IL1RAP −1.21549down EPCAM −1.336797down
VIPR1 −1.89347down GCDH −1.02343down SPINK1 3.633978up
TDO2 −1.44452down HAL −1.262down GPC3 2.807155up
BHMT −1.68706down GABARAPL1 −1.07919down AKR1B10 2.588879up
PCK1 −1.85362down ID1 −1.32236down ASPM 1.804629up
MT1H −2.20509down INMT −1.65209down CAP2 2.086341up
AFM −1.90272down SKAP1 −1.06342down TOP2A 2.232845up
HGFAC −2.18902down FETUB −1.31249down PRC1 1.923672up
MT1G −2.64319down CFHR4 −1.07478down CDKN3 1.778794up
CYP2A6 −2.05548down HSD11B1 −1.27605down CDC20 1.910919up
CETP −1.77384down G6PC −1.00804down PTTG1 1.451774up
SMIM24 −1.81333down MFAP4 −1.53268down NCAPG 1.551838up
FCN2 −1.90705down ABCA8 −1.10284down LCN2 1.551605up
FOSB −2.12211down CYP2J2 −1.03103down CCL20 1.667526up
ECM1 −1.72876down AKR1D1 −1.77452down FAM83D 1.570755up
MT1X −2.07498down GPD1 −1.01057down KIF20A 1.644679up
SLC10A1 −1.70131down HAO1 −1.0889down PBK 1.6372up
CRHBP −2.55698down TACSTD2 −1.09909down AURKA 1.321582up
F9 −1.86997down GCGR −1.51767down UBE2T 1.429052up
SRPX −1.99247down C8orf4 −1.53773down NUSAP1 1.447842up
CYP2C9 −1.7781down DMGDH −1.11277down AKR1C3 1.315793up
GNMT −1.80416down PON3 −1.07722down MELK 1.397481up
CYP2C8 −1.84304down MAT1A −1.15605down SRXN1 1.101781up
PGLYRP2 −1.57039down AADAT −1.45288down HMMR 1.429779up
LECT2 −1.71324down HPX −1.1201down COL15A1 1.679907up
HAO2 −2.05962down KCNN2 −1.76035down UBD 1.793116up
FOS −2.10062down ACADL −1.16219down PLVAP 1.303945up
ANGPTL6 −1.40198down SLC13A5 −1.18455down HSPB1 1.057592up
CNDP1 −2.19859down ASS1 −1.22714down SPP1 1.372928up
CXCL12 −1.91941down PRSS8 −1.15745down CENPF 1.339564up
AGXT2 −1.39193down CPED1 −1.24941down SQLE 1.28364up
ACOT12 −1.27878down FTCD −1.25547down CEP55 1.130246up
RSPO3 −1.62341down TMEM45A −1.37559down KIF4A 1.431933up
PZP −1.76877down ALDH6A1 −1.08996down TRIP13 1.223148up
COLEC10 −1.85319down SLC27A2 −1.02491down S100P 1.428178up
HOGA1 −1.43807down ETFDH −1.15312down DLGAP5 1.462148up
MT1E −1.80442down GCKR −1.00475down ALDH3A1 1.048498up
CYP3A4 −2.39818down OAT −1.35234down CDCA5 1.222277up
SLC39A5 −1.47867down SFRP5 −1.04433down SFN 1.002947up
KLKB1 −1.57229down CYP3A43 −1.2044down ESM1 1.15394up
LCAT −1.87391down SLC6A12 −1.11241down TTK 1.378481up
IGFALS −1.94508down SOCS2 −1.38986down TPX2 1.091732up
GLYAT −1.72131down CYP4F2 −1.0376down PAGE4 1.240802up
ADH1C −1.64914down PHYHD1 −1.0017down COL4A1 1.236208up
PROZ −1.52487down SLC7A2 −1.05182down HJURP 1.034534up
CYP2E1 −2.04247down C1RL −1.01827down RACGAP1 1.407851up
GSTZ1 −1.39923down PLG −1.09969down IGF2BP3 1.019851up
CHST4 −1.72521down CPS1 −1.29626down ANLN 1.53779up
MFSD2A −1.51912down ADAMTSL2 −1.24169down MCM2 1.109517up
IDO2 −1.83679down MTTP −1.02368down UBE2C 1.0809up
SDS −1.75694down CXCL2 −1.43349down NQO1 1.365462up
ENO3 −1.37195down HRG −1.00696down CCNB2 1.303069up
GLS2 −1.75439down ACSL1 −1.14524down CCNA2 1.185444up
DCN −1.94676down MAN1C1 −1.18965down MUC13 1.14796up
PLAC8 −1.80012down PCOLCE −1.00609down MCM6 1.016314up
SERPINA4 −1.2352down MT2A −1.54319down CENPW 1.083208up
ZG16 −1.56869down CD1D −1.02692down TGM3 1.050965up
BCHE −1.77407down XDH −1.11927down RAD51AP1 1.049223up
CFP −1.47416down PPP1R1A −1.10299down THY1 1.046852up
SLC38A4 −1.32606down HBB −1.31952down NUF2 1.25884up
ADH1A −1.27277down RBP5 −1.04885down CKAP2L 1.054397up
CLEC4M −2.35545down CFHR3 −1.10107down MAGEA1 1.282995up
CYP4A11 −1.5036down RELN −1.02856down ECT2 1.065576up
GYS2 −1.66608down NPY1R −1.34248down ACSL4 1.16679up
PHGDH −1.40019down CLDN10 −1.34641down MDK 1.076885up
BGN −1.2236down ATF5 −1.11652down PEG10 1.104051up
CIDEB −1.27052down GNE −1.04957down COX7B2 1.333566up
CYP2C19 −1.55814down CYP4V2 −1.05634down CCNB1 1.362239up
IYD −1.22582down CD5L −1.49237down RRM2 1.542665up
C8A −1.49471down TIMD4 −1.24178down REG3A 1.140254up
STAB2 −1.82665down EGR1 −1.41173down CDK1 1.236442up
CDA −1.14527down GADD45B −1.21416down KIF14 1.054151up
HPGD −1.37821down GPT2 −1.15763down ZIC2 1.320155up
OLFML3 −1.38115down ACMSD −1.02364down BUB1B 1.118801up
PTH1R −1.35746down CCL19 −1.32425down NDC80 1.234218up
EPHX2 −1.29488down RBP1 −1.15142down NEK2 1.144213up
COLEC11 −1.34767down ACADS −1.05741down RBM24 1.220962up
CYP2C18 −1.21134down MYOM2 −1.03989down NMRAL1P1 1.314053up
AMDHD1 −1.14346down DCXR −1.01852down DTL 1.283296up
LYVE1 −1.69466down PLGLB1 −1.07364down SULT1C2 1.181554up
GSPT2 −1.16851down CYP2B6 −1.37318down ROBO1 1.247873up
C8B −1.16715down UROC1 −1.06129down SSX1 1.001365up
ADH1B −1.77846down PDK4 −1.08546down FLVCR1 1.006476up
DPT −1.68413down PPARGC1A −1.08395down CTHRC1 1.120384up
AZGP1 −1.23501down NDRG2 −1.01145down ZWINT 1.066653up
ALDH8A1 −1.37768down IGF1 −1.14785down GINS1 1.03249up
RND3 −1.62821down ASPDH −1.15589down SMPX 1.089408up
SLC19A3 −1.18742down DBH −1.50296down GPR158 1.061576up
WDR72 −1.27875down PRG4 −1.13337down

FC, fold change.

CategoryIDTerm−log10(FDR)Count
BPGO:0055114Oxidation−reduction process16.4564612856
BPGO:0019373Epoxygenase P450 pathway12.7241408513
BPGO:0006805Xenobiotic metabolic process6.80119626916
BPGO:0017144Drug metabolic process6.71331012411
BPGO:0045926Negative regulation of growth5.3540602649
BPGO:0071276Cellular response to cadmium ion4.2584167538
BPGO:0042738Exogenous drug catabolic process3.8737277597
BPGO:0071294Cellular response to zinc ion3.861100448
BPGO:0008202Steroid metabolic process3.34901269210
BPGO:0097267Omega−hydroxylase P450 pathway3.0488317066
BPGO:0016098Monoterpenoid metabolic process2.2847348355
BPGO:0007067Mitotic nuclear division1.90122189919
BPGO:0006569Tryptophan catabolic process1.3828395115
CCGO:0031090Organelle membrane12.1350458321
CCGO:0070062Extracellular exosome10.96203625117
CCGO:0005576Extracellular region8.94422620178
CCGO:0005615Extracellular space8.07940171168
CCGO:0072562Blood microparticle3.94102965317
CCGO:0005579Membrane attack complex2.1314787565
MFGO:0016705Oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen12.7784985119
MFGO:0020037Heme binding11.8210508625
MFGO:0004497Monooxygenase activity11.546349818
MFGO:0005506Iron ion binding10.6976316225
MFGO:0008392Arachidonic acid epoxygenase activity10.2240497311
MFGO:0019825Oxygen binding9.16897524515
MFGO:0016491Oxidoreductase activity5.66454232422
MFGO:0008395Steroid hydroxylase activity5.61351314510
MFGO:0070330Aromatase activity2.8052322578
MFGO:0004024Alcohol dehydrogenase activity, zinc−dependent2.381419825
MFGO:0016712Oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, reduced flavin or flavoprotein as one donor, and incorporation of one atom of oxygen1.8240190966
MFGO:0004745Retinol dehydrogenase activity1.3912803686

FDR, false discovery rate.

  105 in total

1.  Tpx2 controls spindle integrity, genome stability, and tumor development.

Authors:  Cristina Aguirre-Portolés; Alexander W Bird; Anthony Hyman; Marta Cañamero; Ignacio Pérez de Castro; Marcos Malumbres
Journal:  Cancer Res       Date:  2012-01-20       Impact factor: 12.701

2.  Identifying hepatocellular carcinoma-related hub genes by bioinformatics analysis and CYP2C8 is a potential prognostic biomarker.

Authors:  Chuanfei Li; Di Zhou; Xiaoling Jiang; Minhui Liu; Hui Tang; Zhechuan Mei
Journal:  Gene       Date:  2019-02-27       Impact factor: 3.688

3.  Association of interleukin-28B genotype and hepatocellular carcinoma recurrence in patients with chronic hepatitis C.

Authors:  Yuji Hodo; Masao Honda; Akihiro Tanaka; Yoshimoto Nomura; Kuniaki Arai; Taro Yamashita; Yoshio Sakai; Tatsuya Yamashita; Eishiro Mizukoshi; Akito Sakai; Motoko Sasaki; Yasuni Nakanuma; Mitsuhiko Moriyama; Shuichi Kaneko
Journal:  Clin Cancer Res       Date:  2013-02-20       Impact factor: 12.531

4.  Targeting TPX2 suppresses proliferation and promotes apoptosis via repression of the PI3k/AKT/P21 signaling pathway and activation of p53 pathway in breast cancer.

Authors:  Miaomiao Chen; Hongqin Zhang; Guihong Zhang; Ailing Zhong; Qian Ma; Jinyan Kai; Yin Tong; Suhong Xie; Yanchun Wang; Hui Zheng; Lin Guo; Renquan Lu
Journal:  Biochem Biophys Res Commun       Date:  2018-11-16       Impact factor: 3.575

5.  Identification of cyclin B1 and Sec62 as biomarkers for recurrence in patients with HBV-related hepatocellular carcinoma after surgical resection.

Authors:  Li Weng; Juan Du; Qinghui Zhou; Binbin Cheng; Jun Li; Denghai Zhang; Changquan Ling
Journal:  Mol Cancer       Date:  2012-06-08       Impact factor: 27.401

6.  Genomic predictors for recurrence patterns of hepatocellular carcinoma: model derivation and validation.

Authors:  Ji Hoon Kim; Bo Hwa Sohn; Hyun-Sung Lee; Sang-Bae Kim; Jeong Eun Yoo; Yun-Yong Park; Woojin Jeong; Sung Sook Lee; Eun Sung Park; Ahmed Kaseb; Baek Hui Kim; Wan Bae Kim; Jong Eun Yeon; Kwan Soo Byun; In-Sun Chu; Sung Soo Kim; Xin Wei Wang; Snorri S Thorgeirsson; John M Luk; Koo Jeong Kang; Jeonghoon Heo; Young Nyun Park; Ju-Seog Lee
Journal:  PLoS Med       Date:  2014-12-23       Impact factor: 11.069

7.  MiR-93-5p Promotes Cell Proliferation through Down-Regulating PPARGC1A in Hepatocellular Carcinoma Cells by Bioinformatics Analysis and Experimental Verification.

Authors:  Xinrui Wang; Zhijun Liao; Zhimin Bai; Yan He; Juan Duan; Leyi Wei
Journal:  Genes (Basel)       Date:  2018-01-22       Impact factor: 4.096

8.  Aurora Kinase A is a Biomarker for Bladder Cancer Detection and Contributes to its Aggressive Behavior.

Authors:  Aaron Mobley; Shizhen Zhang; Jolanta Bondaruk; Yan Wang; Tadeusz Majewski; Nancy P Caraway; Li Huang; Einav Shoshan; Guermarie Velazquez-Torres; Giovanni Nitti; Sangkyou Lee; June Goo Lee; Enrique Fuentes-Mattei; Daniel Willis; Li Zhang; Charles C Guo; Hui Yao; Keith Baggerly; Yair Lotan; Seth P Lerner; Colin Dinney; David McConkey; Menashe Bar-Eli; Bogdan Czerniak
Journal:  Sci Rep       Date:  2017-01-19       Impact factor: 4.379

9.  KIF20A Affects the Prognosis of Bladder Cancer by Promoting the Proliferation and Metastasis of Bladder Cancer Cells.

Authors:  Tianyu Shen; Long Yang; Zheng Zhang; Jianpeng Yu; Liang Dai; Minjun Gao; Zhiqun Shang; Yuanjie Niu
Journal:  Dis Markers       Date:  2019-04-09       Impact factor: 3.434

Review 10.  Epigenetics of hepatocellular carcinoma.

Authors:  Tan Boon Toh; Jhin Jieh Lim; Edward Kai-Hua Chow
Journal:  Clin Transl Med       Date:  2019-05-06
View more
  4 in total

1.  A prognostic model for hepatocellular carcinoma based on apoptosis-related genes.

Authors:  Renjie Liu; Guifu Wang; Chi Zhang; Dousheng Bai
Journal:  World J Surg Oncol       Date:  2021-03-12       Impact factor: 2.754

2.  Comprehensive Analysis of Gene Expression Changes and Validation in Hepatocellular Carcinoma.

Authors:  Hao Zhang; Renzheng Liu; Lin Sun; Weidong Guo; Xiaoyue Ji; Xiao Hu
Journal:  Onco Targets Ther       Date:  2021-02-15       Impact factor: 4.147

3.  High expression of PDZ-binding kinase is correlated with poor prognosis and immune infiltrates in hepatocellular carcinoma.

Authors:  Wei Mu; Yaoli Xie; Jinhu Li; Runzhi Yan; Jingxian Zhang; Yu'e Liu; Yimin Fan
Journal:  World J Surg Oncol       Date:  2022-01-22       Impact factor: 2.754

4.  Bioinformatics Analysis of Candidate Genes and Pathways Related to Hepatocellular Carcinoma in China: A Study Based on Public Databases.

Authors:  Peng Zhang; Jing Feng; Xue Wu; Weike Chu; Yilian Zhang; Ping Li
Journal:  Pathol Oncol Res       Date:  2021-03-26       Impact factor: 3.201

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.