Literature DB >> 31545466

Transcriptomic analysis and identification of prognostic biomarkers in cholangiocarcinoma.

Hanyu Li¹, Junyu Long¹, Fucun Xie¹, Kai Kang¹, Yue Shi¹, Weiyu Xu¹, Xiaoqian Wu¹, Jianzhen Lin¹, Haifeng Xu¹, Shunda Du¹, Yiyao Xu¹, Haitao Zhao¹, Yongchang Zheng¹, Jin Gu².

Abstract

Cholangiocarcinoma (CCA) is acknowledged as the second most commonly diagnosed primary liver tumor and is associated with a poor patient prognosis. The present study aimed to explore the biological functions, signaling pathways and potential prognostic biomarkers involved in CCA through transcriptomic analysis. Based on the transcriptomic dataset of CCA from The Cancer Genome Atlas (TCGA), differentially expressed protein‑coding genes (DEGs) were identified. Biological function enrichment analysis, including Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, was applied. Through protein‑protein interaction (PPI) network analysis, hub genes were identified and further verified using open‑access datasets and qRT‑PCR. Finally, a survival analysis was conducted. A total of 1,463 DEGs were distinguished, including 267 upregulated genes and 1,196 downregulated genes. For the GO analysis, the upregulated DEGs were enriched in 'cadherin binding in cell‑cell adhesion', 'extracellular matrix (ECM) organization' and 'cell‑cell adherens junctions'. Correspondingly, the downregulated DEGs were enriched in the 'oxidation‑reduction process', 'extracellular exosomes' and 'blood microparticles'. In regards to the KEGG pathway analysis, the upregulated DEGs were enriched in 'ECM‑receptor interactions', 'focal adhesions' and 'small cell lung cancer'. The downregulated DEGs were enriched in 'metabolic pathways', 'complement and coagulation cascades' and 'biosynthesis of antibiotics'. The PPI network suggested that CDK1 and another 20 genes were hub genes. Furthermore, survival analysis suggested that CDK1, MKI67, TOP2A and PRC1 were significantly associated with patient prognosis. These results enhance the current understanding of CCA development and provide new insight into distinguishing candidate biomarkers for predicting the prognosis of CCA.

Entities: Chemical Disease Gene Species

Mesh：

Substances：

Year: 2019 PMID： 31545466 PMCID： PMC6787946 DOI： 10.3892/or.2019.7318

Source DB: PubMed Journal: Oncol Rep ISSN： 1021-335X Impact factor: 3.906

Introduction

Cholangiocarcinoma (CCA) originates from intrahepatic or extrahepatic bile duct epithelial cells and is classified into intrahepatic CCA (iCCA), perihilar CCA (pCCA), and distal CCA (dCCA) according to the anatomic location (1,2). Among the majority of CCA tumors, pCCA accounts for 60–70%, dCCA accounts for 20–30% and iCCA accounts for 5–10%. CCA is recognized as the second most commonly diagnosed primary liver tumor and accounts for approximately 1–15% of all hepatobiliary malignancies (3). Although the average incidence of CCA is low, early diagnosis and treatment of CCA are difficult, and the overall patient prognosis is poor (4). Recently, iCCA has become the leading cause of death related to primary liver tumor (5). Systemic drug therapy is currently limited for patients with advanced or metastatic CCA, while surgical treatment is suitable only for those with early-stage CCA, which has a high risk of recurrence (6,7). The median survival time of patients with advanced CCA is less than 2 years, and the 5-year survival rate is only 10% (3,8). Searching for genetic drivers that affect the occurrence and progression of CCA is important for exploring the molecular diagnosis and targeted therapy (1). In recent years, biomarker research has achieved progress in the prediction, treatment and prognosis of CCA. For example, KRAS mutations and PRKACB fusion genes have been identified in pCCA and dCCA, and somatic mutations of isocitrate dehydrogenase (IDH) have been identified in iCCA (9). In addition, inducible nitric oxide synthase (iNOS) has been involved in the occurrence of CCA through an inflammation-dependent manner (10). However, due to strong genetic heterogeneity, the current understanding of the molecular mechanisms of CCA is still not comprehensive. In particular, understanding of the genetic variations that promote CCA initiation and development are still fragmented. Moreover, the key driver genes of carcinogenesis remain unknown (4,11). Therefore, studying the pathogenesis of CCA and identifying hub genes that are involved in the development of CCA remain a major challenge. The Cancer Genome Atlas (TCGA) is a publicly sponsored project with the purpose of classifying and identifying major carcinogenic genomic alterations among large cohorts of more than 30 human tumors. To perform an integrated analysis of cancer genome profiles, high-throughput technologies relying on the use of microarrays and next-generation sequencing methods were applied in TCGA (12). RNA sequencing (RNAseq) has become useful for transcriptome (total RNA) profiling and obtaining accurate strand information. RNAseq is a method that is conductive to the application of a systematic comprehensive study of differentially expressed gene interactions and related signaling pathways with high precision. Moreover, protein-protein interaction (PPI) networks are useful for distinguishing hub genes, which are defined as genes with a high degree of connectivity that play an essential role in stabilizing the PPI network structure (13,14). There are numerous oncology studies based on TCGA. From the perspective of CCA, Wang et al thoroughly studied the lncRNA-miRNA-mRNA ceRNA network and identified three lncRNAs, COL18A1-AS1, SLC6A1-AS1 and HULC, as being significantly associated with overall CCA patient survival (15). However, in the present study, we focused on identifying hub genes within the PPI and exploring their potential roles in CCA on the basis of TCGA combined with multiple datasets. In the present research, transcriptomic iCCA data from TCGA were utilized to identify differentially expressed protein-coding genes (DEGs) between iCCA and normal tissues. Then, we executed Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis to study alterations in biological functions and signaling pathways of iCCA. PPI network construction was performed, followed by identification of hub genes. Moreover, we identified the differential expression of hub genes by analyzing transcriptomic CCA data from several open-access databases, including Gene Expression Omnibus (GEO) database and ArrayExpress Archive of Functional Genomics Data (ArrayExpress). Further, we performed quantitative polymerase chain reaction (qPCR) in the laboratory to verify these hub genes. Finally, we executed survival analysis of the identified hub genes. The objective of this study was to understand CCA carcinogenesis by exploring the genetic changes involved in disease progression and to identify potential biomarkers that may be helpful for predicting the prognosis of iCCA patients.

Materials and methods

Acquisition of transcriptome data and identification of DEGs

Data for CCA mRNA expression were downloaded from TCGA database (https://portal.gdc.cancer.gov/, RNA-seq, Illumina) on July 6, 2018. Practical Extraction and Reporting Language (Perl) was utilized for sample information extraction, mRNA expression matrix generation, and gene symbol annotation. Only samples of primary site of liver and intrahepatic bile ducts were included for subsequent analysis, Statistical softwareR (version 3.4.4) and the ‘DEseq’ package from Bioconductor were used to perform significance analysis of the DEGs between CCA samples and adjacent noncancerous tissues (16,17). Genes with an absolute value of log2 fold change (log2FC) >2 and a corrected P-value <0.0001 were defined as DEGs.

Functional and pathway enrichment analyses

GO term enrichment analysis was applied to analyze the biological significance of DEGs, which includes biological processes (BP), cellular components (CC) and molecular functions (MF), based on the GO online platform David (https://david.ncifcrf.gov/, date of access: 2019/5/7, species: Human). GO visualization was achieved by the R package ‘GOplot’ (18). KEGG pathway enrichment analysis was applied based on the online platform David. Critical pathways enriched in DEGs were identified. Visualization of KEGG results was conducted by R package ‘ggplot2’. P<0.05 was considered statistically significant for both GO and KEGG analysis.

PPI network analysis

Proteins usually perform biological functions synergistically. Strong relationships have been shown to exist between PPIs and the biological functions of gene/protein clusters (19). Therefore, PPIs must be explored by considering functional groups. PPI network analysis is helpful for distinguishing hub genes among a cluster of DEGs that are implicated in CCA progression based on their interaction levels. PPI information of DEGs was obtained from the STRING database; highest confidence of 0.900 was chosen (version 11.0, https://string-db.org/, data of access: 2019/5/7). The PPI networks for upregulated genes was constructed using Cytoscape3.6.1 software (https://cytoscape.org/). The top 15 genes with the highest degree of connectivity were defined as hub genes.

Identification of hub genes

CCA-related transcriptomic datasets were obtained from GEO (GSE76297 and GSE26566) and ArrayExpress (E-GEOD-32879 and E-GEOD-45001) (20–24). R (version 3.4.4) and the ‘Limma’ package of Bioconductor were used for identification of the DEGs (25). A P-value <0.05 was considered statistically significant. Finally, hub genes in TCGA were compared with the DEGs acquired from other 4 datasets. Hub genes with similar differential expression among 5 datasets were selected for further analysis. A Venn diagram indicating the intersection of multiple datasets was made online (http://bioinformatics.psb.ugent.be/cgi-bin/liste/Venn/calculate_venn.htpl).

Statistical analysis

To examine the classification effect of hub genes on cholangiocarcinoma and normal tissue, receiver operating characteristic (ROC) curves and area under the curve (AUC) were calculated by R package ‘pROC’ (26). Furthermore, clinical data of iCCA were downloaded from TCGA. We divided patients into two groups based on tumor stage: Stage I+II and stage III+IV. The association between hub gene expression and tumor stage was evaluated by Mann-Whitney U test. A P<0.05 was considered statistically significant. Finally, ‘survival’ package of Bioconductor was used to generate overall survival curves (27,28). Perl was used to extract the lifetime of each patient from the clinical cart downloaded from TCGA. Clinical data of 118 patients with cholangiocarcinoma were downloaded from PubMed Central (11). Matching sample information was obtained from GEO dataset: GSE89749 (11). For each dataset, the patients were divided into two groups using the median gene expression value as the cut-off value. The relationship between patient overall survival and expression level of hub genes was tested by Cox proportional-hazards model. P<0.05 was considered statistically significant.

RT-qPCR

Tissue samples were collected as pairs, i.e., tumor tissue and adjacent normal tissue, from 10 patients with iCCA undergoing surgery at Peking Union Medical College Hospital. A total of 6 male and 4 female patients with mean age of 62 (range, 54–68) years were included. The collected tissue samples were stored in a refrigerator at −80°C. All patients were enrolled from November 2018 to April 2019. The study was approved by the Clinical Research Ethics Committee of Peking Union Medical College Hospital. Each patient provided a written informed signed consent. Total RNA was isolated from each sample with Trizol LS reagent (Invitrogen; Thermo Fisher Scientific, Inc.), and then used for cDNA synthesis using oligo(dT)primers and SuperScript™ III Reverse Transcriptase (Invitrogen; Thermo Fisher Scientific, Inc.). PCR Master Mix (2X) (Superarray) and Applied Biosystems QuantStudio5 Real-time PCR system (Thermo Fisher Scientific, Inc.) were utilized for RT-qPCR. The sequences of primers for selected hub genes and housekeeping gene (β-actin) are shown in Table SI.

Visualization of differential expression

For hub genes validated by qRT-PCR, R package ‘ggpubr’ (https://rpkgs.datanovia.com/ggpubr/index.html) was used to visualize gene expression based on the expression profile of DEGs in TCGA and the results of qRT-PCR. A t-test was used to calculate differences between groups. P<0.05 was considered statistically significant.

Results

Identification of DEGs in CCA and normal tissues

The transcriptomic dataset of CCA and the corresponding clinical cart were downloaded from the TCGA database. Patients with lesion of primary site of liver and intrahepatic bile ducts, i.e., patients with iCCA, were included for further analysis. Therefore, a total of 33 cases were acquired, including 19 female and 14 male patients. Forty-one samples were acquired in total, including 33 tumor tissue samples and 8 normal tissue samples. A total of 1,463 DEGs (log2FC >2, corrected P<0.0001) were acquired, including 267 significantly upregulated DEGs and 1,196 significantly downregulated DEGs. A heatmap and a volcano plot showing the expression levels of these genes are shown in Fig. 1.

Figure 1.

Heatmap and volcano plot showing significant DEGs between 33 iCCA tissues and 8 normal tissues in TCGA. (A) Rows represent genes, and columns represent samples. (B) The red spots represent significantly upregulated genes, and the green spots represent significantly downregulated genes. TCGA, The Cancer Genome Atlas; DEGs, differentially expressed protein-coding genes; CCA, cholangiocarcinoma; iCCA, intrahepatic CCA.

Functional and pathway enrichment analyses of DEGs

GO and KEGG pathway enrichment analyses were conducted to explore the functional characteristics of the DEGs. The GO analysis results revealed that the upregulated DEGs were significantly enriched in ‘extracellular matrix organization’, ‘cell-cell adhesion’, ‘cell adhesion’, ‘epithelial cell morphogenesis involved in placental branching and mitotic spindle assembly’ in terms of BP. Regarding MF, the upregulated DEGs were enriched in ‘cadherin binding involved in cell-cell adhesion’, ‘structural molecule activity’, ‘collagen binding’, ‘protein binding’ and ‘signal transducer activity’. Under CC, the upregulated DEGs were enriched in ‘cell-cell adherens junction’, ‘extracellular exosome’, ‘midbody, cell-cell junction’ and ‘cytoplasmic microtubule’. For the downregulated DEGs, significant enrichment was observed in the ‘oxidation-reduction process’, ‘xenobiotic metabolic process’, ‘metabolic process’, ‘steroid metabolic process’ and ‘platelet degranulation’ under BP. For MF, the downregulated DEGs were significantly enriched in ‘oxidoreductase activity’, ‘monooxygenase activity’, ‘iron ion binding’, ‘oxidoreductase activity acting on paired donors, with the incorporation of or reduction in molecular oxygen’ and ‘electron carrier activity’. For CC, the DEGs were enriched in ‘extracellular exosome’, ‘blood microparticle’, ‘organelle membrane’, ‘mitochondrial matrix’ and ‘extracellular region’. In addition, the KEGG analysis results showed that the upregulated DEGs were significantly enriched in ‘ECM-receptor interactions’, ‘focal adhesion’, ‘small cell lung cancer’, ‘pathways in cancer’ and ‘hypertrophic cardiomyopathy (HCM)’. Meanwhile, the downregulated DEGs were enriched in ‘metabolic pathways’, ‘complement and coagulation cascades’, ‘biosynthesis of antibiotics’, ‘retinol metabolism’ and ‘fatty acid degradation’. The enriched GO terms and KEGG pathways are shown in Fig. 2 and Tables SII–SV.

Figure 2.

Functional enrichment analysis of DEGs. (A) GO cluster plot showing a chord dendrogram of the clustering of the expression spectrum of significantly upregulated genes. (B) GO cluster plot showing a circular dendrogram of the clustering of the expression spectrum of significantly downregulated genes. KEGG pathway enrichment dot plot of the (C) significantly upregulated genes and (D) downregulated genes. The y-axis represents KEGG-enriched terms. The x-axis represents the fold of enrichment. The size of the dot represents the number of genes under a specific term. The color of the dots represents the adjusted P-value. DEGs, differentially expressed protein-coding genes; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes.

Construction of the PPI network and hub gene identification

PPI network analysis can be used to distinguish critical hub genes among a group of DEGs. Therefore, the STRING database was used to conduct the PPI network analysis. PPI networks for the upregulated genes were constructed by Cytoscape 3.6.1 (Fig. 3).

Figure 3.

(A) PPI network of the significantly upregulated DEGs. The nodes represent the significantly upregulated DEGs. The edges represent the interaction of significantly upregulated DEGs. The triangles represent hub genes validated by qRT-PCR. Bar chart of the (B) upregulated hub genes and the (C) downregulated hub genes. The x-axis represents count of connectivity. The y-axis represents hub gene symbols. PPI, protein-protein interaction; DEGs, differentially expressed protein-coding genes.

Cytoscape 3.6.1 was used to perform a centrality analysis. The top 15 genes with the highest degree of connectivity were defined as hub genes. Under this criterion, 15 hub genes were obtained for the upregulated DEGs, including cyclin-dependent kinase 1 (CDK1), cyclin B2 (CCNB2), kinesin family member 2C (KIF2C), topoisomerase (DNA) IIα (TOP2A), centrosomal protein 55 (CEP55), ribonucleotide reductase regulatory subunit M2 (RRM2), ubiquitin conjugating enzyme E2 C (UBE2C), baculoviral IAP repeat containing 5 (BIRC5), centromere protein F (CENPF), NIMA-related kinase 2 (NEK2), forkhead box M1 (FOXM1), marker of proliferation Ki-67 (MKI67), protein regulator of cytokinesis 1 (PRC1), integrin subunit α2 (ITGA2), and laminin subunit γ2 (LAMC2). Hub genes for the downregulated DEGs consisted of kininogen 1 (KNG1), complement C3 (C3), apolipoprotein A1 (APOA1), albumin (ALB), fibrinogen α chain (FGA), apolipoprotein B (APOB), fibrinogen γ chain (FGG), 3-hydroxyacyl CoA dehydrogenase (EHHADH), α2-HS glycoprotein (AHSG), apolipoprotein A2 (APOA2), complement C4A (C4A), serpin family A member 1 (SERPINA1), apolipoprotein E (APOE), serpin family C member 1 (SERPINC1) and acyl-CoA oxidase 1 (ACOX1). To further verify the differential expression of the critical hub genes in CCA, we evaluated the expression profiles of 30 hub genes in another 4 datasets. GSE76297 consists of 304 specimens in total, 183 of which were utilized in analysis, including 91 CCA tumor tissues and 92 CCA non-tumor tissues. GSE26566 consists of 169 specimens in total, 163 of which were utilized in analysis, including 104 CCA tissues and 59 surrounding liver tissues. E-GEOD-32879 consists of 37 specimens in total, 23 of which were utilized in analysis, including 16 iCCA tissues and 7 non-tumor tissues. E-GEOD-45001 consists of 10 pairs of iCCA tumor tissues and non-tumor tissues, which were all utilized in analysis. Consistent with our results, 21 out of 30 hub genes in TCGA were found to share similar differential expression among the other 4 datasets, including 8 upregulated hub genes and 13 downregulated hub genes (Fig. S1). The differential expression of hub genes is shown in Fig. 4A-D.

Figure 4.

Dynamic expression of significantly upregulated hub genes. (A-D) Expression of hub genes from the TCGA database. (E-H) Relative quantification of hub genes based on qRT-PCR results. *P<0.05, **P<0.01, **P<0.001. CDK1 (A and E), MKI67 (B and F), PRC1 (C and G) and TOP2A (D and H). TCGA, The Cancer Genome Atlas; CDK1, cyclin dependent kinase 1; MK167, marker of proliferation Ki-67; PRC1, protein regulator of cytokinesis 1; TOP2A, DNA topoisomerase II α.

ROC curves and tumor staging correlation analysis

ROC curves for hub genes were generated based on expression profile of TCGA dataset. AUC was >0.900 for all 21 selected hub genes (Fig. S2). Among them, the expression of 5 downregulated hub genes, including ACOX1, APOA2, APOB, FGA and FGG, was inversely associated with tumor stage (P<0.05, Fig. S3). For other identified hub genes, no association between gene expression and tumor stage was found.

Survival analysis

For upregulated hub genes, the expression of CDK1, MKI67, TOP2A and PRC1 was negatively related to the overall survival time of CCA patients in both TCGA and GSE89749 datasets (P<0.05). No significant result was found for the downregulated hub genes. The survival curves are shown in Fig. 5.

Figure 5.

Survival analysis of significantly upregulated hub genes. CDK1 (A and E), MKI67 (B and F), PRC1 (C and G), TOP2A (D and H). A-D refer to survival curves based on TCGA. E-H refer to survival curves based on GSE89749. Overall survival time is recorded in years. The cut-off value is the median gene expression. TCGA, The Cancer Genome Atlas.

Identification of CDK1, MKI67, TOP2A and PRC1 by RT-qPCR

Since the survival analysis indicates that the overexpression of CDK1, MKI67, TOP2A and PRC1 predicts poor survival of patients with cholangiocarcinoma, we performed RT-qPCR to validate the expression change of these genes in frozen tissue. As expected, all of the 4 genes were upregulated in the iCCA tissue (Fig. 4E-H).

Discussion

Cholangiocarcinoma (CCA) is recognized as the second most commonly diagnosed primary liver tumor. Due to its strong genetic heterogeneity, the current understanding of the pathogenesis of CCA is not comprehensive. Concerning genetic changes involved in CCA initiation and progression, agreement in this field remains fragmented. The key drivers involved in CCA carcinogenesis still need to be defined (3,4,11). In the present study, we focused on the genetic changes in transcription level between intrahepatic CCA (iCCA) and normal tissue. A total of 1,463 differentially expressed protein-coding genes (DEGs) were obtained based on data from The Cancer Genome Atlas (TCGA). Gene Ontology (GO) enrichment analyses showed that ‘changes in cadherin binding involved in ‘cell-cell adhesion’, ‘extracellular matrix organization’ and the ‘cell-cell adherens junction’ represented significant GO terms for the upregulated DEGs and that ‘oxidation-reduction processes’, ‘extracellular exosomes’, and ‘blood microparticles’ represented significant GO terms for the downregulated DEGs. In addition, ‘ECM-receptor interactions’, ‘focal adhesions’ and ‘small cell lung cancer’ were significant pathways related to the upregulated DEGs. ‘Metabolic pathways’, ‘complement and coagulation cascades’ and ‘biosynthesis of antibiotics’ were significant pathways for the downregulated DEGs. Hub genes were identified based on the degree of connectivity. Fifteen upregulated hub genes and 15 downregulated hub genes were selected based on protein-protein interaction (PPI) network. Moreover, the expression profiles of the 30 hub genes were verified using datasets from GEO and Arrayexpress. A total of 21 hub genes showed stable differential expression among 5 datasets including TCGA. ROC curves revealed that all 21 hub genes presented a credible classification effect between tumor and normal tissue with AUC >0.900. In addition, the expression of ACOX1, APOA2, APOB, FGA and FGG was inversely associated with tumor stage, which indicates that these genes may be involved in the progression of CCA. To further explore the relationships between hub genes and the outcomes of CCA patients, a survival analysis was conducted based on the clinical data and expression profiles of the identified hub genes in both TCGA and GSE89749. Four upregulated hub genes, including CDK1, MKI67, TOP2A and PRC1, were identified as being significantly related to overall survival among CCA patients. Moreover, the differential expression of the four genes was validated by RT-qPCR. Therefore, we considered CDK1, MKI67, TOP2A and PRC1 as potential predictors for the poor prognosis of patients with CCA. Cyclin-dependent kinases (CDKs) are a family of protein kinases driving the major events of cell cycle control (29). Aberrant expression of CDK1 is involved in cell cycle arrest in many tumor types such as melanoma, colon cancer and pancreatic cancer (30). Studies that have focused on the roles of CDK1 in cholangiocarcinoma are limited. Okumura et al revealed that CDK1 is upregulated by AIB1, i.e., transcriptional coactivator amplified in breast cancer 1, through the Akt pathway. AIB1 was found to be overexpressed in human CCA specimens and promote cell cycle progression at the G2/M phase by inducing CDK1 (31). In addition, CDK1 may be involved in drug-resistant mechanisms of CCA since western blot analyses indicated that G2/M phase-regulated proteins, including CDK1, were downregulated in gemcitabine-resistant CCA cell lines (32). MK167 encodes Ki-67. It is a cell cycle-regulated phosphatase 1-binding protein universally used as a proliferation marker. Ki-67 is a major organizer required for assembly of the perichromosomal compartment in cells (33). The Ki-67 index is shown to be the most reliable prognostic evaluation factor of gastroenteropancreatic neuroendocrine neoplasms (GEP-NENs) (34). The Ki-67 index can be variable through the disease course (35–37). In combination with tumor type, site and stage, the Ki-67 index is used to stratify patients in different prognostic categories (34). In the present study, we found a prognostic role of MKI67 in CCA. This finding may be clinically valuable, although the underlying mechanisms for Ki-67 variation still requires further investigation. DNA topoisomerase IIα (TOP2A) is an isoform of DNA topoisomerase II (Topo II). Topo II is a crucial enzyme for cell division that generates torsional stress on double-stranded DNA by inducing transient breaks that are subsequently resealed (38). TOP2A is located adjacent to the HER2 oncogene and is frequently coamplified with HER2 in multiple types of cancers, such as breast cancer, bladder cancer and gastric adenocarcinoma (39–42) However, Panvichian et al reported that TOP2A overexpression in hepatocellular carcinoma (HCC) is independent of HER2 gene amplification or expression (43). Our results showed that TOP2A is significantly upregulated in CCA tissues and represents a possible predictive biomarker for poor prognosis. However, no significant change was detected in the transcription of HER2. Therefore, TOP2A may play a role in CCA tumorigenesis independent of HER2. Nateewattana et al reported that andrographolide, a Topo II inhibitor, exhibited a potent cytotoxic effect on CCA cells by suppressing TOP2A expression in vitro (44). Thus, the therapeutic efficacy of Topo II inhibitors, such as andrographolide and anthracycline, in CCA patients should be further explored. Polycomb repressive complex 1 (PRC1) is required for adult stem cell functions and acts as both a tumor suppressor and oncogene (45). Tang et al demonstrated the significant biological implications of PRC1 in tumor pathogenesis and prognosis in non-small cell lung cancer patients by analyzing genome-wide RNAi data and mRNA expression data (46). Bmi1 and EZH2 are representative members of PRC1. Sasaki et al found that Bmi1 was overexpressed in CCA cell lines and stimulated cell proliferation (47). Overexpression of EZH2 may induce hypermethylation of the p16INK4a promoter, followed by decreased expression of p16INK4a in multistep cholangiocarcinogenesis (48). However, the precise molecular mechanisms underlying the role of PRC1 in CCA remain unclear. Importantly, we noted that both CDK1 and PRC1 were involved in the same GO term, midbody. Midbody is a transient structure during cytokinesis and is involved in recruitment and organization of abscission machinery, which physically regulates the localization of two daughter cells (49). Midbody dysregulation causes mitotic problems in daughter cell separation, which increases cancer susceptibility and tumorigenesis (50). CITRON, a known serine kinase present at midbody during cytokinesis, could contribute to tumor occurrence in HCC (51). CDK1 phosphorylates septin 9 (SEPT9), thus playing an important role in mediating the final separation of daughter cells (52). Moreover, PCR1 could accumulate in the midbody during cytokinesis and organize the midbody through microtubule regulation (49). Based on this study, we may speculate that CDK1 and PRC1 contribute to the progression of CCA through midbody-related function. Although we cannot exclude the possibility that the identified hub genes may be implicated in noncarcinogenic aspects of CCA, we attempted to ensure the credibility of the results by including as many datasets as possible. Except for TCGA, a total of 4 datasets were used to validate the differential expression of hub genes. In addition, the survival analysis of certain hub genes was based on 2 datasets. Moreover, we performed RT-qPCR to verify the selected hub genes based on 10 pairs of tissue samples. In conclusion, we identified a number of hub genes and comprehensively revealed the biological functions and signaling pathways associated with CCA carcinogenesis through systematic bioinformatic analyses. Moreover, we identified CDK1, MKI67, TOP2A and PRC1 as possible prognostic biomarkers and further discussed the roles that the four genes may play in cancer development. Most of the genes have not been thoroughly studied in CCA. In future research, the clinical application of the identified hub genes as biomarkers for supervising the prognosis of CCA patients should be further investigated. Moreover, research concerning specific mechanisms of these genes in CCA occurrence and progression is warranted.

49 in total

Review 1. Cholangiocarcinoma.

Authors: Michela Squadroni; Luca Tondulli; Gemma Gatta; Stefania Mosconi; Giordano Beretta; Roberto Labianca
Journal: Crit Rev Oncol Hematol Date: 2016-11-25 Impact factor: 6.312

Review 2. Epidemiology of cholangiocarcinoma.

Authors: Annika Bergquist; Erik von Seth
Journal: Best Pract Res Clin Gastroenterol Date: 2015-02-16 Impact factor: 3.043

3. Differential coexpression analysis using microarray data and its application to human cancer.

Authors: Jung Kyoon Choi; Ungsik Yu; Ook Joon Yoo; Sangsoo Kim
Journal: Bioinformatics Date: 2005-10-18 Impact factor: 6.937

Review 4. Cholangiocarcinoma.

Authors: Nataliya Razumilava; Gregory J Gores
Journal: Lancet Date: 2014-02-26 Impact factor: 79.321

5. Global analysis of Cdk1 substrate phosphorylation sites provides insights into evolution.

Authors: Liam J Holt; Brian B Tuch; Judit Villén; Alexander D Johnson; Steven P Gygi; David O Morgan
Journal: Science Date: 2009-09-25 Impact factor: 47.728

Review 6. Advances in biomarkers of biliary tract cancers.

Authors: Jun Hu; Baobing Yin
Journal: Biomed Pharmacother Date: 2016-04-16 Impact factor: 6.529

7. Over-expression of polycomb group protein EZH2 relates to decreased expression of p16 INK4a in cholangiocarcinogenesis in hepatolithiasis.

Authors: M Sasaki; J Yamaguchi; K Itatsu; H Ikeda; Y Nakanuma
Journal: J Pathol Date: 2008-06 Impact factor: 7.996

8. STRING v10: protein-protein interaction networks, integrated over the tree of life.

Authors: Damian Szklarczyk; Andrea Franceschini; Stefan Wyder; Kristoffer Forslund; Davide Heller; Jaime Huerta-Cepas; Milan Simonovic; Alexander Roth; Alberto Santos; Kalliopi P Tsafou; Michael Kuhn; Peer Bork; Lars J Jensen; Christian von Mering
Journal: Nucleic Acids Res Date: 2014-10-28 Impact factor: 16.971

9. Identification of hub genes involved in the development of hepatocellular carcinoma by transcriptome sequencing.

Authors: Yongchang Zheng; Junyu Long; Liangcai Wu; Haohai Zhang; Lin Li; Ying Zheng; Anqiang Wang; Jianzhen Lin; Xiaobo Yang; Xinting Sang; Ke Hu; Jie Pan; Haitao Zhao
Journal: Oncotarget Date: 2017-07-22

10. A cancer-associated polymorphism in ESCRT-III disrupts the abscission checkpoint and promotes genome instability.

Authors: Jessica B A Sadler; Dawn M Wenzel; Lauren K Strohacker; Marta Guindo-Martínez; Steven L Alam; Josep M Mercader; David Torrents; Katharine S Ullman; Wesley I Sundquist; Juan Martin-Serrano
Journal: Proc Natl Acad Sci U S A Date: 2018-09-04 Impact factor: 11.205

9 in total

Review 1. Novel biomarkers and the future of targeted therapies in cholangiocarcinoma: a narrative review.

Authors: Nishant Munugala; Shishir K Maithel; Rachna T Shroff
Journal: Hepatobiliary Surg Nutr Date: 2022-04 Impact factor: 7.293

2. Analysis of transcriptome in the relationship between expression of PRC1 protein and prognosis of patients with cholangiocarcinoma.

Authors: Qing Wang; Shaoqiong Lu; Ying Chen; Hua He; Weihui Lu; Kanru Lin
Journal: J Int Med Res Date: 2021-03 Impact factor: 1.671

3. Co-expression based cancer staging and application.

Authors: Xiangchun Yu; Sha Cao; Yi Zhou; Zhezhou Yu; Ying Xu
Journal: Sci Rep Date: 2020-06-30 Impact factor: 4.379

4. Comprehensive circular RNA expression profiling constructs a ceRNA network and identifies hsa_circ_0000673 as a novel oncogene in distal cholangiocarcinoma.

Authors: Xin Zhao; Xinxue Zhang; Zhigang Zhang; Zhe Liu; Jiqiao Zhu; Shaocheng Lyu; Lixin Li; Ren Lang; Qiang He
Journal: Aging (Albany NY) Date: 2020-11-18 Impact factor: 5.682

5. Profiles of alternative splicing events in the diagnosis and prognosis of Gastric Cancer.

Authors: Chunyin Wei; Weishun Xie; Xiaoliang Huang; Xianwei Mo; Zujun Liu; Guo Wu; Yongsheng Meng; Franco Jeen; Lianying Ge; Lihua Zhang; Lixian Liao; Jungang Liu; Weizhong Tang
Journal: J Cancer Date: 2021-03-19 Impact factor: 4.207

6. Identification and Validation of Three Autophagy-Related Long Noncoding RNAs as Prognostic Signature in Cholangiocarcinoma.

Authors: Ya Jun Liu; Alphonse Houssou Hounye; Zheng Wang; Xiaowei Liu; Jun Yi; Min Qi
Journal: Front Oncol Date: 2021-12-02 Impact factor: 6.244

7. Behavioral and Gene Expression Analysis of Stxbp6-Knockout Mice.

Authors: Cong Liu; Qian Hu; Yan Chen; Lingqian Wu; Xionghao Liu; Desheng Liang
Journal: Brain Sci Date: 2021-03-29

Review 8. Biomarkers and Genetic Markers of Hepatocellular Carcinoma and Cholangiocarcinoma-What Do We Already Know.

Authors: Jacek Baj; Łukasz Bryliński; Filip Woliński; Michał Granat; Katarzyna Kostelecka; Piotr Duda; Jolanta Flieger; Grzegorz Teresiński; Grzegorz Buszewicz; Marzena Furtak-Niczyporuk; Piero Portincasa
Journal: Cancers (Basel) Date: 2022-03-15 Impact factor: 6.639

Review 9. Current Advances in Basic and Translational Research of Cholangiocarcinoma.

Authors: Keisaku Sato; Leonardo Baiocchi; Lindsey Kennedy; Wenjun Zhang; Burcin Ekser; Shannon Glaser; Heather Francis; Gianfranco Alpini
Journal: Cancers (Basel) Date: 2021-07-01 Impact factor: 6.639

9 in total