Literature DB >> 31612054

Identification of candidate genes for the diagnosis and treatment of cholangiocarcinoma using a bioinformatics approach.

Mi Zhou1, Yabin Zhu1, Ruixia Hou1, Xianbo Mou1, Jun Tan2.   

Abstract

Cholangiocarcinoma (CCA) is a biliary epithelial tumor with poor prognosis. As the key genes and signaling pathways underlying the disease have not been fully elucidated, the aim of the present study was to improve the understanding of the molecular mechanisms associated with CCA. The microarray datasets GSE26566 and GSE89749 were downloaded from the Gene Expression Omnibus and differentially expressed genes (DEGs) between CCA and normal bile duct samples were identified. Gene and pathway enrichment analyses were performed, and a protein-protein interaction network was constructed and analyzed. A total of 159 DEGs and 10 hub genes were identified. The functions and pathways of the DEGs were mainly enriched in 'heparin binding', 'serine-type endopeptidase activity', 'calcium ion binding', 'pancreatic secretion', 'fat digestion and absorption' and 'protein digestion and absorption'. Survival analysis revealed that the upregulated expression of carboxypeptidase B1 and Kruppel like factor 4 was significantly associated with lower overall survival rate. In summary, the present study identified DEGs and hub genes associated with CCA, which may serve as potential diagnostic and therapeutic targets for the disease. Copyright: © Zhou et al.

Entities:  

Keywords:  Kaplan-Meier curves; bioinformatics analysis; cholangiocarcinoma; differentially expressed genes

Year:  2019        PMID: 31612054      PMCID: PMC6781666          DOI: 10.3892/ol.2019.10904

Source DB:  PubMed          Journal:  Oncol Lett        ISSN: 1792-1074            Impact factor:   2.967


Introduction

Cholangiocarcinoma (CCA) is a diverse epithelial malignancy originating in the cholangiocytes, the epithelial cells of the bile duct. The incidence of CCA has increased globally over the past few decades (1). Chronic infection and inflammation due to liver fluke infection, sclerosing cholangitis and hepatitis C and B virus infections serve a key role in cholangiocarcinogenesis, possibly through the accumulation of genetic and epigenetic changes that result in abnormalities in oncogenes and tumor suppressor genes (2,3). The most commonly mutated genes in CCA, including KRAS proto-oncogene GTPase (KRAS), tumor protein p53 (TP53), B-Raf proto-oncogene, serine/threonine kinase (BRAF), BRCA1 associated protein 1(BAP1), and SMAD family member 4 (SMAD4), are associated with cell signaling pathways (for example, MAPKs signaling and TGF-β signaling), cell cycle control and chromatin dynamics (2,4). Previous studies have demonstrated that KRAS point mutation in intrahepatic CCA (iCCA) may affect patient prognosis (5). Furthermore, mutations in KRAS and TP53 in mature cholangiocytes and hepatocytes may cause iCCA (6). A combination of mitogen-activated protein kinase kinase 1/2 and BRAF inhibitors has been shown to be effective in patients with iCCA harboring the BRAF V600E mutation (7). Activation of the transforming growth factor-β/Smad4 signaling pathway accelerates CCA cell invasion and migration via the epithelial-mesenchymal transition (EMT) (8). Knockdown of BAP1 increases CCA cell proliferation, whereas overexpression of wild-type BAP1 significantly inhibits cell proliferation, suggesting that BAP1 exhibits tumor suppressive effects (9). Despite significant efforts to elucidate the pathogenesis of CCA, the precise molecular mechanisms involved remain unclear. Therefore, the aim of the present study was to investigate the molecular mechanisms involved in the pathogenesis of CCA and to identify potential therapeutic targets. In recent years, microarray technology has attracted attention due to its ability to rapidly and simultaneously quantify the expression levels of several genes, and is particularly suitable for the screening of differentially expressed genes (DEGs) (10). Several microarray-based studies on CCA have identified numerous DEGs (11,12). However, previous reports were limited to independent microarray analysis or single cohort studies (13). Therefore, in order to identify more accurate and practical biomarkers, the present study analyzed two mRNA microarray datasets (GSE26566 and GSE89749), downloaded from the Gene Expression Omnibus (GEO), and screened for DEGs between cholangiocarcinoma and normal bile duct samples. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were subsequently performed and a protein-protein interaction (PPI) network was constructed. The results obtained in the present study may aid the early diagnosis and treatment of CCA.

Materials and methods

Microarray data

The current study aimed to elucidate the potential key candidate genes in CCA. Two gene expression profiles, GSE26566 (11) and GSE89749 (12), were downloaded from the GEO database (ncbi.nlm.nih.gov/geo). The GSE26566 dataset consisted of an mRNA expression profile of 104 CCA and 6 normal bile duct samples, while the GSE89749 dataset included 118 CCA and 2 normal bile duct samples. The normal bile duct samples were obtained from healthy controls. The samples in the GSE26566 dataset were obtained from three countries (Australia, Belgium and the United States of America) and the samples in GSE89749 dataset were obtained from ten countries (Singapore, Romania, Thailand, Italy, France, Korea, Brazil, Taiwan, China and Japan).

DEGs screening

The DEGs between CCA and normal bile duct samples were identified using GEO2R (www.ncbi.nlm.nih.gov/geo/geo2r), an interactive online tool used to identify DEGs by comparing samples from the GEO database. The cut-off criteria for the selection of DEGs were P<0.01 and a |log fold-change|≥1. Default settings were used as the screening criteria throughout the entire bioinformatics analysis process, as has been reported in previous studies (14,15).

GO and KEGG analyses of DEGs

GO (geneontology.org) and KEGG (www.genome.jp/kegg) enrichment analyses of the DEGs were performed using the Database for Annotation, Visualization and Integrated Discovery (DAVID version 6.8; david.ncifcrf.gov) with P<0.05 as the cut-off criterion. DAVID provides a comprehensive set of gene functional annotation information to extract biological information (16).

PPI network of DEGs

The PPI network was constructed using Cytoscape version 3.7.0 software (www.cytoscape.org) (17) based on the Search Tool for the Retrieval of Interacting Genes (STRING version 11.0; string-db.org). STRING is a system for searching for interactions between proven and predicted proteins (18). In the present study, genes with a combined score of >0.4 were considered significant. Sub-modules of the PPI network were analyzed using the Cytoscape plug-in Molecular Complex Detection (MCODE version 1.5.1) (19) with the criteria set as follows: MCODE score >4, node score cut-off=0.2, degree cut-off=2, max depth=100 and k-core=2. GO and KEGG analyses for the genes in the most significant sub-module were subsequently performed using Metascape version 3.5 (metascape.org) (20).

Hub gene selection and analysis

The top 10 genes ranked by network degree were selected as the hub genes. The corresponding proteins may be key candidate proteins with important physiological regulatory functions. GO and KEGG analyses for the hub genes were subsequently performed using Metascape version 3.5. A network of the genes and their co-expression genes was analyzed using cBioPortal (cbioportal.org), as described previously (21,22). The University of California, Santa Cruz Cancer Genomics Browser (genome-cancer.ucsc.edu) was used for the hierarchical clustering of the hub genes, and for examining the association between changes in the hub genes and the Child-pugh classification grade or days to death (23). Survival analyses were performed to assess the prognostic value of the hub genes identified in the present study in CCA. Kaplan-Meier analysis in the cBioPortal online platform was used for overall survival rate and disease-free survival analyses of the hub genes based on a dataset from The Cancer Genome Atlas (TCGA; cancergenome.nih.gov/) which contained 51 CCA samples.

Results

A total of 1,870 and 591 DEGs between CCA and normal bile duct samples were identified in the GSE26566 and GSE89749 datasets, respectively. The overlap between the two datasets included 159 DEGs (135 upregulated and 24 downregulated genes; Fig. 1).
Figure 1.

Venn diagram of the DEGs between cholangiocarcinoma and normal bile duct samples in the GSE26566 and GSE89749 datasets. A total of 159 overlapping DEGs were identified between the two datasets. DEGs, differentially expressed genes.

GO and KEGG enrichment analysis of DEGs

Candidate DEGs were subjected to function and pathway enrichment analysis in DAVID. GO analysis included biological processes (BP), cell components (CC) and molecular functions (MF). BP results suggested that the DEGs were mainly enriched in ‘cell adhesion’, ‘digestion’ and ‘defense response to gram-positive bacteria’ (Fig. 2A). CC results revealed that the DEGs were mainly enriched in ‘extracellular space’, ‘proteinaceous extracellular matrixes’ and ‘extracellular regions’ (Fig. 2B). MF results suggested that the DEGs were mainly enriched in ‘heparin binding’, ‘calcium ion binding’ and ‘serine-type endopeptidase activity’ (Fig. 2C). The KEGG pathway analysis suggested that the DEGs were mainly associated with ‘pancreatic secretion’, ‘fat digestion and absorption’ and ‘protein digestion and absorption’ (Fig. 2D).
Figure 2.

Gene ontology analysis, including (A) biological process, (B) cell component and (C) molecular function. (D) Kyoto Encyclopedia of Genes and Genomes analysis of the differentially expressed genes.

PPI network of the DEGs

A PPI network containing 84 nodes and 155 edges was constructed using Cytoscape software based on the STRING database (Fig. 3A). The most significant sub-module was extracted from the PPI network complex using MCODE (Fig. 3B). The functions of the genes in the aforementioned sub-module were analyzed using the online tool Metascape. The sub-module consisted of 7 nodes and 17 edges, which were mainly associated with ‘cell chemotaxis’, ‘G-protein coupled receptor signaling pathway’ and ‘positive regulation of response to external stimulus’ (Fig. 3C).
Figure 3.

PPI network and the most significant sub-module of DEGs. (A) The PPI network was constructed using the Search Tool for the Retrieval of Interacting Genes and Cytoscape software. (B) The most significant sub-module consisted of seven nodes and 17 edges. (C) GO and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis of the DEGs in the most significant sub-module. PPI, protein-protein interaction; DEGs, differentially expressed genes; GO, Gene Ontology.

The top 10 genes ranked by network degree were selected as hub genes and are presented in Table I. KEGG pathway analysis revealed that hub genes were mainly associated with the activation of the phosphoinositide 3-kinase-protein kinase B (Akt) signaling pathway (Table SI). A network of the hub genes and their co-expression genes was analyzed using cBioPortal (Fig. 4A). PLA2G1B and PNLIP did not interact with other genes in the network, thus, only eight hub genes appeared in the network. Hierarchical clustering revealed that among the patients with CCA with upregulated hub genes, the patients were classified as grade B using the Child-pugh classification grade, and the days to death value was relatively small (Fig. 4B). The expression of MYC and LPAR1 did not an alteration in the TCGA dataset. As a result, the survival curves of 8 out of the 10 hub genes are presented in the current study. The 6 hub genes that did not have a significant effect on patient outcome are presented in Fig. S1. Patients with CCA with carboxypeptidase B1 (CPB1) and KLF4 (Kruppel like factor 4) upregulation, had lower overall survival rates compared with patients without alterations (P=0.008 and 0.046 respectively; Fig. 5A). However, whilst CPB1 upregulation was significantly associated with lower overall survival, it was not associated with disease-free survival (P=0.912; Fig. 5B). The Kaplan-Meier estimate could not be used for the disease-free survival analysis of KLF4 due to the lack of clinical data on the ‘disease-free survival time’ of patients with the upregulation.
Table I.

Functions of the 10 hub genes.

Gene symbolFull nameFunction
IL-6Interleukin 6Encodes a cytokine that functions in inflammation and the maturation of B cells
MYCMYC proto-oncogeneProto-oncogene that encodes a nuclear phosphoprotein that plays a role in cell cycle progression, apoptosis and cellular transformation
SSTSomatostatinInhibits the release of numerous secondary hormones
CXCL12CXC motif chemokine ligand 12Plays a role in a number of cellular functions, including embryogenesis, immune surveillance, inflammation response, tissue homeostasis and tumor growth and metastasis
NPYNeuropeptide YInfluences several physiological processes, including cortical excitability, stress response, food intake, circadian rhythms and cardiovascular function
LPAR1Lysophosphatidic acid receptor 1Encodes the lysophosphatidic acid receptor, mediate diverse biologic functions, including proliferation, platelet aggregation, smooth muscle contraction, chemotaxis, and tumor cell invasion.
PLA2G1BPhospholipase A2 group 1BEncodes a secreted member of the phospholipase A2 class of enzymes. The enzyme may be involved in several physiological processes including cell contraction, cell proliferation and pathological response.
CPB1Carboxypeptidase B1Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants.
PNLIPPancreatic lipaseEncodes an enzyme involved in the digestion of dietary fats
KLF4Kruppel like Factor 4The encoded zinc finger protein is required for normal development of the barrier function of skin. The encoded protein is thought to control the G1-to-S transition of the cell cycle following DNA damage by mediating the tumor suppressor gene p53.
Figure 4.

(A) Hub genes and their co-expression genes were analyzed using cBioPortal. Nodes with a bold black outline represent hub genes. Nodes with a thin black outline represent the co-expression genes. (B) Hierarchical clustering of the hub genes was performed using the University of California, Santa Cruz Cancer Genomics Browser. The genomic heat map is presented on the left and the clinical heat map is presented on the right. Grey represents no data. Red and blue represent upregulation and downregulation, respectively. The blue and red cells in the Child-Pugh classification column represents grade A and B, respectively.

Figure 5.

(A) Overall and (B) disease-free survival time analyses of CPB1 and KLF4 were performed using the cBioPortal online platform. P<0.05 was considered to indicate a statistically significant difference. CPB1, carboxypeptidase B1; KLF4, Kruppel like factor 4.

Discussion

CCA is the second most common hepatobiliary malignancy after hepatocellular carcinoma (24). The main etiological factors of CCA include primary sclerosing cholangitis, cirrhosis and hepatitis C and B infections (25). While chronic infection and inflammation in the bile ducts play a major role in CCA, the molecular mechanisms involved in CCA remain poorly understood. The most commonly mutated genes in CCA are SMAD4, TP53, KRAS, BAP1, isocitrate dehydrogenase [NADP(+)] 1 cytosolic, isocitrate dehydrogenase [NADP(+)] 2 mitochondrial and roundabout guidance receptor 2 (2,4). In addition, CCA has been reported to be associated with inflammation, the growth factor signaling pathway, cell signaling pathways and epigenetic regulation (3,26). There are currently no effective clinical biomarkers and targeted molecular therapies for the early diagnosis and treatment of CCA; consequently, the 5-year survival rate is ~10% (27). There is therefore a requirement of the identification of diagnostic markers for CCA. Microarray technology has previously been used to identify novel biomarkers in various diseases and may also be applied to uncover diagnostic markers in CCA (10). In the present study, two microarray datasets (GSE26566 and GSE89749) were obtained from the GEO and analyzed to identify DEGs between CCA and normal bile duct samples. In total, 159 DEGs (135 upregulated and 24 downregulated) were identified. GO and KEGG enrichment analyses were performed to investigate interactions among the DEGs and a PPI network was constructed, revealing 10 hub genes. The PPI network demonstrated that interleukin-6 (IL-6) had the largest number of nodes (20 nodes) and directly interacted with c-Myc, somatostatin, C-X-C motif chemokine 12 (CXCL12), neuropeptide Y (NPY) and LPAR1, suggesting that IL-6 may serve an important role in CCA. Consistent with the results obtained in the current study, a recent study demonstrated that increased IL-6 expression plays a central role in the pathogenesis and progression of CCA (28). In addition, the circulating level of IL-6 in patients with CCA was reported to be increased compared with healthy controls (29). Moreover, overexpression of IL-6 promotes cell survival in malignant cholangiocytes and enhances tumor growth (30). A previous study reported that a combination of increased levels of leucine-rich α-2-glycoprotein 1, carbohydrate antigen 19-9 and IL-6 in the serum could be sued to discriminate between biliary tract cancer (CCA and gallbladder carcinoma) and benign biliary disease (31). c-MYC is a proto-oncogene and encodes a nuclear phosphoprotein, which primarily regulates apoptosis, cell cycle progression and cellular transformation (32). A previous study showed that c-MYC is upregulated in human CCA (33). Furthermore, cyclin D1, a c-Myc target gene, is a molecular biomarker of CCA (34). Knockdown of c-Myc significantly reduced the extent of cholangiofibrosis and cholangioma in vivo, highlighting the importance of c-Myc in the progression of CCA (35). C-X-C motif chemokine receptor 4 (CXCR4) selectively binds CXCL12 and the CXCR4/CXCL12 axis has been shown to be involved in tumorigenesis, cell proliferation and angiogenesis in CCA (36). A recent study suggested that serum CXCL12 levels may serve as a potential biomarker for predicting the clinical outcome in CCA (37). However, in the present study, elevated CXCL12 levels were not significantly associated with disease-free or overall survival. This may be due to the cBioPortal survival analyses performed, which were based on the association between gene mutation and prognosis, whereas high expression in serum is generally caused by a mutation or upregulation (38). The association between CCA and the hub genes NPY, LPAR1, phospholipase A2 group IB, CPB1, pancreatic lipase and KLF4 has not been widely reported. NPY expression has been shown to be upregulated in CCA (39,40), therefore regulating NPY expression may be beneficial for the treatment of CCA. Previous research demonstrated that KLF4 and microRNA-21 play a key role in mediating the EMT in CCA cells via the Akt/extracellular signal-regulated protein kinase 1 and 2 signaling pathway (41). Hierarchical clustering revealed that the hub genes identified in the present study may be used to differentiate CCA from normal bile duct samples. Furthermore, upregulation of CPB1 and KLF4 was associated with a lower overall survival rate, suggesting that the aforementioned genes may serve important roles in the progression of CCA. In summary, the present study identified DEGs that may be involved in the carcinogenesis or progression of CCA. A total of 159 DEGs and 10 hub genes were identified, which following further investigation may serve as diagnostic biomarkers and novel therapeutic targets for CCA.
  41 in total

1.  Genomic spectra of biliary tract cancer.

Authors:  Hiromi Nakamura; Yasuhito Arai; Yasushi Totoki; Tomoki Shirota; Asmaa Elzawahry; Mamoru Kato; Natsuko Hama; Fumie Hosoda; Tomoko Urushidate; Shoko Ohashi; Nobuyoshi Hiraoka; Hidenori Ojima; Kazuaki Shimada; Takuji Okusaka; Tomoo Kosuge; Shinichi Miyagawa; Tatsuhiro Shibata
Journal:  Nat Genet       Date:  2015-08-10       Impact factor: 38.330

Review 2.  Cholangiocarcinoma - evolving concepts and therapeutic strategies.

Authors:  Sumera Rizvi; Shahid A Khan; Christopher L Hallemeier; Robin K Kelley; Gregory J Gores
Journal:  Nat Rev Clin Oncol       Date:  2017-10-10       Impact factor: 66.675

Review 3.  Pathogenesis of cholangiocarcinoma: From genetics to signalling pathways.

Authors:  Sarinya Kongpetch; Apinya Jusakul; Choon Kiat Ong; Weng Khong Lim; Steven G Rozen; Patrick Tan; Bin Tean Teh
Journal:  Best Pract Res Clin Gastroenterol       Date:  2015-02-17       Impact factor: 3.043

4.  Mutational landscape of intrahepatic cholangiocarcinoma.

Authors:  Shanshan Zou; Jiarui Li; Huabang Zhou; Christian Frech; Xiaolan Jiang; Jeffrey S C Chu; Xinyin Zhao; Yuqiong Li; Qiaomei Li; Hui Wang; Jingyi Hu; Guanyi Kong; Mengchao Wu; Chuanfan Ding; Nansheng Chen; Heping Hu
Journal:  Nat Commun       Date:  2014-12-15       Impact factor: 14.919

Review 5.  Cholangiocarcinoma.

Authors:  Nataliya Razumilava; Gregory J Gores
Journal:  Lancet       Date:  2014-02-26       Impact factor: 79.321

Review 6.  MYC, Metabolism, and Cancer.

Authors:  Zachary E Stine; Zandra E Walton; Brian J Altman; Annie L Hsieh; Chi V Dang
Journal:  Cancer Discov       Date:  2015-09-17       Impact factor: 39.397

Review 7.  Cancer genome landscapes.

Authors:  Bert Vogelstein; Nickolas Papadopoulos; Victor E Velculescu; Shibin Zhou; Luis A Diaz; Kenneth W Kinzler
Journal:  Science       Date:  2013-03-29       Impact factor: 47.728

8.  Cytoscape 2.8: new features for data integration and network visualization.

Authors:  Michael E Smoot; Keiichiro Ono; Johannes Ruscheinski; Peng-Liang Wang; Trey Ideker
Journal:  Bioinformatics       Date:  2010-12-12       Impact factor: 6.937

9.  miR-21 and KLF4 jointly augment epithelial‑mesenchymal transition via the Akt/ERK1/2 pathway.

Authors:  Chen-Hai Liu; Qiang Huang; Zhi-Yuan Jin; Cheng-Lin Zhu; Zhen Liu; Chao Wang
Journal:  Int J Oncol       Date:  2017-02-14       Impact factor: 5.650

10.  Exome sequencing identifies distinct mutational patterns in liver fluke-related and non-infection-related bile duct cancers.

Authors:  Waraporn Chan-On; Maarja-Liisa Nairismägi; Choon Kiat Ong; Weng Khong Lim; Simona Dima; Chawalit Pairojkul; Kiat Hon Lim; John R McPherson; Ioana Cutcutache; Hong Lee Heng; London Ooi; Alexander Chung; Pierce Chow; Peng Chung Cheow; Ser Yee Lee; Su Pin Choo; Iain Bee Huat Tan; Dan Duda; Anca Nastase; Swe Swe Myint; Bernice Huimin Wong; Anna Gan; Vikneswari Rajasegaran; Cedric Chuan Young Ng; Sanjanaa Nagarajan; Apinya Jusakul; Shenli Zhang; Priya Vohra; Willie Yu; DaChuan Huang; Paiboon Sithithaworn; Puangrat Yongvanit; Sopit Wongkham; Narong Khuntikeo; Vajaraphongsa Bhudhisawasdi; Irinel Popescu; Steven G Rozen; Patrick Tan; Bin Tean Teh
Journal:  Nat Genet       Date:  2013-11-03       Impact factor: 41.307

View more
  1 in total

1.  Identification of early stage recurrence endometrial cancer biomarkers using bioinformatics tools.

Authors:  María José Besso; Luciana Montivero; Ezequiel Lacunza; María Cecilia Argibay; Martín Abba; Laura Inés Furlong; Eva Colas; Antonio Gil-Moreno; Jaume Reventos; Ricardo Bello; Mónica Hebe Vazquez-Levin
Journal:  Oncol Rep       Date:  2020-06-16       Impact factor: 3.906

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.