Ning Zhang1, Wenxin Chen1, Zhilu Gan1, Alimujiang Abudurexiti1, Xiaogang Hu1, Wei Sang2. 1. Surgery Department of Urology, The Third People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang. 2. The Department of Pathology, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, PR China.
Abstract
Clear cell renal cell carcinoma (ccRCC) is the most common subtype among renal cancer, and more and more researches find that the occurrence of ccRCC is associated with genetic changes, but the molecular mechanism still remains unclear. The present study aimed to identify aggregation trend of differentially expressed genes (DEGs) in ccRCC, which would be beneficial to the treatment of ccRCC and provide research ideas using a series of bioinformatics approach. Gene ontology (GO) and Kyoto Encyclopedia of Gene and Genomes (KEGG) analysis were used to get the enrichment trend of DEGs of GSE53757 and GSE16449. Draw Venn Diagram was applied for co-expression of DEGs. Cytoscape with the Retrieval of Interacting Gene (STRING) datasets and Molecular Complex Detection (MCODE) were performed protein-protein interaction (PPI) of DEGs. The Kaplan-Meier Plotter analysis of top 15 upregulated and top 15 downregulated were selected in Gene Expression Profiling Interactive Analysis (GEPIA). Then, the expression level of hub genes between normal renal tissue and different pathological stages of ccRCC tissue, which significantly correlated with overall survival in ccRCC patients, were also analyzed by Ualcan based on The Cancer Genome Atlas (TCGA) database. In this study, we got 167 co-expression DEGs, including 72 upregulated DEGs and 95 downregulated DEGs. We identified 11 hub genes had significantly correlated with overall survival in ccRCC patients. Among them, KIF23, APLN, ADCY1, GREB1, TLR4, IRF8, CXCL1, CXCL2, deserved our attention.
Clear cell renal cell carcinoma (ccRCC) is the most common subtype among renal cancer, and more and more researches find that the occurrence of ccRCC is associated with genetic changes, but the molecular mechanism still remains unclear. The present study aimed to identify aggregation trend of differentially expressed genes (DEGs) in ccRCC, which would be beneficial to the treatment of ccRCC and provide research ideas using a series of bioinformatics approach. Gene ontology (GO) and Kyoto Encyclopedia of Gene and Genomes (KEGG) analysis were used to get the enrichment trend of DEGs of GSE53757 and GSE16449. Draw Venn Diagram was applied for co-expression of DEGs. Cytoscape with the Retrieval of Interacting Gene (STRING) datasets and Molecular Complex Detection (MCODE) were performed protein-protein interaction (PPI) of DEGs. The Kaplan-Meier Plotter analysis of top 15 upregulated and top 15 downregulated were selected in Gene Expression Profiling Interactive Analysis (GEPIA). Then, the expression level of hub genes between normal renal tissue and different pathological stages of ccRCC tissue, which significantly correlated with overall survival in ccRCC patients, were also analyzed by Ualcan based on The Cancer Genome Atlas (TCGA) database. In this study, we got 167 co-expression DEGs, including 72 upregulated DEGs and 95 downregulated DEGs. We identified 11 hub genes had significantly correlated with overall survival in ccRCC patients. Among them, KIF23, APLN, ADCY1, GREB1, TLR4, IRF8, CXCL1, CXCL2, deserved our attention.
Renal carcinoma is a common tumor in the urinary system, which accounts for 2% to 3% of adult malignancies.[ About 90% of renal carcinoma are renal cell carcinoma (RCC), the majority of which (70–85%) are clear cell renal cell carcinoma (ccRCC).[ ccRCC can be classified into four pathological stages, including grade I to grade IV. And grade IV has poor prognosis, of which 5-year survival rate is 20%.[Due to the poor response to conventional radiotherapy and chemotherapy of RCC patients, surgical resection is still the most effective treatment.[ However, once the disease became metastatic, targeted therapies will be the best choice for them.[ Unfortunately, after nearly 2 years of treatment, most of these patients gradually become resistance.[ In this case, more research efforts were prompted into targeted therapeutics, such as multi-targeted tyrosine kinase inhibitors (TKI) and mammalian target of rapamycin (mTOR) inhibitors,[ which have undergone radical breakthrough in the last decade.The clinical knowledge that ccRCC is a typical hypervascular cancer which tumor cells can promote the growth and progression of tumor through producing pro-angiogenesis factors.[ Thus, the vascular endothelial growth factor (VEGF) signaling, especially TKI, played an important role in targeted therapies of ccRCC.[ Furthermore, other therapeutic strategies, such as immune checkpoint inhibitors, holding greater promises. Newer agents, including varying gene, mRNA, and protein expression signatures were detected within the tumor specimen, and the changes in the dysregulation of mRNA have proved to affect ccRCC pathogenicity.[ Beyond predicting prognosis, more effective biomolecular markers and therapeutic targets are urgent needed for the development of systemic therapies in ccRCC.[Although many researches have focused on the screening of differentially expressed genes (DEGs) associated with ccRCC progression,[ according to gene expression profiles, most of them ignore the high correlations between genes, although genes with similar expression patterns might be functionally related.[ In our study, we downloaded original data of ccRCC from Gene Expression Omnibus (GEO), an online public collection database for different tumors. Two datasets of GSE53757 and GSE16449, were chosen by carefully selected, we compared tumor tissue and normal tissue to get identify DEGs, and analyzed gene ontology (GO) and Kyoto Encyclopedia of Gene and Genomes (KEGG) to get genetic change trend, then obtained genes with clinical guiding significance by survival analysis.
Methods
Data collection
We downloaded the gene expression profiles of GSE53757 and GSE16449 from GEO database (http://www.ncbi.nlm.nih.gov/geo/). The GSE53757 dataset included 144 samples, containing 72 ccRCC samples and 72 normal tissues. The GSE16449 dataset included 70 samples, containing 52 ccRCC samples and 18 normal tissues.
Data processing of DEGs
The analysis was carried out by using GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/), an interactive online tool allowing users to compare two or more groups of samples in a GEO series and it analyzed most GEO series with gene symbol. The adjusted P values were used to decrease the false positive rate using Benjamini and Hochberg false discovery rate method by default. The adjusted P value < .05 and |log FC| > 1 were set as the cutoff criterion. Then Draw Venn Diagram was used for co-expression of DEGs of GSE53757 and GSE16449.
Gene ontology and KEGG pathway analysis of DEGs
The Database for Annotation, Visualization and Integrated Discovery online tool (DAVID, https://david.ncifcrf.gov/) was used to perform GO functional and KEGG pathway enrichment analysis for the DEGs. The GO analysis, including biological process (BP), molecular function (MF), and cellular component (CC), was used to annotate genes and gene products, and also identify characteristic biological attributing to genomic or transcriptomic data. Kyoto Encyclopedia of Genes and Genomes was used to handle genomes, biological pathways, diseases, chemical substances and drugs. The P < .05 was considered to have significant differences.
PPI network and module analysis
The Search Tool for the Retrieval of Interacting Genes (STRING) was an online tool designed to predicted protein-protein interaction (PPI) information. The STRING database was used to analyze the PPIs among the proteins encoded by the DEGs with a combined score ≥0.4, and maximum number of interactors = 0 were set as the cutoff criterion. Then the PPI networks for the upregulated and the downregulated genes were separately visualized by Cytoscape software (version 3.6.1).In addition, the Molecular Complex Detection (MCODE) app was utilized to screen modules of the PPI network in Cytoscape with degree cutoff = 2, node score cutoff = 0.2, k-core = 2, and max. depth = 100.
Survival analysis of hub genes
Hub genes were a series of genes with the highest degree of connectivity in a gene module and determined the characteristics of a module. The Kaplan–Meier Plotter was used to measure hub genes. In the meantime, 30 genes, including 15 top upregulated genes and 15 top downregulated genes, were also inserted into Gene Expression Profiling Interactive Analysis (GEPIA) (http://gepia.cancer-pku.cn/index.html) with confidence score ≥0.4 and maximum number of interactors = 0. The hazard ratio (HR) with 95% confidence intervals and log rank P value were calculated and displayed on the plot.
Comparison of the hub gene expression level
In order to evaluate the expression level of hub genes, which significantly correlated with overall survival in ccRCC patients, we used boxplot by Ualcan (http://ualcan.path.uab.edu/index.html) based on The Cancer Genome Atlas (TCGA) database (https://genome-cancer.ucsc.edu/). The integrated data of 72 normal and 533 primary tumor in TCGA samples were enrolled for analyses. Demographic, clinical and tumor pathological features of ccRCC patients are listed in Table 1. The expression level of hub genes between normal renal tissue and different pathological stages of ccRCC tissue were also analyzed.
Table 1
Patient and tumor characteristics of ccRCC subtype cohorts in TCGA.
Patient and tumor characteristics of ccRCC subtype cohorts in TCGA.
Ethics statement
All analyses were based on the public GEO and TCGA databases,[ we did not need the informed consent of the patients, thus no ethical approval and patient consent are required.
Results
Identification of DEGs
GSE53757 selected 4542 DEGs, including 2441 upregulated DEGs and 2101 downregulated DEGs.GSE16449 selected 5308 DEGs, including 2219 upregulated DEGs and 3089 downregulated DEGs. A total of 167 co-expression of DEGs between GSE53757 and GSE16449 were detected by Draw Venn Diagram analysis, including 72 upregulated DEGs and 95 downregulated DEGs. The heat map, including 25 top upregulated genes and 25 top downregulated genes, was shown in Figure 1. The Volcano plot showed all co-expression genes of GSE53757 and GSE16449 (Fig. 2).
Figure 1
Heat map of the top upregulated 25 genes and the top downregulated 25 genes (red: upregulated, green: downregulated).
Figure 2
Volcano plot filtering of coexpression genes of GSE16449 and GSE57357 (red: up-regulated gene, green: down-regulated gene).
Heat map of the top upregulated 25 genes and the top downregulated 25 genes (red: upregulated, green: downregulated).Volcano plot filtering of coexpression genes of GSE16449 and GSE57357 (red: up-regulated gene, green: down-regulated gene).
GO function and KEGG pathway analysis of DEGs
The top five enriched terms of upregulated and downregulated DEGs were selected in Table 2 (Fig. 3), according to the P values. The DEGs were mainly enriched in BP, including cellular response to glucagon stimulus, negative regulation of phosphatase activity, ureteric bud formation, axo-dendritic transport and receptor-mediated endocytosis for upregulated DEGs, and for downregulated DEGs including immune response, chemokine-mediated signaling pathway, inflammatory response, chemotaxis, and cell chemotaxis. In MF, the upregulated DEGs were particularly enriched in microtubule binding, RNA polymerase II core promoter proximal region sequence-specific DNA binding, retinol binding, L-amino acid transmembrane transporter activity and transcriptional activator activity, RNA polymerase II core promoter proximal region sequence-specific binding, and the downregulated DEGs including chemokine activity, scavenger receptor activity, and DNA binding. In addition, the cell component (CC) analysis also displayed that the upregulated DEGs were significantly enriched in actin filament bundle and synaptic vesicle, and downregulated DEGs were mainly enriched in extracellular region, MHC class II protein complex, lysosomal membrane, endocytic vesicle membrane, and nucleus.
Table 2
Gene ontology analysis of differentially expressed genes associated with ccRCC.
Figure 3
The top GO terms and KEGG pathways enriched separately for the up-regulated DEGs (a) and the down-regulated DEGs (b). The horizontal axis represents the count of enriched DEGs. The vertical axis represents the enriched GO terms and KEGG pathway. BP = biological processes, CC = cellular component, DEGs = differentially expressed genes, GO = gene ontology, KEGG = Kyoto Encyclopedia of Gene and Genomes, MF = molecular function.
Gene ontology analysis of differentially expressed genes associated with ccRCC.The top GO terms and KEGG pathways enriched separately for the up-regulated DEGs (a) and the down-regulated DEGs (b). The horizontal axis represents the count of enriched DEGs. The vertical axis represents the enriched GO terms and KEGG pathway. BP = biological processes, CC = cellular component, DEGs = differentially expressed genes, GO = gene ontology, KEGG = Kyoto Encyclopedia of Gene and Genomes, MF = molecular function.Table 3 shows the most significantly enriched KEGG pathway of upregulated and downregulated DEGs (Fig. 3). The upregulated DEGs were enriched in Dilated cardiomyopathy, GABAergic synapse and Glutamatergic synapse, and the downregulated DEGs were enriched in Phagosome, Inflammatory bowel disease (IBD), Asthma, Leishmaniasis, and Salmonella infection.
Table 3
KEGG pathway analysis of differentially expressed genes associated with ccRCC.
KEGG pathway analysis of differentially expressed genes associated with ccRCC.
Hub genes and module screening from PPI network
Based on the information in the STRING protein relationship, we made the PPI network of the co-expression DEGs (Fig. 4a), including 72 upregulated DEGs and 95 downregulated DEGs. The upregulated DEGs mainly had two modules by using the MCODE plug-in (Fig. 4b and c), including module 1 (AFP, ABCC2, CYP2B6), module 2 (GNG4, APLN, ADCY1). The downregulated DEGs mainly had four modules by using the MCODE plug-in (Fig. 4d–g), including module 3 (HLA-DMB, HLA-DQA1, CAPZA2, HLA-DOA), module 4 (TLR4, IRF8, CD163, GPR183, CXCL1, CXCL2, STAT1, HGF, S1PR1), module 5 (GPR20, CALCRL, GPR84, PTGDR), and module 6 (GUCY1A3, GUCY1B3, ENTPD1).
Figure 4
Protein-Protein interaction network for the up-regulated and down-regulated DEG (a), the Module 1 (b), the Module 2 (c), the Module 3 (d), the Module 4 (e), the Module 5 (f), and the Module 6 (g) (red: up-regulated Module, green: down-regulated Module). DEG = differentially expressed gene.
Protein-Protein interaction network for the up-regulated and down-regulated DEG (a), the Module 1 (b), the Module 2 (c), the Module 3 (d), the Module 4 (e), the Module 5 (f), and the Module 6 (g) (red: up-regulated Module, green: down-regulated Module). DEG = differentially expressed gene.
The Kaplan–Meier Plotter of the top 15 upregulated and top 15 downregulated DEGs with high degree of connectivity
The top 15 upregulated genes including AFP, GDNF, KIF23, GNG4, APLN, ABCC2, ADCY1, CYP2B6, LOXL2, KLC3, CASC5, GREB1, TM4SF5, CDC25A, FRMD5, while the top 15 downregulated genes including TLR4, CCL4, CXCL9, IRF8, CXCL1, STAT1, CD163, CXCL2, MNDA, GPR183, CD93, HGF, HLA-DQA1, GPR84, and CD47. The GEPIA showed that 11 hub genes, including ADCY1, APLN, FRMD5, GNG4, GREB1, KIF23, CXCL1, CXCL2, GPR84, IRF8, and TLR4, had significantly correlated with overall survival in ccRCC patients (Fig. 5a and b).
Figure 5
Prognostic value of top up-regulated 15 DEGs (a) and top down-regulated 15 DEGs (b) in the patient of clear cell renal cell carcinoma. HR = hazard ratio.
Prognostic value of top up-regulated 15 DEGs (a) and top down-regulated 15 DEGs (b) in the patient of clear cell renal cell carcinoma. HR = hazard ratio.
Analysis of hub gene expression level
After analyzed the TCGA database, we found KIF23 and APLN were highly expressed in ccRCC tissue compared normal renal tissue. However, the expressions of other nine genes have opposite results (Fig. 6a and b).
Figure 6
Boxplots of up-regulated six DEGs (KIF23, GNG4, APLN, ADCY1, GREB1, and FRMD5) (a) and down-regulated five DEGs (TLR4, IRF8, CXCL1, CXCL2, and GPR84) (b) across normal and different pathological stages of clear cell renal cell carcinoma in the TCGA data set.
Boxplots of up-regulated six DEGs (KIF23, GNG4, APLN, ADCY1, GREB1, and FRMD5) (a) and down-regulated five DEGs (TLR4, IRF8, CXCL1, CXCL2, and GPR84) (b) across normal and different pathological stages of clear cell renal cell carcinoma in the TCGA data set.
Discussion
Although the use of targeted therapies, such as multi-kinase inhibitors, anti-VEGF antibodies and mTOR, has improved the median progression-free and overall survival of ccRCC patients for nearly 2 years,[ the development of drug resistance is still one of the main causes of treatment failure, and the molecular mechanism of drug resistance phenomenon still remains unclear.[ In this study, we identified DEGs between ccRCC and normal samples and used a series of bioinformatics analyses to screen key gene and pathways associated with cancer. To obtain an in-depth understanding of these DEGs, we performed the GO function and KEGG pathway analysis of them.In order to screen the key candidate proteins in ccRCC, we conducted the GEO database of GSE53757 and GSE16449, and 167 co-expression genes, including 72 upregulated DEGs and 95 downregulated DEGs, were identified finally. After further investigating and validating Kaplan–Meier Plotter, the top 15 upregulated and top 15 downregulated DEGs were identified as the key candidate proteins with high degree of connectivity.After analyzing the TCGA database, six genes in upregulated, including KIF23, GNG4, APLN, ADCY1, GREB1, and FRMD5, and five genes in downregulated, including TLR4, IRF8, CXCL1, CXCL2, and GPR84, may affect the survival time of ccRCC patients. Some of them deserved our attention and discussion. Kinesin family member 23 (KIF23), was a member of kinesin-like protein family. Previous study found that KIF23 was associated with prognostic factors in patients with breast cancer,[ and tightly related to progression of pancreatic ductal adenocarcinoma (PDAC),[ but others found that the expression of KIF23 showed negative relation to the prognosis of pulmonary adenocarcinoma patients.[ APLN, was encoded a peptide that functions as an endogenous ligand for the G-protein coupled apelin receptor. Yang L et al,[ found that APLN might be a therapeutic potential biomarker in muscle-invasive bladder cancer patients, and also highly expressed in breast cancer, and associated with lymph node metastasis and TNM stage.[ Adenylate cyclase 1 (ADCY1), was encoded a member of the of adenylate cyclase gene family that is primarily expressed in the brain. Hua Y et al,[ found that ADCY1 might play essential roles in the metastasis process of RAC through pancreatic secretion and cell adhesion molecules pathways. Growth regulating estrogen receptor binding 1 (GREB1), was an estrogen-responsive gene that is an early response gene in the estrogen receptor-regulated pathway, whose expression was correlated with estrogen levels in breast cancer patients.[ Toll like receptor 4 (TLR4), was a member of the Toll-like receptor (TLR) family which played a fundamental role in pathogen recognition and activation of innate immunity. It was thought to play an important role in hormone-responsive tissues and cancer. Kusuhara Y et al,[ concluded that low TLR4 expression was correlated with tumor progression, and the expression of TLR4 was inversely associated with prognosis of patients with invasive bladder cancer (BCa), and depletion of TLR4 significantly enhanced the invasive capability of BCa cells. Interferon regulatory factor 8 (IRF8), acted as a tumor suppressor gene through the transcriptional repression of β-catenin-TCF/LEF in NSCLC.[ IRF8 methylation may serve as a potential biomarker in NSCLC prognosis. C-X-C motif chemokine ligand 1, C-X-C motif chemokine ligand 2 (CXCL1, CXCL2), were parts of chemokine superfamily that encoded secreted proteins involved in immunoregulatory and inflammatory processes.[ Aberrant expression of these protein were associated with the growth and progression of certain tumors,[ and may suppressed hematopoietic progenitor cell proliferation.One thing should be noted was that 11 hub genes had significantly correlated with overall survival in ccRCC patients, as we showed in Figure 5a and b, based on Kaplan–Meier Plotter. However, after analyzed the TCGA database, only KIF23 and APLN were highly expressed in ccRCC tissue compared normal renal tissue, while the expression of other nine genes have opposite results. As shown in Figure 6a, KIF23 had high expression in tumor tissues compared in normal tissues. Moreover, with the increase of tumor grade, the expression of KIF23 was also enhanced. Interestingly, APLN also had high expression in tumor tissues. While the expression of APLN was decreased with the increase of tumor grade. The reason was mainly due to the choice of different databases. Thus in-depth experimental research studies should be performed to confirm this finding.
Conclusions
We presumed 167 co-expression DEGs by a series of bioinformatics analyses between ccRCC samples and normal tissues, probably related to the development of ccRCC. Besides, these hub genes might be prognostic and recurrence genes for ccRCC patients as validated from two independent datasets of GSE53757 and GSE16449. These identified genes and pathways provide a more detailed molecular mechanism for underlying ccRCC initiation and development. According to the study, upregulation of KIF23, GNG4, APLN, ADCY1, GREB1, FRMD5, and downregulation of TLR4, IRF8, CXCL1, CXCL2, GPR84 maybe considered as biomarkers or therapeutic targets for ccRCC. However, further molecular and biological experiments are required to validate our findings, which will be our subsequent research work.
Author contributions
Data curation: Ning Zhang, Lu Zhi Gan, Alimujiang Abudurexiti, Gang Xiao Hu, Wei Sang.Formal analysis: Ning Zhang, Xin Wen Chen, Lu Zhi Gan, Alimujiang Abudurexiti, Gang Xiao Hu, Wei Sang.Funding acquisition: Wei Sang.Investigation: Ning Zhang.Methodology: Ning Zhang.Resources: Ning Zhang, Wei Sang.Software: Ning Zhang.Supervision: Xin Wen Chen.Writing – original draft: Ning Zhang.Writing – review & editing: Ning Zhang.
Authors: Banumathy Gowrishankar; Ilsiya Ibragimova; Yan Zhou; Michael J Slifker; Karthik Devarajan; Tahseen Al-Saleem; Robert G Uzzo; Paul Cairns Journal: Cancer Biol Ther Date: 2013-12-18 Impact factor: 4.742
Authors: James J Hsieh; Mark P Purdue; Sabina Signoretti; Charles Swanton; Laurence Albiges; Manuela Schmidinger; Daniel Y Heng; James Larkin; Vincenzo Ficarra Journal: Nat Rev Dis Primers Date: 2017-03-09 Impact factor: 52.329
Authors: Francisco E Vera-Badillo; Arnoud J Templeton; Ignacio Duran; Alberto Ocana; Paulo de Gouveia; Priya Aneja; Jennifer J Knox; Ian F Tannock; Bernard Escudier; Eitan Amir Journal: Eur Urol Date: 2014-06-02 Impact factor: 20.096