BACKGROUND: Bladder cancer (BC) is one of the most common malignant neoplasms in the genitourinary tract. We employed the GSE13507 data set from the Gene Expression Omnibus (GEO) database in order to identify key genes related to tumorigenesis, progression, and prognosis in BC patients. METHODS: The data set used in this study included 10 normal bladder mucosae tissue samples and 165 primary BC tissue samples. Differentially expressed genes (DEGs) in the 2 types of samples were identified by GEO2R. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed using the online website DAVID. The online website STRING was used to construct a protein-protein interaction network. Moreover, the plugins in MCODE and cytoHubba in Cytoscape were employed to find the hub genes and modules in these DEGs. RESULTS: We identified 154 DEGs comprising 135 downregulated genes and 19 upregulated genes. The GO enrichment results were mainly related to the contractile fiber part, extracellular region part, actin cytoskeleton, and extracellular region. The KEGG pathway enrichment results mainly comprised type I diabetes mellitus, asthma, systemic lupus erythematosus, and allograft rejection. A module was identified from the protein-protein interaction network. In total, 15 hub genes were selected and 3 of them comprising CALD1, CNN1, and TAGLN were associated with both overall survival and disease-free survival. CONCLUSION: CALD1, CNN1, and TAGLN may be potential biomarkers for diagnosis as well as therapeutic targets in BC patients.
BACKGROUND:Bladder cancer (BC) is one of the most common malignant neoplasms in the genitourinary tract. We employed the GSE13507 data set from the Gene Expression Omnibus (GEO) database in order to identify key genes related to tumorigenesis, progression, and prognosis in BC patients. METHODS: The data set used in this study included 10 normal bladder mucosae tissue samples and 165 primary BC tissue samples. Differentially expressed genes (DEGs) in the 2 types of samples were identified by GEO2R. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed using the online website DAVID. The online website STRING was used to construct a protein-protein interaction network. Moreover, the plugins in MCODE and cytoHubba in Cytoscape were employed to find the hub genes and modules in these DEGs. RESULTS: We identified 154 DEGs comprising 135 downregulated genes and 19 upregulated genes. The GO enrichment results were mainly related to the contractile fiber part, extracellular region part, actin cytoskeleton, and extracellular region. The KEGG pathway enrichment results mainly comprised type I diabetes mellitus, asthma, systemic lupus erythematosus, and allograft rejection. A module was identified from the protein-protein interaction network. In total, 15 hub genes were selected and 3 of them comprising CALD1, CNN1, and TAGLN were associated with both overall survival and disease-free survival. CONCLUSION:CALD1, CNN1, and TAGLN may be potential biomarkers for diagnosis as well as therapeutic targets in BC patients.
Bladder cancer (BC) is one of the most common malignancies in the urinary tract, where it is promoted by testosterone and inhibited by estrogen, and the risk of BC in men is 3 times higher than that in 4 women.[ It is-well known that the only reliable tool for the diagnosis and post-treatment monitoring of BC patients is based on cystoscopy, which is expensive and unpleasant, and patients must undergo several cystoscopies each year to check for recurrence.[ However, this method has no effects on healthy people and even high risk people who might potentially become BC patients. BC is an important public health problem, where it comprises the 7th and 17th most common tumor in men and women, respectively.[ In China, the occurrence of BC increased dramatically from 1991 to 2009, and it caused >20,000 deaths in 2009.[ Despite many aggressive treatment measures, the survival status of advanced or metastatic BC patients is still poor, and until recently, it had not changed greatly for decades.[Newly discovered biomarkers for gene mutations in BC could significantly improve the accuracy of urine tests for identifying BC.[ In addition, abnormal MUC16 glycoforms could serve as potential biomarkers for targeted therapeutics in BC patients.[ Calcium activated chloride channel A4 (CLCA4), a tumor suppressor, has been shown to contribute to the progression of several tumors, including BC. It has been reported that a low expression level of CLCA4 is associated with tumor aggressiveness and unfavorable clinical survival, and it may inhibit BC cell proliferation, migration, and invasion by suppressing the PI3K/AKT signaling pathway.[ Real-time polymerase chain reaction analysis indicates that CLCA4 mRNA is highly expressed in human brain, testis, small intestine, colon, and lung tissues.[High-throughput sequencing analysis is being used increasingly, and it has been applied as a very important tool in various branches of medicine, such as for early cancer diagnosis, cancer grading, and progression prediction.[ In the current study, we used the GSE13507 data set from the Gene Expression Omnibus (GEO) and an online tool called GEO2R, to determine the differentially expressed genes (DEGs) in BC and normal sample types. We then constructed a protein–protein interaction (PPI) network for the DEGs and identified 15 hub genes. Subsequently, we performed Gene Ontology (GO) enrichment analysis to identify biological process, cellular components, and molecular function components, as well as enrichment analysis of the DEGs based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and we constructed a module of the hub genes. Moreover, overall survival (OS) and disease-free survival (DFS) analyses of the hub genes were performed using an online website. The expression levels and Pearson correlation analysis of the genes were employed to visualize the potential relationships among the genes, as well as to provide novel insights regarding potential therapeutic targets in BC patients.
Materials and methods
Patient information
We selected gene expression profiles from the GSE13507 data set in the GEO database, which is a free public database. The GSE13507 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE13507, accessed December 6, 2017) gene expression profiles comprised 256 samples from the GPL6102 platform (Illumina human-6 v2.0 expression beadchip), with 10 normal bladder mucosae tissues, 165 primary BC tissues, 23 recurrent non-muscle invasive tumor tissues, and 58 normal-looking bladder mucosae surrounding cancer tissues.[ In order to construct a reliable model, 165 primary BC samples and 10 normal bladder mucosae samples were selected for the analysis. This study based on public sources data, which contains its ethnic approval. Thus, we do not need any further ethnic approval.
Data processing
The online tool called GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/, accessed December 11, 2017) was used to determine the DEGs in the primary BC and normal bladder mucosae samples.[ The adjusted P values were used to reduce the false positive rate with the Benjamini and Hochberg false discovery rate method by default. Adjusted P values <.05 and |logFC| ≥2 were set as cut-off values. In total, 154 DEGs were determined, which comprised 135 downregulated genes and 19 upregulated genes. The top 15 hub genes were then selected as the hub genes using the default MCC method in cytoHubba, which is a Cytoscape plugin.[
GO and KEGG pathway enrichment analysis of DEGs
GO enrichment analysis is a widely used method for annotating specific genes, gene products, and assembling biological attributes for high-throughput genome and transcriptome data.[ KEGG is a database resource used for understanding the high level functions and utilities of a biological system based on molecular level information obtained by genome sequencing and other high-throughput experimental techniques.[ The Database for Annotation, Visualization, and Integrated Discovery v6.7 (DAVID, https://david-d.ncifcrf.gov/, accessed December 13, 2017) was employed to identify the enriched biological process, cellular component, molecular function, and KEGG pathways for the 154 DEGs.[
PPI network and module analysis
The online Search Tool for the Retrieval of Interacting Genes (STRING) website was employed to construct the relationships among proteins. The Molecular Complex Detection (MCODE) plugin in Cytoscape was then utilized to screen modules in the PPI network with the default settings.[ GO and KEGG pathway enrichment analyses were performed using the genes identified by the MCODE plugin on the DAVID website.
Survival analysis, expression levels, and correlations of hub genes
The online Gene Expression Profiling Interactive Analysis (GEPIA, http://gepia.cancer-pku.cn/index.html, accessed December 20, 2017) resource was employed to depict the OS and DFS outcomes based on the hub gene expression levels.[ The genes related to both OS and DFS were identified for further study based on Pearson correlation analysis and the tissue expression levels in both BC and normal tissues.
Results
Identification of DEGs and hub genes
In total, 165 primary BC and 10 normal bladder mucosae samples were considered in this study. The online GEO2R tool was utilized to determine the DEGs based on cut-off values: adjusted P values <.05 and | logFC | ≥2. We identified 154 DEGs comprising 135 downregulated genes and 19 upregulated genes. A PPI network was constructed using the 154 DEGs. Furthermore, 15 hub genes were identified among the 154 DEGs using the default MCC method. The 15 hub genes were TPM1, TPM2, MYH11, ACTA2, MYL9, CNN1, TAGLN, CALD1, ACTG2, TOP2A, CCNB2, ASPM, NUSAP1, TPX2, and CDC20 (Table 1). Details of the 154 DEGs are presented in Supplementary Table 1.
Table 1
Top 15 hub genes among 154 differentially expressed genes.
Top 15 hub genes among 154 differentially expressed genes.
GO and KEGG pathway enrichment analysis for 154 DEGs
After importing the 135 downregulated genes, DAVID was used to obtain the enrichment results based on GO and KEGG pathway analysis. Among the GO enrichment results, the top 20 GO results were contractile fiber, contractile fiber part, actin cytoskeleton, extracellular region part, extracellular region, myofibril, extracellular matrix, growth factor binding, cytoskeletal protein binding, actin filament bundle, smooth muscle contractile fiber, actomyosin, proteinaceous extracellular matrix, cytoskeleton organization, actin cytoskeleton organization, actin binding, extracellular matrix organization, actin filament-based process, structural constituent of muscle, and sarcomere (Table 2). However, the 19 upregulated genes were not enriched for any GO terms. Details of the 170 GO enrichment results are shown in Supplementary Table 2.
Table 2
Top 20 enriched genes according to GO analysis of differentially expressed genes.
Top 20 enriched genes according to GO analysis of differentially expressed genes.According to the KEGG pathway enrichment analysis of the 135 downregulated genes, the 17 enriched pathways comprised type I diabetes mellitus, asthma, systemic lupus erythematosus, allograft rejection, graft-versus-host disease, viral myocarditis, intestinal immune network for IgA production, autoimmune thyroid disease, vascular smooth muscle contraction, cardiac muscle contraction, antigen processing and presentation, cell adhesion molecules, complement and coagulation cascades, focal adhesion, hypertrophic cardiomyopathy, hematopoietic cell lineage, dilated cardiomyopathy, and arachidonic acid metabolism. The 19 upregulated genes were not enriched for any pathways. Details of the enrichment results are shown in Table 3.
Table 3
Enriched KEGG pathway analysis of differentially expressed genes.
Enriched KEGG pathway analysis of differentially expressed genes.
PPI network of hub genes and module obtained using MCODE
STRING was employed to obtain an interaction network comprising the 15 hub genes, where each interacted with at least 5 other proteins. The network is presented in Fig. 1. Based on the PPI network, Cytoscape determined a module using the default MCODE settings, where 20 genes were assembled in the module. The assembled genes were analyzed to determine their enriched GO and KEGG pathways (Table 4).
Figure 1
The protein–protein interaction network of the top 15 hub genes.
Table 4
Top 20 enriched terms detected for the 20 genes by GO analysis and MCODE.
The protein–protein interaction network of the top 15 hub genes.Top 20 enriched terms detected for the 20 genes by GO analysis and MCODE.The enriched KEGG pathways comprised vascular smooth muscle contraction, cardiac muscle contraction, hypertrophic cardiomyopathy, dilated cardiomyopathy, and cell cycle. The detailed module and KEGG pathways are shown in Fig. 2.
Figure 2
Module obtained from the protein–protein interaction network and enriched pathways for genes in the module. (A) The module was generated using MCODE. (B) Enriched pathways in the module. MCODE = molecular complex detection.
Module obtained from the protein–protein interaction network and enriched pathways for genes in the module. (A) The module was generated using MCODE. (B) Enriched pathways in the module. MCODE = molecular complex detection.According to the GO enrichment results for the 20 genes, the top 20 enriched GO terms comprised cytoskeleton, intracellular non-membrane-bounded organelles, non-membrane-bounded organelles, cytoskeletal part, actin cytoskeleton, contractile fiber part, contractile fiber, smooth muscle contractile fiber, cytoskeleton organization, cytoskeletal protein binding, M phase, mitotic cell cycle, cell cycle phase, spindle, muscle contraction, actin binding, muscle system process, microtubule cytoskeleton, actomyosin, and cell cycle process (Fig. 3). Details of the 106 GO enrichment results are shown in Supplementary Table 3.
Figure 3
Expression levels and Pearson correlation analysis of CALD1, CNN1, and TAGLN genes.
Expression levels and Pearson correlation analysis of CALD1, CNN1, and TAGLN genes.
Survival curves, expression levels, and correlation analysis of hub genes
Among the 15 hub genes, CALD1, CNN1, TAGLN, TMP2, ACTA2, MYH11, and TMP1 had statistically significant P values, where CALD1, CNN1, and TAGLN were related to both OS and DFS (all P ≤.05). TMP2, ACTA2, MYH11, and TMP1 only had relationships with OS (all P ≤0.05). The detailed results are shown in Fig. 4.
Figure 4
Prognostic analysis of overall survival and disease-free survival for CALD1 (A and B), CNN1 (C and D), TAGLN (E and F), TPM2 (G), ACTA2 (H), MYH11 (I), and TPM1 (J) genes.
Prognostic analysis of overall survival and disease-free survival for CALD1 (A and B), CNN1 (C and D), TAGLN (E and F), TPM2 (G), ACTA2 (H), MYH11 (I), and TPM1 (J) genes.Among the 15 hub genes, only the genes that had relationships with OS and DFS were selected for further analysis, that is, CALD1, CNN1, and TAGLN. The expression levels of these 3 genes are shown in Fig. 3, where all of them were low in tumor tissues but high in normal tissues, and they differed significantly between the normal and BC tissues. In addition, their low expression levels were associated with a better prognosis. The Pearson correlation coefficients between the gene expression levels are shown in Fig. 3 (all R >0.9).
Discussion
BC originates from the epithelial lining of the urinary bladder and it is one of the most common genitourinary tumors. In China, the incidence and mortality of BC has increased rapidly in recent decades.[ Currently, pathological analyses including the clinical stage and tumor grade are the main determinants used for risk evaluation and therapeutic decision making for BC patients.[ However, none of the conventional histopathological parameters has satisfactory sensitivity and specificity for detecting, monitoring, and determining the prognosis in BC patients.[ Due to these limitations, many studies have tried to identify potential molecular markers for early detection, early diagnosis, and the development of effective treatments.In the present study, we employed gene expression profiles from the GSE13507 data set in the GEO database to identify potential molecular markers in BC patients. The data set comprised 256 samples, with 10 normal bladder mucosae tissue samples, 165 primary BC tissue samples, 23 recurrent non-muscle invasive tumor tissue samples, and 58 normal-looking bladder mucosae surrounding cancer tissue samples. In order to identify potential molecular makers compared with healthy people, we selected 10 normal bladder mucosae and 165 primary BC tissue samples. In total, 154 DEGs were detected using the GEO2R online tool, where 135 genes were downregulated and 19 genes were upregulated. Moreover, the MCODE plugin and cytoHubba plugin were employed to produce a module and detect 15 hub genes in these DEGs. To obtain a more in-depth understanding of these DEGs, we performed GO and KEGG pathway enrichment analyses.The GO enrichment results showed that the downregulated genes were mainly involved with contractile fiber, actin cytoskeleton, extracellular region part, growth factor binding, actin filament bundle, and cytoskeleton organization. Moreover, the KEGG pathway enrichment results indicated that the downregulated genes were mainly involved with type I diabetes mellitus, asthma, systemic lupus erythematosus, allograft rejection, graft-versus-host disease, and viral myocarditis. However, the 19 upregulated genes were not enriched for any GO terms or KEGG pathways. In total, 20 genes were assembled in the module based on the PPI network and the enriched KEGG pathways for these 20 genes included vascular smooth muscle contraction, cardiac muscle contraction, hypertrophic cardiomyopathy, dilated cardiomyopathy, and the cell cycle. Among the DEGs, 15 hub genes were selected in the PPI network, where 7 genes comprising CALD1, CNN1, TAGLN, TMP2, ACTA2, MYH11, and TMP1 had significant correlations with the patient prognosis. CALD1, CNN1, and TAGLN had relationships with both OS and DFS, so they were subjected to further analysis, where they all had low expression levels in tumor tissues but high expression levels in normal tissues. In addition, their low expression levels were associated with a better prognosis. Based on these results, we hypothesize that CALD1, CNN1, and TAGLN may function as oncogenes.CALD1 is a novel target of TEA domain family member 4, and it is involved with cell proliferation and migration.[ In the transactivated CALD1 gene and humanCALD1 promoter, their 2 glucocorticoid-response element-like sequences may be bound directly by an activated form of glucocorticoid receptor, thereby upregulating the caldesmon protein and regulating cell migration via the reorganization of the actin cytoskeleton.[CALD1 was identified as a tumor-specific splicing variant in all of the validated colon, urinary bladder, and prostate organ samples among 102 normal and cancer tissue samples.[ It has been suggested that CALD1 may indicate general cancer-related splicing events.[ Splicing variants of CALD1 are differentially expressed in glioma neovascularization versus normal brain microvasculature.[ The missplicing of CALD1 is an independent epigenetic event that is regulated at the transcriptional level, which is correlated with the breakdown of tight junctions among epithelial cells in the glioma microvasculature.[CNN1 plays a tumor-suppressive role in ovarian cancer[ and it is a structural molecular signature of cancer initiation and progression.[CNN1 functions as a tumor suppressor gene and it is an indicator of cell migration in primary cultured invasive hepatocellular carcinoma cells.[TAGLN is a downstream target of miR-144 and its expression level is upregulated in osteosarcoma cell, where it is inversely correlated with miR-144 expression.[ The expression of TAGLN in NF-1 associated malignant peripheral nerve sheath tumors is upregulated by hypomethylation in its promoter and subpromoter regions.[TAGLN is significantly overexpressed in lung adenocarcinoma and it may be a reliable therapeutic target and a potential biomarker for predicting the prognosis of lung adenocarcinomapatients.[ TAGLN is coexpressed with TAZ-AXL-CTGF, where it is upregulated and associated with the progression of colon cancer.[There are some limitations need recognized. Other population, including some clinical samples to validate the expression (mRNA and protein), are warranted to further validate our findings. In addition, functional trials are needed to explore properties of metastasis and proliferation.Thus, in this study, we determined the DEGs between normal bladder mucosae and primary BC samples in the GSE13507 data set. The hub genes among the DEGs were associated with the prognoses of BC patients, and 3 were correlated with both OS and DFS. These genes have associations with many diseases, including colon cancer, hepatocellular carcinoma, osteosarcoma, lung adenocarcinoma, and glioma. However, our study have several limitations. Large samples are required to increase the reliability of the findings. Functional experiments are needed to validate our results. According to our bioinformatics analysis and previous studies, we suggest that the CALD1, CNN1, and TAGLN genes may be potential molecular makers in BC patients.
Conclusions
We identified 3 potential molecular markers for BC diagnosis and they may even be therapeutic targets in BC patients, but more detailed functional validation and investigations of the mechanisms related to these genes are necessary. Our results may provide some powerful insights to facilitate the future individualized treatment of BC patients.
Acknowledgments
The authors are grateful to the organizations and researchers who provided the data used in this study. It is their pleasure to acknowledge their contributions.
Authors: Maurizio Brausi; J Alfred Witjes; Donald Lamm; Raj Persad; Joan Palou; Marc Colombel; Roger Buckley; Mark Soloway; Hideyuki Akaza; Andreas Böhle Journal: J Urol Date: 2011-10-19 Impact factor: 7.450
Authors: Janice E Drew; Andrew J Farquharson; Claus Dieter Mayer; Hollie F Vase; Philip J Coates; Robert J Steele; Francis A Carey Journal: PLoS One Date: 2014-11-25 Impact factor: 3.240