Shasha Wu1, Feixiang Wu2, Zheng Jiang1. 1. Department of Gastroenterology, The First Affiliated Hospital of Chongqing Medical University, Chongqing 400016, P.R. China. 2. Department of Urology, The First Affiliated Hospital of Chongqing Medical University, Chongqing 400016, P.R. China.
Abstract
Colorectal cancer (CRC) is the most common cancer of the digestive system. The aim of the present study was to identify the potential biomarkers and uncover the underlying mechanisms. The gene and miRNA expression profiles were obtained from GEO database. The differentially expressed genes (DEGs) and miRNAs (DE miRNAs) were identified by GEO2R. The gene ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed by KOBAS 3.0. The protein-protein interaction (PPI) network and miRNA-gene network were constructed by Cytoscape software. Then, the identified genes were verified by quantitative real-time PCR in both CRC tissue samples and cell lines. A total of 600 upregulated DEGs, 283 downregulated DEGs, 13 upregulated DE miRNAs and 7 downregulated DE miRNAs were identified. GO analysis results showed that upregulated DEGs were significantly enriched in binding, organelle and cellular process. Downregulated DEGs were enriched in binding, extracellular region and chemical homeostasis. KEGG analysis showed that the DEGs were mostly enriched in cell cycle and pathways in cancer. A total of eight genes were identified as biomarkers, including CAD, ITGA2, E2F3, BCL2, PRKACB, IGF1, SGK1 and NR3C1. Experimental validation showed that seven of the eight identified genes had the same expression trend as predicted, except for ITGA2. Besides, hsa-miR-552 and hsa‑miR-30a were identified as key miRNAs. the present study provides a series of biomarkers and mechanisms for the diagnosis and therapy of CRC. We also prove that although bioinformatics analysis is a wonderful approach, experiment validation is necessary.
Colorectal cancer (CRC) is the most common cancer of the digestive system. The aim of the present study was to identify the potential biomarkers and uncover the underlying mechanisms. The gene and miRNA expression profiles were obtained from GEO database. The differentially expressed genes (DEGs) and miRNAs (DE miRNAs) were identified by GEO2R. The gene ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed by KOBAS 3.0. The protein-protein interaction (PPI) network and miRNA-gene network were constructed by Cytoscape software. Then, the identified genes were verified by quantitative real-time PCR in both CRC tissue samples and cell lines. A total of 600 upregulated DEGs, 283 downregulated DEGs, 13 upregulated DE miRNAs and 7 downregulated DE miRNAs were identified. GO analysis results showed that upregulated DEGs were significantly enriched in binding, organelle and cellular process. Downregulated DEGs were enriched in binding, extracellular region and chemical homeostasis. KEGG analysis showed that the DEGs were mostly enriched in cell cycle and pathways in cancer. A total of eight genes were identified as biomarkers, including CAD, ITGA2, E2F3, BCL2, PRKACB, IGF1, SGK1 and NR3C1. Experimental validation showed that seven of the eight identified genes had the same expression trend as predicted, except for ITGA2. Besides, hsa-miR-552 and hsa‑miR-30a were identified as key miRNAs. the present study provides a series of biomarkers and mechanisms for the diagnosis and therapy of CRC. We also prove that although bioinformatics analysis is a wonderful approach, experiment validation is necessary.
CRC is the most common digestive system cancer and is the third most common cancer in the world. Early stage CRC does not have obvious symptoms and is often diagnosed in advanced stage. CRC patients without metastases can be cured by surgery, but patients with advanced CRC are mainly treated with chemotherapy (1). Therefore, early diagnosis and treatment of CRC is needed.Genetic changes have been investigated in CRC. Some significant oncogenes and tumor suppressor genes such as APC, KRAS and p53 are mutated in a considerable part of CRCs (2). This could allow screening and diagnosis of early-stage CRC, which may lead to substantial decreases in the mortality rate of CRC (3).miRNAs are a series of small, non-coding regulatory RNAs. By repressing their target mRNAs, miRNAs regulate many cellular pathways, such as proliferation, apoptosis and differentiation. This feature of miRNAs provides a novel approach for cancer therapy (4). miRNAs such as miRNA-92a and miRNA-21 have shown significant connection with CRC, and can serve as biomarkers (5). Ng et al (6) reported plasma miR-92a can effectively discriminate CRC from control subjects. It has been verified that miR-21 can repress the tumor suppressor gene and induce cell invasion (7).Microarray is a multiple lab-on-a-chip. To identify the biomarkers of CRC, we downloaded the gene and miRNA expression profiles of CRC from GEO database. Expression differences were compared between CRC tissues and normal colorectal tissues. By analyzing GO (8) and pathway enrichment (9) and constructing PPI network (10) and miRNA-gene network, we aimed to find key genes and miRNAs which play significant roles in the occurrence and development of CRC and discover new biomarkers for diagnosis and therapy.
Materials and methods
Microarray data
Three gene expression profiles (GSE21815, GSE32323 and GSE44076) and two miRNA expression profiles (GSE39845 and GSE53592) were obtained from GEO database (http://www.ncbi.nlm.nih.gov/geo/) (11). The GSE21815 datasets contained 132 CRC samples and 9 normal samples. GSE32323 included 17 CRC samples and 17 normal samples. GSE44076 consisted of 98 CRC samples and 98 normal samples. The miRNA expression profile of GSE39845 contained 3 CRC samples and 3 normal tissue samples. GSE53592 included 3 CRC samples and 3 normal samples.
Identification of DEGs and DE miRNAs
GEO2R (http://www.ncbi.nlm.nih.gov/geo/info/geo2r.html) is an interactive web tool for comparing two or more groups of samples in a GEO series to identify DEGs or DE miRNAs across experimental conditions. We used GEO2R to identify DEGs and DE miRNAs. The P-value <0.05 and |logFC|>1 were chosen as cut-off criteria.
Functional enrichment analysis of DEGs
KOBAS 3.0 (http://kobas.cbi.pku.edu.cn/) is a latest updated web server for gene/protein functional annotation and functional sets enrichment of gene (12). The GO enrichment and KEGG (13) pathway analysis were performed by KOBAS 3.0 online tool. In addition, P<0.05 was set as the cut-off criterion.
PPI network construction and module analysis
To explore the interactive relationships among the DEGs, PPI network was constructed by the Search Tool for the Retrieval of Interacting Genes (STRING, version 10.0, http://string.embl.de/) and combined score >0.4 was set as the cut-off criterion. Then, PPI network was visualized by Cytoscape software (14). The Molecular Complex Detection (MCODE) app was performed to analyze modules of PPI network. MCODE scores >3 and the number of nodes >4 were set as cut-off criteria. The pathway enrichment analysis of genes in the modules was performed by KOBAS 3.0. P<0.05 and input number >3 were considered to be significant.
miRNA-gene network construction
The target genes of DE miRNAs were predicted by five established miRNA target prediction programs (miRanda, MirTarget2, PicTar, PITA and TargetScan). The genes predicted by at least three programs were chosen as the targets of DE miRNAs. Then, miRNA-gene network was constructed. To identify the key gene biomarkers, we combined both PPI and miRNA-gene network. Genes with degree ≥20 in PPI network and degree ≥3 in miRNA-gene network were selected as gene biomarkers.
Cell culture
The human CRC cell lines HCT116 and HT-29 were obtained from the American Type Culture Collection Cell Center and cultured in RPMI-1640 medium (HyClone Laboratories, Inc., Logan UT, USA) supplemented with 10% fetal bovine serum (FBS; PAN-Biotech, Aidenbach, Germany) and 1% penicillin-streptomycin (Beyotime Institute of Biotechnology, Haimen, China) at 37°C in 5% CO2.
Patient samples
CRC tissue samples and normal colorectal tissue samples were collected from patients with CRC in The First Affiliated Hospital of Chongqing Medical University (Chongqing, China). Ethics approval was obtained from the Clinical Research Ethics Committee of The First Affiliated Hospital of Chongqing Medical University. Informed consents were signed by all patients for the acquisition and use of tissue samples.
Quantitative real-time PCR
Total RNA was extracted from cultured cells (HCT116 and HT-29) by RNAiso Plus (Takara Bio, Dalian, China) and tissues by TRIzol reagent (Invitrogen, Carlsbad, CA, USA). RNA (1 µg) was reverse transcribed by the GoScript™ Reverse Transcription system (Promega, Madison, WI, USA). Real-time PCR was performed by the GoTaq qPCR Master Mix (Promega). All q-PCR values of each gene were normalized against ACTB. The relative expressions of eight DEGs were calculated by the 2−∆∆Ct method (15).
Statistical analysis
The final data were presented as mean ± SD. The results were analyzed by the Students t-test, and P<0.05 was considered to show statistical difference.
Results
Identification of DEGs
A total of 8027, 1328 and 1230 upregulated DEGs were identified with the 653, 1231 and 1422 downregulated DEGs from GSE21815, GSE32323 and GSE44076, respectively. There were 600 upregulated DEGs (Fig. 1A) and 283 downregulated DEGs (Fig. 1B) in the overlap of the three datasets.
Figure 1.
Identification of DEGs and DE miRNAs. (A) Identification of upregulated DEGs. (B) Identification of downregulated DEGs. (C) Identification of upregulated DE miRNAs. (D) Identification of downregulated DE miRNAs.
KOBAS 3.0 was performed to analyze the functional and pathway enrichment of identified DEGs. GO analysis results showed that upregulated DEGs were significantly enriched in binding, protein binding and organic cyclic compound binding at MF level; organelle, intracellular part and cell part at CC level and cellular process, metabolic process and organic substance metabolic process at BP level (Fig. 2). Downregulated DEGs were enriched in binding, receptor binding and protein binding at MF level; extracellular region, cell periphery and plasma membrane at CC level and chemical homeostasis, cellular response to chemical stimulus and single-organism process at BP level (Fig. 2). Most GO terms enriched in the regulation processes. Binding is a basic step of regulation process. The GO terms mostly enriched in binding which is consistent with the knowledge that tumor development and progression is regulated by multiple molecules (16–18). The GO analysis supported the correlation between the identified DEGs and CRC.
Figure 2.
GO analysis of the DEGs. Black bars stand for the upregulated DEGs, while white bars stand for the downregulated DEGs.
KEGG analysis showed that the upregulated DEGs were mostly enriched in cell cycle, DNA replication, p53 signaling pathway and pathways in cancer (Fig. 3A). The downregulated DEGs were enriched in metabolic pathways and pathways in cancer (Fig. 3B). Cell cycle is the series of events that take place in a cell leading to its division and DNA replication to produce two new cells. Abnormal cell cycle regulation will lead to the occurrence of cancer. Many molecules participate in cancers by regulating the cell cycle (19,20). Multiple genes can disturb DNA replication and cause DNA replication stress and genome instability (21). Oncogenes increase DNA replicative stress and induce a DNA damage response early in tumorigenesis (22). Activated by various factors, p53 signaling pathway represses cancer progression by arresting growth, or by promoting cellular death programs (23). Various molecules regulate the cell cycle and apoptosis though activating or repressing p53 signaling pathway (24–26). Pathways in cancer consist of many classical signaling pathways, such as Wnt, PI3K-Akt, MAPK, VEGF, PPAR, TGF-β and p53 signaling pathways, which play significant roles in regulating cell proliferation, apoptosis, invasion, metastasis and adhesion. The KEGG pathway enrichment proved the relationship between the identified DEGs and CRC. All the above analyses supported the reliability of the results in the present study.
Figure 3.
KEGG pathway analysis of DEGs. (A) KEGG pathway analysis of upregulated DEGs. (B) KEGG pathway analysis of downregulated DEGs.
PPI network construction and analysis of modules
The PPI network was constructed by STRING and visualized by Cytoscape (version 3.4.0) (Fig. 4). Degree ≥20 was set as the cut-off criterion. A total of 168 DEGs were chosen as hub genes, such as CAD, ITGA2, BCL2, PRKACB, IGF1, SGK1, E2F3 and NR3C1. Moreover, the whole PPI network was analyzed by MCODE. The top five modules were selected, and the KEGG pathway enrichment analysis of the genes involved in modules were performed by KOBAS 3.0. The pathway enrichment analysis revealed that DEGs in these modules were mostly enriched in TNF signaling pathway, DNA replication, cell cycle, p53 signaling pathway and pathways in cancer (Fig. 5).
Figure 4.
PPI network construction and module analysis. Yellow nodes represent the DEGs, purple nodes represent the genes involved in modules and the lines represent the interaction between two nodes.
Figure 5.
KEGG pathway analysis of the genes in the top five modules.
miRNA-gene network
To further understand the regulatory relationship between identified DEGs and miRNAs, two miRNA profiles were analyzed. A total of 45 and 25 upregulated DE miRNAs (Fig. 1C) were identified with 38 and 20 downregulated DE miRNAs (Fig. 1D) from GSE39845 and GSE53592, respectively. Among them, 13 upregulated DE miRNAs and seven downregulated DE miRNAs were screened out in both datasets. The targeted genes were predicted by the five programs. Besides, to identify the reliable target genes, we compared the targets with DEGs. Only the overlapping genes were selected. Following, miRNA-gene network was constructed (Fig. 6). By analyzing PPI and miRNA-gene network, a total of eight genes were chosen, consisting of 3 upregulated DEGs (CAD, ITGA2 and E2F3) and 5 downregulated DEGs (BCL2, PRKACB, IGF1, SGK1, and NR3C1) (Fig. 7).
Figure 6.
miRNA-gene network. Yellow nodes stand for DEGs, while green nodes stand for DE miRNAs. The lines stand for the regulation relationship between DE miRNAs and DEGs.
Figure 7.
Eight gene biomarker regulation networks. Yellow nodes stand for DEGs, red nodes stand for the eight gene biomarkers and green nodes stand for DE miRNAs. The lines stand for the regulation relationship between two nodes.
Experimental validation of the identified biomarker genes
To verify the bioinformatics analysis for predicting the potential biomarkers in CRC, the eight candidate genes were selected for further quantitative real-time PCR. The expression of eight genes was detected in both tissue samples and CRC cell lines (HCT116 and HT-29). As shown in Fig. 8, seven of the eight identified genes were confirmed to have the same expression trend as predicted by bioinformatics analysis. CAD and E2F3 were overexpressed in tumor tissue samples and CRC cell lines, while the expression of BCL2, PRKACB, IGF1, SGK1 and NR3C1 was reduced in CRC tissue samples and cell lines. However, the expression of ITGA2 in tumor tissue samples and cell lines was lower than that in normal tissue samples. Besides, there was no significant difference in the expression of NR3C1 and PRKACB in HCT116 cell line and normal tissues.
Figure 8.
Quantitative real-time PCR results for identified gene biomarkers. Eight candidate genes were selected for further quantitative real-time PCR. Expression of these DEGs was normalized against ACTB expression. The statistical significance of differences was calculated by the Students t-test. *P<0.05, **P<0.01, ***P<0.001. #The expression of gene could not be detected by q-PCR.
Discussion
Recently, microarray and bioinformatics analysis have been widely used in identifying potential targets for diagnosis and therapy of different cancers. In the present study, a total of 883 DEGs were screened, including 600 upregulated DEGs and 283 downregulated DEGs. We then performed functional enrichment analysis on the DEGs. For the upregulated DEGs, GO terms were significantly enriched in binding at MF level, organelle at CC level and cellular process at BP level. For the downregulated DEGs, most enriched GO terms were binding at MF level, extracellular region part at CC level and chemical homeostasis at BP level. Most of these GO terms were basic regulatory concepts in cell. Pathway enrichment analysis revealed that cell cycle, p53 signaling pathway and pathways in cancer were significantly enriched, which were associated with the occurrence of cancers. In addition, we constructed the PPI network and analyzed the top five modules. KEGG pathway analysis revealed that the DEGs in these modules were also enriched in cell cycle, p53 signaling pathway and pathways in cancer, which confirmed the findings above. These enriched pathways revealed insight into the molecular mechanisms of CRC, which could provide effective therapeutic strategies for CRC.miRNAs have been reported to participate in the occurrence and progress of cancer. In the present study, we identified 13 upregulated miRNAs and seven downregulated miRNAs. We then constructed the miRNA-gene network. To identify the key biomarkers of CRC, we combined the two networks. A gene with higher degree in PPI network and regulated by more miRNAs in miRNA-gene network was considered to play a more important role in CRC. A total of eight genes were identified, consisting of three upregulated DEGs (CAD, ITGA2 and E2F3) and five downregulated DEGs (BCL2, PRKACB, IGF1, SGK1 and NR3C1). Besides, hsa-miR-552 and hsa-miR-30a represented the more important role in regulating target genes.Experimental validation is necessary to confirm our results predicted by bioinformatics analysis. We found that seven of the eight candidate genes have the same expression trend as predicted by qPCR (P<0.05). Although ITGA2 did not overexpress in CRC tissue as predicted, it has been reported to be downregulated in prostate cancer (27). In addition, Ferraro et al (28) proved that depressedITGA2 could reduce cell migration in colon cancer. As for NR3C1, its expression in HCT116 cell line was significantly higher than that in tumor tissues, but could not be detected in HT-29 cell line. Lind et al (29) reported that hypermethylation was significantly associated with the absence or reduced expression of NR3C1, which leads to different expression levels of NR3C1 in different tissues and cells.The pathogenesis of cancer is a complex process driven by many factors, including genetic and epigenetic alterations. There was no report on CAD in CRC. However, increased expression of CAD was associated with local tumor extension and cancer recurrence in prostate cancer (30). It may become a novel biomarker of CRC. Downregulation of E2F3 could inhibit cell proliferation and induce apoptosis in CRC. Besides, E2F3 is a target gene of several miRNAs (31,32). BCL2 is an apoptosis regulator. It has been identified as a cause of several cancers, including CRC, melanoma, breast and lung cancer (33–36). PRKACB is a member of the serine/threonine protein kinase family and encodes a catalytic subunit of cAMP-dependent protein kinase. Kvissel et al (37) reported that PRKACB isoforms play different roles in proliferation and differentiation in prostate cancer. IGF1 polymorphisms could influence the risk of CRC (38). For example, the SNP rs6214 of IGF1 could increase the CRC risk (39). Besides, the expression level of IGF1 correlates significantly with tumor size (P=0.0024) and depth of invasion (P=0.0147) in CRC (40). SGK1 promotes survival, invasiveness, and metastasis of CRC cells (41). It can also enhance the growth and migration of NSCLC cells by activating β-catenin/TCF signaling pathway (42). SGK1 expression is also increased in endometrial cancer. Suppression of the expression of SGK1 can induce autophagy and apoptosis (43). hsa-miR-552 and hsa-miR-30a were identified as key miRNAs in the miRNA-gene network. Downregulation of miR-552 reduced cell proliferation, migration and clonogenicity in CRC (44). Conversely, inhibition of miR-30a promoted cell proliferation, migration and invasion in CRC (45).Cancer is a complex disease caused by multiple factors. A single reason is not sufficient to explain the mechanism of cancer. Combination of mRNAs, miRNAs and interaction networks can help us to investigate and explain the molecular mechanism of cancer (46). It is an effective approach to identify biomarkers of various diseases.In the present study, we applied bioinformatics analysis to identify the biomarkers of CRC and performed experimental validation. But there were still some limitations in this study. First, the amount of CRC tissue samples was not enough. Because of the difficulty to obtain clinical samples and information. Second, we only checked two cell lines, which cannot fully reveal the difference in expression of biomarkers in CRC.In conclusion, eight genes and two miRNAs were identified as the biomarkers of CRC. Besides, this study also provided a series of significant pathways and mechanisms for diagnosis and therapy. Eight genes were verified by experiments, including CAD, ITGA2, E2F3, BCL2, PRKACB, IGF1, SGK1 and NR3C1, which might become the new stars in research of CRC.