Zhihua Chen1, Yilin Lin1, Ji Gao2, Suyong Lin1, Yan Zheng1, Yisu Liu1, Shao Qin Chen1. 1. Department of Gastrointestinal Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, Fujian 350004, P.R. China. 2. School of Nursing, Fujian Medical University, Fuzhou, Fujian 350004, P.R. China.
Colorectal cancer (CRC) ranks third (13.5%) and second (9.5%) among the incidence of malignancies worldwide in male and female patients, respectively, and is a serious hazard to human health (1). Previous studies have demonstrated that the molecular pathogenesis of CRC is mostly caused by genetic mutations (2,3). Numerous studies over the past two decades have reported that genetic mutations are associated with the prognosis and treatment of CRC, and targeted therapies have been developed (4–7). The progression of CRC is usually accompanied by the activation of the KRAS and BRAF genes and the inhibition of the p53tumour suppressor gene expression; mutations in these genes are associated with changes in the number and structure of chromosomes (8–10). However, >15% of sporadic CRCs occur through completely different molecular pathogenesis. For example, serrated precancerous lesions usually manifest as a result of the methylation of the CpG locus and mutation of the gene (11).The prognosis of CRC is poor due to a lack of effective diagnostic methods at an early stage. Therefore, an effective solution can only be provided for subsequent diagnosis and treatment by better understanding the gene expression of CRC during its occurrence and development and identifying the genes that may be involved in the occurrence and progression of cancer.With the rapid development of science and technology, microarray technology has become increasingly accurate and has been widely used to explore changes in animal and plant gene expression (12–14). Microarray technology aids in the discovery of changes in gene expression during cancer development and progression (15–17). However, it can be difficult to obtain reliable results with single microarray analysis. The present study aimed to identify genetic changes in CRC in three mRNA microarray datasets from the Gene Expression Omnibus (GEO) database by obtaining differentially expressed genes (DEGs) between CRC and normal tissues. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) were used for functional enrichment analysis, and a protein-protein interaction (PPI) network was used to analyse the associations between the DEGs. A total of 142 DEGs and 10 hub genes were identified, which may be candidate biomarkers of CRC.
Materials and methods
Microarray data
GEO (http://www.ncbi.nlm.nih.gov/geo) (18) is a public functional genomic database. Three gene expression datasets (GSE41657, GSE77953, and GSE113513; analysed on Agilent-014850 Whole Human Genome Microarray 4×44K G4112F, Affymetrix Human Genome U133A Array, and Affymetrix Human Gene Expression Array platforms, respectively) were downloaded from GEO. The probes were converted to the corresponding gene symbols based on the annotation information on the platform. The GSE41657 dataset contained 25 CRC and 12 adjacent normal tissue samples. The GSE77953 dataset contained 28 CRC and 13 non-tumour samples. The GSE113513 dataset contained 14 CRC and 14 non-tumour samples.
Identification of DEGs
DEGs between CRC and non-tumour samples were screened using GEO2R (http://www.ncbi.nlm.nih.gov/geo/geo2r). GEO2R is an interactive web tool that compares two or more GEO datasets to identify DEGs across experimental conditions. Adjusted P-values (adj. P) and Benjamin and Hochberg false discovery rates were applied to balance between statistically significant findings and false-positive limits. Probe sets without corresponding gene symbols were removed, and genes with more than one probe set were averaged. Log [fold change(FC)]>1 and adj. P-value <0.01 were considered to indicate a statistically significant difference.
KEGG and GO enrichment analysis of DEGs
The Database for Annotation, Visualization and Integrated Discovery (DAVID; version 6.7; http://david.ncifcrf.gov) (19) is an online bioinformatics database that integrates biological data and analysis tools and provides gene annotation information and protein data. KEGG is a database resource for understanding advanced functions and biological systems from large-scale molecular data generated by high-throughput experimental techniques (20). GO is a major bioinformatics tool for annotating genes and analysing their biological processes (BPs), molecular functions (MFs) and cellular components (CCs) (21). To analyse the function of the DEGs, analysis was performed using DAVID. P<0.05 was considered to indicate a statistically significant difference. The results based on the top ten BPs, MFs, CCs and KEGG were visualized.
PPI network construction and module analysis
A PPI network was predicted using the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING; version 10.0; http://string-db.org) database (22). The PPI network of the DEGs was constructed, and interactions with a combined score >0.4 were considered statistically significant. Cytoscape (version 3.4.0) is an open source bioinformatics platform for visualizing molecular interaction networks (23). Cytoscape plug-in Molecular Complex Detection (MCODE; version 1.4.2) provides topology-based clustering for a network (24). Cytoscape was used to draw a PPI network, and MCODE was used to identify the most important modules in the network. The selection criteria were as follows: MCODE score >5, degree cut-off =2, node score cut-off =0.2, maximum depth =100, and k score =2. Subsequently, KEGG and GO analysis of the genes in this module was performed using DAVID.
Hub genes selection and analysis
The selection criterion for the hub genes was degree of connectivity ≥10. The functions of the genes were identified using the GeneCards online analysis tool (https://www.genecards.org). The BP analysis and visualization of the hub genes were performed using the cBioPortal (http://www.cbioportal.org) online platform (25). The functions of the hub genes were identified using GO and KEGG analysis. Hierarchical clustering of the hub genes was constructed using the University of California Santa Cruz Cancer Genomics Browser (http://genome-cancer.ucsc.edu) (26). The overall survival analysis was performed using Kaplan-Meier curves in cBioPortal. The log-rank test was used to for statistical analysis. The expression profiles of DNA topoisomerase IIα (TOP2A) and phosphoribosylaminoimidazole carboxylase (PAICS) were analysed using the Oncomine database (http://www.oncomine.com). The expression levels of genes in normal tissues and CRC tissues were also analysed using the Oncomine database (27).
Results
Determination of DEGs in CRC
DEGs were identified by standardized microarray results (GSE41657, GSE77953 and GSE113513). A total of 142 genes overlapped in the three datasets, as demonstrated in the Venn diagram (Fig. 1A).
Figure 1.
DEGs between CRC and non-tumour tissues. (A) Venn diagram of the DEGs in the three data sets (GSE41657, GSE77953, and GSE113513). Log fold change >1 and adjusted P-value <0.01 were the conditions for screening DEGs. A total of 142 DEGs were screened. (B) PPI network of 142 DEGs. Yellow indicates the densest module. (C) The relationships among the most important module genes in the PPI network. DEGs, differentially expressed genes; CRC, colorectal cancer; PPI, protein-protein interaction.
PPI network and module analysis
A PPI network of the 142 identified DEGs was constructed (Fig. 1B). A significantly enriched module was obtained using Cytoscape software (Fig. 1C).
GO and KEGG enrichment analysis of the DEGs
DAVID was used for the functional enrichment analysis of DEGs. The results indicated that the DEGs were mainly enriched in ‘cell proliferation’, ‘G2/M transition of mitotic cell cycle’ and ‘one-carbon metabolism’ BPs (Fig. 2A); ‘cytoplasm’, ‘cytosol’ and ‘extracellular exosome’ cellular components (CCs) (Fig. 2B); and ‘protein binding’ and ‘ATP binding’ molecular functions (MFs) (Fig. 2C). KEGG pathway analysis revealed strong enrichment in ‘biosynthesis of antibiotics’, ‘purine metabolism’, ‘pancreatic secretion’ and ‘mineral absorption’ (Fig. 2D).
Figure 2.
GO and KEGG pathway enrichment analysis of the DEGs. (A) The enrichment of biological processes. (B) The enrichment of cellular components. (C) The enrichment of molecular functions. (D) The enrichment analysis of the KEGG pathways. P<0.05 was considered to indicate a statistically significant difference. GO, gene ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; DEGs, differentially expressed genes.
The functional results of the genes in the most significant module indicated that these genes were mainly enriched in processes associated with cell cycle progression, ATP binding, nucleotide binding, the p53 signalling pathway and oocyte meiosis (Table I).
Table I.
GO and KEGG pathway enrichment analysis of DEGs in the most significant module.
Term
Description
No. of genes
P-value
GO:0007049
Cell cycle
11
8.42×10−10
GO:0000279
M phase
9
6.24×10−10
GO:0022403
Cell cycle phase
9
3.84×10−9
GO:0022402
Cell cycle process
9
4.38×10−8
GO:0043228
Non-membrane-bounded organelle
11
1.83×10−5
GO:0015630
Microtubule cytoskeleton
9
1.17×10−8
GO:0044430
Cytoskeletal part
9
8.45×10−7
GO:0005524
ATP binding
6
2.91×10−3
GO:0032559
Adenyl ribonucleotide binding
6
3.10×10−3
GO:0030554
Adenyl nucleotide binding
6
3.91×10−3
hsa04115
p53 signalling pathway
3
1.72×10−3
hsa04114
Oocyte meiosis
3
4.44×10−3
hsa04110
Cell cycle
3
5.71×10−3
GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes.
Hub genes screening and analysis
Genes with a degree of connectivity ≥10 were identified as the hub genes. The names, abbreviations and functions of these genes are presented in Table II. The cBioPortal online platform was used to construct a hub gene network (Fig. 3A), and functional enrichment analysis of DEGs was performed using DAVID (Table III). The heat map constructed using the UCSC Cancer Genomics Browser indicated that the hub genes may be used to distinguish CRC tissue samples from normal intestinal cancer tissue samples (Fig. 3B). Overall and progression-free survival based on alterations in the hub genes was subsequently analysed using Kaplan-Meier curves. Alterations in TOP2A, cyclin-dependent kinase 1 (CDK1) and CDC28 protein kinase regulatory subunit 2 (CKS2) in patients with CRC were associated with a poor overall survival rate (Fig. 4A). However, patients with alterations in TOP2A and CKS2 did not exhibit significant differences in progression-free survival (Fig. 4B).
Table II.
Functional roles of the 10 hub genes with degree of connectivity ≥10.
No.
Gene symbol
Full name
Function
1
TOP2A
DNA Topoisomerase II α
Target for anticancer agents; mutations are associated with drug resistance
2
PAICS
Phosphoribosylaminoimidazole carboxylase
Involved in de novo synthesis of purine nucleotides
3
CDK1
Cyclin-dependent kinase 1
Regulates cell cycle progression, apoptosis and carcinogenesis of tumour cells
4
CKS2
CDC28 protein kinase regulatory subunit 2
Binds to the catalytic subunit of cyclin-dependent kinases; essential for their biological function
5
CKAP2
Cytoskeleton associated protein 2
Possesses microtubule stabilizing properties; involved in regulating aneuploidy, cell cycle and cell death in a p53/TP53-dependent manner
6
CEP55
Centrosomal protein 55
Promotes the proliferation of lung, breast and thyroid cancer
PH domain and leucine rich repeat protein phosphatase 2
Regulates Akt and PKC signalling.
9
RRM2
Ribonucleotide reductase M2 polypeptide
Catalyses the biosynthesis of deoxyribonucleotides; inhibits Wnt signalling
10
NEK2
NIMA (never in mitosis gene a)-related kinase 2
Involved in the regulation of mitosis
Figure 3.
Interaction network of hub genes and hierarchical cluster analysis. (A) Analysis of the hub genes and their co-expressed genes using cBioPortal. (B) Hierarchical clustering of the hub genes. In the sample type column, brown samples are non-cancerous tissue samples, and blue samples are CRC tissue samples. In the gene expression columns, red indicates upregulated expression and blue indicates downregulated expression. CRC, colorectal cancer.
Table III.
GO and KEGG pathway enrichment analysis of hub genes.
Term
Description
No. of genes
P-value
GO:0007049
Cell cycle
5
6.24×10−4
GO:0051301
Cell division
4
5.30×10−4
GO:0000279
M phase
4
7.29×10−4
GO:0015630
Microtubule cytoskeleton
5
1.06×10−4
GO:0044430
Cytoskeletal part
5
8.92×10−4
GO:0005856
Cytoskeleton
5
3.63×10−3
GO:0005524
ATP binding
4
3.62×10−2
GO:0032559
Adenyl ribonucleotide binding
4
3.75×10−2
GO:0030554
Adenyl nucleotide binding
4
4.29×10−2
hsa04115
p53 signalling pathway
2
3.96×10−2
GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes.
Figure 4.
Analysis of overall survival and progression-free survival of four hub genes. (A) Overall survival and (B) progression-free survival analyses based on alterations in the hub genes were performed using the cBioPortal online platform. P<0.05 was considered to indicate a statistically significant difference. TOP2A, DNA topoisomerase IIα; PAICS, phosphoribosylaminoimidazole carboxylase; CDK1, cyclin-dependent kinase 1; CKS2, CDC28 protein kinase regulatory subunit 2.
Among these genes, TOP2A exhibited a node degree of 34, and PAICS exhibited the highest degree of connectivity among the hub genes (34 and 26, respectively). These two genes may serve important roles in the occurrence or development of CRC. Based on the survival analysis, alterations in PAICS exhibited no statistically significant differences in overall and progression-free survival in patients with CRC (overall survival P=0.986; progression-free survival P=0.918; Fig. 4). Overall survival rates were significantly different for patients with TOP2A alterations, but progression-free survival was not significantly different (overall survival P=0.002; progression-free survival P=0.703; Fig. 4). The expression profiles of TOP2A and PAICS in human tissues were analysed using the Oncomine database; TOP2A and PAICS mRNA was upregulated in various cancer tissues compared with normal tissues (Fig. 5A and B). Further analysis revealed significant increases in the expression of TOP2A and PAICS in CRC (Fig. 5C and D).
Figure 5.
Expression of TOP2A and PAICS in cancer and normal tissues. (A and B) The Oncomine online analysis platform was used to analyse the expression of (A) TOP2A and (B) PAICS in human cancer and normal tissues. (C) Heat map of TOP2A expression in clinical CRC and normal tissues. 1. Colorectal Carcinoma vs. Normal. Hong Colorectal. 2. Rectosigmoid Adenocarcinoma vs. Normal. Kaiser Colon. 3. Colon Adenocarcinoma vs. Normal. Notterman Colon. 4. Colorectal Carcinoma vs. Normal, Skrzypczak Colorectal. (D) Heat map of PAICS expression in clinical CRC tissues and normal tissues. Caecum Adenocarcinoma, Colon Adenocarcinoma, Colon Mucinous Adenocarcinoma and Rectal Adenocarcinoma vs. Normal. TCGA Colorectal. TOP2A, DNA topoisomerase IIα; PAICS, phosphoribosylaminoimidazole carboxylase; CRC, colorectal cancer.
Discussion
CRC is a common malignant tumour of the digestive tract, and its mortality rate (9.2%) ranks second among all types of cancer (1). The main causes of CRC include dietary and environmental factors, as well as genetic mutations (3). The simultaneous methylation of the CpG site and mutation of the BRAF gene are also important factors in the development and progression of CRC (11). Failure to detect CRC early may be one of the reasons for poor prognosis in patients. Therefore, there is an urgent need for efficient diagnostic and therapeutic methods.In the present study, DEGs between CRC and non-cancerous tissues were obtained from three mRNA microarray datasets. The functions of the DEGs were identified by GO and KEGG enrichment analysis. The results demonstrated that the DEGs were mainly enriched in the cell cycle, proliferation, mitotic cell cycle and carbon metabolism. Previous studies have reported that dysregulation of the cell cycle and the mitotic cell cycle serves an important role in the development and progression of CRC (28–31). In a study by Chamberlain et al (32), functional B vitamin was used to assess the association of carbon metabolism with CRC, and a similar association between total B vitamin status and CRC risk was identified. Ducker (33) et al also revealed that the single-carbon metabolism of the cytosol can support tumourigenesis. These findings are consistent with the results of the present study. GO cluster and KEGG analysis in the present study also revealed that the changes in the most important module were mainly enriched in processes associated with cell cycle progression, ATP and nucleotide binding and in KEGG pathways associated with the cell cycle, progesterone-mediated oocyte maturation and oocyte cell meiosis.A total of 10 DEGs were selected as the hub genes. Among them, TOP2A and PAICS were the top two nodes with the highest degree of connectivity. TOP2A is involved in DNA replication, transcription and chromosome segregation, and is essential for tumorigenesis and cancer development (34). TOP2A has been demonstrated to be associated with chemoresistance and tumour recurrence and has been recognized as a target for anticancer drugs (35,36). In addition, TOP2A is highly expressed in lung, colon and breast cancer, involved in the inhibition of apoptosis, proliferation and chemoresistance of CRC and may be considered an important biomarker for the diagnosis, prognosis and treatment of tumours (37–39). In the present study, the PPI network indicated that TOP2A interacted directly with CDK1, ribonucleoside-diphosphate reductase subunit M2 (RRM2) and CKS2, indicating a key role for TOP2A in CRC. PAICS is involved in purine nucleotide synthesis; a previous study has demonstrated that high expression of PAICS in lung and prostate cancer is associated with an altered metabolic state of cells, apoptosis inhibition in cancer cells and enhanced cancer cell invasion (40). In addition, PAICS has been reported to promote tumourigenesis and progression in breast and bladder cancer (41,42). The results of the analysis in the Oncomine database in the present study demonstrated that TOP2A and PAICS expression was significantly upregulated in CRC compared with that in normal tissues; however, to the best of our knowledge, there are currently no studies focusing on the association between PAICS expression and CRC. The association between TOP2A and PAICS and overall and disease-free survival was also analysed; changes in TOP2A were significantly associated with overall survival but were independent of disease-free survival. Changes in PAICS exhibited a reduction in overall and disease-free survival; however, these observations were not statistically significant. These results may require further research for verification.CDK1 is a non-redundant cyclin-dependent kinase that serves an important role in mitosis (43,44). Perturbations in chromosomal stability and aspects of S phase and G2/M control mediated by CDK2 and CDK1 are pivotal tumorigenic events (45). A previous study has demonstrated that CDK1 is required for the survival of cells overexpressing MYC, and CDK1 has a therapeutic effect in the treatment of humanmalignancies that overexpress MYC (46). The combination of CKS2 and CDK is essential for promoting cancer cell metastasis in diseases such as colon cancer (47,48). Vascular endothelial growth factor A (VEGFA) is a class of cytokines called antigen growth factors that stimulate the formation of new blood vessels; tumour angiogenesis is primarily dependent on VEGFA-driven responses (49). Anti-VEGF-A treatment can reduce Treg proliferation in CRCpatients, and VEGFA inhibitors have been used to treat CRC (50,51). In the present study revealed, the expression levels of CDK11, CSK2 and VEGFA in CRC were analysed; the results revealed high expression of these genes in CRC, which was consistent with previous studies. In addition, high expression levels of CDK1 and CKS2 were associated with poor survival.CKAP2 is a spindle-associated protein that is degraded during mitosis (52,53). Upregulation of CEP55 promotes the metastasis of several cancers, such as lung adenocarcinoma, breast and anaplastic thyroid cancer (54–56). RRM2, also known as the ribonucleotide reductase small subunit, promotes cyclin F degradation to maintain a balanced dNTP pool and genomic stability (57). PHLPP2 is a protein phosphatase involved in the regulation of Akt and PKC signalling (58,59). Previous studies have reported that microRNA-224 promotes the proliferation and tumour growth of humanCRC cells by inhibiting PHLPP1 and PHLPP2 (60) and acts as a tumour suppressor for lung and breast cancer (61,62). The results of the present study also demonstrated that it is downregulated in CRC tissues. NEK2 is a serine/threonine kinase that is a key regulator of mitosis in cells (63). Upregulation of NEK2 is associated with the development of cancer, such as breast (64), ovarian (65) and colorectal (66,67) cancer. Overexpression of NEK2 promotes the invasion and metastasis of tumour cells and is considered a potential biomarker in nasopharyngeal carcinoma and liver cancer (68,69). The results of the present study indicated that the aforementioned hub genes may be distinguishing factors between CRC and non-tumour tissue samples and candidate tumour biomarkers. In addition, alterations in TOP2A, CDK1 and CKS2 are associated with poor survival, suggesting that these genes may serve an important role in the development, progression or recurrence of cancer.In summary, the aim of the present study was to identify DEGs that may be involved in the development or progression of CRC. The results demonstrated that TOP2A and CDK1 may be involved in the survival prognosis of patients with CRC. The present study also revealed that these genes may be biomarkers for the diagnosis of CRC. However, the present study had limitations, and other databases such as The Cancer Genome Atlas, as well as in vivo and in vitro experiments may be needed to clarify the biological functions of these genes in CRC.
Authors: H L McLeod; F Douglas; M Oates; R P Symonds; D Prakash; A G van der Zee; S B Kaye; R Brown; W N Keith Journal: Int J Cancer Date: 1994-12-01 Impact factor: 7.396
Authors: Nicholas R Brown; Svitlana Korolchuk; Mathew P Martin; Will A Stanley; Rouslan Moukhametzianov; Martin E M Noble; Jane A Endicott Journal: Nat Commun Date: 2015-04-13 Impact factor: 14.919
Authors: Moloy T Goswami; Guoan Chen; Balabhadrapatruni V S K Chakravarthi; Satya S Pathi; Sharath K Anand; Shannon L Carskadon; Thomas J Giordano; Arul M Chinnaiyan; Dafydd G Thomas; Nallasivam Palanisamy; David G Beer; Sooryanarayana Varambally Journal: Oncotarget Date: 2015-09-15
Authors: Tomer Cooks; Ioannis S Pateras; Lisa M Jenkins; Keval M Patel; Ana I Robles; James Morris; Tim Forshew; Ettore Appella; Vassilis G Gorgoulis; Curtis C Harris Journal: Nat Commun Date: 2018-02-22 Impact factor: 14.919
Authors: Sumit Agarwal; Balabhadrapatruni V S K Chakravarthi; Michael Behring; Hyung-Gyoon Kim; Darshan S Chandrashekar; Nirzari Gupta; Prachi Bajpai; Amr Elkholy; Sai A H Balasubramanya; Cherlene Hardy; Sameer Al Diffalha; Sooryanarayana Varambally; Upender Manne Journal: Cancers (Basel) Date: 2020-03-25 Impact factor: 6.639