Literature DB >> 32420374

Screening and Identification of Differentially Expressed Genes Expressed among Left and Right Colon Adenocarcinoma.

Jing Han1, Xue Zhang1, Yang Yang2, Li Feng1, Gui-Ying Wang2, Nan Zhang3.   

Abstract

PURPOSE: Colon adenocarcinoma (COAD) is the third most common malignancy globally and is further categorized as left colon adenocarcinoma (LCOAD) or right colon adenocarcinoma (RCOAD) depending on the location of the primary tumor. The therapeutic outcome and long-term prognosis for patients with COAD are less than satisfactory, and this may be associated with tumor location. Therefore, it is important to investigate the genetic differences in COAD at different sites. Patients and Methods. Public data associated with COAD were downloaded from the Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were identified using R software (version 3.5.3), and functional annotation of DEGs was performed using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. A protein-protein interaction network was constructed, hub genes were identified and analyzed, and data mining using Gene Expression Profiling Interactive Analysis (GEPIA) was conducted.
RESULTS: A total of 286 DEGs were identified between LCOAD and RCOAD. Additionally, 10 hub genes associated with COAD at different locations were screened, namely, CDKN2A, IGF1R, MDM2, SMAD3, SLC2A1, GRM5, PLCB4, FGFR1, UBE2V2, and TNFRSF10B. The expression of cyclin-dependent kinase inhibitor 2A (CDKN2A) and solute carrier family 2 member 1 (SLC2A1) was significantly associated with pathological stage (P < 0.05). COAD patients with high expression levels of CDKN2A exhibited poorer overall survival (OS) times than those with low expression levels (P < 0.05).
CONCLUSION: CDKN2A expression was significantly different between LCOAD and RCOAD and was closely related to the prognosis of COAD. It is of great value for further understanding of the pathogenesis of LCOAD and RCOAD.
Copyright © 2020 Jing Han et al.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 32420374      PMCID: PMC7201700          DOI: 10.1155/2020/8465068

Source DB:  PubMed          Journal:  Biomed Res Int            Impact factor:   3.411


1. Introduction

Colon adenocarcinoma (COAD) is the third most common malignancy worldwide, accounting for 10.0% of all new cancer cases, and is one of the leading causes of cancer-associated mortality [1]. The incidence of COAD has increased year on year and is closely associated with genetic, environmental, and dietary changes, as well as colonic mucosal hyperplasia and the canceration of adenomatous polyps [2]. With the development of targeted therapy, great progress has been made in the treatment of COAD, but the therapeutic outcome and long-term prognosis of patients remain unsatisfactory. It has been suggested that this may be associated with the location of the tumor; thus, the investigation of differences in the incidence of COAD at different sites is particularly important. Based on tumor location, COAD includes at least two types [3], left colon adenocarcinoma (LCOAD) and right colon adenocarcinoma (RCOAD). LCOAD refers to tumors from the splenic flexure of the colon to the sigmoid colon, and RCOAD refers to tumors between the ileocecal region and the transverse colon [4]. In addition to their different origins, LCOAD and RCOAD also have different clinical manifestations, histological types, molecular characteristics, prognoses, modes of metastasis, and treatment options [3], which are reflected in the following aspects. In terms of clinical manifestation, hematochezia and changes in bowel habits are more frequently associated with LCOAD, while iron-deficiency anemia caused by occult blood loss is more common in patients with RCOAD [5]. The data showed that RCOAD patients were more likely to be female, of older age, with larger tumor diameters, poor differentiation, later Tumor-Node-Metastasis stages, and shorter survival times compared with LCOAD patients [6, 7]. In the past 30 years, the incidence of RCOAD has risen, and its incidence is now reportedly higher than that of LCOAD [8]. From a molecular perspective, RCOAD and LCOAD are two separate entities. The fundamental reason for the obvious difference between RCOAD and LCOAD lies in the difference of molecular typing. For example, in the RCOAD, there are high mutations of genes, methylation, BRAF (B-Raf Proto-Oncogene, Serine/Threonine Kinase) mutation, serrated pathway, and inflammatory. And the prognosis of the RCOAD is poor [9]. However, in the LCOAD, there exist chromosomal instability, amplification of EGFR1 (Epidermal Growth Factor Receptor 1) and EGFR2 (Epidermal Growth Factor Receptor 2), EGF (Epidermal Growth Factor) signal transduction, and Wnt signal transduction. 13% of the LROAD with BRAF mutation has a poor prognosis, while 87% without BRAF mutation will have a good prognosis [9]. RCOAD is related to KRas and Serine/threonine-protein kinase B-raf (BRAF) mutations of defect mismatch repair genes and microRNA-31, while LCOAD is closely associated with chromosome instability, p53, NRas, and microRNA-146a, microRNA-147b, and microRNA-1288 [10]. However, Gao et al. [11] showed no significant difference in the expression levels of MLH1, MSH2, MSH6, PMS2, β-tubulin III, p53, Ki67, topoisomerase Iiα, and BRAF gene mutations between the two types of COAD. A number of studies have reported significant differences in p53 gene mutation and protein expression between RCOAD and LCOAD [12-14], while another study has shown no significant correlation between p53 protein expression and tumor location [15]. Therefore, it is significantly necessary to identify the differentially expressed genes between RCOAD and LCOAD. Bioinformatics is a comprehensive field that integrates biology, computer science, and mathematics [16]. With the development of sequencing technology, bioinformatics data has rapidly accumulated and is widely used in medicine and drug development. Concurrently, much gene expression profile data have been generated [17], and efficient data mining has become a bioinformatics research hotspot. The development of bioinformatics also provided a novel approach for the discovery and identification of differentially expressed genes (DEGs) between LCOAD and RCOAD [18]. In the present study, COAD gene chip data from the Gene Expression Omnibus (GEO) were analyzed to identify DEGs and hub genes between LCOAD and RCOAD, construct an interaction network of DEGs, and conduct Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses between these genes. These DEGs and hub genes may provide new ideas to study the differences between LCOAD and RCOAD and the subsequent development of targeted therapy.

2. Materials and Methods

2.1. Access to Public Data

The GEO (http://www.ncbi.nlm.nih.gov/geo) is an open-source platform for the storage of genetic data [19]. Two expression profiling datasets (GSE81558 (GPL15207 platform) and GSE75317 (GPL570 platform)) were, respectively, downloaded from the GEO database. The GSE81558 dataset includes 9 normal colorectal tissues, 19 liver tissues from colorectal liver metastasis patients, 12 rectum tissues from primary colorectal tumor patients, 9 left colon tissues from primary colorectal tumor patients, and 2 right colon tissues from primary colorectal tumor patients. This study mainly aimed to identify the differentially expressed genes between left colorectal tumors and right colorectal tumors. Therefore, we chose only 9 LCOAD and 2 RCOAD samples from the GSE81558 dataset based on the source type. Similarly, 33 LCOAD samples and 26 RCOAD samples were selected from GSE75315 (GPL570 platform).

2.2. DEGs Identified Using R Software

R software (version 3.5.3) is used to distinguish DEGs between LCOAD and RCOAD tissue samples. If one probe set does not contain the homologous gene, or if one gene has numerous probe sets, the data is removed. P < 0.05 is considered to indicate a statistically significant difference. The DEGs are presented as volcano plots, generated using SangerBox software (http://sangerbox.com/), and Venn diagrams were constructed using FunRich software (http://www.funrich.org).

2.3. Functional Annotation of DEGs Using KEGG and GO Pathway Enrichment Analyses

The Database for Annotation, Visualization, and Integrated Discovery (DAVID) (https://david.ncifcrf.gov/home.jsp; version 6.8) is an online suite of analysis tools with an integrated discovery and annotation function [20]. The GO resource is widely used in bioinformatics and covers three aspects of biology, including biological process (BP), cellular component (CC), and molecular function (MF) [21]. KEGG (https://www.kegg.jp/) is one of the most commonly used biological information databases worldwide [22]. DAVID was used to perform GO and KEGG analyses of DEGs, and P < 0.05 was considered to indicate a statistically significant difference.

2.4. Construction of a Protein-Protein Interaction (PPI) Network

Search Tool for the Retrieval of Interacting Genes (http://string.embl.de/), an open-source online tool, was used to construct a PPI network of the identified DEGs, and Cytoscape visualization software version 3.6.1 [23] was used to present the network [24]. A confidence score >0.4 was considered as the criterion of judgment, which may filter out the critical module.

2.5. Identification and Analysis of Hub Genes

Functional annotation of the genes was performed using KEGG and GO analyses in DAVID. A single coexpression network was constructed using cBioPortal (http://www.cbioportal.org) [25]. The Biological Networks Gene Oncology tool (BiNGO) version 3.0.3, one plug-in of the Cytoscape, was used to analyze and visualize the BPs and MFs of each hub gene [26]. OmicShare (http://www.omicshare.com/tools), an open data analysis platform, was subsequently used to perform clustering analysis of these genes.

2.6. Data Mining Using Gene Expression Profiling Interactive Analysis (GEPIA)

The correlations between gene expression and pathological stage were ascertained using GEPIA (http://gepia.cancer-pku.cn/), a newly developed interactive web server for analyzing the gene expression data of large consortium projects such as The Cancer Genome Atlas and the Genotype Tissue Expression project [27]. Correlations between pathological stage, overall survival (OS), and the expression of hub genes in COAD were also identified using GEPIA. The correlation between SLC2A1 and GLUT1 expression was tested by GEPIA.

2.7. RT-qPCR Assay

A total of 8 participates were recruited, including 4 LCOAD and 4 RCOAD samples. After surgery, 4 LCOAD samples from LCOAD patients and 4 RCOAD samples from control individuals were obtained. The research conformed to the Declaration of Helsinki and was authorized by the Human Ethics and Research Ethics Committees of the Fourth Hospital of Hebei Medical University. An informed consent was obtained from all participants. Total RNA was extracted from 4 LCOAD samples and 4 RCOAD samples by the RNAiso Plus (Trizol) kit (Thermofisher, Massachusetts, America) and reverse transcribed to cDNA. RT-qPCR was performed using a Light Cycler® 4800 System with specific primers for the ten hub genes. Table 1 presents the primer sequences used in the experiments. The RQ values (2−ΔΔ, where Ct is the threshold cycle) of each sample were calculated and are presented as fold change in gene expression relative to the control group. GAPDH was used as an endogenous control.
Table 1

Primers and their sequences for PCR analysis.

PrimerSequence (5′–3′)
CDKN2A-hFATATAGCTTCAAAAAGCAAAGGC
CDKN2A-hRTTAAAATCAAATCCAGCAACAGG
IGF1R-hFGAAGTTGAGAAGGAATGAAGACA
IGF1R-hRAATCACCCAAGAAAACAAGACAG
MDM2-hFCCAAGGGGGGTAGTAAAGGGTAT
MDM2-hRTAGAAGGCAAGGAAGAAAGGAGT
SMAD3-hFCACTCGGGAATGGGAAAAATGAA
SMAD3-hRAAAAAATAGCCAGGCGTGGTAGC
SLC2A1-hFGCATGGGTGATGTGTGGTTTGAA
SLC2A1-hRAGGGTATCCTCTCCTGGTTTTAG
GRM5-hFAGGACAGTAAACCAGGAAGCAGG
GRM5-hRGAGGTAATTGAATCATAGGGGCG
PLCB4-hFTGCTTTAATTTTATTATACCCCC
PLCB4-hRAAGTCTCAGTCAATCCAGTCCTC
FGFR1-hFGCCAGAGCAAGTGTGGGTTTTAT
FGFR1-hRGATGCGTGTGATTCGGAGAGGGT
UBE2V2-hFAGGTTCACTCCTCATTCTTTTTT
UBE2V2-hRTTTTCCCTATTTGATGTTTCTGT
TNFRSF10B-hFAATATACGCAGGATTTGAAGACG
TNFRSF10B-hRACATTAAAAAAGGTGAGAAGGGG

2.8. Overall Survival Analysis of the LCOAD and RCOAD

The present study recruited a total of 106 LCOAD and 106 RCOAD patients from the Fourth Hospital of Hebei Medical University. Clinical and histopathological characteristics and follow-up and survival information were available for all patients and were collected retrospectively from medical records. Patients who are aged 30 to 100 years old, are histologically confirmed as colorectal adenocarcinoma [28], do not receive tumor treatment, and have no history of surgery [29] will be screened for inclusion criteria. Exclusion criteria included the following: age <30 years old or >100 years old, combined with other malignant tumors, operation time more than 1 month after the last examination, and severe heart disease. The expression level of CDKN2A in LCOAD or RCOAD patients was measured by RT-qPCR. In this clinical study, we followed up the patients for 210 months. The endpoint of the study was death from colon adenocarcinoma. This trial and the informed consent forms have been reviewed and approved by the Ethics Review Committee of Fourth Hospital of Hebei Medical University, and the approval number is 2017MEC115. The Kaplan–Meier method was performed to analyze the overall survival. All statistical analyses were conducted using SPSS software (version 21.0), and P < 0.05 was considered statistically significant.

3. Results

3.1. Screening of DEGs between LCOAD and RCOAD

In the GSE81558 dataset, we chose nine LCOAD and two RCOAD samples into this research. And in the GSE75317 dataset, we chose 33 LCOAD and 26 RCOAD samples into this research. Following the analysis of the GSE81558 and GSE75317 datasets, respectively, the differences between LCOAD and RCOAD tissues in GSE81558 and GSE75317 were presented as volcano plots as shown in Figures 1(a) and 1(b), respectively. A Venn diagram revealed 286 common DEGs between the two datasets (Figure 1(c)).
Figure 1

Identification of differentially expressed genes. Volcano plots present the difference between LCOAD and RCOAD samples of the (a) GSE81558 and (b) GSE75317 datasets. (c) Venn diagram identifying 286 common genes between the two datasets.

3.2. Functional Annotation for DEGs Using KEGG and GO Analyses

The results of GO analysis revealed that variations in the BP were predominantly enriched in protein complex assembly, sialylation, oligosaccharide metabolic process, peptidyl-tyrosine, phosphorylation, and apoptotic process. Changes in CC were primarily enriched in intracellular, cell-cell junction, peroxisomal matrix, cytosol, and postsynaptic density. Variations in MF were enriched in metal ion binding, sialyltransferase activity, transcription factor activity, sequence-specific DNA binding, nucleic acid binding, and protein binding (Table 2). KEGG analysis demonstrated that DEGs were largely enriched in transcriptional misregulation in cancer, pathways in cancer, and peroxisome (Table 2).
Table 2

GO and KEGG pathway enrichment analyses of DEGs between left and right COAD.

TermDescriptionCount in gene set P value
GO:0006461Protein complex assembly80.002
GO:0097503Sialylation40.003
GO:0009311Oligosaccharide metabolic process40.006
GO:0018108Peptidyl-tyrosine phosphorylation80.009
GO:0006915Apoptotic process170.016
GO:0005622Intracellular359.01E − 04
GO:0005911Cell-cell junction90.004
GO:0005782Peroxisomal matrix50.004
GO:0005829Cytosol630.017
GO:0014069Postsynaptic density80.018
GO:0046872Metal ion binding480.002
GO:0008373Sialyltransferase activity40.003
GO:0003700Transcription factor activity, sequence-specific DNA binding260.005
GO:0003676Nucleic acid binding260.007
GO:0005515Protein binding1520.008
hsa05202Transcriptional misregulation in cancer114.20 − 04
hsa05200Pathways in cancer160.002
hsa04146Peroxisome50.048

GO: Gene Ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes; DEGs: differentially expressed genes. COAD: colon adenocarcinoma.

3.3. Construction of the PPI Network

The construction of a PPI network revealed 264 edges and 159 nodes in the PPI network (PPI enrichment; P=0.0112; Figure 2). The network possessed significantly more interactions than expected, highlighting a greater number of interactions between DEGs than expected for a random set of proteins of a similar size from the same genome. Such enrichment indicates that the identified proteins are at least partially associated.
Figure 2

Protein-protein interaction network of differentially expressed genes, consisting of 264 edges and 159 nodes.

3.4. Hub Gene Selection and Functional Annotation

The following 10 hub genes were identified using Cytoscape, and KEGG and GO analyses were conducted using DAVID: CDKN2A, IGF1R, MDM2, SMAD3, SLC2A1, GRM5, PLCB4, FGFR1, UBE2V2, and TNFRSF10B (Figure 3). The results of GO analysis showed that variations in the BP were largely enriched in the activation of cysteine-type endopeptidase activity involved in the apoptotic process, activation of cysteine-type endopeptidase activity involved in the apoptotic signaling pathway, protein destabilization, protein K63-linked ubiquitination, and immune response. Variations in the CC were predominantly enriched in receptor complex, integral component of plasma membrane, plasma membrane, and cytosol, whereas those in the MF were enriched in identical protein binding, SUMO transferase activity, ubiquitin protein ligase binding, protein binding, and p53 binding. KEGG pathway analysis revealed that the hub genes were mainly enriched in pathways in cancer, adherens junction, cell cycle, FoxO signaling pathway, and proteoglycans in cancer (Table 3). Summaries of the functions of all hub genes are presented in Table 4.
Figure 3

Hub genes identified within the protein-protein interaction network.

Table 3

GO and KEGG pathway enrichment analyses of hub genes between left and right COAD.

TermDescriptionCount in gene set P value
GO:0006919Activation of cysteine-type endopeptidase activity involved in apoptotic process38.50E 04
GO:0097296Activation of cysteine-type endopeptidase activity involved in apoptotic signaling pathway20.007
GO:0031648Protein destabilization20.019
GO:0070534Protein K63-linked ubiquitination20.020
GO:0006955Immune response30.020
GO:0043235Receptor complex30.002
GO:0005887Integral component of plasma membrane50.003
GO:0005886Plasma membrane70.006
GO:0005829Cytosol60.013
GO:0042802Identical protein binding54.05E 04
GO:0019789SUMO transferase activity20.009
GO:0031625Ubiquitin protein ligase binding30.010
GO:0005515Protein binding90.026
GO:d0002039p53 binding20.035
hsa05200Pathways in cancer78.51E 07
hsa04520Adherens junction30.003
hsa04110Cell cycle30.008
hsa04068FoxO signaling pathway30.010
hsa05205Proteoglycans in cancer30.021

GO: Gene Ontology; KEGG: Kyoto Encyclopaedia of Genes and Genomes; DEGs: differentially expressed genes; COAD: colon adenocarcinoma.

Table 4

Summaries for the function of 10 hub genes.

No.Gene symbolFull nameFunction
1CDKN2ACyclin-dependent kinase inhibitor 2ACapable of inducing cell cycle arrest in G1 and G2 phases. Acts as a tumor suppressor. Acts as a negative regulator of the proliferation of normal cells by interacting strongly with CDK4 and CDK6
2IGF1RInsulin-like growth factor 1 receptorThe activated IGF1R is involved in cell growth and survival control. IGF1R is crucial for tumor transformation and survival of malignant cell
3MDM2MDM2 proto-oncogeneInhibits p53/TP53- and p73/TP73-mediated cell cycle arrest and apoptosis by binding its transcriptional activation domain. Inhibits DAXX-mediated apoptosis by inducing its ubiquitination and degradation
4SMAD3SMAD family member 3Receptor-regulated SMAD (R-SMAD) that is an intracellular signal transducer and transcriptional modulator activated by TGF-beta (transforming growth factor) and activin type 1 receptor kinases
5SLC2A1Solute carrier family 2 member 1Facilitative glucose transporter. This isoform may be responsible for constitutive or basal glucose uptake, has a very broad substrate specificity, and can transport a wide range of aldoses including both pentoses and hexoses
6GRM5Glutamate metabotropic receptor 5Ligand binding causes a conformation change that triggers signaling via guanine nucleotide-binding proteins (G proteins) and modulates the activity of down-stream effectors
7PLCB4Phospholipase C beta 4The production of the second messenger molecules diacylglycerol (DAG) and inositol 1,4,5-trisphosphate (IP3) is mediated by activated phosphatidylinositol-specific phospholipase C enzymes
8FGFR1Fibroblast growth factor receptor 1Tyrosine-protein kinase that acts as cell-surface receptor for fibroblast growth factors and plays an essential role in the regulation of embryonic development, cell proliferation, differentiation, and migration
9UBE2V2Ubiquitin conjugating enzyme E2 V2Plays a role in the control of progress through the cell cycle and differentiation, plays a role in the error-free DNA repair pathway, and contributes to the survival of cells after DNA damage
10TNFRSF10BTNF receptor superfamily member 10bPromotes the activation of NF-kappa-B. Essential for ER stress-induced apoptosis

3.5. Analysis of Hub Genes

A coexpression network of the hub genes was constructed using cBioPortal. Among these genes, CDKN2A, UBE2V2, MDM2, SMAD3, FGFR1, IGF1R, and PLCB4 exhibited the highest node scores, suggesting that they may possess pivotal functions for distinguishing between LCOAD and RCOAD (Figure 4). Using the BiNGO tool, biological process analysis of the hub genes is illustrated in Figure 5(a), and molecular function analyses of the hub genes are presented in Figure 5(b). Hierarchical clustering revealed that the hub genes were able to differentiate between the LCOAD and RCOAD samples (Figure 6). Within the GSE81558 dataset, when compared with LCOAD, the expression of GRM5 and UBE2V2 was downregulated, and that of CDKN2A, SLC2A1, IGF1R, FGFR1, TNFRSF10B, MDM2, SMAD3, and PLCB4 was upregulated in RCOAD (Figure 6(a)). In the GSE75317 dataset, when compared with LCOAD, expression levels of PLCB4 and UBE2V2 were downregulated, while those of CDKN2A, MDM2, TNFRSF10B, SMAD3, and SLC2A1 were upregulated in RCOAD (Figure 6(b)).
Figure 4

Coexpression network of hub genes obtained using cBioPortal.

Figure 5

(a) Biological process and (b) molecular function analysis of the identified hub genes using the Biological Networks Gene Oncology tool.

Figure 6

Hierarchical clustering. Differentiation between RCOAD and LCOAD samples in the (a) GSE81558 and (b) GSE75317 datasets using the identified hub genes. The color represents the expression level of each gene (green, low expression; black, medium expression; and red, high expression).

3.6. RT-qPCR Analysis Validation of Hub Genes

As presented in the result, GRM5 and PLCB4 were markedly downregulated in RCOAD samples, when compared with the LCOAD. The relative expression levels of CDKN2A, IGF1R, MDM2, SMAD3, SLC2A1, FGFR1, UBE2V2, and TNFRSF10B were significantly higher in RCOAD samples, compared with the LCOAD groups (Figure 7). It should be noted that CDKN2A, MDM2, SMAD3, SLC2A1, and TNFRSF10B were consistent with the above results.
Figure 7

Relative expression of hub genes between LCOAD and RCOAD by RT-qPCR analysis. (a) CDKN2A, (b) IGF1R, (c) MDM2, (d) SMAD3, (e) SLC2A1, (f) GRM5, (g) PLCB4, (h) FGFR1, (i) UBE2V2, and (j) TNFRSF10B. P < 0.05.

3.7. The Relationship between Pathological Stage, OS, and the Expression of Hub Genes

GEPIA analysis showed that the expression of CDKN2A, MDM2, SLC2A1, and TNFRSF10B was significantly associated with pathological stage (P < 0.05; Figures 8(a), 8(c), 8(d), and 8(e)), while the expression of IGF1R, SMAD3, GRM5, PLCB4, FGFR1, and UBE2V2 was not (Figures 8(b), 8(d), 8(f) and 9(a)–9(c)). The pathological stage of COAD was positively related to the expression of CDKN2A and SLC2A1 and negatively related to the expression of MDM2 and TNFRSF10B. Kaplan–Meier analysis using GEPIA revealed that COAD patients with high expression levels of CDKN2A had poorer overall survival times than those with low expression levels (P < 0.05; Figure 10(a)); there was no statistically significant effect on OS associated with the expression of IGF1R, MDM2, SMAD3, SLC2A1, GRM5, PLCB4, FGFR1, UBE2V2, or TNFRSF10B (P > 0.05; Figures 10(b)–10(i)). Therefore, the other nine genes are not related to the prognosis. After the analysis by GEPIA, there exists a positive correlation between SLC2A1 and GLUT1 expression levels (R = 1, P < 0.001).
Figure 8

Association between pathological stage and the expression levels of (a) CDKN2A, (b) IGF1R, (c) MDM2, (d) SMAD3, (e) SLC2A1, and (f) GRM5.

Figure 9

Association between pathological stage and the expression levels of (a) PLCB4, (b) FGFR1, (c) UBE2V2, and (d) TNFRSF10B.

Figure 10

Kaplan–Meier overall survival analysis using Gene Expression Profiling Interactive Analysis. (a) CDKN2A, (b) IGF1R, (c) MDM2, (d) SMAD3, (e) SLC2A1, (f) PLCB4, (g) FGFR1, (h) UBE2V2, and (i) TNFRSF10B. The expression level of CDKN2A is closely correlated with the prognosis of COAD patients (P < 0.05).

3.8. High Expressions of CDKN2A in Patients with LCOAD or RCOAD Were Independent Prognostic Factors for the Poor Overall Survival

The demographic data and the expression status of CDKN2A were summarized in Table 5. The Kaplan–Meier OS curves were presented in Figure 11. High expression of CDKN2A was a predictor of a shorter OS in the LCOAD patients (Figure 11(a)) and RCOAD patients (Figure 11(b)).
Table 5

The demographic data and the expression status of CDKN2A.

CDKN2A
Low (%)High (%)
SexMale181119 (56.1%)62 (29.2%)
Female310 (0.0%)31 (14.6%)

Age<65 years10064 (30.2%)36 (17.0%)
≥65 years11255 (25.9%)57 (26.9%)

Tumor locationLCOAD10659 (27.8%)47 (22.2%)
RCOAD10660 (28.3%)46 (21.7%)

Overall survival<60 months12257 (26.9%)65 (30.7%)
≥60 months9062 (29.2%)28 (13.2%)
Figure 11

The Kaplan–Meier OS curves of the LCOAD and RCOAD patients with the low/high expression of CDKN2A. (a) High expression of CDKN2A was a predictor of a shorter OS in the LCOAD patients (P < 0.05). (b) High expression of CDKN2A was a predictor of a shorter OS in the RCOAD patients (P < 0.05).

4. Discussion

With global changes in diet and lifestyle, COAD-associated morbidity and mortality have increased, making it one of the primary malignant tumors threatening human health. There is no consensus on the relationship between tumor location and the pathological stage and prognosis of COAD. A meta-analysis [30] of 66 studies that analyzed the OS data of 1.43 million COAD patients showed a 19% reduction in mortality among patients with LCOAD, compared with those with RCOAD; this suggested that the location of the primary tumor serves a key role in determining the prognosis of colon adenocarcinoma. However, Weiss et al. [7] found no significant difference in the 5-year OS rates between patients with left and right COAD, following the adjustment for various prognostic factors. In addition, numerous studies have reported differences in the molecular mechanisms of COAD at different locations [10, 31, 32], but it was not clear whether these molecular differences could be translated into clinically meaningful changes in pathological stage and prognosis. Therefore, pathological stage and prognosis may serve important roles in investigating the relationship between the molecular mechanisms of the occurrence and development of COAD at different locations, facilitating the screening, diagnosis, and targeted treatment of patients with COAD [33]. Bioinformatics is the computational science of understanding biological and genetic information for the purpose of expanding the use of biological and medical data [34]. The units of bioinformatics research are DNA, RNA, and protein molecules, which can be reliably utilized for the identification and investigation of DEGs [35, 36]. COAD results from the interaction of multiple genes and the bioinformatic application of gene expression profiles provide the possibility of studying the pathogenesis of COAD at different locations. Furthermore, the biological analysis of gene chip data is another important advancement for data mining [37]. In the present study, bioinformatics technology was used to analyze two datasets (GSE81558 and GSE75317), in which a total of 286 DEGs were identified. GO enrichment analysis, KEGG signal pathway analysis, and PPI network analysis were also performed with these DEGs, and the following ten hub genes associated with COAD at different locations were identified by the cytoHubba when the degree ≥10, one plug-in of Cytoscape software: CDKN2A, IGF1R, MDM2, SMAD3, SLC2A1, GRM5, PLCB4, FGFR1, UBE2V2, and TNFRSF10B. Among these genes, the expression of CDKN2A and SLC2A1 was upregulated in RCOAD, compared with LCOAD. GEPIA showed that the expression of CDKN2A was significantly associated with pathological stage (P < 0.05). With the increase in CDKN2A expression levels, the pathological stage of COAD also increased (P < 0.05). Kaplan–Meier curve analysis using GEPIA revealed that COAD patients with high expression levels of CDKN2A had poorer OS times than those with low expression levels (P < 0.05). Cyclin-Dependent Kinase Inhibitor 2A (CDKN2A) is an important tumor suppressor gene belonging to the family of cyclin-dependent kinase inhibitor genes, which serves a regulatory role in cell proliferation and apoptosis [38]. The pathways associated with CDKN2A are signaling and apoptosis modulation. CDKN2A codes for two cyclic inhibitory proteins, p16INK4a and p14ARF. Furthermore, through the p16ink4a-cdk4 (and CDK6)-prb and p14arf-mdm2-p53 pathways, it serves a role in cell cycle regulation. CDKN2A is able to induce cell cycle arrest at the G1 and G2 phases and thus has a tumor-inhibitory effect [39]. CDKN2A binds the proto-oncogene MDM2 and blocks its karyoplasmic shuttling by sequestrating MDM2 in the nucleolus. In addition, MDM2-induced degradation of p53 was blocked, enhancing p53-dependent activation and subsequent apoptosis, thereby inhibiting the carcinogenic effect of MDM2 [40]. Additionally, CDKN2A is able to bind BCL6, downregulating bcl6-induced transcriptional inhibition; it can also bind E2F1 and MYC, blocking the transcriptional activation activity of E2F1. However, no effect on MYC-associated transcriptional inhibition has been reported. CDKN2A mutation has been demonstrated as an important event in a number of tumor types, including pancreatic cancer [41] and gastric cancer. Therefore, the development of cancer is often accompanied by CDKN2A mutations; the loss of its anticancer function may promote the neoplastic transformation of cells, subsequently inducing proliferation, invasion, and metastasis [42]. In the present study, it was speculated that CDKN2A may be mutated in COAD, the pathological stage of COAD was positively related to the expression of CDKN2A, and the mutated protein may promote the abnormal proliferation and differentiation of colonic glandular epithelial cells. The results indicated that the expression level of CDKN2A in RCOAD was higher than that in LCOAD and that this is positively correlated with the pathological stage of patients with COAD. Survival analysis also revealed that when CDKN2A was highly expressed, the OS rate of patients with COAD was low and the prognosis was poor. This suggested a possible reason (and research direction) for the hypothesis that, at the molecular level, patients with RCOAD possess a higher pathological stage and poorer prognosis than those with LCOAD. However, there are still some shortcomings to the present study. The sample size of only two datasets was relatively small. In the result of hierarchical clustering data, PLCB4 expression was upregulated in RCOAD as compared to LCOAD using the GSE81558 dataset while PLCB4 expression was downregulated using the GSE75317 dataset. We think that the reasons causing this situation are small sample sizes and individual differences. Currently, there are some research studies about the difference between RCOAD and LCOAD in genomics. Based on the previous studies, our study creatively identified critical differentially expressed genes between LCOAD and RCOAD through the bioinformatics method and further verified them in clinical samples. We found that CDKN2A is expected to be a key target for the pathogenesis and treatment of LCOAD and RCOAD. Meanwhile, a large number of clinical samples and animal experiments would provide more comprehensive verification and a deeper understanding of the different molecular mechanisms, clinical pathological staging, and survival differences between RCOAD and LCOAD.

5. Conclusion

We studied the gene difference between LCOAD and RCOAD by bioinformatics and verified the result by molecular biology, in an attempt to deeply understand the pathogenesis of COAD and expand the thinking for the discovery of new therapeutic targets. Our study identified 286 differentially expressed genes and 10 hub genes, with a focus on verifying the differential expression and prognostic value of CDKN2A. The expression of CDKN2A is upregulated in the RCOAD and is downregulated in the LCOAD. The higher the expression of CDKN2A is, the poorer the pathological stage and overall survival are. Therefore, the prognosis of LCOAD is better than RCOAD. The present study has provided a reference point for the in-depth study of COAD-associated genes, the discovery of molecular markers at different locations, and the biological processes in which they are involved.
  42 in total

1.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository.

Authors:  Ron Edgar; Michael Domrachev; Alex E Lash
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

Review 2.  The KEGG database.

Authors:  Minoru Kanehisa
Journal:  Novartis Found Symp       Date:  2002

3.  Distinct patterns of DNA methylation in conventional adenomas involving the right and left colon.

Authors:  Devin C Koestler; Jing Li; John A Baron; Gregory J Tsongalis; Lynn F Butterly; Martha Goodrich; Corina Lesseur; Margaret R Karagas; Carmen J Marsit; Jason H Moore; Angeline S Andrew; Amitabh Srivastava
Journal:  Mod Pathol       Date:  2013-07-19       Impact factor: 7.842

4.  Cytoscape 2.8: new features for data integration and network visualization.

Authors:  Michael E Smoot; Keiichiro Ono; Johannes Ruscheinski; Peng-Liang Wang; Trey Ideker
Journal:  Bioinformatics       Date:  2010-12-12       Impact factor: 6.937

5.  Alterations of tumor suppressor gene p16INK4a in pancreatic ductal carcinoma.

Authors:  Jyotika Attri; Radhika Srinivasan; Siddhartha Majumdar; Bishan Dass Radotra; Jaidev Wig
Journal:  BMC Gastroenterol       Date:  2005-06-28       Impact factor: 3.067

6.  STRING v10: protein-protein interaction networks, integrated over the tree of life.

Authors:  Damian Szklarczyk; Andrea Franceschini; Stefan Wyder; Kristoffer Forslund; Davide Heller; Jaime Huerta-Cepas; Milan Simonovic; Alexander Roth; Alberto Santos; Kalliopi P Tsafou; Michael Kuhn; Peer Bork; Lars J Jensen; Christian von Mering
Journal:  Nucleic Acids Res       Date:  2014-10-28       Impact factor: 16.971

7.  Differences of protein expression profiles, KRAS and BRAF mutation, and prognosis in right-sided colon, left-sided colon and rectal cancer.

Authors:  Xian Hua Gao; Guan Yu Yu; Hai Feng Gong; Lian Jie Liu; Yi Xu; Li Qiang Hao; Peng Liu; Zhi Hong Liu; Chen Guang Bai; Wei Zhang
Journal:  Sci Rep       Date:  2017-08-11       Impact factor: 4.379

8.  GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses.

Authors:  Zefang Tang; Chenwei Li; Boxi Kang; Ge Gao; Cheng Li; Zemin Zhang
Journal:  Nucleic Acids Res       Date:  2017-07-03       Impact factor: 16.971

9.  Clinicopathologic features and treatment efficacy of Chinese patients with BRAF-mutated metastatic colorectal cancer: a retrospective observational study.

Authors:  Xicheng Wang; Qing Wei; Jing Gao; Jian Li; Jie Li; Jifang Gong; Yanyan Li; Lin Shen
Journal:  Chin J Cancer       Date:  2017-10-16

10.  Use of a Combined Gene Expression Profile in Implementing a Drug Sensitivity Predictive Model for Breast Cancer.

Authors:  Xianglan Zhang; In-Ho Cha; Ki-Yeol Kim
Journal:  Cancer Res Treat       Date:  2016-05-18       Impact factor: 4.679

View more
  3 in total

1.  TRIM29 is differentially expressed in colorectal cancers of different primary locations and affects survival by regulating tumor immunity based on retrospective study and bioinformatics analysis.

Authors:  Jing Han; Jing Zuo; Xue Zhang; Long Wang; Dan Li; Yudong Wang; Jiayin Liu; Li Feng
Journal:  J Gastrointest Oncol       Date:  2022-06

2.  The p53 Pathway Related Genes Predict the Prognosis of Colon Cancer.

Authors:  Jinggao Feng
Journal:  Int J Gen Med       Date:  2022-01-06

3.  Identification and validation of hub genes for diabetic retinopathy.

Authors:  Li Peng; Wei Ma; Qing Xie; Baihua Chen
Journal:  PeerJ       Date:  2021-09-13       Impact factor: 2.984

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.