| Literature DB >> 31579079 |
Hengzhou Zhu1, Yi Ji1, Wenting Li2, Mianhua Wu2.
Abstract
The aim of the present study was to identify key genes in colorectal cancer (CRC) that could be used to reliably diagnose this disease and to explore the potential underlying mechanisms in silico. The gene expression profiles of primary human cancer datasets GSE21510 and GSE32323 were downloaded from the Gene Expression Omnibus database. The limma R software package was used to identify differentially expressed (DE) genes. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed on DE genes using the Database for Annotation, Visualization and Integrated Discovery. The Search Tool for the Retrieval of Interacting Genes/Proteins database was used to construct a protein-protein interaction (PPI) network of the DE genes. Survival rate was analyzed and visualized using The Cancer Genome Atlas (TCGA). A total of 1,126 genes were significantly DE in the present study. All DE genes were enriched in KEGG pathways including 'cell cycle', 'mineral absorption', 'pancreatic secretion', 'pathways in cancer', 'metabolic pathways', 'aldosterone-regulated sodium reabsorption' and 'Wnt signaling pathway'. A total of 5 hub genes enriched in cell cycle and tumor-associated pathways, including E2F2, SKP2, MYC, CDKN1A and CDKN2B, were significantly DE and validated between tumor and normal tissues. CDKN1A and CDKN2B were identified within the PPI network using the Molecular Complex Detection algorithm. Survival and content distribution analyses of 362 clinical samples from TCGA revealed that CDKN1A effectively predicted the prognosis of patients. The present study identified key genes and potential signaling pathways involved in CRC. These findings may provide new insights for survival assessment during the clinical diagnosis of CRC. Copyright: © Zhu et al.Entities:
Keywords: bioinformatic analysis; biomarker; colorectal cancer; survival rate
Year: 2019 PMID: 31579079 PMCID: PMC6757265 DOI: 10.3892/ol.2019.10698
Source DB: PubMed Journal: Oncol Lett ISSN: 1792-1074 Impact factor: 2.967
Figure 1.Principal component analysis of GSE21510 and GSE32323 datasets. PC, principal component.
Figure 2.(A) Genes were well clustered between cancer and normal tissue samples as shown in the volcano plot. All genes plotted in red represented DE genes and the remaining genes were plotted in blue. (B) The significantly changed DE genes of GSE21510 and GSE32323 were presented in the heatmap. (C) The common DE genes between GSE21510 and GSE32323. DE genes, differentially expressed genes.
Functional and pathway enrichment analysis of the differentially expressed genes in colorectal cancer.
| Term | Genes | False discovery rate | P-value |
|---|---|---|---|
| hsa04110: Cell cycle | CDK1, E2F2, E2F5, DBF4, RBL1, SKP2, PRKDC, TTK, CHEK1, ANAPC10, CHEK2, PTTG1, MCM4, MCM6, CCNB1, CDKN1A, CCNB2, MAD2L1, CDKN2B, PCNA, BUB1, BUB1B, ORC6, MYC, ORC3 | 6.71×10−4 | 5.10×10−7 |
| hsa04978: Mineral absorption | SLC11A2, SLC26A3, CLCN2, MT1M, HMOX1, MT2A, MT1E, MT1H, MT1X, MT1G, MT1F | 0.57 | 4.36×10−4 |
| hsa04972: Pancreatic secretion | KCNMA1, CLCA1, CLCA4, SLC12A2, PRKCB, CEL, SLC26A3, PLCB4, ATP2A3, PLA2G2A, CPA3, CA2, SLC4A4, PLCB1, SLC9A1 | 2.29 | 1.76×10−3 |
| hsa05200: Pathways in cancer | FGFR2, WNT5A, CKS1B, E2F2, PPARD, GNAI1, FGF9, GNA11, CXCL8, BDKRB1, ZBTB16, MMP1, EDNRA, WNT2, FOS, PLCB4, CDKN2B, AXIN2, PLCB1, TRAF5, MYC, MSH6, BMP2, COL4A1, EPAS1, MAP2K2, MSH2, MET, SKP2, LEF1, ITGA2, FZD3, PRKCB, FZD6, LAMA1, CDKN1A, LAMA3, MAPK3, CKS2, PDGFRA | 3.20 | 2.47×10−3 |
| hsa01100: Metabolic pathways | B3GALT5, B3GALT4, ADH1C, ADH1B, GPAT3, PRIM1, ASPA, PTGIS, ST3GAL4, CPOX, NANP, LPCAT2, GLCE, PLCE1, NME1, AKR1B10, PLA2G2A, ACAA1, PRPS1, XDH, GCNT3, AHCY, GCNT2, GNE, CTPS2, PPAT, B3GNT6, CDA, GCSH, DNMT3B, MAOA, AK1, MAOB, HGD, GART, TST, CEL, POLD4, GGT6, RPE, HMGCS2, MTR, AHCYL2, PC, ATP5D, CYP2C18, ANPEP, PSPH, CKB, ST6GALNAC6, TDO2, PLCB4, HPSE, P4HA1, MGLL, TWISTNB, PLCB1, ATP6V0D1, HYAL1, POLR1D, ACADS, DHRS9, POLR1C, POLR1B, ST6GALNAC1, ACADVL, ATP6V1C2, ADO, PTGDS, ADK, TGDS, AOC1, UGP2, ALPI, SORD, FUT8, HSD17B2, UGDH, UPP1, PIPOX, GLS2, DGKA, ALDH1A1, CKMT2, FUT3, FUT1, PLCD1, UGT2A3, ACSL4, PAPSS2, PLA2G16, NAT2, SI, PCK1, GBA3, GBA2, MBOAT1, SMPD1, PSAT1, PAICS | 3.87 | 2.99×10−3 |
| hsa04310: Wnt signaling pathway | WNT5A, PPARD, MMP7, LEF1, FZD3, PRKCB, FZD6, WNT2, GPC4, PLCB4, SFRP1, SFRP2, WIF1, RUVBL1, AXIN2, PLCB1, MYC | 14.59 | 1.19×10−2 |
Figure 3.Pathway enrichment analysis and PPI network construction for (A) cell cycle and (B) pathway in cancer signal pathways. The red point indicates the nodes or genes in the signaling pathway. (C) A total of 5 hub genes were identified in the Venn diagram analysis. (D) PPI network of the 5 selected hub genes. PPI, protein-protein interaction; DE genes, differentially expressed genes.
Figure 4.Expression of (A) CDKN1A, (B) CDKN2B, (C) MYC, (D) E2F2 and (E) SKP2 in clinical colon cancer (red, tumor; gray, normal). The expression level changes in different stages of (F) CDKN1A, (G) CDKN2B, (H) MYC (I) E2F2 and (J) SKP2. Differential gene expression analysis was analyzed using a one-way ANOVA, with the pathological stage as the variable used for calculating differential expression. Samples were obtained from The Cancer Genome Atlas and Genotype-Tissue Expression datasets. |Log fold-change| cut-off=1; *P<0.05. CDKN1A, cyclin-dependent kinase inhibitor 1A; COAD, colon adenocarcinoma; T, tumor; N, normal; CDKN2B, cyclin-dependent kinase inhibitor 2B; E2F2, E2F transcription factor 2; SKP2, S-phase kinase associated protein 2.
Figure 5.Survival time of CDKN1A, CDKN2B, E2F2, SKP2 and MYC in clinical patients. Data were obtained from The Cancer Genome Atlas. Patients with high expression of CDKN1A had significantly prolonged survival time. P<0.05. CDKN1A, cyclin-dependent kinase inhibitor 1A. The solid line represents the results of the Log-rank test analysis data. The results of the cox proportional hazard ratio and the 95% confidence interval information are indicated by the dashed lines. The median was used as a threshold for high and low expression TPM, transcripts per million; HR, hazard ratio; CDKN2B, cyclin-dependent kinase inhibitor 2B; E2F2, E2F transcription factor 2; SKP2, S-phase kinase associated protein 2.