Literature DB >> 30746900

Integrated bioinformatics analysis of key genes involved in progress of colon cancer.

Haojie Yang1, Jiong Wu1, Jingjing Zhang1, Zhigang Yang1, Wei Jin1, Ying Li1, Lei Jin1, Lu Yin2, Hua Liu1, Zhenyi Wang1.   

Abstract

BACKGROUND: Colon cancer is one of most malignant cancers around worldwide. Nearly 20% patients were diagnosed at colon cancer with metastasis. However, the lack of understanding regarding its pathogenesis brings difficulties to study it.
METHODS: In this study, we acquired high-sequence data from GEO dataset, and performed integrated bioinformatic analysis including differently expressed genes, gene ontology and Kyoto Encyclopedia of Genes and Genomes pathways analysis, protein-protein analysis, survival analysis to analyze the development of colon cancer.
RESULTS: By comparing the colon cancer tissues with normal colon tissues, 109 genes were dysregulated; among them, 83 genes were downregulated and 26 genes were upregulated. Two clusters were founded based on the STRING database and MCODE plugin of cytoscape software. Then, six genes with prognostic value were filtered out in UALCAN website.
CONCLUSION: We found that SPP1, VIP, COL11A1, CA2, ADAM12, INHBA could provide great significant prognostic value for colon cancer.
© 2019 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.

Entities:  

Keywords:  GEO; TCGA; bioinformatic analysis; colon cancer

Mesh:

Substances:

Year:  2019        PMID: 30746900      PMCID: PMC6465657          DOI: 10.1002/mgg3.588

Source DB:  PubMed          Journal:  Mol Genet Genomic Med        ISSN: 2324-9269            Impact factor:   2.183


INTRODUCTION

Colon cancer, one of the most malignant cancers around worldwide, has caused more than 50,000 deaths per year (Haggar & Boushey, 2009). Due to the characters of colon cancer, such as incidence hidden, progression rapidly, prone to resistant to chemotherapy (He et al., 2017; Marin et al., 2012), it has brought seriously social and medical burden which arose public concern. Although large‐scale studies have been carried on to investigate the early diagnosis biomarkers and the mechanism of colon cancer, it is easy for us to be lost in the dense fog when treating colon cancer. Giving to the contribution of the second‐generation gene sequencing (Kamps et al., 2017), it is much helpful for us to uncover the causes and pathogenesis of colon cancer as well as identifying novel biomarkers with great prognostic value. In this study, we perform integrated analysis including differently expressed genes, gene ontology (GO) analysis, KEGG pathway analysis, survival analysis both to identify a panel of key candidate genes involved in colon cancer, and we found that SPP1, VIP, COL11A1, CA2, ADAM12, and INHBA could provide great significant prognostic value for colon cancer.

MATERIALS AND METHODS

Ethical compliance

The clinical information and sequence data were acquired according to the requirements of GEO and TCGA databanks. Thus, no ethics committee approval or consent procedure was needed.

Data source

High‐sequence data of GSE62932 (GPL570, Affymetrix Human Genome U133 Plus 2.0 Array) were collected from GEO dataset, which includes 68 colon tissues until 03 June 2018. As GEO is a publicly available dataset, no ethics approval is required. The samples were divided into two groups based on the sample type, a total of 64 colon cancer tissues and 4 normal colon tissues were utilized for the following analysis.

Differently expressed genes in colon cancer

Prior to analyzing the DEGs (differently expressed genes) in colon cancer, the sequence data were normalized using RMA (Robust Multichip Average). Then, we performed DEGs analysis using limma package with the cutoff of p‐value <0.05 and |logFC| ≥ 2 (Robinson, McCarthy, & Smyth, 2010). The heatmap was shown by pheatmap R package based on the expression value of DEGs. To better understand how the DEGs involved in the biological process and the signal transduction process, the clusterprofiler R package was carried on the GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis (Yu, Wang, Han, & He, 2012), a p‐value <0.05 was considered significant.

Protein–protein network analysis

As genes were interacting with each other, to deep excavate the central genes, STRING database was applied to construct the interaction network of genes (Szklarczyk et al., 2017). Cytoscape software was performed to visualize the relationship between genes (Shannon et al., 2003). For the sake of further research, following the protein–protein network analysis, the MCODE plugin was performed to re‐analyze the clusters among the network according to the k‐core = 2.

Survival analysis to screen the candidate genes

Survival analysis was carried on the UALCAN website (Chandrashekar et al., 2017), which is a portal for survival analysis according to the TCGA dataset. The colon cancer samples were divided into two groups according to gene expression: high expression (with Transcripts per million [TPM] values higher median) and low/median expression (with TPM values lower median). Then, we used the Kaplan–Meier method to analyze the candidate genes of significantly prognostic value with a p‐value <0.05.

Validation in TCGA and The Human Protein Atlas

For validation, the candidate genes were assessed both from RNA expression level and protein level by TCGA data portal and The Human Protein Atlas database, respectively. The GEPIA website was applied to exhibit the relative RNA expression level between colon cancer and normal colon tissues while The Human Protein Atlas database was performed to map the protein in the tissues (Tang et al., 2017; Uhlén et al., 2015).

RESULTS

Differently expressed genes involved in colon cancer

In this part, samples were first grouped based on the pathology of colon tissues as colon cancer tissues and normal colon tissues, respectively. Differently expressed genes analysis was performed in succession with the p‐value <0.05 and |logFC| ≥ 2. One hundred and nine genes were dysregulated, among them, 83 genes were downregulated and 26 genes were upregulated. To further evaluate the genes’ function, we performed GO analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. The upregulated genes are mainly enriched in C‐X‐C chemokine receptor(CXCR) binding, cytokine activity, chemokine activity, chemokine receptor binding, G‐protein coupled receptor binding, IL‐17 signaling pathway, rheumatoid arthritis, cytokine‐cytokine receptor interaction, chemokine signaling pathway, Toll‐like receptor signaling pathway. The downregulated genes were mostly enriched in oxidoreductase activity, acting on the CH–OH group of donors, nicotinamide adenine dinucleotide (NAD) or nicotinamide adenine dinucleotide phosphate (NADP) as acceptor, carbonate dehydratase activity, chloride channel activity, inorganic anion transmembrane transporter activity, oxidoreductase activity, acting on CH–OH group of donors, retinol metabolism, pentose and glucuronate interconversions, bile secretion, drug metabolism ‐ cytochrome P450, metabolism of xenobiotics by cytochrome P450 (Figure 1).
Figure 1

The dysregulated genes involved in colon cancer. Onthe left is the heatmap of dysregulated genes. Onthe middle is the GO analysis of the upregulated and downregulated genes. On the right is the Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the upregulated and downregulated genes

The dysregulated genes involved in colon cancer. Onthe left is the heatmap of dysregulated genes. Onthe middle is the GO analysis of the upregulated and downregulated genes. On the right is the Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the upregulated and downregulated genes To deep excavate the key genes involved in the development of colon cancer, we take STRING website to estimate the interaction relationship between genes. One thousand three hundred and twenty‐six pairs involved with 229 proteins were constructed in Cytoscape software (Figure 2). We then utilized MCODE plugin to find densely connected regions based on topology and two dense clusters were discovered. Cluster 1 involved 23 genes and 113 connections (SULF1, NPY1R, CXCL8, FGFR2, CXCL5, LEF1, CCL28, HBB, CXCL3, CXCL10, MMP3, CHL1, INHBA, GCG, LPAR1, CHGA, CD36, SPINK5, SFRP4, ANPEP, GZMB, CXCL11, P2RY14). Cluster 2 involved 23 genes and 94 connections(HOPX, SPP1, NOX4, DPT, COL10A1, MMP11, ADAM12, LAMA1, COL11A1, SFRP1, MMP1, CA2, SST, CA1, THBS2, CFD, MMP9, VIP, ABCG2, CHI3L1, MMP7, COL4A5, CNTN3). The density of our protein‐protein network was confirmed with the high degree of nodes, suggesting common competitions for colon cancer.
Figure 2

Protein–protein network of dysregulated genes. On the left is the whole interactions between dysregulated genes. On the middle is the cluster1 involved in. On the right is the cluster2 involved in

Protein–protein network of dysregulated genes. On the left is the whole interactions between dysregulated genes. On the middle is the cluster1 involved in. On the right is the cluster2 involved in

Survival analysis of key genes

To seek the candidate genes which may influence the survival outcome, we perform the survival analysis on the key genes. A total of six candidate genes were screened and were found to have impact on overall survival days, which are SPP1, VIP, COL11A1, CA2, ADAM12, INHBA, respectively (Figure 3). Patients whose tissues have a higher expression of SPP1, VIP, COL11A1, ADAM12, INHBA had significantly shorter overall survival compared to those with lower expression, while patients with higher expression of CA2 have a better prognosis.
Figure 3

Survival analysis of key genes. The red plots present the high expression of each individuals while the blue plots present the median/low expression of each individuals. SPP1: NC_000004.12; VIP: NC_000006.12; COL11A1: NC_000001.11; CA2: NC_000008.11; ADAM12: NC_000010.11; INHBA: NC_000007.14

Survival analysis of key genes. The red plots present the high expression of each individuals while the blue plots present the median/low expression of each individuals. SPP1: NC_000004.12; VIP: NC_000006.12; COL11A1: NC_000001.11; CA2: NC_000008.11; ADAM12: NC_000010.11; INHBA: NC_000007.14

Validation in TCGA and the Human Protein Atlas

The RNA expression levels of SPP1, VIP, COL11A1, CA2, ADAM12, and INHBA were validated in TCGA dataset. Due to the lack of COL11A1 and INHBA information in The Human Protein Atlas dataset, the protein expression level was not evaluated (Figure 4). The results also supported that SPP1, COL11A1, ADAM12, and INHBA expressions were significantly higher in colon cancer tissues compared to that of the normal tissues in accordance with our previous study.
Figure 4

Validation of key genes in TCGA dataset and the human protein Atlas dataset

Validation of key genes in TCGA dataset and the human protein Atlas dataset

DISCUSSION

In this study, we performed several bioinformatics analyses to excavate key genes involved in development of colon cancer. At first, 109 dysregulated genes were found through the comparison between the normal colon tissues and the colon cancer tissues. Protein–protein network analysis was followed to study the interactions between differently expressed genes and also cluster was studied in succession. Then, we performed survival analysis on those key genes to search the prognostic value genes. Interestingly, we found a total of six genes which are SPP1, VIP, COL11A, CA2, ADAM12, INHBA. We go a step further to validate them both in RNA expression level and protein expression level, also, the results are in accordance with our pervious study. SPP1, a secreted phosphoprotein which contains RGD domain, was firstly separated from bone matrix as an extracellular matrix protein by Herring (Heinegård, Hultenby, Oldberg, Reinholt, & Wendel, 1989; Oldberg, Franzén, & Heinegård, 1986). It is vital in bone reconstruction, anti‐inflammation, arteriosclerosis, and immunomodulation. A variety of cell types including osteoclast, macrophage, epithelial cells, T cells, endothelial cells could secrete SPP1 (Saitoh, Kuratsu, Takeshima, Yamamoto, & Ushio, 1995). Besides, many studies hold the opinion that SPP1 participates in the development and metastasis of malignant tumor. In gastric cancer (Imano et al., 2009), esophageal cancer (Lin et al., 2015), glioma (Ellert‐Miklaszewska et al., 2016), breast cancer (Rodrigues, Teixeira, Schmitt, Paulsson, & Lindmark‐Mänsson, 2007), lung cancer (Chambers et al., 1996). It was upregulated and might have served as a biomarker. Also, it could promote ovarian cancer proliferation, migration and invasion in vitro by activating Integrin β1/FAK/AKT signaling pathway (Zeng, Zhou, Wu, & Xiong, 2018). Also, the recent study showed that SPP1 could mediate macrophage polarization and lung cancer evasion, which could be used as a promising drug target (Zhang, Du, Chen, & Xiang, 2017). COL11A1 encodes one of the two alpha chains of type XI collagen, a minor fibrillar collagen. It has been widely studied in many cancers. It is overexpressed in both adenocarcinoma and squamous cell lung carcinoma, comparing with the corresponding non‐neoplastic lung tissues (Wang et al., 2002), in metastatic oral cavity/pharynx squamous cell carcinoma (Schmalbach et al., 2004). Also, it was involved in lymph node metastasis in breast cancer (Feng et al., 2007) which could be used as a potential biomarker to distinguish malignant from premalignant lesions in stomach and pancreas cancer (Kleinert et al., 2015; Zhao et al., 2009). It is localized in the Golgi apparatus of normal human colon goblet cells (Bowen et al., 2008). COL11A1 may be associated with the APC/beta‐catenin pathway in FAP and sporadic colon cancer (Fischer et al., 2001). In lung cancer, it has a positive correlation with pathology stage, poor prognosis, and lymph node metastasis (Chong et al., 2006), it promotes ovarian cancer progression and chemoresistance to cisplatin and paclitaxel via activating NF‐KB, and it promotes the expression of TWIST1, MCL1(Wu, Huang, Chang, & Chou, 2017). It has also been reported that COL11A1 was upregulated in gastric cancer and non‐small cell lung cancer which could boost the malignant behavior in vitro (Li, Li, Lin, Zhuo, & Si, 2017; Shen et al., 2016). COL11A1 could be utilized as a promising biomarker in predicting malignant relapse of breast intraductal papilloma (Freire et al., 2015). Carbonic anhydrase (CA) II is a member of carbonic anhydrases, which are a ubiquitous group of zinc‐bound metalloenzymes and catalyze the reversible hydration of carbon dioxide. Carbonic anhydrase II (hCAII) has important function in physiology and pathology process. CA II highly expresses in different normal organs, but its expression is inhibited in cancer cells (Li, Xie et al., 2012; Sheng, Dong, Zhou, Li, & Dong, 2013). CA II is also associated with osteopetrosis and renal tubular acidosis (Borthwick et al., 2003). CA II takes part in keeping the adequate balance between carbon dioxide and bicarbonate and controls the pH level in cells. As we know that the carbon dioxide and bicarbonate balance is the basic life activities, and can influence various cell behaviors, the low expression of CA II may play important roles in tumor progress and development. ADAM12 (ADisintegrin and metalloproteinase domain‐containing protein 12) encodes a member of a family of proteins, which play important role in a variety of biological processes involving cell‐cell and cell‐matrix interactions (Roy, Wewer, Zurakowski, Pories, & Moses, 2004). ADAM12 have different isoform, that shorter isoforms are secreted, while longer isoforms are membrane‐bound form. AMDAM12 takes part in the regulation in physic and pathological progress, including muscle development, neurogenesis, and fertilization. ADAM12 is upregulated in various cancer, including breast, prostate, ovarian, skin, stomach, lung and brain cancers (Li, Duhachek‐Muggy et al., 2012; Shao et al., 2014). ADAM12 contributes to tumor progression and metastasis by promoting tumor cell proliferation, migration, invasion, and apoptosis resistance. INHBA is a member of the TGF‐beta (transforming growth factor‐beta) superfamily. INHBA gene is overexpressed in different kinds of cancer, such as colorectal cancer, pancreatic cancer, and lung cancer, and promotes cell proliferation, invasion, metastasis and chemoresistance in cancer cells (Okano et al., 2013; Oshima et al., 2014). INHBA also takes part in the development of eye, tooth and testis. INHBA can form different kind of protein complex, which can activate and inhibit follicle stimulating hormone secretion from the pituitary gland, respectively. In conclusion, in this study, we performed integrated analysis to discover the differently expressed genes involved in the development of colon cancer, also showed a panel of genes with prognostic values to better evaluate the outcome of colon cancer patients. Here, we found that SPP1, VIP, COL11A1, CA2, ADAM12, and INHBA exhibited some significant prognostic values. More in‐depth studies are needed to determine the biological functions and mechanisms through which these genes impact cancer malignant cell behavior. Also, the expression pattern of these genes may be a promising target for therapy in colon cancer.

CONFLICT OF INTEREST

None declared. FigS1 Click here for additional data file.
  42 in total

1.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

2.  clusterProfiler: an R package for comparing biological themes among gene clusters.

Authors:  Guangchuang Yu; Li-Gen Wang; Yanyan Han; Qing-Yu He
Journal:  OMICS       Date:  2012-03-28

3.  Activation of TWIST1 by COL11A1 promotes chemoresistance and inhibits apoptosis in ovarian cancer cells by modulating NF-κB-mediated IKKβ expression.

Authors:  Yi-Hui Wu; Yu-Fang Huang; Tzu-Hao Chang; Cheng-Yang Chou
Journal:  Int J Cancer       Date:  2017-08-30       Impact factor: 7.396

4.  Novel candidate tumor marker genes for lung adenocarcinoma.

Authors:  Kan-Kan Wang; Ni Liu; Nikolina Radulovich; Dennis A Wigle; Michael R Johnston; Frances A Shepherd; Mark D Minden; Ming-Sound Tsao
Journal:  Oncogene       Date:  2002-10-24       Impact factor: 9.867

5.  Osteopontin expression in lung cancer.

Authors:  A F Chambers; S M Wilson; N Kerkvliet; F P O'Malley; J F Harris; A G Casson
Journal:  Lung Cancer       Date:  1996-11       Impact factor: 5.705

6.  Relation of INHBA gene expression to outcomes in gastric cancer after curative surgery.

Authors:  Takashi Oshima; Kazue Yoshihara; Toru Aoyama; Shinich Hasegawa; Tsutomu Sato; Naoto Yamamoto; Nozaki Akito; Manabu Shiozawa; Takaki Yoshikawa; Kazushi Numata; Yasushi Rino; Chikara Kunisaki; Katsuaki Tanaka; Makoto Akaike; Toshio Imada; Munetaka Masuda
Journal:  Anticancer Res       Date:  2014-05       Impact factor: 2.480

7.  Reduction of CAII Expression in Gastric Cancer: Correlation with Invasion and Metastasis.

Authors:  Xiao-Jie Li; Hai-Long Xie; San-Ju Lei; Hui-Qiu Cao; Tian-Yun Meng; Yu-Lin Hu
Journal:  Chin J Cancer Res       Date:  2012-09       Impact factor: 5.087

8.  A potential role of collagens expression in distinguishing between premalignant and malignant lesions in stomach.

Authors:  Yuan Zhao; Tianhua Zhou; Aiqing Li; Haomi Yao; Fei He; Liangjing Wang; Jianmin Si
Journal:  Anat Rec (Hoboken)       Date:  2009-05       Impact factor: 2.064

9.  UALCAN: A Portal for Facilitating Tumor Subgroup Gene Expression and Survival Analyses.

Authors:  Darshan S Chandrashekar; Bhuwan Bashel; Sai Akshaya Hodigere Balasubramanya; Chad J Creighton; Israel Ponce-Rodriguez; Balabhadrapatruni V S K Chakravarthi; Sooryanarayana Varambally
Journal:  Neoplasia       Date:  2017-07-18       Impact factor: 5.715

10.  GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses.

Authors:  Zefang Tang; Chenwei Li; Boxi Kang; Ge Gao; Cheng Li; Zemin Zhang
Journal:  Nucleic Acids Res       Date:  2017-07-03       Impact factor: 16.971

View more
  15 in total

1.  Integrated bioinformatics analysis of key genes involved in progress of colon cancer.

Authors:  Haojie Yang; Jiong Wu; Jingjing Zhang; Zhigang Yang; Wei Jin; Ying Li; Lei Jin; Lu Yin; Hua Liu; Zhenyi Wang
Journal:  Mol Genet Genomic Med       Date:  2019-02-11       Impact factor: 2.183

2.  Specific Glioma Prognostic Subtype Distinctions Based on DNA Methylation Patterns.

Authors:  Xueran Chen; Chenggang Zhao; Zhiyang Zhao; Hongzhi Wang; Zhiyou Fang
Journal:  Front Genet       Date:  2019-09-12       Impact factor: 4.599

3.  Identification of novel biomarkers affecting the metastasis of colorectal cancer through bioinformatics analysis and validation through qRT-PCR.

Authors:  Wenping Lian; Huifang Jin; Jingjing Cao; Xinyu Zhang; Tao Zhu; Shuai Zhao; Sujun Wu; Kailu Zou; Xinyun Zhang; Mingliang Zhang; Xiaoyong Zheng; Mengle Peng
Journal:  Cancer Cell Int       Date:  2020-03-30       Impact factor: 5.722

4.  lncRNA small nucleolar RNA host gene 12 promotes renal cell carcinoma progression by modulating the miR‑200c‑5p/collagen type XI α1 chain pathway.

Authors:  Congjie Xu; Hui Liang; Jiaquan Zhou; Yang Wang; Shuan Liu; Xiaolin Wang; Liangju Su; Xinli Kang
Journal:  Mol Med Rep       Date:  2020-09-02       Impact factor: 2.952

5.  Downregulation of miR‑181b inhibits human colon cancer cell proliferation by targeting CYLD and inhibiting the NF‑κB signaling pathway.

Authors:  Xifeng Yang; Yao Sun; Ying Zhang; Shan Han
Journal:  Int J Mol Med       Date:  2020-09-04       Impact factor: 4.101

6.  Gallic acid suppresses colon cancer proliferation by inhibiting SRC and EGFR phosphorylation.

Authors:  Xiaoming Lin; Guangfei Wang; Ping Liu; Lei Han; Tong Wang; Kaili Chen; Yonglin Gao
Journal:  Exp Ther Med       Date:  2021-04-16       Impact factor: 2.447

7.  Identification of immunization-related new prognostic biomarkers for papillary renal cell carcinoma by integrated bioinformatics analysis.

Authors:  Ping Wu; Tingting Xiang; Jing Wang; Run Lv; Shaoxin Ma; Limei Yuan; Guangzhen Wu; Xiangyu Che
Journal:  BMC Med Genomics       Date:  2021-10-07       Impact factor: 3.063

8.  Identification of potential key genes in gastric cancer using bioinformatics analysis.

Authors:  Wei Wang; Ying He; Qi Zhao; Xiaodong Zhao; Zhihong Li
Journal:  Biomed Rep       Date:  2020-02-20

9.  Identification of an alternative splicing signature as an independent factor in colon cancer.

Authors:  Haitao Chen; Jun Luo; Jianchun Guo
Journal:  BMC Cancer       Date:  2020-09-22       Impact factor: 4.430

10.  INHBA promotes the proliferation, migration and invasion of colon cancer cells through the upregulation of VCAN.

Authors:  Jia Guo; Yuan Liu
Journal:  J Int Med Res       Date:  2021-06       Impact factor: 1.671

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.