Literature DB >> 27935967

Large-Scale Analysis of Gene Expression Data Reveals a Novel Gene Expression Signature Associated with Colorectal Cancer Distant Recurrence.

Nehad M Alajez1.   

Abstract

Colorectal cancer (CRC) is the fourth-ranked cause of cancer-related deaths worldwide. Despite recent advances in CRC management, distant recurrence (DR) remains the major cause of mortality in patients with preoperative chemotherapy and radiotherapy, underscoring a need to precisely identify novel gene signatures for predicting the risk of systemic relapse. Herein, we integrated two independent CRC gene expression datasets: the GSE71222 dataset, including 26 patients who developed DR and 126 patients who did not develop DR, and the GSE21510 dataset, including 23 patients who developed DR and 76 patients who did not develop DR. Our data revealed 37 common upregulated genes (fold change (FC) ≥ 1.5, P < 0.05) and three common downregulated genes (FC ≤ 1.5, P < 0.05) between DR and non-recurrent patients from the two datasets. We subsequently validated the upregulated gene panel in the Cancer Genome Atlas CRC datasets (379 patients), which identified a five-gene signature (S100A2, VIP, HOXC6, DACT1, KIF26B) associated with poor overall survival (OS, log-rank test P-value: 1.19 × 10-4) and poor disease-free survival (DFS, log-rank test P-value: 0.002). In a Cox proportional hazards multiple regression model, the five-gene signature and tumor stage retained their significance as independent prognostic factors for CRC DFS and OS. Therefore, our data identified a novel DR gene expression signature associated with worse prognosis in CRC.

Entities:  

Mesh:

Year:  2016        PMID: 27935967      PMCID: PMC5147898          DOI: 10.1371/journal.pone.0167455

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Colorectal cancer (CRC) is one of the most prevalent types of cancers and is currently ranked as the fourth leading cause of cancer-related deaths globally, and the third leading cause of death in the United States in both men and women [1, 2]. The 5-year survival rate for CRC patients with a localized tumor is approximately 90%, which declines to 70% for patients with regional disease, and to 12% for patients with metastatic disease [2]. Multiple molecular alterations occur during CRC development and progression. Therefore, the identification of clinical and pathological parameters that can accurately predict the prognosis of patients with CRC has been a daunting task. Some of the factors to consider for predicting the risk of systemic relapse include the differentiation status of the tumor, depth of tumor invasion, and vascular and perineural invasion [3, 4]. Over the past several years, numerous molecular signatures have been identified for CRC prognosis [5-7]. However, one major problem with many of the established molecular signatures for CRC relapse is the lack of validation across different groups and platforms. Therefore, large-scale analysis of multiple gene expression datasets might lead to the identification of more representative gene expression signatures associated with CRC relapse. Herein, we integrated three independent CRC gene expression datasets retrospectively, which led to the identification of a novel five-gene signature associated with CRC systemic relapse.

Materials and Methods

Patient information and data analysis

The current study was conducted on three different CRC cohorts: (1) the National Center for Biotechnology Information Gene Expression Omnibus (GEO) GSE71222 dataset, which included 26 patients who developed distant recurrence (DR) and 126 patients who did not develop DR; (2) the GSE21510 dataset, which included 23 patients who developed DR and 76 patients who did not develop DR; and (3) The Cancer Genome Atlas (TCGA) CRC dataset, which included a total of 379 CRC patients. Interrogation of the TCGA dataset was conducted as previously described [8-10]. The relationship of gene expression patterns with patient survival in the TCGA database was queried using the cBioportal database with the formula GENE: EXP > 0, where GENE represents a query gene. The clinical characteristics for the TCGA dataset are shown in Table 1. The clinical characteristics for the GSE71222 and GSE21510 datasets have been described previously [11, 12].
Table 1

The Cancer Genome Atlas CRC dataset patient and tumor characteristics.

N = 379%
Age, years
Median age66
Range31–90
Gender
Male20654.4
Female16844.3
Unknown51.3
Overall survival, months
Median22.04
Range0–147.9
Disease-free survival, months
Median20.27
Range0–147.9
Stage
I5614.8
II13535.6
III11229.6
IV5213.7
NA246.3

Microarray data analysis

The GSE71222 and GSE21510 raw gene expression datasets were retrieved from the GEO and were imported into GeneSpring 13.0 software (Agilent Technologies, Palo Alto, CA, USA). Raw data were subsequently normalized using the percentile shift, and a 1.5 fold-change (FC) cutoff and P < 0.05 were used to determine significantly changed transcripts between groups [13].

Statistical analysis

Kaplan-Meier survival curve comparison was conducted using the log-rank test, and a P-value of ≤0.05 was considered statistically significant. The Cox proportional hazards multiple regression model was used to identify the independent prognostic factors and to correct the effect of potential confounding variables, such as gender (male vs female), age (> 65y vs < 65y), tumor stage (stage 3/4 vs stage 1/2), and of cancer type (colon adenocarcinoma vs rectal adenocarcinoma vs mucinous adenocarcinoma of the colon and rectum) on OS and DFS using MedCalc 16.8.4 (MedCalc, Mariakerke, Belgium). Pathway analyses were conducted using DAVID functional annotation and clustering bioinformatics tool, as described in our previous reports [14, 15]. Statistical analyses and graphing were performed using Graphpad Prism 6.0 software (Graphpad Software, San Diego, CA, USA).

Results

Generation of a gene expression panel associated with risk of DR

To devise a gene expression panel associated with CRC DR with high confidence, we analyzed two independent CRC gene expression datasets (GSE71222 and GSE21510) and identified the genes associated with patient recurrence. Analysis of the GSE71222 and GSE21510 datasets revealed 180 (1.5 FC, P < 0.05) and 317 (1.5 FC, P < 0.05) differentially expressed transcripts between DR and non-metastatic tumors, respectively (Fig 1a and 1b). To identify DR-related genes with high confidence, we crossed the differentially expressed genes from the two datasets that revealed 44 common upregulated transcripts, comprising 37 genes (Fig 1c, Table 2), and three common downregulated genes (Table 2). Pathway analysis performed on the common upregulated genes revealed enrichment in several cellular pathways, including cell motion and regulation of cell differentiation (Fig 1d).
Fig 1

Genes associated with CRC distant recurrence (DR).

Heatmap depicting the expression levels of differentially expressed genes (1.5 fold changes and P ≤ 0.05) between DR and non-recurrent (NR) CRC patients from the GSE71222 (a) and GSE21510 (b) datasets. Each column represents an individual sample and each row represents a single transcript. The expression level of each mRNA in a single sample is depicted according to the color scale. (c) Venn diagram depicting the common upregulated genes between DR and NR CRC samples from the GSE71222 and GSE21510 datasets. (d) Pie chart illustrating the distribution of the top 5 pathway designations for the 44 common upregulated transcripts from (c). The pie size corresponds to the number of matched entities.

Table 2

Common recurrence-related genes in the GSE71222 and GSE21510 datasets.

Gene SymbolFC (GSE71222)FC GSE21510
Upregulated genes
LAMC21.511.60
SERPINA31.582.33
LPL1.742.42
S100A21.792.12
PROM11.992.33
COL9A31.772.13
SERPINB51.852.57
TNFRSF11B2.082.28
TCN12.152.73
C4BPA1.602.12
SLC14A11.501.80
REG1B2.502.42
VIP1.672.06
HOXC61.752.50
MSX21.561.63
BMP41.501.60
TNIK1.621.56
PRUNE21.711.66
KRT6B1.903.45
NOV1.621.73
TESC1.711.83
DACT11.521.72
BHLHE411.602.06
ABHD21.591.58
AMIGO21.901.87
DCDC21.822.18
CD1091.671.86
EPHA41.802.32
PPP2R2C1.711.85
SOX21.581.82
EPHB11.842.03
GPR1551.721.72
SBSPON1.861.93
TMEM712.162.91
KIF26B1.971.52
C3ORF701.501.70
CPA61.561.76
Downregulated genes
PTPRD-2.09-2.04
PID1-1.52-1.54
ELF5-1.67-1.62

Selected genes are based on a fold-change (FC) of 1.5 and P < 0.05 cut-off threshold.

Genes associated with CRC distant recurrence (DR).

Heatmap depicting the expression levels of differentially expressed genes (1.5 fold changes and P ≤ 0.05) between DR and non-recurrent (NR) CRC patients from the GSE71222 (a) and GSE21510 (b) datasets. Each column represents an individual sample and each row represents a single transcript. The expression level of each mRNA in a single sample is depicted according to the color scale. (c) Venn diagram depicting the common upregulated genes between DR and NR CRC samples from the GSE71222 and GSE21510 datasets. (d) Pie chart illustrating the distribution of the top 5 pathway designations for the 44 common upregulated transcripts from (c). The pie size corresponds to the number of matched entities. Selected genes are based on a fold-change (FC) of 1.5 and P < 0.05 cut-off threshold.

Validation of the DR-associated gene panel in the TCGA CRC dataset

We subsequently focused on the potential role of the upregulated genes in CRC recurrence. Therefore, each of the 37 upregulated genes was further validated using the TCGA CRC dataset to determine their relationship to overall survival (OS) and disease-free survival (DFS). S100A2, VIP, HOXC6, DACT1, and KIF26B were significantly associated with OS (P≤0.01) and DFS (P≤0.05), while LAMC2, NOV, and AMIGO2 were only associated with DFS (P≤0.05). We subsequently focused on the five-gene panel that was associated with OS and DFS. The OncoPrint for this gene panel in the TCGA CRC dataset with the proportion of patients overexpressing each gene is presented in Fig 2a. Interestingly, the combination of this five-gene panel revealed a higher prognostic value, in which patients overexpressing at least one of the five genes showed a worse OS (log-rank test P-value: 1.19 × 10−4, Fig 2b) and worse DFS (log-rank test P-value: 0.002, Fig 2c) than those with lower expression of these genes. Data from the univariate analysis were subsequently put into the Cox proportional hazards multiple regression model to identify the independent factors for prognosis. The results showed that expression of the five-gene panel and tumor stage retained their significance as independent prognostic factors for CRC DFS and OS (p = 0.0023 and 0.0001 for DFS and p = 0.0086 and <0.0001 for OS, respectively), while age at diagnosis only correlated with OS, p = 0.0004 (Table 3). Network analysis of this five-gene signature revealed multiple network interactions in CRC, such as between VIP and GNG11, GNB3, GNG12, GNB2, GNG5, GNAS, GNG2, GNB4, GNG4, GNG10, and GNB1; between DACT1 and ARRB1, DVL1, CSNK2B, CSNK2A1, and CSNK2A2; and between S100A2 and TP53 (Fig 2d).
Fig 2

Validation of the distant recurrence (DR) gene panel in the TCGA dataset.

(a) OncoPrint of the DR five-gene signature in the TCGA CRC dataset. Alteration in the expression of different members of the five-gene signature (rows) in relation to each sample (columns). Relationships to overall and disease-free survival are also shown. CRC cases with upregulated expression of the DR signature showed worse overall (b) and disease-free (c) survival than cases with lower expression. (d) Network view of the VIP/DACT1/S100A2 neighborhood in CRC. VIP, DACT1, and S100A2 are seed genes (indicated with thick borders), and all other genes were identified as altered in CRC.

Table 3

Multivariate analyses for the prognostic value of the 5-gene signature in TCGA CRC dataset.

ParametersCategoriesDFS hazard ratio (95% CI)P valueOS hazard ratio (95% CI)P value
Five-gene expressionHigh vs Low1.95 (1.27 to 3.01)0.00231.84 (1.16 to 2.90)0.0086
Age at diagnosis<65 vs. >650.86 (0.56 to 1.33)0.51032.47 (1.50 to 4.09)0.0004
TypeCA vs RA vs MA0.87 (0.67 to 1.12)0.29251.14 (0.86 to 1.51)0.3560
Tumor stage(3/4 vs 1/2)1.92 (1.37 to 2.69)0.00013.18 (1.96 to 5.16)<0.0001
GenderM vs F1.35 (0.86 to 2.10)0.18571.04 (0.65 to 1.65)0.8614

CA: Colon Adenocarcinoma; RA: Rectal Adenocarcinoma; MA: Mucinous Adenocarcinoma of the Colon and Rectum

Validation of the distant recurrence (DR) gene panel in the TCGA dataset.

(a) OncoPrint of the DR five-gene signature in the TCGA CRC dataset. Alteration in the expression of different members of the five-gene signature (rows) in relation to each sample (columns). Relationships to overall and disease-free survival are also shown. CRC cases with upregulated expression of the DR signature showed worse overall (b) and disease-free (c) survival than cases with lower expression. (d) Network view of the VIP/DACT1/S100A2 neighborhood in CRC. VIP, DACT1, and S100A2 are seed genes (indicated with thick borders), and all other genes were identified as altered in CRC. CA: Colon Adenocarcinoma; RA: Rectal Adenocarcinoma; MA: Mucinous Adenocarcinoma of the Colon and Rectum

Discussion

In the current study, we retrospectively derived and validated a gene expression signature associated with the risk of systemic relapse in patients with CRC. Analysis of the GSE71222 and GSE21510 datasets identified 37 upregulated and three downregulated genes associated with DR in CRC. Interestingly, several of the identified genes (LAMC2, LPL, SERPINB5, TCN1, VIP, MSX2, PRUNE2, KRT6B, TESC, EPHA4, GPR155, KIF26B, C3ORF70, and PID1) were also found to be differentially expressed in our previous global mRNA expression profiling of CRC compared to adjacent normal mucosa, suggesting a plausible role of these genes in driving CRC in addition to DR [16]. Concordant with our data, Takahashi and colleagues [11] reported a worse prognosis in CRC patients overexpressing Traf2- and Nck-interacting kinase (TNIK). Higher expression of MSX2 was found to be associated with metastasis in different types of human cancers [17]. PROM1, also known as CD133, was among the 37 upregulated genes in both datasets. Interestingly, PROM1 has previously been reported as a cancer stem cell marker in CRC [18, 19]. Similarly, two of the identified genes in the current study (SLC14A1 and KIF26B) were identified in an intestinal stem cell signature previously reported to be associated with poor clinical outcome in CRC [20]. Therefore, it is possible that patients with an enriched CSC phenotype are more likely to develop DR. We subsequently validated this gene signature in the TCGA CRC dataset, which includes 379 patients. Our analysis narrowed down the CRC recurrence signature to five genes (S100A2, VIP, HOXC6, DACT1, and KIF26B) whose expression was associated with poor OS (log-rank test P-value: 1.19 × 10−4) and DFS (log-rank test P-value: 0.002), which was further confirmed in a multivariate analysis. Therefore, we here present a novel gene expression signature for predicting the risk of systemic relapse in CRC. Concordant with our data, overexpression of S100A2 has been associated with poor clinical outcome in colorectal [21] and oral [22] cancers. The HOXC6 gene is frequently upregulated in prostate cancer, although no association with patient relapse was observed [23]. DACT1 was recently shown to promote CRC tumorigenicity and invasion via stabilization of β-catenin [24]. Concordantly, overexpression of DACT1 was observed during the transition of ductal carcinoma in situ to invasive ductal carcinoma in breast cancer [25].

Conclusion

Herein, we integrated multiple gene expression datasets and devised a novel five-gene signature as an independent predictor of CRC DR. This signature adds to the current prognostic value of tumor staging. Before this five-gene-signature can be utilized in the clinic; however, additional validations are required
  25 in total

1.  A molecular signature of metastasis in primary solid tumors.

Authors:  Sridhar Ramaswamy; Ken N Ross; Eric S Lander; Todd R Golub
Journal:  Nat Genet       Date:  2002-12-09       Impact factor: 38.330

2.  Clinical significance of osteoprotegerin expression in human colorectal cancer.

Authors:  Shunsuke Tsukamoto; Toshiaki Ishikawa; Satoru Iida; Megumi Ishiguro; Kaoru Mogushi; Hiroshi Mizushima; Hiroyuki Uetake; Hiroshi Tanaka; Kenichi Sugihara
Journal:  Clin Cancer Res       Date:  2011-01-26       Impact factor: 12.531

3.  Colorectal cancer epidemiology: incidence, mortality, survival, and risk factors.

Authors:  Fatima A Haggar; Robin P Boushey
Journal:  Clin Colon Rectal Surg       Date:  2009-11

4.  Colorectal cancer statistics, 2014.

Authors:  Rebecca Siegel; Carol Desantis; Ahmedin Jemal
Journal:  CA Cancer J Clin       Date:  2014-03-17       Impact factor: 508.702

Review 5.  Perineural Invasion is a Strong Prognostic Factor in Colorectal Cancer: A Systematic Review.

Authors:  Nikki Knijn; Stephanie C Mogk; Steven Teerenstra; Femke Simmer; Iris D Nagtegaal
Journal:  Am J Surg Pathol       Date:  2016-01       Impact factor: 6.394

6.  A human colon cancer cell capable of initiating tumour growth in immunodeficient mice.

Authors:  Catherine A O'Brien; Aaron Pollett; Steven Gallinger; John E Dick
Journal:  Nature       Date:  2006-11-19       Impact factor: 49.962

7.  Overexpression of the S100A2 protein as a prognostic marker for patients with stage II and III colorectal cancer.

Authors:  Taiki Masuda; Toshiaki Ishikawa; Kaoru Mogushi; Satoshi Okazaki; Megumi Ishiguro; Satoru Iida; Hiroshi Mizushima; Hiroshi Tanaka; Hiroyuki Uetake; Kenichi Sugihara
Journal:  Int J Oncol       Date:  2016-01-11       Impact factor: 5.650

8.  Significance of BMI1 and FSCN1 expression in colorectal cancer.

Authors:  Nehad M Alajez
Journal:  Saudi J Gastroenterol       Date:  2016 Jul-Aug       Impact factor: 2.485

9.  Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value.

Authors:  Laetitia Marisa; Aurélien de Reyniès; Alex Duval; Janick Selves; Marie Pierre Gaub; Laure Vescovo; Marie-Christine Etienne-Grimaldi; Renaud Schiappa; Dominique Guenot; Mira Ayadi; Sylvain Kirzin; Maurice Chazal; Jean-François Fléjou; Daniel Benchimol; Anne Berger; Arnaud Lagarde; Erwan Pencreach; Françoise Piard; Dominique Elias; Yann Parc; Sylviane Olschwang; Gérard Milano; Pierre Laurent-Puig; Valérie Boige
Journal:  PLoS Med       Date:  2013-05-21       Impact factor: 11.069

10.  Prognostic significance of Traf2- and Nck- interacting kinase (TNIK) in colorectal cancer.

Authors:  Hidenori Takahashi; Toshiaki Ishikawa; Megumi Ishiguro; Satoshi Okazaki; Kaoru Mogushi; Hirotoshi Kobayashi; Satoru Iida; Hiroshi Mizushima; Hiroshi Tanaka; Hiroyuki Uetake; Kenichi Sugihara
Journal:  BMC Cancer       Date:  2015-10-24       Impact factor: 4.430

View more
  12 in total

1.  Identifying the tumor location-associated candidate genes in development of new drugs for colorectal cancer using machine-learning-based approach.

Authors:  Tuncay Bayrak; Zafer Çetin; E İlker Saygılı; Hasan Ogul
Journal:  Med Biol Eng Comput       Date:  2022-08-10       Impact factor: 3.079

2.  MicroRNA-3148 acts as molecular switch promoting malignant transformation and adipocytic differentiation of immortalized human bone marrow stromal cells via direct targeting of the SMAD2/TGFβ pathway.

Authors:  Radhakrishnan Vishnubalaji; Ramesh Elango; Muthurangan Manikandan; Abdul-Aziz Siyal; Dalia Ali; Ammar Al-Rikabi; Dana Hamam; Rimi Hamam; Hicham Benabdelkamel; Afshan Masood; Ibrahim O Alanazi; Assim A Alfadda; Musaad Alfayez; Abdullah Aldahmash; Moustapha Kassem; Nehad M Alajez
Journal:  Cell Death Discov       Date:  2020-09-01

3.  Codonopis bulleynana Forest ex Diels inhibits autophagy and induces apoptosis of colon cancer cells by activating the NF-κB signaling pathway.

Authors:  Yunpeng Luan; Yanmei Li; Lina Zhu; Shuangqing Zheng; Dechang Mao; Zhuxue Chen; Yong Cao
Journal:  Int J Mol Med       Date:  2017-12-21       Impact factor: 4.101

4.  Genome-scale analysis identifies SERPINE1 and SPARC as diagnostic and prognostic biomarkers in gastric cancer.

Authors:  Ping Liao; Wei Li; Ruizheng Liu; Jamie K Teer; Biaobo Xu; Wei Zhang; Xi Li; Howard L Mcleod; Yijing He
Journal:  Onco Targets Ther       Date:  2018-10-15       Impact factor: 4.147

5.  A novel gene-pair signature for relapse-free survival prediction in colon cancer.

Authors:  Peng-Fei Chen; Fan Wang; Zi-Xiong Zhang; Jia-Yan Nie; Lan Liu; Jue-Rong Feng; Rui Zhou; Hong-Ling Wang; Jing Liu; Qiu Zhao
Journal:  Cancer Manag Res       Date:  2018-10-03       Impact factor: 3.989

6.  Recurrence-Associated Long Non-coding RNA Signature for Determining the Risk of Recurrence in Patients with Colon Cancer.

Authors:  Meng Zhou; Long Hu; Zicheng Zhang; Nan Wu; Jie Sun; Jianzhong Su
Journal:  Mol Ther Nucleic Acids       Date:  2018-06-26       Impact factor: 8.886

7.  An Integrated Bioinformatic Analysis of the S100 Gene Family for the Prognosis of Colorectal Cancer.

Authors:  Meng-Lu Zeng; Xian-Jin Zhu; Jin Liu; Peng-Chong Shi; Yan-Li Kang; Zhen Lin; Ying-Ping Cao
Journal:  Biomed Res Int       Date:  2020-11-26       Impact factor: 3.411

8.  Transgelin is a poor prognostic factor associated with advanced colorectal cancer (CRC) stage promoting tumor growth and migration in a TGFβ-dependent manner.

Authors:  Mona Elsafadi; Muthurangan Manikandan; Sami Almalki; Amer Mahmood; Tasneem Shinwari; Radhakrishnan Vishnubalaji; Mohammad Mobarak; Musaad Alfayez; Abdullah Aldahmash; Moustapha Kassem; Nehad M Alajez
Journal:  Cell Death Dis       Date:  2020-05-11       Impact factor: 8.469

9.  Gene expression data analysis identifies multiple deregulated pathways in patients with asthma.

Authors:  Reem H Alrashoudi; Isabel J Crane; Heather M Wilson; Monther Al-Alwan; Nehad M Alajez
Journal:  Biosci Rep       Date:  2018-11-07       Impact factor: 3.840

10.  Common and mutation specific phenotypes of KRAS and BRAF mutations in colorectal cancer cells revealed by integrative -omics analysis.

Authors:  Snehangshu Kundu; Muhammad Akhtar Ali; Niklas Handin; Louis P Conway; Veronica Rendo; Per Artursson; Liqun He; Daniel Globisch; Tobias Sjöblom
Journal:  J Exp Clin Cancer Res       Date:  2021-07-07
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.