Literature DB >> 30421504

Integrative analysis of differential genes and identification of a "2-gene score" associated with survival in esophageal squamous cell carcinoma.

Lin Wang1,2,3, Gaochao Dong1, Wenjie Xia1,2, Qixing Mao1,2, Anpeng Wang1,2, Bing Chen1,2, Weidong Ma1,2, Yaqin Wu1,2, Lin Xu1, Feng Jiang1.   

Abstract

BACKGROUND: Developments in high-throughput genomic technologies have led to improved understanding of the molecular underpinnings of esophageal squamous cell carcinoma (ESCC). However, there is currently no model that combines the clinical features and gene expression signatures to predict outcomes.
METHODS: We obtained data from the GSE53625 database of Chinese ESCC patients who had undergone surgical treatment. The R packages, Limma and WGCNA, were used to identify and construct a co-expression network of differentially expressed genes, respectively. The Cox regression model was used, and a nomogram prediction model was constructed.
RESULTS: A total of 3654 differentially expressed genes were identified. Bioinformatics enrichment analysis was conducted. Multivariate analysis of the clinical cohort revealed that age and adjuvant therapy were independent factors for survival, and these were entered into the clinical nomogram. After integrating the gene expression profiles, we identified a "2-gene score" associated with overall survival. The combinational model is composed of clinical data and gene expression profiles. The C-index of the combined nomogram for predicting survival was statistically higher than the clinical nomogram. The calibration curve revealed that the combined nomogram and actual observation showed better prediction accuracy than the clinical nomogram alone.
CONCLUSIONS: The integration of gene expression signatures and clinical variables produced a predictive model for ESCC that performed better than those based exclusively on clinical variables. This approach may provide a novel prediction model for ESCC patients after surgery.
© 2018 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.

Entities:  

Keywords:  ESCC; nomogram; prediction model; prognosis

Mesh:

Substances:

Year:  2018        PMID: 30421504      PMCID: PMC6312844          DOI: 10.1111/1759-7714.12902

Source DB:  PubMed          Journal:  Thorac Cancer        ISSN: 1759-7706            Impact factor:   3.500


Introduction

Esophageal cancer is the sixth leading cause of death globally; the two major subtypes are esophageal squamous cell carcinoma (ESCC) and esophageal adenocarcinoma.1, 2 According to epidemiological and biological analysis, ESCC accounts for almost 90% of esophageal cancer cases worldwide, and is prevalent in East Asia, East Africa, and South America. Esophageal adenocarcinoma is more common in the Americas, Europe, and Australia.2, 3 The primary treatment for patients with esophageal cancer includes chemotherapy, chemoradiotherapy, and/or surgical resection.4 Although the incidence of esophageal cancer is declining in most parts of the world, the five‐year survival rate remains < 20%.5, 6, 7 One of the major treatment challenges is the lack of accurate prediction of patient survival, which may lead to the inappropriate treatment prescription. The gold standard for prognostication in oncology remains the tumor node metastasis (TNM) staging system, which states that solid tumors first spread from the primary site to the lymphatic system and then to distant organs.8 However, the TNM system has some limitations when used in a clinical setting.9 TNM staging can only incorporate tumors, nodes, or metastasis as categorical variables, not continuous variables, which complicates the determination of individual patient prognosis. The TNM system also cannot incorporate other variables that govern prognosis, such as genome or transcriptome differences. Patients classified in the same stage may have variable outcomes. Thus, the development of a more advanced method to predict prognosis based on patient and disease characteristics is necessary. With the development of high throughput sequencing, it is now possible to screen the genomic, epigenetic, or proteomic characteristics of esophageal cancer, which leads to a better understanding of esophageal cancer biology to improve patient care.10, 11 Jang et al. developed a robust prediction model for recurrence based on an analysis of the expression profile data of small non‐coding RNAs (sncRNAs) from 108 fresh frozen ESCC specimens. They identified that the expression of three different sncRNAs was associated with recurrence‐free survival.12 Qin et al. sequenced 10 whole‐genome and 57 whole‐exome matched tumor‐normal ESCC sample pairs, and found that the amplification of somatic copy number alterations (SCNAs) in several miRNA genes was significantly associated with survival.13 In recent years, long non‐coding RNA (lncRNA), which is a type of RNA molecule > 200 nucleotides with a lack of protein‐coding capacity, has emerged as a new star in the field of oncology.14, 15, 16 Wu et al. identified that lincRNA‐uc002yug.2 may serve as a predictor for esophageal cancer and prognosis.17 However, a prediction model that combines clinical data and gene expression profiles associated with overall or recurrence‐free survival is lacking. Nowadays, nomograms are widely used as prognostic devices in oncology and medicine, which integrate various prognostic and determinant variables for individual patients.18, 19, 20 In this manuscript, we used clinical and gene expression profiles from the Gene Expression Omnibus (GEO, GSE53625) to analyze the different protein‐coding and long non‐coding genes, respectively. Using the coefficient and regression formula of the multivariate Cox model, we identified several clinical variables and “2‐gene score” (lncRNA) associated with survival duration. Based on the clinical variables and the “2‐gene score,” we constructed a nomogram to predict prognosis. The accuracy of this prediction model was higher than in a model based on clinical variables alone. This model incorporated molecular and clinical/pathological prognostic markers and may refine prognosis assessment.

Methods

Data sources and bioinformatics

The GSE53625gene expression profiles were obtained from GEO (https://www.ncbi.nlm.nih.gov), and included 179 paired tumor‐normal matched samples from ESCC patients treated by resection. The platform of this microarray GPL18109, which incorporates lncRNA and messenger RNA (mRNA) probes, is Agilent‐038314 CBC Homo sapiens lncRNA + mRNA microarray V2.0 (Agilent Technologies, Santa Clara, CA, USA). We re‐annotated this platform mainly focusing on the lncRNA probes according to the database, including ENCODE, CombinedLit, EvoFold, H‐InvDB, imsRNA, hox‐HOX, int‐HOX, nc‐HOX, lncRNAdb, XLOC, NRED, and UCSC. The Limma package in R software (R Foundation for Statistical Computing, Vienna, Austria) was used to show the different mRNA and lncRNA gene expression between normal and tumor specimens. The list of different transcriptional genes was submitted to the Database for Annotation, Visualization and Integrated Discovery (DAVID) Bioinformatics Resources 6.8 (http://david.abcc.ncifcrf.gov) for Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Gene Ontology (GO) biological progress enrichment analysis. The network of the different genes was constructed based on the R package WGCNA (R Foundation) and Cytoscape software (National Institute of General Medical Sciences, Bethesda, MD, USA). The pheatmap package in R software (R Foundation) was used to draw the heatmap, while a receiver operating characteristic (ROC) curve was constructed based on the ROCR package (https://CRAN.R-project.org/package=ROCR). The nomogram was built using the rms package of R statistical software (http://www.R-project.org/).

Statistical analysis

Statistical analysis was performed using SPSS version 20.0 (IBM Corp., Armonk, NY, USA) and P < 0.05 was considered statistically significant. A log‐rank test and the Kaplan–Meier method were used to assess survival. Univariate and multivariate analyses were performed using the Cox model. In the clinical variable Cox model, the following formula was used: variable_1 = 0.363 (Age variable) + 0.564 (TNM stage variable) – 0.582 (adjuvant therapy variable). In the clinical and “2‐gene score” Cox model, the following formula was used: variable_2 = 0.358 (age variable) + 0.605 (TNM stage variable) – 0.605 (adjuvant therapy variable) + 0.723 (RP11‐357H14.20 variable) + 0.295 (RP11‐768G7.2 variable). Based on variable‐1 and variable‐2 scores, patients were assigned into low and high‐risk groups, respectively.

Results

Clinicopathologic characteristics of patients with esophageal squamous cell carcinoma (ESCC)

A total of 179 patients with ESCC were included in the study. The clinical data and gene expression profiles associated with these patients were obtained from the GEO datasets in GSE53625. The baseline characteristics of these patients are listed in Table 1. Approximately half of the patients were aged over 60 years and over 80% were male. More than half of the patients had a history of alcohol consumption and smoking. The tumor was located in the middle esophagus in over half of the patients. The tumor grade was moderate in 54.7% of patients. The percentage of patients in TNM stages I, II, and III were 5.59%, 43.0%, and 51.4%, respectively. A few of the patients suffered from arrhythmia pneumonia and anastomotic leaks. Based on research and the clinical data, patients can benefit from adjuvant therapy.
Table 1

Clinicopathologic characteristics of ESCC patients

CharacteristicsNo. of patients%
Age, years
≥ 608849.2
< 609150.8
Gender
Male14681.6
Female3318.4
Tobacco use
Yes11463.7
No6536.3
Alcohol use
Yes10659.2
No7340.8
Tumor location
Upper2011.2
Middle9754.2
Lower6234.6
Tumor grade
Well3217.9
Moderately9854.7
Poorly4927.4
Invasion of adjacent structure
Yes3117.3
No14882.7
Lymphatic metastasis
Yes9653.6
No8346.4
TNM stage
I105.59
II7743.0
III9251.4
Arrhythmia
Yes4324.0
No13676.0
Pneumonia
Yes3519.6
No16480.4
Anastomotic leak
Yes126.70
No16793.3
Adjuvant therapy
Yes10860.3
No4525.1
Unknown2614.6

ESCC, esophageal squamous cell carcinoma; TNM, tumor node metastasis.

Clinicopathologic characteristics of ESCC patients ESCC, esophageal squamous cell carcinoma; TNM, tumor node metastasis.

Comprehensive analysis of the differentially expressed protein‐coding genes

To identify potential esophageal cancer‐related genes, we used the Limma package of R software to analyze the different transcriptional genes, based on GSE53625 array data. Fold changes > 2 and adjusted P values of < 0.05 were set to filter different genes. A total of 3654 different protein‐coding and long non‐coding genes were identified (Fig 1a). Among these genes, 3205 coding genes were significantly expressed (Fig 1b), of which 1311 were upregulated in tumors, while 1894 were downregulated (Appendix S1 and S2). We used GO and KEGG pathway analysis (DAVID Bioinformatics Resources 6.8) to explore the main function of differentially expressed protein‐coding genes.21 As shown in Figure 1c, the process related to epidermis development, epithelial cell differentiation, ectoderm development, and epithelium development ranked highest in the enrichment analysis of the GO Biological Process. Extracellular matrix (ECM)‐receptor interaction, focal adhesion, and cell cycle achieved the highest scores in KEGG pathway enrichment analysis (Fig 1d). These results indicated that epithelial cell differentiation, ECM‐receptor interaction, focal adhesion, and cell cycle may play important roles in the progression of ESCC, which is consistent with previous reports.10, 22, 23
Figure 1

Systematic analysis of differential transcribed genes and bioinformatics analysis of the differentially expressed coding genes. (a) Use of the Limma package (R software) to screen and analyze the differentially expressed genes of paired samples, including coding and non‐coding. (b) The heatmap reveals the significantly differentially expressed coding genes between tumor and normal specimens. (c,d) Bioinformatic analysis of differentially expressed coding genes according to Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway.

Systematic analysis of differential transcribed genes and bioinformatics analysis of the differentially expressed coding genes. (a) Use of the Limma package (R software) to screen and analyze the differentially expressed genes of paired samples, including coding and non‐coding. (b) The heatmap reveals the significantly differentially expressed coding genes between tumor and normal specimens. (c,d) Bioinformatic analysis of differentially expressed coding genes according to Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway.

Comprehensive analysis of the differential non‐coding genes

Based on the array data, we also identified 449 differentially expressed non‐coding genes (Fig 2a). When comparing the expression profiles of the tumor specimens and matched normal samples, 224 non‐coding genes were upregulated and 225 were downregulated (Appendix S3 and S4). According to the non‐coding RNA database classification,24 we observed that over 60% of differential non‐coding RNAs were long intergenic non‐coding RNAs (lincRNAs) (Fig 2b). The antisense, pseudogene, and processed transcript lincRNAs accounted for 17.1%, 10.2%, and 9.80%, respectively. Increasing evidence shows that lncRNAs play vital roles in cancer processes, which emphasizes the need for investigation of lncRNA function. Methods based on the construction of a coding‐non‐coding co‐expression network have been widely used to predict the probable functions of lncRNAs.25, 26 According to this theoretical coding‐non‐coding co‐expression network, we constructed a network between the differential protein‐coding genes and the differential non‐coding genes to facilitate the prediction of lncRNAs (Fig 2c); 386 coding genes and 79 lncRNAs were implicated in this prediction model (Appendix S5 and S6). Using highly related coding genes based on GO Biological Process enrichment analysis, we observed that lncRNAs may play important roles in the progresses associated with peptide cross‐link, keratinization, and nucleosome assembly (Fig 2d).
Figure 2

Systematic analysis of differentially expressed non‐coding genes and the prediction of function. (a) The heatmap comprises significantly differentially expressed non‐coding genes. (b) The classification of differentially expressed long non‐coding RNAs (lncRNAs). (c) The network between the different protein‐coding genes and non‐coding genes based on the WGCNA package. (d) The predictive function of lncRNA according to correlation analysis. () long intergenic non‐coding RNA (lincRNA), () antisense, (), pseudogene, () Processed transcript, () misc RNA, () sense_intronic, () to be experimentally confirmed (TEC), and () unknown.

Systematic analysis of differentially expressed non‐coding genes and the prediction of function. (a) The heatmap comprises significantly differentially expressed non‐coding genes. (b) The classification of differentially expressed long non‐coding RNAs (lncRNAs). (c) The network between the different protein‐coding genes and non‐coding genes based on the WGCNA package. (d) The predictive function of lncRNA according to correlation analysis. () long intergenic non‐coding RNA (lincRNA), () antisense, (), pseudogene, () Processed transcript, () misc RNA, () sense_intronic, () to be experimentally confirmed (TEC), and () unknown.

Univariate and multivariate analyses of clinical and biological variables based on the Cox model

We first constructed a logistic regression model based on clinicopathologic characteristics. Clinical features including age, gender, tobacco use, alcohol use, tumor location, tumor grade, invasion of adjacent structures, lymphatic metastasis, TNM stage, arrhythmia, pneumonia, anastomotic leak, and adjuvant therapy were entered into univariate analysis (Table 2). We observed that age, tumor grade, invasion of adjacent structures, lymphatic metastasis, TNM stage, and adjuvant therapy were prognostic factors (all P < 0.05). TNM stage had a high correlation with adjacent structures and lymphatic metastasis, thus invasion of adjacent structures and lymphatic metastasis were entered into multivariate analysis. As shown in Figure 3a (left; Table 2), age (hazard ratio [HR] 1.575, 95% CI 1.006–2.467; P = 0.047) and adjuvant therapy (HR 0.520, 95% CI 0.289–0.934; P = 0.029) were independent prognostic factors. Because TNM stage is considered the gold standard for prognostication in clinical practice,8 TNM stage was therefore included in all subsequent analyses. We constructed a Cox model using the new formula: variable_1 = 0.363 (age variable) + 0.564 (TNM stage variable) – 0.582 (adjuvant therapy variable). As shown in Figure 3b (left), patients in the low‐risk group survived longer compared to those in the high‐risk group (P < 0.0001). We also estimated the specificity and sensitivity of variable_1 using an ROC curve. The area under the ROC (AUC) of this new variable was 0.717.
Table 2

Univariate and multivariable analyses based on the clinical Cox model

ParametersUnivariate analysisMultivariable analysis
HR95% CI P HR95% CI P
Age (≥ 60)1.6801.146–2.4610.0081.5751.006–2.4670.047*
Gender (female)1.2770.789–2.0440.307
Tobacco use1.3340.905–1.9670.145
Alcohol use1.1580.788–1.7000.456
Tumor location0.257
Tumor location (middle vs. upper)1.6690.905–3.0780.101
Tumor location (lower vs. upper)1.1350.740–1.7410.561
Tumor location (middle vs. lower)0.6800.385–1.2020.184
Tumor grade0.0480.8290.516–1.3300.436
Tumor grade (well vs. poorly)0.6050.338–1.0820.090
Tumor grade (moderately vs. poorly)0.6130.401–0.9390.024
Tumor grade (moderately vs. well)1.0140.587–1.7500.961
Invasion of adjacent structure1.6281.017–2.6050.0420.8520.610–1.1890.346
Lymphatic metastasis2.1291.420–3.1920.0001.5280.931–2.5080.094
TNM stage0.001
TNM stage (I vs. III)0.2760.087–0.8790.029
TNM stage (II vs. III)0.4920.327–0.7390.001
TNM stage (II vs. I)1.7820.549–5.7880.336
Arrhythmia0.8930.580–1.3750.607
Pneumonia0.7020.354–1.3900.310
Anastomotic leak0.7700.357–1.6580.504
Adjuvant therapy0.4420.256–0.7620.0030.5200.289–0.9340.029*

Indicated P < 0.05. CI, confidence interval; HR, hazard ratio; TNM, tumor node metasta.

Figure 3

Two logistic regression modeling approaches to predict esophageal squamous cell carcinoma (ESCC) survival after surgery. (a) Multivariate analysis of the overall survival of ESCC patients based on the different Cox models, one based exclusively on the clinical variables, the other based on the integration of clinical variables and a 2‐gene score. (b) Kalpan–Meier survival curves of the two logistic regression models. (c) Receiver operating characteristic (ROC) curves of the two models are presented, and reflect the specificity and sensitivity of the two different comprehensive variables. () Low‐risk and () High‐risk; () Low‐risk and () High‐risk. AUC, area under the ROC; CI, confidence interval; HR, hazard ratio.

Univariate and multivariable analyses based on the clinical Cox model Indicated P < 0.05. CI, confidence interval; HR, hazard ratio; TNM, tumor node metasta. Two logistic regression modeling approaches to predict esophageal squamous cell carcinoma (ESCC) survival after surgery. (a) Multivariate analysis of the overall survival of ESCC patients based on the different Cox models, one based exclusively on the clinical variables, the other based on the integration of clinical variables and a 2‐gene score. (b) Kalpan–Meier survival curves of the two logistic regression models. (c) Receiver operating characteristic (ROC) curves of the two models are presented, and reflect the specificity and sensitivity of the two different comprehensive variables. () Low‐risk and () High‐risk; () Low‐risk and () High‐risk. AUC, area under the ROC; CI, confidence interval; HR, hazard ratio. We also constructed a novel Cox model based on the clinical features and gene expression profiles associated with patient survival. An increasing amount of research has indicated that lncRNAs are closely correlated with prognosis. We estimated lncRNAs as the candidate genes in univariable analyses. As shown in Table 3, we distinguished 31 lncRNAs as prognostic factors. Incorporating these candidate lncRNAs into the variables of clinicopathologic characteristics, we observed four independent prognostic factors (Fig 3a, right): age (HR 2.029, 95% CI 1.173–3.508; P = 0.011), adjuvant therapy (HR 0.408, 95% CI 0.192–0.868; P = 0.020), RP11‐357H14.20 (HR 2235, 95% CI 1.237–4.038; P = 0.008), and RP11‐768G7.2 (HR 2.215, 95% CI 1.258–3.903; P = 0.006) (Table 4). Moreover we calculated a new variable (variable_2) according to the novel Cox model: variable_2 = 0.358 (age variable) + 0.605 (TNM stage variable) – 0.605 (adjuvant therapy variable) + 0.723 (RP11‐357H14.20 variable) + 0.295 (RP11‐768G7.2 variable). We also estimated the predictive ability of variable_2. As shown in Figure 3b (right), patients in the high‐risk cohort had poorer long‐term prognosis. The AUC of ROC allowed us to estimate the specificity and sensitivity of variable_2. As shown in Figure 3c, the AUC of variable_2 was 0.769, higher than that of variable_1, indicating that the combination of clinical features and gene expression patterns is a more accurate predictor than clinicopathologic characteristics alone.
Table 3

Univariate analysis of gene expression profiles correlated with overall survival of ESCC patients

NumberEnsemble namelogFCadj.P.ValENSGTypeUnivariate analysis
HR95% CI P
1CASC2−1.051.23E‐25ENSG00000177640Antisense1.5061.026–2.2090.036
2FLJ40288−2.986.24E‐82ENSG00000183470lincRNA0.6490.441–0.9540.028
3KB‐1183D5.111.093.19E‐11ENSG00000215498Processed_transcript0.6760.461–0.9920.046
4RP11‐357H14.21.851.88E‐34ENSG00000233283Processed_transcript1.7031.159–2.5030.007
5RP11‐438N16.12.196.6E‐27ENSG00000249550lincRNA0.6570.447–0.9650.032
6RP11‐129M6.11.382.22E‐14ENSG00000251363lincRNA0.6770.461–0.9960.047
7AC006296.1−1.739.65E‐21ENSG00000251412lincRNA0.6440.438–0.9460.025
8AC007880.1−2.224.47E‐26ENSG00000234572lincRNA0.6520.444–0.9570.029
9AC092168.4−1.216.69E‐32ENSG00000228488lincRNA0.5860.398–0.8650.007
10AC093850.24.966.63E‐96ENSG00000230838lincRNA1.4711.002–2.1590.049
11AF003626.1−1.182.87E‐36ENSG00000230153lincRNA0.6290.428–0.9250.018
12AP000344.3−2.111.18E‐32ENSG00000234928lincRNA0.6540.445–0.9610.031
13AP000473.61.233.22E‐16ENSG00000237735lincRNA0.6050.410–0.8920.011
14CTD‐2382E5.11.138.53E‐12ENSG00000246740Antisense0.6440.439–0.9460.025
15FRMPD2P1−1.933.84E‐33ENSG00000150175Pseudogene0.6140.418–0.9040.013
16LINC00028−1.416.61E‐33ENSG00000233354lincRNA0.5820.395–0.8580.006
17MAMDC2‐AS1−1.716.45E‐54ENSG00000204706Antisense1.6851.144–2.4830.008
18RP11‐120J1.1−1.628.68E‐33ENSG00000225472Antisense0.6350.431–0.9360.022
19RP11‐225N10.1−1.583.08E‐47ENSG00000240063Antisense0.6800.463–0.9990.049
20RP11‐226F19.51.103.34E‐21ENSG00000259062Antisense1.4861.013–2.1810.043
21RP11‐242F24.11.031.75E‐46ENSG00000228750lincRNA1.5131.028–2.2260.036
22RP1‐12803.4−3.712.95E‐82ENSG00000230248lincRNA0.6380.432–0.9360.022
23RP11‐411K7.1−1.304.81E‐13ENSG00000236740Processed_transcript0.6420.437–0.9430.024
24RP11‐51M18.11.592.44E‐17ENSG00000253898lincRNA0.5940.403–0.8760.009
25RP11‐521B24.31.092.49E‐21ENSG00000251602Antisense1.7401.181–2.5640.005
26RP11‐526P5.2−1.163.68E‐12ENSG00000235281lincRNA0.6550.446–0.9630.031
27RP11‐71G12.11.231.1E‐10ENSG00000229961lincRNA0.6530.445–0.9600.030
28RP11‐768G7.21.263.51E‐31ENSG00000241213lincRNA1.6941.150–2.4950.008
29RP11‐89N17.4−1.554.54E‐41ENSG00000236494lincRNA0.5730.389–0.8440.006
30RP11‐726G1.11.042.23E‐19ENSG00000214776Processed_transcript1.4971.019–2.2000.040
31RP11‐69C17.1−1.971.75E‐37ENSG00000234962lincRNA0.6740.459–0.9890.044

CI, confidence interval; HR, hazard ratio; lincRNA, long intergenic non‐coding RNA.

Table 4

Multivariate analysis based on the integration of clinical variables and gene expression signatures in a Cox model

ParametersMultivariable analysis
HR95% CI P
Age (> 60)2.0291.173–3.5080.011*
Tumor grade (well vs. poorly)1.1260.642–1.9760.679
Invasion of adjacent structure0.8040.531–1.2170.302
Lymphatic metastasis1.5890.836–3.0230.158
Adjuvant therapy0.4080.192–0.8680.020*
CASC20.8410.468‐1.5100.561
FLJ402880.6670.365–1.2190.188
KB‐1183D5.110.7070.386–1.2920.259
RP11‐357H14.202.2351.237–4.0380.008*
RP11‐438N16.10.8040.468–1.3840.432
RP11‐129M6.10.9710.545–1.7300.920
AC006296.10.3890.117–1.2930.123
AC007880.12.3470.749–7.3580.143
AC092168.41.3130.654–2.6360.444
AC093850.21.0110.545–1.8770.972
AF003626.10.7420.386–1.4250.370
AP000344.31.5730.769–3.2180.215
AP000473.60.7920.457–1.3740.407
CTD‐2382E5.11.1990.611–2.3530.597
FRMPD2P10.6140.158–2.3830.481
LINC000280.7840.429–1.4360.431
MAMDC2‐AS11.3130.723–2.3850.372
RP11‐120J1.10.7150.371–1.3770.316
RP11‐225N10.10.9030.524–1.5570.714
RP11‐226F19.51.6050.845–3.0510.149
RP11‐242F24.10.8400.387–1.8230.659
RP1‐12803.41.0550.540–2.0610.875
RP11‐411K7.11.0210.550–1.8950.947
RP11‐51M18.10.8270.479–1.4280.496
RP11‐521B24.31.1670.595–2.2890.653
RP11‐526P5.20.7330.390–1.3750.333
RP11‐71G12.10.8190.467–1.4360.485
RP11‐768G7.22.2151.258–3.9030.006*
RP11‐89N17.40.5980.300–1.1930.144
RP11‐726G1.11.6980.990–2.9140.055
RP11‐69C17.11.6460.766–3.5350.201

Indicated P < 0.05. CI, confidence interval; HR, hazard ratio.

Univariate analysis of gene expression profiles correlated with overall survival of ESCC patients CI, confidence interval; HR, hazard ratio; lincRNA, long intergenic non‐coding RNA. Multivariate analysis based on the integration of clinical variables and gene expression signatures in a Cox model Indicated P < 0.05. CI, confidence interval; HR, hazard ratio.

Construction of a novel nomogram to predict survival in ESCC patients

To further assess the predictive ability of the novel Cox model, we built a nomogram using the rms package (R statistical software, R Foundation). Figure 4a shows the prognostic nomogram integrating all of the significant independent factors for overall survival in the clinical cohort. The nomogram illustrated shows the contribution of each variable to predict tumor‐related death at three or five years. The C‐index, reflecting the predictive ability of the nomogram, was 0.639 (95% CI 0.577–0.701). The calibration plot for the probability of survival at three or five years after surgery showed moderate agreement between the predictions made by the nomogram and actual observations (Fig 4b).
Figure 4

Nomograms of the two Cox models. (a,c) Two models are shown. An individual patient's value is located on each variable axis, and a line is drawn upward to determine the number of points received for each variable value. The sum of these numbers is located on the Total Points axis, and a line is drawn downward to the survival axes to determine the likelihood of three or five‐year survival. (b,d) The calibration curve for predicting patient survival at three or five years in the former and combined nomograms, respectively. Nomogram‐predicted probability of overall survival is plotted on the x‐axis; actual overall survival is plotted on the y‐axis. OS, overall survival; TNM, tumor node metastasis.

Nomograms of the two Cox models. (a,c) Two models are shown. An individual patient's value is located on each variable axis, and a line is drawn upward to determine the number of points received for each variable value. The sum of these numbers is located on the Total Points axis, and a line is drawn downward to the survival axes to determine the likelihood of three or five‐year survival. (b,d) The calibration curve for predicting patient survival at three or five years in the former and combined nomograms, respectively. Nomogram‐predicted probability of overall survival is plotted on the x‐axis; actual overall survival is plotted on the y‐axis. OS, overall survival; TNM, tumor node metastasis. To develop a composite prognostic predictor, we assembled the 2‐genes, the independent prognostic factors, and clinical variables in the overall series of ESCC patients, including age, adjuvant therapy, TNM stage, and the 2‐gene score (Fig 4c). The C‐index of this new nomogram was 0.699 (95% CI 0.640–0.758), which was statistically higher than that of the clinical cohort (P < 0.05). The calibration plot for the probability of survival at three or five years showed greater agreement than that of the previous nomogram (Fig 4d). These results indicated that incorporating a 2‐gene score into the clinicopathologic variables improved the prognostic accuracy of survival in ESCC patients after surgery.

Discussion

Population‐based studies have shown that esophageal cancer is predominant in men aged ≥ 60 years, many of whom also have a history of heavy tobacco and alcohol use.27 China has a high‐incidence of ESCC, the most common histological subtype of esophageal cancer.5, 28 The mean ESCC male to female ratio is 3:1.4, 29, 30 Based on these epidemiological characteristics, we chose patients in the GSE53625 dataset, which represents features of ESCC. In this study, we investigated coding and non‐coding gene expression profiles in ESCC by re‐annotating the microarray probe sets of Agilent‐038314 CBC Homo sapiens. Through differential expression, GO, and KEGG pathway enrichment analysis, we observed that epithelial cell differentiation, ECM‐receptor interaction, and cell cycle may play important roles in the development and progression of ESCC. A previous study showed that cell cycle regulators, such as CCND1, CCNE1, CDK6, or RB1, are frequently altered in ESCC via distinct mechanisms.31 Dysregulated pathways, which are of therapeutic interest in ESCC, include receptor tyrosine kinase signaling, chromatin remodeling, and embryonic pathways.32 Our observations are consistent with these results. LncRNA, a new star in the field of oncology, also has been widely investigated in ESCC. Li et al. found that linc‐POU3F3, which is highly expressed in ESCC samples, contributes to the development of ESCC by interacting with EZH2 to promote POU3F3 methylation.33 Zhang et al. also reported that lncRNA CCAT1, which shows significantly increased expression in ESCC, could serve as a scaffold for two distinct epigenetic modifications that facilitate cell growth and migration.34 These results indicate that lncRNA may affect the regulation of epigenetics. In this study, we identified 449 non‐coding genes that were closely implicated in the nucleosome assembly bioprocess, which may provide a novel therapeutic target for ESCC. According to the Cox model of the clinical cohort, age and adjuvant therapy are two independent prognostic factors. Advanced age as a prognostic factor for surgery outcome remains controversial. Some studies have shown that the risk of mortality after esophagectomy is strongly related to patient age and performance status, with poorer long‐term survival among elderly patients.35 However, Alibakhshi et al. reported that esophagectomy outcomes in elderly patients were not significantly different than in young patients.36 The effect of age may be related to comorbidities rather than age itself. The prognosis for ESCC patients with ≥ T2 or N+ after surgery alone is poor and the 10‐year survival rate in stage 1b after surgery is only 50%.37 Thus, adjuvant therapy is recommended, including neoadjuvant or perioperative chemotherapy, radiotherapy, or chemoradiotherapy for ≥ T2 esophageal cancer patients.38 Two logistic regression modeling approaches were used to predict outcomes after surgery, one based exclusively on clinical variables, and the other integrating prognostic gene variables with clinicopathologic characteristics. We identified a “2‐gene score,” of lncRNAs and independent prognostic factors. The combination of 2‐gene score with clinical and pathological features shows better specificity and sensitivity than the clinicopathologic parameters alone for outcome prediction. Nomograms have been developed to predict prognosis in some cancers, and have proven more accurate than conventional staging systems, such as TNM stage. However, few studies have integrated gene expression profiling and clinical variables to predict outcomes after surgery. The predictive accuracy of the nomogram combination the “2‐gene score” and clinical features was higher than the nomogram based exclusively on clinical variables. The calibration plot of the combined nomogram for the probability of survival at three or five years showed greater agreement than that of the clinical nomogram. The 2‐gene score may more accurately reflect tumor biology than clinicopathologic parameters alone and may enhance the ability to predict outcomes in ESCC patients treated by surgery. Some limitations of this study should be taken into consideration. The heterogeneity of the tumor samples presents difficulties in detecting the expression of two lncRNAs (RP11‐357H14.20 and RP11‐768G7). The nomogram that was constructed using this one dataset should be validated in another cohort. In conclusion, the combined nomogram proposed in this study objectively and accurately predicted the prognosis of ESCC patients after surgery. Additional studies are required to determine whether it can be applied in a clinical setting.

Disclosure

No authors report any conflict of interest. Appendix S1. Differentially expressed coding genes. Click here for additional data file. Appendix S2. Fold change of differentially expressed coding genes. Click here for additional data file. Appendix S3. Differentially expressed non‐coding genes. Click here for additional data file. Appendix S4. Fold change of differentially expressed non‐coding genes. Click here for additional data file. Appendix S5. Network of differentially expressed coding and non‐coding genes. Click here for additional data file. Appendix S6. Correlation of node in the network. Click here for additional data file.
  38 in total

1.  The Global Burden of Cancer 2013.

Authors:  Christina Fitzmaurice; Daniel Dicker; Amanda Pain; Hannah Hamavid; Maziar Moradi-Lakeh; Michael F MacIntyre; Christine Allen; Gillian Hansen; Rachel Woodbrook; Charles Wolfe; Randah R Hamadeh; Ami Moore; Andrea Werdecker; Bradford D Gessner; Braden Te Ao; Brian McMahon; Chante Karimkhani; Chuanhua Yu; Graham S Cooke; David C Schwebel; David O Carpenter; David M Pereira; Denis Nash; Dhruv S Kazi; Diego De Leo; Dietrich Plass; Kingsley N Ukwaja; George D Thurston; Kim Yun Jin; Edgar P Simard; Edward Mills; Eun-Kee Park; Ferrán Catalá-López; Gabrielle deVeber; Carolyn Gotay; Gulfaraz Khan; H Dean Hosgood; Itamar S Santos; Janet L Leasher; Jasvinder Singh; James Leigh; Jost B Jonas; Jost Jonas; Juan Sanabria; Justin Beardsley; Kathryn H Jacobsen; Ken Takahashi; Richard C Franklin; Luca Ronfani; Marcella Montico; Luigi Naldi; Marcello Tonelli; Johanna Geleijnse; Max Petzold; Mark G Shrime; Mustafa Younis; Naohiro Yonemoto; Nicholas Breitborde; Paul Yip; Farshad Pourmalek; Paulo A Lotufo; Alireza Esteghamati; Graeme J Hankey; Raghib Ali; Raimundas Lunevicius; Reza Malekzadeh; Robert Dellavalle; Robert Weintraub; Robyn Lucas; Roderick Hay; David Rojas-Rueda; Ronny Westerman; Sadaf G Sepanlou; Sandra Nolte; Scott Patten; Scott Weichenthal; Semaw Ferede Abera; Seyed-Mohammad Fereshtehnejad; Ivy Shiue; Tim Driscoll; Tommi Vasankari; Ubai Alsharif; Vafa Rahimi-Movaghar; Vasiliy V Vlassov; W S Marcenes; Wubegzier Mekonnen; Yohannes Adama Melaku; Yuichiro Yano; Al Artaman; Ismael Campos; Jennifer MacLachlan; Ulrich Mueller; Daniel Kim; Matias Trillini; Babak Eshrati; Hywel C Williams; Kenji Shibuya; Rakhi Dandona; Kinnari Murthy; Benjamin Cowie; Azmeraw T Amare; Carl Abelardo Antonio; Carlos Castañeda-Orjuela; Coen H van Gool; Francesco Violante; In-Hwan Oh; Kedede Deribe; Kjetil Soreide; Luke Knibbs; Maia Kereselidze; Mark Green; Rosario Cardenas; Nobhojit Roy; Taavi Tillmann; Taavi Tillman; Yongmei Li; Hans Krueger; Lorenzo Monasta; Subhojit Dey; Sara Sheikhbahaei; Nima Hafezi-Nejad; G Anil Kumar; Chandrashekhar T Sreeramareddy; Lalit Dandona; Haidong Wang; Stein Emil Vollset; Ali Mokdad; Joshua A Salomon; Rafael Lozano; Theo Vos; Mohammad Forouzanfar; Alan Lopez; Christopher Murray; Mohsen Naghavi
Journal:  JAMA Oncol       Date:  2015-07       Impact factor: 31.777

2.  Increased levels of the long intergenic non-protein coding RNA POU3F3 promote DNA methylation in esophageal squamous cell carcinoma cells.

Authors:  Wei Li; Jian Zheng; Jieqiong Deng; Yonghe You; Hongchun Wu; Na Li; Jiachun Lu; Yifeng Zhou
Journal:  Gastroenterology       Date:  2014-03-12       Impact factor: 22.682

3.  Oesophageal cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up.

Authors:  F Lordick; C Mariette; K Haustermans; R Obermannová; D Arnold
Journal:  Ann Oncol       Date:  2016-09       Impact factor: 32.976

4.  Cancer of the esophagus and esophagogastric junction: data-driven staging for the seventh edition of the American Joint Committee on Cancer/International Union Against Cancer Cancer Staging Manuals.

Authors:  Thomas W Rice; Valerie W Rusch; Hemant Ishwaran; Eugene H Blackstone
Journal:  Cancer       Date:  2010-08-15       Impact factor: 6.860

5.  Development and Internal Validation of a Novel Model to Identify the Candidates for Extended Pelvic Lymph Node Dissection in Prostate Cancer.

Authors:  Giorgio Gandaglia; Nicola Fossati; Emanuele Zaffuto; Marco Bandini; Paolo Dell'Oglio; Carlo Andrea Bravi; Giuseppe Fallara; Francesco Pellegrino; Luigi Nocera; Pierre I Karakiewicz; Zhe Tian; Massimo Freschi; Rodolfo Montironi; Francesco Montorsi; Alberto Briganti
Journal:  Eur Urol       Date:  2017-04-12       Impact factor: 20.096

6.  Global incidence of oesophageal cancer by histological subtype in 2012.

Authors:  Melina Arnold; Isabelle Soerjomataram; Jacques Ferlay; David Forman
Journal:  Gut       Date:  2014-10-15       Impact factor: 23.059

7.  Hedgehog and epithelial-mesenchymal transition signaling in normal and malignant epithelial cells of the esophagus.

Authors:  Noriyuki Isohata; Kazuhiko Aoyagi; Tomoko Mabuchi; Hiroyuki Daiko; Masahide Fukaya; Hiroyuki Ohta; Kenji Ogawa; Teruhiko Yoshida; Hiroki Sasaki
Journal:  Int J Cancer       Date:  2009-09-01       Impact factor: 7.396

8.  Oesophageal cancer survival in Europe: a EUROCARE-4 study.

Authors:  A T Gavin; S Francisci; R Foschi; D W Donnelly; V Lemmens; H Brenner; L A Anderson
Journal:  Cancer Epidemiol       Date:  2012-08-19       Impact factor: 2.984

9.  Long non-coding RNAs function annotation: a global prediction method based on bi-colored networks.

Authors:  Xingli Guo; Lin Gao; Qi Liao; Hui Xiao; Xiaoke Ma; Xiaofei Yang; Haitao Luo; Guoguang Zhao; Dechao Bu; Fei Jiao; Qixiang Shao; RunSheng Chen; Yi Zhao
Journal:  Nucleic Acids Res       Date:  2012-11-05       Impact factor: 16.971

10.  Cancer survival in China, 2003-2005: a population-based study.

Authors:  Hongmei Zeng; Rongshou Zheng; Yuming Guo; Siwei Zhang; Xiaonong Zou; Ning Wang; Limei Zhang; Jingao Tang; Jianguo Chen; Kuangrong Wei; Suqin Huang; Jian Wang; Liang Yu; Deli Zhao; Guohui Song; Jianshun Chen; Yongzhou Shen; Xiaoping Yang; Xiaoping Gu; Feng Jin; Qilong Li; Yanhua Li; Hengming Ge; Fengdong Zhu; Jianmei Dong; Guoping Guo; Ming Wu; Lingbin Du; Xibin Sun; Yutong He; Michel P Coleman; Peter Baade; Wanqing Chen; Xue Qin Yu
Journal:  Int J Cancer       Date:  2014-10-03       Impact factor: 7.396

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.