Yi-Qi Jin1, Dong-Liu Miao1. 1. Department of Intervention and Vascular Surgery, Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou, Jiangsu, China.
Abstract
BACKGROUND: Epigenetic alterations have been shown to lead to human carcinogenesis. The aim of this study was to perform an integrative analysis to develop an epigenetic signature to predict overall survival (OS) of esophageal cancer. METHODS: DNA methylation and messenger RNA expression data of esophageal cancer samples were downloaded from The Cancer Genome Atlas database and were incorporated and analyzed using an R package MethylMix. Functional enrichment analysis of the methylation-related differentially expressed genes (DEGs) was performed. Epigenetic signature and nomogram associated with the OS of esophageal cancer were established by the multivariate Cox model. RESULTS: A total of 71 methylation-related DEGs were identified. Kyoto Encyclopedia of Genes and Genomes pathway analysis revealed that these genes were involved in the biological process related to the initiation and progression of esophageal cancer. Two-gene (FAM24B and FAM200A) risk signature for OS was developed by multivariate Cox analysis, of which had high accuracy. The signature is independent of clinicopathological variables and indicated better predictive power than other clinicopathological variables. Moreover, we developed a novel prognostic nomogram based on risk score and 3 clinicopathological factors. CONCLUSIONS: Our study indicated possible methylation-related DEGs and established an epigenetic signature, which may provide novel insights for understanding the pathogenesis of esophageal cancer.
BACKGROUND: Epigenetic alterations have been shown to lead to human carcinogenesis. The aim of this study was to perform an integrative analysis to develop an epigenetic signature to predict overall survival (OS) of esophageal cancer. METHODS: DNA methylation and messenger RNA expression data of esophageal cancer samples were downloaded from The Cancer Genome Atlas database and were incorporated and analyzed using an R package MethylMix. Functional enrichment analysis of the methylation-related differentially expressed genes (DEGs) was performed. Epigenetic signature and nomogram associated with the OS of esophageal cancer were established by the multivariate Cox model. RESULTS: A total of 71 methylation-related DEGs were identified. Kyoto Encyclopedia of Genes and Genomes pathway analysis revealed that these genes were involved in the biological process related to the initiation and progression of esophageal cancer. Two-gene (FAM24B and FAM200A) risk signature for OS was developed by multivariate Cox analysis, of which had high accuracy. The signature is independent of clinicopathological variables and indicated better predictive power than other clinicopathological variables. Moreover, we developed a novel prognostic nomogram based on risk score and 3 clinicopathological factors. CONCLUSIONS: Our study indicated possible methylation-related DEGs and established an epigenetic signature, which may provide novel insights for understanding the pathogenesis of esophageal cancer.
As one of the deadly diseases, esophageal cancer (EC), characterized by highly lethal, is the eighth most common cancer worldwide.[1] Esophageal squamous cell carcinoma and adenocarcinoma are 2 major histological types of EC. The World Health Organization reported the estimated about 17 290 new cases and 15 850 EC deaths in 2018 from global cancer statistics.[2] Despite significant advances in treatment methods, the prognosis of patients with EC remains unfavorable, even after tumor resection.[3,4] Therefore, a deeper understanding of initiation and progression, as well as the discovery of new molecular markers associated with EC, could lead to the identification of new therapeutic strategies to regulate this deadly disease.Alterations in epigenetic modifications are closely related to tumor development.[5] Aberrant DNA methylation is the most widely studied epigenetic mechanism. Unlike genetic mutations, the process of DNA methylation is reversible and is therefore considered a promising research tool for cancer research.[6] Transcriptional silencing caused by abnormal hypermethylation of the tumor suppressor gene promoter region is considered to be an important mechanism of tumorigenesis.[7] Generally, abnormal DNA methylation in cancer can be divided into 2 categories: focal areas of hypermethylation and widespread areas of hypomethylation. These epigenomic aberrations contribute to the pathogenesis of EC through different mechanisms.[8] Focal areas of hypermethylation occurs preferentially at promoter cytosine-phosphate-guanine (CpG) islands and leads to gene inactivation in the absence of changes to genetic sequence.[5] Compared to focal hypermethylation, much less is known about global hypomethylation. This is because most methylation studies are performed using candidate gene methods. Nowadays, due to the regulatory relationships between the DNA methylation and gene expression, several methylation-regulated differentially expressed genes (DEGs) have been identified in EC.[9,10] However, there are still no quantitative signatures for prognosis of EC. The aim of this study was to perform an integrative analysis to identify epigenetic changes that may play key role in the initiation and progression of EC using DNA methylation and messenger RNA (mRNA) expression profiles and to develop an epigenetic signature to predict overall survival (OS) of EC.
Materials and Methods
Data Source
The DNA methylation, RNA-seq data, and clinicopathological data of EC from the The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/). DNA methylation data generated with the Illumina Infinium HumanMethylation450 platform[11] and mRNA expression was evaluated based on the level 3 RNA-seq data. A total of 198 samples (183 tumor samples and 15 normal samples) of DNA methylation, 169 samples (159 tumor samples and 10 normal samples) of mRNA expression, and clinicopathological data of 183 patients were obtained. Patients with uncertain tumor location (n = 1), tumor stage (n = 22), and tumor grade (n = 34) were excluded. After matching the clinical data with methylation and mRNA expression data, data from 126 patients with EC were finally reserved for further study. The clinicopathologic characteristics of the study cohort are shown in Table 1. The flowchart of the overall research strategy is shown in Figure 1.
Table 1.
Clinicopathologic Characteristics of Patients With Esophageal Cancer.
Variables
Case, n (%)
Total
126
Age (M ± SD, years)
60.456 ± 10.96
Gender
Female
18 (14.3)
Male
108 (85.7)
History of Barrett esophagus
No
74 (58.7)
Yes
19 (15.1)
Unknown
33 (26.2)
History of reflux disease
No
74 (58.7)
Yes
23 (18.3)
Unknown
29 (23.0)
Histological type
Esophagus adenocarcinoma
43 (34.1)
Esophagus squamous cell carcinoma
83 (65.9)
Esophageal tumor central location
Proximal
6 (4.8)
Mid
42 (33.3)
Distal
78 (61.9)
Tumor status
Tumor free
82 (65.1)
With tumor
44 (34.9)
Grade
I
17 (13.5)
II
70 (55.5)
III
39 (31.0)
Stage
I
9 (7.1)
II
67 (53.2)
III
44 (34.9)
IV
6 (4.8)
Figure 1.
Flow chart indicating study design.
Clinicopathologic Characteristics of Patients With Esophageal Cancer.Flow chart indicating study design.
Differential Expression Analysis and Correlation Analysis
The “limma” package in R software was used to identify DEGs and aberrant methylated genes (differently methylated genes [DMG]) based on the thresholds of fold change (FC) > 1.5 and adjusted P value < .05 and FC > 1 and adjusted P value < .05, respectively. MethylMix is a program designed to identify methylation events related to gene expression.[12] We explored the association between DNA methylation and gene expression. Pearson coefficient between methylation level (β value) and gene expression level was calculated to find significantly negatively related genes using the MethylMix package in R. Pearson coefficient < −0.3 with P < .05 was set as the criterion for methylation-related DEGs identification.
Gene Ontology and Pathway Analysis
To better understand the function of methylation-related DEGs, we performed the Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis using “clusterProfiler” package in R.[13] A P value of <.05 was considered significant.
Development of Risk Assessment Signature Based on Gene Methylation Levels
Univariate and multivariate Cox regression analyses were performed to identify a prognostic model based on their methylation β value. Mathematical models were established based on the Akaike Information Criterion using the “Survival” R package. We calculated the prognostic risk score as follows: Risk score = . Based on the median risk score value, all patients were classified into high-risk score group and low-risk score group. Overall survival curves of the 2 groups were generated using Kaplan-Meier analysis. The predictive value of the prognostic model was evaluated by time-dependent receiver operating characteristic (ROC) curve analysis using the “survival ROC” R package.[14]
Construction of Prognostic Nomogram
To provide clinicians with a quantitative method to predict a patient’s prognosis, we established a genomic-clinical nomogram to predict the OS of each patient with EC individually. The distinguishing ability of the nomograms was evaluated by the area under the curve (AUC) of ROC curve.[15] The calibration curves were plotted to compare the nomogram-predicted survival with the actual survival.
Results
Identification of Methylation-Related EDGs
According to the criteria (FC > 1.5 and adjusted P value < .05), a total of 6261 DEGs (5119 upregulated genes and 1142 downregulated genes) were identified between EC and normal samples. We explored the association between the DNA methylation and gene expression to identify methylation-related DEGs (Pearson coefficient < −0.3 and P < .05). Pearson coefficient between methylation level (β value) and gene expression for each of the 6261 DEGs was calculated. Finally, 71 methylation-related DEGs were obtained (Figure 2). The distribution map of the top differential methylation is demonstrated in Figure 3, and the association between gene expression and DNA methylation of the top methylation-related DEGs is shown in Figure 4.
Figure 2.
Heat map of methylation-regulated genes in esophageal cancer. A, Methylation values (β values). The x-axis represents samples and y-axis represents differentially methylated genes between EC and normal samples. The color change from blue to red in the heat map illustrates the trend from low to high methylation. B, The expression of the methylation-regulated genes. The x-axis represents samples and y-axis represents differentially expressed genes between EC and normal samples. The color change from blue to red in the heat map illustrates the trend from low to high expression. EC indicates esophageal cancer.
Figure 3.
Summary of top hypermethylated and hypomethylated genes. The methylation degree when comparing patients with cancer to normal individuals in esophageal cancer. The horizontal black bar represents the distribution of methylation values in the normal samples. The histogram demonstrates the distribution of methylation in tumor samples (denoted as β values, where higher β values represent greater methylation).
Figure 4.
The correlation between gene expression and DNA methylation of top hypermethylated and hypomethylated genes. Average β values are presented on the x-axis, log2 FPKM gene expression values are presented on y-axis.
Heat map of methylation-regulated genes in esophageal cancer. A, Methylation values (β values). The x-axis represents samples and y-axis represents differentially methylated genes between EC and normal samples. The color change from blue to red in the heat map illustrates the trend from low to high methylation. B, The expression of the methylation-regulated genes. The x-axis represents samples and y-axis represents differentially expressed genes between EC and normal samples. The color change from blue to red in the heat map illustrates the trend from low to high expression. EC indicates esophageal cancer.Summary of top hypermethylated and hypomethylated genes. The methylation degree when comparing patients with cancer to normal individuals in esophageal cancer. The horizontal black bar represents the distribution of methylation values in the normal samples. The histogram demonstrates the distribution of methylation in tumor samples (denoted as β values, where higher β values represent greater methylation).The correlation between gene expression and DNA methylation of top hypermethylated and hypomethylated genes. Average β values are presented on the x-axis, log2 FPKM gene expression values are presented on y-axis.
Functional Enrichment Analysis
A total of 71 methylation-related DEGs in patients with EC were annotated according to the GO database. The result demonstrated that these genes were enriched in 260 GO terms (P < .05). The top 10 enriched terms were shown in Figure 5A. According to the KEGG pathway analysis, 7 pathways were associated with these genes (Figure 5B,
P < .05). These enriched pathways were pathways in carbon metabolism, Janus kinase–signal transducer and activator of transcription (JAK-STAT) signaling pathway, herpes simplex virus 1 infection, PPAR signaling pathway, proteoglycans in cancer, and peroxisome.
Figure 5.
Enrichment of top GO terms (A) and KEGG pathways (B) of methylation-regulated genes. GO indicates Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes.
Enrichment of top GO terms (A) and KEGG pathways (B) of methylation-regulated genes. GO indicates Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes.
Construction of Epigenetic Signature Based on Gene Methylation Levels
In order to contrive multigenes-based signature for predicting OS in EC, univariate and multivariate COX analyses were performed with the 71 genes. The result showed that 2 genes were significant prognostic factors for EC. An epigenetic signature was established based on the methylation β values of the 2 genes as follows: risk score = (−3.281 × methylation β value of FAM24B) + (2.207 × methylation β value of FAM200A). All the patients were assigned to low-risk group and high-risk group based on median risk scores. The risk score distribution, survival status, and expression profile of the 2 prognostic genes are shown in Figure 6A. The Kaplan-Meier curves illustrated that there were significant differences in OS in the training cohort (Figure 6B). The AUC value of the signature was 0.738 which showed the predictive ability of the signature in predicting survival risk of patients with EC (Figure 6C).
Figure 6.
Construction of the epigenetic signature in esophageal cancer. A, Risk score distribution, survival status of each patient, and expression heat map of the 2 genes. B, Kaplan-Meier estimates of the overall survival. C, Time-dependent ROC curve of the prognostic signature. ROC indicates receiver operating characteristic.
Construction of the epigenetic signature in esophageal cancer. A, Risk score distribution, survival status of each patient, and expression heat map of the 2 genes. B, Kaplan-Meier estimates of the overall survival. C, Time-dependent ROC curve of the prognostic signature. ROC indicates receiver operating characteristic.Univariate Cox analysis showed that the tumor status, tumor stage, and risk score were significantly correlated with OS (P < .05). According to the results based on the multivariate analysis, 4 variables (histological type, tumor status, tumor stage, and risk score) were confirmed as independent predictors for OS (Table 2). These significant variables were used to create the nomogram for 3- and 5-year OS (Figure 7). By adding up the scores related to each variable and projecting total scores to the bottom scales, it is easy to calculate the estimated 3- and 5-year OS probabilities. Time-dependent ROC at 3- and 5-years was conducted to confirm that the nomogram had higher sensitivity and specificity in predicting the prognosis of OS than AJCC staging system. The 3- and 5-year AUC values of the nomogram for OS were 0.835 and 0.878, compared with 0.755 and 0.705, for that of AJCC stage (Figure 8A). The calibration curves for the probability of OS of 3 years or 5 years show no obvious deviations from the reference line, which illustrated optimal agreement between model prediction and actual observations (Figure 8B).
Table 2.
Univariate and Multivariate Survival Analyses for Screening Independent Prognostic Factors.
Nomogram for predicting 3- and 5-year overall survival in esophageal cancer.
Figure 8.
A, Area under the curve values of ROC predicted 3-year and 5-year overall survival rates of nomogram and AJCC stage. B, Calibration curves for the probability of overall survival of 3 and 5years. ROC indicates receiver operating characteristic.
Univariate and Multivariate Survival Analyses for Screening Independent Prognostic Factors.Abbreviations: EA, esophagus adenocarcinoma; ESCC, esophageal squamous-cell carcinoma; HR, hazard ratio; SD, standard deviation.Nomogram for predicting 3- and 5-year overall survival in esophageal cancer.A, Area under the curve values of ROC predicted 3-year and 5-year overall survival rates of nomogram and AJCC stage. B, Calibration curves for the probability of overall survival of 3 and 5years. ROC indicates receiver operating characteristic.
Discussion
In this study, we performed a comprehensive analysis to develop an epigenetic signature for stratifying the risk of recurrence of EC. First, we identified DMGs and DEGs between EC samples and normal samples. By integrating DNA methylation and mRNA expression data, we identified 71 methylation-related DEGs. Functional enrichment analysis showed that these genes were involved in the biological process related to initiation and progression of EC. After univariate and multivariate Cox proportional hazards analyses, 2 of these genes with independent prognostic value were selected to construct an epigenetic signature. Furthermore, the proposed risk score is independent of clinicopathological variables and showed a favorable prognostic ability. Then, the nomogram based on the risk score and 3 clinicopathological factors as a method to predict prognosis provides a visual method for predicting OS in patients with EC. These results indicate that the signature may serve as an independent prognostic factor.Many oncogenes and cancer suppressor genes show irregular expression due to abnormal CpG island methylation in the regulatory region of DNA rather than sequence changes.[16] Due to the stable nature of DNA and its amplification, DNA methylation can be easily converted from laboratory settings to routine hospital operations. Furthermore, the methylation profile of gene promoters is different for each type of cancer, suggesting that detection of abnormal gene promoter methylation can be used as a potential molecular biomarker for cancers. In addition, epigenetic changes expected to be therapeutic targets as epigenetic changes are reversible.[17] Therefore, detecting DNA methylation can provide new insights for further assessing cancer risk and treatment. Accumulating evidence suggests that DNA methylation is involved in the initiation and progression of EC and is associated with clinical outcomes. Promoter hypermethylation silences tumor suppressor genes, including coding genes (PTPN6,[18] RHCG,[19] and ZNF471[20]) and noncoding genes (lncRNA ZNF667-AS1 and ZNF667,[21] microRNA-128,[22] and microRNA-10b-3p[23]). Liu et al[18] indicated that PTPN6 expression was significantly downregulated in EC cell lines and tissues. The expression of PTPN6 in EC cells treated with DNA methyltransferase inhibitor was significantly upregulated, and frequent methylation of CpG sites in the P2 promoter was detected in esophageal squamous-cell carcinoma (ESCC) tissues and cell lines. The aberrant methylation of P2 showed significant tumor specificity and was associated with the expression level of PTPN6. The hypermethylation status and low expression of PTPN6 are related to advanced clinicopathological features and poor prognosis in patients with ESCC. Overexpression of PTPN6 inhibited proliferation and invasion of EC. Similarly, Dong et al[21] found methylation frequencies of CpG sites within proximal promoter of lncRNA ZNF66-AS1 and ZNF667 were significantly higher in ESCC tissues. The methylation status was linked to tumor stage, tumor grade, and prognosis. Multivariate Cox analysis revealed that hypermethylation was an independent indicator for poor prognosis. They further confirmed that promoter hypermethylation inhibited the expression of ZNF667-AS1 and ZNF667. The expression of ZNF667-AS1 and ZNF667 was significantly downregulated in ESCC tissue. Silence of ZNF667-AS1 and ZNF667 promoted the viability, migration, and invasion of EC cells in vitro. Taken together, ZNF667-AS1 and ZNF667 may act as tumor suppressors in ESCC. CDKN2B[24] and TFF148[25] are hypermethylated during early stages of EC and might therefore serve as biomarkers for early diagnosis of EC. Plasma samples of patients suspected of having ESCC might be collected and analyzed for these hypermethylation events.[26,27] DNA methylation can alter the ability of transcription factors to bind DNA and regulate gene expression.[21,28,29] Dong et al[21] revealed that the hypermethylation within the proximal promoter of ZNF667-AS1 or ZNF667 influenced the binding ability of transcription factor E2F1 binding and the following transcriptional activation and expression of them. Compared with single DNA methyltransferase inhibitor treatment or E2F1 overexpression, DNA methyltransferase inhibitor combined with E2F1 overexpression significantly upregulated the expression levels of ZNF667-AS1 and ZNF667.In this study, we identified a total of 71 methylation-regulated DEGs. Kyoto Encyclopedia of Genes and Genomes pathways suggested these genes were mainly enriched in JAK-STAT signaling pathway, carbon metabolism, herpes simplex virus 1 infection, PPAR signaling pathway, proteoglycans in cancer, and peroxisome. The JAK/STAT signaling pathway is a central signaling hub that can be activated by a plethora of cytokines, growth factors, and hormones[30] and is associated with cell proliferation, differentiation, and apoptosis.[31] It consists of 7 mammalian STAT family members that act as transcription factors and are activated by 4 different JAKs.[32] The dysregulation of the JAK/STAT signaling pathway has been implicated in the initiation and progression of various cancers, including EC.[33-35] Yu et al found that the JAK-STAT pathway was significantly reduced by unbiased RNA sequencing in RNF168-depleted EC cell. Silence of RNF168 reduces JAK-STAT target genes, such as IRF1, IRF9, and IFITM1. Immunoprecipitation showed that RNF168 associates with STAT1 in the nucleus, stabilizing the STAT1 protein and inhibiting its ubiquitination and degradation. Liu et al revealed that nimesulide inhibited the growth of EC cells by inactivating the JAK2/STAT3 pathway. FAM24B and FAM200A in our signature have not been reported in previous studies to relate to EC biology, thus the functions and mechanisms of the 2 genes in EC need to be further investigated.Several limitations of this study need to be pointed out. First, the prognostic signature or nomogram was established based on data from TCGA database. However, due to the limited number of patients in this study, a large sample validation cohort is needed to further validate the predictive accuracy of our model. Second, although the signature of the 2 genes showed favorable predictive ability in EC, the mechanism behind it was not yet clear and further researches are needed. Further functional experiments in vivo and vitro are needed to investigate the elusive mechanisms of aberrant methylated pathways caused by the 2 genes.
Conclusions
The present study develops a novel genetic signature via the analysis integrating multiomic data including DNA methylation, transcriptome, and clinical outcome of patients with EC from TCGA database. Moreover, a nomogram combining the epigenetic signature and clinicopathological factors was constructed to visually predict the survival of patients with EC.
Authors: Marina Bibikova; Bret Barnes; Chan Tsan; Vincent Ho; Brandy Klotzle; Jennie M Le; David Delano; Lu Zhang; Gary P Schroth; Kevin L Gunderson; Jian-Bing Fan; Richard Shen Journal: Genomics Date: 2011-08-02 Impact factor: 5.736
Authors: John J O'Shea; Daniella M Schwartz; Alejandro V Villarino; Massimo Gadina; Iain B McInnes; Arian Laurence Journal: Annu Rev Med Date: 2015 Impact factor: 13.739