Literature DB >> 29393486

A 10‑microRNA prognosis scoring system in esophageal squamous cell carcinoma constructed using bioinformatic methods.

Qingchao Sun1, Liang Zong1, Haiping Zhang1, Yanchao Deng1, Changming Zhang1, Liwei Zhang1.   

Abstract

MicroRNA (miR) signatures may aid the diagnosis and prediction of cancer; therefore, miRs associated with the prognosis of esophageal squamous cell carcinoma (ESCC) were screened. miR‑sequencing (seq) and mRNA‑seq data from early‑stage ESCC samples were downloaded from The Cancer Genome Atlas (TCGA) database, and samples from subjects with a >6‑month survival time were assessed with Cox regression analysis for prognosis‑associated miRs. A further two miR expression datasets of ESCC samples, GSE43732 and GSE13937, were downloaded from the Gene Expression Omnibus database. Common miRs between prognosis‑associated miRs, and miRs in the GSE43732 and GSE13937, datasets were used for risk score calculations for each sample, and median risk scores were applied for the stratification of low‑ and high‑risk samples. A prognostic scoring system of signature miRs was subsequently constructed and used for survival analysis for low‑ and high‑risk samples. Differentially‑expressed genes (DEGs) corresponding to all miRs were screened and functional annotation was performed. A total of 34 prognostic miRs were screened and a scoring system was created using 10 signature miRs (hsa‑miR‑140, ‑33b, ‑34b, ‑144, ‑486, ‑214, ‑129‑2, ‑374a and ‑412). Using this system, low‑risk samples were identified to be associated with longer survival compared with high‑risk samples in the TCGA and GSE43732 datasets. Age, alcohol and tobacco use, and radiotherapy were prognostic factors for samples with different risk scores and the same clinical features. There were 168 DEGs, and the top 20 risk scores positively‑correlated and the top 20 risk scores negatively‑correlated DEGs were significantly enriched for six and 10 functional terms, respectively. 'Tight junction' and 'melanogenesis' were two significantly enriched pathways of DEGs. miR‑214, miR‑129‑2, miR‑37a and miR‑486 may predict ESCC patient survival, although further studies to validate this hypothesis are required.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29393486      PMCID: PMC5865988          DOI: 10.3892/mmr.2018.8550

Source DB:  PubMed          Journal:  Mol Med Rep        ISSN: 1791-2997            Impact factor:   2.952


Introduction

With respect to prognosis and mortality, esophageal squamous cell carcinoma (ESCC) is the 8th most common type of cancer and the 6th most common cancer-associated cause of premature mortality (1). Globally, ~450,000 people are affected by ESCC and this incidence is growing (2), with ~500,000 new cases diagnosed each year (3). The 5-year survival for patients with ESCC remains low (15–20%) (4). The cure rate of early stage ESCC is as high as 50% following surgical resection (5), although a number of patients with ESCC are not candidates for surgery due to comorbid conditions, including advanced age. In these cases, the 30-day mortality is 2–10% (6). Numerous studies have revealed that smoking and pre-diagnosis alcohol consumption are risk factors for ESCC, and the surgical technique, biological behavior, postoperative treatment and response to chemoradiotherapies contribute to improving prognosis (7,8). There are additional genetic alterations that contribute to the prognosis of ESCC, including somatic mutations, copy number variations and gene expression alterations (9). MicroRNAs (miRs) are useful diagnostic and prognostic indicators for human cancer (10), and miR-377 suppresses the initiation and progression of ESCC by inhibiting cluster of differentiation 133 and vascular endothelial growth factor (11). miR-1290 and miR-613 are prognostic factors for patients with ESCC (12,13), and high expression of miR-103/107 is associated with poor survival in patients with ESCC (14). Nevertheless, miRs may cooperate to drive the progression and prognosis of esophageal carcinoma. miR signatures may aid in the diagnosis and prognosis of cancer (15). Feber et al (16) assessed the association of miR expression with patient survival and lymph node metastasis by evaluating miR expression in 45 primary tumors. This previous study identified that miR profiles have prognostic value for staging patients with ESCC. The present study screened signature miRs involved in predicting ESCC using miR-sequencing (seq) and mRNA datasets from The Cancer Genome Atlas (TCGA; gdc-portal.nci.nih.gov) and the Gene Expression Omnibus (GEO; www.ncbi.nlm.nih.gov) database. Subsequently, a prognostic scoring system was created to identify predictive miRs using sample risk scores. All cancer samples were divided into high- and low-risk categories and validated using the scoring system, and the differentially-expressed genes (DEGs) associated with miRs were functionally annotated.

Materials and methods

Microarray data

miR-seq and mRNA-seq data from early stage ESCC samples were downloaded from TCGA on March 18, 2017 and 89 samples with miR and mRNA expression data were obtained by matching barcodes. These were early stage (stage I and II) cancer samples. This dataset was used as a test dataset. A further two miR expression datasets of ESCC samples, GSE43732 and GSE13937, were downloaded from the GEO database. The GSE43732 dataset was based on the platform of GPL16543, and contained 53 early stage cancer samples. The GSE13937 dataset was based on the platform of GPL8835, and contained 31 early stage cancer samples. These two datasets were used as validation datasets. Clinical feature data for all downloaded datasets were also collected (Table I).
Table I.

Clinical features of cancer samples downloaded from TCGA and the Gene Expression Omnibus.

Clinical featureTCGA (n=89)GSE43732 (n=53)GSE13937 (n=31)
Age, mean ± standard deviation63.02±12.4459.21±9.26
Gender
  Male6243
  Female1710
Pathologic_M
  M079
  M10
Pathologic_N (/N1)
  N06443
  N12410
Pathologic_T
  T01
  T1247
  T23315
  T33131
Alcohol
  Yes663223
  No22217
Smoking
  Yes173423
  No31197
  Reformed35
New tumor
  Yes27
  No60
Radiation therapy
  Yes1814
  No6517
Mortality
  Succumbed272414
  Survived672917
Overall survival time (months) (mean ± standard)20.47±20.5944.7±24.0529.78±20.89

TCGA, The Cancer Genome Atlas.

Prognostic miRs

The overall prognosis of patients with early stage ESCC is comparatively good. Samples with a <6-month censor time are not representative samples for analyzing prognostic factors. Therefore, miR-seq data samples from TCGA with a survival time of <6 months were removed to avoid introducing more mixed factors, and the remaining 77 samples assessed with Cox regression analysis using the survival package in R (17) to identify prognostic miRs (threshold of P<0.01 for the log rank test).

Prognostic scoring system

Prognostic miRs were matched with miRs in the GSE43732 and GSE13937 datasets, and common ones were collected. Selected miRs were ranked according to log rank P-values to construct a prognosis scoring system. miRs were added singly subsequent to the first three, until the highest P-value representing correlation significance between samples and overall survival time was obtained. When the P-value was greatest, miRs were considered to be signature miRs, and the scoring system was created using these miRs. Risk scores are used to assess risk factors for large samples (18). Signature miRs were used to calculate risk scores for samples in the TCGA dataset using the following formula: Risk score = β gene 1 × expr gene 1 + β gene 2 × expr gene 2 + … + β gene n × expr gene n, where β gene indicates the regression coefficients of the gene, and the exp gene indicates its expression levels. The risk scores of validation samples (GSE43732 and GSE13937) were computed, and a median risk score was applied to stratify low- and high-risk samples. Subsequently, survival correlation coefficients between low- and high-risk samples in the TCGA and GEO datasets, and correlations among risk scores, were assessed. In addition, correlations between clinical features and sample prognosis were analyzed via Cox regression.

Functional annotation of samples with different prognosis risks

The matched RNA-seq data was downloaded from TCGA according to the barcodes of the samples used in the prognostic miRNA analysis. The RNA-seq data was used to screen the DEGs between high- and low-risk samples using the limma package in R (bioconductor.org/packages/release/bioc/html/limma.html) (19). A false discovery rate (FDR) of <0.05 was set as the threshold. Correlation coefficients for gene expression and risk scores were computed, and positively and negatively-correlated genes were annotated with respect to significant functional terms, and Kyoto Encyclopedia of Genes and Genomes (KEGG; www.genome.jp/kegg) pathway terms, using DAVID (david.ncifcrf.gov) (20).

Results

Using Cox regression analysis on samples that indicated a survival time of >6 months, 34 prognostic miRs from the miR-seq dataset were screened and 16 common miRs were identified between the GSE43732 and GSE13937 datasets (Table II).
Table II.

Common miRs between prognosis-associated miRs, and miRs in GSE43732 and GSE13937.

miRP-value
hsa-miR-129-23.29 ×10−05
hsa-miR-34b1.86 ×10−04
hsa-miR-374a1.92 ×10−04
hsa-miR-4121.99 ×10−04
hsa-miR-1404.66 ×10−04
hsa-miR-2145.15 ×10−04
hsa-miR-1441.57 ×10−03
hsa-miR-376b1.59 ×10−03
hsa-miR-4861.67 ×10−03
hsa-miR-33b3.99 ×10−03
hsa-let-7f-16.22 ×10−03
hsa-miR-4946.24 ×10−03
hsa-miR-33a6.37 ×10−03
hsa-miR-4326.73 ×10−03
hsa-miR-219-17.88 ×10−03
hsa-miR-1889.87 ×10−03

miR, microRNA.

To create a prognostic scoring system, common miRs between prognostic miRs and miRs in the GEO datasets were added singly following the first three, until the highest P-value representing connection significance between samples and overall survival time was obtained. A prognostic scoring system was created using the 10 signature miRs with the greatest P-values, and low-risk samples had greater survival in the TCGA and GSE43732 datasets. These data appear in Fig. 1A and B. Differences in the GSE13937 dataset were not notable (Fig. 1C). Regression analysis revealed that risk scores were correlated with prognosis (P=0.0141; Table III). Differences in expression among 10 signature genes in samples stratified by clinical features were noted, and Table IV shows the risk factors that were prognostic for samples with different risk scores (P<0.05). Survival curves are presented in Figs. 2–5. Risk scores for samples, survival time and expression clustering heatmaps of the 10 signature miRs from the TCGA, GSE13937 and GSE43732 datasets are in Fig. 6.
Figure 1.

Survival curves for patients with early stage esophageal carcinoma stratified by low- and high-risk. Samples from (A) The Cancer Genome Atlas, and (B) GSE43732 and (C) GSE13937 datasets. **P<0.05.

Table III.

Cox regression results for the prognosis-associated clinical features.

Clinical featureP-valueHazards regression (confidence interval)
Age, >60 years vs. <60 years0.971.016 (0.438–2.356)
Sex, male vs. female0.6151.325 (0.442–3.97)
Alcohol, yes vs. no0.9160.943 (0.318–2.793)
Tobacco, yes vs. no vs. reformed0.560.872 (0.551–1.382)
New tumor, yes vs. no0.7261.168 (0.491–2.778)
Radiation therapy, yes vs. no0.93020.951 (0.3113–2.907)
Risk score, high vs. low0.01411.21 (1.005–1.458)
Table IV.

Prognostic factors in high- and low-risk samples under the same clinical features.

Clinical featureP-value
Age
  ≥60, n=390.0119
  ≤60, n=380.1315
Gender
  Male, n=60.0731
  Female, n=150.07537
Alcohol
  Yes, n=590.002
  No, n=180.548
Smoker
  Yes, n=150.193
  No, n=250.0253
  Reformed, n=310.166
New tumor
  Yes, n=270.166
  No, n=480.0175
Radiation therapy
  Yes, n=170.945
  No, n=540.000642
Figure 2.

Survival curves of high- and low-risk samples of different ages. (A) Samples <60 years of age. High-risk samples are red and low-risk samples are black. (B) Samples > 60 years of age. High-risk samples are purple and low-risk samples are blue. (C) Combined survival curves of samples with age groups above and below the median age. Curves crossed with P>0.05 represent different samples which cannot be distinguished by risk score, while curves with P<0.05 represent samples that may be distinguished by risk score. **P<0.05.

Figure 5.

Survival curves of high- and low-risk samples with/without radiation therapy. (A) Samples with no radiation therapy. High-risk samples are red, and low-risk samples are black. (B) Samples with radiation therapy. High-risk samples are purple, and low-risk samples are blue. (C) Combined survival curves from those with/without radiation therapy. **P<0.05.

Figure 6.

Risk scores, survival and expression clustering heatmap of the 10 signature microRNAs of all early stage esophageal carcinoma samples. Samples from (A) The Cancer Genome Atlas, and (B) GSE13937 and (C) GSE43732 datasets.

In total, 168 DEGs were identified, and 58 were negatively-associated with risk scores, with 110 positively-associated with risk scores. The expression pattern of the top 20 DEGs positively- and negatively-associated with risk scores differed significantly between low and high-risk samples (Fig. 7A). The GO enrichment of the DEGs is presented in Fig. 7B. The top 20 positively-associated DEGs were significantly enriched in six KEGG pathways, including: hsa05217-Basal cell carcinoma, hsa04916-Melanogenesis, hsa04610-Complement and coagulation cascades, hsa04530-Tight junction, hsa04340-Hedgehog signaling pathway and hsa03320-PPAR signaling pathway (Fig. 7C).
Figure 7.

Expression pattern and functional annotation of the DEGs positively- and negatively-associated with risk scores. (A) Expression pattern of the top 20 DEGs positively- and negatively-associated with risk scores. X-axis represents the samples in TCGA dataset, wich risk scores increase from left to right. Y-axis represents the DEGs expression levels. (B) The GO enrichment of the DEGs. (C) KEGG pathway enrichment of the top 20 positively-associated DEGs.

Discussion

In order to screen miRs involved in the prognosis of ESCC, miR-seq and mRNA-seq data for early stage ESCC samples were downloaded from TCGA, with a further two miR expression datasets, GSE43732 and GSE13937, downloaded from the GEO database. miR-seq data samples with a survival time of >6 months were subjected to Cox regression analysis to assess prognostic value. Common prognostic miRs, and miRs in the GSE43732 and GSE13937 datasets, were used for risk score calculations, and a median risk score was used to stratify low- and high-risk samples. A prognostic scoring system of 10 signature miRs was made according to survival analysis between low- and high-risk samples. It was noted that low-risk samples had greater survival compared with high-risk samples in the TCGA and GSE43732 datasets. Age, alcohol and tobacco use, and radiotherapy were prognostic factors for samples with different risk scores. The present study identified 168 DEGs for all miRs, 110 of which were positively correlated with risk scores. The top 20 positively-correlated and top 20 negatively-correlated DEGs were significantly enriched in six and 10 functional terms, respectively. There were six significantly enriched KEGG pathways, including ‘tight junction’ and ‘melanogenesis’. Prognostic scoring is used to predict survival and disease recurrence for a number of types of cancer (21). Wang et al (17) established a 53-gene expression system to be used to predict overall survival for gastric cancer. Mao et al (22) created a 12-gene prognostic scoring system to guide adjuvant therapy for breast cancer. Yang et al (23) created a miR signature to stratify patients with Barrett's esophagus with different prognostic risks for targeted chemoprevention. A number of miRs in the prognostic system used in the present study have been previously implicated in ESCC or some other malignant tumors. miR-214, a miR that regulates cancer cell proliferation, migration and invasion by targeting phosphatase and tensin homolog in gastric cancer, has been reported to reduce cell survival via downregulation of Bcl2l2 in cervical cancer cells (24,25). The predictive value of miR-214 for prognosis and multidrug resistance has been implicated in ESCC (26). Overexpression has been reported to enhance cisplatin sensitivity in ESCC by directly targeting surviving, and indirectly through CUG triplet repeat RNA binding protein 1 (27). miR-129-2 suppresses the proliferation and migration of ESCC via downregulation of SRY-related HMG box 4, and miR-129 is hypothesized to be a novel therapeutic target and biomarker in gastrointestinal cancer (28,29). miR-37a is a prognostic marker for patient survival in early-stage non-small cell lung cancer (30). miR-39a has been implicated in cell proliferation, migration and invasion in gastric cancer by targeting SRC kinase signaling inhibitor 1 (31). miR-486-5p expression is frequently decreased in human cancer. Low or unaltered expression of miR-486-5p compared with neighboring normal tissues has been demonstrated to be associated with a poor prognosis, and high expression with a good prognosis, in gastric cancer (32). miR-486 was observed to be downregulated in ESCC tissues (33). In patients with ESCC, miR-486-3p was highly expressed following chemotherapy treatment (34). In conclusion, miR-214, miR-129-2, miR-37a and miR-486 may predict survival in patients with ESCC, although these data require validation with larger studies.
  34 in total

1.  Prediction value of miR-483 and miR-214 in prognosis and multidrug resistance of esophageal squamous cell carcinoma.

Authors:  Yi Zhou; Liu Hong
Journal:  Genet Test Mol Biomarkers       Date:  2013-05-13

2.  Prognostic value of combined and individual expression of microRNA-1290 and its target gene nuclear factor I/X in human esophageal squamous cell carcinoma.

Authors:  Rui Xie; Shang-Nong Wu; Cheng-Cheng Gao; Xiao-Zhong Yang; Hong-Gang Wang; Jia-Ling Zhang; Wei Yan; Tian-Heng Ma
Journal:  Cancer Biomark       Date:  2017-09-07       Impact factor: 4.388

3.  Cancer statistics, 2010.

Authors:  Ahmedin Jemal; Rebecca Siegel; Jiaquan Xu; Elizabeth Ward
Journal:  CA Cancer J Clin       Date:  2010-07-07       Impact factor: 508.702

Review 4.  Advances in the treatment of esophageal carcinoma.

Authors:  T W Rice; D J Adelstein; G Zuccaro; G W Falk; J R Goldblum
Journal:  Gastroenterologist       Date:  1997-12

5.  Changes in microRNA expression levels correlate with clinicopathological features and prognoses in endometrial serous adenocarcinomas.

Authors:  Eri Hiroki; Jun-Ichi Akahira; Fumihiko Suzuki; Satoru Nagase; Kiyoshi Ito; Takashi Suzuki; Hironobu Sasano; Nobuo Yaegashi
Journal:  Cancer Sci       Date:  2009-10-08       Impact factor: 6.716

6.  Tumor budding is a strong and reproducible prognostic marker in T3N0 colorectal cancer.

Authors:  Lai Mun Wang; David Kevans; Hugh Mulcahy; Jacintha O'Sullivan; David Fennelly; John Hyland; Diarmuid O'Donoghue; Kieran Sheahan
Journal:  Am J Surg Pathol       Date:  2009-01       Impact factor: 6.394

7.  Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012.

Authors:  Jacques Ferlay; Isabelle Soerjomataram; Rajesh Dikshit; Sultan Eser; Colin Mathers; Marise Rebelo; Donald Maxwell Parkin; David Forman; Freddie Bray
Journal:  Int J Cancer       Date:  2014-10-09       Impact factor: 7.396

8.  MiR-214 regulate gastric cancer cell proliferation, migration and invasion by targeting PTEN.

Authors:  Ting-Song Yang; Xiao-Hu Yang; Xu-Dong Wang; Yi-Ling Wang; Bo Zhou; Zhen-Shun Song
Journal:  Cancer Cell Int       Date:  2013-07-08       Impact factor: 5.722

Review 9.  miR-129 as a novel therapeutic target and biomarker in gastrointestinal cancer.

Authors:  Andrew Fesler; Haiyan Zhai; Jingfang Ju
Journal:  Onco Targets Ther       Date:  2014-08-21       Impact factor: 4.147

10.  Expression and prognostic value of miR-486-5p in patients with gastric adenocarcinoma.

Authors:  Hui Chen; Chuanli Ren; Chongxu Han; Daxin Wang; Yong Chen; Deyuan Fu
Journal:  PLoS One       Date:  2015-03-20       Impact factor: 3.240

View more
  1 in total

1.  Bioinformatic analysis identifies potentially key differentially expressed genes in oncogenesis and progression of clear cell renal cell carcinoma.

Authors:  Haiping Zhang; Jian Zou; Ying Yin; Bo Zhang; Yaling Hu; Jingjing Wang; Huijun Mu
Journal:  PeerJ       Date:  2019-11-26       Impact factor: 2.984

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.