| Literature DB >> 29784063 |
Jin-Cheng Guo1,2, Yang Wu3, Yang Chen1,2, Feng Pan1,2, Zhi-Yong Wu4, Jia-Sheng Zhang1,2, Jian-Yi Wu1,2, Xiu-E Xu1,5, Jian-Mei Zhao1,2, En-Min Li6,7, Yi Zhao8, Li-Yan Xu9,10.
Abstract
BACKGROUND: Esophageal squamous cell carcinoma (ESCC) is the predominant subtype of esophageal carcinoma in China. This study was to develop a staging model to predict outcomes of patients with ESCC.Entities:
Keywords: Esophageal squamous cell carcinoma; Long non-coding RNA; Overall survival; Protein-coding gene; Staging model; Transcriptome
Mesh:
Substances:
Year: 2018 PMID: 29784063 PMCID: PMC5993132 DOI: 10.1186/s40880-018-0277-0
Source DB: PubMed Journal: Cancer Commun (Lond) ISSN: 2523-3548
Clinicopathological characteristics of patients with esophageal squamous cell carcinoma
| Variable | The GEO datasetsa | The experimental set | ||||
|---|---|---|---|---|---|---|
| Number of patients | 5-year OS rate (%) | Number of patients | 5-year OS rate (%) | |||
| Total | 119 | 105 | ||||
| Age (years) | ||||||
| ≤ 59 | 89 | 48.31 | 0.03 | 63 | 58.73 | 0.69 |
| > 59 | 90 | 33.33 | 42 | 54.76 | ||
| Sex | ||||||
| Female | 33 | 33.33 | 0.30 | 25 | 48.00 | 0.16 |
| Male | 146 | 42.47 | 80 | 60.00 | ||
| Tumor location | ||||||
| Upper thorax | 20 | 25.00 | 0.10 | 7 | 71.4 | 0.15 |
| Middle thorax | 97 | 40.21 | 54 | 61.11 | ||
| Lower thorax | 62 | 46.78 | 44 | 50.00 | ||
| Histological grade | ||||||
| G1 | 49 | 28.57 | 0.02 | 15 | 60.00 | 0.16 |
| G2 | 98 | 44.90 | 73 | 64.29 | ||
| G3 | 32 | 46.88 | 17 | 47.06 | ||
| Primary tumor | ||||||
| T1 | 12 | 41.67 | 0.04 | 21 | 71.42 | 0.03 |
| T2 | 27 | 40.74 | 63 | 63.49 | ||
| T3 | 110 | 44.55 | 21 | 31.58 | ||
| T4 | 30 | 26.67 | ||||
| Regional lymph nodes | ||||||
| N0 | 83 | 56.62 | < 0.01 | 50 | 68.00 | < 0.01 |
| N1 | 62 | 27.41 | 32 | 56.25 | ||
| N2 | 22 | 27.27 | 16 | 37.50 | ||
| N3 | 12 | 25.00 | 7 | 28.50 | ||
| pTNM stage | ||||||
| I | 10 | 70.00 | 0.00 | 6 | 33.30 | 0.00 |
| II | 77 | 53.25 | 48 | 70.80 | ||
| III | 92 | 27.17 | 51 | 47.05 | ||
| Adjuvant therapy | ||||||
| Unknown | 45 | 64.44 | 0 | |||
| No | 30 | 30.00 | 0.50 | 43 | 58.14 | 0.70 |
| Yes | 104 | 33.65 | 62 | 56.45 | ||
| Radiotherapy | Unknown | 18 | 48.83 | 0.11 | ||
| Chemotherapy | Unknown | 19 | 59.09 | |||
| Radiotherapy + chemotherapy | Unknown | 25 | 48.00 | |||
OS overall survival
aComprising the GSE52634 and GSE53622 datasets
bLog-rank test was used
Fig. 1Schedule of the analyses used to develop the transcriptome molecular staging model and validate its predictive efficiency. PCG protein-coding gene, lncRNA long non-coding RNA
Fig. 2The patients identified from the GSE53624 dataset (n = 119) are grouped with three risk stages. a Univariate Cox proportional hazards regression analysis of the expression profiling data of PCGs and lncRNAs. b Eigenvalues of the principal components show most of the variance in the GSE53624 dataset is contained in the first three principal components. c Clustering of the patients with ESCC identified from the GSE53624 dataset according to the three principal component scores using NbClust (Euclidean distance, complete linkage) indicates that optimal cluster number was three with the largest index. d Principal component analysis of the GSE53624 dataset. Axes are principal components 1, 2, and 3. PCG protein-coding gene, lncRNA long non-coding RNA
Fig. 3Survival prediction power of PCG-lncRNA grouping versus TNM staging for patients identified from the GSE53624 dataset. a Kaplan–Meier analysis of patient survival when the PCG-lncRNA grouping is applied. b Kaplan–Meier analysis of patient survival when TNM staging is applied. c Comparison of the PCG-lncRNA grouping and the TNM staging systems using ROC analysis. PCG protein-coding gene, lncRNA long non-coding RNA
Fig. 4The LSB staging model comprising SEMA3A, BEX2, and LINC01800 selected using classification and regression tree (CART) analysis. a SEMA3A, BEX2, and LINC01800 form the classification tree generating using CART analysis. The percentage represents the proportion of patients at every LSB stage in the training set. b Test error result of the classification tree. c Multiclass ROC analysis was performed in the training set, test set, and entire GSE53624 dataset
Identified PCGs and lncRNAs and their associations with prognosis
| Ensemble ID | Gene symbol | Gene name | Chromosome location | Coefficienta | Gene expression level association with prognosis | |
|---|---|---|---|---|---|---|
| ENSG00000075213 | SEMA3A | Semaphorin 3A | Chromosome 7: 83955777–84492724 (−) | 0.17 | 0.01 | High |
| ENSG00000133134 | BEX2 | Brain-expressed X-linked protein 2 | Chromosome X: 103309346–103311046 (−) | − 0.22 | 0.01 | Low |
| ENSG00000234572 | LINC01800 | Chromosome 2: 64846130–64863626 (−) | − 0.20 | 0.00 | Low |
PCG Protein-coding gene, lncRNA long non-coding RNA
aDerived from univariable Cox regression analysis of the GSE53624 dataset
Fig. 5Validation of the LSB staging model using the GSE53622 dataset (n = 60) (a, b) and experimental dataset (n = 105) (c, d). Kaplan–Meier analysis and comparison of the LSB staging model and the TNM staging using ROC analysis
Primer sequences used for real-time RT-PCR
| Gene | Forward (5′–3′) | Reverse (5′–3′) |
|---|---|---|
| SEMA3A | TGGTTCTGCATGTTCTCGCT | CTCTCTGCGACTTCGGACTG |
| BEX2 | TCGAGAATCGGGAGGAGGAGAC | TCCTGGTTGACATTTTCCACGAT |
| LINC01800 | CCACACTGGAGTGCAGCTAT | CCACCTGTCTGATGGTCTTCT |
| Β-actin | AGCGAGCATCCCCCAAAGTT | GGGCACGAAGGCTCATCATT |
RT-PCR Reverse transcription polymerase chain reaction
Fig. 6Coexpression network analysis and prediction of the function of SEMA3A, BEX2, and LINC01800. a Coexpression network of SEMA3A, BEX2, and LINC01800 with other genes in the GSE53624 and GSE53622 datasets (Pearson correlation coefficient > 0.5, P < 0.05). Blue or red genes were coexpressed with two or one of the three identified genes in the LSB staging model, respectively. b Functional enrichment of the protein-coding genes which were coexpressed with SEMA3A, BEX2, and LINC01800, using ClueGo