Bang-Hao Xu1, Jing-Hang Jiang2, Tao Luo3, Zhi-Jun Jiang3, Xin-Yu Liu3, Le-Qun Li3. 1. Department of Hepatobiliary Surgery, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China. 2. Department of Hepatobiliary Surgery, Jing Men NO.2 People's Hospital, Jingmen, Hubei, China. 3. Department of Hepatobiliary Surgery, Guangxi Medical University Affiliated Tumor Hospital, Nanning, Guangxi, China.
Abstract
ABSTRACT: Reliable biomarkers are of great significance for the treatment and diagnosis of hepatocellular carcinoma (HCC). This study identified potential prognostic epithelial-mesenchymal transition related lncRNAs (ERLs) by the cancer genome atlas (TCGA) database and bioinformatics.The differential expression of long noncoding RNA (lncRNA) was obtained by analyzing the lncRNA data of 370 HCC samples in TCGA. Then, Pearson correlation analysis was carried out with EMT related genes (ERGs) from molecular signatures database. Combined with the univariate Cox expression analysis of the total survival rate of hepatocellular carcinoma (HCC) patients, the prognostic ERLs were obtained. Then use "step" function to select the optimal combination of constructing multivariate Cox expression model. The expression levels of ERLs in HCC samples were verified by real-time quantitative polymerase chain reaction.Finally, we identified 5 prognostic ERLs (AC023157.3, AC099850.3, AL031985.3, AL365203.2, CYTOR). The model showed that these prognostic markers were reliable independent predictors of risk factors (P value <.0001, hazard ratio [HR] = 2.400, 95% confidence interval [CI] = 1.667-3.454 for OS). In the time-dependent receiver operating characteristic analysis, this prognostic marker is a good predictor of HCC survival (area under the curve of 1 year, 2 years, 3 years, and 5 years are 0.754, 0.720, 0.704, and 0.662 respectively). We analyzed the correlation of clinical characteristics of these prognostic markers, and the results show that this prognostic marker is an independent factor that can predict the prognosis of HCC more accurately. In addition, by matching with the Molecular Signatures Database, we obtained 18 ERLs, and then constructed the HCC prognosis model and clinical feature correlation analysis using 5 prognostic ERLs. The results show that these prognostic markers have reliable independent predictive value. Bioinformatics analysis showed that these prognostic markers were involved in the regulation of EMT and related functions of tumor occurrence and migration.Five prognostic types of ERLs identified in this study can be used as potential biomarkers to predict the prognosis of HCC.
ABSTRACT: Reliable biomarkers are of great significance for the treatment and diagnosis of hepatocellular carcinoma (HCC). This study identified potential prognostic epithelial-mesenchymal transition related lncRNAs (ERLs) by the cancer genome atlas (TCGA) database and bioinformatics.The differential expression of long noncoding RNA (lncRNA) was obtained by analyzing the lncRNA data of 370 HCC samples in TCGA. Then, Pearson correlation analysis was carried out with EMT related genes (ERGs) from molecular signatures database. Combined with the univariate Cox expression analysis of the total survival rate of hepatocellular carcinoma (HCC) patients, the prognostic ERLs were obtained. Then use "step" function to select the optimal combination of constructing multivariate Cox expression model. The expression levels of ERLs in HCC samples were verified by real-time quantitative polymerase chain reaction.Finally, we identified 5 prognostic ERLs (AC023157.3, AC099850.3, AL031985.3, AL365203.2, CYTOR). The model showed that these prognostic markers were reliable independent predictors of risk factors (P value <.0001, hazard ratio [HR] = 2.400, 95% confidence interval [CI] = 1.667-3.454 for OS). In the time-dependent receiver operating characteristic analysis, this prognostic marker is a good predictor of HCC survival (area under the curve of 1 year, 2 years, 3 years, and 5 years are 0.754, 0.720, 0.704, and 0.662 respectively). We analyzed the correlation of clinical characteristics of these prognostic markers, and the results show that this prognostic marker is an independent factor that can predict the prognosis of HCC more accurately. In addition, by matching with the Molecular Signatures Database, we obtained 18 ERLs, and then constructed the HCC prognosis model and clinical feature correlation analysis using 5 prognostic ERLs. The results show that these prognostic markers have reliable independent predictive value. Bioinformatics analysis showed that these prognostic markers were involved in the regulation of EMT and related functions of tumor occurrence and migration.Five prognostic types of ERLs identified in this study can be used as potential biomarkers to predict the prognosis of HCC.
Liver cancer is one of the most common malignant tumors in the world. The number of deaths due to liver cancer ranks the fourth in the world.[ Hepatocellular carcinoma accounts for the majority of liver cancer cases, and the number of hepatocellular carcinoma (HCC) patients in China accounts for more than half of the world's cases.[ Alcoholism, aflatoxin B1, diabetes, hepatitis B virus (HBV) infection, nonalcoholic fatty liver (NAFLD) and obesity are the main risk factors for HCC.[ Although the management of HCC has improved in recent years, the overall prognosis is still poor,[ so it is urgent to develop new diagnosis and treatment strategies.Noncoding RNAs (ncRNAs) are RNAs that do not participate in protein coding. NcRNAs containing more than 200 nucleotides are defined as long-chain noncoding RNAs (lncRNAs). A large number of studies have shown that lncRNAs are involved in the pathophysiological process of various diseases,[ and this is also proved by more and more researchers in HCC.[Epithelial–mesenchymal transition (EMT) is a physiological process involving the transformation of epithelial cells into stromal cells. This transformation enhances the ability of invasion, metastasis and anti-apoptosis of cells, and helps to promote the growth and metastasis of HCC cells.[ Many studies have reported that many kinds of lncRNAs regulate the occurrence and development of EMT by targeting EMT transcription factors. For example, lnc-GNAT1–1 can inhibit the expression of snail to inhibit EMT,[ lncRNA-CC3 directly targets snail to promote the EMT process,[ and lncRNA-ZFAS1 can induce EMT in cancer cells by competitive binding of corresponding miRNA to ceRNA network.[ At the same time, many studies have shown that these abnormal expression of lncRNA can affect the prognosis of HCC by participating in EMT process, and it is valuable for predicting the prognosis of HCC.[ Although some EMT related lncRNAs (ERLs) have been reported in HCC, there is little knowledge in this area, and there are still a large number of lncRNA and HCC EMT relationship is not clear. In this study, we used high throughput HCC data from the cancer genome atlas (TCGA) to identify ERLs with prognostic value through bioinformatics analysis, in order to find the ERLs with prognostic value for HCC.
Materials and method
Data source and clinical samples
The publicly shared dataset of 370 patients with HCC was downloaded from the TCGA data portal (https://portal.gdc.cancer.gov/, accessed August 20, 2019), including all kinds of gene expression data and corresponding clinical information. Data retrieval and application complied with the TCGA publication guidelines and data access policies. In addition, HCC and para-cancerous tissues were collected from 6 patients, immediately frozen in liquid nitrogen and stored at – 80°C. Informed consent was obtained from participants before this study. The ethics committee of the First Affiliated Hospital of Guangxi Medical University approved the study.
Epithelial–mesenchymal transition related lncRNAs screening
The lncRNA and mRNA expression profiles of HCC cohort were extracted from the RNA sequencing dataset of TCGA. Meanwhile, a series of ERLs were retrieved from the Molecular Signatures Database v7.0 (molecular signatures database; Hallmark_epithelial_mesenchymal_transition; http://www.broadinstitute.org/gsea/msigdb/index.jsp). These ERGs expression data were extracted from the dependent mRNA data.The limma software package of R platform was used to screen the differentially expressed lncRNAs (DELs), Samples with a P value of <.05 and log fold change (log|FC)| of >1 were considered DELs.LncRNA does not encode protein itself, but it affects the co-expression of the related encoded protein mRNA through the target. The co-expression relationship between DELs and ERGs was evaluated by performing Pearson correlation analysis. Pearson correlation coefficient (| R | > 0.4, P < .001) between lncRNAs and ERGs was used to identify ERLs. These lncRNAs were considered as significant ERLs.
Construction of prognostic signature
Univariate Cox regression analysis between the expression levels of ERLs and the survival time of patients with HCC was performed through the “survival” package (version 2.44) on the R (version 3.6.1) to describe the role of ERLs on HCC survival prediction. ERLs with a P value <.01 were identified as prognostic ERLs, the expression level was significantly related to the survival time of HCC patients. the prognostic ERLs selected by the “step” function were fitted into a multivariate Cox regression analysis with survival time as the dependent variable. A risk score model of prognostic signature was constructed by the linear combination of the expression levels of the ERLs with the multivariate Cox regression coefficient (β) as the weight. The risk score of each patient with HCC was calculated by the following formula: risk score = expressionlncRNA1 × βlncRNA1 + expressionlncRNA2 × βlncRNA2 +…+expressionlncRNAn × βlncRNAn[. The HCC cohort was divided into high- and low-risk groups based on this prognostic model by setting the median risk score as cut-off. A time-dependent receiver operating characteristic (ROC) curve was generated by applying the “survivalROC” R package (version 1.0.3) to estimate the predictive accuracy of this prognostic signature.[
Evaluation of ERL-based prognostic signature
To evaluate the predictive value of ERL based prognostic markers for HCC patients. Therefore, joint effect analysis was performed to investigate the association between the prognostic signature and clinicopathologic characteristics in HCC. Prognostic nomogram for predicting survival rate was constructed through risk score and clinicopathologic characteristics. Principal component analysis was performed to determine the profile distribution patterns of grouped cases.
Bioinformatics analysis
The co-expressed ERGs were analyzed for gene ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) database pathways by applying the David online websit (DAVID v6.8, https://david.ncifcrf.gov/, accessed November 1, 2019), which is a widely used bioinformatics resource.[ GO analysis and KEGG analysis reveals the enriched pathway of co-expressed ERGs. P value <.05 was considered statistically significant in GO and KEGG analyses. Gene set enrichment analysis (GSEA, version 4.0.1, http://www.broadinstitute.org/gsea/index.jsp) was performed to explore the distinct functional phenotypes between the high- and low-risk groups. The molecular signatures database of C2 (c2.cp.kegg.v7.0.symbols.gmt) and C5 (c5.all.v7.0.symbols.gmt) was applied to GSEA. A nominal P value <.05 and false discovery rate < 0.05 were considered statistically significant in GSEA.
Real-time quantitative PCR
According to the manufacturer's protocol, total RNA was extracted with RNAiso Plus reagent (Takara, Japan), and reverse-transcripted into complementary DNA using PrimeScriptTMRT reagent Kit with gDNA Eraser (Takara, Japan). TB Green○R Premix Ex TaqTM II Kit (Takara, Japan) was used for real-time PCR in ABI7500 real-time PCR system (Applied Biosystems). The primer sequence is shown in Table 1. Each cDNA sample was repeated 3 times.
Table 1
The primer sequences of 5 ERLS.
AC023157.3
F primer (5’-3’)
GTCTGTTGTTTGTATGCTGAGTTC
R primer (5’-3’)
TTGTCTGACCCAAGTGTTCG
AC099850.3
F primer (5’-3’)
AATATGGAAACAGGAACAGGAC
R primer (5’-3’)
GGAAATCTCAAAACCCAAAGG
AL031985.3
F primer (5’-3’)
ACACCTATTCAACTTCCCCATT
R primer (5’-3’)
CCAAGGATTCCCCTAAACATC
AL365203.2
F primer (5’-3’)
TTGCCTCATTTCATGGTCTG
R primer (5’-3’)
GCCCCTGTTTTGATTCCTAT
CYTOR
F primer (5’-3’)
TGGGAGATGAAACAGGAAGC
R primer (5’-3’)
CAGACAAATGGGAAACCGAC
The primer sequences of 5 ERLS.
Statistical analysis
Kaplan–Meier survival analysis by log-rank test was used to compare the survival status of patients with HCC between the high- and low-risk groups. P value <.05 was considered significant. Survival curves, ROC curves, and heat maps were plotted by the R platform. Statistical analysis was performed using SPSS version 22.0 (IBM Corporation, Armonk, NY, USA).
Results
Comprehensive gene annotation was downloaded from the Ensembl Genomes (http://ensemblgenomes.org/, accessed November 5, 2017).[ Of the 5268 lncRNAs obtained from the RNA sequence data set, 2994 met the standard of mean value greater than 1. The heat map and volcano map of DELs are shown in Figure 1. These lncRNAs were allowed to conduct Pearson correlation analysis with ERG expression data, lncRNA with correlation coefficient R greater than 0.4 was used for univariate regression analysis.
Figure 1
The distribution of differential expression of lncRNA in HCC and normal tissues. Note: (A) Scatter diagram of differential expression of lncRNA in HCC and normal tissues, (B) Heat Map of differential expression of lncRNA in HCC and normal tissues.
The distribution of differential expression of lncRNA in HCC and normal tissues. Note: (A) Scatter diagram of differential expression of lncRNA in HCC and normal tissues, (B) Heat Map of differential expression of lncRNA in HCC and normal tissues.
Construction of the ERL-based prognostic signature
A univariate regression analysis was performed to examine the association between ERLs and the OS of patients with HCC. ERLs with a P value <.001 were regarded as prognostic ERLs. A total of 18 ERLs were identified (see, Supplemental Digital Content Table S1, which demonstrates ERLs identified by Pearson correlation analysis; see, Supplemental Digital Content Table S2, which demonstrates the univariate survival analysis results of the ERLs). After selecting the optimal combination through the “step” function, the following 5 prognostic ERLs are used to build the prognosis model: AC023157.3, AC099850.3, AL031985.3, AL365203.2, CYTOR. The Kaplan–Meier curve and expression scatter diagram of 5 prognostic ERLs are shown in Figure 2. In univariate analysis, 5 prognostic ERLs divided HCC patients into high-risk group and low-risk group according to the expression. Survival analysis showed that the high-risk group had a worse prognosis.
Figure 2
Kaplan–Meier curves of the survival time of the 5 prognostic ERLs in HCC. Note: The 5 prognostic ERLs include (A) AC023157.3, (B) AC099850.3, (C) AL031985.3, and (D) AL365203.2, (E) CYTOR.
Kaplan–Meier curves of the survival time of the 5 prognostic ERLs in HCC. Note: The 5 prognostic ERLs include (A) AC023157.3, (B) AC099850.3, (C) AL031985.3, and (D) AL365203.2, (E) CYTOR.In the multivariate risk model, the median survival time of the high-risk group was significantly lower than that of the low-risk group (1,005 vs 3,125 days for high-risk vs low-risk), and the risk of death was significantly increased (P value <.0001, hazard ratio [HR] = 2.400, 95% confidence interval [CI] = 1.667–3.454 for OS, Fig. 3A, B, D). The results of ROC correlated with time show that the prediction ability of the model is better. Area under the curve of 1 year, 2 years, 3 years, and 5 years are 0.754, 0.720, 0.704, and 0.662 respectively (Fig. 3F). The 5 ERLs of prognostic signature identified from Cox regression analysis are show in Table 2. Clinical and pathologic characteristics of HCC patients and prognostic analysis are show in Table 3, and the expression of 5 ERLs in high and low risk groups is shown in Figure 4.
Figure 3
The risk score model analysis of the 5 prognostic ERL's signature. Note: (A)The rank of risk score, (B) survival outcome, and (C) expression heat map of the 5 prognostic ERLs between the high- and low-risk groups. (D) Kaplan–Meier curves for the high- and low-risk groups. (E) Time-dependent ROC analysis based on the risk score of patients with HCC.
Table 2
The 5 ERLs of prognostic signature identified from Cox regression analysis.
lncRNA symbol
Ensemble ID
Hazard ratio∗
P value∗
Coefficient†
AL365203.2
ENSG00000273038
1.654
<.001
0.211
CYTOR
ENSG00000222041
1.330
<.001
0.187
AC023157.3
ENSG00000276900
1.844
<.001
0.487
AL031985.3
ENSG00000260920
3.204
<.001
0.609
AC099850.3
ENSG00000265303
1.523
<.001
0.198
Table 3
Clinical and pathologic characteristics of HCC patients and prognostic analysis.
Variables
Count of events/total (n = 370)
MST (days)
HR (95% CI)
P value
Age (yr)
70/232
2456
1
.160
≤65
56/138
1490
1.288 (0.904-1.834)
>65
Gender
48/121
1560
1
.362
Female
78/249
2486
0.845 (0.588–1.214)
Male
Serum AFP (ng/mL)∗
60/213
2456
1
.852
≤400
21/64
2486
1.049 (0.633–1.738)
>400
Child-Pugh grade†
57/216
2542
1
.077
A
9/22
1005
1.872 (0.924–3.795)
B / C
Ishak fibrosis score‡
29/74
2456
1
.847
0
7/31
1791
0.757 (0.325–1.762)
1 / 2
6/28
NA
0.686 (0.281–1.675)
3 / 4
2/9
1386
0.720 (0.170–3.056)
5
17/69
NA
0.750 (0.408–1.380)
6
Tumor stage§
41/171
2532
1
<.001
I
25/85
1852
1.436 (0.871–2.369)
II
47/90
770
2.751 (1.803–4.198)
III / IV
Histologic grade||
18/55
2116
1
.786
G1
58/177
1685
1.148 (0.676–1.950)
G2
41/121
1622
1.180 (0.676–2.060)
G3
5/12
NA
1.825 (0.648–5.140)
G4
MVI¶
59/206
2131
1
.185
No
34/108
2486
1.331 (0.870–2.034)
Yes
Radical resection#
106/323
2116
1
.003
R0
17/40
837
2.137 (1.276–3.581)
R1 / R2 / RX
Risk index
46/185
3125
1
<.001
Low
80/185
1005
2.400 (1.667–3.454)
High
Figure 4
Expression levels of the 5 prognostic ERLs in high- and low-risk groups. ∗P < .05.
The risk score model analysis of the 5 prognostic ERL's signature. Note: (A)The rank of risk score, (B) survival outcome, and (C) expression heat map of the 5 prognostic ERLs between the high- and low-risk groups. (D) Kaplan–Meier curves for the high- and low-risk groups. (E) Time-dependent ROC analysis based on the risk score of patients with HCC.The 5 ERLs of prognostic signature identified from Cox regression analysis.Clinical and pathologic characteristics of HCC patients and prognostic analysis.Expression levels of the 5 prognostic ERLs in high- and low-risk groups. ∗P < .05.
Stratified and joint effects analysis.
Stratified and combined effect analysis was performed to assess the predictive power of 5 prognostic ERLs for HCC under different clinical conditions. Median survival time of high-risk score group decreased in different degrees compared with low-risk score group, which was not related to the good clinical phenotype grading in most clinical conditions, especially in tumor grading. In the clinical condition of AFP, when the AFP value is greater than 400, the high and low risk groups do not show the difference of prognosis. In the analysis results, the group with high risk score and poor clinical factors showed shorter survival time and higher risk of death. The combined effect analysis of 5 prognostic ERLs showed that it had a good predictive value of clinical results. Combined with clinical phenotypes, these 5 prediction markers show more accurate prediction ability in HCC (Table 4, Figs. 5 and 6).
Table 4
Joint effects survival analysis of clinicopathologic characteristics and the ERLs signature risk score in HCC patients.
Groups
Risk
Variables
Count of events/total (n = 370)
MST (days)
HR (95% CI)
P value
Age (yr)
A1
Low risk
≤65
28/114
NA
1
A2
Low risk
>65
18/71
2131
0.940 (0.514–1.720)
.841
A3
High risk
≤65
42//118
2456
1.900 (1.175–3.073)
.009 <.001
A4
High risk
>65
38/67
711
3.064 (1.872–5.015)
Gender
G1
Low risk
Female
18/55
2116
1
G2
Low risk
Male
28/130
NA
0.674 (0.369–1.233)
.201
G3
High risk
Female
30/66
1135
1.791 (0.995–3.225)
.052
G4
High risk
Male
50/119
837
1.868 (1.083–3.222)
.025
Serum AFP (ng/mL)∗
S1
Low risk
≤400
29/120
3125
1
S2
Low risk
>400
5/26
NA
0.623 (0.236–1.644)
.339
S3
High risk
≤400
31/93
2456
1.818 (1.088–3.037)
.022
S4
High risk
>400
16/38
931
2.229 (1.199–4.143)
.011
Child-Pugh grade†
C1
Low risk
A
25/125
3125
1
C2
Low risk
B/C
5/13
1624
2.170 (0.826–5.698)
.116
C3
High risk
A
32/93
1694
2.216 (1.308–3.753)
.003
C4
High risk
B/C
4/9
612
4.660 (1.551–14.002)
.006
Ishak fibrosis score‡
I1
Low risk
0
10/40
3125
1
I2
Low risk
1/2/3/4/5/6
18/82
NA
1.014 (0.454–2.265)
.974
I3
High risk
0
19/34
931
2.710 (1.255–5.848)
.011
I4
High risk
1/2/3/4/5/6
14/55
NA
1.675 (0.711–3.942)
.238
T1
Tumor stage§
T2
Low risk
I
20/108
NA
1
T3
Low risk
II
6/32
NA
1.109 (0.445–2.764)
.824
T4
Low risk
III/IV
14/32
1210
2.775 (1.389–5.543)
.004
T5
High risk
I
21/63
2456
2.313 (1.251–4.277)
.008
T6
High risk
II
19/53
1149
2.833 (1.501–5.349)
.001
High risk
III/IV
33/58
558
4.951 (2.821–8.688)
<.001
Histologic grade||
H1
Low risk
G1
10/38
2116
1
H2
Low risk
G2
24/97
3125
0.989 (0.470–2.081)
.977
H3
Low risk
G3/ G4
11/48
NA
0.897 (0.380–2.116)
.804
H4
High risk
G1
8/17
2532
2.193 (0.848–5.671)
.105
H5
High risk
G2
34/80
1005
2.186 (1.073–4.452)
.031
H6
High risk
G3/ G4
35/85
899
2.327 (1.144–4.734)
.020
MVI||
M1
Low risk
No
27/119
3125
1
M2
Low risk
Yes
11/47
NA
1.394 (0.683–2.843)
.361
M3
High risk
No
32/87
1372
2.590 (1.526–4.396)
<.001
M4
High risk
Yes
23/61
1490
2.418 (1.380–4.238)
.002
Radical resection
R1
Low risk
R0
41/171
3125
1
R2
Low risk
R1 + R2 + RX
4/10
837
3.999 (1.379–11.596)
.011
R3
High risk
R0
65/152
1005
2.426 (1.638–3.594)
<.001
R4
High risk
R1 + R2 + RX
13/30
837
3.416 (1.816–6.425)
<.001
Figure 5
Joint effect survival analysis of survival time stratified by risk score and clinicopathologic characteristics. Note: (A) Age, (B) Gender, (C) Serum AFP, (D) Child-Pugh grade, (E) Ishak fibrosis score, (F) Tumor stage, and (G) Histologic grade, (H) MVI, (I) Radical resection.
Figure 6
Prognostic nomogram for predicting the 1-, 3-, and 5-years survival rates with risk score and clinicopathologic characteristics.
Joint effects survival analysis of clinicopathologic characteristics and the ERLs signature risk score in HCC patients.Joint effect survival analysis of survival time stratified by risk score and clinicopathologic characteristics. Note: (A) Age, (B) Gender, (C) Serum AFP, (D) Child-Pugh grade, (E) Ishak fibrosis score, (F) Tumor stage, and (G) Histologic grade, (H) MVI, (I) Radical resection.Prognostic nomogram for predicting the 1-, 3-, and 5-years survival rates with risk score and clinicopathologic characteristics.
Principal component analysis analysis
Principal component analysis was used to study 5 prognosis ERLs, complete ERG data set, complete ERL data set in high-risk group and low-risk group showed different EMT status. The results show that the distribution direction of high-risk and low-risk groups is usually different, and the ERLs high-risk group and low risk group are obviously different. This shows that there is a difference in EMT between high and low risk groups of ERG and ERL, while the EMT status of specific 5 high risk groups of ERL has a greater difference (Fig. 7).
Figure 7
PCA between high- and low-risk groups. Note: (A) the 198 ERG set, (B) the 206 ERL set, (C) the 5 ERLs’ prognostic signature.
PCA between high- and low-risk groups. Note: (A) the 198 ERG set, (B) the 206 ERL set, (C) the 5 ERLs’ prognostic signature.
Bioinformatics analysis of function and pathway
Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were used to explore the biological functions and signaling pathways involved in 5 prognostic ERLs. A total of 29 ERGs related to 5 prognostic ERLs were analyzed by GO and KEGG. Go analysis showed that these ERGs were significantly enriched in the changes of extracellular matrix, cell migration function and cell adhesion function (Fig. 8A, B, C). This was confirmed by KEGG analysis, which showed that the extracellular matrix receptor interaction pathway and Pl3k-akt signaling pathway was significantly enriched (Fig. 8D). GSEA results in high and low risk groups are more likely to be related to cell cycle, chromosome remodeling and repair of biological processes (Fig. 9). Bioinformatics analysis showed that these prognostic ERLs had significant effects on the biological functions and pathways of EMT, cell cycle and chromosome changes in HCC.
Figure 8
Functional and pathway enrichment analyses of the co-expressed ERGs of the 5 prognostic ERLs. Note: (A) result of biological process (BP), (B) result of Cellular Component (CC), (C) result of molecular function, (D) result of Encyclopedia of Genes and Genomes (KEGG) pathways.
Figure 9
GSEA of C2 and C5 gene sets between high- and low-risk groups. Note: GSEA results of (A–E) C2 gene set and (F–I) C5 gene set.
Functional and pathway enrichment analyses of the co-expressed ERGs of the 5 prognostic ERLs. Note: (A) result of biological process (BP), (B) result of Cellular Component (CC), (C) result of molecular function, (D) result of Encyclopedia of Genes and Genomes (KEGG) pathways.GSEA of C2 and C5 gene sets between high- and low-risk groups. Note: GSEA results of (A–E) C2 gene set and (F–I) C5 gene set.
Clinical validation of long noncoding RNA levels of 5 genes
We analyzed 6 pairs of HCC tissues and adjacent controls to verify the mRNA levels of 5 genes. The results showed that the expression of all 5 genes was relatively high in tumor (Fig. 10). Our experimental results are consistent with the data analysis.
Figure 10
Verification of lncRNA levels of 5 prognostic ERLs in 6 pairs of HCC tissues and adjacent tissues. Note: ∗∗∗ represents P < .001.
Verification of lncRNA levels of 5 prognostic ERLs in 6 pairs of HCC tissues and adjacent tissues. Note: ∗∗∗ represents P < .001.
Discussion
Epithelial cells are usually very tightly connected, which can form an important body defense barrier. Interstitial cells are adjacent to epithelial cells, with loose tissue and lack of cell polarity. EMT is a physiological process involving the transformation of epithelial cells into stromal cells, which enhances the ability of invasion, metastasis and anti-apoptosis of various cells.[ This physiological process has 2 sides, different types of EMT can promote wound healing, tissue regeneration and fibrosis, which is also of great significance in the development of embryo.[ Tumor cells can enhance their migration and invasion ability through EMT, invade blood vessels and lymphatic vessels, and have distant metastasis through circulatory system. Early studies on biomarkers of EMT in HCC showed that the expression of E-cadherin was down regulated, which was significantly related to intrahepatic metastasis and capsule invasion of cancer cells.[ EMT is also associated with microvascular invasion in HCC, Overexpression of EMT related transcription factor FOXC1 promotes microvascular invasion of HCC.[LncRNA can activate or inhibit tumor related signaling pathways, destroy the dynamic balance of cells, and participate in the regulation of tumor proliferation, apoptosis, invasion and metastasis by binding with mRNA and protein.[ EMT inducible transcription factors mainly include Twist, Snail, Slug, and Zeb, these cytokines can directly or indirectly participate in the metastasis of cancer cells through different signal cascades, including Akt signal sensor and activator of transcription 3, MAPK pathway and Wnt pathway, and ERLs interacts with these EMT related factors to participate in EMT regulation.[ In recent years, many lncRNAs have been confirmed to promote the proliferation, invasion and metastasis of HCC cells through the positive regulation of EMT process.[ LncRNA-HULC has been shown to play a role as a competitive endogenous RNA in HCC, promoting the progression and metastasis of HCC by influencing the regulation of miR-200a-3p on Zeb-1 expression.[ LncRNA HOYAIR enhances EMT through HOTAIR-mir-23b-3p-Zeb-1 pathway and promotes invasion and migration of hepatoma cells.[ The expression level of lncRNA-CCAT2 was found to be positively correlated with lymph node metastasis and vascular invasion, which regulated the EMT process induced by Snail 2 and promoted the progression of HCC.[ In addition to influencing EMT induced transcription factors, lncRNA also acts on some important signaling pathways in EMT. LncRNA-OGFRP1 can promote the EMT process by regulating Akt and Wnt/β-catenin signaling pathway, which can enhance the proliferation of hepatoma cells.[ The overexpression of lncRNA-n335586 in HBV related hepatocarcinoma can significantly promote the migration and invasion of hepatocarcinoma cells, and It can affect the EMT process by affecting the lncRNA-n335586/miRNA 924/CKMT1A axis.[ As mentioned above, lncRNA plays an important regulatory role in the EMT process of HCC, and also has an important value in the treatment and prognosis prediction of HCC patients. Therefore, it is necessary to identify prognostic ERLs.In this study, we used HCC data from TCGA database. TCGA is a large comprehensive database containing high-throughput genetic material sequencing, clinical characteristics and other data. By using univariate regression analysis and Pearson correlation analysis, we determined 18 ERLs, and finally determined 5 prognostic ERLs by “step” function, which are used to build HCC prognosis model. ERLs shows good predictability in the model, and ROC analysis shows that the model has excellent accuracy in the prediction of survival over time. Among the 5 prognostic ERLs, AC099850.3 has been reported as an important node of lncRNA-miRNA-mRNA-ceRNA network of tongue squamous cell carcinoma in previous studies, which is significantly related to the overall survival rate of tongue squamous cell carcinoma.[ Abnormally high expression of CYTOR was found in various tumor tissues,[ The expression of CYTOR was found to be abnormally high in many kinds of tumor tissues. Relevant research shows that CYTOR is involved in the pathological process of many cancers, such as tongue squamous cell carcinoma, breast cancer, gallbladder cancer, kidney cancer, hepatocellular carcinoma, colon cancer, etc.[ In breast cancer and tongue squamous cell carcinoma, CYTOR was found to be related to the overall survival time of patients, especially in breast cancer,[ CYTOR was also related to tumor recurrence.[ In the meantime, some studies show that x participates in the regulation of EMT, which is consistent with our results.[ The remaining 3 ERLs included in the prognosis model in this study have not been reported in the past literature, and the specific mechanism of EMT in HCC needs further experimental exploration. In addition, we verified that 5 ERLS were highly expressed in HCC tissues by real-time quantitative polymerase chain reaction.The results of combined effect analysis of prognostic ERLs indicate that they are important independent prognostic factors for HCC. We established the nomogram of the prediction model, and the results showed that the risk score based on these prognostic ERLs played a leading role in the prediction of HCC prognosis. Compared with most other clinical features, prognostic ERLs shows better and more accurate predictive power.Bioinformatics analysis of 5 prognostic ERLs showed their biological functions and signal pathways involved in HCC. We used 29 prognostics related ERGs for go and KEGG analysis, and the results were consistent with the prediction. These genes were mainly enriched in cell migration, adhesion and the function of extracellular matrix changes, which also confirmed that these prognostics lncRNAs mainly affect the prognosis of HCC patients by regulating and participating in the EMT process of HCC. Many lncRNA have been found to participate in different biological functions of EMT. CCAT2 can promote EMT of HCC by regulating vimentin, E-cadherin and transcription factor Snail2.[ LncRNA ROR can promote EMT of HCC by hypoxia/miR-145/ZEB2 signal axis.[ LncRNA ATB can not only promote the up regulation of transcription factors ZEB1 and ZEB2 of EMT, but also increase the stability of mRNA of IL-11, interact with them, autocrine IL-11, trigger Stat3 Signal transduction is involved in the migration of HCC cells.[ LncRNA Hottip can enhance the invasion and metastasis of HCC by inhibiting the expression of mirna-125b,[ and spry4-it1 can activate mitogen-activated protein ki-nase (MAPK) signal pathway, and enhance the ability of apoptosis, proliferation and metastasis.[ Most of these lncRNAs are directly or indirectly involved in the EMT process of HCC, and the enrichment function of ERGs in this study is very similar to them. It is worth mentioning that Pl3k-akt signaling pathway stands out in KEGG analysis. Research shows that Pl3k-akt signaling pathway can affect epidermal growth factor receptor (EGFR) and hepatocyte growth factor receptor (HGFR/cMet) signaling, which is of great significance for the migration of HCC cells.[ GSEA analysis results of high and low risk groups showed that cell cycle and chromosome change function were enriched, which was also reflected in GO and KEGG analysis results. In a word, the functional analysis of ERLs shows the relationship between ERLs and EMT, cell cycle change, cell migration and other functions related to prognosis of HCC. The specific mechanism needs further experimental exploration.In this study, we selected the optimal ERLs combination to construct the prognosis model through the “step” function, and then made a hierarchical analysis of the prediction prognosis model construction and clinical characteristics, which paid more attention to the value of these molecular markers in the prognosis prediction of patients with HCC than the previous ERLs research in HCC.[ There are still several defects in this study. First, the HCC data from TCGA can not represent the whole HCC population, and a single data source may have the deviation of genetic data. Secondly, a part of clinical information from TCGA database is missing, which leads to the lack of some clinical features in the joint effect analysis in this study. Despite these shortcomings, this study identified some ERLs with prognostic value in HCC, which is valuable for studying EMT process and predicting prognosis of HCC.
Conclusion
In this study, we conducted intensive analysis of HCC data from TCGA. By matching with the Molecular Signatures Database, we obtained 18 ERLs, and then constructed the HCC prognosis model and clinical feature correlation analysis using 5 prognostic ERLs. The results show that these prognostic markers have reliable independent predictive value. Bioinformatics analysis showed that these prognostic markers were involved in the regulation of EMT and related functions of tumor occurrence and migration, which affected the prognosis of HCC patients. The 5 prognostic types of ELRs identified in this study can be used as potential biomarkers for studying EMT process, predicting prognosis and clinical treatment of HCC patients.
Authors: Glynn Dennis; Brad T Sherman; Douglas A Hosack; Jun Yang; Wei Gao; H Clifford Lane; Richard A Lempicki Journal: Genome Biol Date: 2003-04-03 Impact factor: 13.583
Authors: Kevin C Wang; Yul W Yang; Bo Liu; Amartya Sanyal; Ryan Corces-Zimmerman; Yong Chen; Bryan R Lajoie; Angeline Protacio; Ryan A Flynn; Rajnish A Gupta; Joanna Wysocka; Ming Lei; Job Dekker; Jill A Helms; Howard Y Chang Journal: Nature Date: 2011-03-20 Impact factor: 49.962