Literature DB >> 31897125

Definition of a novel vascular invasion-associated multi-gene signature for predicting survival in patients with hepatocellular carcinoma.

Bo Yi1, Caixi Tang1, Yin Tao1, Zhijian Zhao1.   

Abstract

The aim of the present study was to identify a vascular invasion-associated gene signature for predicting prognosis in patients with hepatocellular carcinoma (HCC). Using RNA-sequencing data of 292 HCC samples from The Cancer Genome Atlas (TCGA), the present study screened differentially expressed genes (DEGs) between patients with and without vascular invasion. Feature genes were selected from the DEGs by support vector machine (SVM)-based recursive feature elimination (RFE-SVM) algorithm to build a classifier. A multi-gene signature was selected by L1 penalized (LASSO) Cox proportional hazards (PH) regression model from the feature genes selected by the RFE-SVM to develop a prognostic scoring model. TCGA set was defined as the training set and was divided by the gene signature into a high-risk group and a low-risk group. Involvement of the DEGs between the two risk groups in pathways was also investigated. The presence and absence of vascular invasion between patients of training set was 175 DEGs. A classification model of 42 genes performed well in differentiating patients with and without vascular invasion on the training set and the validation set. A 14-gene prognostic model was built that could divide the training set or the validation set into two risk groups with significantly different survival outcomes. A total of 762 DEGs in the two risk groups of the training set were revealed to be significantly associated with a number of signaling pathways. The present study provided a 42-gene classifier for predicting vascular invasion, and identified a vascular invasion-associated 14-gene signature for predicting prognosis in patients with HCC. Several genes and pathways in HCC development are characterized and may be potential therapeutic targets for this type of cancer. Copyright: © Yi et al.

Entities:  

Keywords:  gene signature; hepatocellular carcinoma; overall survival; prognosis; vascular invasion

Year:  2019        PMID: 31897125      PMCID: PMC6923904          DOI: 10.3892/ol.2019.11072

Source DB:  PubMed          Journal:  Oncol Lett        ISSN: 1792-1074            Impact factor:   2.967


Introduction

Hepatocellular carcinoma (HCC) is a major type of primary liver cancer (1). The mortality rate is increasing, and patients with the tumor present with a poor prognosis (2). An increasing number of studies have demonstrated that vascular invasion is an adverse prognostic factor in HCC (3–5). Furthermore, vascular invasion is an independent predictive factor of long-term survival in patients with early-stage HCC, and is significantly associated with intrahepatic metastasis (6). Hence, it is extremely necessary to differentiate patients with HCC that present with vascular invasion from those patients with HCC that do not present with vascular invasion, so as to improve survival time. A risk classification model of micro-vascular invasion based on histopathological features has been introduced for predicting the prognosis of patients with HCC (7). Differentially expressed genes (DEGs) in HCC tissue samples in the presence or absence of vascular invasion have been studied in order to extract multi-gene signatures for detecting vascular invasion (8,9). High-throughput technologies allow for the development of a classification model, wherein vascular invasion information can be derived from molecular features. The Cancer Genome Atlas (TCGA) provides comprehensive maps of genomic alterations in various types of cancer (https://portal.gdc.cancer.gov/). A recent study derived a 16-miRNA-based classifier from the analysis of micro (mi)RNA and mRNA expression data derived from TCGA, which could effectively identify vascular invasion and predict overall survival (OS) (10). These studies indicated the feasibility of these multi-gene signatures for prediction of cancer prognosis. Nevertheless, more efforts should be made in order to generate more reliable and accurate prognostic models based on feature genes of vascular invasion. The present study analyzed HCC RNA-sequencing data from TCGA in order to identify feature genes using a recursive feature elimination (RFE) method (11), thus constructing a support vector machine (SVM) classifier for separating patients with vascular invasion from patients without vascular invasion. Furthermore, L1 penalized (LASSO) Cox proportional hazards (PH) regression model was used to determine prognostic genes from the identified feature genes of vascular invasion so as to develop a prognostic scoring model. The performance of the classifier and the prognostic model was tested on an independent set. In addition, a function analysis was performed in order to provide further insights into the molecular mechanisms underlying HCC.

Materials and methods

Data resource

The present study obtained the RNA- sequencing data of 373 HCC samples from TCGA portal based on Illumina HiSeq 2000 RNA Sequencing platform (Download date: 18th, October, 2018). Among these samples, 292 had clinical information of vascular invasion and survival information, including survival time and survival status, and were therefore selected as the training set (TCGA set). Furthermore, the GSE10141 (12) dataset was downloaded from Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) at the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/) based on the GPL5474 Human 6k Transcriptionally Informative Gene Panel platform, including the microarray gene expression data of 80 HCC tissue samples with survival information. Only 62 HCC samples had vascular invasion, and these were selected as the validation set. The present study performed uni- and multivariate Cox regression analyses in order to analyze the associations between clinical factors and OS in the training set using survival package v2.44-1.1 (13) of R language (http://bioconductor.org/packages/survivalr/). The significant clinical factors (log-rank P<0.05) were selected as the cut-off to classify the training set.

Differential expression analysis

Data from the TCGA and GEO databases were normalized using R software (version 3.4.1; http://www.r-project.org/). Following data normalization, the present study performed a differential gene expression analysis using HCC samples with and without vascular invasion in the training set using the limma (14) package (version 3.34.7; http://bioconductor.org/packages/release/bioc/html/limma.html) of R software. The genes with false discovery rate (FDR) <0.05 and |log2 FC|>0.263 were selected and subsequently underwent a two-way hierarchical clustering analysis based on centered pearson correlation (15) algorithm using pheatmap package (16) (version 1.0.8) of R language (version 3.34.7). The results were presented in a heatmap.

Development of an SVM classifier

The present study initially performed a Cox regression analysis to investigate the associations between the identified DEGs and OS. From the significant DEGs with log-rank P<0.05, the present study then identified the optimal combination of feature genes using an RFE (17) algorithm in the caret (18) package (version 6.0–79; https://cran.r-project.org/web/packages/caret) of R language, which was then used to develop an SVM classifier using the SVM (19) function with a sigmoid kernel. In both the training set and the validation set, the robustness of the established SVM classifier was evaluated using concordance index (C-index) (20), Brier score (21), log-rank P-value of cox-PH regression, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and area under receiver operating characteristic curve (AUROC). C-index and Brier score was calculated using the survcomp version 3.9 (22) package (http://www.bioconductor.org/packages/release/bioc/html/survcomp.html) of R language (version 3.4.1), which are two metrics for assessing accuracy. Kaplan-Meier estimate was applied to depict survival time using the survival package in R language. The Log-rank P-value for the difference in OS time between the two groups was calculated. AUROC ranged from 0.5 to 1, with a higher value implying better performance. Sensitivity, specificity, PPV and NPV of ROC curves were computed using pROC v1.15.3 (23) package of R language (https://cran.r-project.org/web/packages/pROC/index.html).

Development and validation of a prognostic scoring model

The present study further utilized the feature genes to fit a LASSO Cox-PH regression model (24) in order to determine the optimal panel of genes for prognosis using the penalized package (v0.9-51) of R language. Based on Cox-PH regression coefficients and expression levels of the identified optimal genes, a prognostic scoring model was built using the following formula: CoefDEGs represents Cox-PH regression coefficients of DEGs; ExpDEGs represents expression levels of DEGs. Risk score was calculated for each sample in the training set. Samples in the training set were then split into a high-risk group and a low-risk group according to median risk score (0.0663803). Kaplan-Meier survival curves were plotted for both risk groups using survival package (version 2.41-1) of R language, and OS of the two groups was compared by log-rank test. Similarly, samples in the validation set were divided into a high-risk group and a low-risk group using the median risk score of the validation set (0.132434) so as to test the prognostic ability of the prognostic scoring model in this set. The present study further validated the results by using SurvExpress, which is an online biomarker validation tool for cancer gene expression data (25). A total of four datasets, including GSE10143 (12), GSE10186 (26), TCGA-Liver-cancer and LIHC-TCGA-Liver HCC were included into SurvExpress.

Stratified analysis

In both the high and low-risk groups of the training set, the present study investigated the associations between clinical factors and OS by performing a Cox regression analysis with the survival package in R language (version 2.41-1).

Functional analysis

The cases in the training set were divided into high- and low-risk groups according to the risk score of the gene signature. The present study then screened for DEGs in the two risk groups using a strict cut-off at FDR<0.05 and |log2FC|>0.263. The signficant DEGs were selected for the pathway enrichment analysis using Gene Set Enrichment Analysis (27) (GSEA, version 3.0; http://software.broadinstitute.org/gsea/index.jsp). P<0.05 was considered to indicate a statistically significant result.

Results

Vascular invasion is an independent predictor of prognosis

The present study performed uni- and multivariate Cox regression analyses in order to analyze the associations between clinical factors and OS in the training set using the survival package in R language. As presented in Table I, vascular invasion and pathological M stage (28) were identified as independent predictors of prognosis in the univariate and multivariate analysis (P<0.05). However, there were only three samples at pathological M1 stage, which was an insufficient amount for accurately assessing prognostic value of pathological M stage. Therefore, the present study classified all samples of the training set into two groups according to vascular invasion. Patients without vascular invasion (n=190) had significantly better survival time compared with patients with vascular invasion (n=102; P=8.609×10−3; Fig. 1).
Table I.

Uni-and multivariate Cox regression analysis of the training set.

Uni-variable coxMulti-variable cox


Clinical characteristicsTCGA (n=292)HR (95% CI)P-valueHR (95% CI)P-value
Age, years, mean ± SD59.85±12.921.017 (0.999–1.035)0.051
sex (male/female)194/980.731 (0.474–1.127)0.154
Pathological M (M0/M1/-) (28)220/3/694.91 (1.523–15.84)0.0033.848 (1.089–13.588)0.036
Pathological N (N0/N1/-)210/2/801.602 (0.221–2.610)0.638
Pathological T (T1/T2/T3/T4/-)160/75/48/8/11.538 (1.23–1.923)<0.0010.607 (0.217–1.699)0.342
Pathological stage (I/II/III/IV/-)150/72/48/4/181.473 (1.153–1.881)0.0032.177 (0.797–5.944)0.129
Histological grade (G1/G2/G3/G4/-)36/141/101/12/21.19 (0.889–1.593)0.243
Virus infection (HBV/HCV/Mixed/-)50/10/35/1971.167 (0.801–1.702)0.420
Vascular invasion (yes/no)102/1901.353 (1.087–2.098)0.0091.678 (1.195–2.962)0.037
Recurrence (yes/no/-)119/156/171.343 (0.843–2.141)0.213
Status (dead/alive)87/205
Overall survival time, months, mean ± SD26.52±24.43

TCGA, The Cancer Genome Atlas; SD, standard deviation; M, metastasis; N, node; T, tumor; HBV, hepatitis B virus; HCV, hepatitis C virus; HR, hazard ratio; CI, confidence interval; -, information unavailable.

Figure 1.

Kaplan-Meier curves of overall survival time of patients with or without vascular invasion of The Cancer Genome Atlas set. HR, hazard ratio.

DEGs were screened between patients with and without vascular invasion

Following the removal of genes with a median expression level of 0, a total of 13,812 genes were inputted into the Limma package and among them, 175 significant DEGs in patients both with and without vascular invasion in the training set that satisfied the cut-off threshold (FDR<0.05 and |log2FC|>0.263) were identified (Table SI), consisting of 62 (35.43%) downregulated genes and 113 (64.57%) upregulated genes in the HCC samples with vascular invasion (Fig. 2A-C).
Figure 2.

DEGs of patients with presence and absence of vascular invasion of the training set. (A) Volcano plot of 175 DEGs. Green spots represent DEGs; red horizontal dash line implies FDR <0.05; two red vertical dash lines indicate |logFC|>0.263. (B) Kernel density plot of log2 (FC) of 175 DEGs. (C) Heatmap for two-way hierarchical clustering of samples based on expression of DEGs. The red and green represent upregulated and downregulated genes, respectively. DEGs, differentially expressed genes; FDR, false discovery rate; FC, fold change.

SVM analysis

Of the aforementioned 175 DEGs, 51 were significantly associated with OS (log-rank P<0.05) in the Cox regression analysis (Table SII). For the purpose of obtaining the optimal feature genes for predicting vascular invasion in HCC, the present study utilized an SVM-RFE algorithm based on the 51 prognosis-associated genes. Maximal prediction accuracy (0.873) (Fig. 3A) and minimal root-mean-square error (0.1038) (Fig. 3B) were reached when using a 42-gene combination (Table II).
Figure 3.

The (A) Accuracy and (B) RMSE curves of the optimal gene combination screened by recursive feature elimination algorithm. The horizontal axis represents the number of gene variables, and the vertical axis represents cross-validation accuracy and RMSE, and the marked position is the number of genes corresponding to the optimal value. Performance of the 42-gene classifier on (C) the training set and (D) the validation set. Left images: Scatter plots presenting the prediction results by the 42-gene classifier. Black round spots represent samples from patients without vascular invasion; red triangles represent samples from patients with vascular invasion. Right images: Confusion matrix for the classification results. The X and Y axes represent the coordinates corresponding to the position in a two-dimensional plane generated by SVM. The top-left corner represents true positive rate (number), the top-right corner represents false negative rate (number), the left bottom represents false positive rate (number) and the right bottom represents true negative rate (number). SVM, support vector machine; RMSE, root-mean-square error.

Table II.

Combination of 42-genes.

GenelogFCP-valueFDR
DNMT3L−0.4578579724.250×10−050.000344393
WNT1−0.4406533730.002298230.018624233
AVPR2−0.3370101960.0001305060.001057586
CRYAA−0.3273496055.220×10−050.000423239
ADRA1A−0.3234579760.0001320.001069691
RERGL−0.3070319740.000272050.002204622
HSD17B13−0.3038838974.350×10−050.00035246
CRHBP−0.2825444060.0003787870.003069588
GPR17−0.274871250.0015570110.012617592
AP1M20.2650121280.0022980970.018623151
CCDC74B0.266071110.0055384910.04488242
EPHX40.2731066350.0016163940.013098814
MYLK20.2779447970.0018983970.015384094
S100P0.2802119420.0007960240.006450761
SCIN0.2867456670.0014013590.011356228
GULP10.2934054650.0020644320.016729591
TMC50.3043488710.0017178240.013920779
HOXD90.3279615194.660×10−050.000377344
DHDH0.3311478220.0013033370.01056189
RUNDC3A0.3443569750.0010491840.0085023
FXYD30.3471112050.0026105680.021155333
FAM90A10.3494920540.0017895460.014501995
POF1B0.3534136630.000983770.007972208
FAM163A0.3576711880.0014744360.01194843
KCNN10.3652173750.0012022030.009742322
TFAP2A0.3655673316.750×10−050.000547399
COL24A10.3823676630.0020492110.016606245
DIRAS20.4059656250.0009951960.0080648
FRMD10.4111644020.0041465250.033602313
EPO0.4135449520.0009928780.008046009
USH1C0.4171429720.0006682810.005415564
CA90.4220984650.0017193370.013933041
ART50.4239557280.0054377470.044066018
MMP120.430640250.0008528960.006911633
TRIM540.4385129070.0010138640.008216081
PPFIA40.4670765495.000×10−050.000405366
SLC35F30.5032855060.0022287690.018061337
ELOVL30.5249901210.0001179120.00095553
NPTX10.5321577040.0016378640.013272803
ZNF6950.6014492780.0002194460.001778327
HOXD100.6330830552.580×10−050.000209174
PPP2R2C0.6850736971.090×10−058.810×10−05

FC, fold change; FDR, false discovery rate.

The SVM classifier was built with the 42-gene combination and its performance was assessed in both the training set and the validation set. A scatter plot and confusion matrix for the training set or the validation set classified by the classifier are presented in Fig. 3C and D. Table III demonstrates that both sets generated high C-index scores (>0.75), low Brier scores (<0.1) and significant average log-rank P-values (2.97×10−08; 0.0264) in OS difference between the patients with and without vascular invasion (Fig. 4). AUROC of the two sets were 0.970 and 0.942, respectively (Table III; Fig. 4). The sensitivity, specificity, PPV and NPV values are presented in Table III. These results suggest that the SVM classifier was able to classify the samples effectively.
Table III.

Performances of the SVM classifier on the training and validation sets.

Overall survivalROC curve


DatasetsC-indexBrier scoreLog-rank P-valueAUROCSensitivitySpecificityPPVNPV
Training set (TCGA, N=292)0.8140.0394<0.00010.9700.8140.9260.8560.903
Validation set (GSE10141, N=62)0.7570.08840.02640.9420.8240.8890.7370.930

SVM, support vector machine; TCGA, The Cancer Genome Atlas; ROC, receiver operating characteristic curve; AUROC, area under receiver operating characteristic curve; PPV, positive predictive value; NPV, negative predictive value.

Figure 4.

Kaplan-Meier and receiver operating characteristic curves for (A) Training set and (B) the validation set classified by the 42-gene classifier. TCGA, The Cancer Genome Atlas; HR, hazard ratio; AUC, area under the curve.

Prognostic model based on a 14-gene signature

The present study also used the 42 feature genes to create a LASSO Cox-PH regression model. When the maximal value of cross-validation likelihood (−498.517) was achieved, the optimal lambda value was 13.049, and the optimal panel of 14 genes was obtained (Table IV), including Wnt family member 1 (WNT1), crystallin α A (CRYAA), RAS like estrogen regulated growth inhibitor like (RERGL), hydroxysteroid 17-Beta dehydrogenase 13 (HSD17B13), scinderin (SCIN), premature ovarian failure (POF)1B, erythropoietin (EPO), USH1 protein network component harmonin (USH1C), ADP-ribosyltransferase 5 (ART5), matrix metalloproteinase (MMP)12, tripartite motif containing 54 (TRIM54), solute carrier family 35 member F3 (SLC35F3), homeobox D (HOXD)10 and protein phosphatase 2 regulatory subunit Bgamma (PPP2R2C). The following results were obtained using the risk score formula:
Table IV.

Prognostic signature with 14 genes.

GeneCoefficientHazard ratio (95%CI)P-value
WNT1−0.25000.602 (0.459–0.789)2.400×10−04
CRYAA−0.00020.108 (0.0092–0.493)4.963×10−02
RERGL−0.02630.463 (0.244–0.854)4.533×10−02
HSD17B13−0.01530.586 (0.176–0.906)4.688×10−02
SCIN0.08521.115 (1.086–1.267)4.939×10−02
POF1B0.07561.085 (1.001–1.178)1.513×10−02
EPO0.06161.068 (1.013–1.152)4.897×10−02
USH1C0.01061.043 (1.001–1.071)4.897×10−02
ART50.01341.047 (1.035–1.171)4.231×10−02
MMP120.02361.051 (1.048–1.165)3.410×10−02
TRIM540.04541.059 (1.028–1.164)2.392×10−02
SLC35F30.01241.057 (1.029–1.203)3.974×10−02
HOXD100.10101.448 (1.127–1.924)3.069×10−03
PPP2R2C0.00471.004 (1.002–1.085)4.926×10−01

CI, confidence interval.

Risk score=(−0.2500) × ExpWNT1 + (−0.0002) × ExpCRYAA + (−0.0263) × ExpRERGL + (−0.0153) × ExpHSD17B13 + (0.0852) × ExpSCIN + (0.0756) × ExpPOF1B + (0.0616) × ExpEPO + (0.0106) × ExpUSH1C + (0.0134) × ExpART5 + (0.0236) × ExpMMP12 + (0.0454) × ExpTRIM54+ (0.0124) × ExpSLC35F3 + (0.1010) × ExpHOXD10 + (0.0047) × ExpPPP2R2C. Based on the median risk score, all samples of the training set were divided into a high-risk group (n=146) and a low-risk group (n=146). As presented in Fig. 5A, the OS time was significantly different between the two risk groups (P=1.062×10−08), with an AUC value of 0.959. OS time was significantly different between the high-risk group (n=40) and the low-risk group (n=40) in the validation set (P=0.0250), with an AUC value of 0.917 (Fig. 5B). These observations prove the predictive robustness of the 14-gene signature.
Figure 5.

Kaplan-Meier and receiver operating characteristic curves for the training set (A) and the validation set (B) divided by the 14-gene signature. TCGA, The Cancer Genome Atlas; HR, hazard ratio; AUC, area under the curve.

For validation of SurvExpress, five datasets, including GSE10143, GSE17856, GSE10186, TCGA-Liver-Cancer, and LIHC-TCGA-Liver HCC associated with HCC were included for validation in SurvExpress. The 51 screened candidate genes were inputted and the results revealed that the OS times were all significantly different between the high-risk group and the low-risk group in GSE10143, GSE10186, TCGA-Liver-cancer and LIHC-TCGA-Liver HCC (Fig. 6). This result supported the reliability of the gene signature.
Figure 6.

Validation analysis of the gene signature by SurvExpress. (A) GSE10143, (B) GSE10186, (C) LIHC-TCGA-Liver hepatocellular carcinoma and (D) TCGA-liver cancer datasets. CI, confidence interval.

The present study further investigated the associations between the clinical factors and OS in the low-risk group and the high-risk group of the training set by performing Cox regression analyses. Vascular invasion was significantly associated with OS time in both risk groups (P=0.034 and P=1.50×10−05, respectively; Table V; Fig. 7).
Table V.

Results of Cox regression analysis for the high- and low-risk groups of The Cancer Genome Atlas set.

Low risk groupHigh risk group


Clinical characteristicsHR (95% CI)P-valueHR (95% CI)P-value
Age, years, mean ± SD1.018 (0.986–1.052)0.2731.012 (0.992–1.032)0.257
Sex (male/female)0.568 (0.246–1.308)0.1781.129 (0.676–1.886)0.643
Pathological M (M0/M1/-) (28)8.721 (1.090–69.77)0.1383.227 (0.770–13.520)0.090
Pathological N (N0/N1/-)3.01 (1.052–33.22)0.7631.429 (0.195–10.490)0.724
Pathological T (T1/T2/T3/T4/-)1.527 (0.955–2.443)0.0721.240 (0.914–1.681)0.165
Pathological stage (I/II/III/IV/-)1.346 (0.805–2.252)0.2541.215 (0.885–1.668)0.228
Histological grade (G1/G2/G3/G4/-)1.115 (0.630–1.971)0.7090.916 (0.642–1.307)0.629
Virus infection (HBV/HCV/Mixed/-)2.333 (1.962–5.655)0.0380.932 (0.613–1.416)0.741
Vascular invasion (yes/no)2.478 (1.044–5.885)0.0343.446 (1.913–6.209)<0.001
Recurrence (yes/no/-)1.569 (0.670–3.672)0.2960.924 (0.526–1.623)0.783

M, metastasis; T, tumor; N, node; HBV, hepatitis B virus; HCV, hepatitis C virus; HR, hazard ratio; CI, confidence interval.

Figure 7.

Kaplan-Meier and receiver operating characteristic curves for patients with and without vascular invasion in (A) the low-risk group and (B) the high-risk group of the training set.

Identification and pathway analysis of DEGs between the two risk groups in the training set

In the training set, 599 upregulated genes and 163 downregulated genes were identified in the high-risk group compared with the low-risk group. These genes were significantly involved in pathways of ‘retinol metabolism’, ‘drug metabolism other enzymes’, ‘drug metabolism cytochrome P450’, ‘peroxisome proliferator-activated receptor (PPAR) signaling pathway’, ‘primary bile acid biosynthesis’, ‘steroid hormone biosynthesis’ and ‘histidine metabolism pathways’ (Table VI).
Table VI.

Significant signaling pathways.

PathwayESNESNormal P-valueFDRCountGene
Retinol metabolism−0.7987−2.3043006CYP4A22, CYP26A1, CYP3A43, CYP2A7, CYP2A6, CYP2A13
Drug metabolism other enzymes−0.9022−2.2834004CYP3A43, CYP2A7, CYP2A6, CYP2A13
Drug metabolism cytochrome P450−0.9011−2.048000.00474CYP3A43, CYP2A7, CYP2A6, CYP2A13
PPAR signaling pathway−0.7106−1.93540.00260.01213CYP4A22, CYP8B1, ACADL
Primary bile acid biosynthesis−0.9631−1.903800.01243CYP8B1, AKR1D1, CYP7A1
Steroid hormone biosynthesis−0.5989−1.81620.00840.01886AKR1D1, CYP7A1, HSD3B2, HSD3B1, CYP11A1, CYP3A43
Histidine metabolism−0.8709−1.68750.010.04683HDC, CNDP1, UROC1

ES, enrichment score; NES, normalized enrichment score; count of genes, the number of genes enriched in a pathway; FDR, false discovery rate.

Discussion

HCC is an aggressive malignancy characterized by high incidence rates of recurrence and metastasis (29). Vascular invasion is an unfavorable prognostic factor for patients with HCC (30). Therefore, unraveling the underlying molecular landscape of vascular invasion is of significance for the prognosis of HCC. In the present study, a total of 175 DEGs were identified between patients with the presence and absence of vascular invasion. An SVM classifier was built that consisted of 42 feature genes by implementing an RFE-SVM algorithm. In both the training and validation sets, the classifier had high C-index values, low Brier scores and significant log-rank P-values, indicating good performances in separating patients with vascular invasion from patients without vascular invasion. Furthermore, through using a LASSO Cox-PH model, a 14-gene prognostic signature was obtained and consequently, a prognostic scoring model was established. The 14-gene signature was able to predict those patients with HCC that would have a shorter survival time, as evidenced by the result that OS time was significantly different between the predicted high-risk patients and the predicted low-risk patients. T prognostic performance of the 14-gene signature was successfully confirmed in the validation set. The 14-gene prognostic combination included WNT1, CRYAA, RERGL, HSD17B13, SCIN, POF1B, EPO, USH1C, ART5, MMP12, TRIM54, SLC35F3, HOXD10 and PPP2R2C. Proto-oncogene protein Wnt-1 encoded by the WNT1 gene has been demonstrated as upregulated in HCC, acting as a direct target of miR-122 (31). RERGL is a member of the RAS superfamily of GTPases that participates in regulating several biological processes, such as cell proliferation, differentiation and apoptosis (32). There was one HSD17B13 protein, namely 17β-HSD type 13, that was downregulated in HCC (33). There is evidence to suggest that HSD17B13 suppresses HCC progression by delaying the G1/S phase transition of HCC cells (34). Furthermore, HSD17B13 is a novel liver-specific protein associated with lipid droplet, and may be a promising biomarker of liver cancer (35). SCIN encodes scinderin, which is an actin-severing protein of the gelsolin superfamily. It acts as a regulator of HCC cell apoptosis and growth, and has been identified as a transcriptional target of tumor suppressor factor breast cancer metastasis-suppressor 1 (36). It has long been established that the EPO/EPO-receptor plays an important role in angiogenesis and progression of HCC (37). EPO protein expression is positively correlated with vasculogenic mimicry in HCC, and has been identified as an independent predictor of prognosis in patients with HCC (38). Furthermore, EPO is upregulated in HCC and could promote HCC cell proliferation through translocation of its specific receptor induced by hypoxia (39). MMP12 belongs to the MMP family implicated in the degradation of the extracellular matrix. It is upregulated in HCC and is an independent predictive factor for OS in patients with HCC (40,41). TRIM54 is a member of the TRIM protein family. Several members in the TRIM family have been reported to be involved in biological processes, such as cell proliferation, differentiation and apoptosis, and may play a role in cancer initiation and progression (42). However, to the best of our knowledge TRIM54 has not been reported previously. HOXD10, a member of the Abd-B homeobox family, exhibits decreased expression levels in HCC and serves as a tumor-suppressor gene through prohibiting extracellular signal-regulated kinase signaling (43). PPP2R2C encodes serine/threonine-protein phosphatase 2A 55 kDa regulatory subunit B γ isoform, and has been identified as upregulated in HCC (44). To the best of our knowledge, there are little studies that focus on the function of CRYAA, RERGL, POF1B, POF1B, USH1C, TRIM54 and SLC35F3 in HCC. The results of the present study indicate that the 14 vascular invasion-associated genes may be prognostic biomarkers of HCC. Another aim of the present study was identifying the potential roles of DEGs between the high- and low-risk groups of the training set. There were 762 DEGs between the two risk groups, which were significantly involved in a number of signaling pathways, such as ‘retinol metabolism’, ‘drug metabolism cytochrome P450’, and ‘PPAR signaling pathway’. The association between retinol metabolism and HCC has been demonstrated previously and a synthetic retinoid has been indicated to prevent HCC recurrence (45). Drug-metabolizing cytochrome P450 enzyme activities are severely disrupted in HCC (46). The PPAR signaling pathway plays a part in tumorigenesis and tumor progression via different metabolic pathways: Glycolysis/gluconeogenesis, lipid, glycerolipid and glycerophospholipid metabolism, protein synthesis and degradation and purine metabolism (47). These findings reveal the critical roles of these pathways in HCC. There are some limitations in the present study; though the 14-gene prognostic signature has been validated by an independent dataset, the expression levels of these 14 genes have not been confirmed by individual gene expression experiments. In summary, using TCGA data, the present study defined a classifier of 42 feature genes for classification of patients with HCC with and without vascular invasion, and identified a vascular invasion-associated 14-gene prognostic signature for HCC. Several genes and pathways have been revealed to be critical for HCC. These results further the current knowledge on the molecular mechanisms underlying HCC and may aid in the development of personalized treatment for patients with HCC. Large-scale studies are required in order to further validate the results of the present study.
  47 in total

Review 1.  Retinoid roles in blocking hepatocellular carcinoma.

Authors:  Yohei Shirakami; Hiroyasu Sakai; Masahito Shimizu
Journal:  Hepatobiliary Surg Nutr       Date:  2015-08       Impact factor: 7.293

2.  survcomp: an R/Bioconductor package for performance assessment and comparison of survival models.

Authors:  Markus S Schröder; Aedín C Culhane; John Quackenbush; Benjamin Haibe-Kains
Journal:  Bioinformatics       Date:  2011-09-07       Impact factor: 6.937

3.  Recursive feature elimination for biomarker discovery in resting-state functional connectivity.

Authors:  Hariharan Ravishankar; Radhika Madhavan; Rakesh Mullick; Teena Shetty; Luca Marinelli; Suresh E Joel
Journal:  Conf Proc IEEE Eng Med Biol Soc       Date:  2016-08

4.  Global trends and predictions in hepatocellular carcinoma mortality.

Authors:  Paola Bertuccio; Federica Turati; Greta Carioli; Teresa Rodriguez; Carlo La Vecchia; Matteo Malvezzi; Eva Negri
Journal:  J Hepatol       Date:  2017-03-21       Impact factor: 25.083

5.  Vascular invasion affects survival in early hepatocellular carcinoma.

Authors:  Chen-Hsi Hsieh; Chang-Kuo Wei; Wen-Yao Yin; Chun-Ming Chang; Shiang-Jiun Tsai; Li-Ying Wang; Wen-Yen Chiou; Moon-Sing Lee; Hon-Yi Lin; Shih-Kai Hung
Journal:  Mol Clin Oncol       Date:  2014-09-18

6.  Boosting the concordance index for survival data--a unified framework to derive and evaluate biomarker combinations.

Authors:  Andreas Mayr; Matthias Schmid
Journal:  PLoS One       Date:  2014-01-06       Impact factor: 3.240

7.  The Interplay between Metabolism, PPAR Signaling Pathway, and Cancer.

Authors:  Daniele Fanale; Valeria Amodeo; Stefano Caruso
Journal:  PPAR Res       Date:  2017-04-26       Impact factor: 4.964

8.  Erythropoietin promoted the proliferation of hepatocellular carcinoma through hypoxia induced translocation of its specific receptor.

Authors:  Shuo Miao; Su-Mei Wang; Xue Cheng; Yao-Feng Li; Qing-Song Zhang; Gang Li; Song-Qing He; Xiao-Ping Chen; Ping Wu
Journal:  Cancer Cell Int       Date:  2017-12-11       Impact factor: 5.722

9.  Dataset for the quantitative proteomics analysis of the primary hepatocellular carcinoma with single and multiple lesions.

Authors:  Xiaohua Xing; Yao Huang; Sen Wang; Minhui Chi; Yongyi Zeng; Lihong Chen; Ling Li; Jinhua Zeng; Minjie Lin; Xiao Han; Jingfeng Liu; Xiaolong Liu
Journal:  Data Brief       Date:  2015-09-08

10.  Integrative transcriptome analysis reveals common molecular subclasses of human hepatocellular carcinoma.

Authors:  Yujin Hoshida; Sebastian M B Nijman; Masahiro Kobayashi; Jennifer A Chan; Jean-Philippe Brunet; Derek Y Chiang; Augusto Villanueva; Philippa Newell; Kenji Ikeda; Masaji Hashimoto; Goro Watanabe; Stacey Gabriel; Scott L Friedman; Hiromitsu Kumada; Josep M Llovet; Todd R Golub
Journal:  Cancer Res       Date:  2009-09-01       Impact factor: 12.701

View more
  2 in total

1.  HepT1-derived murine models of high-risk hepatoblastoma display vascular invasion, metastasis, and circulating tumor cells.

Authors:  Sarah E Woodfield; Brandon J Mistretta; Roma H Patel; Aryana M Ibarra; Kevin E Fisher; Stephen F Sarabia; Ilavarasi Gandhi; Jacquelyn Reuther; Zbigniew Starosolski; Andrew Badachhape; Jessica Epps; Barry Zorman; Aayushi P Shah; Samuel R Larson; Rohit K Srivastava; Yan Shi; Andres F Espinoza; Saiabhiroop R Govindu; Richard S Whitlock; Kimberly Holloway; Angshumoy Roy; Pavel Sumazin; Ketan B Ghaghada; Dolores Lopez-Terrada; Preethi H Gunaratne; Sanjeev A Vasudevan
Journal:  Biol Open       Date:  2022-09-12       Impact factor: 2.643

2.  Deep View of HCC Gene Expression Signatures and Their Comparison with Other Cancers.

Authors:  Yuquan Qian; Timo Itzel; Matthias Ebert; Andreas Teufel
Journal:  Cancers (Basel)       Date:  2022-09-03       Impact factor: 6.575

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.