BACKGROUND: Increasing evidence has indicated immune-related genes (IRGs) play a key role in the development of hepatocellular carcinoma (HCC). Whereas, there have been no investigations proposing a reliable prognostic signature in terms of IRGs. This study aimed to develop a robust signature based on IRGs in HCC. A total of 597 HCC patients from The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) databases were enrolled in this study. METHODS: The TCGA cohort was utilized for discovery, and the ICGC cohort was utilized for validation. Multiple algorithms were implemented to identify key prognostic IRGs and establish an immune-related risk signature. Bioinformatics analysis and R soft tools were utilized to annotate underlying biological functions. RESULTS: A total of 1416 differentially expressed mRNAs (DEMs) were screened, of which 90 were differentially expressed IRGs (DEIRGs). Using univariate Cox regression analysis, we identified 33 prognostically relevant DEIRGs. Using least absolute shrinkage and selection operator (LASSO) regression and multivariate Cox regression analysis, we extracted 8 optimal DEIRGs to construct a risk signature in the TCGA cohort, and the signature was verified in the ICGC cohort. We also built a nomogram to increase the accuracy of predicting HCC prognosis. By investigating the relationship of the risk score and 8 risk genes from our signature with clinical traits, we found that the aberrant expression of the immune-related risk genes is correlated with the development of HCC. Moreover, the high-risk group was higher than the low-risk group in terms of tumor mutation burden (TMB), immune cell infiltration, and the expression of immune checkpoints (programmed cell death protein 1 [PD-1], programmed cell death ligand 1 [PD-L1], and cytotoxic T-lymphocyte-related protein 4 [CTLA-4]), and functional enrichment analysis indicated the signature enriched an intensive immune phenotype. CONCLUSION: This study developed a robust immune-related risk signature and built a predictive nomogram that reliably predict overall survival in HCC, which may be helpful for clinical management and personalized immunotherapy decisions.
BACKGROUND: Increasing evidence has indicated immune-related genes (IRGs) play a key role in the development of hepatocellular carcinoma (HCC). Whereas, there have been no investigations proposing a reliable prognostic signature in terms of IRGs. This study aimed to develop a robust signature based on IRGs in HCC. A total of 597 HCC patients from The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) databases were enrolled in this study. METHODS: The TCGA cohort was utilized for discovery, and the ICGC cohort was utilized for validation. Multiple algorithms were implemented to identify key prognostic IRGs and establish an immune-related risk signature. Bioinformatics analysis and R soft tools were utilized to annotate underlying biological functions. RESULTS: A total of 1416 differentially expressed mRNAs (DEMs) were screened, of which 90 were differentially expressed IRGs (DEIRGs). Using univariate Cox regression analysis, we identified 33 prognostically relevant DEIRGs. Using least absolute shrinkage and selection operator (LASSO) regression and multivariate Cox regression analysis, we extracted 8 optimal DEIRGs to construct a risk signature in the TCGA cohort, and the signature was verified in the ICGC cohort. We also built a nomogram to increase the accuracy of predicting HCC prognosis. By investigating the relationship of the risk score and 8 risk genes from our signature with clinical traits, we found that the aberrant expression of the immune-related risk genes is correlated with the development of HCC. Moreover, the high-risk group was higher than the low-risk group in terms of tumor mutation burden (TMB), immune cell infiltration, and the expression of immune checkpoints (programmed cell death protein 1 [PD-1], programmed cell death ligand 1 [PD-L1], and cytotoxic T-lymphocyte-related protein 4 [CTLA-4]), and functional enrichment analysis indicated the signature enriched an intensive immune phenotype. CONCLUSION: This study developed a robust immune-related risk signature and built a predictive nomogram that reliably predict overall survival in HCC, which may be helpful for clinical management and personalized immunotherapy decisions.
Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer and mostly results from viral infection and liver fibrosis.[ It is the sixth leading cause of malignancy worldwide and the third most common cause of cancer-related death.[ There are approximately 437,000 people in the world who are diagnosed with HCC each year, of which approximately 50% of cases occur in China.[ The main reason for the poor prognosis of HCC is the lack of early and effective biomarkers and therapeutic treatments. Thus, further exploring effective biomarkers and understanding the potential molecular mechanisms of tumorigenesis and progression in HCC may improve the overall survival and therapeutic effect of HCC patients.It has been confirmed that the derangement of the immune response in the tumor microenvironment (TME) actively participates in tumor aggressiveness and progression. The TME facilitates tumor cell survival, proliferation, and metastasis and enhances the ability to evade host immune responses.[ Abundant efforts have been conducted to understand the relationship between tumors and the immune response, and tumor immunotherapy has achieved remarkable success in advancing tumor treatment.[ The development of immune checkpoint inhibitors (including nivolumab and pembrolizumab) has boosted their application in HCC, but they are only effective in a subset of patients.[ Hence, it is necessary to identify a robust immune signature that predicts the sensitivity of patients to immunotherapy to realize individualized treatment and that may become a new target for immunotherapy in HCC.Prior studies have shed light on immune-related genes (IRGs) that not only play an important role in the process of immunotherapy but are also associated with the prognosis of patients with HCC. However, these studies focused on investigating the relationships of a single IRG with the prognosis of HCC. Few studies have explored the correlations between multiple IRGs and the prognosis of HCC. A recent report illustrated that multigene immune signatures might be a more effective tool for selecting candidates for treatment.[In the present study, we utilized IRG expression data based on The Cancer Genome Atlas (TCGA) and Immunology Database and Analysis Portal (ImmPort) to establish an immune-related risk signature consisting of 8 IRGs for HCC and validated it in the International Cancer Genome Consortium (ICGC) database. The signature was combined with clinical factors to further develop a predictive nomogram, enabling improved prognostic assessment and clinical management of HCC patients. To investigate the ability of our signature to predict the progression of HCC, we analyzed the relationships between the risk factors from our signature (the risk score and risk genes) and the clinical parameters. Moreover, to further explore the underlying immune-related molecular mechanisms and functions of the genes in the signature, we performed research on tumor mutation burden (TMB), immune cell infiltration, and functional enrichment in the HCC.
Material and methods
Sample datasets and IRGs
The RNA-Seq gene expression data and corresponding clinical data of HCC samples were collected from the TCGA database (https://portal.gdc.cancer.gov/). Patients were excluded if they lacked mRNA RNA sequencing data; did not have prognostic information; were not primary HCC; were missing clinical data; received neo-adjuvant therapy. The detail of baseline information was shown in Supplemental Table 1 (see Table, Supplemental Table 1, which exhibits the detail of baseline information in TCGA-LIHC and ICGC-LIHI-JP cohorts). The transcriptome profiling of RNA expression was measured in fragments per kilobase of exon model per million mapped reads or FPKM values. Log2 transformation was performed for the normalization of RNA expression profiles. After eliminating genes with FPKM values equal to 0 in >50% of the samples, 18947 protein-coding genes were used for further analysis. A total of 1534 immune-related genes (IRGs) were obtained from the ImmPort database (https://immport.niaid.nih.gov/) (see Table, Supplemental Table 2, which exhibits the 1534 IRGs and their immune category). These IRGs play multiple roles in immune pathways, including antigen processing and presentation pathways, the B cell receptor signaling pathway; cytokine, cytokine receptor, interleukin, and NK cell cytotoxicity pathways; transforming growth factor-β family member pathways; the T cell receptor signaling pathway; and TNF family member receptor pathways. The IRGs were prepared for the construction of an immune-related risk signature.
Differential expression analysis (DEA)
Differentially expressed mRNAs (DEMs) were identified by the R package “limma,” and P values were corrected using the Benjamini and Hochberg false discovery rate (FDR) method. The following cut-off thresholds were set to screen DEMs: |log2 fold change (FC)| > 1 and FDR < 0.05. The intersection of these DEMs and 1534 IRGs was used to obtain differentially expressed immune-related genes (DEIRGs).
Construction of the immune-related risk signature
A total of 365 HCC patients with survival data in the TCGA cohort were randomly divided into a training set (n = 255; 70%) and a testing set (n = 110; 30%). First, univariate Cox regression analysis of the DEIRGs was implemented to determine the correlation between the survival of HCC patients and the expression level of the DEIRGs in the training set. DEIRGs with P < .05 were considered prognostic IRGs and extracted for least absolute shrinkage and selection operator (LASSO) Cox regression analysis to further select key prognostic IRGs for overall survival in patients. The hazard ratio (HR) from the univariate Cox regression analysis was used to identify the protective (HR < 1) and risk genes (HR > 1). Next, multivariate Cox regression analysis was performed on the DEIRGs obtained from the LASSO Cox regression analysis. Then, an immune-related risk signature was constructed based on a linear combination of the expression value of the prognostic DEIRGs weighted by the regression coefficient (β) derived from the multivariate Cox regression analysis, as follows:where n is the number of key prognostic IRGs, exp is the expression value of the key prognostic IRGs i, and β is the regression coefficient of IRGs i. Using the median risk score in the training set as the cut-off value, HCC patients were classified into high- and low-risk groups in the training set, testing set, and entire set. Subsequently, we analyzed the expression differences of the key prognostic IRGs in the high- and low-risk groups in the entire set. The area under the curve (AUC) of the time-dependent receiver operating characteristic (ROC) curve was calculated to evaluate the prognostic ability of the immune-related risk signature for overall survival using the R package “timeROC.” The Kaplan–Meier method was used to calculate survival curves, and the log-rank test was performed to compare the survival differences in the high- and low-risk groups using the R package “survival.” Then, the prognostic ability of the immune-related risk signature was further investigated in the testing set and entire set.
External validation of the immune-related risk signature
To evaluate the predictive accuracy of the immune-related risk signature from the TCGA cohort, a total of 232 HCC patients with full clinical information from the LIHI-JP project of the ICGC cohort (https://icgc.org/) were further analyzed for validation. The criteria for patient screening were consistent with TCGA, and the detail of baseline was displayed in Supplemental Table 1 (see Table, Supplemental Table 1, which exhibits the detail of baseline information in TCGA-LIHC and ICGC-LIHI-JP cohorts). We processed the data from the ICGC cohort in the same way as the TCGA cohort. For each included patient, the risk score was calculated using the same formula. Subsequently, the ROC curve and the Kaplan–Meier survival curve were constructed to test the predictive value of the signature.All data used were from the TCGA and ICGC databases, which are both publicly available and open access, so additional approval was not required from the ethics committee.
Independent prognostic role of the immune-related risk signature
Next, we investigated whether the prognostic power of the immune-related risk signature was influenced by other clinicopathological factors, including sex, age, body mass index (BMI), alpha-fetoprotein (AFP), histological grade, BCLC (Barcelona Clinic Liver Cancer) stage, and vascular invasion. Univariate and multivariate Cox regression analyses were carried out to determine independent factors of prognosis in the entire set. P < .05 was considered statistically significant. Factors with P < .05 based on univariate and multivariate analyses were deemed independent prognostic factors.
Establishment of a new predictive nomogram
All independent prognostic factors were included to establish a new predictive nomogram via a stepwise Cox regression model to investigate the probability of the 1-, 2-, and 3-year overall survival of HCC patients. Kaplan–Meier analysis, the AUC of the time–ROC curve, Harrell concordance index (C-index), and a calibration curve were used to assess the performance of the nomogram. The C-index was calculated to evaluate the discrimination of the nomogram by a bootstrap method with 1000 resamples. The nomogram calibration curve was plotted to compare the predicted and observed overall survival. In addition, the decision curve analysis (DCA) was conducted to estimate clinical utility of the nomogram through quantifying net benefits against a range of threshold probabilities.
Mutation and immune analysis of the signature
The mutation data available for HCC patients were obtained from the TCGA database (https://portal.gdc.cancer.gov/). We calculated the total number of mutations for each sample and then compared the mutation burden of the high- and low-risk groups in the entire set. In addition, to determine whether our signature could influence the status of the tumor immune microenvironment in HCC patients, the tumor immune estimation resource (TIMER) algorithm (https://cistrome.shinyapps.io/timer/) was used to estimate the abundances of B cells, CD4+ T cells, CD8+ T cells, dendritic cells, neutrophils, and macrophages in the data from the TCGA database, and then the relationships between the risk score and immune cell infiltration in the entire set were investigated.
Functional enrichment analysis
To explore the underlying biological processes and pathways of the immune-related risk signature, we utilized the R package “clusterProfiler” to perform Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. GO biological processes and KEGG pathways having both P < .05 and FDR < 0.05 were considered to be significant.
Results
DEMs and DEIRGs in patients with HCC
Figure 1 displays the flow diagram of our study. The analysis of mRNA expression profiles between HCC tissues and adjacent normal liver tissues identified 1416 DEMs (Fig. 2A and B). A total of 1016 mRNAs were significantly upregulated and 400 mRNAs were significantly downregulated in HCC tissues compared with normal liver tissues. The IRGs among the 1416 DEMs were identified, and a total of 90 DEIRGs, including 49 upregulated mRNAs and 41 downregulated mRNAs, were filtered for further analysis (Fig. 2C and D).
Figure 1
The workflow of this study. DEA = differential expression analysis; DEIRGs = differentially expressed IRGs; DEMs = differentially expressed mRNAs; IRGs = immune-associated genes.
Figure 2
Volcano plot and hierarchical clustering analysis of DEMs and DEIRGs in the TCGA dataset. (A) Volcano plot of DEMs. (B) Cluster heat map of the top 50 DEMs. (C) Volcano plot of DEIRGs. (D) Cluster heat map of the top 50 DEIRGs. DEIRGs = differentially expressed IRGs; DEMs = differentially expressed mRNAs; TCGA = The Cancer Genome Atlas.
The workflow of this study. DEA = differential expression analysis; DEIRGs = differentially expressed IRGs; DEMs = differentially expressed mRNAs; IRGs = immune-associated genes.Volcano plot and hierarchical clustering analysis of DEMs and DEIRGs in the TCGA dataset. (A) Volcano plot of DEMs. (B) Cluster heat map of the top 50 DEMs. (C) Volcano plot of DEIRGs. (D) Cluster heat map of the top 50 DEIRGs. DEIRGs = differentially expressed IRGs; DEMs = differentially expressed mRNAs; TCGA = The Cancer Genome Atlas.In the TCGA cohort, the training set was used to construct the immune-related risk signature, the testing set and entire set were used for validation. Using univariate Cox regression analysis, we identified 33 IRGs with prognostic prediction capacity from a total of 90 DEIRGs (see Figure, Supplemental Figure 1, which illustrates the 33 IRGs associated with overall survival). The 33 IRGs were enrolled in LASSO Cox regression analysis. After 100 rounds of 10-fold cross-validation, 8 optimal IRGs were identified when lambda took the minimum value of 0.059, and the 8 IRGs were used to establish an immune-related risk signature for HCC patients (Fig. 3A and B). The results of the univariate and multivariate Cox regression analyses of the 8 optimal IRGs are shown in Table 1. The formula of the risk score is as follows:
Figure 3
Identification of key prognostic IRGs and the expression levels of the key prognostic IRGs genes in the risk group. (A, B) Identification of key prognostic IRGs by LASSO regression analysis. IRGs = immune-associated genes; LASSO = least absolute shrinkage and selection operator.
Table 1
Univariate and multivariate analysis of the 8 optimal IRGs.
Univariate analysis
Multivariate analysis
Gene ID
HR (95% CI)
P-value
HR (95% CI)
P-value
APLN
1.298 (1.066, 1.582)
.01
1.178 (0.941, 1.476)
.154
CDK4
1.721 (1.360, 2.176)
5.92E–06
1.289 (0.967, 1.717)
.083
CXCL2
0.826 (0.724, 0.943)
.005
0.924 (0.790, 1.079)
.318
ESR1
0.514 (0.321, 0.821)
.005
0.790 (0.479, 1.302)
.355
IL1RN
0.739 (0.624, 0.875)
4.37E–04
0.819 (0.680, 0.987)
.036
PSMD2
2.334 (1.610, 3.383)
7.60E–06
1.180 (0.739, 1.886)
.488
SEMA3F
1.591 (1.101, 2.299)
.013
1.279 (0.877, 1.865)
.201
SPP1
1.131 (1.060, 1.208)
2.13E–04
1.078 (0.993, 1.170)
.073
CI = confidence interval; HR = hazard ratio.
Identification of key prognostic IRGs and the expression levels of the key prognostic IRGs genes in the risk group. (A, B) Identification of key prognostic IRGs by LASSO regression analysis. IRGs = immune-associated genes; LASSO = least absolute shrinkage and selection operator.Univariate and multivariate analysis of the 8 optimal IRGs.CI = confidence interval; HR = hazard ratio.The risk scores for patients were calculated, and patients were divided into high- and low-risk groups using the median risk score of the training set as the cut-off threshold in each set. As shown in Supplemental Figure 2 (see Figure, Supplemental Figure 2, which illustrates the expression levels of the key prognostic IRGs in the high- and low-risk groups), among the 8 genes of the immune-related risk signature, significantly elevated APLN, CDK4, PSMD2, SEMA3F, and SPP1 expression was found in high-risk group compared with low-risk group, while CXCL2, ESR1, and IL1RN expression was significantly decreased in high-risk group compared with low-risk group (Wilcoxon rank-sum test; P < .001). The ROC curves displayed that the AUCs of the immune-related risk signature at 1, 2, and 3 years were 0.813, 0.717, and 0.732 for the training set; 0.631, 0.690, and 0.755 for the testing set; and 0.764, 0.710, and 0.728 for the entire set, respectively (Fig. 4A–C left panel). The distribution of the risk scores, the survival status of the patients ranked according to the risk scores, and the expression values of the 8 genes are shown in Fig. 4A–C middle panel. The Kaplan–Meier survival curve combined with the log-rank test showed that patients in the high-risk group had a significantly shorter survival time compared with patients in the low-risk group (Fig. 4A–C right panel, KM Log-Rank P = 3.47e−5, .014, and 3.12e−6, respectively). Our results indicated a good performance of the immune-related risk signature for the survival prediction of HCC patients.
Figure 4
Time-dependent ROC analysis, risk score analysis, and Kaplan–Meier analysis for the immune-associated signature in HCC. (A) Time-dependent ROC analysis, risk score, heatmap of mRNA expression, and Kaplan–Meier curve of the signature in the training set of TCGA cohort. (B) Time-dependent ROC analysis, risk score, heatmap of mRNA expression, and Kaplan–Meier curve of the signature in the testing set of TCGA cohort. (C) Time-dependent ROC analysis, risk score, heatmap of mRNA expression, and Kaplan–Meier curve of the signature in the entire included set of TCGA cohort. (D) Time-dependent ROC analysis, risk score, heatmap of mRNA expression, and Kaplan–Meier curve of the signature in ICGC cohort. HCC = hepatocellular carcinoma; ICGC = International Cancer Genome Consortium; ROC = receiver operating characteristic curve; TCGA = The Cancer Genome Atlas.
Time-dependent ROC analysis, risk score analysis, and Kaplan–Meier analysis for the immune-associated signature in HCC. (A) Time-dependent ROC analysis, risk score, heatmap of mRNA expression, and Kaplan–Meier curve of the signature in the training set of TCGA cohort. (B) Time-dependent ROC analysis, risk score, heatmap of mRNA expression, and Kaplan–Meier curve of the signature in the testing set of TCGA cohort. (C) Time-dependent ROC analysis, risk score, heatmap of mRNA expression, and Kaplan–Meier curve of the signature in the entire included set of TCGA cohort. (D) Time-dependent ROC analysis, risk score, heatmap of mRNA expression, and Kaplan–Meier curve of the signature in ICGC cohort. HCC = hepatocellular carcinoma; ICGC = International Cancer Genome Consortium; ROC = receiver operating characteristic curve; TCGA = The Cancer Genome Atlas.To avoid bias arising from only investigating the TCGA cohort, we enrolled a total of 232 HCC samples in the ICGC database to validate the predictive value of the signature. We utilized the same formula to calculate the risk score for each patient and then classified the patients into high- and low-risk groups using the same cut-off threshold. As shown in Fig. 4D, the AUCs for 1, 2, and 3 years were 0.704, 0.758, and 0.787, respectively. Consistent with the results in the TCGA cohort, patients in the high-risk group had significantly poorer overall survival than those in the low-risk group (KM Log-Rank P = 1.22e−4). Overall, our signature was able to accurately predict the overall survival of HCC patients.A total of 238 HCC patients in the TCGA cohort with complete clinical information (including sex, age, BMI, AFP, histological grade, BCLC stage, and vascular invasion), were included for further analysis. Univariate and multivariate Cox regression analyses indicated that BCLC stage (HR = 2.135, P = .004), vascular invasion (HR = 2.142, P = .003), and risk score (HR = 1.612, P = 2.36e−7) calculated from the immune-related risk signature were independent prognostic factors for the overall survival of HCC patients (Fig. 5A and B). In addition, the prediction accuracy of the signature was compared with all clinical factors. As shown in Fig. 5C–E, the signature had the largest AUC at 1, 2, and 3 year, this suggested the predictive performance of the signature was superior to any other clinical factors. Subsequently, all patients were stratified into subgroups based on their clinicopathological factors, such as sex (female/male), age (<60/≥60), BMI (<25/≥25), AFP (<400/≥400), tumor grade (G1+2/G3+4), BCLC stage (I+II/III+IV), and vascular invasion (no/yes), and each subgroup was further divided into the low- and high-risk groups using the immune-related risk signature (Table 2). The results of stratification analysis showed that the high-risk patients in each stratum of all clinicopathological factors had significantly shorter survival times than the low-risk patients (KM Log-Rank P < .05) (see Figure, Supplemental Figure 3, which illustrates the Kaplan–Meier analysis of overall survival for patients in the 14 subgroups). Our findings suggest that the immune-related risk signature is a promising independent risk factor for the prognosis assessment of HCC.
Figure 5
Univariate and multivariate Cox regression analysis, the time-dependent ROC contrastive analysis between the signature and all clinical factors, and stratification analysis in the TCGA cohort. (A, B) Forrest plot of the univariate and multivariate Cox regression analysis in HCC. (C–E) The time-dependent ROC contrastive analysis between the signature and all clinical factors at 1, 2, and 3 year. HCC = hepatocellular carcinoma; ROC = receiver operating characteristic curve; TCGA = The Cancer Genome Atlas.
Table 2
Clinicopathological factors of HCC patients from TCGA cohort in the low- and high-risk groups.
Risk score
Factor
Low
High
P value
Sex
Male
81 (72.3%)
80 (63.5%)
.189
Female
31 (27.7%)
46 (36.5%)
Age
<60
54 (48.2%)
61 (48.4%)
1.000
≥60
58 (51.8%)
65 (51.6%)
BMI
<25
64 (57.1%)
60 (47.6%)
.181
≥25
48 (42.9%)
66 (52.4%)
AFP
<400
95 (84.8%)
87 (69.0%)
.007
≥400
17 (15.2%)
39 (31.0%)
Grade
G1+2
76 (67.9%)
57 (45.2%)
.001
G3+4
36 (32.1%)
69 (54.8%)
BCLC stage
I+II
96 (85.7%)
99 (78.6%)
.207
III+IV
16 (14.3%)
27 (21.4%)
Vascular invasion
Yes
80 (71.4%)
72 (57.1%)
.031
No
32 (28.6%)
54 (42.9%)
AFP = alpha-fetoprotein; BCLC = Barcelona Clinic Liver Cancer; BMI = body mass index; HCC = hepatocellular carcinoma; TCGA = The Cancer Genome Atlas.
Univariate and multivariate Cox regression analysis, the time-dependent ROC contrastive analysis between the signature and all clinical factors, and stratification analysis in the TCGA cohort. (A, B) Forrest plot of the univariate and multivariate Cox regression analysis in HCC. (C–E) The time-dependent ROC contrastive analysis between the signature and all clinical factors at 1, 2, and 3 year. HCC = hepatocellular carcinoma; ROC = receiver operating characteristic curve; TCGA = The Cancer Genome Atlas.Clinicopathological factors of HCC patients from TCGA cohort in the low- and high-risk groups.AFP = alpha-fetoprotein; BCLC = Barcelona Clinic Liver Cancer; BMI = body mass index; HCC = hepatocellular carcinoma; TCGA = The Cancer Genome Atlas.We then established a new nomogram to predict 1-, 2-, and 3-year overall survival in 238 HCC patients from the TCGA cohort using 3 independent prognostic factors, including BCLC stage, vascular invasion, and risk score (Fig. 6A). According to the total points of the nomogram, the patients were divided into high- and low-point groups using the median point value as the cut-off threshold. Survival curves for the high- and low-point groups were plotted by Kaplan–Meier analysis, and the results showed that those with higher points had significantly poorer overall survival (Fig. 6B, KM Log-Rank P = .001). The AUCs of the 1-, 2-, and 3-year overall survival predictions for the nomogram were 0.750, 0.693, and 0.733, respectively (Fig. 6C). The C-index for the nomogram was 0.689 (95% confidence interval: 0.629–0.749). The calibration curve showed that the nomogram performed well at predicting overall survival in HCC patients, but the nomogram might underestimate or overestimate mortality (Fig. 6D–F). Notably, decision curve analysis (DCA) demonstrated that the nomogram showed the best net benefit for 1 to 3 year OS (see Figure, Supplemental Figure 4, which illustrates The DCA curves of the nomogram for 1-, 2-, and 3-year overall survival in HCC, respectively). Relative to traditional TNM stage and the immune-related signature, the nomogram greatly improves the clinical net benefit of HCC patients. In addition, the costs of misdiagnosis using the nomogram was much lower than traditional TNM stage and the signature. Therefore, DCA confirmed more benefit was added to survival prediction with our nomogram, which might enhance clinical management for HCC patients.
Figure 6
Nomogram predicting overall survival for HCC patients. (A) A prognostic nomogram predicting 1-, 2-, and 3-year overall survival of HCC patients. (B) Shows the Kaplan–Meier survival curves of the nomogram. (C) Shows the time-dependent ROC for 1-, 2-, and 3-year overall survival predictions for the nomogram. (D–F) The calibration plot for internal validation of the nomogram at 1, 2, and 3 year. The y-axis represents actual survival, and the x-axis represents nomogram-predicted survival. HCC = hepatocellular carcinoma; ROC = receiver operating characteristic curve.
Nomogram predicting overall survival for HCC patients. (A) A prognostic nomogram predicting 1-, 2-, and 3-year overall survival of HCC patients. (B) Shows the Kaplan–Meier survival curves of the nomogram. (C) Shows the time-dependent ROC for 1-, 2-, and 3-year overall survival predictions for the nomogram. (D–F) The calibration plot for internal validation of the nomogram at 1, 2, and 3 year. The y-axis represents actual survival, and the x-axis represents nomogram-predicted survival. HCC = hepatocellular carcinoma; ROC = receiver operating characteristic curve.
Clinical correlation of the immune-related risk signature
To investigate the ability of our signature to predict the progression of HCC, we analyzed the relationships between the risk factors from our signature (the risk score and 8 risk genes) and clinical parameters (sex, age, BMI, AFP, histological grade, BCLC stage, and vascular invasion) in the entire set. The results indicated that the expression of CDK4 and SEMA3F was higher in female patients, and the expression of ESR1 was higher in male patients (Fig. 7A–C, Wilcoxon rank-sum test; P = .002, .013, and .047, respectively). In patients older than 60 years, the expression of CXCL2 and ESR1 was higher than in those younger than 60 years (Fig. 7D and E, P = .001 and .012, respectively). The expression of CXCL2 was higher in patients with BMI ≥ 25 than in those with BMI < 25 (Fig. 7F, Wilcoxon rank-sum test; P = 8.99e–4). Patients with AFP ≥ 400 had a higher expression of CDK4 and SEMA3F but a lower expression of CXCL2 and ESR1 than patients with AFP < 400 (Fig. 7G–J, Wilcoxon rank-sum test; P = 2.86e–4, 0.038, 1.16e–7, and 9.59e–18, respectively). As the values of certain factors increased (CDK4, PSMD2, and SPD1 levels and the risk score), the histological grade of HCC patients increased, and histological grade decreased in patients with increased CXCL2 and ESR1 expression (Fig. 7K–P, Wilcoxon rank-sum test; P = .002, .005, .033, 1.35e–4, .002, and 7.71e–9, respectively). And the higher the expression of SEMA3F, the higher the patient's BCLC stage (Fig. 7Q, Wilcoxon rank-sum test; P = .001). Patients with vascular invasion had higher SPP1 levels and risk scores and lower ESR1 expression than patients without vascular invasion (Fig. 7R–T, Wilcoxon rank-sum test; P = 7.24e–4, .012, and .006, respectively). These results indicated that the dysregulation of immune-related risk gene expression is correlated with the development of HCC.
Figure 7
Relationships of the risk factors in the signature with the clinical parameters of patients in the TCGA cohort. (A) CDK4 expression and sex. (B) SEMA3F expression and sex. (C) ESR1 expression and sex. (D) CXCL2 expression and age. (E) ESR1 expression and age. (F) CXCL2 expression and BMI. (G) CDK4 expression and AFP. (H) SEMA3F expression and AFP. (I) CXCL2 expression and AFP. (J) ESR1 expression and AFP. (K) CDK4 expression and histological grade. (L) PSMD2 expression and histological grade. (M) SPP1 expression and histological grade. (N) Risk score and histological grade. (O) CXCL2 expression and histological grade. (P) ESR1 expression and histological grade. (Q) SEMA3F expression and BCLC stage. (R) SPP1 expression and vascular invasion. (S) Risk score and vascular invasion. (T) ESR1 expression and vascular invasion. AFP = alpha-fetoprotein; BCLC Barcelona Clinic Liver Cancer; BMI = body mass index; TCGA = The Cancer Genome Atlas.
Relationships of the risk factors in the signature with the clinical parameters of patients in the TCGA cohort. (A) CDK4 expression and sex. (B) SEMA3F expression and sex. (C) ESR1 expression and sex. (D) CXCL2 expression and age. (E) ESR1 expression and age. (F) CXCL2 expression and BMI. (G) CDK4 expression and AFP. (H) SEMA3F expression and AFP. (I) CXCL2 expression and AFP. (J) ESR1 expression and AFP. (K) CDK4 expression and histological grade. (L) PSMD2 expression and histological grade. (M) SPP1 expression and histological grade. (N) Risk score and histological grade. (O) CXCL2 expression and histological grade. (P) ESR1 expression and histological grade. (Q) SEMA3F expression and BCLC stage. (R) SPP1 expression and vascular invasion. (S) Risk score and vascular invasion. (T) ESR1 expression and vascular invasion. AFP = alpha-fetoprotein; BCLC Barcelona Clinic Liver Cancer; BMI = body mass index; TCGA = The Cancer Genome Atlas.As shown in Fig. 8A, the TMB was greater in the high-risk group than in the low-risk group (Wilcoxon rank-sum test; P = .042). Previous studies have suggested that the TMB is significantly related to the clinical effectiveness of immunotherapy.[ The results demonstrated that our signature may stratify patients with different sensitivities to immunotherapy. To further identify more immune-related mechanisms of the signature, we analyzed the correlation between the risk score and immune checkpoints, including programmed cell death protein 1 (PD-1), programmed cell death ligand 1 (PD-L1), and cytotoxic T-lymphocyte-related protein 4 (CTLA-4). The Wilcoxon rank-sum test indicated that HCC patients in the high-risk group tended to have higher expression levels of PD-1 (P = .024), PD-L1 (P = .026), and CTLA-4 (P = 5.04e–4) (Fig. 8B–D). This finding suggests that high-risk HCC patients might better respond to immune checkpoint blockade (ICB) immunotherapy targeting PD-1 and CTLA-4. As shown in Supplemental Figure 5A (see Figure, Supplemental Figure 5A, which illustrates correlation between immune cells of all subtypes), we found that most of the immune cells were strongly correlated, especially B cells, CD8+ T cells, and dendritic cells, which intensively suggested that immune cells play an important role in the TME of HCC. The infiltration abundances of immune cells (B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells) and the distribution of risk score for each patient were also explored. The abundances of immune cells in HCC tissues were significantly correlated with the risk score (Fig. 8E–J), and these immune cells were significantly highly infiltrated in the high-risk group in the TCGA LIHC cohorts (see Figure, Supplemental Figure 5B, which illustrates infiltration abundance of immune cells in the high- and low-risk groups), which indicated that our signature performed well in predicting the immune status of HCC.
Figure 8
Mutation and immune analysis of the signature in the TCGA cohort. (A) Comparison of mutation burden between the high- and low-risk groups. (B–D) Comparison of PD-1, PD-L1, and CTLA-4 expression between the high- and low-risk groups. (E–G) Analysis of the correlation between the risk score and immune cell infiltration. CTLA-4 = cytotoxic T-lymphocyte-related protein 4; PD-1 = programmed cell death protein 1; PD-L1 = programmed cell death ligand 1; TCGA = The Cancer Genome Atlas.
Mutation and immune analysis of the signature in the TCGA cohort. (A) Comparison of mutation burden between the high- and low-risk groups. (B–D) Comparison of PD-1, PD-L1, and CTLA-4 expression between the high- and low-risk groups. (E–G) Analysis of the correlation between the risk score and immune cell infiltration. CTLA-4 = cytotoxic T-lymphocyte-related protein 4; PD-1 = programmed cell death protein 1; PD-L1 = programmed cell death ligand 1; TCGA = The Cancer Genome Atlas.To investigate the mechanisms underlying the immune-related risk signature, we examined the expression correlation between the risk genes from the signature and all IRGs using 2-sided Pearson correlation coefficients in the entire set. As shown in Supplemental Table 3 (see Table, Supplemental Table 3, which exhibits the 37 IRGs significantly coexpressed with the risk genes in the signature), a total of 37 IRGs were expressed as significantly correlated with at least one of the risk genes (|Pearson correlation coefficient| > 0.5 and P < .01). Then, we performed functional enrichment analysis with the clusterProfiler package to explore the underlying biological processes and pathways of the 37 IRGs, revealing 444 enriched biological processes (see Table, Supplemental Table 4, which exhibits the 444 enriched biological processes) and 116 enriched KEGG pathways (see Table, Supplemental Table 5, which exhibits the 116 enriched KEGG pathways). The GO results of the top 10 immune-related biological processes are shown in Fig. 9A, including Fc receptor signaling pathway, the immune response-regulating cell surface receptor signaling pathway, antigen receptor-mediated signaling pathway, regulation of innate immune response, response to tumor necrosis factor, cellular response to interleukin-1, neutrophil mediated immunity, regulation of leukocyte cell–cell adhesion, regulation of T cell activation, and regulation of leukocyte activation. The top 10 immune-related KEGG pathways are shown in Fig. 9B, including the T cell receptor signaling pathway, B cell receptor signaling pathway, viral carcinogenesis, chemokine signaling pathway, natural killer cell-mediated cytotoxicity, PD-L1 expression and PD-1 checkpoint pathway in cancer, IL-17 signaling pathway, NF-kappa B signaling pathway, Th17 cell differentiation, and Th1 and Th2 cell differentiation. Therefore, our results indicated that the immune-related risk signature displays an intensive immune phenotype.
Figure 9
Functional enrichment analysis of the 37 IRGs that significantly coexpressed with the risk genes in the signature. (A) Top 10 immune-associated biological processes. (B) Top 10 immune-associated KEGG pathways. IRGs = immune-associated genes; KEGG = Kyoto Encyclopedia of Genes and Genomes.
Functional enrichment analysis of the 37 IRGs that significantly coexpressed with the risk genes in the signature. (A) Top 10 immune-associated biological processes. (B) Top 10 immune-associated KEGG pathways. IRGs = immune-associated genes; KEGG = Kyoto Encyclopedia of Genes and Genomes.
Discussion
The immune response in the tumor microenvironment (TME) plays a critical role in tumor initiation and progression.[ The immune system can kill cancer cells and suppress cancer growth by upregulating or downregulating IRGs at certain immune checkpoints.[ However, because tumors have the characteristics of immune evasion, certain tumor cells can imitate the IRG expression patterns of normal cells to elude the attack from the immune system.[ Therefore, the expression of IRGs may be a vital predictor of HCC progression and prognosis. In the present study, we utilized the TCGA and ICGC cohorts to establish and validate a robust immune-related risk signature consisting of 8 IRGs for HCC that can predict patients’ overall survival and was significantly related to clinicopathological factors. We also built a nomogram by combining the signature with clinicopathological factors to increase the accuracy in predicting HCC prognosis. By investigating the relationship of the risk score and 8 risk genes from our signature with clinical traits, we found that the aberrant expression of the immune-related risk genes was correlated with the development of HCC. More importantly, we found that the mutation status and the immune status were associated with this signature, and the signature enriched an intensive immune phenotype. These discoveries shed light on the value of our signature for HCC patient prognosis and on possible new immune targets for immunotherapy.First, we screened 1416 DEMs, among which we found 90 DEIRGs, including 49 upregulated IRGs and 41 downregulated IRGs. We then sequentially performed univariate, LASSO, and multivariate Cox regression analyses to construct an immune-related risk signature. The signature consisted of 8 IRGs significantly related to HCC patient overall survival (including APLN, CDK4, CXCL2, ESR1, IL1RN, PSMD2, SEMA3F, and SPP1). For the genes in our signature, 6 of them were chemokines, cytokines, or cytokine receptors that actively participate in the inflammatory process of tumor growth, progression, and metastasis. Chemokines and cytokines produced by tumor cells and other cellular components of the TME are able to facilitate the occurrence of tumor-associated or tumor-elicited inflammation, and this type of inflammation is a key driver of tumor progression and an essential component of the TME.[ Chemokines and cytokines act in autocrine and paracrine manners through receptors on tumor cells to activate STAT3, AP1, and NFκB families of potential oncogenic transcription factors. They are also involved in the progression of tumor growth through the induction of survival factors and proliferation.[ Thus, according to our signature, high-risk HCC patients may have increased inflammation in the tumor microenvironment, which promotes the progression of HCC and leads to the poor overall survival of patients. In addition, of the 8 IRGs that compose the signature, CDK4 was the most significant gene in the univariate Cox regression analysis. CDK4 is one of the core proteins of the cell cycle regulatory network, and changes in its expression activity directly affect cell cycle progression; moreover, its overexpression is closely related to tumor initiation, progression, and metastasis.[ A recent study indicated that the expression of CDK4 can affect the prognosis of HCC patients, and the higher its expression level is, the worse the prognosis.[ CDK4 and CDK6 inhibitors have been considered the most promising cell cycle therapeutics, and intense efforts are now being conducted to expand the field of this paradigm.[ PSMD2 is another gene in the signature that attracts our attention. The PSMD2 protein is a subunit of the 19S regulatory complex of the 26S proteasome and belongs to the ubiquitin–proteasome system (UPS).[ Prior studies demonstrated that knocking down PSMD2 can suppress tumor cell proliferation, and PSMD2 may serve as a potential therapeutic target for tumors.[ PSMD2 is able to modulate HCC cell proliferation by regulating cellular lipid metabolism. The knockdown of PSMD2 reduces the formation of cellular lipid droplets, and PSMD2 also regulates the expression level of genes involved in lipid synthesis through the p38-JNK and AKT signaling pathways.[ These results further suggest the importance of our signature in the HCC microenvironment.Next, we combined the signature with conventional clinical factors, including BCLC stage and vascular invasion, to develop a nomogram with good performance. This nomogram could be a valuable dIRGnostic and prognostic management tool for HCC. To explore the clinical utility of the signature, we evaluated the correlations of the risk factors in our signature with clinical factors. The risk score and 8 risk genes were significantly related to the progression of HCC. Our research identified the immune-related differences between different clinical subsets in HCC. Our findings also reflect that the expression differences of IRGs may lead to different states of the TME, which further affects the prognosis of different clinical subsets. A recent report suggested that the expression profile of IRGs in the TME of HCC can predict tumor progression and patient overall survival,[ which was consistent with our research. Thus, our signature displayed decent clinical applicability in predicting the development of HCC.We also examined whether our signature is related to TMB. A previous study showed that tumors with high TMB have the potential to produce more neoantigens, which in turn activate the immune system to recognize and clear tumor cells, suggesting that high TMB can enhance tumor sensitivity to ICB therapy.[ In our signature, the TMB was higher in the high-risk group than in the low-risk group, which not only indicated that our signature has good predictive power but can also stratify patients with different susceptibilities to immunotherapy. In addition, we found that as the risk score increased, the infiltration abundances of immune cells (B cells, CD4+ T cells, CD8+ T cells dendritic cells, neutrophils, and macrophages) in HCC tissues also increased. Macrophages were the most statistically significant and most positively correlated with the risk score. A prior report demonstrated that tumor-associated macrophages (TAMs) serve as the main immune cells in the TME, and they could improve the cell proliferation, invasion, and metastatic ability of HCC by secreting multiple cytokines and inducing the occurrence of epithelial–mesenchymal transition (EMT).[ Culturing TAMs together with HCC cells can significantly enhance the stem cell properties of HCC as well as EMT progression, which promotes the progression of HCC.[ We also found that patients in the high-risk group had more CD4+ and CD8+ T-cell infiltration than those in the low-risk group. Further research indicated that high-risk group patients tended to have higher expression levels of PD1, PD-L1, and CTLA-4. This finding suggests that despite the high infiltration abundance of T cells in the HCC microenvironment, they are unable to function normally due to inhibition by the PD-1- or CTLA-4-mediated suppression pathways, which leads to immune evasion of the tumor. It was also reported that tumors with pre-existing intratumor T cells inhibited by the PD-1/PD-L1 signaling pathway are most likely to benefit from ICB.[ Hence, our results revealed that the high-risk group of patients may be associated with a better response to ICB targeting PD-1 and CTLA-4, which enables the personalized treatment of HCC patients. Finally, we conducted functional enrichment analysis on the 37 IRGs that significantly coexpressed with the risk genes in the signature and obtained intensive immune-related phenotypes. This further proved that our signature was closely linked to the immune system.Our study was the first to systematically establish and validate a multigene immune-related risk signature with independent prognostic power for HCC patients. We utilized multiple algorithms (including univariate Cox, LASSO, and multivariate Cox regression) to identify key prognostic IRGs and construct signatures based on large cohorts from the TCGA and ICGC databases. More importantly, we analyzed TMB, immune cell infiltration, immune checkpoints, and immune-related functions, which reflected the immune response intensity in the HCC microenvironment. Our signature may be helpful for clinical management and personalized immunotherapy decisions. Nevertheless, our study had several limitations. First, we did not perform a prospective clinical trial to validate the signature. Moreover, further investigation with in vivo and in vitro experiments should be conducted to explore the molecular mechanisms of the 8 IRGs in our signature during the progression of HCC.In conclusion, our study developed a robust immune-related risk signature and built a predictive nomogram incorporating the signature and clinical factors to predict overall survival of HCC. The signature was not only closely related to the invasion, progression, and prognosis of HCC, but also has a strong correlation with the local immune status. The nomogram reliably predicted overall survival of HCC, which may facilitate clinical management and making of medical decisions.
Acknowledgments
The authors appreciated the authors of the data series and the authors of the databases used in this article. Their efforts provided great convenience for this research.
Authors: Andrew J Gentles; Aaron M Newman; Chih Long Liu; Scott V Bratman; Weiguo Feng; Dongkyoon Kim; Viswam S Nair; Yue Xu; Amanda Khuong; Chuong D Hoang; Maximilian Diehn; Robert B West; Sylvia K Plevritis; Ash A Alizadeh Journal: Nat Med Date: 2015-07-20 Impact factor: 53.440
Authors: Naiyer A Rizvi; Matthew D Hellmann; Alexandra Snyder; Pia Kvistborg; Vladimir Makarov; Jonathan J Havel; William Lee; Jianda Yuan; Phillip Wong; Teresa S Ho; Martin L Miller; Natasha Rekhtman; Andre L Moreira; Fawzia Ibrahim; Cameron Bruggeman; Billel Gasmi; Roberta Zappasodi; Yuka Maeda; Chris Sander; Edward B Garon; Taha Merghoub; Jedd D Wolchok; Ton N Schumacher; Timothy A Chan Journal: Science Date: 2015-03-12 Impact factor: 47.728