Literature DB >> 30971588

Cytochrome P450 family members are associated with fast-growing hepatocellular carcinoma and patient survival: An integrated analysis of gene expression profiles.

Zhao-Zhen Liu1, Li-Na Yan2, Chun-Nan Dong3, Ning Ma1, Mei-Na Yuan1, Jin Zhou1, Ping Gao1.   

Abstract

BACKGROUND/AIMS: The biological heterogeneity of hepatocellular carcinoma (HCC) makes prognosis difficult. Although many molecular tools have been developed to assist in stratification and prediction of patients by using microarray analysis, the classification and prediction are still improvable because the high-through microarray contains a large amount of information. Meanwhile, gene expression patterns and their prognostic value for HCC have not been systematically investigated. In order to explore new molecular diagnostic and prognostic biomarkers, the gene expression profiles between HCCs and adjacent nontumor tissues were systematically analyzed in the present study.
MATERIALS AND METHODS: In this study, gene expression profiles were obtained by repurposing five Gene Expression Omnibus databases. Differentially expressed genes were identified by using robust rank aggregation method. Three datasets (GSE14520, GSE36376, and GSE54236) were used to validate the associations between cytochrome P450 (CYP) family genes and HCC. GSE14520 was used as the training set. GSE36376 and GSE54236 were considered as the testing sets.
RESULTS: From the training set, a four-CYP gene signature was constructed to discriminate between HCC and nontumor tissues with an area under curve (AUC) of 0.991. Accuracy of this four-gene signature was validated in two testing sets (AUCs for them were 0.973 and 0.852, respectively). Moreover, this gene signature had a good performance to make a distinction between fast-growing HCC and slow-growing HCC (AUC = 0.898), especially for its high sensitivity of 95%. At last, CYP2C8 was identified as an independent risk factor of recurrence-free survival (hazard ratio [HR] =0.865, 95% confidence interval [CI], 0.754-0.992, P = 0.038) and overall survival (HR = 0.849; 95% CI, 0.716-0.995, P = 0.033).
CONCLUSIONS: In summary, our results confirmed for the first time that a four-CYP gene (CYP1A2, CYP2E1, CYP2A7, and PTGIS) signature is associated with fast-growing HCC, and CYP2C8 is associated with patient survival. Our findings could help to identify HCC patients at high risk of rapid growth and recurrence.

Entities:  

Keywords:  Biomarker; fast-growing HCC; hepatocellular carcinoma; prognosis

Mesh:

Substances:

Year:  2019        PMID: 30971588      PMCID: PMC6526731          DOI: 10.4103/sjg.SJG_290_18

Source DB:  PubMed          Journal:  Saudi J Gastroenterol        ISSN: 1319-3767            Impact factor:   2.485


INTRODUCTION

Liver cancer, predominantly hepatocellular carcinoma (HCC), is the fifth most common cancer in men and the ninth in women. It is the second most common cause of death from cancer worldwide.[1] Although much is known about both the cellular changes that lead to HCC and the etiological agents responsible for the majority of HCC cases (hepatitis B virus, hepatitis C virus, alcohol), the molecular pathogenesis of HCC is still not well understood.[2] Considerable efforts have been devoted to establish staging systems for HCC by using clinical information and pathological classification to provide information at diagnosis on both survival and treatment options.[3456] However, none of the proposed staging systems encompasses the biological and clinical heterogeneity exhibited by HCCs. One of the important reasons is that these predictive algorithms consider HCCs to be static rather than dynamic entities. They account for the size and number of neoplastic lesions at the time of presentation, yet do not take into account their growth behavior during follow-up, such as tumor doubling time (DT).[7] It therefore appears axiomatic that improving the classification of HCC patients into groups with homogeneous growth pattern will at least improve the application of currently available treatment modalities and at best provide new treatment strategies. Over the past 20 years, microarray technology has led to the identification of several molecular signatures in HCC. For example, a 164-gene signature has been reported to predict the clinical behavior of metastatic HCC patients.[8] Another study established a five-gene score to predict HCC survival after liver resection.[9] These signatures allow stratification of HCC into several clinically relevant subgroups. Nevertheless, the classification and prediction are still improvable because the high-through microarray contains a large amount of bio-information. Therefore, it is necessary to systematically analyze the expression profiles and explore new molecular signatures. In order to explore new molecular diagnostic and prognostic biomarkers, we systematically analyzed the gene expression profiles between HCCs and adjacent nontumor tissues in the present study. We demonstrated that 15 cytochrome P450 (CYP) family members could make a distinction between HCCs and nontumor tissues. A four CYP-gene (CYP1A2, CYP2E1, CYP2A7, and PTGIS) signature is a useful tool to diagnose HCCs and fast-growing HCCs with high sensitivity and specificity. CYP2C8 is associated with patient survival in individuals at first diagnosis.

MATERIALS AND METHODS

Study design

Discovery stage: All the HCC Gene Expression Omnibus (GEO) datasets were collected. Then, a published robust rank aggregation (RRA) method was applied to identify the aberrant genes in HCC development. Training stage: GSE14520 was used as the training set. The diagnostic and prognostic values of aberrant genes were evaluated, and a diagnostic signature was constructed in the training set. Testing stage: GSE36376 and GSE54236 were used as the testing sets. The diagnostic and prognostic values of aberrant genes were further validated in the testing sets.

HCC patient datasets and patient samples

All the HCC datasets (generated from the Affymetrix Human Genome U133 Plus 2.0 Array) were collected from the publicly available GEO database (http://www.ncbi.nlm.nih.gov/geo/). The selection criteria used in this study are as follows: (1) all specimens classified as tissues; cells, serum, or plasma are not included; (2) all the included datasets must contain paired HCC tumors and adjacent noncancerous tissues; (3) sample size should be greater than three pairs; and (4) if there existed data overlapping, the largest sample size was selected. According to the above screening criteria, five datasets were finally included in this study (GSE62232, GSE55092, GSE17548, GSE33006, and GSE6764).[1011121314] To validate the result from the gene expression profiles above, three other datasets were included. They are Roessler's study, Lim's study, and Villa's study (GEO accession: GSE14520, GSE36376, and GSE54236).[71516] HCC patients and tumor features are detailed in Table 1. It is worth noting that, in Villa's study, HCC patients were grouped into four quartiles according to increasing tumor DT: ≤53 days, 54–82 days, 83–110 days, and ≥111 days, respectively. The detailed procedure is as follows:[7] a new diagnosis of HCC at ultrasound (US) surveillance was eligible if they had a clinical condition that allowed a US-guided liver biopsy of a focal lesion, with the largest lesion biopsied in case of multifocality. To further confirm HCC diagnosis, a CT scan was performed. To measure the growth of lesions, a second CT was performed 6 weeks later. During the 6-week interval, patients did not undergo any specific treatment. This interval is much shorter than the average time to treatment after HCC diagnosis.[1718] Therefore, no ethical issues were raised. After the second CT, patients were treated according to international guidelines. Based on these CT values, tumor growth was classified as fast growth (first quartile) or slow growth (other quartiles).
Table 1

Clinical characteristics of patients enrolled in this study

VariablesRoessler’s study GSE14520 (n=242)Lim’s study GSE36376 (n=240)Villa’s study GSE54236 (n=81)P*
Male, n (%)211 (87.2)199 (82.9)61 (75.3)0.04
Median age (years) (range)50 (22-77)53 (45-61)67 (44-88)NA
HBV infection, n (%)231 (95.5)183 (76.3)10 (12.3)<0.0001
Tumor characteristics
Tumor size, median (range) (cm)7.2 (1.3-17.5)3.7 (2.5-6.2)5.8 (3.1-7.4)NA
Single nodular, n (%)190 (78.5)183 (76.3)69 (85.2)0.239
Vascular invasion, n (%)88 (36.4)133 (55.4)9 (11.1)<0.0001
BCLC stage, 0/A/B/C, n20/152/24/290/139/91/100/56/14/10<0.0001
Median follow-up (months)51.7NA25

*Chi-square test, NA=Not available, BCLC=Barcelona Clinic Liver Cancer

Clinical characteristics of patients enrolled in this study *Chi-square test, NA=Not available, BCLC=Barcelona Clinic Liver Cancer

Acquisition and analysis of gene expression profiles of HCC patients

The raw array data (.CEL files) of five gene expression datasets were retrieved from the GEO database and were uniformly preprocessed using the Robust Multichip Average algorithm for background correction, quantile normalization, and log2-transformation. Then, the differentially expressed genes (DEGs) from each dataset were screened out on the basis of P ≤ 0.05 and fold change ≥2. The interlab reproducibility of the results is often problematic due to the small sample size and other factors (such as pathologic staging and surgical outcome) between the studies. To overcome these limitations, a published RRA method was applied to identify the aberrant genes in HCC development.[1920] The new data frame results were constructed with the standard of adjusted P value < 0.05. The operation process can be performed by the Robust Rank Aggreg package in R software (version 3.3.3). Then, the DEGs were classified into different functional gene groups by using the DAVID functional classification method.[21] The unsupervised hierarchical clustering of both HCC patients and aberrant genes was performed with R software by using the Euclidean distance and complete linkage method.

Statistical analysis

The continuous variables were analyzed by t-test or rank sum test as appropriate. The categorical variables were analyzed by Chi-square test. Binary logistic regression analysis was performed to identify variables independently associated with HCC. To construct a diagnostic model, the candidate genes were fitted in the multivariate logistic regression in the discovery dataset. Odds ratio (OR) and 95% confidence intervals (CI) were estimated by logistic regression model. To visualize the capacity of the risk signature to discriminate between HCC and non-HCC, we summarized the data in a receiver operating characteristic curve.[22] The Cox proportional method was used to identify risk factors for recurrence-free survival (RFS) and overall survival (OS). The RFS was calculated from the date of tumor resection until the detection of tumor recurrence, or last observation. The OS was defined as the length of time between the surgery and death or the last follow-up. Variables with a P value < 0.05 in univariate analysis were included in the final multivariate model. Then, these variables were applied to build a risk signature. Finally, HCC patients were assigned a risk score according to the risk signature and were divided into high- and low-risk groups using the median of the risk score as the cutoff value. The difference in RFS or OS between high- and low-risk groups was demonstrated by Kaplan–Meier method, and the statistical significance was assessed by two-sided log-rank test. Hazard ratio (HR) and 95% CI were estimated by Cox proportional hazards regression model. Statistical analyses were performed with SPSS version 22.0 software (SPSS Inc., Chicago, IL, USA), GraphPad Prism 7 (GraphPad Software, La Jolla, CA), and MedCalc software version 12.2.1 (MedCalc, Mariakerke, Belgium).

RESULTS

GEO datasets analysis and candidate gene selection

A total of five datasets were included in our study for comprehensive analysis (GSE62232, GSE55092, GSE17548, GSE33006, and GSE6764). After data processing, 2179 DEGs were found in GSE62232; 2627, 2533, 1993, and 1158 DEGs were found in GSE55092, GSE17548, GSE33006, and GSE6764. Then, the method of RRA was used to integrally calculate the DEGs of the five datasets. Finally, 273 genes were identified as the most significantly differential genes. Detailed information is listed in Supplementary Table S1. These 273 genes were classified into different functional gene groups by using the DAVID functional classification method. Finally, 93 of 273 genes were classified into 13 functional groups, of which 15 CYP family genes formed the largest cluster with the highest enrichment score of 8.47. Then, whether these 15 CYP genes had the ability to classify HCC and predict the outcome of HCC were validated.
Table S1

Significantly differentially expressed genes after integrated calculating through RRA method

Gene symbolScoreAdjusted PGene symbolScoreAdjusted P
UpregulatedDownregulated
 SPINK15.76E-193.15E-14 FCN21.19E-176.52E-13
 AKR1B102.75E-171.50E-12 CLEC1B3.03E-161.65E-11
 HMMR1.52E-158.30E-11 SLC22A12.12E-151.16E-10
 ASPM5.14E-152.80E-10 CXCL145.14E-152.80E-10
 NDC801.89E-141.03E-09 FCN38.58E-154.69E-10
 ROBO12.09E-141.14E-09 GLS21.89E-141.03E-09
 CAP27.35E-144.01E-09 CYP39A12.31E-141.26E-09
 RACGAP11.64E-138.97E-09 CYP1A24.86E-142.65E-09
 CCNB12.14E-131.17E-08 CYP2B65.76E-143.14E-09
 TOP2A2.89E-131.58E-08 CNDP17.35E-144.01E-09
 GPC33.89E-132.12E-08 CLEC4M1.53E-138.38E-09
 PRC14.13E-132.26E-08 C92.43E-131.32E-08
 RRM24.88E-132.66E-08 CXCL124.13E-132.26E-08
 SPP11.19E-126.51E-08 APOF7.04E-133.84E-08
 DNAJC61.42E-127.78E-08 ESR17.78E-134.25E-08
 KIF20A1.55E-128.48E-08 CRHBP1.19E-126.51E-08
 IGF2BP31.62E-128.84E-08 TENM11.25E-126.81E-08
 PRR112.17E-121.18E-07 GHR1.42E-127.78E-08
 MAP24.74E-122.59E-07 DNASE1L31.62E-128.86E-08
 CDKN2C7.05E-123.85E-07 ADRA1A1.84E-121.01E-07
 DLGAP58.25E-124.50E-07 LPA1.92E-121.05E-07
 NUSAP11.18E-116.44E-07 HGF2.44E-121.33E-07
 BIRC52.03E-111.11E-06 HAO23.08E-121.68E-07
 KIF4A2.74E-111.50E-06 IL1RAP3.08E-121.68E-07
 NCAPG3.24E-111.77E-06 HAMP4.62E-122.52E-07
 CCNB23.81E-112.08E-06 IGF16.20E-123.39E-07
 MAGEA14.26E-112.33E-06 LIFR7.05E-123.85E-07
 KIF115.41E-112.95E-06 CYP2C87.51E-124.10E-07
 AURKA6.52E-113.56E-06 NAT28.25E-124.50E-07
 SULT1C27.39E-114.03E-06 FBP11.08E-115.90E-07
 CCNA28.28E-114.52E-06 LCAT1.18E-116.44E-07
 CDK11.14E-106.24E-06 GBA31.64E-118.98E-07
 E2F81.76E-109.63E-06 NNMT1.69E-119.23E-07
 CCNE22.01E-101.10E-05 MARCO1.88E-111.03E-06
 BUB12.04E-101.12E-05 SLCO1B31.89E-111.03E-06
 LCN22.42E-101.32E-05 ALDOB2.36E-111.29E-06
 CENPA2.80E-101.53E-05 RDH164.26E-112.33E-06
 C1orf1123.06E-101.67E-05 SPP24.65E-112.54E-06
 CDKN33.45E-101.88E-05 CD5L6.93E-113.78E-06
 AKR1C33.87E-102.11E-05 MT1M7.97E-114.35E-06
 PBK6.92E-103.78E-05 PLAC81.16E-106.34E-06
 BUB1B7.55E-104.12E-05 COLEC111.18E-106.45E-06
 ECT29.69E-105.29E-05 CYP2C91.51E-108.26E-06
 EDIL31.01E-095.51E-05 CYP2E11.65E-109.00E-06
 CDKN2A1.05E-095.71E-05 SRD5A21.68E-109.15E-06
 DEPDC11.11E-096.05E-05 GLYAT2.11E-101.15E-05
 KIAA01011.21E-096.63E-05 HGFAC2.11E-101.15E-05
 TRIM161.53E-098.36E-05 C8orf42.18E-101.19E-05
 PRKAA22.31E-090.000126011 AFM2.32E-101.27E-05
 CDC202.54E-090.00013875 ZG162.36E-101.29E-05
 EZH22.69E-090.00014687 IGFBP32.76E-101.51E-05
 TTK3.15E-090.000172213 ATF53.02E-101.65E-05
 LEF13.45E-090.000188269 SOCS23.35E-101.83E-05
 ACSL44.21E-090.000230079 NPY1R3.71E-102.03E-05
 CENPF5.10E-090.00027849 KDM84.52E-102.47E-05
 DTL8.21E-090.000448485 COLEC104.58E-102.50E-05
 MELK9.44E-090.000515558 MRC15.11E-102.79E-05
 CDKN2B9.76E-090.000532784 XDH5.61E-103.06E-05
 NEK21.10E-080.000603006 STEAP46.15E-103.36E-05
 FGF131.10E-080.000603006 MT1F6.62E-103.62E-05
 COL15A11.26E-080.000686421 MBL26.99E-103.82E-05
 CLGN1.27E-080.000692295 SLC7A27.17E-103.92E-05
 STXBP61.85E-080.00100832 CYP2C187.64E-104.17E-05
 ITGA62.15E-080.001173191 DCN8.22E-104.49E-05
 RRAGD2.20E-080.001199474 STAB28.22E-104.49E-05
 FAM169A2.44E-080.001333329 CIDEB8.43E-104.60E-05
 MAGEA62.44E-080.001333329 CYP4A119.28E-105.07E-05
 PEG102.72E-080.001483161 CYP2A69.30E-105.08E-05
 KIF142.76E-080.001509308 RCAN11.30E-097.09E-05
 SLC7A112.81E-080.0015358 SRPX1.32E-097.22E-05
 MAD2L13.01E-080.001645264 ZGPAT1.42E-097.75E-05
 ENAH3.44E-080.001878771 LY6E1.45E-097.92E-05
 TKT3.62E-080.001976497 VNN11.48E-098.06E-05
 MAGEA34.38E-080.002392275 MASP11.55E-098.46E-05
 PTTG14.66E-080.002544424 CA21.69E-099.21E-05
 ZWINT5.33E-080.002913182 KCNN21.76E-099.63E-05
 CENPU6.08E-080.003320583 CXCL22.01E-090.000110033
 APOBEC3B1.17E-070.006411243 KBTBD112.25E-090.000123013
 DLG51.80E-070.009808666 KAZN2.34E-090.000128026
 TXNRD11.94E-070.010578678 CETP3.07E-090.000167869
 EFCAB24.75E-070.025921521 GYS23.15E-090.000172213
 CKAP24.79E-070.02614349 MT1G3.25E-090.000177446
 SLC38A65.30E-070.028919541 MT1H3.35E-090.000182797
 NRCAM5.90E-070.032238372 GPM6A3.60E-090.000196524
 DHRS26.29E-070.034360482 THBS15.29E-090.000289149
 TPX26.33E-070.034574782 AKR1D15.81E-090.000317473
 FAT16.85E-070.037402125 MT1E6.12E-090.000334159
 SMPX8.02E-070.043778089 MT1X6.28E-090.000342744
 HOXA38.40E-070.045901344 HABP26.83E-090.000373189
 GREM27.28E-090.000397764
 PLG7.64E-090.000417031
 GSTZ17.76E-090.000423617
 AGXT9.12E-090.000497916
 MYO109.19E-090.000501698
 ACSM39.33E-090.00050933
 MOGAT29.47E-090.000517054
 ECM19.61E-090.000524872
 GNMT9.88E-090.000539501
 ADH1A1.05E-080.000573787
 ADAMTS131.18E-080.000644603
 ANXA101.45E-080.000794344
 TMEM45A1.58E-080.000861452
 PDGFRA1.67E-080.000914505
 TDO21.74E-080.000951209
 ASS11.77E-080.000968444
 FOS1.78E-080.000969969
 SLC10A11.85E-080.00100832
 BBOX11.99E-080.001088418
 AZGP12.02E-080.00110373
 FGFR22.15E-080.001173191
 EPB41L4B2.32E-080.001269561
 SH3YL12.49E-080.001357476
 KMO2.91E-080.001589826
 C73.17E-080.00173112
 ANGPTL63.23E-080.001761951
 ADH1C3.58E-080.001956276
 PRG43.62E-080.001976497
 CD1D3.73E-080.002036314
 SLCO4C13.99E-080.002176958
 HBB4.05E-080.002211793
 FETUB4.18E-080.002282715
 MT1HL14.45E-080.002429655
 MCC4.80E-080.002623161
 SHBG4.87E-080.002658396
 MT2A5.49E-080.003000286
 IGFALS5.68E-080.00310222
 RBMS36.37E-080.003476853
 SLC22A76.37E-080.003476853
 PCK16.71E-080.00366583
 CHST46.84E-080.003733479
 HAO17.67E-080.004187743
 OLFML39.02E-080.004927537
 CFP1.14E-070.006235826
 FAM134B1.16E-070.006333836
 C61.20E-070.00656818
 GRAMD1C1.47E-070.008021078
 TFPI21.64E-070.008980903
 TAT1.70E-070.009284924
 TRPM81.74E-070.009491841
 CPEB31.76E-070.009596585
 BHMT1.86E-070.010133342
 CYP2A71.98E-070.010806761
 CYP26A12.06E-070.011273951
 NDRG22.17E-070.011875047
 SLC1A12.36E-070.012897255
 KCND32.41E-070.01316171
 ADH62.58E-070.014062978
 HAL2.66E-070.014545375
 ASPA3.05E-070.016661935
 ANK33.05E-070.016661935
 F93.08E-070.016821456
 CYP3A43.17E-070.017306885
 ADH1B3.58E-070.019532585
 CYP2C193.61E-070.019712234
 G6PC3.81E-070.020816234
 FOSB3.99E-070.021771039
 OAT4.02E-070.021965865
 ASPN4.39E-070.023986679
 ART44.47E-070.024406973
 CYP4F24.63E-070.025264069
 MASP24.71E-070.025700965
 FOLH1B5.08E-070.027737347
 ACADL5.12E-070.027970839
 NAMPT5.38E-070.029402819
 GADD45B5.56E-070.030387478
 MFAP3L5.70E-070.031142006
 ITGA95.94E-070.032430592
 SLC19A36.28E-070.03430112
 ABCA86.80E-070.037111871
 FAM13A7.23E-070.039481728
 PTGIS8.10E-070.044228221
 EPB41L4A8.22E-070.044891898
 SERPINA48.78E-070.047971072
 KLKB18.85E-070.048322697
 BCHE9.04E-070.049389145
Significantly differentially expressed genes after integrated calculating through RRA method

Identification of CYP family genes associated with HCC

Three datasets were used to validate the association between CYP family genes and HCC. They are Roessler's study, Lim's study, and Villa's study (GEO accession: GSE14520, GSE36376, and GSE54236). A total of 242 patients were enrolled in the Roessler's study, which is the largest dataset in our study. So, we used this dataset to form a training set. The remaining two were considered as the testing sets. Patients in Villa's study were mostly Caucasians, while patients in Roessler's study and Lim's study were mostly Asians. Table 1 summarized the clinical characteristics of the patients in the training set and testing sets. There was heterogeneity among these three sets in some characteristics, such as sex distribution, hepatitis B virus infection, vascular invasion, and Barcelona Clinic Liver Cancer (BCLC) stage. Such heterogeneity may help to ensure that molecular signatures have real-world applicability across heterogeneous patient populations. Besides, in Villa's study, HCC patients were grouped into fast growing group (n = 20) and slow growing group (n = 61), according to increasing tumor DT. Kaplan–Meier curve analysis of survival showed a significantly lower survival rate for HCC cases in the fast growing group as compared with HCC cases in the slow growing group (P < 0.0001).[7] The expressions of 15 CYP genes in HCC tissues were confirmed in the training set, and the results showed that all of these 15 genes were significantly decreased in HCC tissues as compared to that in the matched nontumor tissues [Figure 1b]. Unsupervised hierarchical clustering of 484 tissues according to the expression patterns of these 15 CYP genes showed two distant clusters, which were highly correlated with HCC (P = 6.53E - 9, Chi-square test; Figure 1a). Indeed, cluster I contained most of the nontumor tissues (n = 236; 97.5%). Conversely, cluster II contained the majority of tumor tissues (n = 241; 99.6%).
Figure 1

Deregulated cytochrome P450 (CYP) family genes in HCC tumor tissues in the training set. (a) The unsupervised hierarchical clustering heat map of 242 HCC samples and 242 matched adjacent nontumor livers; each row represents an individual tissue sample and each column represents the expression level of an individual CYP gene. (b) Relative expression of the 15 CYP genes in 242 HCC tumor tissues and 242 adjacent nontumor tissues. T = Tumor tissues, N = nontumor tissues

Deregulated cytochrome P450 (CYP) family genes in HCC tumor tissues in the training set. (a) The unsupervised hierarchical clustering heat map of 242 HCC samples and 242 matched adjacent nontumor livers; each row represents an individual tissue sample and each column represents the expression level of an individual CYP gene. (b) Relative expression of the 15 CYP genes in 242 HCC tumor tissues and 242 adjacent nontumor tissues. T = Tumor tissues, N = nontumor tissues

Construction of diagnostic signature from the training set and validating this signature in the testing sets

In univariate analysis, all of 15 genes were confirmed to be significantly differentially expressed between HCC tissues and nontumor tissues. In multivariate analysis, 4 of 15 genes reached statistical significance and were used to construct the diagnostic model [Table 2]. The model was as follows: logit (P) = 47.896 − 0.721 × CYP1A2 − 1.132 × CYP2E1 − 1.320 × CYP2A7 − 3.736 × PTGIS. The best cutoff point of this model is −0.6513. Possibility above −0.6513 suggested HCC. The area under curve (AUC) for the established four-gene expression signature was 0.991 (95% CI, 0.977–0.997; Figure 2a), higher than the diagnostic value of any of these four genes (the AUCs of CYP1A2, CYP2E1, CYP2A7, and PTGIS were 0.973, 0.877, 0.931, and 0.874, respectively).
Table 2

Univariate and multivariate logistic regression analysis in the training set

GenesUnivariate analysisMultivariate analysis


OR*95% CIPOR95% CIP
CYP39A10.190(0.140-0.257)<0.001
CYP1A20.239(0.181-0.316)<0.0010.486(0.318-0.744)0.001
CYP2B60.115(0.079-0.168)<0.001
CYP2C80.004(0.001-0.012)<0.001
CYP2C90.036(0.018-0.069)<0.001
CYP2E10.183(0.122-0.276)<0.0010.322(0.161-0.644)0.001
CYP2C180.222(0.164-0.301)<0.001
CYP4A110.058(0.033-0.101)<0.001
CYP2A60.255(0.196-0.334)<0.001
CYP2A70.186(0.139-0.250)<0.0010.267(0.151-0.472)<0.001
CYP26A10.174(0.126-0.240)<0.001
CYP3A40.130(0.085-0.197)<0.001
CYP2C190.153(0.111-0.212)<0.001
CYP4F20.124(0.082-0.186)<0.001
PTGIS0.007(0.003-0.017)<0.0010.024(0.005-0.111)<0.001

OR=Odds ratio, CI=confidence intervals

Figure 2

Receiver operating characteristic (ROC) curve analysis of the four-gene (CYP1A2, CYP2E1, CYP2A7, and PTGIS) signature in the training and testing sets. In order to compare the predictive value of the four-gene signature, we analyzed the ROC curve of the signature in different datasets. ROC plots for the four-gene panel discriminating HCC in the (a) training set, (b) Lim's dataset, (c) Villa's study; (d) ROC plots for the four-gene panel discriminating between fast-growing HCC and slow-growing HCC. AUC = Area under the curve

Univariate and multivariate logistic regression analysis in the training set OR=Odds ratio, CI=confidence intervals Receiver operating characteristic (ROC) curve analysis of the four-gene (CYP1A2, CYP2E1, CYP2A7, and PTGIS) signature in the training and testing sets. In order to compare the predictive value of the four-gene signature, we analyzed the ROC curve of the signature in different datasets. ROC plots for the four-gene panel discriminating HCC in the (a) training set, (b) Lim's dataset, (c) Villa's study; (d) ROC plots for the four-gene panel discriminating between fast-growing HCC and slow-growing HCC. AUC = Area under the curve To confirm our findings, the diagnostic ability of the four-gene expression signature was validated in two testing sets. In Lim's study, with the same cutoff point, the AUC of the four-gene signature was 0.973 (95% CI, 0.953–0.986; Figure 2b). In Villa's study, the four-gene signature could distinguish HCC with an AUC of 0.852 (95% CI, 0.787–0.903; Figure 2c). Moreover, this gene signature had a good performance to make a distinction between fast-growing HCC and slow-growing HCC (AUC = 0.898; 95% CI, 0.810–0.954; Figure 2d), especially for its high sensitivity and specificity (95% and 85.25%, respectively).

Performance of the CYP family genes in HCC outcomes

In the training set, univariate Cox proportional hazard regression was applied to analyze each of the 15 genes. The results showed that 7 of 15 genes were significantly correlated with patient's RFS [Table 3], and another 7 of 15 genes were significantly correlated with patient's OS [Supplementary Table S2]. In multivariate analysis, only CYP2C8 demonstrated significant correlation between patient's RFS (HR = 0.809; 95% CI, 0.712–0.919) and OS (HR = 0.735; 95% CI, 0.634–0.853).
Table 3

Cox regression analysis of recurrence-free survival in the training set

GenesUnivariate analysisMultivariate analysis


HR*95% CIPHR95% CIP
CYP39A10.922(0.823-1.032)0.158
CYP1A20.909(0.815-1.014)0.087
CYP2B60.869(0.724-1.043)0.132
CYP2C80.809(0.712-0.919)0.0010.809(0.712-0.919)0.001
CYP2C90.945(0.852-10.48)0.281
CYP2E10.974(0.920-1.031)0.357
CYP2C180.990(0.889-1.103)0.857
CYP4A110.869(0.780-0.967)0.010
CYP2A60.899(0.834-0.970)0.006
CYP2A70.864(0.767-0.973)0.016
CYP26A10.744(0.554-1.000)0.050
CYP3A40.868(0.774-0.973)0.015
CYP2C191.023(0.850-1.231)0.808
CYP4F20.873(0.778-0.980)0.021
PTGIS0.994(0.608-1.625)0.981

HR=Hazard ratio, CI=confidence intervals

Table S2

Cox regression analysis of overall survival in the training set

GenesUnivariate analysisMultivariate analysis


HR95% CIPHR95% CIP
CYP39A10.944(0.826-1.080)0.403
CYP1A20.906(0.792-1.035)0.146
CYP2B60.890(0.714-1.108)0.295
CYP2C80.735(0.634-0.853)0.0000.735(0.634-0.853)0.000
CYP2C90.868(0.770-0.979)0.021
CYP2E10.948(0.888-1.013)0.116
CYP2C180.943(0.829-1.072)0.370
CYP4A110.776(0.683-0.882)0.000
CYP2A60.854(0.778-0.938)0.001
CYP2A70.800(0.690-0.926)0.003
CYP26A10.779(0.553-1.096)0.152
CYP3A40.797(0.692-0.918)0.002
CYP2C191.004(0.796-1.267)0.970
CYP4F20.832(0.724-0.956)0.010
PTGIS1.087(0.605-1.952)0.780
Cox regression analysis of recurrence-free survival in the training set HR=Hazard ratio, CI=confidence intervals Cox regression analysis of overall survival in the training set Each patient in the training set was classified into different prognostic groups (the high- and low-risk groups) according to the median value of CYP2C8 (6.39). Patients in high-risk group had mean and median RFS periods of 32.841 ± 2.468 and 23 months, respectively, whereas the mean and median RFS periods for patients in low-risk group were 44.960 ± 2.298 and 59.5 months, respectively. Correspondingly, the Kaplan–Meier analysis demonstrated a significant difference in RFS between these two groups (P = 0.0009; Figure 3a). Meanwhile, patients in high-risk group had significantly shorter OS period than those in low-risk group (mean 41.239 ± 2.461 vs. 54.519 ± 1.947 months; median 51.6 vs. 64.7 months; P < 0.0001, log-rank test; Figure 3b). Besides, CYP2C8 was downregulated in BCLC stage B–C when compared with BCLC stage 0–A (P = 0.023).
Figure 3

Kaplan–Meier curve for recurrence-free survival (RFS) and overall survival (OS) in patients with HCC with high- or low risk according to the median value of CYP2C8. (a, b) RFS and OS in the training set, and (c) OS in Villa's set

Kaplan–Meier curve for recurrence-free survival (RFS) and overall survival (OS) in patients with HCC with high- or low risk according to the median value of CYP2C8. (a, b) RFS and OS in the training set, and (c) OS in Villa's set Cox regression analysis identified CYP2C8 (HR = 0.865; 95% CI, 0.754–0.992; P = 0.038), Tumor Node Metastasis (TNM) stage (HR = 1.641; 95% CI, 1.195–2.254; P = 0.002), and BCLC stage (HR = 1.760; 95% CI, 1.312–2.360; P < 0.0001) as independent risk factors for RFS. CYP2C8 (HR = 0.849; 95% CI, 0.716–0.995; P = 0.033), TNM stage (HR = 1.723; 95% CI, 1.189–2.498; P = 0.004), and BCLC stage (HR = 1.582; 95% CI, 1.120–2.236; P = 0.009) were also independent risk factors for OS. Moreover, Villa's dataset was used to validate the prognostic efficiency of CYP2C8. Patients were classified into high- and low-risk groups with the same cutoff point. Patients in high-risk group had mean and median OS periods of 20.450 ± 2.553 and 19 months, respectively, whereas the mean and median OS periods for patients in low-risk group were 39.217 ± 2.440 and 47 months, respectively. Kaplan–Meier curve analysis of survival showed a significantly lower survival rate for patients in high-risk group (P = 0.004, Figure 3c).

DISCUSSION

Progression of HCC often leads to vascular invasion and intrahepatic metastasis, which correlate with recurrence after surgical treatment and poor prognosis. Surgical resection and liver transplantation are the only curative treatments for HCC, but eligibility is uncommon. Even if patients underwent surgery, tumor recurrence occurs in more than 70% of cases within 5 years, and the 5-year survival rate is 60–70%.[2324] In the past years, great efforts have been made to improve our understanding of the possible mechanism of progression, metastasis, and recurrence at protein, mRNA, and noncoding RNA levels, which will enable them to benefit from adjuvant therapy.[2526] More recently, many molecular tools have been developed to assist in patient stratification and prediction with microarray analysis.[8915] Nevertheless, the classification and prediction are still improvable because the high-through microarray contains a large amount of information. Moreover, gene expression patterns and their prognostic value for HCC have not been systematically investigated. In this study, we have demonstrated that 15 CYP family genes were significantly decreased in HCC tissues as compared to that in the matched nontumor tissues. A four-gene diagnostic signature was constructed for distinguishing between HCC and noncancerous liver, and the results were robust. Besides, CYP2C8 was identified as an independent risk factor of survival. Till now, there are several studies about integrated analysis of gene expression profiles in HCC.[2728293031] But the integration strategies are different. In a study by Shiraishi et al. the authors performed integrated and comparative analyses of whole genomes and transcriptomes of 22 HBV-related HCCs and their matched controls. The results showed that various types of genomic mutations triggered diverse transcriptional changes.[27] In another study, the Wang et al. repurposed 7 GEO datasets, which include a total of 267 HCC samples and 67 control samples and then reanalyzed the different genes in these 2 groups.[28] Zheng et al. only used one GEO dataset[29] while Chen et al. used three GEO datasets and one miRNA dataset to obtain DEGs and miRNAs.[30] In the study by Zhou et al. the authors chose four datasets, because they thought these datasets represented different racial populations.[31] In our study, gene expression profiles have been obtained by repurposing five GEO datasets. Although each of the five datasets used Affymetrix Human Genome U133 Plus 2.0 Array to analyze gene expression patterns, the interlab reproducibility of the results is often problematic due to the small sample size and other factors (such as pathologic staging and surgical outcome). To overcome these limitations, RRA approach was applied in this study. It has been specifically designed for comparison of several ranked gene lists and identification of commonly overlapping genes. This method assigns a P value to each element in the aggregated list indicating how much better it is ranked compared with a null model expecting random ordering. Finally, 273 genes were identified as the most significantly differential genes. This method has some strength. Most importantly, it is based on a statistical model that naturally allows evaluating the significance of the results. In addition, RRA is easy to compute and robust, not restricting its use to certain subset of problems or requiring all data to be of top quality. This method can also handle variable gene content of different microarray platforms. By defining the rank vector for each gene based only on the datasets where it is present, we do not have to omit the genes that are not present in every platform.[20] In the four-gene signature, the observation of low CYP1A2 expression in HCC was also reported by other studies.[3233] CYP1A2 metabolizes 17β-estradiol to generate the potent antitumor agent 2-methoxyestradiol in HCC. The reduction of CYP1A2 significantly disrupts this metabolic pathway, contributing to the progression and growth of HCC.[33] The results of previous studies also suggest that functional relationship occurs among genes characterizing the signature identified in this study. Fan et al. reported that CYP2E1 revealed low level of expression in 70% of the tumor tissues, when compared to the adjacent nontumor tissues, at both mRNA and protein levels. The low expression of CYP2E1 was significantly correlated with the aggressive tumor phenotype, including poor differentiation status, absence of tumor capsule, and younger age of the patients.[34] Moreover, HBx inhibits human CYP2E1 gene expression via downregulating HNF4α, which contributes to promotion of human hepatoma cell growth.[35] However, CYP2A7 and PTGIS are poorly studied in HCC, and further research may reveal a better understanding of the interaction of HCC and these genes. Lastly, by using multivariate Cox proportional hazard regression analysis, our study demonstrated that only 1 (CYP2C8) of 15 genes was significantly correlated with patient's RFS and OS. Even so, the multivariate analysis with HRs indicated that CYP2C8 is a significant survival-related risk factor independent of the well-known BCLC staging system. This implies that HCC prognosis could be improved by the combination of CYP2C8 with the existed staging system. However, the mechanisms of CYP2C8 in HCC remain unclear. A major strength of our study is that the samples were derived from three large populations with different races, which ensured that molecular signatures have real-world applicability. Another advantage of our study is that the rigorous data-processing methods and statistical analysis made our results reliable. Nonetheless, there are also limitations in our study. First, in order to measure the growth of lesions, a second CT was performed 6 weeks later. During the 6-week interval, patients did not undergo any specific treatment. This interval is much shorter than the average time to treatment after HCC diagnosis. Therefore, no ethical issues were raised. But this interval could not be compared with average waiting time for treatment as this usually varies due to unintended reasons. Second, our work needs to be re-evaluated in a special cohort of patients with a proper follow-up or in case-control studies. The signatures should be validated in qRT-PCR-based samples; we therefore need to develop the signatures in tissues with qRT-PCR method. Besides, we did not study the mechanisms of the screened CYP family genes; whether these genes can affect the biological functions of HCC cells remains to be studied.

CONCLUSION

A four-gene signature was identified as being able to discriminate between HCC and nontumor tissues as well as identify a subgroup of patients with rapidly growing HCC. Moreover, CYP2C8 can be used as an independent prognostic risk factor. These results may not only help to identify HCC patients at high risk of rapid growth and recurrence but could also provide insight into the mechanisms of HCC progression, metastasis, and recurrence.

Financial support and sponsorship

There are no financial supports and sponsorships.

Conflicts of interest

There are no conflicts of interest.
  35 in total

1.  Molecular pathogenesis of human hepatocellular carcinoma.

Authors:  Snorri S Thorgeirsson; Joe W Grisham
Journal:  Nat Genet       Date:  2002-08       Impact factor: 38.330

2.  Survival of patients with hepatocellular carcinoma in cirrhosis: a comparison of BCLC, CLIP and GRETCH staging systems.

Authors:  C Cammà; V Di Marco; G Cabibbo; F Latteri; L Sandonato; P Parisi; M Enea; M Attanasio; M Galia; N Alessi; A Licata; M A Latteri; A Craxì
Journal:  Aliment Pharmacol Ther       Date:  2008-03-27       Impact factor: 8.171

3.  Prognosis of hepatocellular carcinoma: comparison of 7 staging systems in an American cohort.

Authors:  Jorge A Marrero; Robert J Fontana; Ashley Barrat; Frederick Askari; Hari S Conjeevaram; Grace L Su; Anna S Lok
Journal:  Hepatology       Date:  2005-04       Impact factor: 17.425

4.  A unique metastasis gene signature enables prediction of tumor relapse in early-stage hepatocellular carcinoma patients.

Authors:  Stephanie Roessler; Hu-Liang Jia; Anuradha Budhu; Marshonna Forgues; Qing-Hai Ye; Ju-Seog Lee; Snorri S Thorgeirsson; Zhongtang Sun; Zhao-You Tang; Lun-Xiu Qin; Xin Wei Wang
Journal:  Cancer Res       Date:  2010-12-15       Impact factor: 12.701

Review 5.  Resection and liver transplantation for hepatocellular carcinoma.

Authors:  Josep M Llovet; Myron Schwartz; Vincenzo Mazzaferro
Journal:  Semin Liver Dis       Date:  2005       Impact factor: 6.115

6.  Predicting hepatitis B virus-positive metastatic hepatocellular carcinomas using gene expression profiling and supervised machine learning.

Authors:  Qing-Hai Ye; Lun-Xiu Qin; Marshonna Forgues; Ping He; Jin Woo Kim; Amy C Peng; Richard Simon; Yan Li; Ana I Robles; Yidong Chen; Zeng-Chen Ma; Zhi-Quan Wu; Sheng-Long Ye; Yin-Kun Liu; Zhao-You Tang; Xin Wei Wang
Journal:  Nat Med       Date:  2003-03-17       Impact factor: 53.440

7.  Decreased expression of cytochrome P450 2E1 is associated with poor prognosis of hepatocellular carcinoma.

Authors:  Jenny C Ho; Siu Tim Cheung; Ka Ling Leung; Irene O Ng; Sheung Tat Fan
Journal:  Int J Cancer       Date:  2004-09-10       Impact factor: 7.396

8.  Genome-wide molecular profiles of HCV-induced dysplasia and hepatocellular carcinoma.

Authors:  Elisa Wurmbach; Ying-bei Chen; Greg Khitrov; Weijia Zhang; Sasan Roayaie; Myron Schwartz; Isabel Fiel; Swan Thung; Vincenzo Mazzaferro; Jordi Bruix; Erwin Bottinger; Scott Friedman; Samuel Waxman; Josep M Llovet
Journal:  Hepatology       Date:  2007-04       Impact factor: 17.425

9.  Management of hepatocellular carcinoma: an update.

Authors:  Jordi Bruix; Morris Sherman
Journal:  Hepatology       Date:  2011-03       Impact factor: 17.425

10.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists.

Authors:  Da Wei Huang; Brad T Sherman; Richard A Lempicki
Journal:  Nucleic Acids Res       Date:  2008-11-25       Impact factor: 16.971

View more
  9 in total

1.  Cytochrome P450 1A2 overcomes nuclear factor kappa B-mediated sorafenib resistance in hepatocellular carcinoma.

Authors:  Jianqing Yu; Nuozhou Wang; Zhongqin Gong; Liping Liu; Shengli Yang; George Gong Chen; Paul Bo San Lai
Journal:  Oncogene       Date:  2020-11-12       Impact factor: 9.867

2.  Exploratory Study Using Urinary Volatile Organic Compounds for the Detection of Hepatocellular Carcinoma.

Authors:  Ayman S Bannaga; Heena Tyagi; Emma Daulton; James A Covington; Ramesh P Arasaradnam
Journal:  Molecules       Date:  2021-04-22       Impact factor: 4.411

3.  Identification of cancer stem cell characteristics in liver hepatocellular carcinoma by WGCNA analysis of transcriptome stemness index.

Authors:  Kun-Hao Bai; Si-Yuan He; Ling-Ling Shu; Wei-Da Wang; Shi-Yong Lin; Qian-Yi Zhang; Liang Li; Lei Cheng; Yu-Jun Dai
Journal:  Cancer Med       Date:  2020-04-20       Impact factor: 4.452

4.  CYP1A2 suppresses hepatocellular carcinoma through antagonizing HGF/MET signaling.

Authors:  Jianqing Yu; Xianfeng Xia; Yujuan Dong; Zhongqin Gong; Gang Li; George Gong Chen; Paul Bo San Lai
Journal:  Theranostics       Date:  2021-01-01       Impact factor: 11.556

5.  Cytochrome P450 2A6 is associated with macrophage polarization and is a potential biomarker for hepatocellular carcinoma.

Authors:  Tao Jiang; Ai-Song Zhu; Chu-Qi Yang; Chu-Yun Xu; Dan-Qian Yang; Zhao-Huan Lou; Guang-Ji Zhang
Journal:  FEBS Open Bio       Date:  2021-02-05       Impact factor: 2.693

6.  Identification and validation of ADME genes as prognosis and therapy markers for hepatocellular carcinoma patients.

Authors:  Jukun Wang; Ke Han; Chao Zhang; Xin Chen; Yu Li; Linzhong Zhu; Tao Luo
Journal:  Biosci Rep       Date:  2021-05-28       Impact factor: 3.840

7.  Pantothenate Kinase 1 Inhibits the Progression of Hepatocellular Carcinoma by Negatively Regulating Wnt/β-catenin Signaling.

Authors:  Yuyuan Zi; Jie Gao; Chenglv Wang; Yidi Guan; Linzhao Li; Xinxin Ren; Lan Zhu; Yun Mu; Shuang-Hui Chen; Zimei Zeng; Zhen Cao; Zhuoxian Rong; Pan Chen; Xiuping Zhang; Tao Chen; Haiguang Xin; Xuebing Li; Zhi Li; Lunquan Sun; Yuezhen Deng; Nan Li; Yingjie Nie
Journal:  Int J Biol Sci       Date:  2022-01-24       Impact factor: 6.580

8.  Acyl-CoA Binding Domain Containing 4 Polymorphism rs4986172 and Expression Can Serve as Overall Survival Biomarkers for Hepatitis B Virus-Related Hepatocellular Carcinoma Patients After Hepatectomy.

Authors:  Huasheng Huang; Xiwen Liao; Guangzhi Zhu; Chuangye Han; Xiangkun Wang; Chengkun Yang; Xin Zhou; Tianyi Liang; Ketuan Huang; Tao Peng
Journal:  Pharmgenomics Pers Med       Date:  2022-03-29

9.  Mining TCGA Database for Tumor Microenvironment-Related Genes of Prognostic Value in Hepatocellular Carcinoma.

Authors:  Zhenfeng Deng; Jilong Wang; Banghao Xu; Zongrui Jin; Guolin Wu; Jingjing Zeng; Minhao Peng; Ya Guo; Zhang Wen
Journal:  Biomed Res Int       Date:  2019-11-19       Impact factor: 3.411

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.