Literature DB >> 31703415

PAC-5 Gene Expression Signature for Predicting Prognosis of Patients with Pancreatic Adenocarcinoma.

Jieun Kim^1,2, Yong Hwa Jo^2,3, Miran Jang^2,3, Ngoc Ngo Yen Nguyen^1,2, Hyeong Rok Yun^1,2, Seok Hoon Ko⁴, Yoonhwa Shin^1,2, Ju-Seog Lee⁵, Insug Kang^1,2,3, Joohun Ha^1,2,3, Tae Gyu Choi^2,3, Sung Soo Kim^1,2,3.

Abstract

Pancreatic adenocarcinoma (PAC) is one of the most aggressive malignancies. Intratumoural molecular heterogeneity impedes improvement of the overall survival rate. Current pathological staging system is not sufficient to accurately predict prognostic outcomes. Thus, accurate prognostic model for patient survival and treatment decision is demanded. Using differentially expressed gene analysis between normal pancreas and PAC tissues, the cancer-specific genes were identified. A prognostic gene expression model was computed by LASSO regression analysis. The PAC-5 signature (LAMA3, E2F7, IFI44, SLC12A2, and LRIG1) that had significant prognostic value in the overall dataset was established, independently of the pathological stage. We provided evidence that the PAC-5 signature further refined the selection of the PAC patients who might benefit from postoperative therapies. SLC12A2 and LRIG1 interacted with the proteins that were implicated in resistance of EGFR kinase inhibitor. DNA methylation was significantly involved in the gene regulations of the PAC-5 signature. The PAC-5 signature provides new possibilities for improving the personalised therapeutic strategies. We suggest that the PAC-5 genes might be potential drug targets for PAC.

Entities: Chemical Disease Gene Mutation Species

Keywords: adjuvant therapies; gene expression signature; pancreatic adenocarcinoma; prognostic prediction

Year: 2019 PMID： 31703415 PMCID： PMC6896100 DOI： 10.3390/cancers11111749

Source DB: PubMed Journal: Cancers (Basel) ISSN： 2072-6694 Impact factor: 6.639

1. Introduction

Pancreatic cancer is an intractable malignancy, which is the fourth-leading cause of cancer deaths in the United States, with 56,770 new cases and 45,750 deaths in 2019 [1]. It constitutes a small percentage of all cancer deaths (7.2%). However, it is one of the fatal types of cancers with a five-year survival rate of only 9%. The vast majority of pancreatic cancers (>85%) are adenocarcinomas occurring in exocrine glands of the pancreas. Most of pancreatic adenocarcinoma (PAC) patients typically present advanced stages at the diagnosis. Surgery is considered the most effective treatment and the only therapeutic intervention, but only 20% of the patients are eligible for resection [2]. The American Joint Committee on Cancer (AJCC) staging system has been widely applied worldwide to provide guidelines for prognostic assessment and therapeutic decisions in PAC. The AJCC staging system is based on three components: size and/or local extent of the primary tumour (T), the involvement of regional lymph nodes (N), and metastasis (M). However, it is unable to describe tumour behaviour comprehensively. Indeed, PAC patients with the same AJCC stage may have different clinical prognosis after receiving the same treatment [3]. Thus, authorised model should be further proposed to complement the current pathological staging. In PAC patients, gemcitabine is still employed as the baseline agent for adjuvant chemotherapy [2]. Thereafter, a combination of gemcitabine with FOLFIRINOX [4] or albumin-bound paclitaxel (nab-paclitaxel) [5] have become first-line therapies. However, the majority of patients poorly respond to these chemotherapeutic agents, and the therapeutic failure rather accelerates drug-resistance and metastatic progression [6]. This phenomenon is supported by intratumoural molecular heterogeneity that arises at multiple stages during tumour progression [7]. Tumourigenesis of PAC involves mutual interactions of diverse factors, including gene mutations and microenvironmental conditions [8]. Furthermore, tumour heterogeneity is closely associated with therapeutic sensitivity [7,9]. It is therefore vital to understand the underlying mechanisms in order to increase the treatment efficacy and improve patient outcomes. With the remarkable advances in bioinformatic technologies, prognostic gene expression signatures have extensively been developed, which reflect various clinicopathological and demographic factors. To date, commercial gene signatures were successfully established to predict prognosis and help therapeutic decision in various cancer patients such as head and neck [10] and breast [11]. In PAC, several previous studies have attempted to develop tumour subtype for prediction of prognosis [12,13,14,15] or therapeutic benefit [16]. However, clinical applications are not yet available. Thus, it is necessary to develop molecular classifier that allows to accurately predict the prognosis of the individual patient via the understanding of tumour heterogeneity in PAC. Furthermore, a good molecular classifier that minimises harmfulness from the overtreatment of patients and thus provides therapeutic benefit in safe is required. Here, we established a novel molecular classifier that accurately predicted the prognosis of the PAC patients, which was closely associated with tumour-specific gene expression. The PAC-5 gene expression signature would give benefit to PAC patients by selecting patients who were suitable for adjuvant therapies. Further, we attempted to provide possibilities for improving prognostic models of PAC heterogeneity via extensive analyses.

2. Results

2.1. Establishment of a Prognostic Gene Expression Signature

In order to generate a molecular classifier that distinguishes PAC patients into low- and high-risk groups, the gene expression data have been examined in relation to survival information. GSE71729 was used as a training dataset. A flow chart of the procedure used to generate the gene signature was provided in Figure 1. Initially, 2654 genes were obtained through filtering gene set intensity. Then, differentially expression genes (DEGs) analysis between normal pancreas and tumour tissues was employed to identify cancer-related genes, by which 1149 tumour-specific genes were obtained (Table S2). These genes were used for least absolute shrinkage and selection operator (LASSO) regression analysis with overall survival (OS) as the survival endpoint. As a result, we obtained a subset of prognostic genes: LAMA3, E2F7, IFI44, SLC12A2, LRIG1, DUOXA1, and RBM1. However, DUOXA1 expression did not act as an independent prognostic biomarker to stratify the patients into distinct risk-groups, and the gene expression data for RBM1 were not available in the external validation datasets. These two genes were excluded from establishing the final gene expression signature related to OS. Thus, we established a prognostic model that was termed the PAC-5 signature, including five genes (LAMA3, E2F7, IFI44, SLC12A2, and LRIG1, Table S3). Based on the PAC-5 gene expression patterns, the low- and high-risk groups were accurately represented by two clusters (Figure 2A, upper panel). To confirm whether the PAC-5 genes were tumour-specifically expressed, the mRNA expression levels of five genes between normal subjects and risk-groups of the PAC patients were evaluated. The expression levels of LAMA3, E2F7, IFI44, and SLC12A2 were higher in PACs than in the normal pancreases. However, LRIG1 mRNA was less expressed in PACs than in the normal pancreatic tissues. In the analysis between the tumour tissues, mRNAs of LAMA3, E2F7, and IFI44 were more expressed in the tissues of the high-risk group than in those of low-risk group, whereas the expression levels of IFI44 and LRIG1 were lower in the high-risk group than in low-risk group (Figure S1A–E). Prognostic index values were calculated based on the PAC-5 signature for all patients and normal subjects. The patients were classified into low- (n = 63) and high-risk (n = 73) groups by their prognostic indices (Figure 2A, lower histogram). Prognostic index values for the high-risk group were significantly higher than those for the two other groups (Figure S1F).

Figure 1

Schematic overview of the strategy used for prognostic model construction.

Figure 2

Establishment of the PAC-5 signature. (A) Survival and clinical information were associated with the heatmap of the two risk-groups in the training dataset (upper panel). The gene expression score colour keys were presented in the legends, with red indicating higher expression and blue lower expression. The patients were also clustered into two groups (classic and basal-like), based on the Moffitt classification. MoC, Moffitt classification. The prognostic index for each patient was calculated according to the weight of each gene (lower histogram). (B) Kaplan–Meier plots for OS of two risk-groups in the training dataset. p-Values were computed by log-rank test.

Moffitt et al. (GSE71729 training dataset) [13] previously suggested two distinctive subtypes for predicting prognosis of PAC patients, of which a ‘basal-like subtype’ was associated with poorer prognostic outcome than a ‘classical subtype’. These Moffitt classification subtypes were associated with the gene expression patterns and histological cellularity in PACs. Interestingly, LAMA3 was also used for their genes in subtype-discrimination, of which over-expression was related to the ‘basal-like subtype’. Thus, to further evaluate the relevance between PAC-5 signature and Moffitt classification, we performed an association analysis using χ2 test; the risk-groups by the PAC-5 signature were significantly correlated with the Moffitt classification (p = 1.12 × 10−3, Table S4). To further verify the survival difference between low- and high-risk groups in the training dataset, we employed the Kaplan–Meier survival curve analysis. As a result, the Kaplan–Meier plot indicated a significant prognostic difference between the low- and high-risk groups at a median OS of 27.4 and 13.7 months, respectively (p = 8.37 × 10−4, Figure 2B).

2.2. Survival Analysis and Clinical Relevance of PAC-5 Signature in the Validation Datasets

Next, to further estimate the robustness of the classifier, the PAC-5 signature was validated in the combined five microarray or three RNA-seq datasets. During (leave-one-out cross-validation LOOCV, the specificity and the sensitivity for correctly predicting risk were 0.839 and 0.905 in compound covariate predictor, respectively. The PAC-5 signature significantly classified patients into low- and high-risk groups at median OS of 30.4 and 17.7 months in the combined validation datasets (p = 1.88 × 10−7, Figure 3A), and RFS of 17.5 and 15.5 months in the combined validation datasets (p = 0.046, Figure 3B). Kaplan–Meier plots also showed significant prognostic differences in the microarray datasets and RNA-seq datasets (p = 4.87 × 10−3 and p = 6.94 × 10−7 for OS, respectively, Figure S2A,B).

Figure 3

Kaplan–Meier survival analysis of the PAC-5 signature in validation datasets. (A,B) Kaplan–Meier survival plots for OS and RFS of two risk-groups in the validation datasets. The p-Values were computed by the log-rank test.

One external dataset, GSE62452, had gene expression data of adjacent tissues paired to the data of their tumour tissues (n = 61). To further intensify that the PAC-5 genes were tumour-specifically expressed, we evaluated the mRNA expression levels of five genes between adjacent tissues and their tumour tissues assigned to two risk-groups of the PAC patients. Similarly to the results of the training dataset, the expression levels of LAMA3, E2F7, IFI44, and SLC12A2 were higher in PACs than in the adjacent tissues. However, LRIG1 mRNA was less expressed in PACs than in the adjacent tissues. In the analysis between the tumour tissues, mRNAs of LAMA3, E2F7, and IFI44 were more expressed in the tissues of the high-risk group than in those of low-risk group, whereas the expression levels of IFI44 and LRIG1 were lower in the high-risk group than in low-risk group (Figure S3A–E). Prognostic index values for the high-risk group were also significantly higher than those for the two other groups (Figure S3F). Univariate Cox regression analysis revealed significant prognostic accuracy of PAC-5 signature for survival time in the training dataset [hazard ratio (HR) 1.781, 95% confidence interval (Cl) 1.105–2.873, p = 0.018]. Since the training dataset had no information for clinicopathological characteristics, the prognostic value of the PAC-5 signature could not be compared with prognostic covariates. Thus, to compare the prognostic value of our PAC-5 signature with prognostic covariates, univariate and multivariate Cox regression analyses were also performed using the combined validation datasets. In the univariate analysis, pathological grade, primary tumour size, lymph nodes metastasis, AJCC staging, and the PAC-5 signature were significantly associated with OS, compared to their referents, except for pathological grade 4. The significant covariates and our PAC-5 signature were used in multivariate analysis, in which pathological grade (G2 and G3), primary tumour size (T2 and T3), lymph nodes metastasis, and the PAC-5 signature still presented significant prognostic values (Table 1).

Table 1

Univariate and multivariate Cox proportional hazard regression analyses of clinical variable in validation datasets.

Variables		OS
		Univariate			Multivariate
		HR	95% CI	p-Value	HR	95% CI	p-Value
Age
	≤65 (referent)	1
	>65	1.316	0.997–1.738	0.053
Gender
	Female (referent)	1
	Male	0.898	0.735–1.097	0.292
Family history
	No (referent)	1
	Yes	0.862	0.476–1.560	0.624
Race
	Black (referent)	1
	Other	1.246	0.544–2.856	0.614
Pancreatitis
	No (referent)	1
	Yes	1.033	0.491–2.173	0.931
Diabetes
	No (referent)	1
	Yes	0.817	0.542–1.230	0.332
Grade
	G1 (referent)	1			1
	G2	1.363	1.047–1.773	0.021	1.324	1.005–1.744	0.046
	G3	2.270	1.706–3.022	1.89 × 10⁻⁸	1.924	1.410–2.627	3.72 × 10⁻⁵
	G4	1.634	0.515–5.182	0.405	2.399	0.746–7.718	0.142
T
	T1 (referent)	1			1
	T2	2.010	1.018–3.970	0.044	2.309	1.015–5.524	0.046
	T3	2.434	1.295–4.574	0.006	2.778	1.199–6.438	0.017
	T4	2.966	1.168–7.534	0.022	5.418	0.643–45.632	0.120
N
	N0 (referent)	1			1
	N1	1.938	1.504–2.496	3.08 × 10⁻⁷	2.021	1.466–2.787	1.76 × 10⁻⁵
AJCC Staging
	1 (referent)	1			1
	2	1.611	1.140–2.277	0.007	0.676	0.342–1.338	0.261
	3	2.173	1.215–3.888	0.009	0.420	0.025–7.122	0.548
	4	2.735	1.346–5.555	0.005	0.590	0.130–2.675	0.494
PAC-5
	Low (referent)	1			1
	High	1.599	1.333–1.895	2.41 × 10⁻⁷	1.349	1.080–1.685	0.008

AJCC, American Joint Committee on Cancer; OS, overall survival; HR, hazard ratio; CI, Confidence Interval; T, primary tumour size; N, lymph node metastasis. The Wald test was used to estimate p-Values. All statistical tests were two-sided.

2.3. Validation of the PAC-5 Signature in Stage I and II PAC Patients

The AJCC staging system is the most widely accepted prognostic model for PAC. However, the prognostic value of AJCC staging is indeed limited, by which the survival rates of patients in IB, IIA, and IIB are identical [3]. Thus, we investigated whether the PAC-5 signature could suitably stratify patients with stage I or II tumours into the two risk-groups in the validation datasets. The combined validation datasets included patients with survival information in stage I (n = 66, 9%) and II (n = 621, 85.4%). Indeed, we observed that the AJCC staging system has not properly stratified patients with stage IA, IB, IIA and IIB for survival (Figure S4A). Especially, Kaplan–Meier survival curves for the stage IB and IIA were not significantly different (p = 0.297, Figure S4B). However, the PAC-5 signature significantly stratified the stage IB or IIA patients into low- and high-risk groups (p = 0.047 for stage IB and p = 0.043 for stage IIA, respectively, Figure 4A,B). Moreover, it stratified the patients with stage IIB into two distinct prognostic risk-groups (p = 8.61 × 10−5, Figure 4C). The patients with stage IA could not be classified into different risk-groups by the PAC-5 signature (p = 0.109, Figure S4C).

Figure 4

Kaplan–Meier survival analysis of PAC patients with stage IB, IIA, and IIB. (A–C) Kaplan–Meier survival analyses were performed to estimate the differences in OS between the low- and high-risk patients in stage IB, IIA and IIB. p-Values were computed by log-rank test.

2.4. Association of the PAC-5 Signature with Advantage of Adjuvant Therapies

2.4.1. Chemotherapy

Gemcitabine-based adjuvant chemotherapy is currently recommended as a standard therapy after surgery for PAC [2]. However, the substantial number of PAC patients poorly respond to the chemotherapeutic agents, which rather causes drug-resistance and metastatic progression [6]. Actually, the clinicopathological information was incomplete for chemotherapeutic treatment in the GSE79668 and TCGA RNA-seq datasets (Table S5). The clinical information of TGCA dataset only indicated the patients who received the adjuvant chemotherapy (n = 100). In the case of GSE 79668 dataset, the details of chemotherapy information were archived with the drug names: Yes (n = 17) and No (n = 6). We could thus not assess the patients who had therapeutic benefit by the PAC-5 signature in each risk-group. However, the PAC-5 signature significantly stratified the patients who were chemotherapeutic drug-administered (n = 117) into two risk-subgroups for OS (p = 0.022, Figure 5A), but did not classify the patient for RFS (p = 0.087, Figure 5B).

Figure 5

Kaplan–Meier survival analysis of PAC patients with chemotherapy. The patients were separated into risk-subgroups according to chemotherapy treatment. Kaplan–Meir analyses were used to evaluate the therapeutic advantage. (A) Kaplan–Meier plots for OS of two risk-subgroups. (B) Kaplan–Meier plots for RFS of two risk-subgroups. p-Values were computed by the log-rank test.

2.4.2. Radiotherapy

Adjuvant radiotherapy has frequently been used as an integral component to treat PAC [17]. However, it is still controversial whether the patients benefit from radiotherapy [18]. Thus, in order to investigate the association of the PAC-5 signature with a response to adjuvant radiotherapy, we performed subgroup analysis. Radiotherapy itself showed therapeutic benefit for OS, but not for RFS in the GSE79668 and TCGA RNA-seq datasets (p = 4.81 × 10−3 for OS and p = 0.631 for RFS, Figure S5A,B, respectively). By incorporating the PAC-5 signature into radiotherapy information, the high-risk patients were shown to obtain the benefit for OS, compared to the patients without adjuvant radiotherapy (p = 2.82 × 10−4, Figure 6A). In contrast, low-risk patients did not show significant difference in radiotherapy effect (p = 0.832, Figure 6B). However, both low- and high-risk groups did not benefit from radiotherapy for RFS (Figure 6C,D).

Figure 6

Kaplan–Meier survival analysis of PAC patients with radiation therapy. The patients were separated into risk-subgroups according to radiotherapy. Kaplan–Meir analyses were used to evaluate the therapeutic advantage. (A,B) Kaplan–Meier plots for OS of two risk-subgroups. (C,D) Kaplan–Meier plots for RFS of two risk-subgroups. p-Values were computed by the log-rank test.

2.4.3. Targeted Molecular Therapy

Targeted molecular therapy has been suggested as a type of personalised medicine designed to treat cancer via inhibiting oncoproteins that drive signalling pathways in cancer [19]. In the PAC treatment, erlotinib, a selective epithermal growth factor receptor (EGFR) tyrosine kinase inhibitor (TKI), is the only targeted therapeutic agent approved by Food and Drug Administration (FDA) [20]. Although the administered drug was incompletely named in clinicopathological information of TCGA dataset, the association of the PAC-5 signature with the response to targeted molecular therapy was examined in the TCGA RNA-seq validation dataset. The patients with targeted molecular therapy had therapeutic benefit for OS, compared to the patients without targeted molecular therapy (p = 8.53 × 10−4 for OS and p = 0.018 for RFS, respectively, Figure S5C,D) By incorporating the PAC-5 signature into targeted molecular therapy, the high-risk patients were shown to obtain the benefit in OS and RFS compared to patients without targeted molecular therapy (p = 1.10 × 10−8 for OS and p = 1.25 ×10−3 for RFS, respectively, Figure 7A,C). In contrast, the low-risk patients did not show a significant difference in the treatment outcome (p = 0.172 for OS and p = 0.832 for RFS, respectively, Figure 7B,D).

Figure 7

Kaplan–Meier survival analysis of PAC patients with targeted molecular therapy. The patients were separated into risk-subgroups according to targeted molecular therapy. Kaplan–Meir analyses were used to evaluate the therapeutic advantage. (A,B) Kaplan–Meier plots for OS of two risk-subgroups. (C,D) Kaplan–Meier plots for RFS of two risk-subgroups. p-Values were computed by the log-rank test.

2.5. Associations of PAC-5 Signature with KRAS Status

The malignant behaviour of cancer cells is compelled by mutations in oncogenes and tumour suppressor genes [21]. In PAC, KRAS is the most frequently mutated gene (in ~95% of cases) [22]. In the GSE79668 and TCGA RNA-seq datasets, KRAS mutations were observed in 82.4% of all pancreatic tumour cases. The KRAS status itself did not show significant difference in prognostic outcomes of the patients for both OS and RFS (Figure S6A,B). Although the PAC-5 signature did not classify the patients with KRAS wild type into low- and high-risk groups for OS (p = 0.218, Figure 8A), while the patients with KRAS mutants were significantly stratified into two distinct risk-groups (p = 9.19 × 10−4, Figure 8B). However, the PAC-5 signature did not still stratify the patients for RFS regardless of KRAS status (Figure 8C,D). To further assess whether the KRAS status influences the patients assigned to two risk-subgroups by the PAC-5 signature, we have stratified each risk-subgroup by incorporating the KRAS status. However, no risk-groups were further classified by KRAS status (Figure S7A–D).

Figure 8

Kaplan–Meier survival analysis of PAC-5 gene signature with KRAS mutation. Kaplan–Meier survival analyses were used to estimate differences in OS and RFS between the low- and high-risk groups with KRAS status. (A,B) Kaplan–Meier plots for OS of two risk-subgroups. (C,D) Kaplan–Meier plots for RFS of two risk-subgroups. KRAS-WT, wild type KRAS; KRAS-MT, mutant KRAS. p-Values were computed by the log-rank test.

2.6. DNA Methylation Regulating Expression of the PAC-5 Genes

DNA methylation is a critical epigenetic gene regulation mechanism in cancer [23]. To assess whether the DNA methylation influenced the PAC-5 gene expression, correlations between the gene expression and their DNA methylation status at CpG sites were analysed. The threshold for the methylation value of CpG sites was set as the absolute value of Δβ = βTumour − βNormal > 0.1 between the tumour and adjacent normal tissue; Δβ > 0.1 was defined as hypermethylated sites, and Δβ < –0.1 was considered as hypomethylated sites. The associations between the proximal gene expression and DNA methylation were determined by Pearson’s correlation coefficient (r). When the two criteria (r > 0.4 and p < 0.05) were satisfied, the correlation value was defined significant. The significant CpG sites were obtained for two genes, LAMA3 and LRIG1 with moderate r-Values (Figure 9A–C and Table S6). A DNA methylation heatmap for these three genes was provided in Figure 9D. However, no considerable CpG sites were found for regulation of the other genes.

Figure 9

Methylation assessment of LAMA3 and LRIG1 genes in two risk-subgroups according to PAC-5 signature. Pearson’s correlation was used to measure linear relationships between DNA methylation and gene expression levels. r-Value indicated the Pearson’s correlation coefficient, and the p-Value (2-tailed) was the probability of a correlation. (A–C) Correlations between DNA methylations and the indicated genes. (D) Heatmap showed trend of the PAC-5 gene methylations according to the risk-subgroups. The colour keys of standardised methylation β values were presented in the legends, with red indicating hypermethylation and green indicating hypomethylation.

2.7. Identification of Protein–Protein Interaction Network Associated with the PAC-5 Signature

Finally, to investigate how the PAC-5 genes might contribute to PAC progression, we employed PPI analysis in the NetworkAnalyst tool [24]. Three PPI networks related to the PAC-5 genes were generated with 53 nodes representing the proteins and 51 edges representing the interaction between the proteins (Figure 10). To further annotate functions of the proteins interacting with the PAC-5 genes, we executed a KEGG pathway analysis. Importantly, two genes in the PAC-5 signature, LRIG1 and SLC12A2 potentially interacted with the genes involved in EGFR-TKI resistance (Table S7).

Figure 10

Protein–protein interaction network analysis of the PAC-5 genes. Interaction map was generated using the STRING database with experimental evidence in the Network Analyst 3.0. The proteins of the PAC-5 signature were red-circled, and the proteins related to the term of EGFR inhibitor resistance in KEGG pathway were black-circled.

3. Discussion

Pancreatic adenocarcinoma (PAC) is a highly heterogeneous disease with poor clinical outcomes. The prognostic prediction for treatment and mortality after surgery is frequently limited due to tumour molecular heterogeneity. Hence, the primary challenge is to develop a precise prognostic model that provides criteria for clinical treatment decisions. To address this issue, we established a PAC-5 gene expression signature via DEG profiling of normal pancreatic and PAC tissues in publicly available datasets to identify the tumour-specific genes. We here introduced novel genes, IFI44, SLC12A2 and LRIG1, which were not overlapped to other prognostic gene signatures for PAC. The robustness of the PAC-5 signature was supported by the reproducibility of a significant association between the predicted outcome and patient prognosis in external validation datasets composed with by far the largest gene expression profiles. Moreover, the PAC-5 signature could be a complementary prognostic adjunct to pathological staging to pave the way to personalised management strategies. Therapeutic subgroup analysis showed that PAC-5 signature might predict which patients would benefit from adjuvant therapies such as chemotherapy, radiotherapy, and targeted molecular therapy. Furthermore, we revealed that the PAC-5 signature could give potential therapeutic benefit to the patients with KRAS mutant. The five genes were found to be involved in tumourigenic signaling pathways, such as MAPK, PI3K-AKT, and ERBB pathways. Finally, network analyses of the PAC-5 signature provided clues for further elucidation of PAC heterogeneity and potential therapeutic target genes. In the process of PAC-5 signature development, we initially subjected normal pancreas and tumour tissues to DEG analysis to find pancreatic cancer-specific genes. We subsequently identified seven genes (LAMA3, E2F7, IFI44, SLC12A2, LRIG1, DUOXA1, and RBM1) related to OS of PAC patients, using Cox proportional hazards analysis. Finally, we established a prognostic model with five genes that classified patients into two distinct risk-subgroups. Among the five genes in the PAC-5 signature, expression levels of LAMA3, E2F7 and IFI44 were elevated in the high-risk group, whereas SLC12A12 and LRIG1 expressions were relatively lowered. Interestingly, SLC12A2 expression was higher in tumour tissues than in normal pancreas tissues; however, it was rather highly expressed in the low-risk group, compared to the high-risk one. Further observation is thus necessary to elucidate how SLC12A2 expression is regulated in PAC biology. We also observed similar results from the analysis of an external validation dataset, which had adjacent tissues paired to their PAC tissues. A supervised method was used to construct the gene signature that was refined by LOOCV. Furthermore, a meta-analysis approach based on five microarray datasets (n = 474) and three RNA-seq datasets (n = 283) was applied to validate the prognostic significance of the gene signature in association with overall survival (30.4 months in the low-risk group and 17.7 months in the high-risk group). The PAC-5 signature and clinical parameter adjustment showed a significant association with survival in univariate analysis. Importantly, multivariate analysis demonstrated that the PAC-5 signature was the most significant variable associated with the prognosis of patients with PAC. The AJCC staging system cannot accurately predict patient survival, by which the survival times of stage IB, IIA, and IIB patients actually show no significant differences [3]. This intra-stage variance is due to tumour heterogeneity, resulting in different clinical prognosis after receiving the same treatments [6]. In our subgroup analysis, the patients with stage IIB distinctively showed poor prognosis, not in accordance with previous studies. By incorporating the PAC-5 signature, the patients with stage IB, IIA, and IIB were further stratified into significantly low- and high-risk groups. These consistent results indicate that our gene signature could be a complementary prognostic adjunct to pathological staging to pave the way to personalised management strategies. All current treatment regimens and many clinical trials targeting specific molecular pathways failed to improve therapeutic efficacy in PAC patients. Thus, the identification of patients who respond well to adjuvant therapy remains a major clinical concern. We demonstrated that the PAC-5 signature is closely associated with clinical outcomes of adjuvant therapies. Adjuvant chemotherapy is currently recommended as a standard therapy after resection for PAC [2]. However, it has modest clinical benefit and may not improve OS. The lack of significant chemotherapeutic response of PAC results in the inherent drug resistance of tumour cells [6]. In our analysis, the clinicopathological information was incomplete for chemotherapeutic treatment. We could thus not assess that the patients who had therapeutic benefit by the PAC-5 signature in each risk-group. Nonetheless, the PAC-5 signature further classified the patients who received the adjuvant chemotherapy for OS. The potential role of radiotherapy as management of resectable tumours in adjuvant settings remains controversial [17]. The treatment efficacy in many patients with PAC is conflicting [25]. In our study, subgroup analysis of patients with available data revealed that adjuvant radiotherapy was beneficial for high-risk patients to improve OS. Targeted molecular therapy is one of the primary modalities in cancer treatment, which interferes with specific molecules needed for tumourigenesis [26]. Currently, no effective targeted molecular therapies have been found for PAC. Because of a lack of information on adjuvant targeted molecular therapy, we could not comprehensively evaluate the efficacy of the PAC-5 signature to predict therapeutic outcomes. However, we found that the PAC-5 signature evidently improved OS and RFS in high-risk patients treated with targeted molecular therapy but not in the low-risk group. Hence, the PAC-5 signature results suggested a potential advantage of adjuvant therapies to patients in high-risk group, although we could not draw definite conclusions because of the small number of patients used in these analyses or incomplete information. In most cases of PAC, oncogenic KRAS mutations, which initially drive pancreatic neoplasia, are prevalent [22]. With the substantial evidence that mutant KRAS is critical for PAC progression, it is extensively investigated as well [27]. However, no effective targeted therapies for KRAS have been established for PAC. In our analysis, the patients with KRAS status were not involved in prognostic outcomes. We found that the PAC-5 signature in combination with KRAS status further stratified patients for OS, while the signature did not show prognostic differences for RFS. Thus, the PAC-5 signature might give the potential benefit from adjuvant TMT in patients with KRAS mutant type, although we agree that it would not be enough to make a strong conclusion for the predictive power due to the small number of patients used in these analyses. The majority of genes in the PAC-5 signature (LAMA3, E2F7, SLC12A2, and LRIG1) have been reported to be associated with tumour progression in various types of cancer. LAMA3 is the alpha subunit of laminin-332, which is further composed of laminin subunit β2 (LAMB2) and laminin subunit γ2 (LAMC2). Tumourigenic roles of the laminin-332 are well-known in diverse cancers such as breast and colon cancers [28] and squamous cell carcinoma [29] as well as PAC [30]. E2F7, one of the E2F transcription factors, has critical roles in the regulation of cell cycle progression and DNA-damage response [31,32]. In cancer biology, E2F7 is associated with poor survival in squamous cancers [33]. Loss of E2F7 confers resistance to poly-ADP-ribose polymerase (PARP) inhibitors in BRCA2-deficient breast cancer cells [34]. SLC12A2 plays a role in Na+, K+, and 2Cl-cotransporter in membrane blebbing via interactions with actin and the p38 mitogen-activated protein kinases (p38 MAPK) in malignant mesothelioma cells [35]. Pharmacological modulation of K+ transport increases sensitivity to apoptosis in human malignant pleural mesothelioma cell line [36]. SLC12A2 expression is associated with glioblastoma cell invasion and aggressiveness [37]. LRIG1 participates in the aggressive progression of several tumours, in which its expression is frequently decreased [38,39,40]. More importantly, it blocks the EGFR pathway with its antagonist erlotinib abrogated LRIG1 suppression-induced EMT and, subsequently, cell invasion, migration, and vasculogenic mimicry of melanoma cells under hypoxia [41]. IFI44 is one of the interferon-α stimulated genes (ISGs) which is associated with infections of several viruses such as hepatitis C virus [42], rhinovirus [43] and human papillomavirus [44]. In addition, IFI44 inhibits cAMP-mediated signalling downstream of ERK via depletion of intracellular GTP, resulting in arrest of cell division in melanoma [45]. In breast cancer, reduced expression of IFI44 in lymphocytes exacerbates cancer-associated immune dysfunction [46]. However, the molecular functions of IFI44 in cancer cells remain to be explored. The PPI network analysis of the PAC-5 signature indicated the possibility of drug resistance to EGFR-TKIs. EGFR-TKIs generally bind the tyrosine kinase domain of EGFR, and thus inhibit its activity. For instance, erlotinib and/or gefitinib (small molecular EGFR-TKIs) achieved significant treatment efficacy in patients with lung cancer or PAC [20,47]. Nevertheless, cancer cells gradually acquire resistance to these drugs, resulting in progression and relapse [48]. SLC12A2 and LRIG1 were shown to interact with proteins involved in EGFR kinase inhibitor resistance, such as EGFR, ERBB2, ERBB3, and c-MET. ERBBs are known to promote pancreatic cancer development [49]. Overexpression or mutation of ERBB2 is associated with resistance to EGFR-TKIs [50]. Overexpression of ERBB3 in poorly differentiated colorectal cancer cell lines led to a significant resistance to gefitinib in vitro and in vivo [51]. Furthermore, ERBB3 phosphorylation is driven by EGFR and/or ERBB2, or through amplification of the proto-oncogene c-Met [52]. Several studies have shown that the drug resistance to either TKIs or gemcitabine is developed through hyperactivation of the c-MET/HGF signalling axis [53,54,55]. In addition, although LAMA3 and E2F7 did not exhibit direct interactions with proteins involved in resistance to EGFR-TKIs, these two proteins were also connected to many proteins related to tumour progression [56,57]. Accordingly, we suggest that the PAC-5 signature can be used as a biomarker panel to estimate not only the clinical effectiveness of EGFR-TKIs but also drug resistance. In this manner, the PAC-5 genes might be utilised as valuable targets for concurrent therapy in addition to their role as prognostic markers. The development of high-throughput technologies has allowed accessing integrated approaches of the genetic and epigenetic patterns for the regulatory mechanism of interest genes. In our analysis of DNA methylation for gene regulations, we found that the genomic alterations in methylation influence the PAC-5 gene expressions. DNA methylation is critical in the early formation and process of diseases, especially for cancers, and the hypermethylation of promotor or/and CpG island (CGI) of genes results in the transcriptional silencing [58]. The LAMA3 loci were relatively hypermethylated at one transcriptional region in the patients of the low-risk group, which were significantly associated with the mRNA expressions. In coincidence with our data, a previous study reports that LAMA3 promoter methylation frequency was inversely associated with increased tumour stage and tumour size in breast cancer [59]. In contrast, two different CpG regions of LRIG1 were hypermethylated in patients with high-risk, which were significantly involved in the gene expression. At present, additional regulatory mechanisms for the five gene expressions in PAC biology remain to be uncovered. Perspective studies would provide new insight on cancer-specific gene regulations for understanding the molecular heterogeneity of PAC.

4. Methods

4.1. PAC Patient and Gene Expression Data

All clinical and gene expression data for PAC patients were obtained from the Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/geo/), ArrayExpress (http://www.ebi.ac.uk/arrayexpress/), International Cancer Genome Consortium (ICGC, http://icgc.org) and The Cancer Genome Atlas (TCGA, http://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga/). All of the used datasets contained clinical information of patients on survival event and time. In the case of the TCGA dataset, the gene expression data, methylation and clinical information were obtained from the University of California Santa Cruz (http://xena.ucsc.edu/). Tumour tissues in two RNA-seq datasets of TCGA and ICGC contained pancreatic neuroendocrine tumours and other types of carcinoma. To discriminate the PAC tissues, the histological subtypes provided in clinical information of TCGA and ICGC were reviewed according to guidelines [18,60]. The gene expression data were normalised using a robust multiarray averaging method (RMA) [61]. The 636 patients of five microarray and three RNA sequencing datasets were used in the analysis. GSE71729 (Agilent-014850 Whole Human Genome Microarray 4x44K G4112F) [13] dataset had gene expression data of patients from multiple cancers, in which 125 PAC patients and 46 normal pancreas subjects were included. The GSE71729 was used as the training dataset to establish a gene signature. The validation datasets were GSE17891 (n = 27, Affymetrix Human Genome U133 Plus 2.0 Array) [16], GSE57495 (n = 63, Rosetta/Merck Human RSTA Custom Affymetrix 2.0 microarray) [62], GSE62452 (n = 66, Affymetrix Human Gene 1.0 ST Array) [63], E-MEXP-2780 (n = 30, Affymetrix Human Genome U133 Plus 2.0) [64], E-MTAB-6134 (n = 288, Affymetrix Human Genome U219 Array) [60], GSE79668 (n = 51) [65], ICGC (n = 82) [14] and TCGA (n = 150) [15]. In the validation datasets, the majority of patients were assigned to stage IIB disease (490/727, 64.7%). Notably, 117/201 patients received chemotherapy (58.2%), 42/187 patients received radiotherapy (22.5%), and 102/139 patients received targeted molecular therapy (73.4%). A summary of the training and validation datasets is provided in Table S1.

4.2. Development of the Prognostic Gene Expression Signature

A prognostic gene signature was developed using the GSE71729 training dataset. First, the 19,749 genes were filtered by at least more than two folds of the absolute value of log2 scale in less than 20% of the patients. Next, differentially expressed genes (DEGs) analysis [66] between normal pancreas (n = 46) and PAC tissues (n = 125) was performed to isolate tumour-specific genes. Stringent p-Value (p < 0.001) and false discovery rate (FDR) < 0.1 using univariate permutation test (1000 times) were set as the cutoffs for the DEGs. The LASSO regression [67] (p < 0.001) was then used to identify the OS-associated gene signature from the training dataset. After this step, genes that were not compatible with the external validation datasets or not significant in the individual survival curves were excluded. For predicting prognosis, genes from the survival signature were applied to survival risk prediction analysis. This method utilised the principal component from the training dataset and generated a prognostic index for each patient. The prognostic index (y) was computed by the formula where wi and xi were the weight and logged gene expression for the i-th gene, respectively, as below. The weight values of all genes were as follows: w, 0.306929; w, 0.118701; w, 0.263742; w, −0.4137 and w, −0.190012. The patients were divided into two risk-groups according to a median prognostic index value of −0.060239. Patients were assigned to the high-risk group if their prognostic index values were higher than the median value, whereas the low-risk group comprised patients with the prognostic index values that were equivalent to or less than the median value. Dendrogram of prognostic genes was generated using the heatmap function in R, using default settings for the clustering algorithm.

4.3. Validation of the Prognostic Signature

The validation of the gene signature was performed in external datasets. Gene expression data from different datasets were normalised by subtracting the median expression value across the samples. Compound covariate predictor was utilised as a class prediction algorithm to further refine this model and sub-stratify the predicted outcomes [68]. The robustness was estimated by the misclassification rate that was determined during leave-one-out cross-validation (LOOCV). Kaplan–Meier survival analyses were performed after the patient classification into two risk-groups, and Chi-square (χ2) and log-rank tests were used to evaluate the survival probability in the two predicted risk-subgroups of patients. Univariate and multivariate Cox proportional hazard regression analyses were used to evaluate independent prognostic factors associated with survival, and the gene signature, tumour grade, and pathological characteristics were employed as covariates.

4.4. Network and Pathway Enrichment Analysis

NetworkAnalyst 3.0 is a web-based visual analytics platform for comprehensive profiling, meta-analysis and systems-level interpretation of gene expression data (http://www.networkanalyst.ca/) [24], accessed July 2019. The NetworkAnalyst 3.0 was used to generate protein–protein interaction (PPI) networks, and then to perform KEGG pathway enrichment analysis. The PPI network analysis was performed using STRING database v11.0 (http://string-db.org/) [69] with experimental evidence. KEGG pathway enrichment analysis was conducted to annotate the pathways, in which the genes in expression signature were involved. An adjusted p < 0.05 was considered significant for all enrichment analyses.

4.5. DNA Methylation Analysis of Gene Regulation

DNA methylation profiling analysis for the gene regulations was performed using TCGA DNA methylation data (Illumina Infinium HumanMethylation450 platform). The DNA methylation β values for CpG sites indicated the estimate of methylation level using the ratio of intensities between methylated and unmethylated alleles. The threshold for the methylation value of a CpG site was set as the absolute value of Δβ = βTumour − βNormal > 0.1 between the tumour and adjacent normal tissue; Δβ > 0.1 was defined as a hypermethylation, while Δβ < −;0.1 was determined as a hypomethylation. The association between the proximal gene expression and DNA methylation was measured by Pearson’s correlation coefficient (r). The correlation values were indicated: 0.1 < |r| ≤ 0.4, weak correlation; 0.4 < |r| ≤ 0.7, moderate correlation; r = 0.7 < |r| ≤ 0.9, strong correlation.

4.6. Statistical Methods

Gene expression datasets were analysed using BRB-Array Tools Version 4.6 (http://brb.nci.nih.gov/BRB-ArrayTools/) [66]. All other statistical analyses were accomplished in the R language environment (http://www.r-project.org) and Statistical Package for Social Sciences (SPSS) software (version 25, SPSS Inc, Chicago, IL, USA). In all statistical analyses, a p-Value of less than 0.05 was considered significant.

5. Conclusions

In this study, we developed a novel gene signature, the PAC-5 signature, which was developed via DEG between the normal pancreas and PAC tissues to identify the cancer-specific genes. The gene signature accurately and robustly predicts individual PAC patients at high risk of mortality. The prognostic value of the PAC-5 signature was statistically significant in the overall datasets, independently of the pathological staging. Furthermore, we provided evidence that the PAC-5 signature might help to refine the selection of the PAC patients who are beneficial from adjuvant radiation or targeted molecular therapies. Hence, we propose that the five genes in our signature might be promising molecular targets for PAC treatment.

68 in total

1. Expression of laminin-5-gamma-2 chain in intraductal papillary-mucinous and invasive ductal tumors of the pancreas.

Authors: N Fukushima; M Sakamoto; S Hirohashi
Journal: Mod Pathol Date: 2001-05 Impact factor: 7.842

Review 2. Implementing personalized cancer care.

Authors: Richard L Schilsky
Journal: Nat Rev Clin Oncol Date: 2014-04-01 Impact factor: 66.675

Review 3. Controversies and challenges regarding the impact of radiation therapy on survival.

Authors: C Chargari; J-C Soria; E Deutsch
Journal: Ann Oncol Date: 2012-08-16 Impact factor: 32.976

4. Rhinovirus-induced modulation of gene expression in bronchial epithelial cells from subjects with asthma.

Authors: Y A Bochkov; K M Hanson; S Keles; R A Brockman-Schneider; N N Jarjour; J E Gern
Journal: Mucosal Immunol Date: 2009-08-26 Impact factor: 7.313

5. Acquisition of cisplatin-resistance in malignant mesothelioma cells abrogates Na+,K+,2Cl(-)-cotransport activity and cisplatin-induced early membrane blebbing.

Authors: Veronica Janson; Britta Andersson; Parviz Behnam-Motlagh; Karl Gunnar Engström; Roger Henriksson; Kjell Grankvist
Journal: Cell Physiol Biochem Date: 2008-07-25

6. Impaired interferon signaling is a common immune defect in human cancer.

Authors: Rebecca J Critchley-Thorne; Diana L Simons; Ning Yan; Andrea K Miyahira; Frederick M Dirbas; Denise L Johnson; Susan M Swetter; Robert W Carlson; George A Fisher; Albert Koong; Susan Holmes; Peter P Lee
Journal: Proc Natl Acad Sci U S A Date: 2009-05-18 Impact factor: 11.205

7. Targeting a tumor-specific laminin domain critical for human carcinogenesis.

Authors: Mark Tran; Patricia Rousselle; Pasi Nokelainen; Sruthi Tallapragada; Ngon T Nguyen; Edgar F Fincher; M Peter Marinkovich
Journal: Cancer Res Date: 2008-04-15 Impact factor: 12.701

8. Role of cell surface metalloprotease MT1-MMP in epithelial cell migration over laminin-5.

Authors: N Koshikawa; G Giannelli; V Cirulli; K Miyazaki; V Quaranta
Journal: J Cell Biol Date: 2000-02-07 Impact factor: 10.539

Review 9. Pancreatic Cancer Heterogeneity Can Be Explained Beyond the Genome.

Authors: Natalia Anahi Juiz; Juan Iovanna; Nelson Dusetti
Journal: Front Oncol Date: 2019-04-05 Impact factor: 6.244

10. The novel c-Met inhibitor cabozantinib overcomes gemcitabine resistance and stem cell signaling in pancreatic cancer.

Authors: C Hage; V Rausch; N Giese; T Giese; F Schönsiegel; S Labsch; C Nwaeburu; J Mattern; J Gladkich; I Herr
Journal: Cell Death Dis Date: 2013-05-09 Impact factor: 8.469

7 in total

1. Identification of Prognostic Immune-Related Genes in Pancreatic Adenocarcinoma and Establishment of a Prognostic Nomogram: A Bioinformatic Study.

Authors: Guolin Wu; Zhenfeng Deng; Zongrui Jin; Jilong Wang; Banghao Xu; Jingjing Zeng; Minhao Peng; Zhang Wen; Ya Guo
Journal: Biomed Res Int Date: 2020-06-09 Impact factor: 3.411

2. Prognostic value of Glypican family genes in early-stage pancreatic ductal adenocarcinoma after pancreaticoduodenectomy and possible mechanisms.

Authors: Jun-Qi Liu; Xi-Wen Liao; Xiang-Kun Wang; Cheng-Kun Yang; Xin Zhou; Zheng-Qian Liu; Quan-Fa Han; Tian-Hao Fu; Guang-Zhi Zhu; Chuang-Ye Han; Hao Su; Jian-Lu Huang; Guo-Tian Ruan; Ling Yan; Xin-Ping Ye; Tao Peng
Journal: BMC Gastroenterol Date: 2020-12-10 Impact factor: 3.067

3. microRNA-26a represses pancreatic cancer cell malignant behaviors by targeting E2F7.

Authors: Liang Wang; Meijun Li; Fei Chen
Journal: Discov Oncol Date: 2021-11-27

4. Development and Validation of a 7-Gene Prognostic Signature to Improve Survival Prediction in Pancreatic Ductal Adenocarcinoma.

Authors: Zengyu Feng; Hao Qian; Kexian Li; Jianyao Lou; Yulian Wu; Chenghong Peng
Journal: Front Mol Biosci Date: 2021-05-21

5. High expression of LAMA3/AC245041.2 gene pair associated with KRAS mutation and poor survival in pancreatic adenocarcinoma: a comprehensive TCGA analysis.

Authors: Chengming Tian; Xiyao Li; Chunlin Ge
Journal: Mol Med Date: 2021-06-16 Impact factor: 6.354

Review 6. CD36 and CD97 in Pancreatic Cancer versus Other Malignancies.

Authors: Cristiana Tanase; Ancuta-Augustina Gheorghisan-Galateanu; Ionela Daniela Popescu; Simona Mihai; Elena Codrici; Radu Albulescu; Mihail Eugen Hinescu
Journal: Int J Mol Sci Date: 2020-08-06 Impact factor: 5.923

7. Construction of a 6-gene prognostic signature to assess prognosis of patients with pancreatic cancer.

Authors: Jiayue Yang; Wei Shi; Shengwei Zhu; Cheng Yang
Journal: Medicine (Baltimore) Date: 2020-09-11 Impact factor: 1.817

7 in total