| Literature DB >> 31595147 |
Chenhao Zhou1, Yue Zhao2,3, Yirui Yin1, Zhiqiu Hu4, Manar Atyah1, Wanyong Chen1,4, Zhefeng Meng4, Huarong Mao4, Qiang Zhou1, Weiguo Tang4, Pengcheng Wang4, Zhanming Li4, Jialei Weng4, Christiane Bruns2, Marie Popp2, Felix Popp2, Qiongzhu Dong4,5, Ning Ren1,4.
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is one of the most fatal malignancies worldwide. PDAC prognostic and diagnostic biomarkers are still being explored. The aim of this study is to establish a robust molecular signature that can improve the ability to predict PDAC prognosis. 155 overlapping differentially expressed genes between tumor and non-tumor tissues from three Gene Expression Omnibus (GEO) datasets were explored. A least absolute shrinkage and selection operator method (LASSO) Cox regression model was employed for selecting prognostic genes. We developed a 6-mRNA signature that can distinguish high PDAC risk patients from low risk patients with significant differences in overall survival (OS). We further validated this signature prognostic value in three independent cohorts (GEO batch, P < 0.0001; ICGC, P = 0.0036; Fudan, P = 0.029). Furthermore, we found that our signature remained significant in patients with different histologic grade, TNM stage, locations of tumor entity, age and gender. Multivariate cox regression analysis showed that 6-mRNA signature can be an independent prognostic marker in each of the cohorts. Receiver operating characteristic curve (ROC) analysis also showed that our signature possessed a better predictive role of PDAC prognosis. Moreover, the gene set enrichment analysis (GSEA) analysis showed that several tumorigenesis and metastasis related pathways were indeed associated with higher scores of risk. In conclusion, identifying the 6-mRNA signature could provide a valuable classification method to evaluate clinical prognosis and facilitate personalized treatment for PDAC patients. New therapeutic targets may be developed upon the functional analysis of the classifier genes and their related pathways. © The author(s).Entities:
Keywords: Pancreatic ductal adenocarcinoma; molecular signature; survival
Mesh:
Substances:
Year: 2019 PMID: 31595147 PMCID: PMC6775308 DOI: 10.7150/ijbs.32899
Source DB: PubMed Journal: Int J Biol Sci ISSN: 1449-2288 Impact factor: 6.580
Patient and tumor clinicopathological characteristics of 528 pancreatic adenocarcinoma patients involved in the study.
| Characteristics | All (N=528) | Detailed data | ||
|---|---|---|---|---|
| TCGA (N=181) | GEO batch (N=251) | ICGC (N=96) | ||
| ≤60 | 108 (32.93%) | 61 (18.60%) | 18 (5.49%) | 29 (8.84%) |
| ˃60 | 220 (67.07%) | 120 (36.59%) | 33 (10.06%) | 67 (20.43%) |
| Gender | ||||
| Female | 166 (46.37%) | 81 (22.63%) | 38 (10.61%) | 47 (13.13%) |
| Male | 192 (53.63%) | 100 (27.93%) | 43 (12.01%) | 49 (13.69%) |
| Head | 215 (80.83%) | 139 (52.26%) | — | 76 (28.57%) |
| Body | 20 (7.52%) | 15 (5.64%) | — | 5 (1.88%) |
| Tail | 31 (11.65%) | 16 (6.02%) | — | 15 (5.64%) |
| G1 | 33 (9.82%) | 30 (8.93%) | 2 (0.60%) | 1 (0.30%) |
| G2 | 185 (55.06%) | 97 (28.87%) | 32 (9.52%) | 56 (16.67%) |
| G3 | 113 (33.63%) | 50 (14.88%) | 29 (8.63%) | 34 (10.12%) |
| G4 | 5 (1.49%) | 2 (0.60%) | 1 (0.30%) | 2 (0.60%) |
| I | 47 (11.66%) | 21 (5.21%) | 17 (4.22%) | 9 (2.23%) |
| II | 323 (80.15%) | 149 (36.97%) | 94 (23.33%) | 80 (19.85%) |
| III | 16 (3.97%) | 4 (0.99%) | 11 (2.73%) | 1 (0.25%) |
| IV | 17 (4.22%) | 5 (1.24%) | 6 (1.49%) | 6 (1.49%) |
TNM, tumor-nodes-metastasis; —: Without available data.
Figure 1Differentially expressing analyses of genes in GEO datasets. (A) Identification of 155 commonly changed differentially expressed genes (DEGs) from three cohort profile datasets (GSE15471, GSE28735 and GSE62452). Different color areas represent different datasets. The cross areas showed the commonly changed DEGs. DEGs were identified with classical t test; statistically significant DEGs were defined with P< 0.001 and |log2 fold change|> 1 as the cut-off criteria. (B) Volcano plots of the DEGs in GSE15471. Among 1265 DEGs, 1128 were up-regulated and 137 were down-regulated. (C) Volcano plots of the DEGs in GSE28735. Among 350 DEGs, 131 were up-regulated and 219 were down-regulated. (D) Volcano plots of the DEGs in GSE62452. Among 229 DEGs, 157 were up-regulated and 72 were down-regulated.
mRNAs significantly associated with the overall survival in the test series patients (N=181)
| Gene symbol | Coefficient | Description |
|---|---|---|
| KYNU | 0.067744 | Kynureninase (L-Kynurenine Hydrolase) |
| MET | 0.332037 | MET Proto-Oncogene, Receptor Tyrosine Kinase |
| INPP4B | 0.012583 | Inositol Polyphosphate-4-Phosphatase Type II B |
| IGF2BP3 | 0.003424 | Insulin Like Growth Factor 2 MRNA Binding Protein 3 |
| ANKRD22 | 0.012010 | Ankyrin Repeat Domain 22 |
| TOP2A | 0.041295 | DNA Topoisomerase II Alpha |
Figure 2Kaplan-Meier estimates of the overall survival (OS) or disease free survival (DFS) using the six-mRNA signature. The Kaplan-Meier plots were used to visualize the OS or DFS probabilities for the low-risk versus high-risk group of patients based on the best cut-off points (0.1868) of risk score from corresponding TCGA, GEO, ArrayExpress or ICGC datasets. (A) Kaplan-Meier curves for OS in TCGA test series patients (N= 181); (B) Kaplan-Meier curves for DFS in TCGA series patients (N= 181) (C) Kaplan-Meier curves for OS in GEO batch validation series patients (N= 251); (D) Kaplan-Meier curves for OS in ICGC validation series patients (N= 96). The tick marks on the Kaplan-Meier curves represent the censored subjects. The differences between the two curves were determined by the two-side log-rank test. The number of patients at risk was listed below the survival curves.
Figure 3Forest plot summary of analyses of overall survival (OS). Univariable and multivariable analyses of the six-mRNA risk score, age, gender, histological grade and TNM stage on TCGA (A, B), GEO batch (C, D) and ICGC datasets (E, F). The green squares on the transverse lines represent the hazard ratio (HR), and the red transverse lines represent 95% CI. Risk score and age are continuous variables, gender, histological grade and TNM stage are discontinuous variables.
Figure 4Kaplan-Meier survival analysis to evaluate the independence of the six-mRNA signature from histological grade, TNM stage and tumor subdivision of pancreas. The patients from combined datasets were stratified into subgroups. The six-mRNA signature was applied to the low histological grade patients (A), high histological grade patients (B), TNM stage II and III patients (C), patients with tumor subdivision in head (D), body (E) and tail (F) of pancreas, separately. The number of patients at risk was listed below the survival curves. The tick marks on the Kaplan-Meier curves represent the censored subjects. The differences between the two curves were determined by the two-sided log-rank test.
Figure 5Receiver operating characteristic (ROC) analysis of the sensitivity and specificity of the overall survival (OS) prediction by the six-mRNA risk score, age, histological grade, TNM stage and all combined factors in combined datasets patients (N= 528). P values were from the comparisons of the area under the ROC (AUROC) of all combined factors versus six-mRNA risk score, age, histological grade and TNM stage, respectively. As can be seen, the six-mRNA risk score combined with other factors showed a better prediction of OS than age (P < 0.0001), histological grade (P < 0.0001) and TNM stage (P < 0.0001).
Figure 6Gene Set Enrichment Analysis Delineates Biological Pathways and Processes associated with risk score. Cytoscape and Enrichment Map were used for visualization of the GSEA results. Nodes represent enriched gene sets, which are grouped and annotated by their similarity according to related gene sets. Enrichment results were mapped as a network of gene sets (nodes). Node size was proportional to the total number of genes within each gene set. Proportion of shared genes between gene sets was represented as the thickness of the green line between nodes.