| Literature DB >> 22292069 |
Yan Lu1, Liang Wang, Pengyuan Liu, Ping Yang, Ming You.
Abstract
About 30% stage I non-small cell lung cancer (NSCLC) patients undergoing resection will recur. Robust prognostic markers are required to better manage therapy options. The purpose of this study is to develop and validate a novel gene-expression signature that can predict tumor recurrence of stage I NSCLC patients. Cox proportional hazards regression analysis was performed to identify recurrence-related genes and a partial Cox regression model was used to generate a gene signature of recurrence in the training dataset -142 stage I lung adenocarcinomas without adjunctive therapy from the Director's Challenge Consortium. Four independent validation datasets, including GSE5843, GSE8894, and two other datasets provided by Mayo Clinic and Washington University, were used to assess the prediction accuracy by calculating the correlation between risk score estimated from gene expression and real recurrence-free survival time and AUC of time-dependent ROC analysis. Pathway-based survival analyses were also performed. 104 probesets correlated with recurrence in the training dataset. They are enriched in cell adhesion, apoptosis and regulation of cell proliferation. A 51-gene expression signature was identified to distinguish patients likely to develop tumor recurrence (Dxy = -0.83, P<1e-16) and this signature was validated in four independent datasets with AUC >85%. Multiple pathways including leukocyte transendothelial migration and cell adhesion were highly correlated with recurrence-free survival. The gene signature is highly predictive of recurrence in stage I NSCLC patients, which has important prognostic and therapeutic implications for the future management of these patients.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22292069 PMCID: PMC3264655 DOI: 10.1371/journal.pone.0030880
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Clinical summary of patients in the analyzed datasets.
| Dataset 1 | Dataset 2 | Dataset 3 | Dataset 4 | Dataset 5 | |
| Total number of samples | 142 | 46 | 138 | 54 | 36 |
| Mean age (range) | 65 (35–85) | 63 (36–78) | 61 (31–82) | 69 (32–89) | 66 (48–81) |
| Sex | |||||
| male | 80 | 33 | 104 | 9 | 20 |
| female | 62 | 13 | 34 | 45 | 16 |
| Mean follow-up (months) | |||||
| Total DFS | 57 | 39 | 35 | 48 | 38 |
| No recurred | 69 | 63 | 54 | 55 | 51 |
| Recurred | 27 | 35 | 16 | 33 | 22 |
| Stage | |||||
| IA | 70 | 16 | — | 27 | 0 |
| IB | 72 | 30 | — | 27 | 36 |
| Histological type | |||||
| ADC | 142 | 46 | 62 | 49 | 14 |
| SCC | 0 | 0 | 76 | 1 | 18 |
| Others | 0 | 0 | 0 | 4 | 4 |
Genes related to tumor recurrence of stage I NSCLC.
| Genes | Function | HR | Genes | Function | HR |
| AU148154 | 0.5792 | NM_018600 | 1.5353 | ||
| B4GALT1 | Cell adhesion | 1.8344 | OCA2 | cell differentiation | 1.4181 |
| CGB | cell death | 1.3312 | PADI3 | terminal differentiation of the epidermis | 1.5470 |
| CHST12 | 1.4697 | RPRM | negative regulation of progression through cell cycle | 1.4748 | |
| CLEC11A | positive regulation of cell proliferation | 1.6334 | SH3YL1 | 1.5522 | |
| COL2A1 | negative regulation of apoptosis, Cell adhesion | 1.5701 | SLC27A2 | PPAR signaling pathway | 1.4456 |
| CYP2A6 | nicotine metabolism | 1.2751 | SLC35F5 | 1.4836 | |
| DENND1A | synaptic vesicle endocytosis | 1.4545 | SNAPC2 | transcription from RNA polymerase II promoter | 1.5725 |
| DIO1 | 1.5142 | SPTBN2 | cell death | 1.6520 | |
| DOCK6 | 1.6545 | STRN3 | 1.3969 | ||
| EPHB6 | Loss of expression in metastatic melanoma | 1.4146 | SUSD4 | 1.4464 | |
| FZD9 | G-protein coupled receptor protein signaling pathway | 1.2810 | TCF3 | transcription factor activity | 1.5250 |
| GLE1 | export mRNA from nucleus to cytoplasm | 1.4920 | TET3 | tet oncogene family member 3 | 1.6322 |
| GTF3C2 | transcription factor | 1.6350 | THBS1 | Cell adhesion, blood vessel development | 1.3397 |
| INF2 | Rho GTPase binding | 1.4114 | TRIM34 | 1.4886 | |
| KDM4B | transcriptional target of hypoxia-inducible factor | 1.7967 | TRIM46 | 1.4355 | |
| SIK3 | protein phosphorylation | 0.5875 | TRIP11 | transcription from RNA polymerase II promoter | 1.4917 |
| GREB1L | 1.4917 | CELSR1 | Cell adhesion | 1.5144 | |
| KLK5 | epidermis development | 1.4736 | UBE2D4 | ubiquitination | 1.4669 |
| KRT81 | keratin filament | 1.3167 | UBXN4 | response to unfolded protein | 1.4742 |
| LENEP | cell differentiation | 1.5902 | VKORC1 | oxidoreductase activity | 1.5498 |
| MYOG | cell differentiation | 1.6048 | ZBTB7B | cell differentiation | 1.5783 |
| NFKBIL1 | member of the I-kappa-B family | 1.5875 | ZNF365 | 1.5436 | |
| NLRP2 | cell death | 1.4080 | MUC5AC | induction of apoptosis, Cell adhesion | 1.4135 |
| NM_004876 | FGFR2 | cell growth | 1.5516 | ||
| FEZ2 | cell projection organization and biogenesis | 1.6395 |
Figure 1Survival analyses of the training set of 142 stage I denocarcinomas.
(A) Kaplan-Meier survival curves for two groups of patients with stage IA or IB. (B) Kaplan-Meier survival curves for the two groups of patients defined by having positive (high risk) or negative (low risk) risk scores of recurrence-free survival. The risk scores were estimated with 15 principle components based on the model using 51 recurrence-free survival-related genes. (C) The area under the curve (AUC) of time-dependent ROC analysis for survival models based on stage information or 51-gene expression data respectively. Time is indicated in months on the x-axis, cumulative survival is indicated on the y-axis. Tick marks, patients whose data were censored at last follow-up.
Figure 2Validation of the 51-gene signature in four independent datasets.
Kaplan-Meier survival analysis was performed in low (full red line) and high (dashed blue line) risk patient groups defined by the 51-gene classifier. AUC for survival models based on stage (dashed red line) or 51-gene classifier (full black line) was also compared. The testing dataset GSE8894 do not have available stage information and all patients in the WUSTL dataset are stage IB. So the time dependent ROC using stage information in these two datasets could not be calculated; all set at 0.5 instead. Tick marks, patients whose data were censored at last follow-up.
Top 30 significant prognostic KEGG pathways related to recurrence.
| KEGGpathway | Pathway annotation | Genenumber | P value |
| hsa04670 | Leukocyte transendothelial migration | 116 | 1.01E-13 |
| hsa04141 | Protein processing in endoplasmic reticulum | 166 | 3.99E-13 |
| hsa04514 | Cell adhesion molecules (CAMs) | 113 | 7.23E-12 |
| hsa00230 | Purine metabolism | 161 | 1.44E-11 |
| hsa03013 | RNA transport | 151 | 2.77E-11 |
| hsa04630 | Jak-STAT signaling pathway | 155 | 3.14E-11 |
| hsa03040 | Spliceosome | 127 | 3.82E-11 |
| hsa04660 | T cell receptor signaling pathway | 108 | 6.73E-11 |
| hsa04722 | Neurotrophin signaling pathway | 127 | 1.11E-10 |
| hsa04144 | Endocytosis | 195 | 1.24E-10 |
| hsa04380 | Osteoclast differentiation | 128 | 1.29E-10 |
| hsa04730 | Long-term depression | 69 | 1.68E-10 |
| hsa04115 | p53 signaling pathway | 68 | 2.96E-10 |
| hsa00190 | Oxidative phosphorylation | 132 | 3.62E-10 |
| hsa04010 | MAPK signaling pathway | 267 | 4.65E-10 |
| hsa04910 | Insulin signaling pathway | 138 | 7.16E-10 |
| hsa04930 | Type II diabetes mellitus | 48 | 8.82E-10 |
| hsa04060 | Cytokine-cytokine receptor interaction | 264 | 1.10E-09 |
| hsa04530 | Tight junction | 132 | 2.48E-09 |
| hsa04666 | Fc gamma R-mediated phagocytosis | 92 | 3.75E-09 |
| hsa04310 | Wnt signaling pathway | 150 | 4.61E-09 |
| hsa04020 | Calcium signaling pathway | 177 | 4.84E-09 |
| hsa04150 | mTOR signaling pathway | 52 | 5.24E-09 |
| hsa03008 | Ribosome biogenesis in eukaryotes | 77 | 5.49E-09 |
| hsa03015 | mRNA surveillance pathway | 83 | 5.62E-09 |
| hsa03010 | Ribosome | 91 | 5.78E-09 |
| hsa04914 | Progesterone-mediated oocyte maturation | 86 | 7.37E-09 |
| hsa05322 | Systemic lupus erythematosus | 122 | 9.24E-09 |
| hsa04120 | Ubiquitin mediated proteolysis | 135 | 9.66E-09 |
| hsa04012 | ErbB signaling pathway | 87 | 1.08E-08 |