| Literature DB >> 28615526 |
Jian-Yong Wang1, Ling-Ling Chen1, Xiong-Hui Zhou1.
Abstract
Identifying the prognostic genes in cancer is essential not only for the treatment of cancer patients, but also for drug discovery. However, it's still a big challenge to select the prognostic genes that can distinguish the risk of cancer patients across various data sets because of tumor heterogeneity. In this situation, the selected genes whose expression levels are statistically related to prognostic risks may be passengers. In this paper, based on gene expression data and prognostic data of ovarian cancer patients, we used conditional mutual information to construct gene dependency network in which the nodes (genes) with more out-degrees have more chances to be the modulators of cancer prognosis. After that, we proposed DirGenerank (Generank in direct netowrk) algorithm, which concerns both the gene dependency network and genes' correlations to prognostic risks, to identify the gene signature that can predict the prognostic risks of ovarian cancer patients. Using ovarian cancer data set from TCGA (The Cancer Genome Atlas) as training data set, 40 genes with the highest importance were selected as prognostic signature. Survival analysis of these patients divided by the prognostic signature in testing data set and four independent data sets showed the signature can distinguish the prognostic risks of cancer patients significantly. Enrichment analysis of the signature with curated cancer genes and the drugs selected by CMAP showed the genes in the signature may be drug targets for therapy. In summary, we have proposed a useful pipeline to identify prognostic genes of cancer patients.Entities:
Keywords: DirGenerank; biomarker; drug target; ovarian cancer; prognosis
Mesh:
Substances:
Year: 2017 PMID: 28615526 PMCID: PMC5542276 DOI: 10.18632/oncotarget.18189
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Figure 1Pipeline to identify prognostic genes
Figure 2The dependency network in ovarian cancer
The prognostic genes identified by our pipeline
| Gene Id | Cox coefficient | Stability | Gene symbol | Description |
|---|---|---|---|---|
| −0.611 | 397 | HNF4G | hepatocyte nuclear factor 4, gamma | |
| −0.542 | 400 | FGF5 | fibroblast growth factor 5 | |
| −0.517 | 375 | NFYC | nuclear transcription factor Y, gamma | |
| −0.465 | 400 | ANGPTL3 | angiopoietin-like 3 | |
| −0.460 | 399 | SCMH1 | sex comb on midleg homolog 1 (Drosophila) | |
| −0.430 | 393 | HABP2 | hyaluronan binding protein 2 | |
| −0.422 | 358 | CSNK1D | casein kinase 1, delta | |
| −0.413 | 392 | FOXK2 | forkhead box K2 | |
| −0.332 | 400 | HOXA11 | homeobox A11 | |
| −0.281 | 240 | ETS1 | v-ets erythroblastosis virus E26 oncogene homolog 1 (avian) | |
| −0.264 | 386 | BICD1 | bicaudal D homolog 1 (Drosophila) | |
| −0.235 | 281 | AQP5 | aquaporin 5 | |
| −0.204 | 314 | B3GALNT1 | beta-1, 3-N-acetylgalactosaminyltransferase 1 (globoside blood group) | |
| −0.203 | 399 | LAMC2 | laminin, gamma 2 | |
| −0.193 | 299 | IL6R | interleukin 6 receptor | |
| −0.185 | 346 | CCL17 | chemokine (C-C motif) ligand 17 | |
| −0.132 | 399 | CRABP1 | cellular retinoic acid binding protein 1 | |
| −0.130 | 381 | CAPN9 | calpain 9 | |
| −0.0935 | 0 | NR1I2 | nuclear receptor subfamily 1, group I, member 2 | |
| −0.0762 | 31 | CXCL13 | chemokine (C-X-C motif) ligand 13 | |
| −0.0669 | 0 | PIK3R3 | phosphoinositide-3-kinase, regulatory subunit 3 (gamma) | |
| 0.0142 | 0 | AVPR2 | arginine vasopressin receptor 2 | |
| 0.0616 | 0 | SPHK1 | sphingosine kinase 1 | |
| 0.10637585 | 391 | BNC1 | basonuclin 1 | |
| 0.161 | 97 | DLX2 | distal-less homeobox 2 | |
| 0.190 | 395 | LYVE1 | lymphatic vessel endothelial hyaluronan receptor 1 | |
| 0.193 | 319 | FABP7 | fatty acid binding protein 7, brain | |
| 0.209 | 391 | SNCA | synuclein, alpha (non A4 component of amyloid precursor) | |
| 0.236 | 399 | PTGER3 | prostaglandin E receptor 3 (subtype EP3) | |
| 0.253 | 398 | HSPB7 | heat shock 27kDa protein family, member 7 (cardiovascular) | |
| 0.262 | 320 | OTOR | otoraplin | |
| 0.296 | 399 | APOC2 | apolipoprotein C-II | |
| 0.310 | 396 | LILRA2 | leukocyte immunoglobulin-like receptor, subfamily A (with TM domain), member 2 | |
| 0.350 | 378 | HAS1 | hyaluronan synthase 1 | |
| 0.367 | 396 | CPNE1 | copine I | |
| 0.406 | 385 | APC | adenomatous polyposis coli | |
| 0.456 | 400 | TJP1 | tight junction protein 1 (zona occludens 1) | |
| 0.4778 | 356 | ENG | endoglin | |
| 0.510 | 397 | AP3S1 | adaptor-related protein complex 3, sigma 1 subunit | |
| 0.887 | 329 | IL3 | interleukin 3 (colony-stimulating factor, multiple) |
For each gene, cox coefficient is the average coefficient of 400 cox proportional hazards regressions in the 400 resampling data sets. The stability of a gene is set as the times when it's significant in 400 cox proportional hazards regressions.
Function annotation of the prognostic genes
| Pathways | p-value | FDR q-value |
|---|---|---|
| 9.63E-06 | 1.79E-03 | |
| 8.64E-05 | 8.03E-03 | |
| 3.39E-04 | 2.10E-02 | |
| 6.14E-04 | 2.86E-02 | |
| 8.90E-04 | 2.95E-02 | |
| 9.53E-04 | 2.95E-02 | |
| 1.35E-03 | 3.50E-02 | |
| 1.72E-03 | 3.50E-02 | |
| 1.77E-03 | 3.50E-02 | |
| 2.02E-03 | 3.50E-02 | |
| 2.18E-03 | 3.50E-02 | |
| 2.46E-03 | 3.50E-02 | |
| 2.70E-03 | 3.50E-02 | |
| 2.70E-03 | 3.50E-02 | |
| 2.82E-03 | 3.50E-02 | |
| 3.26E-03 | 3.79E-02 |
Figure 3Survival analysis of the patients divided by the prognostic genes in four data sets
Figure 4Survival analysis of patients divided by the prognostic genes in the merged data set
Figure 5Survival analysis of the patients stratified by age and stage
(a) Survival analysis of the patients in younger group. (b) Survival analysis of the patients in elder group. (c) Survival analysis of the patients in low-stage group. (d) Survival analysis of the patients in high-stage group.
Drugs selected by CMAP using prognostic genes
| Rank | CMAP name | p-value | Tag |
|---|---|---|---|
| trichostatin A | 0 | true | |
| PHA-00745360 | 0.00018 | unclear | |
| mephenytoin | 0.00022 | true | |
| Gly-His-Lys | 0.00026 | unclear | |
| resveratrol | 0.0003 | true | |
| quinpirole | 0.00034 | unclear | |
| etiocholanolone | 0.00052 | unclear | |
| vorinostat | 0.00064 | true | |
| aciclovir | 0.00127 | unclear | |
| 0175029-0000 | 0.00177 | unclear | |
| GW-8510 | 0.00223 | unclear | |
| dantrolene | 0.00226 | false | |
| irinotecan | 0.00246 | true | |
| folic acid | 0.00328 | true | |
| midodrine | 0.0033 | false | |
| lobelanidine | 0.00442 | unclear | |
| alsterpaullone | 0.00503 | unclear | |
| tranylcypromine | 0.00611 | false | |
| isometheptene | 0.00629 | false | |
| Prestwick-857 | 0.00631 | unclear | |
| morantel | 0.00647 | unclear | |
| clebopride | 0.00712 | unclear | |
| levomepromazine | 0.00722 | unclear | |
| piribedil | 0.00859 | false | |
| pentamidine | 0.00897 | true | |
| Prestwick-691 | 0.00937 | unclear | |
| Prestwick-664 | 0.00985 | unclear | |
| vinblastine | 0.00988 | true |
Ovarian cancer data sets used in this work
| Data set | Number of samples | Usage | Site |
|---|---|---|---|
| 300 | Training | ||
| 267 | Test | ||
| 260 | Independent test | ||
| 110 | Independent test | ||
| 185 | Independent test | ||
| 1287 | Independent test |
It should be mention that the TCGA data set was downloaded from http://tcga-data.nci.nih.gov/tcga/. And the source now is on https://portal.gdc.cancer.gov/.