| Literature DB >> 20458363 |
Jianghui Xiong1, Juan Liu, Simon Rayner, Yinghui Li, Shanguang Chen.
Abstract
Cancer is a disease associated with the deregulation of multiple gene networks. Microarray data has permitted researchers to identify gene panel markers for diagnosis or prognosis of cancer but these are not sufficient to make specific mechanistic assertions about phenotype switches. We propose a strategy to identify putative mechanisms of cancer phenotypes by protein-protein interactions (PPI). We first extracted the logic status of a PPI via the relative expression of the corresponding gene pair. The joint association of a gene pair on a cancer phenotype was calculated by entropy minimization and assessed using a support vector machine. A typical predictor is "If Src high-expression, and Cav-1 low-expression, then cancer." We achieved 90% accuracy on test data with a majority of predictions associated with the MAPK pathway, focal adhesion, apoptosis and cell cycle. Our results can aid in the development of phenotype discrimination biomarkers and identification of putative therapeutic interference targets for drug development.Entities:
Keywords: biomarker; cancer; phenotype discrimination; protein-protein interaction
Year: 2010 PMID: 20458363 PMCID: PMC2865773 DOI: 10.4137/cin.s3899
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Example of entropy calculation from protein-protein interaction data for a gene pair.
| [0 0] | 0 | 0 | 28 | 27 | 0.9998 |
| [1 0] | 1 | 0 | 1 | 50 | 0.1396 |
| [0 1] | 0 | 1 | 30 | 16 | 0.9321 |
| [1 1] | 1 | 1 | 23 | 5 | 0.6769 |
N0, number of times that the state S appeared in normal samples. N1, the number of times it appeared in cancerous samples. H, Entropy.
Protein interaction modules predicted to be the most discriminating markers of cancer phenotype.
| 1 | CANX | FAM107A | 0 | 1 | 27 | 1 | 0.222285 | 0.94 |
| 2 | ABCB1 | CAV1 | 1 | 1 | 27 | 1 | 0.222285 | 1.00 |
| 3 | COL10A1 | P4HB | 0 | 0 | 27 | 1 | 0.222285 | 0.98 |
| 4 | PAICS | CHD3 | 0 | 0 | 26 | 1 | 0.228538 | 0.98 |
| 5 | CAV1 | SRC | 1 | 1 | 26 | 1 | 0.228538 | 0.89 |
| 6 | TNFRSF1B | SSR4 | 1 | 0 | 26 | 1 | 0.228538 | 0.91 |
| 7 | LMO2 | MAPRE3 | 1 | 0 | 26 | 1 | 0.228538 | 0.96 |
| 8 | SMAD3 | EPAS1 | 0 | 1 | 26 | 1 | 0.228538 | 0.94 |
| 9 | NOS3 | CAV1 | 0 | 1 | 26 | 1 | 0.228538 | 0.94 |
| 10 | COL10A1 | P4HB | 0 | 0 | 26 | 1 | 0.228538 | 0.96 |
| 11 | PDK1 | EPAS1 | 0 | 1 | 26 | 1 | 0.228538 | 0.96 |
| 12 | LMO4 | TCF21 | 0 | 1 | 26 | 1 | 0.228538 | 0.96 |
| 13 | SKIL | SASH1 | 1 | 1 | 26 | 1 | 0.228538 | 0.96 |
| 14 | PAFAH1B1 | NDEL1 | 0 | 0 | 1 | 26 | 0.228538 | 0.89 |
| 15 | CAV1 | SRC | 0 | 1 | 1 | 26 | 0.228538 | 0.89 |
| 16 | NOS3 | CAV1 | 0 | 0 | 1 | 26 | 0.228538 | 0.94 |
N0, number of times that the state S appeared in normal samples. N1, the number of times it appeared in cancerous samples. H, Calculated Entropy, Prediction Accuracy is calculated applying Leave-one-out cross-validation on SVM classifier (see Materials and Methods for more details).
Association of gene signatures with diseases.*
| Ensemble Signatures (354 genes) | CANCER | 25 | 7.06% | 5.60E-06 | TP53, PTGS2, CDKN1B, ABCB1, SFN, CCND1, AR, TGFA, ESR1, CDKN1A, EGFR, IL6, VDR, CBFB, AGER, BCL2, FAS, ALDH2, ERBB2, CDK4, NME1, HRAS, MC1R, CTNNB1, IL8 |
| LUNG CANCER | 4 | 1.13% | 0.036935 | TP53, PTGS2, CCND1, CDKN1A | |
| Cancer-specific Signatures (187 genes) | CANCER | 18 | 9.63% | 6.70E-05 | IL6, CTNNB1, ALDH2, CDKN1A, ABCB1, CBFB, BCL2, TP53, TGFA, HRAS, AGER, ERBB2, SFN, ESR1, NME1, EGFR, PTGS2, AR |
| LUNG CANCER | 3 | 1.60% | 0.09121 | CDKN1A, TP53, PTGS2 |
354 genes involved in gene pairs which Entropy <0.3 were selected for further analysis using the DAVID tool (http://david.abcc.ncifcrf.gov) which considers the functional assignment of the genes according to the Gene Ontology Index. These genes were defined to be the “ensemble signatures”, and the 187 genes that showed an “high-expression” status in cancer samples were defined as “cancer-specific signatures”.
Enrichment ratio means the percentage of input genes are annotated on given term.
P value is calculated by DAVID tool.
Gene ontology enriched in gene signatures.
| Signal transduction | 128 | 36.16% | 7.10E-14 | 74 | 39.57% | 5.05E-11 |
| Cell cycle | 52 | 14.69% | 2.55E-14 | 29 | 15.51% | 4.92E-09 |
| Cell proliferation | 40 | 11.30% | 2.53E-11 | 22 | 11.76% | 5.46E-07 |
| Protein kinase cascade | 20 | 5.65% | 1.10E-05 | 15 | 8.02% | 3.09E-06 |
| Regulation of metabolism | 93 | 26.27% | 1.10E-07 | 50 | 26.74% | 4.31E-05 |
| Apoptosis | 34 | 9.60% | 1.36E-07 | 17 | 9.09% | 4.85E-04 |
| Mitotic cell cycle | 19 | 5.37% | 5.41E-07 | 10 | 5.35% | 5.18E-04 |
| Regulation of transcription | 77 | 21.75% | 6.31E-05 | 40 | 21.39% | 4.10E-03 |
P value is calculated by DAVID tool.
KEGG pathway enriched in gene signatures.
| MAPK SIGNALING PATHWAY | 17 | 9.09 | 1.07E-04 | TRAF6, IKBKG, TP53, GADD45B, AKT3, MAP3K1, MAP3K3, HRAS, CHUK, MAP3K14, NFKB2, EGFR, MAP3K7IP1, TNFRSF1A, IKBKB, PRKCG, IKBKE, |
| FOCAL ADHESION | 14 | 7.49 | 3.62E-04 | CTNNB1, BCL2, SRC, AKT3, CAV2, HRAS, ERBB2, CAV1, FYN, EGFR, LAMB2, PRKCG, SHC1, VCL, |
| APOPTOSIS | 12 | 6.42 | 1.98E-06 | BCL2L1, MAP3K14, CHUK, IKBKG, BCL2, TP53, NFKB2, AKT3, IKBKB, TNFRSF1A, IRAK1, TRADD, |
| CELL CYCLE | 12 | 6.42 | 1.35E-05 | YWHAZ, CDK2, CDKN1A, MAD2L1, SFN, PCNA, TP53, SMAD3, GADD45B, CCNE1, CREBBP, MCM6, |
| ADHERENS JUNCTION | 11 | 5.88 | 3.61E-06 | TJP1, CTNNB1, ERBB2, INSR, FYN, SMAD3, SRC, EGFR, PARD3, CREBBP, VCL |
P Value is calculated by DAVID tool.
The status of protein interaction modules lead to cancer phenotype switch.
| High | Low | Cancer | |
| High | High | Normal | |