| Literature DB >> 21352556 |
Shivashankar H Nagaraj1, Antonio Reverter.
Abstract
BACKGROUND: Cancer has remarkable complexity at the molecular level, with multiple genes, proteins, pathways and regulatory interconnections being affected. We introduce a systems biology approach to study cancer that formally integrates the available genetic, transcriptomic, epigenetic and molecular knowledge on cancer biology and, as a proof of concept, we apply it to colorectal cancer.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21352556 PMCID: PMC3051904 DOI: 10.1186/1752-0509-5-35
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1The schema for the identification of novel genes associated with complex diseases. The expression profiles from the cancer data are analyzed to predict differentially expressed and condition-specific genes. The functional attributes over-represented in cancer are selected and representative datasets from public resources mined. The common cancer fingerprints from cancer-associated genes are processed through Boolean logic to develop a guilt-by-association classifier which, applied to non-cancer-associated genes, predicts novel candidate cancer-associated genes. Finally, novel candidate genes are further analyzed using network theory approaches.
Overview of the genetic, epigenetic and molecular information used in this study
| Functional Attribute | Role in Cancer | Potential application | Examples | Data source | Reference |
|---|---|---|---|---|---|
| Cancer associated genes | Genes with at least 2 mutations in causally implicated in cancer. Includes oncogenes, tumor suppressor genes | Potential drug targets and diagnostic or prognostic markers | Oncogenes: | NA | |
| Non-cancer associated genes | There is no previous report of any causal mutation. | If cancer association is established, these genes are either potential drug targets and diagnostic or prognostic markers | NCBI - Human Genome | NA | |
| Kinases | More than 30% of cancer related genes are kinases and the most common domain that is encoded by cancer genes is the protein kinase domain | Drug targets through inhibitors | Human Kinome Consortium | [ | |
| Excretory - Secretory proteins | Malignant tumors secrete increased levels of ES proteins | non-invasive diagnostic or prognostic markers for early detection | alpha-fetoprotein, | Secreted Protein Database (SPD) | [ |
| Transcription factors | Overactivity of TFs at different stages of cancer is well documented and novel treatment strategies have been suggested for targeted inhibition of oncogenic TFs | Alternative therapeutic strategy, potential drug targets | Genomatix | [ | |
| DNA Methylation | Methylation patterns are altered in cancer cells as shown in hypomethylation of oncogenes and hypermethylation of tumor suppressor resulting in gene silencing or gene inactivation | CpG island methylation could be used as a biomarker of malignant cells | Human Colon Methylome from [ | [ | |
| Post-translational modifications | Key proteins driving oncogenesis, Can undergo PTM Although Phosphoryltion is partially covered in kinases section, other PTMs such as glycosylation and ubiquitination reported to play a role in malignancies, are included separate functional gene attributes. | HPRD | [ | ||
Figure 2The classification of differentially expressed genes resulting from the expression data analysis. The top 15 DE genes in all of the three categories are tabulated with their expression values in normal, adenoma, carcinoma and inflammation.
Figure 3Trends showing the distribution of genes across 13 binarized Boolean variables. Four classes of genes were used for the comparison; i. all the genes in the human genome (21 892), ii. cancer-associated genes (749), iii. GBA ranked candidate genes candidate genes (1017) and iv. top candidate genes (134, 13.2%of the GBA ranked candidate genes). PTM and SEC classes are enriched in cancer-associated genes as well as in candidate genes category.
Figure 4Two-step computational validation approach to ascertain the inferential validity of the proposed GBA. 4A shows the ratio of the average Boolean score given to cancer genes over the average score given to the other genes. Candidate genes comprising the top 13.2% of genes guarantee a 2.71-fold over-representation of cancer genes. 4B. Standard cross-validation in which the proportion of cancer-associated genes are compared to genes with extreme Boolean scores. By selecting the 50% most extreme genes captures 90% of all cancer genes.
The top candidates identified by the GBA algorithm (genes with similar functional attributes are clustered together)
| Candidate Genes | Normal | Adenoma | Carcinoma | Inflammation | Condition Specificity | Colon tissue specificity | Secreted Proteins | Transcription Factors | Protein kinases | PTMs | DNA Methylation |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 11.01 | 5.66 | 7.52 | 8.05 | ✓ | ✓ | ✓ | ✓ | ||||
| 6.35 | 9.2 | 10.28 | 10.48 | ✓ | ✓ | ✓ | |||||
| 6.51 | 5.88 | 7.71 | 7.12 | ✓ | ✓ | ✓ | ✓ | ✓ | |||
| 10.14 | 4.76 | 6.87 | 8.21 | ✓ | ✓ | ✓ | ✓ | ✓ | |||
| 5.71 | 10.87 | 10.8 | 12.17 | ✓ | ✓ | ✓ | ✓ | ||||
| 8.66 | 7.36 | 8.43 | 9.04 | ✓ | |||||||
| 4.18 | 3.39 | 4.61 | 3.89 | ✓ | ✓ | ✓ | |||||
| 9.11 | 6.15 | 6.76 | 8.26 | ✓ | ✓ | ✓ | ✓ | ||||
| 8.31 | 7.22 | 8.69 | 8.78 | ✓ | ✓ | ✓ | |||||
| 2.22 | 4.8 | 3.53 | 2.55 | ✓ | ✓ | ✓ | |||||
| 8.62 | 8.75 | 8.96 | 8.29 | ✓ | ✓ | ||||||
| 8.58 | 9.97 | 8.63 | 8.12 | ✓ | ✓ | ✓ | ✓ | ||||
| 5.16 | 4.4 | 5.47 | 5.56 | ✓ | ✓ | ||||||
| 5.02 | 3.36 | 4.42 | 4.71 | ✓ | ✓ | ✓ | |||||
| 6.93 | 8.76 | 9.01 | 7.84 | ✓ | |||||||
| 10.54 | 6.27 | 8.04 | 7.08 | ✓ | ✓ | ✓ | |||||
| 10.42 | 6.24 | 7.69 | 9.55 | ✓ | ✓ | ✓ | ✓ | ||||
| 4.95 | 10.34 | 10.1 | 11.19 | ✓ | ✓ | ✓ | ✓ | ||||
| 8.9 | 8.11 | 9.28 | 10.21 | ✓ | ✓ | ||||||
| 11.76 | 8.76 | 9.57 | 9.81 | ✓ | ✓ | ✓ | |||||
The properties of network connectivity:
| Normal | Adenoma | Carcinoma | Inflammation | |
|---|---|---|---|---|
| Normal | 5.18 | 2.28 | 3.31 | 4.25 |
| Adenoma | 1.20 | 4.63 | 8.26 | 5.25 |
| Carcinoma | 2.01 | 3.89 | 11.67 | 11.07 |
| Inflammation | 2.30 | 1.96 | 4.01 | 11.10 |
Clustering coefficients (%, on diagonals) and percent overlap computed from the ratio of common links divided by the total number of unique links for positive (above diagonal) and negative (below diagonal) links across each pair-wise network comparison.
Figure 5The Always Conserved network visualized using the Cytoscape software at our levels of resolution: (A) Connections involving at least one top candidate gene; (B) derived from A where only genes with more than two connections are displayed; (C) derived from B where only connections that were deemed to be significant across the four original networks (Adenoma, Carcinoma, Inflammation and Normal) are displayed; and (D) only those connections involving at least one top candidate gene in the four networks. The specific nature of edges, nodes and other features such as shape and color along with the Cytoscape file is provided in our website http://www.livestockgenomics.csiro.au/courses/crc.html
The Boolean probabilistic truth table for MEF2C gene
| No | Binarized Boolean profile | Probability values |
|---|---|---|
| 1 | 0000000000001 | 0.05094 |
| 2 | 0000000001000 | 0.23019 |
| 3 | 0000000001001 | 0.02453 |
| 4 | 0000000010000 | 0.10755 |
| 5 | 0000000010001 | 0.03396 |
| 6 | 0000000011000 | 0.07925 |
| 7 | 0000000011001 | 0.03019 |
| 8 | 0100000000000 | 0.01509 |
| 9 | 0100000000001 | 0.00377 |
| 10 | 0100000001000 | 0.00377 |
| 11 | 0100000001001 | 0.00189 |
| 12 | 0100000010000 | 0.00377 |
| 13 | 0100000010001 | 0.00189 |
| 14 | 0100000011000 | 0.00189 |