| Literature DB >> 23940583 |
Abstract
Recently, computational approaches integrating copy number aberrations (CNAs) and gene expression (GE) have been extensively studied to identify cancer-related genes and pathways. In this work, we integrate these two data sets with protein-protein interaction (PPI) information to find cancer-related functional modules. To integrate CNA and GE data, we first built a gene-gene relationship network from a set of seed genes by enumerating all types of pairwise correlations, e.g. GE-GE, CNA-GE, and CNA-CNA, over multiple patients. Next, we propose a voting-based cancer module identification algorithm by combining topological and data-driven properties (VToD algorithm) by using the gene-gene relationship network as a source of data-driven information, and the PPI data as topological information. We applied the VToD algorithm to 266 glioblastoma multiforme (GBM) and 96 ovarian carcinoma (OVC) samples that have both expression and copy number measurements, and identified 22 GBM modules and 23 OVC modules. Among 22 GBM modules, 15, 12, and 20 modules were significantly enriched with cancer-related KEGG, BioCarta pathways, and GO terms, respectively. Among 23 OVC modules, 19, 18, and 23 modules were significantly enriched with cancer-related KEGG, BioCarta pathways, and GO terms, respectively. Similarly, we also observed that 9 and 2 GBM modules and 15 and 18 OVC modules were enriched with cancer gene census (CGC) and specific cancer driver genes, respectively. Our proposed module-detection algorithm significantly outperformed other existing methods in terms of both functional and cancer gene set enrichments. Most of the cancer-related pathways from both cancer data sets found in our algorithm contained more than two types of gene-gene relationships, showing strong positive correlations between the number of different types of relationship and CGC enrichment [Formula: see text]-values (0.64 for GBM and 0.49 for OVC). This study suggests that identified modules containing both expression changes and CNAs can explain cancer-related activities with greater insights.Entities:
Mesh:
Year: 2013 PMID: 23940583 PMCID: PMC3734239 DOI: 10.1371/journal.pone.0070498
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1A schematic of our approach.
(A) Gene expressions and their paired CNA data are collected. (B) A gene-gene relationship network, GGR, is constructed using direct and indirect relationships of GE-GE, CNA-GE, and CNA-CNA. (C) A novel algorithm, VToD, finds overlapping modules combining the GGR network and PPI information. (D) Functional and cancer gene set enrichments are tested for identified modules.
Figure 2Comparative functional enrichments of pre-modules generated using different vote thresholds.
(A) is for GBM and (B) is for OVC. Bars represent fractions of modules enriched with KEGG, BioCarta, GO biological process, cancer-related KEGG, cancer-related BioCarta, cancer-related GO biological process, and cancer gene census (CGC) for three different vote thresholds. Additionally, in each case, vote-values were computed using only topological properties, using only data-driven properties, and by combining them to compare their individual effects on performance. The numbers of genes (nGS) in each pre-module set are shown correspondingly.
Summary of functional and cancer gene set enrichments for selected GBM modules (sorted by driver gene set enrichment).
| Module ID(Size) | # of enrichedpathways | % of gene-gene directrelations | # of CGC & GBMgenes ( | Enriched cancer genes in modules ‡ |
| 12 | 31, 40, 51 | 26.67% | 4 (9.15 |
|
| (10) | & | 6.67% | 3 (2.0 |
|
| 34, 37, 57 | 6.67% | |||
| 2 | 37, 49, 73 | 26.59% | 10 (1.05 |
|
| (48) | & | 0.79% | 2 (1.02 | DDX5,MDM2,MDM4,NPM1, |
| 40, 48, 92 | 0.71% | DAXX, | ||
| 17 | 29, 54, 26 | 41.82% | 3 (4.98 | JAK2, |
| (11) | & | 3.64% | 1 (5.61 | |
| 37, 52, 38 | 1.82% | |||
| 8 | 30, 39, 42 | 30.64% | 6 (4.95 |
|
| (33) | & | 1.33% | 1 (1.32 | MET,MYC |
| 34, 37, 52 | 3.79% | |||
| 1 | 30, 49, 21 | 24.51% | 6 (6.91 | APC,BRAF, |
| (55) | & | 1.62% | 1 (1.8 | PPP2R1A,RAF1,WT1 |
| 37, 37, 24 | 0.54% |
KEGG,
BioCarta,
GO Term;
cancer-related subset of KEGG,
cancer-related subset of BioCarta,
cancer-related subset of GO Term;
GE-GE,
CNA-GE,
CNA-CNA relationships;
Gene symbols in bold text are GBM-related genes; the remainings are CGC genes.
Figure 3Analysis of GBM Module 2.
(A) A network view of GBM Module 2 using only direct relationships, drawn by Cytoscape [70]. Genes were grouped together based on the overlap with BioCarta pathways, and the percentages of samples with CNAs and GE changes are shown. CGC genes are colored in olive and GBM genes are in purple. Cytoband and Amp/Del (or Alteration-Expression Changes) information for CNA-CNA (or CNA-GE) pairs are shown in the inset table. (B) Pathway enrichment tests with KEGG and BioCarta pathways for this module are shown. Blue bars indicate the enrichment -values of pathways and red bars indicate the overlap -values between the pathway and GBM driver genes. Black vertical bars show -value threshold, 0.05, and the width of the horizontal bars depends on (-value). (C) Red bars show the overlapping -value with CGC and GBM driver genes.
Summary of functional and cancer gene set enrichments for selected OVC modules (sorted by driver gene set enrichment).
| Module ID(Size) | # of enrichedpathways | % of gene-gene directrelations | # of CGC & OVCgenes ( | Enriched cancer genes in modules‡ |
| 4 | 41, 54, 102 | 0.67% | 16 (2.60 | AKAP9,AKT1, |
| (182) | & | 3.05% | 11 (8.83 |
|
| 45, 47, 108 | 1.05% | MLLT4, | ||
| PIM1,RAD1,SRGAP3,ZBTB16, | ||||
|
| ||||
|
| ||||
| 10 | 48, 92, 82 | 0.23% | 11 (1.95 | AKT1,CARD11,FOXO1,FOXO3, |
| (51) | & | 2.59% | 7 (6.48 | JUN,MAP2K4, |
| 50, 82, 88 | 9.96% |
| ||
|
| ||||
| 8 | 37, 59, 41 | 0.3% | 7 (2.08 | PTPN11,AKT1, |
| (37) | & | 4.2% | 6 (5.23 | HRAS,LIFR, |
| 39, 58, 49 | 7.36% |
| ||
| 6 | 41, 23, 128 | 4.27% | 26 (3.78 | AKAP9, |
| (253) | & | 4.27% | 9 (5.65 | CBL, |
| 38, 23, 122 | 8.07% |
| ||
| FOXO3,HSP90AB1,KLF6,MLLT4, | ||||
|
| ||||
| PIM1, | ||||
| TSC1,ZBTB16, | ||||
|
| ||||
| 1 | 47, 95, 95 | 0.05% | 11 (1.78 |
|
| (63) | & | 1.74% | 6 (8.77 | CCND3,HRAS,JAK1,MAP2K4, |
| 47, 86, 92 | 1.05% |
| ||
|
|
KEGG,
BioCarta,
GO Term;
cancer-related subset of KEGG,
cancer-related subset of BioCarta,
cancer-related subset of GO Term;
GE-GE,
CNA-GE,
CNA-CNA relationships;
Gene symbols in bold text are OVC-related genes; the remainings are CGC genes.
Figure 4Analysis of OVC Module 8, with a description similar to that of Figure 3.
(A) A network view of OVC Module 8 using only direct relationships. CGC genes are colored in olive and OVC-related genes are in purple. (B) Pathway enrichment tests tests were similar to those in Figure 3(B), but here, red bars indicate the overlapping -values between the pathway and OVC-related genes. (C) Red bars show the -values that overlap with those of the CGC- and OVC-related genes.
Comparing VToD to other methods.
| Methods | Data setsused | Cancertypes | # ofmodules | # of functionallyenriched modules | # of enriched moduleswith subset ofpathways or terms | # of # distinctpathways orfunctional terms¶ | # of cancer geneenriched modules†,‡ |
| HierarchicalClustering | GE,CNA,PPI | GBM | 216 | 14 (6.48%), 0,13 (6.02%) | 4 (1.85%), 0, 4 (1.85%) | 51 | 5 (2.31%), 1 (0.46%) |
|
| GE,CNA,PPI | GBM |
|
|
|
|
|
| OVC |
|
|
|
|
| ||
| Cerami et. al. | Mutation,CNA,PPI | GBM | 10 | 1 (10%), 1 (10%),3 (30%) | 2 (20%), 1 (10%), 2 (20%) | 68 | 2 (20%), 2 (20%) |
| MATISSE | GE,PPI | GBM | 34 | 14 (41.18%), 12 (35.29%),12 (35.29%) | 4 (11.77%), 1 (2.9%),1 (2.9%) | 129 | 3 (8.82%), 0 |
| OVC | 15 | 9 (60%), 8 (53.33%),4 (26.67%) | 3 (20%), 1 (6.7%), 0 | 78 | 7 (46.67%), 2 (13.33%) | ||
|
| GE,PPI | GBM |
|
|
|
|
|
| OVC |
|
|
|
|
| ||
| ClusterONE | PPI Only | – | 210 | 114 (54.29%), 74 (35.24%),119 (56.67%) | 100 (47.62%), 72 (34.29%),116 (55.24%) | 454 | 38 (18.09%), 7 (3.33%) |
KEGG,
BioCarta,
GO Term;
cancer-related subset of KEGG,
cancer-related subset of BioCarta,
cancer-related subset of GO Term;
Distinct enriched pathways or terms within all modules were found depending on key terminologies; modules enriched significantly (-value 0.05) with CGC genes and specific cancer-related genes;
with GBM-related genes and with OVC-related genes.
The VToD algorithm.
| VToD ( | |
|
| |
| 1 |
|
| 2 |
|
| 3 |
|
| 4 |
|
| 5 |
|
| 6 |
|
| 7 |
|
| 8 |
|
| 9 |
|
| 10 |
|
|
| |
| 11 |
|
| 12 |
|
| 13 | Calculate a local rank and a global rank for |
| 14 |
|
| 15 |
|
| 16 |
|
|
| |
| 17 |
|
| 18 |
|
| 19 |
|
| 20 |
|
| 21 |
|
| 22 |
|
| 23 |
|
| 24 |
|
| 25 |
|
| 26 |
|
| 27 |
|
| 28 |
|
| 29 |
|
| 30 |
|
|
| |
| 31 |
|
| 32 |
|
| 33 |
|
| 34 |
|
| 35 |
|
| 36 |
|
| 37 |
|