| Literature DB >> 32318558 |
Wei Zhang1,2, Yifu Zeng1,2, Lei Wang1,3, Yue Liu4, Yi-Nan Cheng5.
Abstract
Identifying the molecular modules that drive cancer progression can greatly deepen the understanding of cancer mechanisms and provide useful information for targeted therapies. Most methods currently addressing this issue primarily use mutual exclusivity without making full use of the extra layer of module property. In this paper, we propose MCLCluster to identity cancer driver modules, which use somatic mutation data, Cancer Cell Fraction (CCF) data, gene functional interaction network and protein-protein interaction (PPI) network to derive the module property on mutual exclusivity, connectivity in PPI network and functionally similarity of genes. We have taken three effective measures to ensure the effectiveness of our algorithm. First, we use CCF data to choose stronger signals and more confident mutations. Second, the weighted gene functional interaction network is used to quantify the gene functional similarity in PPI. The third, graph clustering method based on Markov is exploited to extract the candidate module. MCLCluster is tested in the two TCGA datasets (GBM and BRCA), and identifies several well-known oncogenes driver modules and some modules with functionally associated driver genes. Besides, we compare it with Multi-Dendrix, FSME Cluster and RME in simulated dataset with background noise and passenger rate, MCLCluster outperforming all of these methods.Entities:
Keywords: Markov clustering; connectivity; driver modules; functionally similarity; mutual exclusivity
Year: 2020 PMID: 32318558 PMCID: PMC7154174 DOI: 10.3389/fbioe.2020.00271
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
Figure 1The overview of MCLCluster. (A) Integrate CCF data to choose stronger signals and more confident mutations, and compute the mutex of each gene pairs. (B) The weighted gene functional interaction network is used to quantify the gene functional similarity in PPI. (C) Compute total similarity as edge weight, then execute Markov clustering to extract candidate module.
Results of GBM.
| 1 | CDKN2B CDK4 RB1 ERBB2 | 4 | 76% | 0 | 0.834 |
| 2 | TP53 MDM2 MDM4 | 3 | 82% | 0.001 | 0.766 |
| 3 | PTEN PIK3R1 NF1 EGFR | 4 | 78% | 0.001 | 0.741 |
.
Figure 2List 3 driver module and the interaction among genes in each driver module in the GBM data. Node color shows the role of GBM in different signal pathways.
Results of BRCA.
| 1 | PTEN PIK3CA PIK3R1 AKT1 | 4 | 72% | 0 | 0.824 |
| 2 | TRPS1 ZNF217 FBXO31 | 3 | 74% | 0 | 0.811 |
| 3 | TP53 CDH1 MYC | 3 | 80% | 0.001 | 0.721 |
| 4 | FBXO31 RB1 CCDN1 | 3 | 70% | 0.001 | 0.714 |
.
Figure 3List 4 driver module and the interaction among genes in each driver module in the BRCA data. Node color shows the important role of BRCA in different signal pathways.
Figure 4The F1 score of MCLCluster, Multi-Dendrix, FSME Cluster and RME in simulation data for 1 module. (A) When noise = 0.05, the F1 score of the four methods with different passenger rate. (B) When noise = 0.07, the F1 score of the four methods with different passenger rate. (C) When noise = 0.09, the F1 score of the four methods with different passenger rate. (D) When noise = 0.11, the F1 score of the four methods with different passenger rate.
Figure 5The F1 score of MCLCluster, Multi-Dendrix, FSME Cluster and RME in simulation data for multiply modules.