| Literature DB >> 25161255 |
Andrea Gobbi1, Francesco Iorio2, Kevin J Dawson1, David C Wedge1, David Tamborero1, Ludmil B Alexandrov1, Nuria Lopez-Bigas1, Mathew J Garnett1, Giuseppe Jurman1, Julio Saez-Rodriguez1.
Abstract
MOTIVATION: Studying combinatorial patterns in cancer genomic datasets has recently emerged as a tool for identifying novel cancer driver networks. Approaches have been devised to quantify, for example, the tendency of a set of genes to be mutated in a 'mutually exclusive' manner. The significance of the proposed metrics is usually evaluated by computing P-values under appropriate null models. To this end, a Monte Carlo method (the switching-algorithm) is used to sample simulated datasets under a null model that preserves patient- and gene-wise mutation rates. In this method, a genomic dataset is represented as a bipartite network, to which Markov chain updates (switching-steps) are applied. These steps modify the network topology, and a minimal number of them must be executed to draw simulated datasets independently under the null model. This number has previously been deducted empirically to be a linear function of the total number of variants, making this process computationally expensive.Entities:
Mesh:
Year: 2014 PMID: 25161255 PMCID: PMC4147926 DOI: 10.1093/bioinformatics/btu474
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.BEM randomization through the switching-algorithm. A bipartite graph (B) is derived from the initial BEM by considering it as a graph incidence matrix (A). A sequence of switching-steps (C and D) is performed. In each of these steps, two edges (a,b) and (c,d) are randomly chosen (C) and, if the edges (a,d) and (c,b) do not exist yet, they are added to the network, while (a,b) and (c,d) removed (D). A rewired version of the BEM is derived by considering the incidence matrix of the resulting network after a sufficiently long sequence of switching-steps (E)
Performance comparisons in terms of execution time and residual bias across different algorithms and bounds
| (A) Execution time | |||||
| 53 min 20 s | 5 h 58 s | 43 days | 154 days | 5 h 21 min 29 s | |
| 6 h 21 min | 21 h 36 mina | ||||
| 28 s | |||||
| 9 h | 47 days | 2 years 145 days | 8 years 114 days | ||
| 37 min 30 s | 7 h 37 min 55 s | 41 min 12 sa | 22 h 53 min 20 sa | ||
| (B) Residual average Jaccard similarity | |||||
| 0.006716 | 0.907788 | 0.006744 | 0.006762a | 0.006921 | |
| 0.006744 | 0.299971 | 0.006723a | 0.006879a | ||
Note: aEstimations.
Fig. 2.ME P-value comparisons. ME P-values for 237 gene pairs, whose coverage is in the BEM derived from the colorectal cancer dataset. Positions on the two axes indicate P-values computed by using two different null models simulated by generating 10 000 randomized version of the original BEM, through the switching algorithm and different numbers of switching steps: our novel lower bound and the empirical one. An overall consistency of P-values can be observed and a set of 11 gene pairs has a significant level of ME (at a false discovery rate ) on both the null models