| Literature DB >> 30271588 |
Dimitri Perrin1, Guido Zuccon1.
Abstract
Biological networks are highly modular and contain a large number of clusters, which are often associated with a specific biological function or disease. Identifying these clusters, or modules, is therefore valuable, but it is not trivial. In this article we propose a recursive method based on the Louvain algorithm for community detection and the PageRank algorithm for authoritativeness weighting in networks. PageRank is used to initialise the weights of nodes in the biological network; the Louvain algorithm with the Newman-Girvan criterion for modularity is then applied to the network to identify modules. Any identified module with more than k nodes is further processed by recursively applying PageRank and Louvain, until no module contains more than k nodes (where k is a parameter of the method, no greater than 100). This method is evaluated on a heterogeneous set of six biological networks from the Disease Module Identification DREAM Challenge. Empirical findings suggest that the method is effective in identifying a large number of significant modules, although with substantial variability across restarts of the method.Entities:
Keywords: Community detection; DREAM challenge; Module identification; Network biology
Mesh:
Year: 2018 PMID: 30271588 PMCID: PMC6143918 DOI: 10.12688/f1000research.15845.1
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Challenge networks.
| ID | Type | # nodes | # edges | Directed |
|---|---|---|---|---|
| 1 | PPI | 17,397 | 2,232,405 | No |
| 2 | PPI | 12,420 | 397,309 | No |
| 3 | Signalling | 5,254 | 21,826 | Yes |
| 4 | Co-expression | 12,588 | 1,000,000 | No |
| 5 | Cancer | 14,679 | 1,000,000 | No |
| 6 | Homology | 10,405 | 4,223,606 | No |
Figure 1. Conversion of a directed network into an undirected one.
Figure 2. Overall algorithm.
Figure 3. Results on each network as a function of the value for k.
White and red dots represent the median and mean values for each configuration, respectively. The blue line indicates our performance in the challenge leaderboard for that network, and the red line that of the best submission for that network.