| Literature DB >> 35085263 |
Saharnaz Dilmaghani1, Matthias R Brust1, Carlos H C Ribeiro2, Emmanuel Kieffer3, Grégoire Danoy1,3, Pascal Bouvry1,3.
Abstract
Identifying protein complexes in protein-protein interaction (ppi) networks is often handled as a community detection problem, with algorithms generally relying exclusively on the network topology for discovering a solution. The advancement of experimental techniques on ppi has motivated the generation of many Gene Ontology (go) databases. Incorporating the functionality extracted from go with the topological properties from the underlying ppi network yield a novel approach to identify protein complexes. Additionally, most of the existing algorithms use global measures that operate on the entire network to identify communities. The result of using global metrics are large communities that are often not correlated with the functionality of the proteins. Moreover, ppi network analysis shows that most of the biological functions possibly lie between local neighbours in ppi networks, which are not identifiable with global metrics. In this paper, we propose a local community detection algorithm, (lcda-go), that uniquely exploits information of functionality from go combined with the network topology. lcda-go identifies the community of each protein based on the topological and functional knowledge acquired solely from the local neighbour proteins within the ppi network. Experimental results using the Krogan dataset demonstrate that our algorithm outperforms in most cases state-of-the-art approaches in assessment based on Precision, Sensitivity, and particularly Composite Score. We also deployed lcda, the local-topology based precursor of lcda-go, to compare with a similar state-of-the-art approach that exclusively incorporates topological information of ppi networks for community detection. In addition to the high quality of the results, one main advantage of lcda-go is its low computational time complexity.Entities:
Mesh:
Year: 2022 PMID: 35085263 PMCID: PMC8794110 DOI: 10.1371/journal.pone.0260484
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1A snapshot of the community structures and local information that lcda-go is implemented on for node v.
The transparent area is unknown zone that is not available during the operations. Thus, each node performs relying on the knowledge of its first neighbours. In this example, c and d are from community a and t is in community x. The community label describes the source node of the community, hence, a and x are two surrounded communities of v. The numbers attached to each node describes the hop-distance of the node from its community presenter. During the implementation, we have considered hl of a source node equal to 1 instead of 0.
Notation exploited in lcda-go.
|
| A |
|
| Set of solution that consists of communities of |
|
| The current node |
| Γ( | Neighbours of node |
| Degree of node | |
| Community label of | |
| Hop-distance from the community source node | |
|
| |
| λ( | Community influence degree on node |
| Local community modularity |
Datasets of networks used for the experiments.
| PPI Network | |||||
| Datasets | | | | | avg. degree | # CC | | |
| Krogan [ | 2674 | 7079 | 5.29 | 62 | 2527 |
| PPI + MF | 1014 | 2135 | 4.21 | 7 | 995 |
| PPI + BP | 1154 | 2502 | 4.33 | 8 | 1130 |
| PPI + CC | 1160 | 2710 | 4.67 | 10 | 1130 |
| PPI + All | 1523 | 3708 | 4.86 | 9 | 1498 |
| Gene Ontology (GO) | |||||
| Database | Proteins | # MF functions | # BP functions | # CC functions | All functions |
| Panther [ | 2358 | 8 | 11 | 3 | 22 |
| Benchmark | |||||
| Database | Proteins | Complexes | # ∩ Krogan | # ∩ Panther | |
| CYC2008 [ | 1920 | 408 | 970 | 813 | |
An overview of the resulted communities from each algorithm including our method on Saccharomyces Cerevisiae Krogan interaction datasets.
| PPI + MF | |||||
| Algorithms | MCODE | MCL | ClusterOne | LCDA | LCDA-GO |
| #communities | 37 | 244 | 209 | 65 | 383 |
|
| 4 | 160 | 142 | 69 | 167 |
|
| 2 | 112 | 117 | 36 | 154 |
| PPI + BP | |||||
| Algorithms | MCODE | MCL | ClusterOne | LCDA | LCDA-GO |
| #communities | 38 | 256 | 236 | 71 | 416 |
|
| 3 | 192 | 170 | 76 | 202 |
|
| 3 | 149 | 146 | 51 | 196 |
| PPI + CC | |||||
| Algorithms | MCODE | MCL | ClusterOne | LCDA | LCDA-GO |
| #communities | 51 | 277 | 237 | 71 | 425 |
|
| 6 | 196 | 180 | 80 | 210 |
|
| 5 | 158 | 153 | 54 | 211 |
| PPI + All | |||||
| Algorithms | MCODE | MCL | ClusterOne | LCDA | LCDA-GO |
| #communities | 52 | 347 | 142 | 79 | 548 |
|
| 4 | 213 | 122 | 78 | 223 |
|
| 4 | 178 | 106 | 52 | 237 |
Performance comparison of the communities of the algorithms that are based on only topology on Saccharomyces Cerevisiae Krogan interaction datasets. θ is 0.1.
| PPI + MF | |||||||
| Algorithms |
|
|
|
|
|
| |
| MCODE | 0.05 | 0.01 | 0.02 | 0.02 |
| 0.11 | 0.19 |
| MCL | 0.45 |
|
| 0.26 | 0.60 |
| 1.11 |
| ClusterOne |
| 0.35 |
| 0.25 | 0.58 | 0.38 |
|
| LCDA |
| 0.16 | 0.26 |
| 0.33 | 0.31 | 1.16 |
| PPI + BP | |||||||
| Algorithms |
|
|
|
|
|
| |
| MCODE | 0.07 | 0.00 | 0.01 | 0.02 |
| 0.12 | 0.22 |
| MCL | 0.58 |
|
| 0.34 | 0.62 |
| 1.38 |
| ClusterOne | 0.61 | 0.41 | 0.49 | 0.31 | 0.63 | 0.44 | 1.37 |
| LCDA |
| 0.17 | 0.30 |
| 0.35 | 0.35 |
|
| PPI + CC | |||||||
| Algorithms |
|
|
|
|
|
| |
| MCODE | 0.10 | 0.01 | 0.02 | 0.03 |
| 0.15 | 0.28 |
| MCL | 0.57 |
|
| 0.34 | 0.65 |
| 1.39 |
| ClusterOne | 0.64 | 0.44 |
| 0.34 | 0.63 | 0.46 | 1.45 |
| LCDA |
| 0.20 | 0.31 |
| 0.34 | 0.36 |
|
| PPI + All | |||||||
| Algorithms |
|
|
|
|
|
| |
| MCODE | 0.08 | 0.01 | 0.02 | 0.03 |
| 0.15 | 0.26 |
| MCL | 0.51 |
|
| 0.39 | 0.63 |
| 1.40 |
| ClusterOne |
| 0.30 | 0.45 | 0.30 | 0.60 | 0.42 | 1.46 |
| LCDA | 0.66 | 0.20 | 0.30 |
| 0.31 | 0.37 |
|
Fig 2Composite score including Precision, Sn, and Acc.
Performance of lcda-go on Saccharomyces Cerevisiae from Krogan interaction datasets.
| Network |
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|
| PPI + MF | 0.40 | 0.41 | 0.41 | 0.19 | 0.62 | 0.35 | 0.94 |
| PPI + BP | 0.72 | 0.17 | 0.30 | 0.35 | 0.35 | 0.35 | 1.41 |
| PPI + CC | 0.50 | 0.51 | 0.51 | 0.27 | 0.64 | 0.41 | 1.17 |
| PPI + All | 0.43 | 0.55 | 0.48 | 0.28 | 0.65 | 0.43 | 1.15 |
Fig 3Comparing the results of lcda-go with mtgo on Krogan dataset.
Complexity and run time of algorithms incorporating go on Krogan network.
| Algorithm | Time (sec) | Complexity |
|---|---|---|
| LCDA-GO | 47.05 | |
| MTGO | 54000 |