| Literature DB >> 29795231 |
Jianjun Cheng1,2, Xinhong Yin3, Qi Li3, Haijuan Yang4, Longjie Li3, Mingwei Leng5, Xiaoyun Chen6.
Abstract
Community detection has been paid much attention in many fields in recent years, and a great deal of community-detection methods have been proposed. But the time consumption of some of them is heavy, limiting them from being applied to large-scale networks. On the contrary, there exist some lower-time-complexity methods. But most of them are non-deterministic, meaning that running the same method many times may yield different results from the same network, which reduces their practical utility greatly in real-world applications. To solve these problems, we propose a community-detection method in this paper, which takes both the quality of the results and the efficiency of the detecting procedure into account. Moreover, it is a deterministic method which can extract definite community structures from networks. The proposed method is inspired by the voting behaviours in election activities in the social society, in which we first simulate the voting procedure on the network. Every vertex votes for the nominated candidates following the proposed voting principles, densely connected groups of vertices can quickly reach a consensus on their candidates. At the end of this procedure, candidates and their own voters form a group of clusters. Then, we take the clusters as initial communities, and agglomerate some of them into larger ones with high efficiency to obtain the resulting community structures. We conducted extensive experiments on some artificial networks and real-world networks, the experimental results show that our proposed method can efficiently extract high-quality community structures from networks, and outperform the comparison algorithms significantly.Entities:
Year: 2018 PMID: 29795231 PMCID: PMC5966462 DOI: 10.1038/s41598-018-26415-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
The statistical information of the networks involved in the experiments.
| Network | Vertices | Edges | Communities | Reference |
|---|---|---|---|---|
| LFR_1000 | 1000 | 15135 | 16 | — |
| LFR_5000 | 5000 | 47368 | 57 | — |
| Dolphin | 62 | 159 | 4 |
[ |
| Risk map | 42 | 83 | 6 |
[ |
| Scientists collaboration | 118 | 197 | 6 |
[ |
| 1133 | 5451 | — |
[ | |
| PGP | 10680 | 24316 | — |
[ |
| DBLP | 317080 | 1049866 | — |
[ |
| Amazon | 334863 | 925872 | — |
[ |
Figure 1The artificial networks. (a) The community structure extracted from the synthetic network containing 1000 vertices, which is generated by setting the key parameter μ = 0.5 in the LFR benchmark network generator software. (a) The result uncovered from the artificial network containing 5000 vertices, which is synthesised using the same software with μ = 0.6.
Figure 2The dolphin social network. (a) The ground-truth community structure. (b) The community structure detected by our proposed method. The different vertex shapes and colours indicate different communities, the intra-community edges are plotted as black lines, and the inter-community ones are in grey. This illustration style also applies to the next figures.
Figure 3The network corresponding to a map of game Risk. (a) The ground-truth community structure. (b) The community structure extracted by the proposed method.
The experimental results on the first category of networks, the quality of the extracted community structures are measured in terms of modularity (Q) and normalised mutual information (NMI).
| network | measure | ground truth | Fast | LPA | LPAm | PPC | Attractor | IsoFdp | proposal |
|---|---|---|---|---|---|---|---|---|---|
| LFR_1000 |
| 0.43 | 0.356 | 0.326 | 0.385 | 0.404 | 0.356 | 0.36 |
|
| NMI | 1.00 | 0.671 | 0.752 | 0.89 | 0.924 | 0.902 |
| 0.925 | |
| LFR_5000 |
| 0.38 | 0.275 | 0.122 | 0.149 | 0.271 | 0.197 | 0.308 |
|
| NMI | 1.00 | 0.345 | 0.304 | 0.368 | 0.501 | 0.536 | 0.649 |
| |
| Dolphin |
| 0.519 | 0.491 | 0.503 | 0.497 | 0.519 | 0.495 | 0.466 |
|
| NMI | 1.00 | 0.733 |
| 0.744 | 0.812 | 0.691 | 0.629 | 0.783 | |
| Risk map |
| 0.621 | 0.625 | 0.624 | 0.567 | 0.621 | 0.623 | 0.519 |
|
| NMI | 1.00 | 0.894 | 0.848 | 0.888 | 0.803 | 0.834 | 0.714 |
| |
| Scientists collaboration |
| 0.739 | 0.749 | 0.681 | 0.587 |
| 0.707 | 0.62 | 0.739 |
| NMI | 1.00 | 0.867 | 0.799 | 0.704 | 0.877 | 0.857 | 0.775 |
|
The largest values are typed in bold.
Figure 4The collaboration network of scientists working at the Santa Fe Institute (Colour on-line). (a) The ground-truth community structure. (b) The community structure identified by the proposed method.
The experimental results on the second category of networks, the quality of the obtained results are measured using the modularity (Q). The largest value are typed in bold.
| network | Fast | LPA | LPAm | PPC | Attractor | IsoFdp | proposal |
|---|---|---|---|---|---|---|---|
| 0.507 | 0.283 | 0.366 | 0.546 | 0.48 | 0.531 |
| |
| PGP | 0.852 | 0.804 | — | 0.869 | 0.771 | 0.745 |
|
| DBLP | 0.728 | 0.683 | — | 0.796 | 0.633 | — |
|
| Amazon | 0.879 | 0.785 | — | 0.901 | 0.78 | — |
|
Figure 5The metric values obtained from the first category of networks by the proposal and comparison algorithms. (a) The bar chart of the modularity (Q). (b) The bar chart of NMI.
Figure 6The bar chart of the modularity (Q) metrics obtained from the second category of networks.