| Literature DB >> 27528179 |
Xuanmei Qin1, Weidi Dai2, Pengfei Jiao2, Wenjun Wang2, Ning Yuan3.
Abstract
Community structure is one of the fundamental characteristics of complex networks. Many methods have been proposed for community detection. However, most of these methods are designed for static networks and are not suitable for dynamic networks that evolve over time. Recently, the evolutionary clustering framework was proposed for clustering dynamic data, and it can also be used for community detection in dynamic networks. In this paper, a multi-similarity spectral (MSSC) method is proposed as an improvement to the former evolutionary clustering method. To detect the community structure in dynamic networks, our method considers the different similarity metrics of networks. First, multiple similarity matrices are constructed for each snapshot of dynamic networks. Then, a dynamic co-training algorithm is proposed by bootstrapping the clustering of different similarity measures. Compared with a number of baseline models, the experimental results show that the proposed MSSC method has better performance on some widely used synthetic and real-world datasets with ground-truth community structure that change over time.Entities:
Year: 2016 PMID: 27528179 PMCID: PMC4985760 DOI: 10.1038/srep31454
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
The performance in different GN-benchmark networks.
| Cn | aveage degree = 16 | aveage degree = 20 | ||||||
|---|---|---|---|---|---|---|---|---|
| z = 4 | z = 5 | z = 6 | z = 4 | z = 5 | z = 6 | |||
| PCQ-NA | 1 | NMISSE | 0.3562 ± 0.04554224.38 ± 260.79 | 0.0393 ± 0.01786069.06 ± 81.79 | 0.0253 ± 0.00726101.72 ± 70.14 | 0.9111 ± 0.0311813.00 ± 340.81 | 0.5383 ± 0.10743071.78 ± 827.96 | 0.1129 ± 0.06295636.56 ± 336.75 |
| 3 | NMISSE | 0.3333 ± 0.04374378.28 ± 289.09 | 0.0384 ± 0.01696076.68 ± 75.84 | 0.0252 ± 0.00766131.48 ± 84.28 | 0.9142 ± 0.0384675.24 ± 263.41 | 0.5268 ± 0.10423133.74 ± 792.06 | 0.1055 ± 0.05525682.42 ± 298.19 | |
| 6 | NMISSE | 0.3165 ± 0.05974470.96 ± 366.32 | 0.0400 ± 0.01386074.34 ± 86.10 | 0.0253 ± 0.00566119.46 ± 81.34 | 0.9256 ± 0.0389556.82 ± 342.32 | 0.5084 ± 0.10723288.68 ± 766.29 | 0.1059 ± 0.05775686.82 ± 291.96 | |
| PCQ-NC | 1 | NMISSE | 0.4059 ± 0.14433815.34 ± 995.63 | 0.0404 ± 0.02046001.10 ± 107.43 | 0.0274 ± 0.01056078.12 ± 91.28 | 0.9034 ± 0.0307898.88 ± 298.82 | 0.5589 ± 0.11282827.82 ± 838.13 | 0.1106 ± 0.04765615.80 ± 250.09 |
| 3 | NMISSE | 0.3921 ± 0.14053898.54 ± 967.97 | 0.0394 ± 0.01945999.10 ± 106.08 | 0.0290 ± 0.00996053.18 ± 88.14 | 0.9288 ± 0.0346503.94 ± 211.46 | 0.5349 ± 0.12623002.60 ± 943.07 | 0.1031 ± 0.04285672.86 ± 240.05 | |
| 6 | NMISSE | 0.3670 ± 0.12184021.20 ± 858.63 | 0.0394 ± 0.01776010.98 ± 88.34 | 0.0267 ± 0.00986071.34 ± 70.74 | 0.9137 ± 0.0316646.28 ± 219.35 | 0.4958 ± 0.10843295.12 ± 783.13 | 0.0966 ± 0.04245710.98 ± 210.87 | |
| PCM-NA | 1 | NMISSE | 0.3116 ± 0.06214539.50 ± 394.25 | 0.0412 ± 0.01586053.88 ± 104.21 | 0.0257 ± 0.00566121.96 ± 71.94 | 0.8999 ± 0.0387810.52 ± 390.29 | 0.4974 ± 0.11333358.52 ± 784.04 | 0.1054 ± 0.05685680.16 ± 305.19 |
| 3 | NMISSE | 0.3109 ± 0.07094545.74 ± 419.80 | 0.0395 ± 0.01646057.90 ± 93.30 | 0.0251 ± 0.00726126.38 ± 83.91 | 0.8877 ± 0.0344948.72 ± 312.82 | 0.4980 ± 0.11693346.96 ± 797.57 | 0.1039 ± 0.05425682.32 ± 278.17 | |
| 6 | NMISSE | 0.3098 ± 0.06564550.38 ± 380.21 | 0.0403 ± 0.01446073.36 ± 81.51 | 0.0239 ± 0.00526126.62 ± 70.79 | 0.9294 ± 0.0323467.30 ± 173.87 | 0.4945 ± 0.11493386.82 ± 824.11 | 0.1058 ± 0.05595683.04 ± 293.82 | |
| PCM-NC | 1 | NMISSE | 0.0392 ± 0.01916012.28 ± 114.99 | 0.0314 ± 0.01256049.72 ± 83.70 | 0.9004 ± 0.0484889.90 ± 396.37 | 0.1197 ± 0.03705578.26 ± 257.97 | ||
| 3 | NMISSE | 0.0408 ± 0.01975984.40 ± 107.92 | 0.0267 ± 0.00576046.64 ± 40.66 | 0.8737 ± 0.0461860.80 ± 290.84 | 0.0852 ± 0.02425789.26 ± 122.49 | |||
| 6 | NMISSE | 0.2804 ± 0.05254627.50 ± 341.80 | 0.0402 ± 0.01886007.58 ± 77.13 | 0.0233 ± 0.00676080.16 ± 38.87 | 0.8951 ± 0.0400701.52 ± 251.36 | 0.3785 ± 0.09314026.64 ± 570.82 | 0.0609 ± 0.02395915.96 ± 158.57 | |
| StaticSpectral | 1 | NMISSE | 0.3741 ± 0.13153971.22 ± 924.64 | 0.0396 ± 0.01706012.54 ± 98.31 | 0.0284 ± 0.01076068.14 ± 108.58 | 0.9166 ± 0.0289619.68 ± 274.70 | 0.4940 ± 0.11983305.72 ± 880.41 | 0.0992 ± 0.04565667.04 ± 268.22 |
| 3 | NMISSE | 0.3732 ± 0.13163988.72 ± 933.37 | 0.0394 ± 0.01766005.98 ± 88.83 | 0.0263 ± 0.01046083.30 ± 106.78 | 0.9117 ± 0.0349674.82 ± 338.08 | 0.5000 ± 0.11593260.94 ± 822.32 | 0.0985 ± 0.04545688.84 ± 235.99 | |
| 6 | NMISSE | 0.0421 ± 0.01865997.50 ± 96.17 | 0.0264 ± 0.01126086.88 ± 106.90 | 0.9080 ± 0.0483704.44 ± 421.43 | 0.4898 ± 0.11563351.84 ± 848.26 | 0.0995 ± 0.04515687.16 ± 233.97 | ||
| MSSC | 1 | NMISSE | 0.4684 ± 0.05973461.16 ± 471.45 | 0.6462 ± 0.13112284.44 ± 969.86 | ||||
| 3 | NMISSE | 0.4108 ± 0.07473840.66 ± 526.34 | 0.5639 ± 0.11422836.34 ± 844.02 | |||||
| 6 | NMISSE | 0.3727 ± 0.07024059.82 ± 471.45 | ||||||
When parameter z = 4, 5 and 6, the average degree of each node is 16 and 20 at each snapshot, we randomly select 1, 3 and 6 nodes change their cluster membership, respectively. Notice that the value of NMI and SSE is the average for 10 snapshots.
Figure 1The performance of different methods in synthetic networks.
(a,b) Normalized mutual information and the sum of the squared errors of different methods at 10 snapshots in synthetic networks, where the parameter z is 5, the average degree of each node is 16 and at each snapshot, 3 nodes change their cluster membership. (c,d) Performance for a single contraction event with 1000 nodes over 10 snapshots; the nodes have a mean degree of 15, a maximum degree of 50, and a mixing parameter value of μ = 0, which controls the overlapping among communities. Notice that the x-axes show the snapshots.
The performance in different GN-benchmark networks #2.
| z | syn-fix | syn-var | |||
|---|---|---|---|---|---|
| NMI | SSE | NMI | SSE | ||
| PCQ-NA | 35 | 0.6028 ± 0.20330.6069 ± 0.2016 | 9640.60 ± 4779.749340.80 ± 4642.21 | 0.6132 ± 0.18620.6116 ± 0.1928 | 9259.42 ± 3910.069174.40 ± 4197.49 |
| PCQ-NC | 35 | 9319.38 ± 4134.379466.48 ± 4214.53 | 0.6002 ± 0.17980.5984 ± 0.1922 | 9841.54 ± 3830.969656.26 ± 4299.85 | |
| PCM-NA | 35 | 0.5963 ± 0.19960.5993 ± 0.1978 | 9460.58 ± 4666.489304.36 ± 4568.22 | 0.5926 ± 0.18720.5834 ± 0.1928 | 9720.78 ± 3991.919855.60 ± 4235.54 |
| PCM-NC | 35 | 0.5978 ± 0.1862 | 9916.08 ± 4193.829271.82 ± 4401.95 | 0.6070 ± 0.18540.6139 ± 0.1907 | 9536.20 ± 3861.539123.94 ± 4136.11 |
| StaticSpectral | 35 | 0.5835 ± 0.19540.5781 ± 0.2052 | 9668.48 ± 4259.439755.16 ± 4679.10 | 0.5863 ± 0.18960.5812 ± 0.1862 | 9482.16 ± 3893.939659.56 ± 3723.53 |
| MSSC | 35 | 0.5852 ± 0.20520.6091 ± 0.2094 | |||
The performance for SYN-FIX and SYN-VAR with z = 3 and z = 5, respectively. For SYN-FIX, the number of communities is fixed. For SYN-VAR, a new community is created once at each timestamp between 2 ≤ t ≤ 5.
The performance for five dynamic networks.
| birthdeath | expand | contraction | mergesplit | switch | ||
|---|---|---|---|---|---|---|
| PCQ-NA | NMISSE | 0.8398 ± 0.011874170.32 ± 18518.48 | 0.8485 ± 0.012290065.30 ± 13012.11 | 0.8365 ± 0.017594808.78 ± 19132.44 | 0.8515 ± 0.009684288.78 ± 8577.71 | 0.8381 ± 0.0187101091.10 ± 20912.82 |
| PCQ-NC | NMISSE | 0.8504 ± 0.024768511.50 ± 20307.64 | 0.8457 ± 0.017692582.98 ± 14792.11 | 0.8430 ± 0.014389217.64 ± 15869.03 | 0.8373 ± 0.016094506.46 ± 15785.94 | 0.8432 ± 0.014397056.32 ± 16190.39 |
| PCM-NA | NMISSE | 0.8356 ± 0.012977350.20 ± 21198.80 | 0.8368 ± 0.017599648.56 ± 16180.79 | 0.8316 ± 0.017597976.98 ± 17861.72 | 0.8374 ± 0.018890110.12 ± 14109.52 | 0.8446 ± 0.011793093.64 ± 13556.13 |
| PCM-NC | NMISSE | 0.8419 ± 0.016874547.64 ± 16656.38 | 0.8369 ± 0.0193101543.20 ± 17771.21 | 0.8469 ± 0.017283593.96 ± 18388.87 | 0.8407 ± 0.019090853.56 ± 12056.83 | 0.8359 ± 0.0153101398.38 ± 15773.68 |
| StaticSpectral | NMISSE | 0.8548 ± 0.017652299.48 ± 12337.69 | 0.8574 ± 0.015470764.06 ± 8715.67 | 0.8540 ± 0.021970587.36 ± 12970.81 | 0.8477 ± 0.016075613.28 ± 9825.15 | 0.8480 ± 0.0193101398.38 ± 12010.70 |
| MSSC | NMISSE |
Dynamic networks for five different event type: birth and death, expansion, contraction, merging and splitting, switch nodes.
Figure 2The performance for real-world dataset.
(a,b) This NEC blog dataset contains 407 blogs crawled during 15 consecutive months, which begin from July 2005, where each month is a snapshot. (c,d) The network of e-mail contacts at the department of computer science at KIT is an ever-changing network during 48 consecutive months, where the snapshot is six months.
The performance for the KIT E-mail Dataset.
| T = 48 | T = 24 | T = 16 | T = 12 | T = 8 | ||
|---|---|---|---|---|---|---|
| PCQ-NA | NMISSE | 0.7567 ± 0.0365369.70 ± 58.83 | 0.7850 ± 0.03981259.69 ± 265.12 | 0.7760 ± 0.03961915.06 ± 463.29 | 0.7479 ± 0.03602766.32 ± 581.48 | 0.7362 ± 0.04163892.13 ± 1007.15 |
| PCQ-NC | NMISSE | 0.8120 ± 0.0401253.33 ± 60.63 | 0.8105 ± 0.0284924.83 ± 156.76 | 0.7933 ± 0.02871371.71 ± 202.50 | 0.7807 ± 0.02921862.48 ± 215.96 | 0.7736 ± 0.02152476.00 ± 224.26 |
| PCM-NA | NMISSE | 0.7466 ± 0.0433382.39 ± 69.97 | 0.7773 ± 0.03861290.58 ± 258.89 | 0.7680 ± 0.04341949.60 ± 462.78 | 0.7378 ± 0.03982856.55 ± 628.81 | 0.7326 ± 0.04003954.75 ± 1018.30 |
| PCM-NC | NMISSE | 0.8290 ± 0.0345 | 0.8150 ± 0.0214924.99 ± 109.74 | 0.8076 ± 0.02481296.94 ± 174.63 | 0.7796 ± 0.03171884.85 ± 230.36 | 0.7680 ± 0.02032686.10 ± 152.40 |
| StaticSpectral | NMISSE | 0.8018 ± 0.0398265.45 ± 56.69 | 0.8059 ± 0.0249933.73 ± 135.10 | 0.7871 ± 0.02841418.89 ± 226.12 | 0.7795 ± 0.02871848.40 ± 201.88 | 0.7674 ± 0.02122518.20 ± 224.53 |
| MSSC | NMISSE |
The e-mail networks taking 1, 2, 3, 4, 6 months as a snapshot, respectively.
Figure 3The graphical illustration of the dynamic co-training method.
represents the similarity matrix at snapshot t represents the new similarity matrix after the dynamic co-training. denotes the discriminative eigenvector in the Laplacian matrix obtained from the 1, 2, 3 … sth except for the pth similarity measures.