| Literature DB >> 33265561 |
Fanrong Meng1, Xiaobin Rui1, Zhixiao Wang1, Yan Xing2, Longbing Cao3.
Abstract
Attributed networks consist of not only a network structure but also node attributes. Most existing community detection algorithms only focus on network structures and ignore node attributes, which are also important. Although some algorithms using both node attributes and network structure information have been proposed in recent years, the complex hierarchical coupling relationships within and between attributes, nodes and network structure have not been considered. Such hierarchical couplings are driving factors in community formation. This paper introduces a novel coupled node similarity (CNS) to involve and learn attribute and structure couplings and compute the similarity within and between nodes with categorical attributes in a network. CNS learns and integrates the frequency-based intra-attribute coupled similarity within an attribute, the co-occurrence-based inter-attribute coupled similarity between attributes, and coupled attribute-to-structure similarity based on the homophily property. CNS is then used to generate the weights of edges and transfer a plain graph to a weighted graph. Clustering algorithms detect community structures that are topologically well-connected and semantically coherent on the weighted graphs. Extensive experiments verify the effectiveness of CNS-based community detection algorithms on several data sets by comparing with the state-of-the-art node similarity measures, whether they involve node attribute information and hierarchical interactions, and on various levels of network structure complexity.Entities:
Keywords: attributed networks; community detection; coupled node similarity
Year: 2018 PMID: 33265561 PMCID: PMC7512989 DOI: 10.3390/e20060471
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Structure and attribute couplings in a co-authoring network. (Note: Symbol indicates the linkage between two nodes; refers to simple attribute similarity between two nodes, and represents the complex coupling relationships between two nodes.)
Figure 2The framework for learning CNS. (Note: Symbol indicates intra-attribute coupled similarity calculated using the interaction between attribute values within an attribute and ⟷ refers to inter-attribute coupled similarity involved the couplings between attributes. The coupled attribute similarity in the second level integrates both of intra-attribute coupled similarity and inter-attribute coupled similarity. The coupled attribute-to-structure similarity in the second level captures the interactions between node attributes and network structure. In the last level, CNS integrates the coupled attribute similarity and the coupled attribute-to-structure similarity.)
Notation explanation.
| Notation | Description |
|---|---|
|
| The number of node attributes |
|
| The set of all distinct values on the |
|
| The value of the |
|
| The number of communities |
|
| The communities of the network, |
|
| The community to which node |
|
| The neighbor set of node |
|
| The adjacency relationship between nodes |
|
| The weight between nodes |
|
| The similarity between nodes |
|
| The received label of node |
|
| The sum of all edge weights in the network, |
|
| The sum of edge weights which are connected to node |
|
| The node set whose |
|
| The weight parameter for the |
|
| the inter-relative attribute coupled similarities between values |
|
| |
|
| the node set whose attribute value in the |
|
| The intra-attribute coupled similarity between the attribute values |
|
| The inter-attribute coupled similarity between the attribute values |
|
| The coupled attribute similarity between the attribute values |
|
| The coupled attribute-to-structure similarity between the attribute values |
|
| The coupled attribute similarity between nodes |
|
| The coupled node similarity between nodes |
The similarities.
| Similarity | Formula |
|---|---|
| Adjacency |
|
| Cosine |
|
| Jaccard |
|
| SMC |
|
| CAS | Equation ( |
| CNS | Equation ( |
The information of real networks.
| ID | Name | Abbr. |
|
|
|
|
|---|---|---|---|---|---|---|
| R1 | Lazega | Laz | 71 | 575 | 2 | 7 |
| R2 | Research | Res | 77 | 2228 | 3 | 4 |
| R3 | Consult | Con | 46 | 879 | 4 | 2 |
:The number of nodes; : The number of edges; K: The number of communities; M: The number of node attributes.
The results of NMI(%) w.r.t. six similarities.
| Similarity | SLPA | BGLL | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Laz | Res | Con | Laz | Res | Con | Laz | Res | Con | |
| Adjacency | 14.04 | 65.68 | 58.15 | 31.47 | 70.92 |
| 11.25 | 29.02 | 20.15 |
| Cosine | 25.81 | 80.22 | 64.94 | 36.65 | 75.66 | 49.60 | 22.61 | 58.64 | 37.79 |
| Jaccard | 26.16 | 78.48 | 64.66 | 39.13 | 75.62 | 49.60 | 39.76 | 62.88 | 30.80 |
| SMC | 27.67 | 92.05 | 70.46 | 39.04 |
|
| 28.46 | 34.75 | 47.74 |
| CAS | 26.11 | 87.84 | 67.23 | 36.18 | 86.64 |
| 72.55 | 34.08 | 54.11 |
| CNS |
|
|
|
|
|
|
|
|
|
|
| 6.51 | 7.24 | 11.65 | 24.38 | 0.00 | 0.00 | 4.78 | 21.60 | 35.45 |
The results of F-Measure(%) w.r.t. six similarities.
| Similarity | SLPA | BGLL | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Laz | Res | Con | Laz | Res | Con | Laz | Res | Con | |
| Adjacency | 52.32 | 68.25 | 80.00 | 71.96 | 62.98 |
| 53.54 | 40.36 | 63.36 |
| Cosine | 66.27 | 87.22 | 79.81 | 75.85 | 89.52 | 36.67 | 71.57 | 70.58 | 76.66 |
| Jaccard | 65.32 | 85.00 | 79.62 | 70.14 | 86.47 | 36.67 | 75.21 | 74.85 | 74.52 |
| SMC | 65.57 | 93.89 | 74.11 | 73.81 |
|
| 74.48 | 54.06 | 74.48 |
| CAS | 64.84 | 92.60 | 82.12 | 77.39 | 71.55 |
| 91.60 | 52.40 | 75.78 |
| CNS |
|
|
|
|
|
|
|
|
|
|
| 1.67 | 5.37 | 10.56 | 3.90 | 0.00 | 0.00 | 2.05 | 5.28 | 12.14 |
The results of Accuracy(%) w.r.t. six similarities.
| Similarity | SLPA | BGLL | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Laz | Res | Con | Laz | Res | Con | Laz | Res | Con | |
| Adjacency | 61.32 | 73.95 | 72.60 | 60.87 | 66.22 |
| 60.14 | 46.31 | 67.10 |
| Cosine | 63.16 | 93.47 | 68.35 | 63.77 | 86.49 | 27.50 | 72.62 | 71.55 | 77.05 |
| Jaccard | 59.64 | 80.35 | 68.10 | 55.07 | 78.38 | 27.50 | 76.14 | 75.97 | 75.03 |
| SMC | 68.22 | 89.89 | 74.95 | 59.42 |
|
| 75.20 | 56.92 | 76.60 |
| CAS | 70.26 | 91.65 | 71.78 | 66.67 | 83.78 |
| 92.17 | 54.03 | 78.03 |
| CNS |
|
|
|
|
|
|
|
|
|
|
| 0.87 | 5.23 | 15.14 | 4.35 | 0.00 | 0.00 | 1.59 | 6.57 | 12.91 |
Figure 3The results of SLPA w.r.t. different CNSs.
Figure 4The results of BGLL w.r.t. different CNSs.
Figure 5The results of Kmedoids w.r.t. different CNSs.
Figure 6The results of BGLL w.r.t. different levels of network structure complexity.
Figure 7The results of NMI (%) w.r.t. three community detection algorithms on attributed networks.