| Literature DB >> 35711311 |
Min Shi1, Bo Qu2, Xiang Li3, Cong Li1.
Abstract
Previously network representation learning methods mainly focus on exploring the microscopic structure, i.e., the pairwise relationship or similarity between nodes. However, the mesoscopic structure, i.e., community structure, an essential property in real networks, has not been thoroughly studied in the network representation learning. We here propose a deep attributed network representation learning with community awareness (DANRL-CA) framework. Specifically, we design a neighborhood enhancement autoencoder module to capture the 2-step relations between node pairs. To explore the multi-step relations, we construct a community-aware skip-gram module based on the encoder. We introduce two variants of DANRL-CA, namely, DANRL-CA-AM and DANRL-CA-CSM, which incorporate the community information and attribute semantics into node neighbors with different methods. We compare two variant models with the state-of-the-art methods on four datasets for node classification and link prediction. Especially, we apply our models on a brain network. The superiority indicates the scalability and effectiveness of our method on various networks. Compared with DANRL-CA-AM, DANRL-CA-CSM can more flexibly coordinate the role of node attributes and community information in the process of network representation learning, and shows superiority in the networks with sparse topological structure and node attributes.Entities:
Keywords: attributed networks; brain networks; community information; link prediction; node classification; representation learning
Year: 2022 PMID: 35711311 PMCID: PMC9196130 DOI: 10.3389/fphys.2022.910873
Source DB: PubMed Journal: Front Physiol ISSN: 1664-042X Impact factor: 4.755
FIGURE 1The architecture of the proposed DANRL-CA framework.
Dataset statistics.
| Datasets |
|
|
|
|
|---|---|---|---|---|
| Citeseer | 3,312 | 4,714 | 3,703 | 6 |
| PubMed | 19,717 | 44,338 | 500 | 3 |
| Cora | 2,708 | 5,429 | 1,433 | 7 |
| Flickr | 7,575 | 239,738 | 12,047 | 9 |
| Fly-drosophila-medulla-1 | 1,781 | 9,016 | — | — |
Detailed architecture information for datasets ( ).
| Datasets |
|
|---|---|
| Citeseer | 3312-1000-500-128-500-1000-3312 |
| PubMed | 19717-1000-500-128-500-1000-19717 |
| Cora | 2708-1000-500-128-500-1000-2708 |
| Flickr | 7575-500-128-500-7575 (NC) |
| 7575-1000-500-128-500-1000-7575 (LP) |
Detailed architecture information for datasets ( ).
| Datasets |
|
|---|---|
| Citeseer | 3312-2000-1000-500-128-500-1000-2000-3312 (NC) |
| 3312-500-128-500-3312 (LP) | |
| PubMed | 19717-1000-500-128-500-1000-19717 |
| Cora | 2708-2000-1000-500-128-500-1000-2000-2708 (NC) |
| 2708-1000-500-128-500-1000-2708 (LP) | |
| Flickr | 7575-500-128-500-7575 (NC) |
| 7575-1000-500-128-500-1000-7575 (LP) |
Detailed architecture information for datasets ( ).
| Datasets |
|
|---|---|
| Fly-drosophila-medulla-1 (LPA) | 1781-1000-500-128-500-1000-1781 (LP) |
| Fly-drosophila-medulla-1 (Infomap) | 1781-500-128-500-1781 (LP) |
| Fly-drosophila-medulla-1 (Multilevel) | 1781-500-128-500-1781 (LP) |
Node classification results on Citeseer, Pubmed, Cora, BlogCatalog and Flickr datasets.
| Datasets | Citeseer | PubMed | Cora | Flickr |
|---|---|---|---|---|
| Evaluation | Micro-F1 Macro-F1 | Micro-F1 Macro-F1 | Micro-F1 Macro-F1 | Micro-F1 Macro-F1 |
| DeepWalk | 0.5665 0.5212 | 0.8109 0.7978 | 0.7900 0.7782 | 0.4940 0.4835 |
| node2vec | 0.6002 0.5465 | 0.8104 0.7968 | 0.8058 0.7942 | 0.5155 0.5062 |
| LINE | 0.5605 0.5256 | 0.8049 0.7926 | 0.7884 0.7767 | 0.5613 0.5576 |
| SDNE | 0.4161 0.3632 | 0.4258 0.2900 | 0.5813 0.5201 | 0.6043 0.5991 |
| M-NMF | 0.5337 0.4814 | 0.7175 0.6630 | 0.6416 0.6269 | 0.6028 0.5974 |
| ComVAE (Infomap) | 0.2189 0.1521 | 0.3944 0.2990 | 0.2527 0.1372 | 0.5167 0.5095 |
| ComVAE (LPA) | 0.2173 0.1580 | 0.3952 0.2997 | 0.2416 0.1337 | 0.5383 0.5299 |
| ANRL-WAN |
| 0.8595 0.8584 | 0.8161 0.8030 | 0.6701 0.6584 |
| DANRL-CA-AM/Infomap | 0.7154 0.6658 | 0.8583 0.8551 | 0.8324 0.8204 |
|
| DANRL-CA-AM/LPA | 0.7138 0.6710 | 0.8452 0.8421 | 0.8350 | 0.9128 0.9118 |
| DANRL-CA-AM/Multilevel | 0.7146 0.6739 | 0.8189 0.8125 |
| 0.9002 0.8988 |
| DANRL-CA-CSM/Infomap | 0.7155 0.6631 | 0.8753 0.8740 | 0.8313 0.8173 | 0.9057 0.9042 |
| DANRL-CA-CSM/LPA | 0.7122 | 0.8774 0.8751 | 0.8313 0.8166 | 0.9056 0.9043 |
| DANRL-CA-CSM/Multilevel |
|
| 0.8336 0.8193 | 0.9078 0.9065 |
⋆ We use red bold to highlight the best performance, and utilize black bold to show the performance comparison results between DANRL-CA-AM, and DANRL-CA-CSM, respectively. Significantly, there is the overlap between the red bold part and the black bold part.
Link prediction results on Citeseer, Pubmed, Cora, BlogCatalog and Flickr datasets.
| Datasets | Citeseer | PubMed | Cora | Flickr |
|---|---|---|---|---|
| Evaluation | AUC | AUC | AUC | AUC |
| DeepWalk | 0.6020 | 0.7925 | 0.7209 | 0.7247 |
| node2vec | 0.5485 | 0.7977 | 0.7244 | 0.7341 |
| LINE | 0.5309 | 0.6213 | 0.6047 | 0.5262 |
| SDNE | 0.6093 | 0.7562 | 0.6326 | 0.9023 |
| M-NMF | 0.6249 | 0.7944 | 0.7884 | 0.8725 |
| ComVAE (Infomap) | 0.5729 | 0.5531 | 0.5703 | 0.7635 |
| ComVAE (LPA) | 0.5654 | 0.5518 | 0.5727 | 0.7539 |
| ANRL-WAN |
| 0.8035 | 0.9181 | 0.7800 |
| DANRL-CA-AM/Infomap | 0.9562 | 0.8700 | 0.9246 |
|
| DANRL-CA-AM/LPA | 0.9550 | 0.8981 | 0.9314 | 0.9377 |
| DANRL-CA-AM/Multilevel | 0.9531 | 0.8414 | 0.9244 | 0.9382 |
| DANRL-CA-CSM/Infomap | 0.9565 |
| 0.9276 | 0.9378 |
| DANRL-CA-CSM/LPA | 0.9528 | 0.9564 |
| 0.9375 |
| DANRL-CA-CSM/Multilevel |
| 0.9506 | 0.9300 | 0.9374 |
⋆ We use red bold to highlight the best performance, and utilize black bold to show the performance comparison results between DANRL-CA-AM, and DANRL-CA-CSM, respectively. Significantly, there is the overlap between the red bold part and the black bold part.
Link prediction results on Fly-drosophila-medulla-1 dataset.
| Datasets | Fly-Drosophila-Medulla-1 |
|---|---|
| Evaluation | AUC |
| DeepWalk | 0.6589 |
| node2vec | 0.6004 |
| LINE | 0.6073 |
| SDNE | 0.7961 |
| M-NMF | — |
| ComVAE (Infomap) | 0.6885 |
| ComVAE (LPA) | 0.6429 |
| ANRL-WAN | — |
| DANRL-CA/Infomap |
|
| DANRL-CA/LPA | 0.7943 |
| DANRL-CA/Multilevel | 0.8024 |
| DANRL-CA/NoCommunityInformation | 0.8594 |
⋆ We use red bold to highlight the best performance.
Detailed architecture information for datasets ( ).
| Datasets |
|
|---|---|
| Citeseer | 3312-1000-500-128-500-1000-3312 |
| PubMed | 19717-500-128-500-19717 (NC) |
| 19717-1000-500-128-500-1000-19717 (LP) | |
| Cora | 2708-1000-500-128-500-1000-2708 |
| Flickr | 7575-500-128-500-7575 (NC) |
| 7575-1000-500-128-500-1000-7575 (LP) |
Detailed architecture information for datasets ( ).
| Datasets |
|
|---|---|
| Citeseer | 3312-1000-500-128-500-1000-3312 (NC) |
| 3312-500-128-500-3312 (LP) | |
| PubMed | 19717-256-128-256-19717 (NC) |
| 19717-500-128-500-19717 (LP) | |
| Cora | 2708-1000-500-128-500-1000-2708 (NC) |
| 2708-500-128-500-2708 (LP) | |
| Flickr | 7575-500-128-500-7575 (NC) |
| 7575-1000-500-128-500-1000-7575 (LP) |
Detailed architecture information for datasets ( ).
| Datasets |
|
|---|---|
| Citeseer | 3312-1000-500-128-500-1000-3312 |
| PubMed | 19717-1000-500-128-500-1000-19717 |
| Cora | 2708-2000-1000-500-128-500-1000-2000-2708 |
| Flickr | 7575-256-128-256-7575 (NC) |
| 7575-1000-500-128-500-1000-7575 (LP) |
Detailed architecture information for datasets ( ).
| Datasets |
|
|---|---|
| Citeseer | 3312-1000-500-128-500-1000-3312 (NC) |
| 3312-256-128-256-3312 (LP) | |
| PubMed | 19717-1000-500-128-500-1000-19717 |
| Cora | 2708-1000-500-128-500-1000-2708 (NC) |
| 2708-256-128-256-2708 (LP) | |
| Flickr | 7575-1000-500-128-500-1000-7575 |