| Literature DB >> 33266969 |
Shaokai Wang1,2, Xutao Li3, Yunming Ye3, Shanshan Feng4, Raymond Y K Lau5, Xiaohui Huang6, Xiaolin Du7.
Abstract
Presently, many users are involved in multiple social networks. Identifying the same user in different networks, also known as anchor link prediction, becomes an important problem, which can serve numerous applications, e.g., cross-network recommendation, user profiling, etc. Previous studies mainly use hand-crafted structure features, which, if not carefully designed, may fail to reflect the intrinsic structure regularities. Moreover, most of the methods neglect the attribute information of social networks. In this paper, we propose a novel semi-supervised network-embedding model to address the problem. In the model, each node of the multiple networks is represented by a vector for anchor link prediction, which is learnt with awareness of observed anchor links as semi-supervised information, and topology structure and attributes as input. Experimental results on the real-world data sets demonstrate the superiority of the proposed model compared to state-of-the-art techniques.Entities:
Keywords: anchor link prediction; attributed network; network embedding
Year: 2019 PMID: 33266969 PMCID: PMC7514735 DOI: 10.3390/e21030254
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1The anchor link prediction problem.
Figure 2APAN network architecture.
Statistics of the datasets.
| Datasets | #Nodes | #Edges |
|---|---|---|
| Flickr | 4935 | 15,884 |
| Lastfm | 4496 | 10,628 |
| Douban Online | 3906 | 16,328 |
| Douban Offline | 1118 | 3022 |
Performance of different models on real datasets (%), boldface indicates the best performance.
| Dataset | Evaluation Metric | APAN | APAN-N | PALE | ULink | FINAL |
|---|---|---|---|---|---|---|
| Hit-p@1 | 9.51 ± 2.82 |
| 5.53 ± 1.57 | 4.10 ± 1.22 | 2.43 ± 0.90 | |
| Hit-p@5 | 15.35 ± 2.58 |
| 9.91 ± 2.73 | 7.79 ± 2.06 | 7.10 ± 0.60 | |
| Hit-p@10 |
| 18.93 ± 2.37 | 13.65 ± 3.00 | 12.07 ± 1.92 | 12.18 ± 1.76 | |
| Flickr-Lastfm | Hit-p@15 |
| 21.75 ± 2.41 | 16.70 ± 2.81 | 15.55 ± 3.24 | 15.95 ± 2.45 |
| Hit-p@20 |
| 24.48 ± 2.41 | 19.57 ± 2.51 | 18.20 ± 2.91 | 19.28 ± 3.04 | |
| Hit-p@25 |
| 26.92 ± 2.23 | 22.42 ± 2.20 | 21.42 ± 2.88 | 22.51 ± 3.28 | |
| Hit-p@30 |
| 29.36 ± 2.04 | 25.08 ± 2.07 | 24.93 ± 1.81 | 25.65 ± 3.63 | |
| Hit-p@1 | 46.53 ± 2.80 |
| 39.98 ± 3.30 | 24.95 ± 3.05 | 25.93 ± 2.17 | |
| Hit-p@5 | 66.31 ± 3.23 |
| 61.66 ± 2.04 | 43.93 ± 4.03 | 43.90 ± 1.20 | |
| Hit-p@10 |
| 76.38 ± 0.95 | 71.72 ± 2.06 | 50.95 ± 2.74 | 55.19 ± 2.27 | |
| Douban online-offline | Hit-p@15 |
| 81.54 ± 0.98 | 77.49 ± 2.24 | 54.62 ± 1.85 | 62.44 ± 2.81 |
| Hit-p@20 |
| 83.43 ± 1.00 | 81.25 ± 2.24 | 59.94 ± 2.36 | 67.86 ± 2.77 | |
| Hit-p@25 |
| 85.55 ± 1.12 | 83.99 ± 2.17 | 66.11 ± 3.22 | 72.08 ± 2.64 | |
| Hit-p@30 |
| 87.09 ± 1.03 | 85.98 ± 2.14 | 69.92 ± 4.39 | 75.49 ± 2.34 |
Figure 3The average Hit-precision@k performance on real datasets. (a) Flickr-Lastfm networks; (b) Douban online-offline networks.
Figure 4Hit-precision@k performance on real datasets. (a) Flickr-Lastfm networks; (b) Douban online-offline networks.
Figure 5Comparison of APAN-N and PALE in different iterations on Flickr-Lastfm networks.
Figure 6The tuning of on Flickr-Lastfm networks.
Figure 7Tuning the dimension of the embedding vectors on Flickr-Lastfm networks.