| Literature DB >> 30190540 |
Zhiwei Cao1,2,3, Yichao Zhang1,2, Jihong Guan4,5, Shuigeng Zhou6,7.
Abstract
Incomplete or partial observations of network structures pose a serious challenge to theoretical and engineering studies of real networks. To remedy the missing links in real datasets, topology-based link prediction is introduced into the studies of various networks. Due to the complexity of network structures, the accuracy and robustness of most link prediction algorithms are not satisfying enough. In this paper, we propose a quantum-inspired ant colony optimization algorithm that integrates ant colony optimization and quantum computing to predict links in networks. Extensive experiments on both synthetic and real networks show that the accuracy and robustness of the new algorithm is competitive in respect to most of the state of the art algorithms. This result suggests that the application of intelligent optimization to link prediction is promising for boosting its accuracy and robustness.Entities:
Year: 2018 PMID: 30190540 PMCID: PMC6127200 DOI: 10.1038/s41598-018-31254-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The precision evaluation of six algorithms with 90% of the links used as the training set on the nPSO networks with 8 communities. Synthetic networks are generated by the nonuniform PSO model with parameters γ = 3 (power-law degree distribution exponent), m = [10, 12, 14] (half of average degree), T = [0.1, 0.3, 0.5] (temperature, inversely related to the clustering coefficient), N = [100, 500, 1000] (network size) and 8 communities. For each combination of parameters, 100 networks are generated. For each parameter combination, the plots report the mean precision and standard error over the random iterations. Note that for SBM only 10 networks are considered due to the high time complexity. In addition, HD is the hyperbolic distances between the nodes in the original network.
Figure 2The precision evaluation of three algorithms with 90% of the links used as the training set on the Watts-Strogatz networks. Synthetic networks are generated by the Watts-Strogatz model with parameters N = [100, 500, 1000] (network size), m = [10, 12, 14] (half of average degree) and β = [0.001, 0.01, 0.1] (rewiring probability). For each combination of parameters, 100 networks are generated. For each parameter combination, the plots report the mean precision and standard error over the random iterations.
The precision evaluation of five algorithms with 90% of the links used as the training set on the small-size real networks.
|
|
|
|
|
| |
|---|---|---|---|---|---|
| mouse neural | 0.02 |
| 0.11 | 0.10 | 0.01 |
| karate | 0.17 | 0.23 | 0.20 |
| 0.27 |
| dolphins | 0.13 | 0.18 | 0.14 | 0.16 |
|
| macaque neural |
| 0.64 | 0.56 | 0.68 | 0.55 |
| polbooks | 0.17 | 0.17 | 0.17 | 0.15 |
|
| ACM2009 contacts | 0.26 |
| 0.27 | 0.25 | 0.26 |
| football | 0.31 | 0.30 |
| 0.34 | 0.25 |
| physicians innovation | 0.07 |
| 0.07 | 0.06 | 0.08 |
| FWFW |
| 0.30 | 0.08 | 0.18 | 0.14 |
| manufacturing email |
| 0.41 | 0.42 | 0.47 | 0.39 |
| littlerock foodweb |
| 0.44 | 0.15 | 0.73 | 0.17 |
| jazz |
| 0.48 | 0.56 | 0.47 | 0.45 |
| residence hall friends |
| 0.21 | 0.24 | 0.18 | 0.24 |
| haggle contacts |
|
| 0.57 |
| 0.57 |
| worm nervoussys |
| 0.13 | 0.12 | 0.15 | 0.11 |
| netsci | 0.41 | 0.37 |
| 0.13 | 0.33 |
| infectious contacts |
| 0.30 | 0.34 | 0.30 | 0.33 |
| flightmap |
| 0.59 | 0.54 | 0.64 | 0.56 |
|
| 0.15 |
| 0.09 |
| |
| polblog |
| 0.20 | 0.17 | 0.19 | 0.17 |
| mean precision |
| 0.31 | 0.29 | 0.31 | 0.27 |
| mean ranking |
| 2.78 | 3.20 | 3.28 | 3.60 |
For each network, the table reports the mean precision over the random iterations and the mean precision over the entire datasets. Moreover, the mean ranking of the algorithms over all the networks is shown in the last row. In addition to 10 iterations for SBM due to its high computational time, the other algorithms are 100 iterations. For each network, the best algorithm (or algorithms) is highlighted in bold. The networks are sorted by N in ascending order. The algorithms are ranked from left to right according to the mean ranking (the results of ranking score for each network can refer to Table SII in the SI).
The precision evaluation of three algorithms with 90% of the links used as the training set on the large-size real networks.
|
|
|
| |
|---|---|---|---|
| yeast | 0.25 | 0.26 |
|
| odlis |
| 0.11 | 0.08 |
| router | 0.11 |
| 0.30 |
| advogato |
| 0.15 | 0.15 |
| wikipedia | 0.14 | 0.11 |
|
| oregon |
|
| 0.07 |
| P2P | 0.03 |
| 0.03 |
| arxiv astroph | 0.53 | 0.58 |
|
| thesaurus |
|
| 0.07 |
| arxiv hepth | 0.22 | 0.21 |
|
| ARK201012 |
| 0.14 | 0.11 |
|
| 0.10 | 0.10 | |
| mean precision | 0.17 | 0.18 |
|
| mean ranking |
| 2.00 | 2.13 |
For each network, the table reports the mean precision over the random iterations and the mean precision over the entire datasets. Moreover, the mean ranking of the algorithms over all the networks is shown in the last row. In addition, all algorithms are 10 iterations. For each network, the best algorithm (or algorithms) is highlighted in bold. The networks are sorted by N in ascending order. The algorithms are ranked from left to right according to the mean ranking (the results of ranking score for each network can refer to Table SIV in the SI).
The precision evaluation of three algorithms in time on the AS Internet networks.
|
|
|
| mean precision | mean ranking | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
| 0.11 | 0.12 | 0.13 | 0.14 | 0.14 | 0.08 | 0.09 | 0.09 | 0.10 | 0.11 |
|
|
|
|
|
|
|
| 0.12 | 0.13 | 0.14 | 0.14 | 0.07 | 0.08 | 0.09 | 0.10 | CH | 0.13 | 2 | |||
|
|
|
| 0.12 | 0.13 | 0.14 | 0.08 | 0.09 | 0.10 | SPM | 0.09 | 3 | ||||||
|
|
| 0.12 | 0.13 | 0.08 | 0.09 | ||||||||||||
|
| 0.12 | 0.09 | |||||||||||||||
From September 2009 to December 2010, six AS Internet network snapshots are considered at time steps of 3 months. For every snapshot at times i = [1, 5], the non-observed links are assign likelihood scores based on the algorithms. Meanwhile, the link-prediction performance is evaluated with respect to every future time point j = [i + 1, 6]. Considering a pair of time points (i, j), the non-observed links at time i are ranked by likelihood scores in descending order. And the precision is computed as the percentage of links that appear at time j among the top − r links, where top − r is the total number of non-observed links at time i that appear at time j. Non-observed links at time i involving nodes that disappear at time j are not considered in the ranking. For each algorithm, a 5-dimensional upper triangular matrix is shown, where entry (i, j) denotes the precision of the link prediction from time i to time j + 1. On the right side, the algorithms are ranked by the mean precision computed over all the time combinations. For each comparison, the best algorithm is highlighted in bold.