| Literature DB >> 26849659 |
Abstract
The link-prediction problem is an open issue in data mining and knowledge discovery, which attracts researchers from disparate scientific communities. A wealth of methods have been proposed to deal with this problem. Among these approaches, most are applied in unweighted networks, with only a few taking the weights of links into consideration. In this paper, we present a weighted model for undirected and weighted networks based on the mutual information of local network structures, where link weights are applied to further enhance the distinguishable extent of candidate links. Empirical experiments are conducted on four weighted networks, and results show that the proposed method can provide more accurate predictions than not only traditional unweighted indices but also typical weighted indices. Furthermore, some in-depth discussions on the effects of weak ties in link prediction as well as the potential to predict link weights are also given. This work may shed light on the design of algorithms for link prediction in weighted networks.Entities:
Mesh:
Year: 2016 PMID: 26849659 PMCID: PMC4744029 DOI: 10.1371/journal.pone.0148265
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Comparison of WMI-based methods with other typical unweighted indices measured by Precision (top-100) on four networks.
Each value is obtained by averaging over 100 independent runs of random division of training set and probe set. The abbreviations WMI-WCN*, WMI-WAA* and WMI-WRA* represent highest Precision values obtained by Eqs (19)–(21), respectively. The optimal values of α are presented in Table 2. The best performance in each network is marked by bold font.
| Indices∖Nets | USAir | Celegans | Baywet | Bible |
|---|---|---|---|---|
| CN | 0.606 | 0.14 | 0.092 | 0.447 |
| LNB-CN | 0.621 | 0.14 | 0.11 | 0.539 |
| WMI-WCN | 0.498 | 0.177 | 0.099 | 0.398 |
| WMI-WCN* | 0.65 | 0.162 | 0.55 | |
| AA | 0.625 | 0.14 | 0.093 | 0.571 |
| LNB-AA | 0.641 | 0.14 | 0.109 | 0.747 |
| WMI-WAA | 0.549 | 0.173 | 0.103 | 0.466 |
| WMI-WAA* | 0.667 | 0.196 | 0.164 | 0.706 |
| RA | 0.645 | 0.133 | 0.093 | 0.872 |
| LNB-RA | 0.643 | 0.133 | 0.107 | 0.916 |
| WMI-WRA | 0.59 | 0.159 | 0.192 | 0.912 |
| WMI-WRA* | 0.654 | 0.165 | ||
| DC-CN* | 0.143 | 0.094 | 0.876 |
Optimal values of parameter α subject to the highest Precision values in four networks.
| Indices∖Nets | USAir | Celegans | Baywet | Bible |
|---|---|---|---|---|
| WCN* | -0.41 | 1.41 | 0.18 | 0 |
| WMI-WCN* | -0.4 | 1.71 | 0.21 | -4.16 |
| WAA* | -0.40 | 1.44 | 0.25 | -2.34 |
| WMI-WAA* | -0.41 | 1.95 | 0.32 | -0.82 |
| WRA* | -0.24 | 1.56 | 0.98 | 0.4 |
| WMI-WRA* | -0.1 | 1.76 | 1.82 | 0.68 |
Fig 1The performances of WMI-based methods and other three weighted indices with different α on the four real-world networks.
Comparison of WMI-based methods with other typical weighted indices measured by Precision (top-100) on four networks.
Each value is obtained by averaging over 100 independent runs of random division of training set and probe set. The abbreviations WCN*, WAA*, WRA*, WMI-WCN*, WMI-WAA* and WMI-WRA* represent the highest Precision values shown in Fig 1 (please refer to detailed α values in Table 2). The best performance in each network is marked by bold font.
| Indices∖Nets | USAir | Celegans | Baywet | Bible |
|---|---|---|---|---|
| WCN | 0.462 | 0.167 | 0.046 | 0.347 |
| rWCN | 0.115 | 0.133 | 0.059 | 0.429 |
| WMI-WCN | 0.498 | 0.177 | 0.099 | 0.398 |
| WCN* | 0.637 | 0.189 | 0.141 | 0.447 |
| WMI-WCN* | 0.65 | 0.162 | 0.55 | |
| WAA | 0.533 | 0.178 | 0.053 | 0.359 |
| rWAA | 0.030 | 0.136 | 0.067 | 0.669 |
| WMI-WAA | 0.549 | 0.173 | 0.103 | 0.466 |
| WAA* | 0.655 | 0.197 | 0.153 | 0.594 |
| WMI-WAA* | 0.196 | 0.164 | 0.706 | |
| WRA | 0.578 | 0.163 | 0.191 | 0.838 |
| rWRA | 0.134 | 0.128 | 0.072 | 0.817 |
| WMI-WRA | 0.59 | 0.159 | 0.192 | 0.912 |
| WRA* | 0.647 | 0.167 | 0.191 | 0.887 |
| WMI-WRA* | 0.654 | 0.165 |
Comparison of the computing efficiency of sixteen methods on four real-world networks.
Each value is the average time in seconds for 100 independent runs.
| Indices∖Nets | USAir | Celegans | Baywet | Bible |
|---|---|---|---|---|
| CN | 0.0134 | 0.0142 | 0.00484 | 0.239 |
| WCN | 0.0288 | 0.0297 | 0.0106 | 0.502 |
| rWCN | 0.043 | 0.042 | 0.017 | 0.75 |
| LNB-CN | 0.0606 | 0.06 | 0.023 | 1.08 |
| WMI-WCN | 0.0897 | 0.0902 | 0.0389 | 1.53 |
| AA | 0.108 | 0.106 | 0.0431 | 1.92 |
| WAA | 0.124 | 0.121 | 0.0528 | 2.21 |
| rWAA | 0.142 | 0.139 | 0.0573 | 2.59 |
| LNB-AA | 0.161 | 0.156 | 0.0628 | 2.91 |
| WMI-WAA | 0.191 | 0.185 | 0.0748 | 3.38 |
| RA | 0.207 | 0.201 | 0.0822 | 3.76 |
| WRA | 0.229 | 0.22 | 0.0894 | 4.15 |
| rWRA | 0.246 | 0.24 | 0.0925 | 4.53 |
| LNB-RA | 0.265 | 0.259 | 0.102 | 4.85 |
| WMI-WRA | 0.295 | 0.288 | 0.115 | 5.31 |
| DC-CN | 0.313 | 0.304 | 0.121 | 5.69 |
Estimated optimal values of parameter α subject to the highest Precision values validated by the validation sets in four networks, respectively.
The original network is divided into three parts: training set, validation set and probe set. The proportions are 80%, 10% and 10%, respectively. RMSD is the root mean-square deviation of the Precision values with estimated α values and optimal α values in Table 2, respectively, in each network.
| Indices∖Nets | USAir | Celegans | Baywet | Bible |
|---|---|---|---|---|
| WCN* | -0.07 | 1.26 | 0.18 | 0 |
| WMI-WCN* | -0.21 | 1.54 | 0.23 | -1.82 |
| WAA* | 0 | 1.44 | 0.27 | -0.79 |
| WMI-WAA* | -0.05 | 1.18 | 0.37 | -0.99 |
| WRA* | 0.41 | 1.61 | 2.75 | 0.48 |
| WMI-WRA* | 0.47 | 1.66 | 3.34 | 0.6 |
| RMSD | 0.025 | 0.006 | 0.002 | 0.008 |