| Literature DB >> 30451945 |
Shibao Li1, Junwei Huang2, Zhigang Zhang2, Jianhang Liu2, Tingpei Huang2, Haihua Chen2.
Abstract
Link prediction aims to predict the existence of unknown links via the network information. However, most similarity-based algorithms only utilize the current common neighbor information and cannot get high enough prediction accuracy in evolving networks. So this paper firstly defines the future common neighbors that can turn into the common neighbors in the future. To analyse whether the future common neighbors contribute to the current link prediction, we propose the similarity-based future common neighbors (SFCN) model for link prediction, which accurately locate all the future common neighbors besides the current common neighbors in networks and effectively measure their contributions. We also design and observe three MATLAB simulation experiments. The first experiment, which adjusts two parameter weights in the SFCN model, reveals that the future common neighbors make more contributions than the current common neighbors in complex networks. And two more experiments, which compares the SFCN model with eight algorithms in five networks, demonstrate that the SFCN model has higher accuracy and better performance robustness.Entities:
Year: 2018 PMID: 30451945 PMCID: PMC6242980 DOI: 10.1038/s41598-018-35423-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Three types of the future common neighbors.
Figure 2The process of identifying the future common neighbors from the chaotic network. The yellow nodes are the future common neighbors between x and y.
Details of networks.
| Networks | | | | | < | < | < | < | < |
|---|---|---|---|---|---|---|---|
| CE | 297 | 2148 | 2.46 | 14.4646 | 1.8008 | 0.3079 | −0.163 |
| FWFB | 128 | 2075 | 1.78 | 32.4219 | 1.2370 | 0.3346 | −0.112 |
| NS | 379 | 914 | 4.93 | 4.82 | 1.66 | 0.798 | −0.082 |
| PB | 1222 | 16714 | 2.74 | 27.3552 | 2.9707 | 0.3600 | −0.221 |
| Yeast | 2375 | 11693 | 5.09 | 9.8467 | 3.4756 | 0.3883 | 0.454 |
|V| and |E| are the number of nodes and links, respectively.
Figure 3AUC sensitivity analysis of the SFCN model in FWFB network. X-axis is the α value that is taken from 0 to 15 at intervals of 3. Y-axis is the β value that is taken from 0 to 2 at intervals of 0.4.
There are the prediction accuracy results, measured by AUC, of classic indexes and corresponding algorithms based on the SFCN model in five real networks.
| AUC | CE | FWFB | NS | PB | Yeast |
|---|---|---|---|---|---|
| CN | 0.846 | 0.616 | 0.989 | 0.925 | 0.917 |
| SFCN-CN-RA | 0.876 | 0.659 | 0.989 | 0.939 | 0.971 |
| Salton | 0.802 | 0.532 | 0.984 | 0.880 | 0.914 |
| SFCN-Salton-HDI | 0.851 | 0.793 | 0.984 | 0.937 | 0.974 |
| RA | 0.871 | 0.598 | 0.977 | 0.927 | 0.924 |
| SFCN-RA-HDI | 0.872 | 0.794 | 0.991 | 0.938 | 0.975 |
| LP | 0.861 | 0.633 | 0.980 | 0.938 | 0.971 |
| SFCN-LP-RA | 0.872 | 0.666 | 0.990 | 0.942 | 0.978 |
| HPI | 0.804 | 0.528 | 0.979 | 0.855 | 0.912 |
| SFCN-HPI-HPI | 0.811 | 0.762 | 0.985 | 0.901 | 0.972 |
| SFCN-HPI-RA | 0.865 | 0.790 | 0.989 | 0.945 | 0.974 |
| HDI | 0.775 | 0.527 | 0.980 | 0.873 | 0.914 |
| SFCN-HDI-HDI | 0.849 | 0.782 | 0.985 | 0.935 | 0.976 |
| SFCN-HDI-RA | 0.889 | 0.795 | 0.990 | 0.947 | 0.977 |
| LNBRA | 0.863 | 0.659 | 0.980 | 0.928 | 0.920 |
| SFCN-LNBRA-RA | 0.883 | 0.796 | 0.993 | 0.949 | 0.977 |
| SFCN-LNBRA-HDI | 0.881 | 0.809 | 0.992 | 0.942 | 0.975 |
| SFCN-LNBRA-LHN | 0.878 | 0.843 | 0.989 | 0.941 | 0.975 |
| LHN | 0.725 | 0.390 | 0.974 | 0.766 | 0.906 |
| SFCN-LHN-LHN | 0.810 | 0.891 | 0.983 | 0.891 | 0.974 |
| SFCN-LHN-RA | 0.876 | 0.797 | 0.992 | 0.947 | 0.977 |
| SFCN-LHN-LP | 0.806 | 0.704 | 0.974 | 0.928 | 0.961 |
| SFCN-LHN-HDI | 0.839 | 0.798 | 0.984 | 0.936 | 0.976 |
There are the prediction accuracy results, measured by precision (top-100), of classic indexes and corresponding algorithms based on the SFCN model in five real networks.
| Precision | CE | FWFB | NS | PB | Yeast |
|---|---|---|---|---|---|
| CN | 0.198 | 0.094 | 0.396 | 0.460 | 0.678 |
| SFCN-CN-RA | 0.202 | 0.108 | 0.404 | 0.464 | 0.766 |
| Salton | 0.012 | 0.008 | 0.290 | 0.000 | 0.024 |
| SFCN-Salton-HDI | 0.012 | 0.010 | 0.260 | 0.000 | 0.032 |
| RA | 0.124 | 0.094 | 0.564 | 0.256 | 0.520 |
| SFCN-RA-HDI | 0.130 | 0.098 | 0.566 | 0.278 | 0.440 |
| LP | 0.124 | 0.112 | 0.312 | 0.400 | 0.654 |
| SFCN-LP-RA | 0.140 | 0.112 | 0.324 | 0.410 | 0.696 |
| HPI | 0.026 | 0.068 | 0.556 | 0.224 | 0.860 |
| SFCN-HPI-HPI | 0.036 | 0.388 | 0.192 | 0.224 | 0.868 |
| SFCN-HPI-RA | 0.116 | 0.382 | 0.162 | 0.566 | 0.896 |
| HDI | 0.032 | 0.008 | 0.310 | 0.002 | 0.030 |
| SFCN-HDI-HDI | 0.086 | 0.360 | 0.320 | 0.516 | 0.900 |
| SFCN-HDI-RA | 0.106 | 0.360 | 0.294 | 0.590 | 0.882 |
| LNBRA | 0.131 | 0.162 | 0.544 | 0.252 | 0.586 |
| SFCN-LNBRA-RA | 0.136 | 0.166 | 0.554 | 0.250 | 0.580 |
| SFCN-LNBRA-HDI | 0.130 | 0.164 | 0.564 | 0.278 | 0.602 |
| SFCN-LNBRA-LHN | 0.132 | 0.154 | 0.580 | 0.260 | 0.586 |
| LHN | 0.000 | 0.014 | 0.138 | 0.000 | 0.010 |
| SFCN-LHN-LHN | 0.000 | 0.026 | 0.138 | 0.000 | 0.014 |
| SFCN-LHN-RA | 0.000 | 0.014 | 0.138 | 0.000 | 0.012 |
| SFCN-LHN-LP | 0.000 | 0.020 | 0.138 | 0.000 | 0.012 |
| SFCN-LHN-HD | 0.000 | 0.018 | 0.140 | 0.000 | 0.010 |
Figure 4The AUC of different algorithms with different ratio of training sets to probe sets in real networks. X-axis is the ratio, and Y-axis is the each algorithm.