| Literature DB >> 36198758 |
Ruizhi Zhang1, Qiaozi Wang2, Qiming Yang1, Wei Wei3,4.
Abstract
Temporal network link prediction is an important task in the field of network science, and has a wide range of applications in practical scenarios. Revealing the evolutionary mechanism of the network is essential for link prediction, and how to effectively utilize the historical information for temporal links and efficiently extract the high-order patterns of network structure remains a vital challenge. To address these issues, in this paper, we propose a novel temporal link prediction model with adjusted sigmoid function and 2-simplex structure (TLPSS). The adjusted sigmoid decay mode takes the active, decay and stable states of edges into account, which properly fits the life cycle of information. Moreover, the latent matrix sequence is introduced, which is composed of simplex high-order structure, to enhance the performance of link prediction method since it is highly feasible in sparse network. Combining the life cycle of information and simplex high-order structure, the overall performance of TLPSS is achieved by satisfying the consistency of temporal and structural information in dynamic networks. Experimental results on six real-world datasets demonstrate the effectiveness of TLPSS, and our proposed model improves the performance of link prediction by an average of 15% compared to other baseline methods.Entities:
Mesh:
Year: 2022 PMID: 36198758 PMCID: PMC9534913 DOI: 10.1038/s41598-022-21168-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Schematic diagram of temporal link prediction.
Main notations.
| Notation | Dscription |
|---|---|
| Graph snapshot at timestamp | |
| Adjacency matrix at timestamp | |
| Latent matrix at timestamp | |
| Degree vector at timestamp | |
| Node set | |
| Edge set at timestamp | |
| Latent edge, | |
| Hyperparameters in adjusted sigmoid function |
Figure 2Sigmoid and ASF function. The upper left figure shows the original sigmoid function. The comparison of the upper right and lower left figures shows that the more information is remained with the larger parameter q. The comparison of the upper right and lower right figures shows that the active period of link information is determined by the parameter p.
Figure 3Examples of different types of simplex structures in networks.
Figure 4Schematic diagram of the topology around the target link. In this figure, the node pair x and y is to be predicted, are their common neighbors, and they form 2-simplices . h is the hidden neighbor of endpoint x, and is the latent edge. The simplicial complex J can be decomposed into two 2-simplices and . Symmetrically, and are hidden neighbors of node y.
Figure 5Diagram of the proposed model TLPSS. This model contains pre-process, construct graph and prediction steps. In first step, the data is processed and decayed by ASF. Then, according to the network snapshots, we obtain the adjacency matrix sequence and latent matrix sequence. Finally, coupling the temporal and structural information, the temporal link prediction method TLPSS is proposed.
Network datasets statistics.
| Dataset | Node number | Edge number | Ave. Degree | Start date | End date | Total duration | Snapshot number |
|---|---|---|---|---|---|---|---|
| Contact | 273 | 28227 | 206.78 | 1970/1/1 | 1970/1/4 | 70h | 70/h |
| DBLP | 1169 | 10667 | 18.24 | 1986/1/1 | 1996/1/1 | 10y | 10/y |
| Digg | 3159 | 17661 | 11.18 | 2008/11/3 | 2008/11/11 | 8d | 192/h |
| Enron | 883 | 31092 | 70.42 | 2000/2/15 | 2000/6/14 | 4m | 17/w |
| 3877 | 30480 | 15.72 | 2007/11/30 | 2008/8/26 | 9m | 270/d | |
| Prosper | 2561 | 46540 | 36.34 | 2006/10/10 | 2006/12/11 | 2m | 60/d |
Baseline link prediction methods for temporal networks.
| Baseline Methods | Description | Definition |
|---|---|---|
| Common neighbors (CN) | The algorithm uses the number of common neighbors as an indicator to measure the possibility of establishing a link between two nodes[ | |
| Jaccard Index (JA) | This algorithm evaluates the probability of connecting edges also by measuring the number of common neighbors, it is the normalized version of | |
| Preferential Attachment (PA) | In this algorithm, the probability that the target link is connected is proportional to the product of the degrees of the two endpoints, it is a hub-promoted method[ | |
| Resource Allocation (RA) | Common neighbors serve as a medium for resource transfer, and the weight of common neighbors is inversely proportional to its degree[ | |
| Cannistrai Alanis Ravai (CAR) | The algorithm utilizes the links between commmon neighbors, along with commmon neighbors information, where LCL’(x,y) is total weights of links between common-neighbors[ | |
| Clustering Coefficient-based Index (CCLP) | This metric employs clustering coefficient of common neighbors to reflect the density of triangles within a local network environment, where |
Comparison of the AUC value between TLPSS and baseline methods.
| AUC | CN | JA | PA | RA | CAR | CCLP | TLPSS |
|---|---|---|---|---|---|---|---|
| Contact | 0.9525 | 0.8611 | 0.9020 | 0.9324 | 0.9334 | 0.8495 | |
| DBLP | 0.8627 | 0.8591 | 0.5201 | 0.8718 | 0.7748 | 0.8105 | |
| Digg | 0.6426 | 0.6445 | 0.5089 | 0.6472 | 0.6525 | 0.6119 | |
| Enron | 0.8872 | 0.8730 | 0.4681 | 0.8852 | 0.8745 | 0.8277 | |
| 0.7505 | 0.7518 | 0.4731 | 0.7520 | 0.5798 | 0.7453 | ||
| Prosper | 0.4018 | 0.4102 | 0.4639 | 0.3907 | 0.4993 | 0.4036 |
Significant values are in bold.
Comparison of the Precision value between TLPSS and baseline methods.
| Precision | CN | JA | PA | RA | CAR | CCLP | TLPSS |
|---|---|---|---|---|---|---|---|
| Contact | 0.3700 | 0.6600 | 0.9700 | 0.9600 | 0.9600 | ||
| DBLP | 0.2100 | 0.1700 | 0.0000 | 0.0210 | 0.1400 | 0.0800 | 0.3600 |
| Digg | 0.0670 | 0.0530 | 0.0000 | 0.0240 | 0.0830 | 0.0070 | |
| Enron | 0.6500 | 0.6500 | 0.0000 | 0.2400 | 0.6100 | 0.1900 | |
| 0.0080 | 0.0090 | 0.0000 | 0.0030 | 0.0050 | 0.0080 | ||
| Prosper | 0.0004 | 0.0008 | 0.0024 | 0.0004 | 0.0024 | 0.0028 |
Significant values are in bold.
Figure 6Performance comparison of varying parameter p in different dynamic networks. All methods are based on the same temporal information decayed by ASF. The performance of TLPSS is superior to other baseline methods.
Figure 7Performance comparison of varying parameter q in different dynamic networks. The AUC value of the TLPSS model includes rising stage and stable stage. Explosive rising stage illustrates the effectiveness of latent edge composed of 2-simplices structure.