| Literature DB >> 24939414 |
Junming Huang1, Chao Li2, Wen-Qiang Wang3, Hua-Wei Shen1, Guojie Li1, Xue-Qi Cheng1.
Abstract
For the study of information propagation, one fundamental problem is uncovering universal laws governing the dynamics of information propagation. This problem, from the microscopic perspective, is formulated as estimating the propagation probability that a piece of information propagates from one individual to another. Such a propagation probability generally depends on two major classes of factors: the intrinsic attractiveness of information and the interactions between individuals. Despite the fact that the temporal effect of attractiveness is widely studied, temporal laws underlying individual interactions remain unclear, causing inaccurate prediction of information propagation on evolving social networks. In this report, we empirically study the dynamics of information propagation, using the dataset from a population-scale social media website. We discover a temporal scaling in information propagation: the probability a message propagates between two individuals decays with the length of time latency since their latest interaction, obeying a power-law rule. Leveraging the scaling law, we further propose a temporal model to estimate future propagation probabilities between individuals, reducing the error rate of information propagation prediction from 6.7% to 2.6% and improving viral marketing with 9.7% incremental customers.Entities:
Mesh:
Year: 2014 PMID: 24939414 PMCID: PMC4061555 DOI: 10.1038/srep05334
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Characterizing propagation probabilities.
(a,b) Time stamps of positive examples (retweeting behaviors) on two random edges. Each vertical line represents a retweeting behaviors occurring with the time stamp marked on the horizontal axis. (c,d) Positive (retweeting) and negative (neglecting) examples on those two edges. Vertical lines in upper half represent positive examples, while those in lower half represent negative ones. It shows an obvious tendency that most positive examples are concentrated on the left zone, i.e., most retweeting behaviors occur with short latency. The tendency is stronger on (c) than that on (d). (e) Distribution of latency of retweeting behaviors over all edges. (f) Ratio of positive examples upon all examples on all edges with respect to the associated latency, demonstrating the power-law interdependence between the propagation probability and the latency.
Figure 2Model evaluation.
(a) AUC of the Decay model and baselines. AUC measures the area under the ROC curves, and thus is equivalent to the probability that a trained model correctly distinguish a randomly selected positive example from another randomly selected negative example. (b) Perplexity of the Decay model and baselines when predicting retweeting behaviors, against the training set ratio. A lower perplexity indicates a better prediction accuracy, meaning less extent a testing example surprises a trained model. (c) Receiver Operating Characteristic (ROC) curves with a training set of 90% examples. (d) Influence spreads of an initial seed set selected on propagation probabilities predicted by the Decay model and baselines.