| Literature DB >> 35126767 |
Oscar Fontanelli1, Demian Hernández2, Ricardo Mansilla1.
Abstract
In this work we introduce a simple mathematical model, based on master equations, to describe the time evolution of the popularity of hashtags on the Twitter social network. Specifically, we model the total number of times a certain hashtag appears on user's timelines as a function of time. Our model considers two kinds of components: those that are internal to the network (degree distribution) as well as external factors, such as the external popularity of the hashtag. From the master equation, we are able to obtain explicit solutions for the mean and variance and construct confidence regions. We propose a gamma kernel function to model the hashtag popularity, which is quite simple and yields reasonable results. We validate the plausibility of the model by contrasting it with actual Twitter data obtained through the public API. Our findings confirm that relatively simple semi-deterministic models are able to capture the essentials of this very complex phenomenon for a wide variety of cases. The model we present distinguishes from other existing models in its focus on the time evolution of the total number of times a particular hashtag has been seen by Twitter users and the consideration of both internal and external components.Entities:
Keywords: Hashtag propagation; Master equations; Social network modeling
Year: 2022 PMID: 35126767 PMCID: PMC8807957 DOI: 10.1007/s13278-022-00861-4
Source DB: PubMed Journal: Soc Netw Anal Min
Schematic view of some relevant models in the last ten years for tweet, retweet, trend and hashtag prediction in Twitter
| References | Methodology | Main purpose |
|---|---|---|
| Ma ( | Supervised Machine Learning | Prediction of hashtag popularity |
| Kupavski ( | Supervised Machine Learning | Retweet prediction |
| Kong ( | Supervised Machine Learning | Hashtag bursting prediction |
| Doong ( | Supervised Machine Learning | Prediction of hashtag popularity |
| Firdaus ( | Supervised Machine Learning | Retweet prediction |
| Zhang ( | Deep Learning and Neural networks | Retweet prediction |
| Yu (2020) | Deep Learning | Prediction of peak time for hashtag popularity |
| Pervin ( | Statistical analysis | Analysis of hashtag co-occurrence |
| Pancer and Poole ( | Hierarchical regression analysis | Prediction of links and retweets |
| Zhang ( | Non-linear auto-regression models | Trend prediction |
| Xiong ( | Epidemiological models | Modeling of information propagation |
| Jin et al. ( | Epidemiological models | Modeling of information cascades |
| Skaza and Blais ( | Epidemiological models | Modeling of hashtag dynamics |
| Kawamoto ( | Stochastic models and random processes | Modeling the dynamics of retweet activities |
| Ko et al. ( | Mathematical modeling | Modeling the propensity to tweet and retweet |
| Mollgaard ( | Stochastic and mathematical modeling | Modeling of tweet rates |
| Zhao ( | Mathematical and statistical modeling | Prediction of tweet popularity |
| Rizoiu et al. ( | Stochastic and mathematical modeling | Prediction of popularity for tweeted videos |
| Bao et al. ( | Mathematical modeling and supervised ML | Prediction of tweet popularity and retweets |
Fig. 1Time activity analysis for #MasterChef. On panel (A), we see the procedure for getting the empirical w(t) and fitting the theoretical one. On panel (B), we compare the observed X(t) with the model’s predictions
Fig. 2Observed time evolution of X(t) (red) and model predictions for a variety of hasthags that were trending topics between May and June 2021
Fig. 3In the first stage, we took a train set with 30% of observations to fit parameters, and in a second stage, we compared the model predictions against a test set with the remaining 70% of data. Precision refers to the fraction of times the predicted confidence region contained the actual observation
Fig. 4Three examples where our model does not entirely capture the popularity evolution of the hashtag, either because a very narrow popularity function (#GreysAnatomy), a popularity function with two local maxima (#LetsGoPens) or a followers distribution with very large variance (#RealMadrid)