Literature DB >> 36213149

From agent-based models to the macroscopic description of fake-news spread: the role of competence in data-driven applications.

Abstract

Fake news spreading, with the aim of manipulating individuals' perceptions of facts, is now recognized as a major problem in many democratic societies. Yet, to date, little has been understood about how fake news spreads on social networks, what the influence of the education level of individuals is, when fake news is effective in influencing public opinion, and what interventions might be successful in mitigating their effect. In this paper, starting from the recently introduced kinetic multi-agent model with competence by the first two authors, we propose to derive reduced-order models through the notion of social closure in the mean-field approximation that has its roots in the classical hydrodynamic closure of kinetic theory. This approach allows to obtain simplified models in which the competence and learning of the agents maintain their role in the dynamics and, at the same time, the structure of such models is more suitable to be interfaced with data-driven applications. Examples of different Twitter-based test cases are described and discussed.

Entities: Chemical

Keywords: Agent-based models; Competence; Data uncertainty; Fake news spreading; Kinetic models; Learning dynamics; Social closure

Year: 2022 PMID： 36213149 PMCID： PMC9527739 DOI： 10.1007/s42985-022-00194-z

Source DB: PubMed Journal: SN Partial Differ Equ Appl ISSN： 2662-2963

Introduction

Since the 2016 U.S. presidential election, and more recently the COVID-19 infodemic, fake news on social networks, intended to manipulate users’ perceptions of events, has been recognized as a fundamental problem in open societies. As fake news proliferate, disinformation threatens democracy and efficient governance. In particular, there is empirical evidence that fake news spreads significantly “ faster, deeper, and more widely” than real news [37]. In the same study, it is also highlighted that the phenomenon is not due to robotic automatisms of news dissemination but to the actions of human beings sharing the news without the ability to identify misinformation. It is therefore of fundamental importance the construction of mathematical models capable of describing such scenarios and with a structure simple enough to be interfaced with data available, for example from social networks, but still embedding the specific features related to the ability of individuals in detecting the piece of false information. In recent years, compartmental models inspired by epidemiology have been used fruitfully to study spreading phenomena of rumors and hoaxes. For instance, following the pioneering work of Daley and Kendall [11], in [23] SIR-type models are used in conjunction with dynamical trust rates that account for the different spreading rates in a network. Those traditional models were elaborated in [7], where the authors consider also the impact of online groups in feeding the rumor growth once it has started. Alongside these approaches there are more data-driven works. In this field, Twitter has been gaining consensus as a powerful source of useful and structured information. A recent example in this direction can be found in [27], that focuses on fake news dissemination on the platform using a two-phase model, where fake news initially spread as novel news story and after a correction time they are paired with a competitive narrative which describes the news as fake in the first place. Twitter data in conjunction with epidemiological models have already been used to study the spread of rumors and fake news by several authors [10, 16, 17, 26], where SIS and SEIZ compartmental models were employed to fit the data of the evolution of different news. Mounting experimental evidence highlights the strong link between digital media literacy and possibility to reliably identify the quality of online information. This connection has been early identified by communication scientists [20] and later confirmed by experimental studies, see e.g. [22, 24]. In [19], starting from an agent-based model for the dissemination of fake news in presence of competence, using the tools of kinetic theory, in the limit of a large number of agents, novel mathematical models were proposed and discussed. Previously, kinetic models that include the role of competence or knowledge had been proposed in [5, 29, 31]. The behavior of a social system composed by a large number of interacting agents has been studied in the case of opinion formation [3, 9, 14, 15, 34] and more recently epidemiological dynamics [1, 2, 12]. We refer to [28] for an introduction to the subject. The compartmental structure of the model for fake-news spreading in presence of competence introduced in [19] is composed by four groups of individuals: the susceptible (S) agents—defined as the ones who are unaware of the fake news; the exposed (E) agents—those who know the news but still have not decide whether to spread it or not; the infectious (I) agents—who actively divulge and finally the skeptical or removed (R) agents—those who are aware of the news but choose to not spread it. On a population divided among such categories, there is also a social structure based on an additional time evolving variable that measures the competence level of the agents. Although the model has shown the capacity to correctly describe the role of competence in the dynamics of fake-news, its mathematical structure based on kinetic partial differential equations is generally too complex to be interfaced with the available data. In an attempt to address this problem, in the present work by exploiting the knowledge of the equilibrium states of the corresponding mean-field model we derived reduced order macroscopic models based on ordinary differential equations in which, however, the role of competence continues to be present. The new social models, thanks to their simpler structure, are more suitable for data-driven applications. We emphasize that the methodology here adopted is quite general and that in principle points the way to introducing additional social characteristics of individuals into tractable mathematical models in terms of structural complexity. The rest of the manuscript is organized according to the following sections. In Sect. 2, we recall the basic concepts of the kinetic model for describing the spread of fake-news in the presence of competence. Next, in Sect. 3, using the local equilibrium states of the competence we derive reduced order models that depend on the specific shape of the interaction function. Section 4 is devoted to presenting a series of numerical experiments in which we first validate the model and then consider data-driven applications based on Twitter. In the last section, a series of final considerations are reported.

Kinetic models, competence and fake news spreading

In this section we present a model for the description of the spreading of fake news in a society characterized by a heterogeneous competence of agents. Our starting point is the compartmental kinetic approach recently proposed in [19]. We suppose that the system of agents can be divided in the following epidemiologically relevant states: susceptible (S) agents are the ones that are unaware of fake news, we further denote as exposed (E) the agents that encountered the fake news but have still to spread them, infectious (I) agents are the real spreader and, finally, the removed (R) agents are not actively engaged in the spread of misinformation. In the following we indicate with the set of epidemiological compartments. Aiming to incorporate the effects of personal competence on the fake news dynamics, we stick to a simple mathematical setting where the state of the individuals in each compartment, at any time , is characterized by the sole competence level . Hence, we denote by , , , and the distribution of competence at time of susceptible, exposed, infectious and removed individuals, respectively. We neglect natality and mortality dynamics since we can consider a short time dynamic where nobody enters or leaves it during the spreading of the fake news. This assumption can be justified based on the average lifespan of fake news. Therefore, we can fix the total distribution of competence of a society to be a probability density for all Consequently, the quantitiesdenote the fractions of the population that are susceptible, exposed, infected, or recovered respectively at time . We also denote with the moment of the distribution , , of order Unambiguously we will indicate with , , the mean values corresponding to .

Competence and learning in multi-agent systems

Drawing inspiration from seminal models for multi-agent systems in presence of personal competence [29, 31] we introduce a binary interaction term expressing two different processes: The dynamics described at point (i) can be easily sketched by the following process: if two agents belonging to compartment and characterized by competence levels meet, their post-interaction competence is given bywhere , , quantify the amount of competence lost by individuals of compartment H by the natural process of forgetfulness and the parameter , , models the competence gained through the interaction with members of the class J, with . A possible choice for is , where is the characteristic function and a minimum level of competence required to the agents for increasing their own skills by interactions. In (1) and are centered iid random variable such that, denoting by their expectation, we have . learning processes by less competent agents that can learn from the more competent ones the competence evolution depends by a social background in which individuals grow. We suppose that the process defined in (ii) takes place in a different time scale from the one of interactions between agents. In particular, unlike [19] we assume that the time scale of online interactions for competence formation is faster than interactions with the social background. To this end, we will consider advection terms that will be defined in the next section.

Remark 1

It is reasonable to assume that both the processes of gain and loss of competence from the interaction with other agents in (1) are bounded by zero. Therefore we suppose that if , and if , with and , and then may, for example, be uniformly distributed in .

Fake news spreading in presence of a social feature

Following [19] we choose to describe the dissemination of fake news through a population of agents via a kinetic compartmental model. In this setting the description of the sole spreading dynamics can be illustrated by the following system of ODEsBorrowing from the consolidated epidemiological tradition, we will refer to it as the SEIR model. System (2) describes the evolution of the mass fractions of the population that belongs to each compartment for each time . The parameters appearing in system (2) are presented in Table 1. Also a schematic representation of system (2) is given in Fig. 1. The last equation of system (2) translates the fact that—as specified at the beginning of the Section—the total mass of the population is preserved.

Table 1

Parameters definition in the SEIR model (3)

Parameter	Definition
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta $$\end{document}β	Contact rate between susceptible and infected individuals
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1/\delta $$\end{document}1/δ	Average decision time on whether or not to spread fake news
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document}η	Probability of deciding not to spread fake news
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1/\gamma $$\end{document}1/γ	Average duration of a fake news
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document}α	Probability of remembering fake news

Fig. 1

Dissemination dynamics for the SEIR model (2)

The combination of the learning mechanisms presented in the previous subsection together with the spreading of the fake news is described by the following kinetic model:where the parameter describes the intensity of the interactions. Parameters definition in the SEIR model (3) Dissemination dynamics for the SEIR model (2) In (3) the functionalis the local incidence rate and is a nonnegative contact function measuring the impact of competence in the spreading of fake news. This function is decreasing with respect to the competences of the population of susceptible and infected agents. In the following we will investigate the macroscopic effects of the following two choices of The two functions are both decreasing but have strong differences for . Indeed, since (A) is not limited for small competences it enforces the spreading of fake news among less competent agents compared with (B). Indeed, the function in (B) is bounded in . We further remark that individuals have the highest rates of contact with people belonging to the same social class, and thus with a similar level of competence. Strong competence-based contact function , with , Weak competence-based contact function , .

Remark 2

In (3) the parameters are considered to be dependent on the competence level x, in general. This is to reflect the fact that competence plays a role in the dissemination of fake news. Furthermore, the operators , , describe the binary collisions (1) and they determine the thermalization of the distribution of competence characterizing the Jth compartment. The advection terms in (3) come models the influence of the social background on the competence dynamics. It is worth to observe that the evolution of mass fractions J(t) obeys the classical SEIR model with reinfection (2) by choosing and . This would correspond in considering the spreading a fake news independent of the competence level of a system of agents. In more details, we will consider the operators as integral operators that modify the competence distribution through repeated interactions of type (1) among individuals. We can fruitfully define the introduced operators in weak form as followswhere is a test function and where the brackets indicate the expectation with respect to the random variables . In the model (3) the function determines the duration of the fake news and can be strongly influenced by the competence level of the spreader. Furthermore, the function is related to the average time that an agent eventually spend before the diffusion of a fake news such that people with high competence invest more time in checking information reliability, and characterizes individuals’ decision to spread fake news. The function describes the probability to remember fake news and can be thought less influenced by the competence variable. In Table 1 we summarize all the introduced parameters.

Asymptotic states of the learning process

We focus now on the learning dynamics introduced in model (3) whose evolution is given by the nonlinear operators , , defined in (5). We concentrate in particular on the analysis of asymptotic states of the learning dynamics undergoing elementary interactions (1). We are therefore interested in the asymptotic distribution of the Boltzmann-type modelIt is easily observed that if the mass is conserved in (6) corresponding to the conservation of the total number of agents. If in (6) we obtain the evolution of the average competence in each compartment that is not conserved in timeand the total competence is conservedSince the steady state solution of (6) is difficult to obtain, we can formally derive a simplified Fokker–Planck model in which the study of the asymptotic properties is much easier. To this end, we introduce the following quasi-invariant scaling of the relevant parameter of the binary scheme (1) given bywith . It is worth to mention that the introduced scaling is inspired by the so-called grazing collision limit of the Boltzmann equation, see [6, 36]. In the context of multi-agent systems this scaling has been introduced in [8, 33]. In the introduced regime of parameters the interactions become quasi-invariant, in the sense that the post-interaction competences are such that and are small for . Hence, assuming , we can perform the following Taylor expansionwith . Hence, in the time scale we havewhere we exploited the fact that and we have defined the sum of reminder termsWe may observe that, assuming , then we may write , where we introduced the centered random variable with unitary variance and such that . Therefore, and, under the scaling (7), we get . Hence, under the above assumption, proceeding as in [8] we can prove that for Therefore in the new time scale, for and under the quasi-invariant scaling (7), we can show that the solution of model (6) converges toIntegrating back by parts we have obtainedwith , coupled with the following boundary conditionsandfor all . Assuming then independent by for all the steady states , are solution ofwhere is a conserved quantity as we already observed. Hence, we obtain that the large time distribution is an inverse GammawhereNow, we highlight that for we have , which means thatIn view of we conclude that under the introduced assumptions

Reduced order models for fake news spread with competence

Once we have characterized the equilibrium distribution of the transition operators , with , we can study the complete system (3). The aim of this section is the definition of observable macroscopic equations of the introduced kinetic model. Integrating both sides of (3) with respect to and recalling that the introduced operators are mass and momentum preserving, we obtain the following system for the evolution of the mass fractions J(t), whereas for the momentum we getWe can observe that the obtained system is not closed since the evolution of mass fractions J(t) and of the momentum depend on the evolution of the distribution functions . The closure of the obtained system can be obtained by formally resorting to a limit procedure. Indeed, assuming that the time scale involved in the process of competence formation is , we have a fast learning process of the system of agents with respect to the evolution of the spreading of fake news. Therefore, for the distribution function reaches fast the inverse Gamma equilibrium with mass fractions J(t) and local mean values . In the following we obtain two different set of macroscopic equations in relation with the considered contact rate function .

Social closure with a strong competence-based contact function

We consider the case (A) introduced in Sect. 2.2 corresponding to a strong competence-based contact function defined by , . We havewhereTherefore, in the limit we can plug in (13) which becomesthanks to the properties of the inverse Gamma distribution, leading toNext, looking at (11), recalling that under the hypothesis that for , the knowledge exchange operator also preserves momentum, we have the following system of equationswhich, using the fact thatimpliesthat is, we obtained a closed system of eight ordinary differential equations (15), (17).

Social closure with a weak competence-based contact function

If, instead, we consider the case B) of Sect. 2.2, corresponding to the weak competence-based contact function defined by , it is possible to writeAs discussed in Sect. 3.1, in the limit we may plug the asymptotic distribution of the Fokker–Planck model (9) in (18) to obtainwhere stands for the modified Bessel function of the second kind of order a evaluated at x. Hence, if we consider system (10) under the assumption of weak competence-based contact function we obtainwhich becomesThe next equation will help to close the systemwhich is a straightforward consequence of the following property of the modified Bessel functions of the second kindAgain under the assumptions that for , integrating with respect to x Eq. (11), with the aid of Eq. (21), we getwhich, using again the fact thatleads to

Examples and applications

In this section we numerically validate the modeling framework proposed in (3) with local incidence rate 4 in the settings (A)–(B). We stress that those form of contact functions generate different macroscopic models that have been defined in (15), (17) and (20), (23), respectively, for . Once established the consistency of the approach, we proceed by exploiting the macroscopic sets of equations for calibration purposes based on a freely available repository for the spreading of hashtags linked to known fake news. The proposed data-oriented approach is fundamental to experimentally observe the different impact of the contact function in identifying impact of competence in the fake news dynamics. From the numerical point of view we will exploit an implicit structure preserving method for the Fokker–Planck operator (9) based on the schemes presented in [32]. The advantage of these methods relies on an arbitrarily accurate description of the steady state distribution of the Fokker–Planck model of interest. Similar approaches have been investigated in a different context also in [12, 13, 30].

Test 1: Validation of the social closure

In this first test we compare the evolution of mass fractions J(t) and means , , obtained from direct integration of , solution to (3), with respect to the competence , with the macroscopic models (15), (17) and (20)–(23) for several regimes of . We start by outlining the procedure by which we solve the system of kinetic equations (3) with Fokker–Planck interaction operators. Since is assumed to be small, we adopt a time splitting procedure. In particular, upon introducing a time discretization , constant, we proceed as follows. I. Fokker–Planck solver. At time , we determine the distributions for all solution towhere is the Fokker–Planck operator defined in Sect. 2.3 whose form, in the hypothesis , is given byIn this step we take advantage of an implicit structure preserving (SP) scheme for Fokker–Planck equations [32] and describes with arbitrary accuracy the steady state of the model. In Fig. 2 we report for several the numerical solution of the Fokker–Planck model in the time interval [0, T], , obtained from a discretization of the domain [0, 4] with grid points and with . We may observe that the scheme is capable to approximate the inverse Gamma analytical equilibrium . We also report the evolution of the numerical error computed as in the time frame [0, 2] from which we can observe how for sufficiently small values of we correctly approximate the given equilibrium distribution.

Fig. 2

Test 1. Left: numerical distribution obtained with SP implicit scheme at time for several , , , . Right: evolution of the error . In all the tests we considered a discretization of the domain [0, 4] obtained with grid points and II. Advection-Reaction step. Hence, we consider the distribution obtained in the interaction step as an input for the advection-reaction dynamics for In particular, we adopted a second order Lax-Wendroff scheme coupled with an explicit time integration. In the test of this subsection, unless otherwise specified, we prescribe as initial datum the distributionwhere and with initial mass fractionsWe consider the choice of parameters , and for (4.1). The fake news dynamics is regulated by the following choice of parameters , , , and . For contact rates A)–B) we compared the evolution of mass fractions and mean values obtained from the integration of (3) with the ones derived in Sect. 3. We consider the time interval [0, T], , a uniform time discretization with and , . In particular, Fig. 3 refers to the case and Fig. 4 to the case . In both cases we may observe that for small values of the obtained macroscopic models are accurate in describing the trends of observable quantities of the kinetic field model. The macroscopic systems of coupled ODEs has been solved through a RK4 numerical scheme with .

Fig. 3

Fig. 4

Test 1. Evolution of mass fractions obtained from direct integration of the kinetic model (3) in the case , for , together with the evolution of mass fractions of the macroscopic model (15)–(17). In both cases we considered , , , . The kinetic model has been solved through the scheme I–II over the domain [0, 4], discretization obtained with grid points and . The initial distribution of the kinetic model has been defined in (24)–(25)

Test 1. Evolution of mass fractions (left) and mean values (right) obtained from direct integration of the kinetic model (3) in the case , for , together with the evolution of mass fractions of the macroscopic model (15)–(17). In both cases we considered , , , . The kinetic model has been solved through the scheme – over the domain [0, 4], discretization obtained with grid points and . The initial distribution of the kinetic model has been defined in (24)–(25) Test 1. Evolution of mass fractions obtained from direct integration of the kinetic model (3) in the case , for , together with the evolution of mass fractions of the macroscopic model (15)–(17). In both cases we considered , , , . The kinetic model has been solved through the scheme I–II over the domain [0, 4], discretization obtained with grid points and . The initial distribution of the kinetic model has been defined in (24)–(25)

Test 2: A data driven application to Twitter

In this test we focus on the spreading of the fake news by considering available Twitter data from the repository TweetSets.1 In details, we analyzed the evolution from March to November, 2020 of the hashtag #facemask related to the COVID-19 pandemic, and of the hashtags #hurricaneflorence#fakenews both associated to the hurricane Florence of September 2018 that caused catastrophic damages in USA, particularly in the states of North Carolina and South Carolina. In the following we will assume that the competence variable is strongly related to the education level of a country. The data for the initial distribution of education has been extrapolated by the available Italian data from 2011 ISTAT census, and has been considered as representative data of a prototypical Western country [21]. As underlined in [21] the cumulative distribution of education exhibits a power-law type of tail. For this reason, as an approximation of the competence distribution we considered an inverse Gamma of the formwith obtained by data fitting. More precisely, we measure the education level on the scale [0, 6] where 6 represents the education of people with a PhD (see Fig. 5).

Fig. 5

Test 2. Competence distribution and its inverse Gamma approximation f(x) (26) corresponding to , and leading to a mean competence background of . Data refers to 2011 Italian census and are used as representative of a prototypical Western country. On the x-axis we indicated with (1) lower secondary education, (2) upper secondary education, (3) undergraduate, (4) master, (5) second level master, (6) doctorate

Test 2A: Fitting the model to data

Once we have obtained the initial competence distribution together with the value of we can estimate the parameters of the models defined in (15)–(17) and (20)–(23). Several approaches have been proposed in the literature, see e.g., [25]. It is worth to mention that several uncertainties are present in data linked to news-monitoring. For example the total population size is generally unknown and the total number of Twitter accounts represent an upper bound over the real active users. The approach adopted in [16], and subsequently in [17, 26], is to treat this quantity as a parameter to be determined in the minimization process along with the parameters of the models. To reduce the number of parameters to optimize we follow a different path. In particular, as initial guess on the total population size, since the datasets that we used for the fitting were based on U.S. hashtags, we considered that each fake-news spreader has in average 453 followers.2 Hence, in average we may expect that the total number of susceptible is given by the total number of tweets multiplied by the average number of followers. To take also into account both the number of bots on Twitter as found in [35] (and references therein) and users whose activity could be not assiduous enough to matter during the lifespan of the considered fake news, the initial guess was also reduced by a factor of 4. Let us denote by the number of active spreaders obtained from the data, while I(t) is the number of infectious agents given by the macroscopic differential model. Hence, we consider the following cost functionalwhere is the time-frame (in h) during which we solve the minimization problemwhereas was kept fixed and equal to 0.5. Since data for the evolution of compartments S, E, R are not at our disposal, as well is not the initial means value for any of the compartments, we solved the ODE model on , where is the starting point of the spreading process and is a suitable unknown time previous to starting from single exposed, infectious and recovered individuals. The idea is to simulate an initial situation for the spread of fake news to happen. Furthermore, we considered initial mean values equal to the half of the mean background distribution of competence, i.e. . Test 2A. Top row: optimization results for #facemask. Bottom row: optimization results for #florence#fakenews. From right to left: approximation of raw data on the number of tweets, cumulative distribution and evolution of the mean competence for each compartment In Fig. 6 we compare the evolution on the number of tweets regarding the hashtag #facemask, from 3rd March 2020 to 22 November 2020, and the hashtag #florence#fakenews, from 11th September 2018 to 4th October 2018, with the evolution of the model (15), (17). The obtained parameters are reported in Table 2.

Fig. 6

Test 2A. Top row: optimization results for #facemask. Bottom row: optimization results for #florence#fakenews. From right to left: approximation of raw data on the number of tweets, cumulative distribution and evolution of the mean competence for each compartment

Table 2

Test 2A. Estimated parameters for the entire datasets for the hashtags #facemask (second and third column) and #florence#fakenews (fourth and fifth column)

Parameter	#facemask		#florence#fakenews
	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\kappa (x,x_) = \beta /(x\, x_)$$\end{document}κ(x,x∗)=β/(xx∗)	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\kappa (x,x_)=\beta e^{-x-x_}$$\end{document}κ(x,x∗)=βe-x-x∗	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\kappa (x,x_) = \beta /(x\, x_)$$\end{document}κ(x,x∗)=β/(xx∗)	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\kappa (x,x_)=\beta e^{-x-x_}$$\end{document}κ(x,x∗)=βe-x-x∗
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document}α	0.9995	0.9993	1.0000	0.9999
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta $$\end{document}β	0.0122	0.2937	0.0901	0.9999
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\delta $$\end{document}δ	0.0237	0.0336	1.0000	0.1930
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma $$\end{document}γ	0.0046	0.0079	0.2127	0.9999

In both cases, we may observe that the evolution of the mean competence levels are different in the four compartments and, in particular, that low competence levels are associated to exposed and infectious agents, i.e., the active spreaders. The outcome reflects the intuitive idea that the disinformation could be driven by the lack of capability to recognize an information as purposely false in the first place. To better take into account the impact of a competence-based contact rate function , we also computed the associated basic reproduction number using the parameters and estimated previously for both datasets, reported in Table 2. Following [4, 19], and omitting the details for brevity, we consider a generalized version of the classical reproduction number defined aswhere again we leveraged the structure preserving scheme proposed in [32] to perform the calculations (see Fig. 7).

Fig. 7

Test 2A. Evolution of in the first 24 h of datasets #facemask (left) and #florence#fakenews (right) for the parameters estimated in Table 2 relative to the introduced contact functions

Test 2A. Estimated parameters for the entire datasets for the hashtags #facemask (second and third column) and #florence#fakenews (fourth and fifth column) Test 2A. Evolution of in the first 24 h of datasets #facemask (left) and #florence#fakenews (right) for the parameters estimated in Table 2 relative to the introduced contact functions

Test 2B: Forecasting under data uncertainties

To analyze the impact of uncertainties in data and parameters we consider a 3D random variable with distribution . We will suppose that the random vector has independent components, i.e. . Taking into account parametric uncertainties, we consider the estimated model parameters as followswhere we supposed and . As a result, the macroscopic quantities describing the evolution of compartments result affected by the introduced uncertainties increasing their dimensionality , , . In order to handle efficiently the introduced uncertainties in the dynamics we adopt a stochastic collocation approach based on stochastic Galerkin methods, we refer the interested reader to [38] for an introduction and to [2, 39] for applications in compartmental modelling of epidemic dynamics. This class of methods allows to accurately quantify the propagation stochasticity in a parametric differential model when information on the uncertainties’ distribution are available. We remark that fast convergence properties hold under suitable regularity assumptions on the problem’s solution. In details, we construct a 3D sample , , obtained in a collocation setting through Gauss–Legendre polynomials with nodes. In Fig. 8 we display the dynamics of the considered fake-news with respect to available data. In details, for #florence#fakenews we consider the period from September 11th 2018 to September 21st 2018. We consider two successive prediction horizons respectively of 1 day, i.e. the parameters of the models are calibrated taking into account data until September 19th, and a 2 days prediction horizon, where the calibration is based only on data until September 18th. Regarding #facemask we considered the period from March 3rd to May 17th. Also in this case we consider two successive prediction horizons of 1 week, i.e. the parameters of the models are calibrated taking into account data until May3rd, and a two weeks prediction horizon, where the calibration is based on data until May10th.

Fig. 8

Test 2B. Top row: comparison between 14 days (May 3rd–May 14th, 2020) and 7 days (May 3rd–May 10th, 2020) predictions of #facemask based on the model (15), (17) with uncertain parameter (29) with , , . Bottom row: comparison between 24 h (September 18th–September 19th, 2018) and 48 h (September 18th–September 20th, 2018) predictions of #florence#fakenews based on the model (15), (17) with uncertain parameter (29) with , ,

We highlight in dashed black and magenta the expected value of the predicted number of tweets . Together with the expected trends we plot the confidence intervals (CI) with respect to the random parameters , , and . The blue shaded band is relative to the variability in , the green shaded to the variability in whereas the shaded red is relative to the variability in . Test 2B. Top row: comparison between 14 days (May 3rd–May 14th, 2020) and 7 days (May 3rd–May 10th, 2020) predictions of #facemask based on the model (15), (17) with uncertain parameter (29) with , , . Bottom row: comparison between 24 h (September 18th–September 19th, 2018) and 48 h (September 18th–September 20th, 2018) predictions of #florence#fakenews based on the model (15), (17) with uncertain parameter (29) with , ,

Test 3: Competence background in misinformation

In this test we perform a retrospective analysis to study how the background could influence the dissemination of fake news as a result of a different learning process. We recall that the background modifies through a learning dynamic the effectiveness of the level of knowledge in identifying fake news. As a consequence high values of the background correspond to a high level of effectiveness of the competence while low values will make it difficult to identify the fake-news. Indirectly, the background acts as a control term which limits the spread of the misinformation. This can also be interpreted as a process of education specific to the identification of fake news that allows to limit the so-called knowledge neglect phenomenon [18]. We consider the two datasets for the hashtags #facemask and #florence#fakenews with the estimated parameters reported in Table 2 and we increase the value of the competence level attained by the background, i.e., , while keeping fixed the parameters during the dynamics defined by (15), (17) and (20), (23). Hence, we performed the test with both choices of a strong and weak competence based contact function; the results are summarized in Fig. 9. In all cases, we see how increasing the competence of the background reduces the spread of fake news, leading to a decrease in the cumulative number of tweets of infectious agents proportional to the increase in the value of . Indeed, we can observe how increasing the competence of the background, we obtain an evident decrease in the overall misinformation for both the examples considered #facemask and #florence#fakenews.

Fig. 9

Test 3. Total number of infectious agents for the hashtags #facemask (left) and #florence#fakenews (right) as a function of the competence background. In both cases, we employed the parameters reported in Table 2

Concluding remarks

Despite the digital transformation of governments and the modernization of public administration, a global decline in democracy is occurring around the world. The spread of fake news created for the purpose of polarizing society in certain directions poses a risk to democratic institutions. The role of individuals’ knowledge and the ability to use it in identifying false information is deemed of paramount importance. In this paper starting from a model for the description of fake-news dissemination in the presence of heterogeneous agents with different levels of competence, through the tools of kinetic theory, reduced-order models have been derived that allow to keep the effects of the of competence in the dynamics and that, thanks to their simplified structure, can be interfaced with data. The starting model is inspired, as in much of the literature related to fake-news, to the epidemiology, so it is based on a compartmental structure. The introduction of competence allows to analyze complex phenomena of great relevance in contemporary society, such as the effectiveness of control actions taken to limit the spread of fake news and the role of knowledge neglect in misinformation. The methodology adopted in this article is fully general and depends closely on the equilibrium state of the social variable and the social interaction function at the basis of fake-news spreading. As a consequence, additional social variables that play a key role in the spread of misinformation may be embedded in the dynamics using similar arguments. The ability to have a model that can be interfaced with the available data allowed us to present some preliminary examples of applications to the case of fake-news spreading on Twitter.

14 in total

1. EPIDEMICS AND RUMOURS.

Authors: D J DALEY; D G KENDALL
Journal: Nature Date: 1964-12-12 Impact factor: 49.962

2. Knowledge does not protect against illusory truth.

Authors: Lisa K Fazio; Nadia M Brashier; B Keith Payne; Elizabeth J Marsh
Journal: J Exp Psychol Gen Date: 2015-08-24

3. Opinion modeling on social media and marketing aspects.

Authors: Giuseppe Toscani; Andrea Tosin; Mattia Zanella
Journal: Phys Rev E Date: 2018-08 Impact factor: 2.529

4. Wealth distribution and collective knowledge: a Boltzmann approach.

Authors: L Pareschi; G Toscani
Journal: Philos Trans A Math Phys Eng Sci Date: 2014-11-13 Impact factor: 4.226

5. The spread of true and false news online.

Authors: Soroush Vosoughi; Deb Roy; Sinan Aral
Journal: Science Date: 2018-03-09 Impact factor: 47.728

Review 6. The reproductive number of COVID-19 is higher compared to SARS coronavirus.

Authors: Ying Liu; Albert A Gayle; Annelies Wilder-Smith; Joacim Rocklöv
Journal: J Travel Med Date: 2020-03-13 Impact factor: 8.490

7. Optimal control of epidemic spreading in the presence of social heterogeneity.

Authors: G Dimarco; G Toscani; M Zanella
Journal: Philos Trans A Math Phys Eng Sci Date: 2022-04-11 Impact factor: 4.226

8. Control with uncertain data of socially structured compartmental epidemic models.

Authors: Giacomo Albi; Lorenzo Pareschi; Mattia Zanella
Journal: J Math Biol Date: 2021-05-23 Impact factor: 2.259

9. A digital media literacy intervention increases discernment between mainstream and false news in the United States and India.

Authors: Andrew M Guess; Michael Lerner; Benjamin Lyons; Jacob M Montgomery; Brendan Nyhan; Jason Reifler; Neelanjan Sircar
Journal: Proc Natl Acad Sci U S A Date: 2020-06-22 Impact factor: 11.205

10. Kinetic models for epidemic dynamics with social heterogeneity.

Authors: G Dimarco; B Perthame; G Toscani; M Zanella
Journal: J Math Biol Date: 2021-06-26 Impact factor: 2.164