Literature DB >> 34901312

Understanding the spread of COVID-19 misinformation on social media: The effects of topics and a political leader's nudge.

Xiangyu Wang¹, Min Zhang¹, Weiguo Fan², Kang Zhao².

Abstract

The spread of misinformation on social media has become a major societal issue during recent years. In this work, we used the ongoing COVID-19 pandemic as a case study to systematically investigate factors associated with the spread of multi-topic misinformation related to one event on social media based on the heuristic-systematic model. Among factors related to systematic processing of information, we discovered that the topics of a misinformation story matter, with conspiracy theories being the most likely to be retweeted. As for factors related to heuristic processing of information, such as when citizens look up to their leaders during such a crisis, our results demonstrated that behaviors of a political leader, former US President Donald J. Trump, may have nudged people's sharing of COVID-19 misinformation. Outcomes of this study help social media platform and users better understand and prevent the spread of misinformation on social media.

Entities: Chemical

Year: 2021 PMID： 34901312 PMCID： PMC8653058 DOI： 10.1002/asi.24576

Source DB: PubMed Journal: J Assoc Inf Sci Technol ISSN： 2330-1635 Impact factor: 3.275

INTRODUCTION

Social media has changed the way people receive and share information. According to the Pew Research Center, the influence of social media has outpaced traditional news outlets with 68% of US adults using social media as their primary sources of news (Elisa, 2018; Elisa & Katerina Eva, 2018). The ubiquity of social media has made them a major channel for political, business, philanthropic, or health campaigns (Courtney, 2013; Gainous & Wagner, 2013). In particular, the value of social media has been highlighted during emergencies or crises such as mass shooting, wildfires, and terrorist attacks (Vosoughi et al., 2018). Due to the popularity of social media, political leaders have strategically used them for campaigning, advocacy, and fund raising (Hemsley, 2019; Kreis, 2017; Petrova et al., 2020). Some political leaders become influential on social media, with millions of followers who actively seek the up‐to‐date information these leaders share (Parmelee & Bichard, 2011). Thus, they have strong power to disproportionately impact the spread of information and influence their followers' behaviors (Parmelee & Bichard, 2011), even to divert their attention from mainstream media (Lewandowsky et al., 2020). At the same time, social media have also become a hotbed for misinformation—information that is false or inaccurate (Scheufele & Krause, 2019; Wang et al., 2019) and sometimes referred to as fake news (Cummings & Kong, 2019; Lazer et al., 2018). Misinformation has been circulated widely on social media (Allcott & Gentzkow, 2017). Such widespread of misinformation on social media has had great impact on various aspects of our society, including public elections, financial markets, environment protection, violent uprising, and so on (Kim et al., 2019; Lazer et al., 2018). Therefore, it becomes urgent for us to understand the dynamics of misinformation on social media, so that we can better promote accurate information, deter the spread of misinformation, and mitigate its negative effects on our society. Since early 2020, the world has been shocked by the COVID‐19 pandemic and witnessed the surge of related misinformation on social media. In panic, people rushed online to find and share information about COVID‐19, especially when their physical mobility became limited due to lockdown measures. During March and April 2020, COVID‐19‐related searches on Google exceeded those for other news and weather by a large margin (GoogleTrends, 2020) and the number of tweets about COVID‐19 reached more than 500 million during the same period. Unfortunately, a large amount of misinformation emerged as well, with about 50% of Americans having encountered fabricated information about COVID‐19 online (Amy & Baxter, 2020) and many falling victim to such misinformation (Whelan et al., 2019). However, when people look up to their political leaders during the crisis, some leaders have been criticized for fueling the spread of misinformation about COVID‐19 on social media. One such example is former US President Donald J. Trump, an avid Twitter user with more than 88 million followers. After being accused of sharing misinformation about the 2020 US presidential election, a large portion of his tweets about the election were labeled as misleading by Twitter in November 2020. Although his account was later suspended by Twitter for incitement of violence, it remains a question if and how he was involved in spreading misinformation about COVID‐19. This study analyzed large‐scale data on how COVID‐19 stories from misinformation sources spread on Twitter, one of the most popular social media platforms. Based on the heuristic‐systematic model (HSM) and the nudge theory, our regression model not only examined factors associated with the content and sources of these stories, but also revealed Trump's subtle yet significant role in the process. The contributions of this work are twofold. First, our study represents the first one to systematically investigate the spread of multi‐topic misinformation about COVID‐19 on social media. We hope our results can provide a clear picture on how such a diverse group of misinformation spreads on social media. Second, we showed that certain political leaders' behaviors on social media could have an effect, albeit indirectly, on the spread of misinformation. Findings of our study can help social media operators better identify misinformation posts that are likely to spread. Our outcomes also suggest that both operators and users of social media should be alert of political leaders who are implicitly promoting the spread of misinformation. The remainder of this article is organized as follows: we review related work in Section 2. This is followed by descriptions of our datasets and models in Section 3. After demonstrating and discussing results in Section 4, this article concludes with implications of our findings and future research directions.

BACKGROUND

The detection and diffusion of misinformation

Previous studies on misinformation have focused on two tasks: misinformation detection (Kumar & Geethakumari, 2014; Shu et al., 2017; Zhang et al., 2018) and misinformation diffusion (Allcott & Gentzkow, 2017; Fung et al., 2016; Garrett, 2019; Grinberg et al., 2019; Mosleh et al., 2020; Wood, 2018). Misinformation detection is usually a binary classification task (misinformation/true) or a multi‐class classification task (true/mostly true/half true/mostly false/false) (Sharma et al., 2019). Most existing studies have leveraged the content or context of information (Wu et al., 2019; Zhou & Zafarani, 2018). (a) Content‐based methods mainly include knowledge‐based and style‐based detections. Knowledge‐based detections evaluate the reliability of content, such as evaluating whether the knowledge from text content is false via manual fact‐checking (e.g., expert‐based, or crowd‐sourced fact‐checking) (Grinberg et al., 2019) or automatic fact‐checking through Natural Language Processing. Style‐based detections focus on writing styles. For instance, misinformation usually includes shocking images, skeptical headlines, and exaggerated language to attract people's attention (Baptista & Gradim, 2020). (b) Context‐based detections mainly investigate information propagation patterns and the credibility of its sources. Propagation‐based approach focuses on how misinformation spreads. Epidemic models, cascade models, and generative models have been adopted to characterize misinformation patterns and reduce the process of spreading (Bak‐Coleman et al., 2021; Cinelli et al., 2020; Tambuscio et al., 2015). Researchers have revealed that misinformation spreads faster, deeper, and broader than trusted information (Del Vicario et al., 2016; Vosoughi et al., 2018). Credibility‐based detections examine the credibility of news sources by assessing those who create and spread the news (e.g., social media accounts) (Shu et al., 2017). The spread of misinformation can be attributed to the lack of “third‐party filtering, fact‐checking, or editorial judgement” on the internet (Ali & Zain‐ul‐abdin, 2021; Allcott & Gentzkow, 2017; Ennals et al., 2010), including on social media. People believe in and share misinformation for individual and social reasons (Nahon & Hemsley, 2013; Sharma et al., 2019). At the individual level, many people lack the ability to recognize misinformation, especially citizens with lower levels of education (Allen et al., 2020; Bessi et al., 2015; Georgiou et al., 2020; Grinberg et al., 2019; Guess et al., 2019). On the contrary, people with higher education and spend more time on media are more likely to discern misinformation (Allcott & Gentzkow, 2017). At the social level, social networking and interactions enabled by social media platform make it easier for a piece of information to “go viral” (Nahon & Hemsley, 2013; Galuba et al., 2010). During the diffusion process on social media, a group of “influencers” can have the potential power to influence a disproportionately large group of people (Weimann, 1994). Many studies have attempted to identify such influential users on social media and quantify their influence via retweets, follower count, network centralities, or sentiment dynamics (Cha et al., 2010; Dubois & Gaffney, 2014; Kwak et al., 2010; Zhao et al., 2014). Nevertheless, for the spread of political information, having more followers does not necessarily mean a higher level of influence (Hemsley, 2019). The COVID‐19 pandemic became a major event, for which misinformation was prevalent. One reason is that people are more likely to share misinformation during crises (e.g., natural disasters, wars, terrorist attacks, virus breakouts) (Mosleh et al., 2020; Van Prooijen & Douglas, 2017; Vosoughi et al., 2018). Other reasons include lower trust in science (Plohl & Musil, 2020; Roozenbeek et al., 2020), a lack of analytical thinking (Roozenbeek et al., 2020), and the emergence of purposely designed misinformation articles (Cinelli et al., 2020; Ordun et al., 2020; Sharma et al., 2020). Existing studies of misinformation about COVID‐19 focused on descriptive analysis, such as the sentiment, topics, and geographical distribution (Li et al., 2020; Sharma et al., 2020; Singh et al., 2020). Roozenbeek et al. (2020) characterized individual predictors of people's susceptibility to COVID‐19 misinformation using survey data, but the sample size is limited in scale. Therefore, this study represents the first effort to investigate the spread of misinformation about COVID‐19 with a large‐scale dataset. However, the literature on misinformation spread has mainly studied misinformation about a much more specific topic, such as candidates of political elections (Allcott & Gentzkow, 2017; Garrett, 2019; Grinberg et al., 2019; Hemsley, 2019) or alternative treatments of diseases (Fung et al., 2016). In contrast, misinformation about COVID‐19 covers a broader spectrum of topics, including politics (e.g., lock down measures), public health (e.g., 5G causing the virus), medicine (e.g., the use of hydroxychloroquine to treat COVID‐19), vaccine (e.g., conspiracy stories against Bill Gates and Anthony Fauci), and so on (Singh et al., 2020). The spread of such multi‐topic misinformation about COVID‐19 has posed threats to people and our society in several ways. First, it undermines the credibility of scientific news and individual capacity (Hopf et al., 2019). Second, it has created confusion and misguided people's behaviors in the fight against the virus (e.g., discouraging people from adopting precautions or wearing masks) (Roozenbeek et al., 2020; Tasnim et al., 2020). Third, it also leads to hatred, discrimination, and social unrests (Yusof et al., 2020). However, it remains unknown which COVID‐19 topics are more likely to spread on social media. In addition, key influencers, including political leaders (Hahl et al., 2018), contributed heavily to the spread of misinformation (Wood, 2018; Zollo & Quattrociocchi, 2018). One such example is former US President Donald J. Trump, who has been criticized for using his tweeting “deflection” strategy to frame himself as the only reliable source of truth and erode the public trust on mainstream media outlets throughout the 2016 U.S. Presidential Election (Ross & Rivers, 2018). Despite their importance in sharing information with the public and people's trust in them to provide accurate information during a crisis, political leaders' effects on the spread of misinformation have not been studied with empirical data.

The HSM for misinformation spreading

Sharing information (e.g., retweeting) is a key feature provided by social media sites such as Twitter (Cha et al., 2010; Suh et al., 2010). People need to process information before sharing it (Engelmann et al., 2019). Therefore, we based our analysis of sharing misinformation on the HSM, which states that people process information using two modes—heuristic processing and systematic processing (Chaiken, 1980; Chen & Chaiken, 1999). The systematic processing mode mainly focuses on the content of a message and involves considerable cognitive effort to scrutinize, comprehend, and evaluate the validity of information (i.e., fully processing the content of the message). In contrast, heuristic processing requires less cognitive processing—people make quick decisions to adopt information based on mental shortcuts or rules of thumb that have been stored in their memory and can be reprocessed in a given situation, such as the readability and familiarity of information content (e.g., the ease to get and comprehend information), the credibility of information sources (e.g., relying on statements from experts and leaders they trust), or the crowd behaviors (e.g., supporting decisions endorsed by a large number of people) (Bargh, 1989; Bohner et al., 1995; Chaiken & Maheswaran, 1994; Chen et al., 1996; Chen & Chaiken, 1999; Lin et al., 2016; Wathen & Burkell, 2002). Beyond processing information, empirical studies using the HSM found that both systematic processing and heuristic processing influence individual's sharing of information (Xiao et al., 2018). Specifically in the context of information sharing in social media, systematic processing assesses a tweet's content itself, while heuristic processing has been operationalized to focus on a tweet's sources, others' endorsement of the tweet, and its communication styles (Engelmann et al., 2019; Firdaus et al., 2018; Liu et al., 2012). Another phenomenon related to heuristic processing is “nudge”—the indirect yet deliberate move to alter people's choices or make people more subject to heuristic processing (Saghai, 2013; Schmidt, 2017; Thaler & Sunstein, 2009). Even though nudges usually occur in subtle and often unnoticeable ways, they have been effective in promoting individual behaviors such as those in taxation (Holz et al., 2020), education (Benhassine et al., 2015), healthcare (Blumenthal‐Barby & Burroughs, 2012), and marketing (Tan et al., 2018). Specifically, nudges can influence people's behaviors through two types of heuristics: familiarity heuristics and consensus heuristics. Familiarity heuristics indicate that individuals are more likely to make familiar decisions based on previous experiences (Tversky & Kahneman, 1974). For example, social media users are more likely to endorse information from repeated exposure to certain information (Ali & Zain‐ul‐abdin, 2021; Garcia‐Marques & Mackie, 2001). In fact, even a single prior encounter can increase the believability of information (Pennycook et al., 2018). In contrast, consensus heuristics suggest that an individual's behavior is an extension of other people's behaviors (Chaiken, 1987; Kelley, 1967). For instance, people tend to conform to the judgment of a large number of people or even if the judgment is incorrect (Bond & Smith, 1996). Specifically, while directly sharing a piece of misinformation on social media would certainly help its spread, we would like to investigate whether influential individuals, such as political leaders, are also effective in nudging people's sharing of misinformation with indirect approaches. Specifically for COVID‐19, it is difficult for citizens to evaluate the validity of information about COVID‐19 during the pandemic, because of limited knowledge about this new virus, information overload, or anxiety (Li et al., 2020; Rathore & Farooq, 2020; Van Bavel et al., 2020). Such situations can increase people's reliance on the heuristic mode (Eastin, 2001; Lang, 2000; Ratneshwar & Chaiken, 1991; Zuckerman & Chaiken, 1998) when they process and share information related to COVID‐19. As a result, many people turn to their political leaders during such a crisis and “rally around the flag” (Baekgaard et al., 2020; Hetherington & Nelson, 2003; Mueller, 1970), because they are confident in their leaders' ability to handle challenges (Zhu et al., 2012). Studies suggested that some political leaders' tweets about COVID‐19 quickly became viral (Rufai & Bunce, 2020) and have influenced people's judgment of the pandemic (Van Bavel et al., 2020). Therefore, based on large‐scale data of COVID‐19 misinformation and the behaviors of Donald J. Trump on Twitter, this study attempts to address two specific research questions with the HSM framework. What are the effects of different topics of COVID‐19 misinformation on the spread of such stories on social media? What roles did Donald J. Trump play in the spread of COVID‐19 misinformation on Twitter?

METHODS

Datasets

This study fused datasets from different sources collected between February 24, 2020, when the US CDC first responded to the COVID‐19 pandemic publicly, and July 14, 2020. First, we started with a list of 895 misinformation websites for COVID‐19, including 237 sites identified by NewsGuard and 672 sites by Allcott et al. (2019). Note that 14 websites appear on both lists. From these websites, we extracted 895 news stories related to COVID‐19. Among them, we kept a pool of 855 unique focal news stories that were written in English for further analysis. Then we retrieved tweets about COVID‐19 through Twitter's Search API. Among a total of 39,026,205 tweets with keywords about the pandemic, 59,270 original tweets contain URLs pointing to 619 out of the 855 focal news stories from misinformation websites. These tweets are referred to as “focal tweets” in the remainder of this article. For these focal tweets, we further collected their content and retweet counts till the end of our data collection period, as well as their authors' information. In addition, we also retrieved all the 1,097 tweets published by former President Donald J. Trump's official Twitter account during the same period. It is worth noting that Trump did not include any of the focal news stories in his tweets. That motivated us to further investigate his roles in the spread of these stories on Twitter. As a comparison, we also retrieved 568 tweets published by a mainstream news outlet—National Public Radio (NPR) News—during the same period. We chose NPR News because it is considered as a source of news by several third parties that rate media bias , ,

Variables

The unit of analysis in our study is each individual focal tweet (i.e., original tweets that include URLs to COVID‐19 stories from misinformation websites). Our model includes a dependent variable, independent variables, and control variables. Table 2 lists all these variables in our model. The dependent variable—NumRetweets—is the number of retweets of focal tweets. A highly retweeted focal tweet would mean a higher level of spread of COVID‐19 misinformation on Twitter.

TABLE 2

Variables descriptions and summary statistics (N = 59,270)

Variable name	Descriptions	Mean	SD	(Min, max)
NumRetweets	The number of retweets of the focal tweet	10.95	223.02	(0, 33,400)
Seniority	The number of days a user has been on Twitter	2,410.36	1,343.85	(6, 5,097)
TweetPerDay	How active a creator has been on Twitter	34.90	55.85	(0, 714.91)
MaxSimNPRTweets	The highest similarity score between tweets of NPR News (posted 24 hrs before) and the focal tweet	0.71	0.37	(0, 1)
Domain‐Ranking	The popularity of the misinformation site	19,523.85	83,183.19	(286, 2,046,618)
NumCOVID‐19tweets	The number of tweets with COVID‐19 related keywords posted 24 hr before the focal tweets	483,926.50	103,037.40	(100,255, 662,555)
Topic2	A news story's probability on pandemic in the world	0.02	0.10	(2.6e−05,1)
Topic3	A new story's probability on pandemic in the United States	0.19	0.34	(2.6e−05,1)
Topic4	A new story's probability on medical response to pandemic	0.30	0.40	(3.3e−05,1)
Topic5	A new story's probability on conspiracy	0.17	0.33	(2.5e−05,1)
MaxSimTrumpTweets	The highest similarity score between Trump's tweets (posted before 24 hrs) and the focal tweet	0.73	0.31	(−0.083,1)

Independent variables capture two aspects of COVID‐19 misinformation spread. The first aspect focuses one important area of the systematic processing in HSM—the topics of misinformation stories. We applied Latent Direchlet Allocation (LDA), a generative probabilistic topic modeling method for content analysis (Blei et al., 2003), to extract latent topics from misinformation stories. Inputs to the LDA model include title and body text of the 855 misinformation stories about COVID‐19. Because most of them have keywords such as “coronavirus,” “virus,” and “Covid,” we removed such keywords from the corpus. To find the number of topics (K) for this corpus, we varied K from 2 to 10 and chose its value using corpus‐based coherence scores: Coherence value (CV) and UCI. Topic coherence is a measure of topic quality; UCI and CV are measured by pairwise semantic similarity of word co‐occurrence frequencies in a sliding time window over a reference corpus (e.g., the Wikipedia). The difference is that UCI is based on all word pairs of the top‐ranked topic words (Newman et al., 2010), while CV considers the context of topic top words by segmenting them into subsets (Röder et al., 2015). Outputs of the LDA model include two types of probability distributions: The first distribution associates each word with a topic. In other words, a topic is represented by a distribution over words and such a distribution helps us understand what each topic is about; the second distribution represents the overall content of a tweet by representing a tweet with a distribution over latent topics. Thus, the second distribution serves as a good indicator for a tweet's text content, and has been used as a key indicator for systematic processing of information based on HSM (Son et al., 2020). The second aspect is about heuristic processing and measures Trump's behaviors on Twitter. Because Trump did not directly share any of the story in our pool of COVID‐19 misinformation, we tried to capture how his “nudge” may have been part of people's heuristic processing of information, where heuristics are assumed to be stored in memory and can be retrieved by individuals when in a relevant situation (Engelmann et al., 2019). For instance, what Trump talked about before a misinformation story may have been stored in some Twitter users' memories, and became heuristic cues for them to process and share the misinformation story later. Specifically, for each focal tweet, we measured how similar it was with Trump's earlier tweets. Tweets were represented with embedding vectors (size = 200) based on pre‐trained Global Vectors (GloVe) for Twitter (Pennington et al., 2014). We then calculated the cosine similarity between the embedding of a focal tweet and the embedding of each of Trump's tweets published 24 hr before the focal tweet. We used 24 hr because the life cycle of the most information on social media is within the 1 day after the original post (Bakshy et al., 2012; Kwak et al., 2010). Because Trump's tweets covered a variety of topics, some of which are not related to COVID‐19 (e.g., elections), we picked the maximum similarity (MaxSimTrumpTweets) to represent the similarity between Trump's earlier tweets and each focal misinformation tweet. In addition to the maximum similarity between Trump's tweets and a focal tweet, we also calculated the number of retweets from this most similar Trump's tweet. We controlled other factors related to heuristic processing: (a) the number of days a creator has been on Twitter (Seniority) (Son et al., 2020); (b) how active the creator has been, measured by the number of tweets per day (TweetPerDay) (Son et al., 2020; Zhang et al., 2014). Note that we did not include the variables number of followers and number of friends in our model because they are highly correlated with TweetPerDay (Pearson correlation >0.6); (c) the popularity of the misinformation site where the piece of story came from. DomainRanking is based on Alexa ranking scores—a website with a higher score attracts more traffic; (d) the overall attention people were paying to the pandemic (Fu & Sim, 2011), measured by the number of tweets with COVID‐19‐related keywords during the 24 hrs before a focal tweet was posted (NumCOVID‐19tweets); (e) the maximum similarity between the focal tweet and tweets from NPR News published 24 hrs before the focal tweet (MaxSimNPRTweets).

Model setup

We used negative binominal (NB) regression to model the effects of various factors related to the spread of misinformation. NB distribution can account for overdispersion of the dependent variable, and thus has also been widely used to model the levels of user engagement on social media (e.g., the number of likes, comments, and shares) (Bakhshi et al., 2014; Hu et al., 2020). Our NB regression model can be represented as follows: where is the estimate of the dispersion parameter; is the fitted mean of count response (i.e., NumRetweets), and . We formulate the likelihood function following (Hilbe, 2011) and use the log‐likelihood function to estimate the coefficients: where is the dependent variable NumRetweets and is a gamma function with as the scale parameter.

RESULTS AND DISCUSSION

We set the number of topics K = 5, which is the elbow of two curves shown in Figure 1. Table 1 identifies the top 30 most representative words of each of the five topics, along with our interpretations of these topics.

FIGURE 1

Coherence scores of varying the number of topics in Latent Direchlet Allocation models

TABLE 1

Details of the five topics from the Latent Direchlet Allocation model

Topic	Interpretations	Top keywords	Titles of example stories
1	Politics about the pandemic	People, state, say, China, Trump, new, health, case, time, president, Dr, patient, mask, hospital, pandemic, world, get, country, report, also, medicine, day, death, even, spread, public, go, infect, need	“An Obama Holdover in an Obscure Government Arm Helped Cause the Country's Coronavirus Crisis.”
2	Pandemic in the world	Say, case, state, report, people, health, government, positive, ministry, test, country, India, also, official, number, March, new, day, patient, spread, Friday, death, hospital, Delhi, outbreak, total, include, take, confirm, year	“Iran's Mass Graves for Coronavirus Victims Are Large Enough to Be Seen From Space”
3	Pandemic in the United States	Case, say, new, death, people, number, report, China, hospital, health, city, test, state, York, March, week, day, home, time, rate, country, also, official, outbreak, confirm, accord, posit, total, data, online	“New York Health Commissioner Tells People Not to Follow White House Coronavirus Guidance”
4	Medical response to pandemic	Patient, people, case, say, infect, new, study, death, test, disease, drug, health, China, day, research, use, rate, time, also, Dr, number, hydroxychloroquine, report, week, country, spread, hospital, online, state	“California biotech claims it's discovered an antibody that can block ‘100%’ of coronavirus”
5	Conspiracy	China, Wuhan, Chinese, say, report, people, time, government, world, outbreak, also, health, lab, official, research, case, new, disease, state, human, nation, country, medium, day, first, spread, hospital, infect, patient, pandemic	“U.S. government gave $3.7million grant to Wuhan lab that experimented on coronavirus source bats.”

Coherence scores of varying the number of topics in Latent Direchlet Allocation models Details of the five topics from the Latent Direchlet Allocation model “U.S. government gave $3.7million grant to Wuhan lab that experimented on coronavirus source bats.” We excluded the variable, NumSimTrumpReTweets, because the high Pearson correlation (0.855) between MaxSimTrumpTweets and could lead to multicollinearity in our model. Similarly, including all five topics' probabilities, which sum to one, would also cause multicollinearity problems. Thus, we removed Topic 1 from our model because it has the highest correlations (up to −0.42) with other topics and the spread of political misinformation has been well studied before. Table 2 lists descriptive statistics of variables in our model. Pearson correlations among variables are reported in Table 3. We also checked variance inflation factors (VIFs) and VIF scores are all lower than 2, suggesting low multicollinearity.

TABLE 3

Pearson correlation coefficients among variables (N = 59,270)

Variables		2	3	4	5	6	7	8	9	10	11
1	NumRetweets	0.02	0.02	0.00	0.01	−0.01	−0.01	0.01	0.00	0.00	0.01
2	Seniority	1.00	−0.19	0.04	−0.01	0.01	0.00	0.00	0.00	0.00	−0.04
3	log(TweetPerDay)		1.00	−0.19	0.00	0.00	0.00	0.02	−0.07	−0.03	−0.01
4	log(DomainRanking)			1.00	−0.03	−0.02	−0.02	−0.24	0.21	0.01	−0.02
5	log(NumCOVID‐19tweets)				1.00	0.15	−0.27	0.06	0.06	−0.10	0.09
6	MaxSimNPRTweets					1.00	0.01	−0.07	−0.02	0.04	0.06
7	Topic2						1.00	−0.04	−0.12	−0.01	−0.10
8	Topic3							1.00	−0.33	−0.21	0.09
9	Topic4								1.00	−0.33	0.02
10	Topic5									1.00	−0.15
11	MaxSimTrumpTweets										1.00

Variables descriptions and summary statistics (N = 59,270) Pearson correlation coefficients among variables (N = 59,270) Table 4 summarizes the estimation results of the NB regression model defined in Equation (1). Model 1, which serves as the baseline, only includes control variables. Model 2 adds independent variables of topic of stories and Model 3 further incorporates Trump's behaviors on Twitter. The Akaike Information Criterion gets better as we added more variables to the models. Note that is the dispersion parameter of NB regression, if equals zero, the model reduces to simple Poisson model. A higher represents greater spreads of the dependent variable, meaning data are over dispersed and are better estimated by a NB model than a Poisson model.

TABLE 4

Results of negative binomial models

	Model 1	Model 2	Model 3
Control variables
Seniority	0.5955*** (0.0161)	0.6117*** (0.0160)	0.6154*** (0.0160)
log(TweetPerDay)	0.7397*** (0.0164)	0.8105*** (0.0164)	0.8124*** (0.0164)
log(DomainRanking)	−0.1001*** (0.0161)	−0.0503** (0.0167)	−0.0634*** (0.0167)
log(NumCOVID‐19tweets)	0.2277*** (0.0160)	0.2226*** (0.0167)	0.2128*** (0.0167)
MaxSimNPRTweets	−0.2398*** (0.0159)	−0.2335*** (0.0160)	−0.2492*** (0.0160)
Independent variables
Topic2		−0.1365*** (0.0169)	−0.1052*** (0.0169)
Topic3		0.0556** (0.0182)	0.0327 (0.0182)
Topic4		0.0123 (0.0189)	0.0048 (0.0188)
Topic5		0.2171*** (0.0179)	0.2225*** (0.0180)
MaxSimTrumpTweets			0.2002*** (0.0161)
Constant	2.0185*** (0.0158)	1.9913*** (0.0157)	1.9674*** (0.0157)
α	14.5550	14.4427	14.33938
Akaike Information Criterion	179,397.4	179,188.8	178,997.9

Note: SEs are in parentheses.

p < .001.

p < .01.

Results of negative binomial models 0.2226*** (0.0167) Note: SEs are in parentheses. p < .001. p < .01. Results from our models reveal three main findings. First, topics of misinformation stories do matter for their spread in Twitter. Compared to political misinformation about COVID‐19 (Topic 1), conspiracies (Topic 5) are more likely to be retweeted, followed by pandemic stories in the United States (Topic 3). Meanwhile, misinformation about pandemic in the world (Topic 2) is less popular. Second, Trump's behaviors on Twitter were related to the spread of COVID‐19 misinformation. The positive and significant effect of MaxSimTrumpTweets suggests that misinformation that is more similar to Trump's own tweets tended to get more attention on Twitter. As shown in Model 3, an increase of 1 SD in the maximum similarity between a focal tweet and the Trump's tweet () is associated with 22% more retweets of the focal tweet. Third, control variables have expected effects on misinformation spread: misinformation shared by more senior (i.e., higher Seniority) and more active (i.e., higher TweetPerDay) users received more retweets; when users were more active in talking about COVID‐19 in general (i.e., higher NumCOVID‐19tweets), misinformation also got more attention. The ranking of websites where misinformation came has significant and negative effect on misinformation spread, suggesting that a story from a more popular misinformation site was more likely to spread. The impact of MaxSimNPRTweets is negatively and significantly associated with the spread of misinformation. In other words, when a misinformation tweet is more similar to what NPR News posted, it is less likely to spread, which is opposite to Trump's tweets. These findings are supported by HSM. On the one hand, as a key component of the systematic processing of information, latent topics reflect what a piece of misinformation is about and are important predictors for its adoption and spread. On the other hand, heuristic factors, such as the status of the user who shared a piece of misinformation online and the popularity of the misinformation source, can influence how the misinformation spreads. The role of Trump's tweets in the spread of COVID‐19 misinformation is also interesting. From the HSM perspective, Trump's status as a political leader can potentially influence people's behaviors of adopting and sharing information via the heuristic processing route. The fact that Trump did not share or retweet any of the stories in our pool of focal misinformation stories seems to suggest that he was not involved in the spread of these stories. Nevertheless, upon further investigations, our analyses revealed the subtle effect from Trump. Specifically, if a misinformation tweet was more similar to one of Trump's recent tweets, that tweet was more likely to spread. Although our regression model cannot establish the causal relationship between what Trump did and the spread of COVID‐19 misinformation, one possible explanation is that Trump's behaviors “nudged” Twitter users to adopt and share misinformation stories, even though he did not directly share these stories.

CONCLUSIONS

Based on the HSM, this study systematically investigated factors associated with the spread of misinformation about COVID‐19 on social media. On the systematic processing side, this work is the first to examine how the topics of stories affected the spread of misinformation that covers multiple topics for an event. For example, we found that misinformation about conspiracies related to COVID‐19 is most likely to spread. As for heuristic processing side, our analysis not only confirmed the effects of misinformation source but also revealed subtle effects from a political leader who was very active on social media. We hope our findings of COVID‐19 misinformation based on the framework of HSM can help us better understand the dynamic of misinformation in general. Outcomes of this study have implications for managing misinformation on social media. For example, while misinformation of all topics can be detrimental to our society, social media platforms may want to pay special attention to stories about conspiracy theories because such misinformation is more likely to spread than other topics about COVID‐19. In addition, when misinformation becomes prevalent during crises, both social media platforms and users should be alerted to behaviors of influential users, especially political leaders, because they have great impact on the spread of misinformation. Even when they are not directly sharing misinformation, their behaviors may still implicitly influence or nudge people's adoption and sharing of misinformation.

Limitations and future work

This research is not without limitations. First, we measured the spread of misinformation only based on the number of retweets. While retweet count is an important measure of a tweet's impact, future work could explore other measures such as the depth of spread (i.e., how far a tweet spreads in a social network) and the reach of a tweet (e.g., a retweet by a user with many followers could reach more users). Second, as we mentioned earlier in this article, our explanatory models only reveal correlations between independent variables and the spread of misinformation. Causal inference could further help to design better interventions to prevent the spread of misinformation. Third, the study focused on the spread of multi‐topic misinformation about one major event. It would be interesting to compare the results with other events that also feature misinformation with various topics. In fact, topics of misinformation about COVID‐19 have also evolved over time (e.g., those about vaccines emerged in 2021). Fourth, although many people turn to their leaders during a crisis, Donald Trump is a special and controversial political leader. More studies are needed to see whether our findings about Trump's role in the spread of COVID‐19 misinformation apply to other political leaders. Last, but not least, this article only examines the effect of tweets published before a misinformation tweet. It would also be interesting to investigate if political leaders' tweets posted after a misinformation tweet would also affect the spread of the misinformation tweet.

34 in total

1. Judgment under Uncertainty: Heuristics and Biases.

Authors: A Tversky; D Kahneman
Journal: Science Date: 1974-09-27 Impact factor: 47.728

2. Fake news on Twitter during the 2016 U.S. presidential election.

Authors: Nir Grinberg; Kenneth Joseph; Lisa Friedland; Briony Swire-Thompson; David Lazer
Journal: Science Date: 2019-01-25 Impact factor: 47.728

3. Science audiences, misinformation, and fake news.

Authors: Dietram A Scheufele; Nicole M Krause
Journal: Proc Natl Acad Sci U S A Date: 2019-01-14 Impact factor: 11.205

4. Salvaging the concept of nudge.

Authors: Yashar Saghai
Journal: J Med Ethics Date: 2013-02-20 Impact factor: 2.903

5. The science of fake news.

Authors: David M J Lazer; Matthew A Baum; Yochai Benkler; Adam J Berinsky; Kelly M Greenhill; Filippo Menczer; Miriam J Metzger; Brendan Nyhan; Gordon Pennycook; David Rothschild; Michael Schudson; Steven A Sloman; Cass R Sunstein; Emily A Thorson; Duncan J Watts; Jonathan L Zittrain
Journal: Science Date: 2018-03-08 Impact factor: 47.728

Review 6. Using social and behavioural science to support COVID-19 pandemic response.

Authors: Jay J Van Bavel; Katherine Baicker; Paulo S Boggio; Valerio Capraro; Aleksandra Cichocka; Mina Cikara; Molly J Crockett; Alia J Crum; Karen M Douglas; James N Druckman; John Drury; Oeindrila Dube; Naomi Ellemers; Eli J Finkel; James H Fowler; Michele Gelfand; Shihui Han; S Alexander Haslam; Jolanda Jetten; Shinobu Kitayama; Dean Mobbs; Lucy E Napper; Dominic J Packer; Gordon Pennycook; Ellen Peters; Richard E Petty; David G Rand; Stephen D Reicher; Simone Schnall; Azim Shariff; Linda J Skitka; Sandra Susan Smith; Cass R Sunstein; Nassim Tabri; Joshua A Tucker; Sander van der Linden; Paul van Lange; Kim A Weeden; Michael J A Wohl; Jamil Zaki; Sean R Zion; Robb Willer
Journal: Nat Hum Behav Date: 2020-04-30

7. Modeling compliance with COVID-19 prevention guidelines: the critical role of trust in science.

Authors: Nejc Plohl; Bojan Musil
Journal: Psychol Health Med Date: 2020-06-01 Impact factor: 2.423

8. Propagating and Debunking Conspiracy Theories on Twitter During the 2015-2016 Zika Virus Outbreak.

Authors: Michael J Wood
Journal: Cyberpsychol Behav Soc Netw Date: 2018-07-18

9. World leaders' usage of Twitter in response to the COVID-19 pandemic: a content analysis.

Authors: Sohaib R Rufai; Catey Bunce
Journal: J Public Health (Oxf) Date: 2020-08-18 Impact factor: 2.341

10. Constructing and Communicating COVID-19 Stigma on Twitter: A Content Analysis of Tweets during the Early Stage of the COVID-19 Outbreak.

Authors: Yachao Li; Sylvia Twersky; Kelsey Ignace; Mei Zhao; Radhika Purandare; Breeda Bennett-Jones; Scott R Weaver
Journal: Int J Environ Res Public Health Date: 2020-09-19 Impact factor: 3.390

6 in total

1. Commitment to protective measures during the COVID-19 pandemic in Syria: A nationwide cross-sectional study.

Authors: Mosa Shibani; Mhd Amin Alzabibi; Abdul Fattah Mohandes; Humam Armashi; Tamim Alsuliman; Angie Mouki; Marah Mansour; Hlma Ismail; Shahd Alhayk; Ahmad Abdulateef Rmman; Hala Adel Almohi Alsaid Mushaweh; Elias Battikh; Naram Khalayli; Bisher Sawaf; Mayssoun Kudsi
Journal: PLoS One Date: 2022-10-14 Impact factor: 3.752

Review 2. The Lancet Commission on lessons for the future from the COVID-19 pandemic.

Authors: Jeffrey D Sachs; Salim S Abdool Karim; Lara Aknin; Joseph Allen; Kirsten Brosbøl; Francesca Colombo; Gabriela Cuevas Barron; María Fernanda Espinosa; Vitor Gaspar; Alejandro Gaviria; Andy Haines; Peter J Hotez; Phoebe Koundouri; Felipe Larraín Bascuñán; Jong-Koo Lee; Muhammad Ali Pate; Gabriela Ramos; K Srinath Reddy; Ismail Serageldin; John Thwaites; Vaira Vike-Freiberga; Chen Wang; Miriam Khamadi Were; Lan Xue; Chandrika Bahadur; Maria Elena Bottazzi; Chris Bullen; George Laryea-Adjei; Yanis Ben Amor; Ozge Karadag; Guillaume Lafortune; Emma Torres; Lauren Barredo; Juliana G E Bartels; Neena Joshi; Margaret Hellard; Uyen Kim Huynh; Shweta Khandelwal; Jeffrey V Lazarus; Susan Michie
Journal: Lancet Date: 2022-09-14 Impact factor: 202.731

3. Combating Misinformation by Sharing the Truth: a Study on the Spread of Fact-Checks on Social Media.

Authors: Jiexun Li; Xiaohui Chang
Journal: Inf Syst Front Date: 2022-06-11 Impact factor: 5.261

4. Understanding the spread of COVID-19 misinformation on social media: The effects of topics and a political leader's nudge.

Authors: Xiangyu Wang; Min Zhang; Weiguo Fan; Kang Zhao
Journal: J Assoc Inf Sci Technol Date: 2021-09-27 Impact factor: 3.275

5. Understanding How and by Whom COVID-19 Misinformation is Spread on Social Media: Coding and Network Analyses.

Authors: Yuehua Zhao; Sicheng Zhu; Qiang Wan; Tianyi Li; Chun Zou; Hao Wang; Sanhong Deng
Journal: J Med Internet Res Date: 2022-06-20 Impact factor: 7.076

6. Trust in COVID-19 public health information.

Authors: Nitin Verma; Kenneth R Fleischmann; Le Zhou; Bo Xie; Min Kyung Lee; Kate Rich; Kristina Shiroma; Chenyan Jia; Tara Zimmerman
Journal: J Assoc Inf Sci Technol Date: 2022-09-20 Impact factor: 3.275

6 in total