Literature DB >> 35915743

Text Analysis of Evolving Emotions and Sentiments in COVID-19 Twitter Communication.

Abstract

Scientists and regular citizens alike search for ways to manage the widespread effects of the COVID-19 pandemic. While scientists are busy in their labs, other citizens often turn to online sources to report their experiences and concerns and to seek and share knowledge of the virus. The text generated by those users in online social media platforms can provide valuable insights about evolving users' opinions and attitudes. The objective of this research is to analyze text of such user disclosures to study human communication during a pandemic in four primary ways. First, we analyze Twitter tweet information, generated throughout the pandemic, to understand users' communications concerning COVID-19 and how those communications have evolved during the pandemic. Second, we analyze linguistic sentiment concepts (analytic, authentic, clout, and tone concepts) in different Twitter settings (sentiment in tweets with pictures or no pictures and tweets versus retweets). Third, we investigate the relationship between Twitter tweets with additional forms of internet activity, namely, Google searches and Wikipedia page views. Finally, we create and use a dictionary of specific COVID-19-related concepts (e.g., symptom of lost taste) to assess how the use of those concepts in tweets are related to the spread of information and the resulting influence of Twitter users. The analysis showed a surprisingly lack of emotion in the initial phases of the pandemic as people were information seeking. As time progressed, there were more expressions of sentiment, including anger. Further, tweets with and without pictures and/or video had statistically significant differences in text sentiment characteristics. Similarly, there were differences between the sentiment in tweets and retweets and tweets. We also found that Google and Wikipedia searches were predictive of sentiment in the tweets. Finally, a variable representing a dictionary of COVID-related concepts was statistically significant when related to users' Twitter influence score and number of retweets, illustrating the general impact of COVID-19 on Twitter and human communication. Overall, the results provide insights into human communication as well as models of human internet and social media use. These findings could be useful for the management of global challenges beyond, or different from, a pandemic.

Entities: Chemical

Keywords: COVID-19; Coronavirus; Coronavirus dictionary; Google; Human communication; Linguistic Inquiry and Word Count (LIWC); Pandemic; Scherer’s ontology; Sentiment analysis; Social media use; Text analysis; Twitter; Wikipedia

Year: 2022 PMID： 35915743 PMCID： PMC9330938 DOI： 10.1007/s12559-022-10025-3

Source DB: PubMed Journal: Cognit Comput ISSN： 1866-9956 Impact factor: 4.890

Introduction

The COVID-19 pandemic has affected, and continues to affect, lives worldwide in an unprecedented way. At the same time, the amount of information that has been generated during the pandemic is unprecedented. Social media users have created large amounts of publicly available communications that capture their views, opinions, concerns, thoughts, and knowledge about the pandemic. Our research investigates the text content of some of that social media, focusing on Twitter (posts) tweets, but also on Google and Wikipedia searches, to study human communications during the pandemic. We use text analysis to extract and classify opinions and study how internet search data are predictive of Twitter tweet sentiment. Using both text analysis tools and manual assessment, Twitter tweets are analyzed for their content and expressions of sentiment and psychological content. We study how characteristics of the social media (e.g., pictures or no pictures) lead to different text concepts within Twitter messages. We investigate the relationship between Google and Wikipedia searches and the sentiment of Twitter messages creating a model relating search and social media. We also examine how a dictionary of key COVID concepts discussed in Twitter tweets are related to the extent that social media messages will get retweeted and add to the sender’s reputation in the context of the social media. Throughout, we use text analysis because it provides insight into the social media text provided by Twitter users [1, 2]. Accordingly, we use social media data, in the form of Twitter tweets and Google and Wikipedia searches, all of which were collected during the pandemic. Our specific objectives are to; analyze the sentiment and emotional content in Twitter tweet texts using both computer-based and manual methods; study the differences in text concepts identified in different types of Twitter tweets and retweets; develop a model of “search and tweet” and examine the ability of Google and Wikipedia searches to predict the sentiment in Twitter tweets; investigate the impact of the use of COVID-19 specific concepts (words) on human communication through Twitter; and identify general implications, beyond the COVID-19 pandemic, of the findings identified in this paper. This research makes several contributions. First, the analysis shows that the text sentiment and emotional content of Twitter messages, with and without video and pictures, is statistically significantly different. Second, the text sentiment and emotional content in COVID-19-based tweets and retweets were statistically significantly different. Third, Google searches and Wikipedia page views are predictive of the percentage of positive and negative sentiment tweets. Fourth, a variable representing a dictionary of words capturing COVID-19 concepts was statistically significant in models of Twitter influence and retweets indicating the impact on human communication. In addition, we propose, as the basis for needed further analysis, a possible COVID-19 health management life cycle and further types of analysis that might provide useful with respect to different types of sentiment (e.g., neutrality, ambivalence). The remainder of this paper proceeds as follows. The “Data and Methodology” section provides an overview of the data and the methods used. The “Twitter Text Analysis: Tweets vs. Retweets and Pictures vs. No Pictures” section discusses text analysis in Twitter tweets and investigates different levels of emotions in different types of tweets. The “Behavioral Links Between Internet Search and Twitter Sentiment” section examines the behavioral link between internet search, using Google and Wikipedia, and Twitter sentiment. The “Impact of COVID-19 Vocabulary Use of Twitter Reputation and Retweets” section considers the relationship of the impact of COVID-19-based communications on Twitter influence and Twitter retweets. The “Manual Analysis of Ad Hoc Tweets” section manually analyzes the tweets using a sentiment ontology. The “COVID-19 Management Life Cycle, Emerging Pandemic Issues, and Computational Extensions” section presents the notion of a COVID-19 life cycle and discusses additional approaches and extensions, such as Word2Vec, ensemble methods and notions of ambivalence. Finally, the “Summary, Contributions, and Conclusion” section summarizes the paper, discusses the implications of our findings and related research, and proposes further extensions.

Data and Methodology

In response to the pandemic, people sought to gain information about COVID-19 from personal interaction and sharing of stories and content. They shared their concerns, questions, opinions, and knowledge on social media [3]. Accordingly, we use such user provided content as the data for our research, especially Twitter tweets.

The Emotional Content of Twitter Tweets’ Text

Twitter tweets are often recognized as capturing the sentiment and emotional content of the crowd [4]. With millions of tweets generated daily, Twitter generally is perceived as a useful platform for research. Researchers have investigated many related issues using tweets, including tracking disease propagation, anticipating election results, and predicting sports outcomes [5]. Further recognizing the value of Twitter data, Banda et al. [6] created a large database of tweets related to COVID-19 and made it available to researchers, to provide substantial opportunities for investigation [7]. Other research has also used Twitter data for analyzing issues related to users’ attitudes towards COVID-19 [8-10]. Since people express their opinions and ideas in this user-generated content, this online text can be mined to extract corresponding sentiment in the textual disclosures [11]. There are different approaches to text analysis [12, 13]. Efforts related to text analysis related COVID-19 vaccinations have used a wide range of AI-enabled social media analysis on large data sets to accommodate the unstructured nature of the data [14]. As an example, Leibowitz et al. [15] used Linguistic Inquiry and Word Count (LIWC) to investigate the text of Twitter tweets generated by emergency medicine Twitter users and found that approximately 34% of the tweets were positive and 31% were negative; 76.5% focused on the present.

Review of COVID Data Sets and a Data Timeline

This research focuses on studying the early part of the pandemic. We, therefore, track some of the key events from early in the pandemic. Figure 1 shows a timeline of significant COVID-19-related events as the pandemic was identified and progressed during its initial stages when there were predictions that it might end by August 2020.

Fig. 1

Progression of coronavirus into COVID-19 pandemic

Progression of coronavirus into COVID-19 pandemic During the early phases of the COVID-19 pandemic, many people turned to Twitter as a platform to both post and retrieve information on the virus. Twitter tweets collected throughout this timeframe became the source of data for this research. We collected sets of data from Twitter, divided into roughly two periods: as the pandemic was identified and progressed to countries shutting down; and as countries started to reopen and cases continued to rise. These periods, were approximately, before early to mid-June, and after mid-June 2020.

LIWC

LIWC text analysis program [16, 17] was applied to the tweets. LIWC is perhaps the leading software for capturing information regarding psychological concepts from text. LIWC uses a psychology-based bag of words approach to analyze text. Tausczik and Pennebaker [18] provide a history of LIWC and the bag of words approach, which is derived from Freud and others and has a long history in psychology. Different concepts are represented within LIWC, such as “positive emotion” and “negative emotion,” but also related concepts such as “anger” and “power.” For each concept, a dictionary of words is included in LIWC. The software is then used to identify the relative frequency of occurrence of these words in a set of text (e.g., Twitter tweets), thus providing a scientific approach to text analysis. Representative concepts examined in this paper are summarized in Table 1.

Table 1

Summary of selected concepts and categories from LIWC

Concept/category	Description
Affiliation	Characterized by words such as ally, friend, and social
Analytic (summary)	A high number reflects formal, logical, and hierarchical thinking; lower numbers reflect more informal, personal, here-and-now, and narrative thinking
Anger	Includes words such as hate, kill, and annoyed
Authenticity (summary)	Higher numbers are associated with a more honest, personal, and disclosing text; lower numbers suggest a more guarded, distanced form of discourse
Clout (summary)	A high number suggests that the author is speaking from the perspective of high expertise and is confident; low clout numbers suggest a more tentative, humble, even anxious style
Cognitive	Includes words such as cause, know, and ought
Health	Words include clinic, flu, and pill
Negative emotion	Includes words such as hurt, ugly, and nasty
Positive emotion	Characterized by words such as love, nice, and sweet
Power	Includes words such as superior and bully
Social	Includes words such as mate, talk, and they
Tone (summary)	A high number is associated with a more positive, upbeat style; a low number reveals greater anxiety, sadness, or hostility. A number around 50 suggests either a lack of emotionality or different levels of ambivalence

Summary of selected concepts and categories from LIWC LIWC has two different types of measures, “summary variables” and “categories of words.” The other categories of words, for example, “anger” or “power,” are made in a comparative analysis, typically within the existing sample based on their relative occurrence. However, four concepts (analytic, clout, authenticity, and tone) have been established as “summary variables,” which are “standardized composites, based on previous research” [19]. The four summary variable concepts capture the frequency of word occurrences from other categories. LIWC’s summary variables are not analyzed based on the number of occurrences, as are the category variables, but, instead, the number of occurrences is related to an empirical distribution that ranges from 0 to 100, measuring the percentile. Those summary variables allow us to make statements about these measures, independent of additional comparisons, such as in-sample comparisons. Although we primarily focus on the summary variables, we also chose the categories of affiliation, anger, health, negative emotion, positive emotion, power, and social for various reasons: health, because the coronavirus is an issue associated with health; anger, because we expect an angry reaction to the pandemic; and ranges of positive and negative emotion because we expect that tweets about coronavirus would be emotional. We expected affiliation to be an important distinguishing variable as people connect with each other in a friendship type of gesture. Similarly, we expect that the tweets provide a social outlet for the tweeter. Finally, we included power because the process of tweeting is likely to provide the tweeter with a certain extent of perceived power over a situation.

Our Approach and Use of Data

Users communicate using Twitter, allowing us to capture and analyze conversations in text format. We can use dictionaries to capture concepts, such as sentiment, emotions, or COVID-19, or use other types of words. By focusing on Twitter, we study human communication and behavior during the pandemic. Figure 2 provides an overview of the analysis conducted in this research.1

Fig. 2

Analysis of Twitter posts related to COVID-19

Analysis of Twitter posts related to COVID-19 The data sets included: Twitter posts from early March to mid-June 2020 (2200 posts); a set of 900 posts collected in mid-July; and approximately 22,000 online posts collected daily from July–August 2020. These posts were collected by scrapping tweets using tools available on the internet; manual collection; and using https://birdiq.net/. The first data set was the most specific to COVID-19 and collected from reading and searching tweets based on keywords, such as symptom, infection, fever, fatigue, sick, COVID, and coronavirus. It also included mentions of family members or friends who might have had contact with the virus. The targeted collection was to identify the sentiment and emotions of people who were specifically dealing with COVID-19 in a concrete way. The second data set was intended to be used to obtain an overall sense of the attitude towards the virus over time. These tweets were collected automatically in July 2020 and cleaned.2 They served as input to a text analysis tool. Daily tweets were collected from mid-July until August. Additionally, we collected data from Google searches and Wikipedia page views to study their predictive relationship with Twitter tweets.3 A third data set employs additional data drawn from Hussain et al. [14].

Twitter Text Analysis: Tweets vs. Retweets and Pictures vs. No Pictures

In this section, we analyze a random sample of 20,000 tweets gathered in July 2020, chosen based on whether they contained the term coronavirus in order to study different forms of communication through tweets. We removed the tweets that were not English, resulting in 14,352 tweets. That set of tweets was analyzed using different comparative approaches: tweets vs. retweets and tweets with picture and video content vs. no video content. Of the 14,352 tweets, there were 1627 original tweets, 11,667 retweets, and the rest were replies. We did not analyze replies. Of the 14,352 tweets, 12,021 did not include any picture or video; 2331 did include pictures or videos. We analyzed both the two sets of tweets and the entire dataset using LIWC.

Text Analysis of the Entire Sample of Tweets: Analytic, Clout, Authenticity, and Tone

Applying LIWC to the entire sample, we found that the tweets averaged an “analytic” score of 74.02, which generally suggests more logical thinking. In addition, the tweets averaged a “clout” score of 66.60, which suggests more confidence than average. However, the tweets averaged an “authenticity” score of only 23.97, which is in the lower quarter of such scores. Finally, the “tone” averaged only 35.46, which reveals anxiety, sadness, or hostility. These results suggest that, on average, tweets about the coronavirus were relatively analytic, and came from a position of some clout. However, the tweets were not authentic sounding, suggesting guarded positions, and, generally, were more negative than positive.

Text Analysis of Tweets Using Pictures and Video Versus No Pictures and No Videos

Tweets can appear as text only or supplemented with pictures and videos. An important consideration is the extent of the impact of the pictures and videos on the text messages. Does the text differ if the tweeter includes pictures or videos? Do people express different text emotions when they supplement their text with videos or pictures? If they do, what does that mean? Do tweeters expect the pictures and video to tell the story? This section investigates some of these issues within the context of the coronavirus pandemic, while raising questions for future research. We conducted a text analysis of the differences between the two groups of tweets: those that did not have pictures or videos; versus those that did. We used a two-sample t-test with unequal variances to test the differences between the two populations, for each of our variables. The results are summarized in Table 2. It is interesting to note that, for each text variable, there is a statistically significant difference between the average values for each of the categories, except health.

Table 2

Analysis of tweets with and without pictures and video

Category	No pict/video	Pict/video	t-value	p-value
Analytic	71.70	86.01	− 31.28	< 0.0001
Clout	66.39	67.66	− 2.88	0.0040
Authentic	25.74	14.84	21.97	< 0.0001
Tone	34.93	38.19	− 4.17	< 0.0001
Health	1.47	1.39	1.60	0.1090
Affiliation	1.87	1.47	7.16	< 0.0001
Power	3.31	2.07	18.61	< 0.0001
Social	9.67	7.75	17.37	< 0.0001
Positive emotion	2.07	1.95	2.03	0.0420
Negative emotion	2.21	1.65	9.88	< 0.0001
Anger	0.78	0.60	5.10	< 0.0001
Cognitive	8.78	5.25	33.68	< 0.0001

Analysis of tweets with and without pictures and video The “no picture and no video” tweets had statistically significantly differences, with both more positive and negative emotion and affiliation. The tweets also had greater comparative social context words and were presented with greater power. The differences were statistically significant. Finally, the text in the no picture and no video messages was also statistically significantly more “authentic”; however, the authenticity rating was still in approximately the lower 25%, suggesting a guarded presentation. On the other hand, there were statistically significant differences, with greater analytic vocabulary, clout and tone in the messages with pictures and videos. This analysis suggests different sentiment and emotional content in the two sets of tweets. It does not establish whether: the use of pictures and video leads to changes in the sentiment and emotion in the text; the type of problem that leads to using a picture or video leads to a different type of text; or if people who communicate using video or pictures use different text than those who do not. Alternatively, it may be some combination of the three. Regardless, these are general issues in human communication and behavior that are subjects for future research.

Text Analysis of Retweets Versus Tweets

We analyzed how the text of tweets that were retweeted differ from those that are original tweets, with the results summarized in Table 3. There was not a statistically significant difference for the variables of affiliation and anger, whereas, for each of the other variables, there was a statistically significant difference. The retweets had greater values for clout, health, affiliation, power, social, negative emotion, and cognitive, whereas the original tweets had larger values for the categories of analytic, authenticity, tone, and positive emotion. Thus, the retweets were more negative in tone, suggested cognitive aspects to the tweet, focused more about health, and approached the tweet from a position of power.

Table 3

Analysis of tweets versus retweets

Category	Retweet	Tweet	t-value	p-value
Analytic	74.66	76.98	− 3.22	0.001
Clout	68.17	59.53	13.23	< 0.0001
Authentic	23.46	25.35	− 2.37	0.0180
Tone	34.36	42.05	− 8.08	< 0.0001
Health	1.51	1.24	4.04	< 0.0001
Affiliation	1.81	1.77	0.59	0.5560
Power	3.25	2.45	8.66	< 0.0001
Social	9.87	6.32	22.54	< 0.0001
Positive emotion	1.93	2.46	− 5.78	< 0.0001
Negative emotion	2.12	1.85	3.22	0.0010
Anger	0.74	0.76	− 0.29	0.7720
Cognitive	8.19	7.07	6.65	< 0.0001

Analysis of tweets versus retweets Thus, the results show a statistically significant difference between the text of retweeted tweets and original tweets. What is not clear is whether the sentiment and emotions are related to the likeliness of a tweet to be retweeted. This is a topic for future research, perhaps using behavioral research. Further, although we find a difference in these pandemic-based tweets, whether the same relationship will hold between non-COVID-19 tweets requires future investigation.

General Progression of Sentiment and Emotions Expression

Figure 3 summarizes the findings from the analysis of the Twitter tweets during the early phases of COVID-19. It provides a general timeline of the results from these three data sets, starting with the early manually collected and analyzed tweets and progressing to the LIWC analysis of the automated collection of tweets and retweets.4

Fig. 3

Timeline of sentiment and emotions expression

Behavioral Links Between Internet Search and Twitter Sentiment

In an interesting and recent research paper, Hussain et al. [14] analyze the sentiment related to Twitter tweets during the COVID pandemic.5 For their analysis, the researchers collected 10% of the Twitter tweets in a database of COVID pandemic tweets [20]. As part of their analysis, they determined the relative percentage numbers of COVID-related tweets that had either positive, negative, or neutral sentiment, over a 37-week time period, during 2020. For the same period, we gathered both Google searches (from the USA6) and all Wikipedia page views, since there is no country-by-country availability. We then used those searches and page views to predict the percentage of numbers of positive, negative, and neutral sentiment tweets. Our analysis is based on the behavioral model that people would search for information (for example, using Google or Wikipedia) and, after gathering their information, potentially issue a Twitter tweet. In that model, shown in Fig. 4, we would expect that the numbers of Google searches and Wikipedia page views to be predictive of Twitter tweets.

Fig. 4

General behavioral model of users’ search for online content and reaction

General behavioral model of users’ search for online content and reaction We used a set of variables to capture the relative percent of tweets that had “positive” sentiment (Pos Sent), “negative” sentiment (Neg Sent), and “neutral” sentiment (Neutral Sent). We provide the correlations of the percentage of the numbers of tweets with the numbers of Google searches and Wikipedia page views (for coronavirus and COVID-19). As seen in Table 4, both the numbers of Google searches and Wikipedia page views are statistically significantly related to both the percentage of tweets with positive and negative sentiment. The numbers of searches and page views are negatively related to the percent of positive sentiment tweets, and positively related to the percent of negative sentiment tweets. The percentage of neutral tweets is not statistically significantly related to Google searches or Wikipedia page views.

Table 4

Correlations and p values for models of sentiment

Correlations	Google	Wiki-Coronavirus	Wiki-COVID-19	Pos Sent	Neg Sent
Wiki-Coronavirus	0.9431
Wiki-COVID-19	0.9585	0.9455
Pos Sent	− 0.5367	− 0.5582	− 0.5446
Neg Sent	0.5906	0.6367	0.6084	− 0.8805
Neutral	0.2129	0.1571	0.1836	− 0.5824	0.1511

Correlations and p values for models of sentiment Next, we investigate the capture of lagged (1 week) variables, to test the ability to forecast the relative percentages of positive and negative sentiment tweets. The relative number of Google searches from Google Trends is the lagged variable “Google-1.” We capture the two different sets of lagged Wikipedia page views as “Wiki-Coronavirus-1” (coronavirus) and “Wiki-COVID-19–1” (COVID-19), based on two different sets of pages (coronavirus and COVID-19). The correlation results for the lagged variables are summarized in Table 5, and the regression models in Table 6.

Table 5

Correlation and p values for models of sentiment—predictive

Correlation	Google-1	Wiki-Coronavirus-1	Wiki-COVID-19–1	Pos Sent	Neg Sent
Wiki-Coronavirus-1	0.9042
Wiki-COVID-19–1	0.9544	0.9449
Pos Sent	− 0.5113	− 0.5513	− 0.5244
Neg Sent	0.5180	0.6215	0.5479	− 0.8806
Neutral	0.2355	0.1792	0.2301	− 0.5824	0.1511

Table 6

Predictive regression models of sentiment based on Google and Wikipedia searches

Dependent variable	Pos Sent	Pos Sent	Pos Sent	Neg Sent	Neg Sent	Neg Sent
R**2	0.261	0.275	0.304	0.268	0.300	0.386
Google-1	− 0.115			0.092
p value	0.0012			0.001
Wiki-COVID-19–1		− 6.94E − 05			5.71E − 05
p value		0.0009			0.0004
Wiki-Coronavirus-1			− 3.57E − 06			3.16E − 05
p value			0.0004			< 0.0001

Correlation and p values for models of sentiment—predictive Predictive regression models of sentiment based on Google and Wikipedia searches As can be seen from Tables 4 and 5, the Google search variables and the Wikipedia page view variables are each highly correlated. Unfortunately, that correlation makes using both variables in the same regression equation difficult because of multicollinearity. Accordingly, we used only one of each of the variables in each regression equation, as reported in Table 6. This analysis showed that both Google searches and Wikipedia page views were statistically significantly predictive of both positive sentiment and negative sentiment, but not neutral sentiment in Twitter tweets. Further, both Google searches (lagged one period) and Wikipedia page views of “COVID-19” and “coronavirus” (lagged one period) are predictive of the percent of Twitter messages with positive and negative sentiment. The percent of Twitter messages associated with positive sentiment was negatively correlated with both Google searches and Wikipedia page views, whereas the number of messages associated with negative sentiment was positively correlated with both the numbers of Google searches and Wikipedia page views. As a result, it appears that more Google searches and Wikipedia page views, ultimately, are related to more negative sentiment Twitter messages. This is an interesting behavioral finding that should be examined in other settings to determine if the same relationships hold. This is important because it provides a basis for a potential behavioral link between searches for information (Google searches and Wikipedia page views) and statements or positions issued through social media (Twitter).

Impact of COVID-19 Vocabulary Use of Twitter Reputation and Retweets

This section investigates, as an alternative view, the impact of COVID-19 on Twitter use. That is, whether the use of COVID-19 terms in Twitter tweets is statistically significantly related to an “influence measure” of Twitter users and whether the use of those COVID-19 words is related to retweets. Doing so, allows us to study the effects on human communication of the pandemic by tracking the relationship between occurrences of words regarding COVID-19, both on social media influence and reuse of messages. Unfortunately, there has been limited research associated with studying the impact of COVID-19 on these issues.7 We, therefore, study the impact using text mining, supported by the generation and use of a dictionary of COVID-19 words. We then investigate the relationship between the dictionary words, and Twitter “influence” scores, and between that dictionary of words and two measures of retweeting. This allows us to gain insights into the impact of COVID-19-based words on communications.

Creating a Coronavirus Dictionary

To assess whether a tweet used COVID-19 concepts, we first generated a dictionary of COVID-19 terms, broadly based on words related to symptoms and controlling the spread of the disease. As a result, we focus on subcategories of detecting the coronavirus (currently have or had in the past), preventing the coronavirus, curing (e.g., a vaccine), symptoms (e.g., fever), and different names for the virus. There are potentially many different terms that can be used to measure the extent to which text contains information about the coronavirus pandemic. We choose our terms based on the following process. First, we obtained a list of words and phrases that occurred in a list of coronavirus tweets and ranked them by the number of occurrences. Second, those words not related to the disease were discarded; e.g., stop words were removed. Third, we reviewed those frequently occurring words to identify the subcategory to which they belonged. Finally, additional words and phrases from the authors were added. Our focus was on developing a set of words aimed at isolating text related to the coronavirus. Table 7 shows the coronavirus dictionary words.

Table 7

Coronavirus dictionary words derived from Twitter

Temperature
Antibody test
Vaccine
Mask
Social distance
Lost sense of taste
Lost sense of smell
Loss of taste
Loss of smell
Fever
Corona
COVID-19
COVID
Coronavirus
Pandemic

Coronavirus dictionary words derived from Twitter We aggregated all of the occurrences of these dictionary words under the category “coronavirus.” As with LIWC, we use a bag of words approach, counting the number of words from the dictionary occurring in the tweets, to measure the effects of COVID-19 words on the Twitter communications. As a verification of the importance of the dictionary words in coronavirus text, we conducted a joint, quoted search, using Google to determine co-occurrence of the word with coronavirus as reported in Table 8.

Table 8

Joint number of occurrences with “coronavirus”

Dictionary word	Joint Google Occurrences
Temperature	197,000,000
Antibody test	10,000,000
Vaccine	195,000,000
Mask	680,000,000
Social distance	380,000,000
Lost sense of taste	41,600
Lost sense of smell	136,000
Loss of taste	2,000,000
Loss of smell	2,410,000
Fever	179,000,000
Corona	688,000,000
COVID-19	2,740,000,000
COVID	2,880,000,000
Coronavirus	2,790,000,000
Pandemic	701,000,000

Joint number of occurrences with “coronavirus” The symptom words (e.g., lost sense of taste) had the lowest co-occurrence in our Google search. However, the symptoms of lost taste and smell in Table 8 seem particularly unique to the coronavirus. Despite, their potential low occurrence rate, we assessed that they would allow us to isolate and characterize coronavirus discussions.

Linguistic Inquiry and Word Count for Control Variables

We used LIWC to provide the control variables for our analysis of influence and retweeting. LIWC was used to capture both structural and semantic information about tweets in order to study the impact of our coronavirus dictionary. LIWC “structural” variables are the number of words in the text (WC), and items related to the style or difficulty of the text, such as the number of six letter (or longer words—Sixltr) and the number of words per sentence (WPS). We choose those control variables for several reasons. Word count provides a measure of the actual length of the message. Both WPS and Sixltr capture the “level” at which the tweets are made. More words per sentence and more than six letter words likely indicate a more “educated” tweeter. Alternatively, together WPS and Sixltr provide measures of “readability” or “ease of readability.” These three variables are measured based on the number of occurrences. Table 9 summarizes the structural variables used in this research.

Table 9

LIWC structural variables used

Word group	Description (Pennebaker, Booth, Boyd, and Francis 2015)
WC	Word count
WPS	Words per sentence
Sixltr	Percent of six (or more) letter words

LIWC structural variables used Although LIWC provides several semantic word sets, we focus on the “summary words” that provide measures of the occurrences of some concepts in the text. They can provide control variables over which to normalize the impact of the text content on our dependent variables, in order to test the specific effects of our COVID dictionary.

Dependent Variables: Influence Score, Retweeted Status User Listed Count and “Is a Retweet”

We choose three dependent variables: the Twitter influence score, retweeted status user listed count, and whether the particular tweet was retweeted (“Is a Retweet”), using a dataset generated from https://birdiq.net/twitter-search. Twitter influence scores capture substantial information about the use of Twitter [21]. Canals [22] indicates that the influence score is a joint function of the number of followers, the number following, and the number of posts in the Twitter account and is heavily historical. The “retweeted status user listed count” and whether a tweet is retweeted (“Is a Retweet”) provide measures of the users’ current interest in the information.

Data

We collected a random sample of 900 Twitter tweets from https://birdiq.net/twitter-search during July 2020 using seed words of COVID and coronavirus. Of those 900, we eliminated the ones that were not written in English, bringing the final number of tweets used to 770. It was important to eliminate non-English terms because our analysis used an English dictionary and is dependent on being able to count the numbers of English words in each category. We identified approximately 85% English tweets and 15% non-English, largely Spanish and French.

Empirical Analysis: Correlation and Regression Analysis

We used both correlation analysis and regression analysis to analyze our data. Since the “Is a Retweet” is a nominal variable, we investigated it using logistic regression. In the regression analysis, we used the “variable inflation factors” (VIFs) to determine the extent of multicollinearity among our independent variables. Our largest VIF score did not exceed 1.3 and, thus, was well below the standard of 4 in the literature [23], suggesting very limited multicollinearity.

Empirical Findings

Our analysis took four different approaches. First, we conducted a correlation analysis between each of our continuous variables. Second, we investigated the Twitter influence score in two steps: first, with the structural control variables and the summary text content variables; and second, with the control variables, the summary text variables, and the coronavirus variable. This allowed us to assess the direct impact of the words in our dictionary. Third, we conducted a regression analysis of the continuous variable, retweeted status user listed count, with all our variables. Fourth, we performed a logistic regression on the nominal variable “Is Retweet,” with our control and dictionary variable.

Correlation Analysis

This section uses correlation analysis to investigate the relationships between our variables with particular emphasis on influence score. The correlations are summarized in Table 10 and the p values in Table 11.

Table 10

Correlation analysis

	Influence score	WC	WPS	Sixltr	Analytic	Clout	Authentic	Tone
Influence score
WC	− 0.1298
WPS	0.1094	0.2596
Sixltr	0.0472	− 0.2532	− 0.0655
Analytic	− 0.0319	− 0.1999	0.1339	0.3223
Clout	0.0866	− 0.0763	− 0.0591	0.086	0.0143
Authentic	0.0045	0.1088	− 0.0001	− 0.2195	− 0.1871	− 0.4209
Tone	0.1178	0.0483	− 0.0727	0.022	− 0.1543	− 0.0682	0.1774
Coronavirus	− 0.1976	− 0.0705	0.0372	− 0.1446	0.1376	− 0.2085	0.0007	− 0.2153

Table 11

p-Values for correlation analysis

	Influence score	WC	WPS	Sixltr	Analytic	Clout	Authentic	Tone
Influence score
WC	0.0005
WPS	0.0033	< 0.0001
Sixltr	0.2061	< 0.0001	0.0695
Analytic	0.3932	< 0.0001	0.0002	< 0.0001
Clout	0.0202	0.0344	0.1010	0.017	0.6919
Authentic	0.9035	0.0025	0.9976	< 0.0001	< 0.0001	< 0.0001
Tone	0.0015	0.1805	0.0438	0.5429	< 0.0001	0.0584	< 0.0001
Coronavirus	< 0.0001	0.0506	0.303	< 0.0001	0.0001	< 0.0001	0.9854	< 0.0001

Correlation analysis p-Values for correlation analysis As can be seen from Table 11, the coronavirus variable is statistically significantly correlated with the influence score. In addition, two of the semantic control variables, WC and WPS, are statistically significantly correlated with the influence score. Finally, two of the summary text variables, clout and tone, are also statistically significantly correlated with influence score. For those statistically significantly correlated variables, the signs on word count and coronavirus were negative, whereas the signs on the other three were positive.

Regression Analysis of Tweeter Influence Score

Table 12 shows that a model including the structural variables and the summary text variables generates an R-square value of 0.075. Table 13 summarizes the model variables. The two structural variables control variables of word count and words per sentence, and the summary text variables of analytic, clout, and tone, were statistically significant. We, therefore, conclude that, for this set of tweets, the influence score is statistically significantly related to analytic, clout, and tone text variables. Each of the VIFs (variable inflation factors) are less than 4, suggesting minimal multicollinearity (Hair et al. [24] and others).

Table 12

Regression model fit for control variables and summary text variables, without coronavirus dictionary

Fit measure	Value
R-square	0.075
R-square adj	0.066
Root mean square error	32,992.3
Mean of response	17,153.2
Observations (or sum wgts)	720

Table 13

Regression model of influence score for control variables without coronavirus dictionary

Term	Estimate	Std error	t ratio	Prob >\|t\|	VIF
Intercept	12,011.93	9077.604	1.32	0.1862
WC	− 812.178	164.3373	− 4.94	< 0.0001	1.219
WPS	933.0234	190.5672	4.9	< 0.0001	1.154
Sixltr	114.8679	133.8729	0.86	0.3912	1.190
Analytic	− 124.673	58.85601	− 2.12	0.0345	1.248
Clout	157.134	61.67107	2.55	0.011	1.209
Authentic	55.9025	53.50231	1.04	0.2964	1.259
Tone	138.03	39.85452	3.46	0.0006	1.057

Regression model fit for control variables and summary text variables, without coronavirus dictionary Regression model of influence score for control variables without coronavirus dictionary Finally, in Tables 14 and 15 we add the findings of our new dictionary variable on the coronavirus. The R-square increases to 0.101, a statistically significant increase. In addition, the same control variables (WC and WPS) and one of the summary text variables (tone) has a coefficient that stays statistically significant. The coefficient on our coronavirus dictionary is also statistically significant and negative, as in the correlation matrix.

Table 14

Regression model fit for control, summary text, and coronavirus variables

Fit measure	Value
R-square	0.101
R-square adj	0.091
Root mean square error	32,550.9
Mean of response	171,532
Observations (or sum wgts)	720

Table 15

Regression model of influence score for control, summary text, and with coronavirus variable

Term	Estimate	Std error	t ratio	Prob >\|t\|	VIF
Intercept	23,737.56	9324.126	2.55	0.0111
WC	− 852.054	162.3785	− 5.25	< 0.0001	1.223
WPS	913.777	188.0659	4.86	< 0.0001	1.155
Sixltr	− 9.09987	134.898	− 0.07	0.9462	1.242
Analytic	− 74.8425	59.10535	− 1.27	0.2058	1.293
Clout	93.69627	62.44295	1.5	0.1339	1.274
Authentic	33.09257	53.0271	0.62	0.5328	1.270
Tone	99.97911	40.21196	2.49	0.0131	1.105
Coronavirus	− 2046.01	452.5503	− 4.52	< 0.0001	1.198

Regression model fit for control, summary text, and coronavirus variables Regression model of influence score for control, summary text, and with coronavirus variable

Analysis of “Retweeted Status User Listed Count”

We use the same model as in the previous analysis of influence score, to study one aspect of the impact of retweeting the retweeted status user listed count (RSULC) of those re-tweeters. The measure for fit for the equation is in Table 16 and the regression model in Table 17.

Table 16

Regression model fit for retweeted status user listed count model

Fit measure	Value
R-square	0.104
R-square adj	0.090
Root mean square error	28,595.8
Mean of response	9815.567
Observations (or sum wgts)	552

Table 17

Regression model of retweeted status user listed count model for control, summary text, and coronavirus variables

Term	Estimate	Std error	t ratio	Prob >\|t\|	VIF
Intercept	− 58,178.2	13,570.33	− 4.29	< 0.0001
WC	1243.338	420.5091	2.96	0.0032	1.790
WPS	866.508	220.8383	3.92	< 0.0001	1.148
Sixltr	784.2514	168.1955	4.66	< 0.0001	1.861
Analytic	− 1.83134	59.21576	− 0.03	0.9753	1.284
Clout	112.8247	67.23197	1.68	0.0939	1.350
Authentic	− 77.6443	55.63054	− 1.4	0.1634	1.386
Tone	− 30.8501	39.99352	− 0.77	0.4408	1.103
Coronavirus	1098.97	474.1582	2.32	0.0208	1.154

Regression model fit for retweeted status user listed count model Regression model of retweeted status user listed count model for control, summary text, and coronavirus variables These results suggest that the vocabulary in the coronavirus dictionary is positively related to the RSULC. Further, each of the structural variables had p values that were statistically significant.

Analysis of “Is Retweet”—Logistic Regression

In this section, we discuss the analysis of the dependent variable “Is Retweet,” that takes on two values: true and false. As a nominal variable, we use logistic regression to analyze the data. The measures of fit are summarized in Table 18 and the coefficients in Table 19, for the model of “Is Retweet” for each of the control variables, summary text, and coronavirus variable. The p values on the coefficients of the semantic control variables WC and WPS were statistically significant, as they were in the estimation of influence score. However, whereas tone was statistically significant in the estimate of influence score, in the case of “Is Retweet,” the p values on the coefficients for clout and authentic were statistically significant. Finally, the p value on coronavirus was also statistically significant variable in estimating the variable “Is Retweet,” as it was in the model of influence score. The results in Table 19 indicate that only coefficients on word count and the coronavirus were positive, whereas those on words per sentence, clout and authentic were negative.

Table 18

Logistic regression measures of fit for “Is Retweet” model

Measure	Training
Entropy R-square	0.3123
Generalized R-square	0.4344
Mean—Log p	0.374
RMSE	0.3324
Mean absolute deviation	0.2285

Table 19

Logistic regression coefficients for structural, summary text, and coronavirus variables estimate of “Is Retweet” model

Term	Estimate	Std error	Chi-square	Prob > chi-sq
Intercept	− 1.93167	0.775761	6.2	0.0128
WC	0.178136	0.018811	89.68	< 0.0001
WPS	− 0.0317	0.015999	3.93	0.0475
Sixltr	− 0.02377	0.012997	3.34	0.0675
Analytic	− 0.00398	0.005134	0.6	0.4377
Clout	− 0.03581	0.005836	37.65	< 0.0001
Authentic	− 0.02966	0.005704	27.04	< 0.0001
Tone	− 0.00627	0.003718	2.85	0.0916
Coronavirus	0.234212	0.039816	34.6	< 0.0001

Logistic regression measures of fit for “Is Retweet” model Logistic regression coefficients for structural, summary text, and coronavirus variables estimate of “Is Retweet” model

Summary

This section compares the results across the three dependent variables. The p values for word count and words per sentence are statistically significant throughout this section. In the regression model of the influence score, the coefficient on the word count has a negative sign. However, for the logistic regression model of the retweeting dependent variable, the sign is positive. Words per sentence is statistically significant in each of the three models. The coefficient of the text variable tone is positive and has a statistically significant coefficient in the regression model of influence score. However, the coefficients on clout and authentic were negative and statistically significant in the model of retweets. Finally, the p values for our coronavirus dictionary are statistically significant for all three dependent variables. The statistically significant results on our coronavirus dictionary suggest that the coronavirus pandemic has created a vocabulary of its own, including the previously unknown term COVID-19, and that vocabulary influences human communication as captured in Twitter. Why does our dictionary have a negative sign on the estimation of the influence score and a positive sign on estimation of the retweet measures? We conclude that this is an indication of change in the information being diffused, and that information needs a “new dictionary” to identify it. Influential Twitter users have an established set of followers, posts, and topics. As a result, there is likely to be a consistency in their tweets. However, new and important topics emerged; for example, the coronavirus pandemic can attract great interest and result in retweets. Our results suggest that those creating the coronavirus tweets are a different set of users, diffusing a different set of information than more established Twitter users normally would do. These results should be important for information technology research. The coronavirus dictionary allowed us to track the changes in vocabulary use in Twitter communications. This comparison between the impact on influence score and retweets allows us to monitor these changes. Special emerging technology dictionaries could be used with a range of technologies to capture and measure the information diffusion associated with such technologies.

Manual Analysis of Ad Hoc Tweets

In addition to the computer-based analysis, we manually analyzed the posts at a finer granularity, by adopting the work of Scherer [25], who identifies 36 ontological categories that deal with affect: admiration/awe, amusement, anger, anxiety, being touched, boredom, compassion, contempt, contentment, desperation, disappointment, disgust, dissatisfaction, envy, fear, feeling of affection/love, gratitude, guilt, happiness, hatred, hope, humility, interest/enthusiasm, irritation, jealousy, joy, longing, lust, pleasure/enjoyment, pride, relaxation/serenity, relief, sadness, shame, surprise, and tension/stress. These terms help to identify sentiment in natural language [26, 27]. We read each of the tweets and classified them as factual or emotional, based on Scherer’s [25] categories. Approximately, 55% of the posts were factual, simply referring to a fact (without emotion or sentiment) intended to be true at the time of posting. Other tweets were factual with some type of emotion and included direct reporting of patient experiences (5%). The remaining tweets reflected only a few emotions (anger, anxiety, desperation, disgust, hope, and surprise). An example of a tweet classified as factual (which could be falsified later) was: “The spread of #COVID19 by an asymptomatic or someone who is not showing any symptoms appears to be less likely, said #WHO (@WHO) in the recently published summary of transmission of COVID-19 including symptomatic, pre-symptomatic and asymptomatic patients.” An example of the emotion fear was: “A patient with symptoms of a heart attack refused treatment after reading on Facebook that she would die if she went to hospital during the COVID-19 crisis.” Another tweet (again subject to later falsification), but intended to be factual, was: “@CDCgov issued some very useful current best estimates:—About 1/3 of COVID-19 infections are asymptomatic.—40% of transmission is occurring before people feel sick.—Time from exposure to symptom onset: ~ 6 days on average.” Using an online tool, https://birdiq.net/Twitter-search, we manually collected tweets based on keywords, such as COVID-19, CDC, and WHO. These tweets (over 2000) were reviewed to identify sentiments and insights that would not be possible to extract using an automated tool, again, to gain insight into user behaviors. We strived to show the value of manual mining, recognizing that this type of analysis is not feasible on a large scale. Table 20 provides examples of tweets and classifies them based on their actual, or assumed, intended (potential or real) significance.

Table 20

Data set of ad hoc tweets

Keyword	Tweet	Significance
COVID-19	CNN: The US has recorded more than 3.3 million coronavirus cases since the pandemic began, meaning nearly 1 out of every 100 Americans has tested positive for COVID-19, according to Johns Hopkins University. More than 135,000 Americans have died	Factual
	As Ontario prepares to reopen indoor bars and restaurants, here’s a story about how anyone who visited a bar in Montreal since July 1 is being told to get tested for #COVID19 bc of outbreaks linked to several establishments. Via @mtlgazette (Montreal Gazette) 13 July 2020	Procedures for managing crisis
	The last old drug we repurposed for #COVID_19 had 6425 patients—#RECOVERY trial for #Dexamethasone. A 30 patient study is not enough to prove efficacy even for an old drug. And as for “doctors seeing outstanding results”, the plural of anecdote is not data—Clinical Trials 101	Treatment efforts
	A group of Durban-based businessmen, who started a non-profit organisation to locally produce ventilators to meet the need of COVID-19 patients, has received regulatory approval for their locally made product. @CowansView South Africa	Treatment innovation
	#AndhraPradesh #COVID_19 Special buses to collect samples for COVID-19 tests. A unique and great initiative indeed	Testing innovation
	I tested positive for COVID-19 prior to my teams departure … Please take this virus seriously. Be safe. Make up! #whynot	Influencer (celebrity)
	@MailOnline Antibodies from LLAMAS could be developed as a treatment for COVID-19 patients	Treatment innovation
	CNN Dr. Anthony Fauci, the nation’s top infectious disease expert, believes the country is on track to find treatments that will help prevent the progression of COVID-19 disease, particularly for people who are the most likely to get extremely sick	Treatment innovation
	ABS-CBN News Channel Malacañang claims hospitals have enough beds for #COVID19 patients	Healthcare
	@politico Visitors to New York from states where COVID-19 infections are on the rise could face a $2,000 fine if they fail to provide information about where they plan to quarantine for two weeks, Gov. Andrew Cuomo said Monday	Regulation
	Christiane Amanpour @camanpour .@edyong209: “A country that, 7 months into a pandemic, still cannot ensure that its healthcare workers have enough gowns and gloves and protective equipment is not going to be able to distribute a vaccine in an efficient way. It simply isn’t.”	Healthcare
WHO	Tedros Adhanom Ghebreyesus Solidarity is the key ingredient to fight #COVID19. I am glad that the European nations: France, Germany, Luxembourg, Switzerland & Austria are leading by example. Together, we can stop the virus from spreading, save lives & build back better! #BastilleDay	Awareness and advice
	To stay safe, they need training & personal protective equipment. We have online courses at http://OpenWHO.org in many languages We work with private sector, partners to send supplies	Awareness and advice
	#COVID19 pandemic could tip over 130 million more people into chronic hunger by the end of 2020, adding to persistent hunger & #malnutrition—new & WHO #SOFI2020 report highlights challenges to achieving 0 hunger by 2030	Awareness and advice
CDC	@DrTomFrieden We’re seeing unprecedented attacks on science, on public health, on CDC. If there was that much focus attacking the virus that causes COVID instead, we’d all be safer	Awareness and advice
CDC	@HHSGov 23 h Wearing a face covering and staying six feet apart doesn’t just protect you, it protects those around you. Learn more about doing your part during #COVID19: https://bit.ly/304qbdX #COVIDStopsWithMe	Awareness and advice

Data set of ad hoc tweets Christiane Amanpour @camanpour .@edyong209: “A country that, 7 months into a pandemic, still cannot ensure that its healthcare workers have enough gowns and gloves and protective equipment is not going to be able to distribute a vaccine in an efficient way. It simply isn’t.” Tedros Adhanom Ghebreyesus Solidarity is the key ingredient to fight #COVID19. I am glad that the European nations: France, Germany, Luxembourg, Switzerland & Austria are leading by example. Together, we can stop the virus from spreading, save lives & build back better! #BastilleDay To stay safe, they need training & personal protective equipment. We have online courses at http://OpenWHO.org in many languages We work with private sector, partners to send supplies From this sample, the most-likely keyword, COVID-19, revealed a variety of expected tweets on: the spread of the virus, the serious of it based on experiences and testimonials, testing, innovative ways to approaching testing and treatments, and others. The tweets shown from the WHO and DCD relate to advice and awareness.

Searches for Nuggets

The notion of the wisdom of the crowds implies that sometimes the crowd is able to perform better than individuals [28]. We investigated whether the content, as provided by the user community of Twitter (the crowd), could provide insights that might be helpful to the general public, or perhaps even medical professionals. The types of insights we were looking for required a human to identify what might be useful content and extract ideas at the tweet level of analysis [29]. Therefore, we reviewed approximately 2000 tweets posted throughout the pandemic. We attempted to identify nuggets; that is, pieces of information with the potential to have real value or use, beyond just a post. Examples of potentially influential tweets are given below. The first is a best practices suggestion. Tweet (factual/sharing of best practices): #itvnews Many German patients were given oximeters in the community back in April. Other places have also recommended this. https://www.thailandmedical.news/news/COVID-19-tips-oximeters,-a-potential-home-tool-to-monitor-progress-of-COVID-19-symptoms-from-mild-to-moderate-and-to-detect-COVID-19-pneumonia-early The following tweet shows passing on blood type information from a legitimate news source. Such information could be useful for someone assessing their own risk (e.g., for potential usefulness). Tweet (blood types). This study finds COVID patients with type A blood are at much higher risk of developing life-endangering symptoms, patients with type O blood experience a “protective effect” https://www.nytimes.com/2020/06/03/health/coronavirus-blood-type-genetics.html However, a later study by Harvard showed that people who were symptomatic and had blood types of B + or AB + were more likely to have a positive COVID-19 test than people who were symptomatic with blood type O.8 The following tweet is factual. Knowing the potential length of the illness might last would be useful to anyone concerned with whether they are experiencing a typical duration. Swiss TV news (factual): Half of patients (500/1000) contacted by a COVID 19 follow-up service report symptoms after 6 weeks https://www.rts.ch/play/tv/19h30/video/le-virus-recule-et-le-nombre-de-gueris-du-COVID-sont-tres-nombreux--mais-cette-nouvelle-maladie-laisse-parfois-des-traces-?id=11370777 The following tweet reports on a medial study and would be useful for anyone concerned with how seriously the virus might infect them. Factual: Low levels of the prognostic biomarker suPAR are predictive of mild outcome in patients with symptoms of COVID-19 - a prospective cohort study. Authors: jesper eugen-olsen, Izzet Altintas, Jens Tingleff, Marius St... http://medrxiv.org/cgi/content/short/2020.05.27.20114678 The following two tweets show associations of patient characteristics and occurrence of the disease. These posts are interesting in the sense that the associations being made are non-intuitive. However, they serve as examples of the types of tweets that might trigger self-reporting of whether a person falls into one of these categories, which, in turn, could lead to further investigation to uphold or falsify the conclusions from the reports. Factual (implications true or falsified later): In one report, dermatologists evaluated 88 COVID-19 patients in an Italian hospital and found 1 in 5 had some sort of skin symptom, mostly red rashes over the trunk. https://inq.news/COVID-toes Factual (implications true or falsified later): Most #coronavirus patients had no hair https://www.hulldailymail.co.uk/news/uk-world-news/bald-men-could-risk-more-4194866 The tweets below could be important because they provide information on the virus itself, as well as a potential treatment, but are not scientifically proven. COVID-19 maybe mutating but it’s for the good. Doctors in Italy have claimed that the symptoms of COVID-19 and their intensity is less than what they experienced with the first wave of patients. This suggests that COVID-19 gets weaker as it spreads. https://elemental.medium.com/could-the-coronavirus-be-weakening-as-it-spreads-928f2ad33f89 A new drug, #famotidine, available over-the-counter for relieving #heartburn, has shown promising results in treating the symptoms of #COVID19 https://www.firstpost.com/health/heartburn-drug-famotidine-may-reduce-symptoms-of-non-hospitalised-COVID-19-patients-suggests-case-series-8452421.html As these tweets illustrate, they provide useful, or potentially useful, information when so much is unknown about this global crisis. Human judgment is needed to assess the validity of the claims in the tweets with scientific study clearly required for some of them. However, the potential value of the information contained within a tweet could not easily be obtained by software.

Twitter Use

People generally turned to Twitter as a platform to make sense of the pandemic. The tweets showed that people also wanted to provide useful information for others, sharing their opinions and knowledge. There were many compassion posts triggered by personal situations. Example (desperation/disgust): My father, 62 yr suffering from high fever (103-104) from 9 days with no other apparent symptoms. He tested negative on COVID 19. He has history of CABG in 2006. Our family doctor advised to get him admitted. No hospital is accepting patient with fever. Pls help #caremongers Example (factual/tension/stress): My friend is a nurse & finally broke her silence. She said she’s seeing COVID-19 patients leaving the hospital after COVID with kidney damage. Others will suffer with COPD like symptoms for the rest of their lives. It’s very scary. Example (disgust) MILD. There’s a huge amount at stake in term mild – for gov actors, health service planners, clinicians, patients, carers...In the days when my own ‘mild’ #COVID19 symptoms have been manageable (Day 52 now), I reflected on mild COVID-19 for @somatosphere https://t.co/rQ9wFdcSQ7?amp=1 This mining revealed a great deal of posts with different perspectives. Many posts were intended to provide useful information. However, some of the posts which reported information considered to be factual (e.g., do not need to wear masks) had the potential to later be proven false. Ideally, the mining for nuggets could produce insights for the management of the virus. For example, some cases reported on successful convalescent plasma treatments, leading to requests for plasma donations from recovered patients. Other tweets reported some members of a family getting the virus and others who lived at the same location, not getting it. Such reports might be of interest to researchers trying to find commonalities in these cases. The Appendix contains tweets mined from an additional data set. They reveal a combination of medical innovations (attempted or actual), health information, sentiment and personal reports, opinions, and creative comments. The tweets reveal a need to contribute to an ongoing crisis by providing medical information; contributing to the global conversation on COVID-19; or seeking help.

Themes—Summary

The use of Twitter as a critical social media tool in times of major communication needs was obvious with Twitter text providing valuable insights into users’ opinions and attitudes. The same held true as other world events unfold; for example, the Arab Spring and Japan’s earthquakes [6]. For COVID-19, the sentiment analysis revealed a change over time as the pandemic progressed. The most notable trend was that tweet content progressed from providing, and seeking, factual information to expressing emotions, including anger. Prior research found that Twitter, along with other social media, could be used as a predictor of COVID-19 cases and other threats to community health [30]. It is likely there will be continued use of these platforms. The development of large databases of tweets or other user-generated content should, thus, continue to provide substantial research opportunities to investigate COVID-19 or other issues related to global challenges of such magnitude [7]. Content themes emerged. The tweets emphasized information on testing, treating, reporting of well-known figures who tested positive, warnings about the severity of the disease, and other health-related information. Additional themes related to politics, reported scientific breakthroughs (some of which were later shown to be false), economics, reopening of schools and businesses, and others. We attempted to understand the content of the tweets using sentiment analysis. Many tweets were factual; other showed predictable sentiment of anger, desperation, and hope. Of interest was how Twitter might be used to identify information nuggets, in the traditional sense of a valuable idea. This involved manual inspection and mining. One nugget was a relatively early suggestion that a hospital in India collect the blood of patients who had recovered from COVID-19. Later, the identification of blood type was scientifically investigated as an indicator of the risk of experiencing the disease. However, there does not appear to be a way for a computer program to connect these two, demonstrating the limitations of tools to extract inherent information in text data [12]. In the same way, there is much intentional or non-intentional sharing of misinformation, often referred to as fake news [31, 32]. A computer program that can deal with sentiment well might be able to identify tweets with specific content and others with opposite or contradictory content. We did not, for example, investigate tweets that suggested the COIVD-19 pandemic should not be taken seriously. Instead, we considered reasons why people elected to share content. Representative examples are provided in Table 21.

Table 21

Sharing of Twitter content

Representative tweet	Reasons and sentiment
Sophie Grégoire Trudeau has donated her blood to an expansive Canadian study of whether or not antibody-carrying plasma from people who have recovered from COVID-19 can help patients still trying to overcome the illness, iPolitics has learned #cdnpoli https://ipolitics.ca/2020/05/30/sophie-gregoire-trudeau-donates-blood-for-COVID-19-convalescent-plasma-study/	Reason for sharing: intrigued by actions of notable people Sentiment: admiration
Diagnostic Tip: Get tested for COVID if you have ANY symptom from this list	Reasons for sharing: being helpful
A patient with symptoms of a heart attack refused treatment after reading on Facebook that she would die if she went to hospital during the COVID-19 crisis	Reason for sharing: warning Sentiment: absurd, overreaction
For many COVID-19 patients, symptoms can linger for weeks after the virus clears their system and full recovery can take longer still. Some are finding themselves unable to shake sickness and fatigue and get back to work. From The New York Times	Reason for sharing: factual information helpful to others
Each COVID hospital in all cities needs to have a list of COVID recovered patients with their blood groups so that they can be reached out for to donate for convalescent plasma therapy	Reason for sharing: express actionable opinion
@SenatorWicker More than 11,000 children test positive for coronavirus in Florida As the Florida Department of Education mandates that public K-12 schools must open in August, thousands of children in Florida are continuing to test positive for COVID-19	Reason for sharing: informative and warning Sentiment: frustration
Top Read: Many patients with serious symptoms have delayed care due to fear of COVID-19	Reason for sharing: information and warning
The treatment was administered to 73 COVID-19 patients in UAE, with moderate to severe symptoms. All those patients have responded well to the therapy. The team of researchers, however, insisted that despite initial success, further data should be gathered	Reason for sharing: informative Sentiment: factual (hopeful)
RT @MollyYaLa: florida is now the epicenter for COVID-19 not of the US but THE WHOLE WORLD… https://t.co/l5L5fZlB8P…	Reason for sharing: factual, awareness
It took 95 days for us to reach 1,000,000 cases of COVID-19. It took 43 days for us to reach 2 million cases. It took 28 days for us to reach 3 million cases. The virus is accelerating. Even someone with 1/100th of a brain should see that	Reason: awareness Sentiment: scare
schools can’t even control lice, but think they can control COVID-19 lol	Analogy

Sharing of Twitter content Reason for sharing: intrigued by actions of notable people Sentiment: admiration Reason for sharing: warning Sentiment: absurd, overreaction Reason for sharing: informative and warning Sentiment: frustration Reason for sharing: informative Sentiment: factual (hopeful) Reason: awareness Sentiment: scare Many other investigations are possible. For example, could the impact of international protests be factored into the sentiment analysis? Is it possible to identify a “tipping” point where people realized the importance of being vigilant (wearing masks, etc.) based on posts reported by infected people relaying the seriousness of the disease to others? Twitter has been used for social debates and expressions of public opinion (e.g., [33]). No doubt, it will continue to be used in this manner for topics of large, public impact. However, with millions of tweets being generated each day, our study has involved a limited number of texts, restricted to those written in English. It would be useful to expand the categories of sentiment we use as well as well as to determine whether there was any age group or gender differences in the negative tweets. Finally, finding the true nuggets will, no doubt, require a huge, semi-automated approach, but doing so might help to identify insights that could lead to the development of better sentiment analysis tools. What we learn from Twitter as a platform is the potential to reach a large audience and provide much information, informative or otherwise. Of course, there are many issues, but it is not possible to verify them without scientific experimentation and reporting of actual numbers. For example, at one point, based on data from Italy and the UK, a website reported men as having approximately twice the number of deaths as women.9 Finally, not all insights can be obtained using existing sentiment analysis tools, but there is a limited amount of insight that can be obtained from manual mining.

COVID-19 Health Management Life Cycle, Emerging Pandemic Issues, and Computational Extensions10

This paper has used data collected early in the COVID-19 life cycle. At the time of data collection, it was unimaginable that multiple COVID variants would have emerged. Nor was it foreseeable that, after 2 years, the end of the pandemic is not in sight. However, these realizations suggest that the COVID-19 pandemic has a sustained life cycle with many events that also could be investigated. Because that life cycle has many implications, we examine the basic notion of a COVID-19 life cycle and some of its implications. In addition, we examine some computing extensions for using bags of words in order to address issues, beyond the psychological concerns addressed in this paper. We examine potentially using Word2Vec and other approaches in future models and generate a list of business-based COVID-19 words. We also investigate the potential opportunities for the application of symbolic and sub-symbolic AI for sentiment analysis of COVID-19 as well as other emerging trends.

COVID-19 Health Management Life Cycle and Related Problems

As the COVID-19 pandemic continues, with new variants such as “omicron” emerging, and no solid end in sight, it is clear there is a life cycle to the pandemic (e.g., [34]) that affects healthcare planning, management and resource allocation. The emergent COVID-19 health management life cycle has many things in common with technology life cycles, such as the maturity curve, the hype curve, the adoption curve and others (e.g., [35]). Unfortunately, what it is not clear are the specific concerns, markers, or events, within that pandemic life cycle. Some of those emerging activities within the life cycle appear to include issues, such as managing new outbreaks of COVID, integrating health management efforts across multiple countries, freeing up and allocating resources, and other concerns, whose difficulties and solutions likely have not been established completely because the disease, literally, has been emerging, diffusing, and evolving. A health management life cycle model could be useful for identifying problems associated with each stage in the life cycle, as the disease works through its life cycle. The beginning of one such view of a life cycle is provided in Fig. 5 and includes potential life cycle stages on the horizontal axis and potential problems associated with the stages on the vertical axis. As a companion to this approach, it is easy to imagine a COVID-19 version of the hype cycle that traces the COVID-19 technologies (vaccines, antivirals, infusions, etc.) over time and over the stages, such as the “peak of inflated expectations,” to the “Trough of Disillusionment” to the “Slope of Enlightenment” [36].

Fig. 5

Potential Health Management model of a COVID-19 life cycle and related problems

Potential Health Management model of a COVID-19 life cycle and related problems This life cycle model could be helpful in text analysis by providing insights in a number of directions. For example, this research is concerned with psychological issues of communication in social media, suggesting the importance of a text mining approach centered in psychology and helping us choose the tool, LIWC. Across the life cycle, there are likely different psychological problems, that potentially might be identified from analysis of text communications. However, analysts may be concerned with other stages and other problems in the life cycle, requiring a different context than a psychological one. In those settings, analysts may need to generate different bags of words to gather meaning from different contexts about different problems, such as outbreak management or integrating efforts across countries.

Generating Bags of Words in Alternative Contexts for Alternative Problems

A number of approaches could be used to generate bags of words for different contexts in the COVID life cycle. Word2Vec [37] provides two approaches that allow the generation of words that are similar to a seed word: “Continuous Bag of Words” (CBOW) and “Continuous Skip Gram Model” (Skip Gram). Word2Vec identifies words that are similar to a seed word or words, in the text from which they are gathered. For example, as noted by Mikolov et al. [37, p. 5] in the analysis of text they found that the approach would help find similar words, such as how “… France is to Paris as Germany is to Berlin ….” We generated a set of 39 words drawn from a business corpus that are presented as Table 22, using both CBOW and Skip Gram.

Table 22

List of words using Word2Vec

BREXIT	Impacts	Slowdown
Contagious	Lingering	Spread
COVID	Lockdown	Subside
Crises	Lockdowns	Subsides
Downturn	Outbreak	Surges
Economy	Outbreaks	Trajectory
Emergencies	Pandemics	Unfolded
Endures	Persists	Unprecedented
Epicenters	Quarantines	Vaccinations
Epidemic	Recedes	Virus
Exploded	Recession	Warming
Fears	Resurgences	Worsened

List of words using Word2Vec An analysis of those words can imagine the business concerns, e.g., “downturn,” “slowdown,” “economy,” and “recession” captured in the corpus. In addition, the list includes other related risks to business, such as “BREXIT” and “fears.” Further, some of these words, although related to COVID-19, are not uniquely associated with the pandemic, such as “economy” and “abates.” Poria et al. [38] and Araque et al. [39] suggest that text analysis should employ ensemble approaches. Interestingly, there are two different approaches within Word2Vec, so its use inherently provides the perspective of an assemble of methods. In addition, other approaches such as “Glove” [19], can be included with the two approaches within Word2Vec, to broaden the ensemble of methods. Each of these algorithms could be used to generate sets of words with different seed words. However, using multiple methods generates redundancies and words that, in general, may not be directly related to the seed word(s) in the sense that the analysis is concerned. As a result, it is important to include a “human-in-the-loop” into the ensemble approach when generating a bag of words about a concept. Finally, Cambria et al. [40] investigated the application of symbolic and sub-symbolic AI for sentiment analysis. Their approach to capture meaning integrates both top-down and bottom-up computing that employs both computational sub-symbolic computational approaches and symbolic logic and semantic network approaches. In so doing, they built a new version of SenticNet. Future research can focus on integrating this approach to build better word sets that match the domain-specific needs of the particular locations of a COVID-19 life cycle, generating the wordsets for the problems as needed.

Ambivalence

Recently, researchers have begun to explore additional approaches of measuring neutral sentiment. Although there are approaches to capturing neutral sentiment, as discussed in the “Impact of COVID-19 Vocabulary Use of Twitter Reputation and Retweets” section, and in the Python natural language tool kit, recently Wang et al. [43] developed a more fine-grained approach to measuring ambivalence. This is important, because, although much of COVID-related activity is emotionally charged, resulting in demonstrations world-wide, some issues apparently garner ambivalence.11 For example, Peng and Chen [42] investigate emotional ambivalence and luxury good consumption during the COVID pandemic. However, as noted by Craig et al. [41], capturing ambivalence can relate to the specific issues being considered and the way in which questions are worded, further emphasizing the importance of specific words.

Summary, Contributions, and Conclusion

As of 6th May 2020, there were almost 4 million known, confirmed cases of COVID-19 worldwide. By mid-July, that number more than tripled. It reached 30 million cases by September, and close to one million deaths.12 By December 2021, over 250 million cases and 5 million deaths have been documented. By May 2022, that number grew to over 530 million cases and almost over 6 million deaths. Many global efforts are still being taken to combat the virus. As ordinary people seek to understand the virus, and learn how to protect themselves, they frequently turn to online platforms, such as Twitter, which is often regarded as a good resource from which to analyze opinions from user-generated content [44].

Contributions—Human Communication

This paper makes several contributions to knowledge about human communication using social media, couched in the use of Twitter within the COVID-19 pandemic, which lead to interesting questions for future research. First, we found that the text sentiment and emotions of Twitter messages with and without video and pictures is statistically significantly different. Although not clear why, the findings suggest important differences. Is it a general characteristic of human communication that using pictures results in different text sentiment than if pictures are not used? Second, the sentiment and emotions of the retweets and the original tweets in the pandemic were statistically significantly different. Future research should investigate the extent to which this finding can be generalized. Is there something in the sentiment and emotions of a tweet that makes it likely to be retweeted? Is it a general characteristic of human communication to use this particular type or amount of sentiment? Third, Google searches and Wikipedia page views are predictive of the percentage of positive and negative sentiment tweets, suggesting that humans perform internet searches and then communicate the results via social media. Future research should investigate the extent to which this phenomenon occurs in non-pandemic settings and can be considered a general model of human behavior. Fourth, a variable representing a dictionary of words capturing COVID concepts was statistically significant in models of Twitter influence and retweets. It appears that communication of new topics pursued by new users results in retweets, in contrast to tweets from those with large influence scores derived from established pools of followers and topics. This finding can be used to support future research on new user groups or on technology use during a major event (e.g., [45]).

Relationship to Previous Research with LIWC

We have not discovered similar research to benchmark the findings in our analyses. However, very recently, other researchers have used LIWC for various types of related research into COVID-19-based concerns. Silva et al. [46], for example, used LIWC and Twitter tweets to investigate issues associated with misinformation. Barnes [47] used LIWC in analysis of “terror management theory.” Safa et al. [48], similarly, analyzed the detection of symptoms of depression in Twitter tweets. Ebeling et al. [49] used LIWC to investigate the impact of political polarization during the COVID-19 pandemic. Mosleh et al. [50] used LIWC, to analyze correlations with behavior. These efforts support our use of LIWC, although there are limited in their application to the issues examined in this paper.

Conclusion

The COVID-19 pandemic continues to be a topic of much global interest for both health and economic reasons, as new variants evolve. This research has analyzed text data from Twitter to gain an understanding of human communication based on user-supplied content during the pandemic. Twitter tweets were analyzed manually and using a text analysis tool. The results show changes in user participation over time from information seeking to expressions of anger or other emotions. Users retweeted different content with clout and were most concerned with health. Tweets that include pictures and movies have different text than those that do not. The percentage of positive versus negative sentiments found in COVID-19 tweets could be predicted by Google searches and Wikipedia page views. This research can also be considered as an analysis of human communication where new concepts are discussed using text and images, which provide a firm foundation from which to analyze the implications of events or situations that have wide-spread consequences, such as a pandemic or a natural disaster.

Table 23

Ad hoc Twitter posts

Tweet	Type
Seasonal Approach (winter fuel) & COVID-19. Investigating Ketone Bodies as Immunometabolic Countermeasures against Respiratory Viral Infections https://t.co/AYr2XgGM88	Medical innovation
RT @KellyannePolls: First COVID-19 vaccine tested in US poised for final testing https://t.co/GzeHXgTdtC	Medical innovation
RT @Mariah__Driver: It’s been 82 days since I tested positive for COVID-19, and I’m still experiencing symptoms	Personal report
RT @magaxxoo: Another medical professional who AGREES with the New England Journal of Medicine findings that masks are INEFFECTIVE at protecting	Health (controversy); eventually incorrect
RT @urlocalchlo: if you could sacrifice one genre of music to end COVID-19, what would it be? and why country music?	Creative comments
RT @bopinion: But a lower average death count — say, 500 a day — is still tragic	Compassion [25]
RT @AfrDiasporaNews: The news just dropped about Houston being selected for COVID-19 vaccine trials. They claimed we were selected due to o…	Health
RT @untoldmaga: The World Health Organization has taken a complete U turn concerning COVID-19 see the video….. https://t.co/jHX95…	Health
RT @RepJamesComer: This is simple: China lied, the WHO complied, and Americans died	Creative comments; sentiment disgust [25]
RT @propublica: After months of asserting pregnant women were not at high risk for the coronavirus, the CDC recently released a study with…	Health
RT @sparkledocawake: I am a physician I no longer trust the CDC I no longer trust the FDA I no longer trust the WHO I no longer trust the…	Creative comments; sentiment disgust [25]
RT @bakoff333: CDC acknowledges mixing up coronavirus testing data I’m sure it was an accident.. I would go as far to say it’s criminal…	Disgust [25]
RT @YalePediatrics: Estimates suggest that 25 to 45% of people are asymptomatic #COVID19 carriers. “Our best estimate right now is that for…	Health
@Malcolm_fleX48 Also, look at CDC’s own numbers of COVID deaths for the week of 6–27 to 7–4…. Seventy one TOTAL for US lowest since this whole thing started	Health
RT @flo2changz: 2/ Upon arrival at Taiwan’s airport, I purchased a prepaid SIM card and provided the number to the CDC. They will use this… < a href = "http://Twitter.com/download/iphone" rel = "nofollow" > Twitter for iPhone < /a >	Health
RT @CNBC: CDC says U.S. could get coronavirus under control in one to two months if everyone wears a mask https://t.co/mYQfMbg0F2	Health
RT @DrEricDing: Worrying: 16 h of airborne aerosol infectivity for the coronavirus that causes #COVID19 in a study published in a CDC…	Health
@judgealexferrer BTW, @judgealexferrer I also had COVID in Jan, long before Prez stopped calling it a hoax and CDC admitted it was in country. The May case was the same symptoms, just much milder	Personal report
@Mike_Pence @VP You are reckless! I will follow CDC guidelines over you and this administration!! Shame on you	Sentiment: anger [25]
@thePotSta @kylamb8 Considering how often (always) the CDC has been wrong on this, are you sure that’s a mountain you want to die on? #FactsOverFear #NoNewNormal #MuzzleUpAZ #MuzzleUpWA	Sentiment: disgust
@MrRealism @YoramBlue There’s some evidence now that the severity of a COVID 19 infection is dependent on blood type. Most blood types can survive it. Hang in there and don’t give up. Keep us posted on how you’re doing	Health, sentiment: hope [25]
Type O blood types were known to be resistant to COVID months ago. NOW, they want to pursue that link after everyone is wearing masks that compliment their outfits and selling on beautiful new displays all over the country. Horse crap!!’	Creative comment; sentiment: disgust [25]
RT @JJDJ1187: Your blood type could play a role in how sick you get from COVID-19, how this could be a game-changer in fight to stop it	Health; sentiment: hope [25]
@HermioneIsHere None of that tells me that knowing my blood type helps me. They all got COVID. Knowing their blood type or not didn’t help them	Creative comment; sentiment: contempt [25]
The MEDICAL PEOPLE are making COVID-19 all the more worrisome with AGE and BLOOD TYPE declaring those of AGES OVER 55 and BLOOD TYPES other than “O” will have little HOPE. DO YOU KNOW WHAT YOU ARE DOING	Sentiment: disgust [25]
I wonder if the blood type and COVID link is holding any truth. My sister and 2 nephews got it. And it only lasted less than a week and they were pretty ok. And my other nephew who lives in the same house NEVER GOT IT. Yet I know people who have died of it. Scary stuff	Sentiment: tension/stress [25]
@d_mos77 Wtf COVID is so selective! Less likely to die if you’re an atheist, blood type 0 and born on a monday	Sarcastic comments
RT @albrtenrqz: My mother is currently battling against COVID-19. We are in need of blood type O + and donor must have recovered from COVI…	Personal report; sentiment: desperation [25]
RT @NYTScience: New studies show that people with Type A blood are not at greater risk of getting sick with COVID-19, as previous studies h…	Health; contradiction
The amount of convalescent plasma orders I’ve done for COVID patients today alone is crazy. And we don’t even have the inventory to keep up. Especially blood type B and AB. Initial data available from studies using COVID-19 convalescent plasma for the treatment of individuals with severe or life-threatening disease indicate that a single dose of 200 mL showed benefit for some patients, leading to improvement	Medical innovation
@drdavidsamadi Meanwhile in Canada, the “authorities” are following the World Health Organization’s “recommendation” with no regards to science and ethic, whatsoever	Sentiment: disgust [25]

Table 24

Additional Twitter posts

Tweet	Type
Type O blood types were known to be resistant to COVID months ago. NOW, they want to pursue that link after everyone is wearing masks that compliment their outfits and selling on beautiful new displays all over the country…	Health
RT @1000Frolly: COVID-19 NEWS; Sweden is now approaching Herd Immunity, as it’s death rate nears zero. There should be no “second wave” for…	Health
RT @SilvertsClothes: COVID-19 has changed the way we live, work, and interact with one another	Factual
RT @aceprtglz: I miss life without COVID	Sentiment
RT @vickitle: my COVID positive pt received plasma from a donor that had antibodies and her O2 sats went up almost immediately after …	Health
My cousin died at 6:30 this morning from COVID-19. Please wear a mask	Sentiment: sad
RT @HollandJeffreyR: There are many ways that we can learn to be more thoughtful, grateful, and spiritual..	Sentiment: gratitude
RT @JustinTrudeau: ATTENTION CANADIANS: a new mobile app that will help limit the spread of COVID-19 is now available! The COVID Alert App…	Factual
RT @celtics: Don’t forget your masks! Wearing a mask is one of the most effective ways to slow the spread of COVID-19. Do your part …	Health
RT @maddieevelasco: Why you shouldn’t eat at restaurants during COVID-19: A thread by a host	Factual
RT @Suntimes: Don’t expect to get rid of the masks anytime soon: Illinois is unlikely to return to normalcy until some time in 2021	Prediction
im still alive just people. 2 of my coworkers tested positive for COVID so i have been working more and trying to get things done in RL.i am not sick	Factual
RT @toddeherman: I keep posting the literal, mathematically derived, inarguable facts about the COVID Flu	Factual
so far everything “good” that has happened to me this year has gotten ruined due to COVID. every single aspect of my wedding, my bachelorette, now my honeymoon. I have never felt more defeated	Sentiment: defeat
RT @eahcalanait: My sister and I lost our only family earlier this week to COVID-19. We’ve started a go fund raiser to help us financially…	Sentiment: sad

10 in total

1. The science of fake news.

Authors: David M J Lazer; Matthew A Baum; Yochai Benkler; Adam J Berinsky; Kelly M Greenhill; Filippo Menczer; Miriam J Metzger; Brendan Nyhan; Gordon Pennycook; David Rothschild; Michael Schudson; Steven A Sloman; Cass R Sunstein; Emily A Thorson; Duncan J Watts; Jonathan L Zittrain
Journal: Science Date: 2018-03-08 Impact factor: 47.728

2. Automatic detection of depression symptoms in twitter using multimodal analysis.

Authors: Ramin Safa; Peyman Bayat; Leila Moghtader
Journal: J Supercomput Date: 2021-09-09 Impact factor: 2.557

3. Emergency Medicine Influencers' Twitter Use During the COVID-19 Pandemic: A Mixed-methods Analysis.

Authors: Maren K Leibowitz; Michael R Scudder; Meghan McCabe; Jennifer L Chan; Matthew R Klein; N Seth Trueger; Danielle M McCarthy
Journal: West J Emerg Med Date: 2021-03-22

4. Cognitive reflection correlates with behavior on Twitter.

Authors: Mohsen Mosleh; Gordon Pennycook; Antonio A Arechar; David G Rand
Journal: Nat Commun Date: 2021-02-10 Impact factor: 14.919

5. Using Tweets to Understand How COVID-19-Related Health Beliefs Are Affected in the Age of Social Media: Twitter Data Analysis Study.

Authors: Hanyin Wang; Yikuan Li; Meghan Hutch; Andrew Naidech; Yuan Luo
Journal: J Med Internet Res Date: 2021-02-22 Impact factor: 7.076

6. Artificial intelligence-enabled analysis of UK and US public attitudes on Facebook and Twitter towards COVID-19 vaccinations.

Authors: Amir Hussain; Ahsen Tahir; Zain Hussain; Zakariya Sheikh; Mandar Gogate; Kia Dashtipour; Azhar Ali; Aziz Sheikh
Journal: J Med Internet Res Date: 2021-01-31 Impact factor: 5.428

7. An analysis of COVID-19 vaccine sentiments and opinions on Twitter.

Authors: Samira Yousefinaghani; Rozita Dara; Samira Mubareka; Andrew Papadopoulos; Shayan Sharif
Journal: Int J Infect Dis Date: 2021-05-27 Impact factor: 3.623

8. Understanding terror states of online users in the context of COVID-19: An application of Terror Management Theory.

Authors: Stuart J Barnes
Journal: Comput Human Behav Date: 2021-07-24

9. Mobile phone data for informing public health actions across the COVID-19 pandemic life cycle.

Authors: Nuria Oliver; Bruno Lepri; Harald Sterly; Renaud Lambiotte; Sébastien Deletaille; Marco De Nadai; Emmanuel Letouzé; Albert Ali Salah; Richard Benjamins; Ciro Cattuto; Vittoria Colizza; Nicolas de Cordes; Samuel P Fraiberger; Till Koebe; Sune Lehmann; Juan Murillo; Alex Pentland; Phuong N Pham; Frédéric Pivetta; Jari Saramäki; Samuel V Scarpino; Michele Tizzoni; Stefaan Verhulst; Patrick Vinck
Journal: Sci Adv Date: 2020-06-05 Impact factor: 14.136

10 in total