Literature DB >> 35998408

Microblog data analysis of emotional reactions to COVID-19 in China.

Yuchang Jin¹, Aoxue Yan¹, Tengwei Sun¹, Peixuan Zheng², Junxiu An³.

Abstract

To explore the emotional attitudes of microblog users in the different COVID-19 stages in China, this study used data mining and machine-learning methods to crawl 112,537 Sina COVID-19- related microblogs and conduct sentiment and group difference analyses. It was found that: (1) the microblog users' emotions shifted from negative to positive from the second COVID-19 pandemic phase; (2) there were no significant differences in the microblog users' emotions in the different regions; (3) males were more optimistic than females in the early stages of the pandemic; however, females were more optimistic than males in the last three stages; and (4) females posted more microblogs and expressed more sadness and fear while males expressed more anger and disgust. This research captured online information in real-time, with the results providing a reference for future research into public opinion and emotional reactions to crises.

Entities: Chemical

Keywords: Basic emotions; COVID-19; Data mining; Sentiment analysis technology; Sina microblogs

Mesh：

Year: 2022 PMID： 35998408 PMCID： PMC9245366 DOI： 10.1016/j.jpsychores.2022.110976

Source DB: PubMed Journal: J Psychosom Res ISSN： 0022-3999 Impact factor: 4.620

Introduction

The COVID-19 outbreak in China was a public health emergency, with its high infectiousness and uncertainty threatening people's physical and mental health. As the most important news and development social platform in China, the Sina microblog releases and updates gave the latest information about the pandemic in real-time. Because people paid a great deal of attention to pandemic information, they were also exposed to significant negative information, which resulted in negative emotions and some mental health problems [21]. Previous studies found that female emotional reactions to crises were more profound and lasting as their sensitivity to stress was greater and they were more vulnerable to negative emotions [29]. Recent studies have also found that the anxiety levels of female college students during COVID-19 were significantly higher than the anxiety levels of male college students [10]. However, as most previous studies have focused on the emotional changes in specific groups, there have been few macro perspective studies on the emotional changes in internet groups during the pandemic. Therefore, to provide scientific and effective support for public mental health in the post-pandemic prevention and control period, this study examined the changes in group emotions on internet platforms by tracing the psychological changes during COVID-19. Sharing daily emotions and states on social media has become part of everyday life for microblog users. As one of the most representative online social media in China, the Sina microblog had 523 million active users in 2020. As the massive online information generated by the microblog can reflect the psychology of online audiences, big data research methods can be used for online text analyses. Pan et al. [41] crawled pandemic news on the People's Daily using data mining technology and divided the event into three stages to study the microblog users' emotional tendencies. Zhang et al. [61,62] used a big data research method to study spatial dimensions and found obvious differences in different administrative regions, with the more serious the pandemic became, the higher the participation and the more negative the microblog users' emotional states. Some scholars classified the COVID-19 related Sina microblogs into seven main themes using an LDA theme model and a random forest algorithm and proposed strategies for early public opinion [22]. However, the specific reasons for the spatiotemporal distributions and dynamic microblog COVID-19 changes have not been further analyzed from a psychological perspective. Therefore, using a combination of data mining and sentiment analysis technology, this paper analyzed the COVID-19 microblog public opinions in the five stages suggested by the Chinese State Council [26], and offers some psychological explanations.

COVID-19 and social media

Social media use has developed rapidly in recent years, with increasingly more public health departments and individuals using social media platforms to exchange and share information during public health emergencies. During crises, social media has become an important channel to promote risk communication [20,25]. The public constantly posted information about the pandemic on social media platforms, such as Twitter, microblogs, Facebook, and post bar, to express their attitudes and feelings on such items as medical care and public policies. This social media content, therefore, can provide important information about public emotions and concerns. Social media research to assess public emotions and reactions has been conducted on infectious diseases, such as H7N9 ([19]; [8]; [54]), Ebola [25,45], the Zika virus [16,44], the Middle East respiratory syndrome associated coronavirus [17], and dengue fever [65]. The Sina microblog, which is similar to Twitter, is one of the most popular social media platforms in China. As of the fourth quarter of 2018, the number of monthly active users was 462 million, with about 200 million people using Sina microblogs every day [66]. This study analyzed the COVID-19 public emotional change trends and the most popular keywords from December 2019 to May 2020 and developed visual cluster maps of the most popular topics. Accessing timely public responses provides a deeper and more comprehensive understanding of people's emotional attitudes and concerns, which can assist governments and health departments take appropriate measures to prevent and control crises such as the pandemic [13,17].

Social emotion formation processes

Rivera [42] used three concepts to study social emotions: emotional atmosphere, which was the common emotional response from the whole society when faced with the certain problem, such as despair after the failure of war; emotional climate, which referred to the stable emotional characteristics shared by social members, such as fear in totalitarian countries; and emotional culture, which referred to the emotional experience or expression rules and standards developed in the society over a long time and the social culture and value influences on emotion, such as the negative emotional expressions found in some collectivist cultures. Later research on social emotions took a more comprehensive view. For example, Bar-Tal [4] focused on collective emotional orientations, which was the tendency of a whole society to express similar emotions because of the comprehensive effects of many factors. Barsade [3] proposed an emotional infection group model, claiming that when a person joins a group, they first pay attention to and inhabit the emotional states of the other group members with the valence, emotional arousal levels, and individual differences being the main influencing factors. Studies have found that compared with positive emotions, negative emotions are more likely to cause emotional infection. For example, Rui et al. [43] studied the propagation and diffusion speed of happiness, sadness, anger, and disgust on microblogs, and found that anger affected others more than happy emotions and was more widely spread. While anger and laziness are both negative emotions, anger has been found more likely to affect others than laziness. Individual differences, such as gender, age, emotional susceptibility, and living environments can also influence social emotions and impact the emotional infection [43]. Based on this discussion, the following hypotheses are proposed. : Over time, the internet users' moods change significantly in reaction to the COVID-19 pandemic evolution. : The male and female microblog users express different emotions when discussing the COVID-19 pandemic. : The microblog users from different areas have different emotions about the pandemic, with users in areas in which the COVID-19 pandemic is serious being more negative. The phased dynamic analysis of microblog data allows for accurate sentiment and keyword analyses, which can give authorities information about public opinion development trends, allowing them to pacify these emotions and implement situation controls. These data also provide valuable information for future research on public emergency emotional and psychological development processes.

Research methods

The Python programming language, data mining, and text sentiment analysis technology were applied to analyze user emotion characteristics and trends. The data mining involved microblog data acquisition, data cleaning, data denoising, and data storage, the text analysis involved keyword visualization, and the sentiment analysis used the Baidu AI API machine learning models to analyze sentiment polarity and classify the emotions. Ekman's “big six” basic emotions theory was used to divide the emotions into six main types: anger, disgust, fear, happiness, sadness, and surprise [11]. This study was approved by the Human Ethics Committee of Sichuan Normal University. The specific research steps were as follows and as shown in Fig. 1 :

Fig. 1

Research flow chart.

Data acquisition

Data were obtained using Python programming language web crawler technology. First, the HTTP library of Python was installed, after which the developer mode debugging tool in the Google browser was run to analyze the URL and retrieve the interface addresses of the pages to crawl. Then, the requests.get() command was applied to generate the data collection. Based on the five stages outlined by the Chinese State Council [26], “novel coronavirus” was used as the keyword to search and collect 112,537 related microblogs from December 27, 2019, to May 30, 2020, with gender, age, number of fans, VIP or not, region, and other information collected for each post.

Data cleaning and storage

Python's Panda library was used to efficiently and accurately process the microblog text data. The data cleaning step was employed to sort the news texts that had been forwarded from other people's microblog texts, identify the official platform and commercial marketing numbers, process the stop words, and delete the stop words that had nothing to do with the emotional text analysis. After cleaning to remove the noise, the 112,537 microblog data were stored in the CSV format. The microblog contents and related information examples are given in Table 1 .

Table 1

The format of microblog data related to “novel coronavirus”.

Posters' nickname	Sex	Region	Microblog content	Posting time
Niushen	Male	Beijing	I'm at such a risk and anxious when novel coronavirus...	20/1/30 12:10
Gengua	Female	Mianyang	Medical incidents are rampant....	20/1/30 12:11
Boshu biology	Male	Jiangsu	#Novel coronavirus diagnosis#1342 cases were confirmed...	20/1/30 12:11
...	...	...	...	...

The format of microblog data related to “novel coronavirus”.

Word cloud visualization

Word clouds, which are also [26] known as text clouds, is a text data visualization method that filters large amounts of text information, visually highlights the frequent “keywords”, and uses the WordCloud Library in Python to identify the main text themes, with the word cloud frame map in this research set as the map of China. Before the word cloud mapping was visualized, “Jieba” word segmentation was applied to remove all stop words to reveal the pure word text information and keywords.

Sentiment analysis

Sentiment analysis uses dictionary matching or machine learning technology to identify and analyze the emotional information posters are expressing in their subjective texts [12]. Sentiment analysis has become one of the most active natural language processing (NLP) research fields in recent years. Python has its own natural language processing library “snownlp”, and its own snownlp function to analyze the related information [31]. However, as snownlp's built-in positive and negative texts for training were taken from e-commerce platform evaluations, when applied to microblog text content analysis, the accuracy is not guaranteed. Therefore, the emotional analysis API interface provided by the Baidu AI open platform was employed in this paper to analyze the emotional polarity, with machine learning then applied to establish an Ernie model and analyze the six specific emotional categories.

Extreme emotional analysis

The sentiment analysis API interface on the Baidu AI open platform was first invoked and the API key and access tokens applied, after which a sentiment analysis was conducted line by line on the CSV file. The tool uses a Bi-LSTM model to judge the emotional tendencies based on semantics, with the results being divided into three types; positive, neutral and negative emotion intensity, for which the intensity ranges were [0,1]; the larger the value, the stronger the emotional intensity; with the emotional flag value = 0 for negative, 1 for neutral, and 2 for positive.

Multiclass sentiment analysis based on machine learning

The machine learning sentiment analysis learns the text data features using a small number of tag words, which can save time and money, and greatly improves the classification accuracy. This study used the TF-IDF feature extraction technique to train three machine learning models; Bert, AlbertL, and Ernie; and then based on the model with the highest accuracy, used the Ernie model to conduct the COVID-19 information sentiment classifications. The multi-classification emotion model was developed as follows:

Prepare training and test sets

The data set used in this paper was from the 2013 International Conference on natural language processing and Chinese computing (NPL & CC), which has 14,307 emotion-related text data, such as anger, sadness, happiness, fear, disgust, and surprise. Forty percent of the total data set was selected for the training data set to construct the model, with the remaining 60% of the data set being used as the test data set to test the generalization ability and accuracy of the model.

Constructing the word vectors

A text word vector in English indicates expresses the meanings in words; however, in Chinese, the meaning is best expressed in phrases. Therefore, for the Chinese language sentiment analysis, the data set needed to be segmented [51], for which word embedding coding methods were used to construct the word vectors. As word embedding is able to measure the similarities between quantifiers and words [60] it has been widely used. As part of the implementation, Python's gensim library was imported and the word2vec() function actioned to train the Chinese word vectors.

TF-IDF feature extraction

As the Bert model is unable to learn from original text, the TF-IDF feature extraction was applied to train the classifier model. TF-IDF is a statistical measurement algorithm [27] that determines the importance of a word in the document set.

Ernie modeling

Three models; Bert, Albert, and Ernie; were trained for this research. Bert is a pre-training language model proposed in 2018 [23] that can train language representations on task independent data sets, and then apply the learned representations to task-related language. The Albert model reduces space complexity using weight sharing and matrix decomposition and can obtain a much smaller model than the Bert model using several optimization strategies [30]. Ernie is an enhanced model [36] based on the Bert model and knowledge masking strategies that was optimized by Baidu in 2019 and can effectively overcome the disadvantages caused by directly breaking up the relationships between words in the hidden codes in the Bert model, thereby enabling the model to learn the representation method for complete concepts. The main parameters for the Ernie model are shown in Table 2 :

Table 2

Parameters of Ernie model.

Parameters	Values
initializer_range	0.02
Learning_rate	0.00002
vocab_size	18,000
Epoch	20
hidden_dropout_prob	0.1
attention_probs_dropout_prob	0.1

Parameters of Ernie model.

Model results

This study trained three models based on Bert, Albert and Ernie respectively, and used each model to experiment the effect of emotion classification on the test set data. The comprehensive advantages of the model were determined by weighted F1 score, the accuracy rate (Pr), and the recall rate (Re). As shown in Table 3 , it can be found that in this study, the accuracy rate (Pr), and the recall rate (Re) of Ernie model trained for this event were higher than those of Bert and Albert models. Therefore, in this research, Ernie model was selected as the emotion analysis algorithm of microblog of “COVID-19” event .

Table 3

Training results of various models.

Method	Pr (%)	Re (%)	F1 (%)
Ernie	83.9	83.9	83.9
Albert	80.9	79.16	68.72
Bert	83.5	83.5	83.5

Training results of various models.

Results

Difference analysis of the five pandemic stages

Fink's four-stage theory [1] on the division of public opinion event cycle stages defines public opinion crisis life cycles in four stages; an incubation period, an outbreak period, a stable period, and an extinction period. China's action against the COVID-19 posted by the State Council in June 2020 divided the COVID-19 process into five stages: quickly dealing with the outbreak; curbing the spread of the pandemic; decreasing the number of local cases to single digits; achieving decisive results in the defense of Wuhan and Hubei; and normalizing the national pandemic prevention and control. By analyzing the microblog user keyword clouds in these different stages over time, microblog users' public emotions and cognition differences in the different stages were dynamically analyzed.

Stage I: Rapidly respond to outbreaks (December 27, 2019, to January 19, 2020)

In this stage, there were only a small number of micro-blogs related to novel coronavirus. On December 31, the Wuhan Municipal Health Commission of Hubei Province issued the first official notice admitting that Wuhan city had an unexplained viral pneumonia, at which time the number of related microblogs increased significantly. As shown in Fig. 2(a), the analysis of the microblog keyword cloud map found that the main micro-blog keywords were pneumonia, SARS, avian influenza, Wuhan, illness, controllable, seafood city, viral, influenza, and swine fever.

Fig. 2

Cloud maps of keywords in each stage of epidemic situation.

Stage 2: Curb the spread of the pandemic (from January 20, 2020, to February 20, 2020)

In the second stage, the number of newly confirmed cases increased rapidly as prevention and control measures were implemented. To further stop the spread of the virus, China closed the channel in Wuhan. With the confirmation of human-to-human transmission, the worry and panic in the microblogs increased gradually, with most people being worried about the adverse impacts of the flow of people during the Spring Festival on virus transmission; consequently, there were fewer positive emotions expressed by the microblog users. The word cloud map shown in Fig. 2(b) reveals that the main keywords in this stage were Wuhan, influenza, pneumonia, coronavirus, mask, viral, infection, new, and Spring Festival.

Stage 3: Number of new local cases gradually drops to single digits (from February 21, 2020, to march 17, 2020)

By this stage, the rapid rise in the pandemic situation in Hubei Province and Wuhan city had been curbed and the pandemic situation in the whole country was stable except for Hubei Province. The daily new cases were controlled within single digits in mid-March, indicating that the pandemic prevention and control measures had achieved important results. Therefore, the Chinese government made the major decision of work resumption.

Stage 4: Pandemic situation in Wuhan and Hubei stabilizes (from March 18, 2020, to April 28, 2020)

By the fourth stage, the COVID-19 outbreak had been controlled in Wuhan, the Wuhan passage opened and the infected COVID-19 patients in Wuhan had all left hospital. While the pandemic situation abroad was spreading rapidly, the pandemic spread in China was sporadic, with the imported cases being the main transmission cause. The Chinese government then focused on a prevention and control strategy related to the “external prevention of imported cases and the internal prevention of a rebound”, which consolidated the effectiveness of the domestic pandemic prevention efforts.

Stage 5: National pandemic prevention and control was normalized (since April 29, 2020)

In this stage, there were only sporadic pandemic cases in China, the imported cases were basically under control, the pandemic situation was gradually developing in a good direction, and the Chinese government was implementing regular epidemic prevention and control measures. Based on the pandemic stage classifications in the pandemic related studies, the last three stages were stable pandemic periods [8,41,61,62]. The overall keyword cloud is shown in Fig. 2(c), with the main keywords being return to work, diagnosis, United States, Trump, mask, and prevention and control.

Dynamic change analysis of the microblog user emotions

Python's Matplotlib library was employed to draw a broken line chart tracing the dynamic emotional microblog user changes from December 27, 2019, to May 30, 2020, with time on the x-axis and the emotions on the y-axis. The microblog users' bipolar emotion proportions are shown in Fig. 3 and the six emotional proportions are shown in Fig. 4 .

Fig. 3

Proportion of bipolar emotions——time evolution.

Fig. 4

Proportion of multiple emotions——time evolution diagram.

Proportion of bipolar emotions——time evolution. Proportion of multiple emotions——time evolution diagram. In the first stage of the pandemic outbreak (before January 19, 2020), there was a significant decrease in positive emotions and an increase in the intensity of negative emotions, with the sadness, surprise, and fear proportions rising significantly. In the second stage (January 20, 2020, to February 20, 2020), the microblog users' moods continued to improve, with an evident rise being in positive moods, which may have been related to the Spring Festival (January 24 was New Year's Eve). The fluctuations in the emotions at this stage were related to official statements and measures. For example, the passage in Wuhan was temporarily closed from 10:00 on January 23, and from January 24, 346 national medical teams, 42,600 medical staff, 965 public health personnel, and the army were mobilized from all over China to help Wuhan control the situation. On February 8, the Chinese Health Commission announced that “COVID-19” was the official abbreviation of the pandemic and gave information about the infection situations in various regions of the country. People paying close attention to the pandemic gained a sense of security after learning of the national measures and expressed hope of overcoming the pandemic. These positive emotions then spread widely in social groups, reaching a peak on February 8. This stage could be seen to be commensurate with group social emotional infection theory [3], which claims that the emotional congruence of group members is the result of emotional infection, that is, one's emotions can affect others they interact with. Some previous studies [28] have used online text sentiment analysis program opinion finders to analyze the subjective well-being being expressed on Twitter, and found that homogeneous attraction was an important factor affecting people's social connections on the internet. Therefore, the theory of emotional infection provides a more reasonable explanation for the development of relatively consistent emotional experiences within groups. Since the pandemic situation was controlled, in the third stage (from February 21, 2020), there were more positive microblog user emotions expressed than negative emotions. However, the aversion proportion increased by 7.6% compared to the second stage. Combined with the differences between the two and the latter three stages, it was found that there were more keywords; “overseas”, “the United States”, “global pandemic situation” and “Trump”; in the latter three stages. Therefore, it was inferred that the last three stages were due to the continuous growth in the overseas input as well as disgust with the comments and behavior of the U.S. government, which led to an increase in the Chinese microblog users' negative emotions. (1) Specifically, the mood changes in the first stage were turbulent, with the proportion of positive posts being the lowest, which bottomed out on January 19. Combined with the news reports, experts on January 19th confirmed that there was COVID-19 human transmission and as this announcement was very close to the start of the Chinese Spring Festival holiday period, the positive mood peaked around January 25. In the fourth stage, the pandemic was rapidly spreading overseas and becoming increasingly serious, which resulted in a decrease in the happiness intensity. (2) Surprise and sadness increased during the outbreak period, mainly because the public were surprised and saddened by the new virus and the unexpected outbreak; however, these emotions then began to decline. (3) The fear intensity increased unsteadily in the initial outbreak stage, and after the experts announced on January 19th that the virus could be easily transmitted between people, the fear intensity reached a peak. As the pandemic situation was controlled in the second stage, the fear intensity began to decrease. (4) The feelings of disgust were unstable in the early stages, but in the second stage, they gradually decreased to a low level; however, in the third stage, the feelings of disgust first grew steadily and then became stable. The key words at this time were “United States”, “Trump”, and “overseas”, which indicated that in the later stages of the pandemic, disgust being felt by the Chinese microblog users was focused mainly on foreign countries. (5) In the early stages, the feelings of anger changed rapidly every day and reached a peak on January 25, which was possibly because Wuhan had been closed on January 23. As the pandemic situation came under control after January 25, the anger intensity gradually decreased and stabilized to a low level. However, it is worth noting that the time evolution of other countries may be different from that of China in this paper to some extent.

Difference of microblog user gender and expression tendencies

The gender COVID-19 incident differences were divided into five stages. As shown in Table 4 , it was found that there were emotional expression differences between the males and the females (52,383 males accounting for 46.54%, 60,154 females accounting for 53.46%), with the female emotional expressions tending to be significantly higher than the males in all stages.

Table 4

Analysis of gender * expression tendency (number of bloggers).

Stage	Sex		Number
	Male	Female
Stage1	5243	6300	11,543
Stage2	14,439	15,283	29,722
Stage3	8974	9304	18,278
Stage4	14,416	18,082	32,498
Stage5	9311	11,186	20,496

Analysis of gender * expression tendency (number of bloggers).

Difference of microblog user gender and emotional tendencies

The differences in the number of microblogs expressing different emotional tendencies were analyzed, from which it was found that there were 41,940 negative emotion microblogs (37.3%), mainly fear and disgust, and 70,597 positive emotion microblogs (62.7%), mainly happiness. The numbers and proportions of different emotion microblog are shown in Table 5 . Except for the first stage, most microblog users tended to have positive views of the event.

Table 5

Different emotions of microblogs.

Stage	Fear	Sad	Happy	Disgust	Surprise	Fear	Positive value	Negative value	FLAG value
Stage1N = 11,543	1222	812	4338	863	2343	1965	0.4769	0.5502	0.912
Stage1N = 11,543	10.60%	7%	37.60%	7.50%	21%	17%	N = 5263	N = 6280	0.912
Stage2N = 29,722	3763	1841	16,128	1127	3110	3754	0.6548	0.3747	1.304
Stage2N = 29,722	12.70%	6.20%	54.20%	3.80%	10.50%	12.60%	N = 19,375	N = 10,348	1.304
Stage3N = 18,278	2145	814	8139	2092	2776	2312	0.651	0.3851	1.298
Stage3N = 18,278	11.70%	4.50%	44.50%	11.40%	15.20%	12.60%	N = 11,867	N = 6411	1.298
Stage4N = 32,498	3518	1794	14,028	4871	4927	3360	0.6487	0.3856	1.292
Stage4N = 32,498	10.80%	5.50%	43.20%	15%	15.20%	10.30%	N = 20,988	N = 11,510	1.292
Stage5N = 20,496	2220	1076	8774	3606	3120	1701	0.6476	0.3857	1.279
Stage5N = 20,496	10.80%	5.20%	42.80%	17.60%	15.20%	8.30%	N = 13,104	N = 7393	1.279

Different emotions of microblogs. SPSS 22.0 was used to conduct an independent sample t-test to examine the differences between the microblog emotional tendencies by gender. In the first stage, both males and females expressed mostly negative emotions; however, the negative emotion intensities in the males were lower than in the females (P < 0.05). A greater number of microblogs expressed positive emotions in the second stage, with the males tending to be more positive than the females. As shown in Table 6 , the female negative emotions were more significant, whereas the males had a greater number of positive emotions. In the third stage, while both the males and females showed a greater number of positive emotions, the female emotional attitudes were more positive.

Table 6

Difference analysis of gender * different emotional tendencies (2 categories).

	Positive value				Negative value
	Male	Female	t value	ρ	Male	Female	t value	ρ
Stage1	0.488	0.467	2.601	0.127	0.548	0.552	−0.624	0.024*
Stage2	0.670	0.640	6.062	***	0.376	0.372	0.595	0.827
Stage3	0.646	0.665	−3.408	0.049*	0.488	0.412	6.602	***
Stage4	0.629	0.675	−0.980	***	0.417	0.344	14.249	***
Stage5	0.623	0.675	−8.317	***	0.418	0.347	11.738	***

Note: *p < 0.05 、**p < 0.01、***p < 0.001.

Difference analysis of gender * different emotional tendencies (2 categories). Note: *p < 0.05 、**p < 0.01、***p < 0.001. SPSS was used to draw a cross table (2 * 6) for gender and emotion (6 classifications), and as shown in Table 7 , the Chi-square test and post inspection were conducted five times according to the five stages. In each stage of the epidemic, significant differences were found in the emotions of male and female students. The specific pairwise comparison found that from the first stage to the fifth stage: the anger and disgust of male students were significantly higher than those of female students,

Table 7

Difference analysis of gender * different emotional tendencies (6 categories).

Stage 1					Stage 2				Stage 3				Stage 4				Stage 5
	Male	Female	F	p	Male	Female	F	p	Male	Female	F	p	Male	Female	F	p	Male	Female	F	p
Angry	622_a	600_b	310.80	***	1812_a	1951_a	455.32	***	1346_a	799_a	307.40	***	2077a	1411b	886.21	***	1302a	918b	670.2	***
Sad	230_a	582_b			633_a	1208_b			395_a	419_b			785a	1009b			414a	662b
Happy	1950_a	2388_a			8013_a	8115_b			4669_a	3470_b			7055a	6973b			4171a	4603b
Disgust	423_a	440_b			596_a	531_b			1548_a	544_b			3488a	1383b			2500a	1106b
Surprise	1323_a	1020_b			1884_a	1226_b			1886_a	890_b			3098a	1829b			1898a	1222b
Fear	695_a	1270_b			1501_a	2253_b			1460_a	852_a			1913a	1447a			901a	800a

Note: *p < 0.05 、**p < 0.01、***p < 0.001.

Each subscript letter indicates a subset of group categories whose column proportions do not differ significantly from each other at the. 05 level.

Difference analysis of gender * different emotional tendencies (6 categories). Note: *p < 0.05 、**p < 0.01、***p < 0.001. Each subscript letter indicates a subset of group categories whose column proportions do not differ significantly from each other at the. 05 level. At the first and second stages, sadness and fear were significantly higher in females than in males, but at the beginning of the third stage, sadness and fear were significantly lower in females than in males; At the first and second stages, fear was significantly higher in females than in males, but at the beginning of the third stage, there was no significant difference in sadness and fear between females and males.

Difference of microblog user regions and emotional tendencies

Of the 112,537 microblog data collected in this study, 4447 were from Wuhan. Taking the source region of the data as the grouping variable, t-tests were conducted on the emotional tendency and flag values. As shown in Table 8 , the emotional microblog text flag value from Wuhan was slightly lower than from non-Wuhan, the negative emotion intensity value was slightly higher from Wuhan than from non-Wuhan, and the positive emotion intensity value was slightly lower from Wuhan than from non-Wuhan; however, the differences were not significant (P > 0.05). Therefore, there were no significant emotional attitude differences between the groups in the Wuhan area and the groups in the non-Wuhan area at any stage.

Table 8

Difference analysis of regions * emotional tendency.

	regions	N	Flag value	t
Stage 1	Wuhan City	1165	0.89	−0.264 (ρ = 0.880)
Stage 1	Non Wuhan	10,378	0.91	−0.264 (ρ = 0.880)
Stage 2	Wuhan City	1282	1.28	−0.881 (ρ = 0.881)
Stage 2	Non Wuhan	28,441	1.30	−0.881 (ρ = 0.881)
Stage 3,4,5	Wuhan City	2000	1.28	−0.320 (ρ = 0.505)
Stage 3,4,5	Non Wuhan	50,996	1.29	−0.320 (ρ = 0.505)

Note: *p < 0.05 、**p < 0.01、***p < 0.001.

Difference analysis of regions * emotional tendency. Note: *p < 0.05 、**p < 0.01、***p < 0.001.

Discussion

COVID-19 microblog user emotional psychology

In the first pandemic stage, there were significantly more negative emotional microblogs than positive emotional microblogs, which was consistent with Pan et al. [41] and Su et al. [46], which found that when the COVID-19 outbreak was announced, the public expressed more negative emotions, such as anxiety, depression, panic, and hypochondriasis. As COVID-19 is a major public health event that can endanger personal safety, if timely and transparent news is not released by the government, internet public opinion can polarize and have negative effects. However, expressing negative emotions can also prepare people for dangerous situations in advance and significantly improve group survival probabilities [9]. Although the public mood was still relatively negative in the early pandemic stage, the public was more positive and concerned about the pandemic prevention measures in the later stages. Social emotional choice theory [6] could be used to explain the emotional changes at the start of the pandemic. According to social emotional choice theory, time perceptions influence people's choices and pursuit of social goals, aid in the balance of long- and short-term goals, and allow them time to adapt to the newly changed environment. People's goals are mostly preparatory when time is perceived as rich and free, such as seeking novel experiences in ways that enrich their knowledge. When time is perceived as limited, emotional goals become people's basic pursuits, and goal orientations tend to emphasize feeling states as people seek to manage their emotional states to achieve happiness. During the pandemic, people's perceptions of time changed from being free and abundant to being limited and urgent. As the microblog users became more sensitive to the pandemic policy information being released by the government, achieving emotional and health goals became a basic group pursuit. When the reported infections were increasing, the negative emotions had an upward trend and the positive emotions decreased; however, when the state's strong pandemic prevention and control policies were announced, positive emotions increased significantly and negative emotions decreased.

Dynamic changes in the microblog users' emotional psychology under COVID-19

This study reviewed the five COVID-19 stages to track the dynamic changes in the microblog user emotions through keyword cloud maps and emotional line graphs. Hypothesis H1 was verified, that is, the mood of the internet users changed significantly as the COVID-19 situation evolved, which was consistent with the results in Yang et al. [57], which found that microblogs became more positive as the news became more positive. In addition, Liu et al. [34] also analyzed Weibo data in the first five months after the initial COVID-19 outbreak and found that people's happiness and optimism increased and fear and sadness decreased over time. In the first outbreak stage (from December 27, 2019, to January 19, 2020,), the micro-blog users focused on pneumonia, avian flu, and SARS as COVID-19 was still unknown, which meant that the public was curious and searching for relevant information to find out the truth. The communication distortion effect [55] is when people tend to perceive network risk information from informal platforms, with the threat intensity being amplified or weakened through public understanding and transformation. However, when the risk is recognized by official agencies, there may be a chain reaction, with false and exaggerated information proliferating and the public dissatisfaction, anxiety, resentment, and a sense of crisis vented through certain channels resulting in group polarization [39] on social media platforms, that is, the group attitudes developed through group discussions tend to be more extreme than the original individual attitudes. The user pandemic related event discussions can lead to group polarization as the empathy and personal emotions are continuously amplified through the network information loop, leading to stronger emotions. In the second stage, the positive microblog user emotions increased significantly and the negative emotions decreased (Fig. 3), with the word cloud keywords being focused on influenza, pneumonia, coronavirus, masks, viral, infection, new, masks, and Spring Festival. As early as 1978, Maderthaner et al. proposed the theory of psychological immunity, which stated that frequent contact with potential threatening objects or events could lead to habituation and familiarity, which reduces people's risk assessment and perception [37], and results in more stable, less negative emotions. In the last three stages, as the pandemic situation came under control, the keywords were focused on refueling, hope, prevention and control, and safety. People began to isolate themselves at home spontaneously and actively cheer for the medical staff. After February 10, 2020, the domestic pandemic prevention and control efforts continued to improve and work, and production resumed. However, the international pandemic crisis was becoming more serious, with new cases being imported from abroad. Therefore, the pandemic prevention focus began to shift from “internal control” to “external prevention”, with the microblog users showing concern about the potential impact of these imported cases on the existing pandemic prevention achievements and the possibility of a second pandemic recurrence, which meant that the keywords included “overseas import”, “strict control” and concerns about overseas returnees. At this point, the public was no longer focused on the event itself, but rather on other people's experiences and the brave behavior of health workers, with many people spontaneously demonstrating social support psychologies. Wills [52] first proposed the social support theory, which emphasized the importance of family, friends, and society providing support and assistance in stabilizing social psychology. At the end of the maturation period, as the pandemic was gradually controlled, the positive microblog user emotions increased significantly, with the microblogs encouraging their families and those in the more severely affected areas.

Netizen gender and emotional difference analysis

This study verified hypothesis H2, that is, significant gender differences were found in the netizen sentiment. Females were found to be more emotional and more negative than males in the earlier stages and more active in the last three stages. However, the males tended to be more positive in the early stages and more negative in the later stages. In each stage of the pandemic, the proportions of anger, disgust, and surprise in males were significantly higher than in females, and the proportions of sadness and fear in females were higher than in males, with these differences being significant. In the first two stages, the male and female emotional differences may have been due to their different concerns. The word cloud maps for the males and females indicated that the males were more concerned about the process and principle of “COVID-19”, whereas the females displayed greater anxiety and tension. Grossman and Wood [18] conducted a comparative experiment on five different emotions (fear, pleasure, sadness, anger, and love), and found that females experienced emotions more frequently and deeply than males, which was surmised to be because of gender role stereotypes. Fivush et al. [15] studied 21 young children and their parents and concluded that the gender emotional differences were due to children's observations, learning, and imitation, since mothers tended to discuss the emotional aspects of an event and use more emotional words than fathers, and both parents used more emotional expressions when discussing sad events with their daughters than with their sons. These emotional expression gender differences can have long-term influences [50], which was also reflected in the statistical results of this study. It has also been found that females are more inclined to express support and sympathy for people or things from an emotional perspective, whereas males are more inclined to ask and analyze problems from a practical problem perspective [40,47,63]. An analysis of the differences between male and female users when expressing neutral and Ekman's six basic emotions; joy, sadness, anger, fear, trouble, surprise; found that females were more likely to express joy and sadness emotions than males, and males were more likely to express anger and surprise emotions than women [5]. Ashbyplant et al. [2] pointed out that females expressed fear more strongly than males and were not as strong as males in the anger dimension. Many recent studies have found that COVID-19 has expanded inequalities, with the negative impact on vulnerable groups such as females being greater. Females are more likely to face unemployment, a lower work output, or greater income reductions than males [33,38], and because of the need for home isolation, females were expected to take on more housework and family member care responsibilities [49]. Females were more likely to express negative emotions than males at the beginning of the COVID-19 pandemic as a result of these factor.

Difference analysis on the emotional tendencies in user regions

The regional difference analysis found that there were few differences between the Wuhan users and the non-Wuhan users; therefore, H3 was not verified. But it was consistent with the “psychological typhoon eye” effect proposed by Li Shu in their study of the May 12 earthquake during the SARS period. The “psychological typhoon eye effect” uses the “typhoon eye” analogy to describe psychological responses to disasters, that is, the closer the time period is to the high-risk stage, the calmer the individual [32,37,53]. Many pandemic studies have highlighted this effect [33], which was also observed during the SARS pandemic in Hong Kong when it was found that the anxiety level of the residents in the pandemic areas was lower than the residents in the non-pandemic areas. Some recent studies have also confirmed the “psychological typhoon eye effect” in analyses of the safety concerns and risk perceptions of people living in five pandemic risk areas; very low risk, low risk, medium risk, medium high-risk, and high-risk; where it was found that the closer the people were to the high-risk locations, the calmer their psychology and the further away people were from the high-risk locations, the greater the feelings of panic [56]. Festinger's theory of cognitive dissonance could also be used to explain this phenomenon [14]. When the people in Wuhan were facing the pandemic, their cognitive element 1 “in Wuhan” and cognitive element 2 “Wuhan is not safe” were possibly in psychological conflict; therefore, the Wuhan citizens experienced cognitive dissonance thinking “Wuhan is safe”. Understanding this phenomenon could also help governments formulate emergency strategies to more quickly reduce public tension.

Conclusion

After careful analysis and discussion, the following conclusions were drawn. Microblog user emotions moved from negative and fearful to positive from the second stage of the COVID-19 pandemic. There were no significant differences in the microblog user emotions in areas that had different local COVID-19 severity. In the first two COVID-19 stages, males were more emotionally active than females, but in the latter three stages, the females expressed more optimistic emotions and less negative emotions. Over the whole examined period, females published more microblogs and expressed more sadness and fear, and males expressed more anger and disgust.

Summary and prospects

Based on Ekman's six basic emotions classification theory, other psychological theories, and machine learning technologies, the Chinese internet users' dynamic emotional changes in response to the COVID-19 pandemic were examined to reveal the trends in public opinion. The results provided insight into the public emotional and psychological changes experienced during crises, which could assist authorities in better addressing such crises. Interdisciplinary research method to data mining and sentiment analysis provided a more in-depth view of public psychology during crises. Considering the long-term nature of COVID-19, the time span of data collected can be extended and the amount of data studied can be increased in future studies.

Limitations

However, this study had several limitations. First, as only the Sina microblog social media platform was consulted, the data sources were relatively narrow. Further, because most internet users in China are young people, with Sina microblog users being primarily 18–41 years old [59], all age ranges were not represented, that is, there was selection bias. This study also excluded other popular social media data sources such as WeChat, jitter, and bean paste and was limited by the information released by users on the Sina TikTok platform. Subsequent related research could examine a wider range of media platform data to more comprehensively extract public opinion information.

Declaration of Competing Interest

The author declares no conflict of interest.

19 in total

1. Happiness is assortative in online social networks.

Authors: Johan Bollen; Bruno Gonçalves; Guangchen Ruan; Huina Mao
Journal: Artif Life Date: 2011-05-09 Impact factor: 0.667

2. Fine-tuning ERNIE for chest abnormal imaging signs extraction.

Authors: Zhaoning Li; Jiangtao Ren
Journal: J Biomed Inform Date: 2020-07-06 Impact factor: 6.317

3. Quantifying Network Dynamics and Information Flow Across Chinese Social Media During the African Ebola Outbreak.

Authors: Shihui Feng; Liaquat Hossain; John W Crawford; Terry Bossomaier
Journal: Disaster Med Public Health Prep Date: 2017-08-01 Impact factor: 1.385

4. Communicating Ebola through social media and electronic news media outlets: A cross-sectional study.

Authors: Mowafa Househ
Journal: Health Informatics J Date: 2015-02-05 Impact factor: 2.681

5. How people react to Zika virus outbreaks on Twitter? A computational content analysis.

Authors: King-Wa Fu; Hai Liang; Nitin Saroha; Zion Tsz Ho Tse; Patrick Ip; Isaac Chun-Hai Fung
Journal: Am J Infect Control Date: 2016-08-24 Impact factor: 2.918

6. The content of social media's shared images about Ebola: a retrospective study.

Authors: E K Seltzer; N S Jean; E Kramer-Golinkoff; D A Asch; R M Merchant
Journal: Public Health Date: 2015-08-15 Impact factor: 2.427

7. Importance of Internet surveillance in public health emergency control and prevention: evidence from a digital epidemiologic study during avian influenza A H7N9 outbreaks.

Authors: Hua Gu; Bin Chen; Honghong Zhu; Tao Jiang; Xinyi Wang; Lei Chen; Zhenggang Jiang; Dawei Zheng; Jianmin Jiang
Journal: J Med Internet Res Date: 2014-01-17 Impact factor: 5.428

8. Chinese social media reaction to the MERS-CoV and avian influenza A(H7N9) outbreaks.

Authors: Isaac Chun-Hai Fung; King-Wa Fu; Yuchen Ying; Braydon Schaible; Yi Hao; Chung-Hong Chan; Zion Tsz-Ho Tse
Journal: Infect Dis Poverty Date: 2013-12-20 Impact factor: 4.520

9. Immediate Psychological Responses and Associated Factors during the Initial Stage of the 2019 Coronavirus Disease (COVID-19) Epidemic among the General Population in China.

Authors: Cuiyan Wang; Riyu Pan; Xiaoyang Wan; Yilin Tan; Linkang Xu; Cyrus S Ho; Roger C Ho
Journal: Int J Environ Res Public Health Date: 2020-03-06 Impact factor: 3.390

10. Prevalence and predictors of PTSS during COVID-19 outbreak in China hardest-hit areas: Gender differences matter.

Authors: Nianqi Liu; Fan Zhang; Cun Wei; Yanpu Jia; Zhilei Shang; Luna Sun; Lili Wu; Zhuoer Sun; Yaoguang Zhou; Yan Wang; Weizhi Liu
Journal: Psychiatry Res Date: 2020-03-16 Impact factor: 3.222