| Literature DB >> 35402850 |
Runbin Xie1, Samuel Kai Wah Chu1, Dickson Kak Wah Chiu1, Yangshu Wang1.
Abstract
It is necessary and important to understand public responses to crises, including disease outbreaks. Traditionally, surveys have played an essential role in collecting public opinion, while nowadays, with the increasing popularity of social media, mining social media data serves as another popular tool in opinion mining research. To understand the public response to COVID-19 on Weibo, this research collects 719,570 Weibo posts through a web crawler and analyzes the data with text mining techniques, including Latent Dirichlet Allocation (LDA) topic modeling and sentiment analysis. It is found that, in response to the COVID-19 outbreak, people learn about COVID-19, show their support for frontline warriors, encourage each other spiritually, and, in terms of taking preventive measures, express concerns about economic and life restoration, and so on. Analysis of sentiments and semantic networks further reveals that country media, as well as influential individuals and "self-media," together contribute to the information spread of positive sentiment.Entities:
Keywords: COVID-19; LDA; Weibo; sentiment analysis; web crawling
Year: 2022 PMID: 35402850 PMCID: PMC8975181 DOI: 10.2478/dim-2020-0023
Source DB: PubMed Journal: Data Inf Manag ISSN: 2543-9251
Popular Themes in Survey-Based Research of COVID-19 Outbreak
| Themes | Sample research |
|---|---|
| Knowledge, attitudes & practices (KAP) | |
| Psychological stress | |
| Information seeking | |
| Misinformation (fake news) | |
| Sensitive individuals, including front-line hospital staffs and recovered patient | |
| Attitudes towards government actions on disease control |
Research on Disease Outbreaks on Social Media
| Author | Disease outbreak | Social media | Methods | Findings |
|---|---|---|---|---|
| Measles | Twitter and others | Thematic analysis | People on Twitter cared about disease transmission, preventive actions, and vaccination; governments needed to promote vaccination acceptability. | |
| MERS-CoV & H7N9 | Statistical analysis on the number of Weibo posts | Weibo users reacted to the disease outbreak significantly, and people paid more attention to the H7N9 outbreak. | ||
| Dengue | Analysis on the numbers of posts and spatial information | Spatially and temporally, there was a correlation between the number of posts and disease development trends. | ||
| H1N1 | Manual and automated coding | Several sentiments, including confusion, humor, risk, and so on, were discovered, among which humor was the most popular sentiment. | ||
| Influenza | Modeling | A prediction model built on Twitter data could be used for influenza outbreak alerts. | ||
| H7N9 | Analysis on the number of Weibo posts and the number of new confirmed cases | There was a positive correlation between discussion and disease outbreak level, and Weibo served as a good medium to promote communications of public health. | ||
| COVID-19 | Machine learning algorithms | Weibo posts were classified into seven categories of situational information. Useful text features should be helpful in building an emergence response system. |
Figure 1Methodological framework of the proposed research
Selected Keywords for Data Collection
| Keyword | Translation |
|---|---|
| ?? | Virus |
| ?? | (Confirmed/suspicious) case |
| ?? | Pneumonia |
| ?? | COVID |
| ???? | Coronavirus |
| ?? | Disease outbreak |
Figure 2A screenshot of some collected raw data
Figure 3Word cloud of top 500 terms in the cleaned dataset
Top 50 Words in the Cleaned Dataset
| No. | Term | Translation | Frequency | No. | Term | Translation | Frequency | No. | Term | Translation | Frequency |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ?? | Disease outbreak | 153,855 | 18 | ?? | Country | 23,856 | 35 | ?? | Import | 15,642 |
| 2 | ?? | Pneumonia | 126,294 | 19 | ?? | Quarantine | 23,311 | 36 | ?? | Vaccine | 15,560 |
| 3 | ?? | COVID | 98,978 | 20 | ?? | Court | 23,168 | 37 | ?? | Treatment | 15,506 |
| 4 | ?? | Case | 97,597 | 21 | ?? | Work | 22,905 | 38 | ?? | Genuine | 15,478 |
| 5 | ?? | Virus | 71,299 | 22 | ?? | Hope | 22,836 | 39 | ?? | Staff | 14,722 |
| 6 | ?? | Novel | 66, 986 | 23 | ?? | Death | 21,772 | 40 | ?? | Situation | 14,243 |
| 7 | ?? | Infected | 65,884 | 24 | ?? | Hospital | 21,209 | 41 | ?? | Hubei | 13,338 |
| 8 | ???? | Coronavirus | 60,636 | 25 | ?? | Test | 21,195 | 42 | ?? | Beijing | 13,018 |
| 9 | ?? | America | 52,716 | 26 | ?? | Add oil | 20,350 | 43 | ?? | Health | 13,005 |
| 10 | ?? | China | 50,935 | 27 | ?? | Globe | 19,263 | 44 | ?? | Report | 12,769 |
| 11 | ?? | Wuhan | 45,959 | 28 | ?? | Time | 19,247 | 45 | ?? | Period | 12,223 |
| 12 | ?? | Infected | 38,556 | 29 | ?? | Accumulate | 18,367 | 46 | ?? | abroad | 12,162 |
| 13 | ?? | Disease control | 33,447 | 30 | ?? | Fight | 17,996 | 47 | ?? | Discharged | 11,678 |
| 14 | ?? | Video | 32,489 | 31 | ?? | Folk | 17,381 | 48 | ?? | Doctor | 1,111 |
| 15 | ?? | Patients | 31,036 | 32 | ?? | News | 17,000 | 49 | ?? | World | 11,558 |
| 16 | ?? | New case | 28,674 | 33 | ?? | Discover | 16,799 | 50 | ??? | Yuhua district | 11,391 |
| 17 | ?? | Face mask | 23,942 | 34 | ?? | Nation | 15,650 |
Figure 4Perplexity scores of models under different settings of number of topics
Figure 5Coherence scores of models under different settings of number of topics
Figure 6LDA model visualization
Top 30 Most Salient Terms of Each Topic and Topic Coding Results
| Topic ID | Topic Label | 10 representative words selected from the top 30 most salient words |
|---|---|---|
| 1 | Fight the virus together | |
| 2 | Knowledge | ?? (research), ?? (expert), ?? (reason), ?? (popular science), |
| 3 | Assistance | ?? (work), ?? (fight), ?? (come back), ?? (nation), |
| 4 | Economics | ?? (globe), ?? (economics), ?? (influence), |
| 5 | Global pandemic | ?? (Iran), ?? (case), ?? (UK), ?? (disease outbreak), ?? (new case), ?? (Japan), ??? (Italy), ?? (accumulate), ?? (urgent), ?? (abroad) |
| 6 | Prevention | ?? (disease outbreak), ?? (face mask), ?? (promotion), ?? (prevention), ?? (do well in), ?? (prevent), ?? (fight the virus), ?? (risk), ?? (health), ?? (measure) |
| 7 | Treatment | ?? (quarantine), ?? (discharged), ??? (no symptoms), ???? (medical observation), ???? (close contact), ?? (cure), ?? (fever), ?? (severely ill), ?? (confirmed affection), ?? (treatment) |
| 8 | Stay at home | ?? (period), ?? (life), ?? (this time), ?? (happy), ?? (go back home), ?? (at home), ?? (like), ?? (thing), ?? (friend), ?? (home) |
| 9 | Law | ?? (court), ?? (judge), ?? (folk), ?? (breach the law), ?? (case), ?? (defendant), ?? (proof), ?? (audio recording), ?? (according to the law), ?? (truth) |
| 10 | Study | ?? (school), ?? (children), ?? (start school), ?? (student), ?? (start school), ?? (university), ?? (major), ?? (parents), ?? (homework), ?? (college entrance examination) |
| 11 | Celebrity and charity | ??? (Xukun Cai), ?? (Zi Yang), ?? (charity), ?? (super topic), ?? (fan), ?? (dawn), ?? (charity), ?? (Zhan Xiao), ?? (protect), ?? (defeat) |
| 12 | People | ?? (live stream), ?? (son), ?? (younger brother), ?? (grandmother), ?? (vocation), ?? (husband), ?? (idol), ?? (value), ?? (boost popularity), ?? (meet) |
Figure 7Number of positive and negative posts over time
Top 10 Frequent Terms Extracted from Posts at Each Peak
| No. | Peak 1 | Peak 2 | Peak 3 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Top term | Translation | Freq. | Top term | Translation | Freq. | Top term | Translation | Freq. | |
| 1 | ?? | Hope | 470 | ?? | Protect | 767 | ?? | Disease outbreak | 533 |
| 2 | ?? | Wuhan | 388 | ?? | Disease outbreak | 374 | ?? | Case | 395 |
| 3 | ?? | Disease outbreak | 320 | ?? | Hope | 364 | ?? | COVID-19 | 287 |
| 4 | ?? | Pneumonia | 260 | ?? | Defeat | 351 | ?? | Coronavirus | 263 |
| 5 | ?? | Add oil | 235 | ?? | This | 350 | ?? | Pneumonia | 193 |
| 6 | ?? | Novel | 204 | ?? | Fight the virus | 344 | ?? | Confirmed Affection | 189 |
| 7 | ?? | Safe and sound | 179 | ??? | Fight the virus | 313 | ?? | China | 185 |
| 8 | ???? | Coronavirus | 173 | ?? | Information | 306 | ?? | Vaccine | 173 |
| 9 | ?? | Virus | 157 | ??? | Good kids | 288 | ?? | Accumulate | 148 |
| 10 | ?? | 1 year | 154 | ???? | Eliminate the false and retain the true | 263 | ?? | Hope | 141 |
Figure 8Visualization of semantic network of positive sentiment
Figure 9Visualization of semantic network of negative sentiment
Quartiles of Normalized Reposting Frequencies of Each Semantic Network
| Sentiment | Min | Q1 | Median | Q3 | Max |
|---|---|---|---|---|---|
| Positive | 0.000 | 0.002 | 0.013 | 0.109 | 1.000 |
| Negative | 0.000 | 0.000 | 0.000 | 0.002 | 1.000 |