| Literature DB >> 33173820 |
Lizhou Fan1,2, Huizi Yu2,3, Zhanyuan Yin4,3.
Abstract
As the COVID-19 pandemic has unfolded, Hate Speech on social media about China and Chinese people has encouraged social stigmatization. For the historical and humanistic purposes, this history-in-the-making needs to be archived and analyzed. Using the query "china+and+coronavirus" to scrape from the Twitter API, we have obtained 3,457,402 key tweets about China relating to COVID-19. In this archive, in which about 40% of the tweets are from the U.S., we identify 25,467 Hate Speech occurrences and analyze them according to lexicon-based emotions and demographics using machine learning and network methods. The results indicate that there are substantial associations between the amount of Hate Speech and demonstrations of sentiments, and state demographics factors. Sentiments of surprise and fear associated with poverty and unemployment rates are prominent. This digital archive and the related analyses are not simply historical, therefore. They play vital roles in raising public awareness and mitigating future crises. Consequently, we regard our research as a pilot study in methods of analysis that might be used by other researchers in various fields. 83rd Annual Meeting of the Association for Information Science & Technology October 25‐29, 2020. Author(s) retain copyright, but ASIS&T receives an exclusive publication license.Entities:
Keywords: COVID‐19; Twitter; coronavirus; hate speech; pandemic
Year: 2020 PMID: 33173820 PMCID: PMC7645876 DOI: 10.1002/pra2.313
Source DB: PubMed Journal: Proc Assoc Inf Sci Technol
FIGURE 1Top 205 hashtags including in‐appropriate naming of COVID‐19. Note: An interactive version of the word cloud is available at: https://voyant‐tools.org/?corpus=b60fb002d2b470b44fc7b0c3133cf9df&visible=205&view=Cirrus
Selected highest occuring hashtags with frequency rank
| Rank | Term | Count | Rank | Term | Count |
|---|---|---|---|---|---|
| 1 | wuhan | 24080 | 9 | taiwan | 6009 |
| 2 | italy | 13536 | 10 | chinesevirus | 5191 |
| 6 | hongkong | 6643 | 12 | wuhanvirus | 4557 |
| 7 | usa | 6554 | 15 | hubei | 4208 |
| 8 | iran | 6212 | 16 | russia | 3895 |
FIGURE A1Top five hashtags per day and their numbers of occurrences
FIGURE 2Number of new cases in the U.S. and Number of tweets per day in the U.S
FIGURE 3Number of New Cases per day in the U.S. (Blue Bars) and Sentiment and Hate Speech Trends
Confusion Matrix of the Test Set
| Actual | |||
|---|---|---|---|
| Positive | Negative | ||
|
|
| 691877 | 0 |
|
| 610 | 4418 | |
Confusion Matrix Measurements
| Measurements | Value |
|---|---|
| Accuracy | 99.91% |
| Sensitivity | 99.91% |
| Specificity | 100.00% |
| Precision | 100.00% |
| NPV | 87.87% |
Note: NPV is Negative Predicted Value.
FIGURE A2Decision Tree Classifier generated by classifying Hate Speech using training data
Importance of Each Emotional Feature in Picking Hate Speech
| Emotion | Feature Importance |
|---|---|
| Surprise | 28.28% |
| Fear | 22.78% |
| Anticipation | 13.43% |
| Anger | 10.99% |
| Disgust | 7.80% |
| Trust | 7.77% |
| Sadness | 6.83% |
| Joy | 2.11% |
FIGURE 4Important Decision Nodes
Basic Rules of Emotion to Distinguish Hate Speech
| Emotion | Low (0 ∼ 0.5) | Medium (0.5 ∼ 1.5) | High (1.5 ∼) |
|---|---|---|---|
| Surprise | ✓ | ||
| Fear | ✓ | ✓ | |
| Anticipation | ✓ | ||
| Anger | ✓ | ||
| Disgust | ✓ | ✓ | |
| Trust | ✓ | ✓ | |
| Sadness | ✓ | ||
| Joy | ✓ | ✓ |
FIGURE 6Network Construction and Result
Modularity of the Network by Different Grouping Criterion
| Factor | Political Party | Poverty Rate | Asian Percentage | Unemployment Rate |
|---|---|---|---|---|
| Modularity | −0.0624 | 0.0244 | −0.0013 | 0.0399 |
FIGURE 7Centrality Measurements of Network
FIGURE 5Comparison between the Number of Occurrences of Hate Speech and Confirmed Cases: California, Texas, Florida and New York, all states with high number of confirmed cases, are also the locations associated with large numbers of Hate Speech
Examples of Hate Speech on Twitter
| Tweet ID | Date | Tweet |
|---|---|---|
| 122348490476099**** | 2/1/2020 | After SARS, Bird Flu, Swine flu and all that, how could something like coronavirus be allowed to spread for so long that the first person to get it is elusive? China do not f**k around, one c**t starts coughing the whole province is in masks in minutes. |
| 123718116467216**** | 3/10/2020 | What do we call this China Coronavirus then what it is and you Idiot's call us Racist I put it where it should belong at the feet of the China Government we did not get this from another Country it came from China now we are all in it's path so keep on being ignorant about this |
| 123795213637523**** | 3/12/2020 | If they diagnosed all these people with the coronavirus in China 2 months ago then why TF DID THEY EVEN LET THEM LEAVE CHINA, WHAT IDIOT LET THESE PEOPLE GO AROUND THE WORLD AND SPREAD IT |
| 123815218206889**** | 3/12/2020 | Well, yes, the novel coronavirus COVID 19 did, in fact, originate from Wuhan, China probably from filthy Commie Ch*nk s open air meat markets the Wuhan Virus is a foreign virus. Call it the ChineseVirus, perhaps. And, out of spite, Americans should boycott Chinese stuff. |
| 123815202117115**** | 3/12/20 | Down ∼£5 k on my investments because some mad c**t eat a bat in China and started the coronavirus. Mad ting |
| 124076900418213**** | 3/19/2020 | Anyone who thinks China the pestilence nation is a leader in this is an idiot China covered this up for at least 2 months and let it fester. If anything they declared biological war on the planet. FoxNews |
| 124255771854143**** | 3/24/2020 | Anyone who believes Russia's numbers is an idiot. And I do not believe China's numbers either. Bullshit COVID19 coronavirus |
Note: Hate Speech occurrences are marked red with parts masked for the sake of the scholarly audience; the last four digits of Tweet IDs are masked for privacy reasons.
Descriptive Statistics of State Level Data
| Mean | 95% CI Upper | |||
|---|---|---|---|---|
| State | Total Tweets (Key word containing) | 13466.80 | 7995.17 | 18938.43 |
| Hate Speech Percentage | 0.00594 | 0.00520 | 0.00653 | |
| Emotion | Anger Score | 0.01905 | 0.01882 | 0.01929 |
| Anticipation Score | 0.02252 | 0.02234 | 0.02269 | |
| Disgust Score | 0.01635 | 0.01611 | 0.01660 | |
| Fear Score | 0.02708 | 0.02670 | 0.02745 | |
| Joy Score | 0.00882 | 0.00872 | 0.00892 | |
| Sad Score | 0.02432 | 0.02393 | 0.02470 | |
| Surprise Score | 0.01710 | 0.01691 | 0.01729 | |
| Trust Score | 0.02763 | 0.02739 | 0.02786 | |
| Positive Score | 0.03213 | 0.03193 | 0.03232 | |
| Negative Score | 0.05219 | 0.05158 | 0.05280 | |
| Demographic | Asian Percentage | 0.04195 | 0.02661 | 0.05729 |
| Unemployment Rate | 0.03562 | 0.03318 | 0.03806 | |
| Poverty Rate | 0.12864 | 0.12085 | 0.13643 | |
| Political Party | Percentage | |||
| Democratic | 0.38 | |||
| Republican | 0.62 | |||