| Literature DB >> 33207310 |
Raghad Alshalan1, Hend Al-Khalifa1, Duaa Alsaeed1, Heyam Al-Baity1, Shahad Alshalan1.
Abstract
BACKGROUND: The massive scale of social media platforms requires an automatic solution for detecting hate speech. These automatic solutions will help reduce the need for manual analysis of content. Most previous literature has cast the hate speech detection problem as a supervised text classification task using classical machine learning methods or, more recently, deep learning methods. However, work investigating this problem in Arabic cyberspace is still limited compared to the published work on English text.Entities:
Keywords: CNN; COVID-19; NMF; Twitter; convolutional neural network; coronavirus; deep learning; hate speech; non-negative matrix factorization; pandemic; public health; social media; social network analysis
Year: 2020 PMID: 33207310 PMCID: PMC7725497 DOI: 10.2196/22609
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Figure 1Methodology workflow. CNN: convolutional neural network; NMF: utilized nonnegative matrix factorization.
Total numbers of hate tweets and non–hate tweets in the Arab region during the period of study (N=547,554).
| Type of tweet | Number of tweets, n (%) | |
| Non–hate tweets | 535,811 (97.8) | |
| Hate tweets | 11,743 (3.2) | |
|
| ||
|
| Low | 8385 (71.4) |
|
| Average | 3018 (25.7) |
|
| High | 340 (2.9) |
aBased on scores assigned by the convolutional neural network model (low: 0.50-0.67; average: 0.68-0.85; high: 0.86-1.00).
Figure 2Number of hate tweets (red) and total tweets (black) in each Arab country, where a darker color depicts a higher number of hate tweets in that country (MA: Morocco, MR: Mauritania, DZ: Algeria, TN: Tunisia, LY: Libya, EG: Egypt, JO: Jordan, LB: Lebanon, SY: Syria, IQ: Iraq, SA: Saudi Arabia ,YE: Yemen, KW: Kuwait, QA: Qatar, AE: United Aran Emirates, OM: Oman).
Statistics of COVID-19–related hate tweets posted in Arab countries.
| Variable | Country | |||||||||||
|
| Saudi Arabia | Kuwait | Egypt | UAEa | Lebanon | Yemen | Jordan | Oman | Iraq | Mauritania | Other Arab countriesb | |
| Population (million) | 34 | 4 | 102 | 10 | 7 | 30 | 10 | 5 | 40 | 5 | 187 | |
| Posted tweets (N=535,811), n (%) | 165,536 (24.2) | 102,945 (15.0) | 58,629 (8.6) | 38,409 (5.6) | 34,646 (5.1) | 27,059 (4.0) | 20,293 (3.0) | 19,344 (2.8) | 18,188 (2.7) | 9976 (1.5) | 52,529 (7.1) | |
| Hate tweets (n=11,743), n (%) | 4321 (2.6) | 1526 (1.5) | 995 (1.7) | 314 (0.8) | 892 (2.6) | 947 (3.5) | 226 (1.1) | 130 (0.7) | 802 (4.4) | 188 (1.9) | 1402 (2.7) | |
|
| ||||||||||||
|
| Low (n=8385), n (%) | 2813 (65.1) | 1153 (75.6) | 747 (75.0) | 245 (78.0) | 700 (78.5) | 640 (67.5) | 160 (70.8) | 104 (80.0) | 624 (77.8) | 153 (81.4) | 1046 (74.6) |
|
| Average (n=3018), n (%) | 1308 (30.3) | 347 (22.7) | 224 (22.5) | 65 (20.7) | 184 (20.6) | 277 (29.2) | 62 (27.4) | 21 (16.6) | 168 (20.9) | 33 (17.6) | 329 (23.5) |
|
| High (n=340), n (%) | 200 (4.6) | 26 (1.7) | 24 (2.4) | 4 (1.3) | 8 (0.9) | 30 (3.2) | 4 (1.8) | 5 (3.8) | 10 (1.2) | 2 (1.1) | 27 (1.9) |
|
| Average hate level score | 0.643 | 0.613 | 0.617 | 0.606 | 0.605 | 0.634 | 0.623 | 0.602 | 0.610 | 0.602 | 0.623 |
|
| ||||||||||||
|
| Cases (n=78,037) | 22753 | 4024 | 5537 | 12481 | 725 | 6 | 453 | 2348 | 2003 | 7 | 27700 |
| Deaths (n=1593) | 162 | 26 | 392 | 105 | 24 | 0 | 8 | 11 | 92 | 1 | 772 | |
aUAE: United Arab Emirates.
bOther Arab countries: this column combines the results of the remaining Arab countries (Palestine, Algeria, Libya, Bahrain, Morocco, Qatar, Sudan, Syria, Tunisia, Comoros, and Somalia).
cBased on scores assigned by the convolutional neural network model (low: 0.50-0.67; average: 0.68-0.85; high: 0.86-1.00).
Figure 3Number of COVID-19–related hate tweets per country with the average hate level scores in brackets (low: 0.50-0.67; average: 0.68-0.85; high: 0.86-1.00). UAE: United Arab Emirates.
Statistics of COVID-19–related hate tweets per time period (N=547,554).
| Variable | Time period (2020) | |||
|
| January 27-February 29 | March 1-30 | April 1-30 | |
| Total tweets (N=547,554), n (%) | 118,991 (21.7) | 253,806 (46.4) | 174,757 (31.9) | |
| Hate tweets (n=11,743), n (%) | 3014 (25.7) | 6095 (51.9) | 2634 (22.4) | |
|
| ||||
|
| Low | 2198 (72.9) | 4300 (70.5) | 1887 (71.6) |
|
| Average | 741 (24.6) | 1617 (26.5) | 660 (25.1) |
|
| High | 75 (2.5) | 178 (2.9) | 87 (3.3) |
|
| Average hate level score | 0.622 | 0.628 | 0.625 |
|
| ||||
|
| Cases | 133 | 7447 | 70,092 |
|
| Deaths | 0 | 202 | 1350 |
aBased on scores assigned by the convolutional neural network model (low: 0.50-0.67; average: 0.68-0.85; high: 0.86-1.00).
Figure 4Numbers of hate tweets and numbers of COVID-19 cases and deaths per time period with the average hate level scores in brackets (low: 0.50-0.67; average: 0.68-0.85; high: 0.86-1.00).
Identified topics in hate tweets and examples of the most common words in each topic.
| Topic number | Main theme | Examples of the top unigrams and bigrams | Distribution |
| 1 | China and COVID-19 | Cursed, China, curse China, Curses on, life, new | 4.3% |
|
|
|
|
|
| 2 | Iran as a source of COVID-19 | Export, Gulf, country, Terrorism, disease, spread | 5.07% |
|
|
|
|
|
| 3 | Saudi citizens visiting Iran | Saudi Arabia, Saudi, Bahrain, citizen, travel, passport | 9.44% |
|
|
|
|
|
| 4 | Chinese eating habits and COVID-19 | Dog, animal, bat, eating, meat, snake | 5.72% |
|
|
|
|
|
| 5 | China and Uighur | China, Muslim, Uyghur, Chinese, pig, punishment | 7.38% |
|
|
|
|
|
| 6 | Iran regime | Regime, Mullahs, Iran, Iranian People, Tehran, outbreak | 5.60% |
|
|
|
|
|
| 7 | General political tweets, conspiracies, COVID-19 as an exaggerated threat | Country, people, disease, party, Iraq, Egypt | 62.45% |
|
|
|
|
|