| Literature DB >> 34790954 |
Maximilian Wich1, Tobias Eder1, Hala Al Kuwatly1, Georg Groh1.
Abstract
Recently, numerous datasets have been produced as research activities in the field of automatic detection of abusive language or hate speech have increased. A problem with this diversity is that they often differ, among other things, in context, platform, sampling process, collection strategy, and labeling schema. There have been surveys on these datasets, but they compare the datasets only superficially. Therefore, we developed a bias and comparison framework for abusive language datasets for their in-depth analysis and to provide a comparison of five English and six Arabic datasets. We make this framework available to researchers and data scientists who work with such datasets to be aware of the properties of the datasets and consider them in their work.Entities:
Keywords: Abusive language detection; Arabic; Bias; English; Hate speech detection
Year: 2021 PMID: 34790954 PMCID: PMC8288848 DOI: 10.1007/s43681-021-00081-0
Source DB: PubMed Journal: AI Ethics ISSN: 2730-5953
Bias framework for abusive language datasets
| Perspective | Method | Problem |
|---|---|---|
| 1. Meta | (a) Class distribution and availability | Degradation |
| (b) Time distribution | Temporal bias | |
| (c) Pareto analysis of authors | Author bias | |
| 2. Semantic | (a) LSI-based intra-dataset class similarity | Similarity/dissimilarity of classes |
| (b) Word embedding based intra- and inter-dataset class similarity | Similarity/dissimilarity of classes | |
| (c) Cross-dataset topic model | Topic bias | |
| (d) PMI-Based word ranking for class | Topic bias | |
| 3. Annotation | (a) Distribution of inter-rater reliability | Annotator bias |
| 4. Classification | (a) Cross-dataset performance | Generalizability |
| (b) Explainable classification models | Generalizability |
Fig. 1Overview of the framework’s methods and the required data
Selected abusive language datasets (class names in bold are the abusive categories)
| Lang. | Name | Source | Size | Labels | Ref. |
|---|---|---|---|---|---|
| English | Waseem | 16,907 | None, sexism, racsim | [ | |
| Davidson | 24,783 | Offensive, hate, neither | [ | ||
| Founta | 99,996 | Normal, abusive, hateful, spam | [ | ||
| Zampieri | 14,100 | Hierarchical labels: (1) not offensive, offensive (2) if offensive: targeted insult, untargeted insult (3) if targeted: individual target, group target, other | [ | ||
| Vidgen | 20,000 | Hostility, criticism, counter speech, discussion of East Asian prejudice, neutral | [ | ||
| Arabic | Alsafari | 5341 | 3-class: clean, offensive, hate; 6-class: clean, offensive, religious hate, gender hate, nationality hate, ethnicity hate | [ | |
| Alshalan | 8958 | Hate, non-hate | [ | ||
| Albadi | 6136 | Hierarchical labels: (1) neutral, religious hate (2) if religious hate: Muslims, Jews, Christians, Atheists, Sunnis, Shia, other | [ | ||
| Chowdhury | Twitter, Facebook, YouTube | 4000 | Hierarchical labels: (1) non-offensive, offensive (2) if offensive: vulgar, hate, only offensive | [ | |
| Mubarak | 9996 | Hierarchical labels: (1) non-offensive, offensive (2) if offensive: hate speech, not hate speech | [ | ||
| Mulki | 5846 | Normal, busive, hate | [ |
Fig. 2Class distribution and platform availability of English datasets (available means that the online resource, e.g. tweet, is still accessible)
Fig. 3Temporal distribution of the tweets from English datasets with tweet IDs
Fig. 4Pareto analysis showing how many tweets (incl. classes) were created by the top authors of each dataset
Fig. 5LSI-based similarity of classes within English datasets (the higher the score, the more similar are the two classes
Fig. 6FASTTEXT sentence embedding vectors averaged for each class of English datasets and visualized with PCA (the closer the points, the more similar the classes)
Words with highest PMI for each class of the selected abusive English datasets
| Words with highest PMI | |
|---|---|
| Waseem - sexism | sexist, women, kat, girls, like, call, female, men, think, woman |
| Waseem - racism | islam, muslims, muslim, mohammed, religion, jews, prophet, isis, quran, like |
| Davidson - hate | bitch, faggot, like, ass, nigga, white, fuck, nigger, trash, fucking |
| Davidson - offensive | bitch, bitches, hoes, like, pussy, hoe, ass, got, fuck, get |
| Founta - abusive | fucking, fucked, like, ass, bitch, fuck, get, bad, shit, know |
| Founta - hateful | hate, niggas, fucking, nigga, like, people, idiot, get, amp, ass |
| Zampieri - OFF | liberals, like, control, gun, people, shit, antifa, get, conservatives, one |
| Vidgen - hostility | china, world, chinese, virus, people, ccp, us, wuhan, spread, rt |
Fig. 7Topic model on the abusive classes of English dataset selection
Fig. 8Annotators’ inter-rater reliability scores and overall inter-rater reliability score (black line) of Vidgen dataset
Fig. 9Cross-dataset classification performance (macro F1 scores)
Fig. 10SHAP explanations of an abusive tweet that is misclassified by two of the five English classification models
Fig. 11Class distribution and platform availability of Arabic datasets (available means that the online resource, e.g. tweet, is still accessible)
Fig. 12Temporal distribution of the tweets from Arabic datasets with tweet IDs
Fig. 13Pareto analysis showing how many tweets (incl. classes) from Arabic datasets were created by the top authors of each dataset
Fig. 14LSI-based similarity of classes within Arabic datasets (the higher the score, the more similar are two classes
Fig. 15FASTTEXT sentence embedding vectors averaged for each class of Arabic datasets and visualized with PCA (the closer the points, the more similar the classes are)
Words with highest PMI for each class of the selected abusive Arabic datasets
Fig. 16Topic model on tweets from abusive classes of Arabic datasets
Fig. 17Cross-dataset classification performance (macro F1 scores) of Arabic datasets
Fig. 18SHAP explanations of an abusive tweet that is misclassified by two of the six Arabic classification models