Hostname: page-component-77f85d65b8-t6st2 Total loading time: 0 Render date: 2026-03-28T07:12:24.556Z Has data issue: false hasContentIssue false

Emojis as anchors to detect Arabic offensive language and hate speech

Published online by Cambridge University Press:  10 August 2023

Hamdy Mubarak*
Affiliation:
Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
Sabit Hassan
Affiliation:
School of Computing and Information, University of Pittsburgh, Pittsburgh, PA, USA
Shammur Absar Chowdhury
Affiliation:
Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
*
Corresponding author: Hamdy Mubarak; Email: hmubarak@hbku.edu.qa
Rights & Permissions [Opens in a new window]

Abstract

We introduce a generic, language-independent method to collect a large percentage of offensive and hate tweets regardless of their topics or genres. We harness the extralinguistic information embedded in the emojis to collect a large number of offensive tweets. We apply the proposed method on Arabic tweets and compare it with English tweets—analyzing key cultural differences. We observed a constant usage of these emojis to represent offensiveness throughout different timespans on Twitter. We manually annotate and publicly release the largest Arabic dataset for offensive, fine-grained hate speech, vulgar, and violence content. Furthermore, we benchmark the dataset for detecting offensiveness and hate speech using different transformer architectures and perform in-depth linguistic analysis. We evaluate our models on external datasets—a Twitter dataset collected using a completely different method, and a multi-platform dataset containing comments from Twitter, YouTube, and Facebook, for assessing generalization capability. Competitive results on these datasets suggest that the data collected using our method capture universal characteristics of offensive language. Our findings also highlight the common words used in offensive communications, common targets for hate speech, specific patterns in violence tweets, and pinpoint common classification errors that can be attributed to limitations of NLP models. We observe that even state-of-the-art transformer models may fail to take into account culture, background, and context or understand nuances present in real-world data such as sarcasm.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NC
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial licence (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright
© The Author(s), 2023. Published by Cambridge University Press
Figure 0

Table 1. Available Arabic datasets along with the annotation labels, data source and collection method, and percentage of offensive content.

Figure 1

Figure 1. Emojis used in Wiegand and Ruppenhofer (2021).

Figure 2

Figure 2. Categories of common offensive Emojis.

Figure 3

Table 2. Statistics and examples from the annotated corpus (total of 12,698 tweets)

Figure 4

Figure 3. Categories of common offensive Emojis. Emojis in the same row are sorted based on percentage of offensive tweets (in descending order).

Figure 5

Figure 4. Percentage of offensive tweets for emojis and their corresponding words in samples from March 2021.

Figure 6

Figure 5. Wrong usage of middle finger emoji in Arabic tweets.

Figure 7

Figure 6. Emojis co-occur with dialectal and morphological variations.

Figure 8

Figure 7. Offensive words (including vulgar words).

Figure 9

Table 3. Common targets in hate speech tweets

Figure 10

Figure 8. Hate speech% for different religious groups.

Figure 11

Table 4. Common patterns in violence tweets

Figure 12

Figure 9. Total percentage of offensive tweets in a sample of 50 tweets for each emoji in Bengali tweets.

Figure 13

Figure 10. Examples of (non-)offensive tweets in Bengali collected using emojis.

Figure 14

Figure 11. English examples of pig and shoe emojis (topics of common usages and offensive contexts are shown).

Figure 15

Table 5. Distribution of offensive and hate speech data

Figure 16

Table 6. Macro-averaged (P)recision, (R)ecall, and F1 for offensive language classification. Best results are highlighted in bold.

Figure 17

Table 7. Macro-averaged (P)recision, (R)ecall, and F1 for hate speech classification. Best results are highlighted in bold.

Figure 18

Table 8. Common error types in offense detection. Classes: FP (false-positives) and FN (false-negatives)

Figure 19

Figure 12. Examples of explainability errors. Green-colored words contribute to classification as offensive, while red-colored words contribute to classification as a non-offensive class. Rough translations are provided.

Figure 20

Table 9. Performance comparison of QARiB, fine-tuned on our dataset (EMOJI-OFF), SemEval dataset and combination of the two datasets

Figure 21

Table 10. Performance on multi-platform data. MPOLD-TW/YT/FB refers to Twitter, YouTube, and Facebook portions of MPOLD dataset, respectively. The MPOLD-TW/YT/FB columns contain numbers reported in Chowdhury et al. (2020b). The EMOJI-OFF column represents QARIB fine-tuned on our data. All numbers are macro-averaged F1.