Hostname: page-component-77f85d65b8-6c7dr Total loading time: 0 Render date: 2026-03-28T16:36:29.182Z Has data issue: false hasContentIssue false

The problem of varying annotations to identify abusive language in social media content

Published online by Cambridge University Press:  29 March 2023

Nina Seemann*
Affiliation:
Research Institute CODE, University of the Bundeswehr Munich, Neubiberg, Germany
Yeong Su Lee
Affiliation:
Research Institute CODE, University of the Bundeswehr Munich, Neubiberg, Germany
Julian Höllig
Affiliation:
Research Institute CODE, University of the Bundeswehr Munich, Neubiberg, Germany
Michaela Geierhos
Affiliation:
Research Institute CODE, University of the Bundeswehr Munich, Neubiberg, Germany
*
*Corresponding author. E-mail: nina.seemann@unibw.de
Rights & Permissions [Opens in a new window]

Abstract

With the increase of user-generated content on social media, the detection of abusive language has become crucial and is therefore reflected in several shared tasks that have been performed in recent years. The development of automatic detection systems is desirable, and the classification of abusive social media content can be solved with the help of machine learning. The basis for successful development of machine learning models is the availability of consistently labeled training data. But a diversity of terms and definitions of abusive language is a crucial barrier. In this work, we analyze a total of nine datasets—five English and four German datasets—designed for detecting abusive online content. We provide a detailed description of the datasets, that is, for which tasks the dataset was created, how the data were collected, and its annotation guidelines. Our analysis shows that there is no standard definition of abusive language, which often leads to inconsistent annotations. As a consequence, it is difficult to draw cross-domain conclusions, share datasets, or use models for other abusive social media language tasks. Furthermore, our manual inspection of a random sample of each dataset revealed controversial examples. We highlight challenges in data annotation by discussing those examples, and present common problems in the annotation process, such as contradictory annotations and missing context information. Finally, to complement our theoretical work, we conduct generalization experiments on three German datasets.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - SA
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (http://creativecommons.org/licenses/by-nc-sa/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is used to distribute the re-used or adapted article and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright
© The Author(s), 2023. Published by Cambridge University Press
Figure 0

Fig. 1: Relations and boundaries between hate speech and related concepts according to Poletto et al. (2021).

Figure 1

Table 1. Overview of datasets used and their corresponding sizes. The percentage of abusive content is calculated by considering all types of abuse (e.g., hate, offense, aggression)

Figure 2

Table 2. Overview of the systems that achieved the best results on each dataset

Figure 3

Table 3. Top five hashtags in HatEval

Figure 4

Table 4. Fifteen most frequent terms (after preprocessing) in HateBaseTwitter

Figure 5

Table 5. Classes of controversial annotations in each dataset

Figure 6

Table 6. Results of the intra-dataset experiments that serve as our baseline (top) and the results of the generalization experiments (bottom). All values shown are macro-averaged F1 scores