Hostname: page-component-89b8bd64d-x2lbr Total loading time: 0 Render date: 2026-05-09T02:07:32.151Z Has data issue: false hasContentIssue false

Hate speech on Twitter: Profiling users and interaction analysis in the Spanish language

Published online by Cambridge University Press:  30 April 2026

Irene Ramiro-López
Affiliation:
Department of Computer Science, Autonomous University of Madrid, Spain
Lara Quijano-Sánchez*
Affiliation:
Department of Computer Science, Autonomous University of Madrid, Spain IBiDat – Big Data Institute, Carlos III University of Madrid, Spain
Federico Liberatore
Affiliation:
IBiDat – Big Data Institute, Carlos III University of Madrid, Spain School of Computer Science and Informatics, Cardiff University, UK
*
Corresponding author: Lara Quijano-Sánchez; Email: laraquij@inst.uc3m.es
Rights & Permissions [Opens in a new window]

Abstract

Twitter is one of the most widely used social networks globally but is also notorious for harbouring significant levels of hate speech. This paradox has garnered the attention of companies and governments concerned about how digital hate can fragment communities and incite real-world violence. While extensive research has focused on detecting hate speech, there is a lack of comprehensive analysis of the actors involved, their characteristics, and their interactions within the online environment. Common assumptions categorise users into merely two groups—haters and non-haters—overlooking the existence of other groups that may more accurately represent the dynamics of hate dissemination. Additionally, existing user classification models often rely on large volumes of tweet data, a limitation given the restricted access to the Twitter API. Social networks are also frequently visualised using graphs where edges represent only superficial relationships. This study addresses these gaps through five research questions. We propose formal user clustering methods and develop a classifier that uses exclusively profile attributes—information more readily obtainable from the network. We also introduce a more nuanced definition of interaction for graph edges, based on ideological support or opposition between users. To conduct our analysis, we have extracted two complementary datasets: (i) a keyword-based corpus of 3.3M Spanish tweets containing hate-associated terms, of which 1.6M unique tweets were retained after filtering and (ii) a user-based corpus comprising timelines of $\approx$3,000 users linked to hate speech, totalling over 3 M tweets. Our results reveal the existence of three primary user classes—haters, upstanders, and neutrals—in contrast to the conventional binary classification. We demonstrate that profile attributes are reliable indicators for automatically classifying users and find significant statistical differences between these classes. Finally, we develop a graph visualisation tool to assist authorities in analysing interactions among different user types, providing a useful exploratory tool to support the analysis of online hate and inform potential mitigation strategies.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NC
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial licence (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original article is properly cited. The written permission of Cambridge University Press or the rights holder(s) must be obtained prior to any commercial use.
Copyright
© The Author(s), 2026. Published by Cambridge University Press
Figure 0

Table 1. Summary of hate speech classification studies

Figure 1

Table 2. Overview of real-time monitoring studies

Figure 2

Table 3. Summary of works focused on user profiling based on different attributes

Figure 3

Table 4. Overview of studies focused on user profiling in the context of hate speech

Figure 4

Table 5. Overview of user graph studies

Figure 5

Figure 1. User profiling methodology.

Figure 6

Table 6. Categories designed for tweet classification

Figure 7

Figure 2. Simplified process for implicit feature extraction.

Figure 8

Figure 3. Insight extraction for the specific domain of hate speech.

Figure 9

Figure 4. Example of the process for obtaining the argumentative relation between two replies and the original message they respond to. All the texts are fictitious.

Figure 10

Figure 5. Algorithm for generating each graph.

Figure 11

Table 7. Number of tweets extracted by category and how many of these have received responses

Figure 12

Table 8. Number of tweets extracted, grouped by keyword set, and classified by hate type

Figure 13

Table 9. Overview of the two datasets used in this study

Figure 14

Table 10. Comparison of Spanish hate speech classifiers on the validation subset. Notes: Values are averaged over the manually validated subset of tweets. Both models show very similar classification performance and nearly identical distributions of predicted labels, with no systematic skew across protected attributes

Figure 15

Table 11. Fairness check of hate speech classifiers across protected attributes. Notes: The $\Delta$ column shows the absolute difference in F1-scores between models. Differences remain below 0.02, suggesting consistent behaviour across groups and no systematic bias propagation. This table demonstrates explicitly that both models behave stably and equitably across gender, ideology, and age

Figure 16

Figure 6. Density analysis of $X_{\text{upstander}}$.

Figure 17

Figure 7. Histogram of the proportions of hater tweets accompanied by the density function $0.55*\text{Beta}(\alpha =2.7,\beta =5)+0.45*\text{Beta}(\alpha =1.05,\beta =21)$.

Figure 18

Figure 8. Density estimation level curves for different distributions.

Figure 19

Figure 9. Clusters identified by k-means on the transformed data, with $k=3$.

Figure 20

Figure 10. Confusion matrix of the profile-based classifier using 10-fold cross-validation. User labels correspond to the cluster-derived classes (Haters, Upstanders, and Neutrals). Each cell shows both the number of users and the percentage within the true class row.

Figure 21

Figure 11. Feature importance of profile attributes identified by the model (ordered by the normalised F-score).

Figure 22

Figure 12. Box plot of the five most important characteristics of the model by author type. Outliers are not graphed for clarity.

Figure 23

Figure 13. Pie charts of topics’ description by user type.

Figure 24

Figure 14. Pie charts of sentiment&’s description by user type.

Figure 25

Figure 15. Proportion of the average number of responses received by each author based on tweet type.

Figure 26

Figure 16. Graph tool colour coding.

Figure 27

Table 12. Percentage of node and edge types obtained by hate category

Figure 28

Figure 17. Examples of relationships between hater and upstander users, along with the conversations from which these graphs were obtained. The names are anonymised for privacy.

Figure 29

Figure 18. Graph representing a conversation stemming from a hate tweet targeting the Roma community.

Figure 30

Table A1. Features extracted from general profile data. Those marked with an asterisk are implicit

Figure 31

Table A2. Features extracted from users’ profile descriptions. Those marked with an asterisk are implicit

Figure 32

Figure B1. Scatter plot of proportions grouped into 3 clusters.

Figure 33

Table C1. Classification models comparison. The best results for each metric are highlighted in bold