Hostname: page-component-77f85d65b8-g4pgd Total loading time: 0 Render date: 2026-03-26T21:31:38.613Z Has data issue: false hasContentIssue false

Ideological Scaling of Social Media Users: A Dynamic Lexicon Approach

Published online by Cambridge University Press:  28 August 2018

Mickael Temporão*
Affiliation:
Département de science politique, Université Laval, Québec, QC G1V 0A6, Canada. Email: mickael.temporao.1@ulaval.ca
Corentin Vande Kerckhove
Affiliation:
Large Graphs and Networks Group, Université catholique de Louvain, Louvain-la-Neuve, B-1348, Belgium
Clifton van der Linden
Affiliation:
Department of Political Science, University of Toronto, Toronto, ON M5S 3G3, Canada
Yannick Dufresne
Affiliation:
Département de science politique, Université Laval, Québec, QC G1V 0A6, Canada. Email: mickael.temporao.1@ulaval.ca
Julien M. Hendrickx
Affiliation:
Large Graphs and Networks Group, Université catholique de Louvain, Louvain-la-Neuve, B-1348, Belgium
Rights & Permissions [Opens in a new window]

Abstract

Words matter in politics. The rhetoric that political elites employ structures civic discourse. The emergence of social media platforms as a medium of politics has enabled ordinary citizens to express their ideological inclinations by adopting the lexicon of political elites. This avails to researchers a rich new source of data in the study of political ideology. However, existing ideological text-scaling methods fail to produce meaningful inferences when applied to the short, informal style of textual content that is characteristic of social media platforms such as Twitter. This paper introduces the first viable approach to the estimation of individual-level ideological positions derived from social media content. This method allows us to position social media users—be they political elites, parties, or citizens—along a shared ideological dimension. We validate the proposed method by demonstrating correlation with existing measures of ideology across various political contexts and multiple languages. We further demonstrate the ability of ideological estimates to capture derivative signal by predicting out-of-sample, individual-level voting intentions. We posit that social media data can, when properly modeled, better capture derivative signal than discrete scales used in more traditional survey instruments.

Information

Type
Articles
Copyright
Copyright © The Author(s) 2018. Published by Cambridge University Press on behalf of the Society for Political Methodology. 
Figure 0

Figure 1. Political candidates—Comparison of estimated positions for the reference method (network scaling approach) and the dynamic lexicon approach. The x-axis shows the standardized position of political candidates on the unidimensional latent space derived from network data (Barberá 2015). The y-axis shows the standardized position of political candidates on the unidimensional latent space derived from the dynamic lexicon approach. Pearson correlations ($\unicode[STIX]{x1D70C}$) are all statistically significant (p-value ${<}$ 0.05).

Figure 1

Figure 2. Citizens—Assessing linear combinations of textual and network ideologies. The x-axis displays the parameter $\unicode[STIX]{x1D706}$ (3). The textual ideology corresponds to the case $\unicode[STIX]{x1D706}=0$. The network ideology corresponds to the case $\unicode[STIX]{x1D706}=1$. The y-axis displays Pearson correlations (p-value ${<}0.05$) between the linear combination and the reference ideologies. The dashed line depicts the correlation for $\unicode[STIX]{x1D706}=1$.

Figure 2

Table 1. Citizens—Assessment of the dynamic lexicon approach for citizens. Results are expressed for a bigram dictionary. The values indicate the Pearson correlations (p-value ${<}$ 0.05) between the ideologies extracted from Twitter data (text and network) and the reference ideologies based on policy positions. The values in parenthesis indicate the Pearson correlations for the subset of politically interested citizens.

Figure 3

Figure 3. Venn diagram illustrating the complementarity between the ideology estimates to predicting voting intentions of citizens. The metric values inside each set represent the prediction’s efficiency measured by the area under curve (AUC).

Figure 4

Figure 4. Comparison at the party level of citizens’ Twitter and Survey prediction efficiencies. The values displayed correspond to the area under curve of the precision and recall curves related to each party.

Supplementary material: File

Temporão et al. supplementary material

Temporão et al. supplementary material 1

Download Temporão et al. supplementary material(File)
File 412.8 KB