Hostname: page-component-89b8bd64d-46n74 Total loading time: 0 Render date: 2026-05-09T04:01:46.740Z Has data issue: false hasContentIssue false

Constructing Vec-tionaries to Extract Message Features from Texts: A Case Study of Moral Content

Published online by Cambridge University Press:  11 April 2025

Zening Duan
Affiliation:
School of Journalism and Mass Communication, University of Wisconsin-Madison, WI, USA
Anqi Shao
Affiliation:
Department of Life Sciences Communication, University of Wisconsin-Madison, WI, USA
Yicheng Hu
Affiliation:
Department of Chemical and Biological Engineering, University of Wisconsin-Madison, WI, USA
Heysung Lee
Affiliation:
School of Journalism and Mass Communication, University of Wisconsin-Madison, WI, USA
Xining Liao
Affiliation:
School of Journalism and Mass Communication, University of Wisconsin-Madison, WI, USA
Yoo Ji Suh
Affiliation:
School of Journalism and Mass Communication, University of Wisconsin-Madison, WI, USA
Jisoo Kim
Affiliation:
School of Journalism and Mass Communication, University of Wisconsin-Madison, WI, USA
Kai-Cheng Yang
Affiliation:
Network Science Institute, Northeastern University, Boston, MA, USA
Kaiping Chen
Affiliation:
Department of Life Sciences Communication, University of Wisconsin-Madison, WI, USA
Sijia Yang*
Affiliation:
School of Journalism and Mass Communication, University of Wisconsin-Madison, WI, USA
*
Corresponding author: Sijia Yang; Email: sijia.yang@alumni.upenn.edu
Rights & Permissions [Opens in a new window]

Abstract

While researchers often study message features like moral content in text, such as party manifestos and social media posts, their quantification remains a challenge. Conventional human coding struggles with scalability and intercoder reliability. While dictionary-based methods are cost-effective and computationally efficient, they often lack contextual sensitivity and are limited by the vocabularies developed for the original applications. In this paper, we present an approach to construct “vec-tionaries” that boost validated dictionaries with word embeddings through nonlinear optimization. By harnessing semantic relationships encoded by embeddings, vec-tionaries improve the measurement of message features from text, especially those in short format, by expanding the applicability of original vocabularies to other contexts. Importantly, a vec-tionary can produce additional metrics to capture the valence and ambivalence of a message feature beyond its strength in texts. Using moral content in tweets as a case study, we illustrate the steps to construct the moral foundations vec-tionary, showcasing its ability to process texts missed by conventional dictionaries and to produce measurements better aligned with crowdsourced human assessments. Furthermore, additional metrics from the vec-tionary unveiled unique insights that facilitated predicting downstream outcomes such as message retransmission.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NC
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial licence (https://creativecommons.org/licenses/by-nc/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of The Society for Political Methodology
Figure 0

Figure 1 The model pipeline of moral foundations vec-tionary.

Figure 1

Figure 2 Difference in RBO similarity scores by moral foundation.

Figure 2

Table 1 Performance comparison for the Care/Harm moral foundation while varying weight and depth values.

Figure 3

Figure 3 Performance gain of the moral foundations vec-tionary: Care/Harm.

Figure 4

Figure 4 Performance gain of the moral foundations vec-tionary: Loyalty/Betrayal.

Figure 5

Figure 5 Performance gain of the moral foundations vec-tionary: Authority/Subversion.

Figure 6

Figure 6 Model specification and comparison.

Figure 7

Table 2 Model outputs (only showing Poisson regression) in evaluating vec-tionary’s predictive capabilities.

Supplementary material: File

Duan et al. supplementary material

Duan et al. supplementary material
Download Duan et al. supplementary material(File)
File 6.5 MB