Hostname: page-component-89b8bd64d-7zcd7 Total loading time: 0 Render date: 2026-05-05T22:51:55.620Z Has data issue: false hasContentIssue false

Labeling social media posts: does showing coders multimodal content produce better human annotation, and a better machine classifier?

Published online by Cambridge University Press:  02 July 2025

Haohan Chen*
Affiliation:
Center for Social Media and Politics, New York University, New York, NY, USA Department of Politics and Public Administration, The University of Hong Kong, Hong Kong, Hong Kong
James Bisbee
Affiliation:
Center for Social Media and Politics, New York University, New York, NY, USA Department of Political Science, Vanderbilt University, Nashville, TN, USA
Joshua A. Tucker
Affiliation:
Center for Social Media and Politics, New York University, New York, NY, USA Wilf Family Department of Politics, New York University, New York, NY, USA
Jonathan Nagler
Affiliation:
Center for Social Media and Politics, New York University, New York, NY, USA Wilf Family Department of Politics, New York University, New York, NY, USA
*
Corresponding author: Haohan Chen; Email: haohan@hku.hk
Rights & Permissions [Opens in a new window]

Abstract

The increasing multimodality (e.g., images, videos, links) of social media data presents opportunities and challenges. But text-as-data methods continue to dominate as modes of classification, as multimodal social media data are costly to collect and label. Researchers who face a budget constraint may need to make informed decisions regarding whether to collect and label only the textual content of social media data or their full multimodal content. In this article, we develop five measures and an experimental framework to assist with these decisions. We propose five performance metrics to measure the costs and benefits of multimodal labeling: average time per post, average time per valid response, valid response rate, intercoder agreement, and classifier’s predictive power. To estimate these measures, we introduce an experimental framework to evaluate coders’ performance under text-only and multimodal labeling conditions. We illustrate the method with a tweet labeling experiment.

Information

Type
Research Note
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use and/or adaptation of the article.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of EPS Academic Ltd.
Figure 0

Table 1. Measures for the cost and benefit of multimodal labeling compared to text-only labeling

Figure 1

Figure 1. Interface of our tweet labeling infrastructure.

Figure 2

Figure 2. Multimodal labeling cost 19% more time per Twitter post and 14% more time per valid response, increased the valid response rate by 4%, and increased the intercoder agreement by 9%. It decreases classifier’s predictive power by 3%. Horizontal bars indicate 95% intervals based on cluster robust bootstrapped standard errors for $\delta T$, $\Delta T_v$, and $\Delta R$. Inference for $\Delta I$ is based on bootstrapping two coders per tweet ID, while $\Delta P$ is based on 100 cross-validated calculations.

Supplementary material: File

Chen et al. supplementary material

Chen et al. supplementary material
Download Chen et al. supplementary material(File)
File 3.6 MB