Hostname: page-component-6766d58669-bp2c4 Total loading time: 0 Render date: 2026-05-16T12:28:20.889Z Has data issue: false hasContentIssue false

Improving Content Analysis: Tools for Working with Undergraduate Research Assistants

Published online by Cambridge University Press:  02 October 2023

Benjamin Goehring*
Affiliation:
University of Michigan, USA
Rights & Permissions [Opens in a new window]

Abstract

Undergraduate research assistants (URAs) perform important roles in many political scientists’ research projects. They serve as coauthors, survey respondents, and data collectors. Despite these roles, there is relatively little discussion about how best to train and manage URAs who are working on a common task: content coding. Drawing on insights from psychology, text analysis, and business management, as well as my own experience in managing a team of nine URAs, this article argues that supervisors should train URAs by pushing them to engage with their own mistakes. Via a series of simulation exercises, I also argue that supervisors—especially supervisors of small teams—should be concerned about the effects of errant post-training coding on data quality. Therefore, I contend that supervisors should utilize computational tools to monitor URA reliability in real time. I provide researchers with a new R package, ura, and a web-based application to implement these suggestions.

Information

Type
Comment and Controversy
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of American Political Science Association
Figure 0

Figure 1 Training Task DiagramThis figure displays the training steps that Lowande and I asked our URAs to complete. For every unilateral action, the URA first searched through the ProQuest database for articles that match criteria defined by us (shown in blue). The URAs then reviewed each possible match returned by the search criteria to determine whether it met our definition of “relevant coverage” (shown in yellow). After working through all of the unilateral actions in the practice set, the URAs checked their results against an answer key (shown in red). If the URAs discovered that they erred, they went back to the article and codebook to either describe why they erred or to argue why their original coding decision was correct (shown in green).

Figure 1

Table 1 URA Justification Examples

Figure 2

Figure 2 The Effects of One Poorly Performing URA on IRRThis figure shows how one poorly performing URA affects the IRR of a dataset. For a task conducted by a given number of URAs, each facet shows how the (im)precision of one URA affects the Krippendorf’s Alpha of the final dataset. The blue dashed line at 0.8 represents the conventional level of reliability for Krippendorf’s Alpha. The data for each facet are generated by randomly assigning each URA 100 actions from a set of 200 actions (with replacement). Therefore, some but not all of the actions were coded by more than one URA and suitable for IRR testing. The URAs always agreed on actions that were coded by more than one URA, except for URA i, who agreed with the others with some probability (shown on the horizontal axis). If the probability of URA i agreeing with the other URAs was 1, then all of the URAs assigned the same coding to actions. If the probability of URA i agreeing with the other URAs was 0, then URA i assigned the opposite coding of the other URAs. The online appendix includes the simulation code and additional plots that vary the number of actions assigned to each URA. As shown in those plots, varying the number of actions assigned to each URA and the number of actions sampled does not affect the findings.

Figure 3

Table 2 Percent Agreement, by Coder

Supplementary material: Link

Goehring Dataset

Link
Supplementary material: PDF

Goehring supplementary material

Goehring supplementary material

Download Goehring supplementary material(PDF)
PDF 405.3 KB