Hostname: page-component-89b8bd64d-x2lbr Total loading time: 0 Render date: 2026-05-07T12:16:38.402Z Has data issue: false hasContentIssue false

How to train your stochastic parrot: large language models for political texts

Published online by Cambridge University Press:  14 January 2025

Joseph T. Ornstein*
Affiliation:
Department of Political Science, University of Georgia, Athens, GA, USA
Elise N. Blasingame
Affiliation:
Department of Political Science, University of Georgia, Athens, GA, USA
Jake S. Truscott
Affiliation:
Department of Political Science, University of Florida, Gainesville, FL, USA
*
Corresponding author: Joseph T. Ornstein; Email: jornstein@uga.edu
Rights & Permissions [Opens in a new window]

Abstract

We demonstrate how few-shot prompts to large language models (LLMs) can be effectively applied to a wide range of text-as-data tasks in political science—including sentiment analysis, document scaling, and topic modeling. In a series of pre-registered analyses, this approach outperforms conventional supervised learning methods without the need for extensive data pre-processing or large sets of labeled training data. Performance is comparable to expert and crowd-coding methods at a fraction of the cost. We propose a set of best practices for adapting these models to social science measurement tasks, and develop an open-source software package for researchers.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of EPS Academic Ltd.
Figure 0

Figure 1. LLM prompt for sentiment classification task.

Figure 1

Figure 2. Classification performance on Twitter sentiment task, comparing the few-shot LLM approach (GPT-3 and GPT-4), RoBERTa fine-tuned for Twitter sentiment classification (TweetNLP), dictionary-based sentiment analysis, and a supervised learning method (Naive Bayes).

Figure 2

Table 1. Sample of tweets where sentiment is ambiguous absent political context

Figure 3

Figure 3. LLM prompt for political ad tone task.

Figure 4

Figure 4. Comparing crowd-coded, GPT-3, and GPT-4 estimates to expert-coded political ad tone (Carlson and Montgomery, 2017).

Figure 5

Figure 5. LLM prompt for ideology scaling task.

Figure 6

Figure 6. Performance of crowd-coded (top panel) and GPT-3 (bottom panel) ideology estimates, compared to expert scores.

Figure 7

Figure 7. LLM prompt for topic modeling application.

Figure 8

Figure 8. Share of speeches mentioning a virtue (and its synonyms) by political party. Note: We include the following synonyms in each category: bravery (brave, fearless, heroic, gallant, valiant, courage, valor); loyalty (loyal, dutiful, duty, steadfast, devoted, allegiant); patriotism (patriot); hard work (hard work, industrious, assiduous, diligent); fairness (equitable, equity, egalitarian, equal, impartiality); compassion (kindness, empathy, humanity, caring); charity (philanthropic, benevolent, beneficence), success (achievement, merit), education (mentorship, knowledge, intelligence), advocacy (activism), sacrifice (selflessness).

Supplementary material: File

Ornstein et al. supplementary material

Ornstein et al. supplementary material
Download Ornstein et al. supplementary material(File)
File 5.2 MB