Hostname: page-component-89b8bd64d-n8gtw Total loading time: 0 Render date: 2026-05-10T12:55:34.839Z Has data issue: false hasContentIssue false

Political DEBATE: Efficient Zero-Shot and Few-Shot Classifiers for Political Text

Published online by Cambridge University Press:  15 December 2025

Michael Burnham*
Affiliation:
Center for the Study of Democratic Politics, Princeton University, USA
Kayla Kahn
Affiliation:
Department of Political Science, The Pennsylvania State University, USA
Ryan Yang Wang
Affiliation:
Manship School of Mass Communication, Louisiana State University, USA
Rachel X. Peng
Affiliation:
Manship School of Mass Communication, Louisiana State University, USA
*
Corresponding author: Michael Burnham; Email: mlb6496@tamu.edu
Rights & Permissions [Opens in a new window]

Abstract

Social scientists have quickly adopted large language models (LLMs) for their ability to annotate documents without supervised training, an ability known as zero-shot classification. However, due to their computational demands, cost, and often proprietary nature, these models are frequently at odds with open science standards. This article introduces the Political Domain Enhanced BERT-based Algorithm for Textual Entailment (DEBATE) language models: Foundation models for zero-shot, few-shot, and supervised classification of political documents. As zero-shot classifiers, the models are designed to be used for common, well-defined tasks, such as topic and opinion classification. When used in this context, the DEBATE models are not only as good as state-of-the-art LLMs at zero-shot classification, but are orders of magnitude more efficient and completely open source. We further demonstrate that the models are effective few-shot learners. With a simple random sample of 10–25 documents, they can outperform supervised classifiers trained on hundreds or thousands of documents and state-of-the-art generative models. Additionally, we release the PolNLI dataset used to train these models—a corpus of over 200,000 political documents with highly accurate labels across over 800 classification tasks.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of The Society for Political Methodology
Figure 0

Table 1 Dataset statistics across different tasks.

Figure 1

Table 2 Example hypotheses and documents for each task.

Figure 2

Figure 1 Zero-shot performance for all test set documents.

Figure 3

Figure 2 Zero-shot performance by task.

Figure 4

Figure 3 Distribution of F1 across all datasets for zero-shot classification.

Figure 5

Figure 4 Classification of COVID-19 Tweets.

Figure 6

Figure 5 Comparison of regression estimates (with p-values) between the DEBATE model and Block Jr. et al. (2022).Note: In addition to the above variables, the model includes a vector of demographic and geographic controls.

Figure 7

Figure 6 The DEBATE models offer a massive efficiency advantage over generative language models.

Supplementary material: File

Burnham et al. supplementary material

Burnham et al. supplementary material
Download Burnham et al. supplementary material(File)
File 4.4 MB
Supplementary material: Link

Burnham et al. Dataset

Link