Published online by Cambridge University Press: 29 April 2019
Political scientists can rely on a long tradition of applying unsupervised measurement models to estimate ideology and preferences from texts. However, in practice the hope that the dominant source of variation in their data is the quantity of interest is often not realized. In this paper, I argue that in the messy world of speeches we have to rely on supervised approaches that include information on party affiliation in order to produce meaningful estimates of polarization. To substantiate this argument, I introduce a validation framework that may be used to comparatively assess supervised and unsupervised methods, and estimate polarization on the basis of 6.2 million records of parliamentary speeches from the UK House of Commons over the period 1811–2015. Beyond introducing several important adjustments to existing estimation approaches, the paper’s methodological contribution therefore consists of outlining the challenges of applying unsupervised estimation techniques to speech data, and arguing in detail why we should instead rely on supervised methods to measure polarization.
Author’s note: I would like to thank Andrew Peterson, Max Goplerud, David Doyle, Radoslaw Zubek, Simon Hug, and Kamil Marcinkiewicz for helpful suggestions and comments on earlier drafts. I am also grateful to discussants and audiences at the American Political Science Association meeting (2017) and the ECPR Standing Group on Parliaments (2017), who provided helpful feedback. My manuscript benefited greatly from detailed comments from three anonymous referees, and from the editor at Political Analysis. The usual disclaimer applies. Replication materials are available on the Political Analysis Dataverse (Goet 2018). Supplementary materials for this article are available on the Political Analysis website.
Contributing Editor: R. Michael Alvarez