Hostname: page-component-8448b6f56d-qsmjn Total loading time: 0 Render date: 2024-04-20T14:33:44.178Z Has data issue: false hasContentIssue false

Measuring Polarization with Text Analysis: Evidence from the UK House of Commons, 1811–2015

Published online by Cambridge University Press:  29 April 2019

Niels D. Goet*
Data Scientist, Inspera AS, Oslo, Norway. Email:


Political scientists can rely on a long tradition of applying unsupervised measurement models to estimate ideology and preferences from texts. However, in practice the hope that the dominant source of variation in their data is the quantity of interest is often not realized. In this paper, I argue that in the messy world of speeches we have to rely on supervised approaches that include information on party affiliation in order to produce meaningful estimates of polarization. To substantiate this argument, I introduce a validation framework that may be used to comparatively assess supervised and unsupervised methods, and estimate polarization on the basis of 6.2 million records of parliamentary speeches from the UK House of Commons over the period 1811–2015. Beyond introducing several important adjustments to existing estimation approaches, the paper’s methodological contribution therefore consists of outlining the challenges of applying unsupervised estimation techniques to speech data, and arguing in detail why we should instead rely on supervised methods to measure polarization.

Copyright © The Author(s) 2019. Published by Cambridge University Press on behalf of the Society for Political Methodology. 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Author’s note: I would like to thank Andrew Peterson, Max Goplerud, David Doyle, Radoslaw Zubek, Simon Hug, and Kamil Marcinkiewicz for helpful suggestions and comments on earlier drafts. I am also grateful to discussants and audiences at the American Political Science Association meeting (2017) and the ECPR Standing Group on Parliaments (2017), who provided helpful feedback. My manuscript benefited greatly from detailed comments from three anonymous referees, and from the editor at Political Analysis. The usual disclaimer applies. Replication materials are available on the Political Analysis Dataverse (Goet 2018). Supplementary materials for this article are available on the Political Analysis website.

Contributing Editor: R. Michael Alvarez


Adcock, R., and Collier, D.. 2001. “Measurement Validity: A Shared Standard for Qualitative and Quantitative Research.” American Political Science Review 95(3):529546.Google Scholar
Benedetto, G., and Hix, S.. 2007. “The Rejected, the Ejected, and the Dejected: Explaining Government Rebels in the 2001–2005 British House of Commons.” Comparative Political Studies 40(7):755781.Google Scholar
Binder, S. A. 1996. “The Partisan Basis of Procedural Choice: Allocating Parliamentary Rights in the House, 1789–1990.” The American Political Science Review 90(1):820.Google Scholar
Bonica, A. 2014. “Mapping the Ideological Marketplace.” American Journal of Political Science 58(2):367386.Google Scholar
Bottou, L. 2004. “Stochastic Learning.” In Advanced Lectures on Machine Learning , edited by Bousquet, O., von Luxburg, U., and Rätsch, G., 146168. Berlin and Heidelberg: Springer.Google Scholar
Carrubba, C. J., Gabel, M., and Hug, S.. 2008. “Legislative Voting Behavior, Seen and Unseen: A Theory of Roll-call Vote Selection.” Legislative Studies Quarterly 33(4):543572.Google Scholar
Carrubba, C. J. et al. . 2006. “Off the Record: Unrecorded Legislative Votes, Selection Bias and Roll-call Vote Analysis.” British Journal of Political Science 36(4):691704.Google Scholar
Cox, G. W. 1987. The Efficient Secret: The Cabinet and the Development of Political Parties . Cambridge: Cambridge University Press.Google Scholar
Diermeier, D., and Vlaicu, R.. 2011. “Parties, Coalitions, and the Internal Organization of Legislatures.” American Political Science Review 105(2):359380.Google Scholar
Eggers, A. C., and Spirling, A.. 2014. “Electoral Security as a Determinant of Legislator Activity, 1832–1918: New Data and Methods for Analyzing British Political Development.” Legislative Studies Quarterly 39(4):593620.Google Scholar
Gentzkow, M., Shapiro, J. M., and Taddy, M.. 2016. “Measuring Polarization in High-dimensional Data: Method and Application to Congressional Speech.” Working Paper.Google Scholar
Goet, N. D.2018. “Replication Data for: Measuring Polarisation with Text Analysis - Evidence from the UK House of Commons, 1811–2015.”, Harvard Dataverse, V1.Google Scholar 2013. “Ideology Analysis of Members of Congress.” Scholar
Greenacre, M. 2016. Correspondence Analysis in Practice . 3rd edn. Boca Raton, FL: Chapman & Hall/CRC Press.Google Scholar
Grimmer, J. 2013. Representational Style in Congress: What Legislators Say and Why it Matters . Cambridge: Cambridge University Press.Google Scholar
Grimmer, J., and Stewart, B. M.. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21(3):267297.Google Scholar
Herzog, A., and Benoit, K.. 2015. “The Most Unkindest Cuts: Speaker Selection and Expressed Government Dissent During Economic Crisis.” The Journal of Politics 77(4):11571175.Google Scholar
Hix, S., and Noury, A.. 2010. “Scaling the Commons: Using MPs’ Left–right Self-placement and Voting Divisions to Map the British Parliament, 1997–2005.” Paper prepared for presentation at the annual meeting of the American Political Science Association in Washington, DC, September 2–5, 2010.Google Scholar
Hug, S. 2010. “Selection Effects in Roll Call Votes.” British Journal of Political Science 40(1):225235.Google Scholar
Kam, C. 2009. Party Discipline and Parliamentary Politics . Cambridge: Cambridge University Press.Google Scholar
Lauderdale, B. E., and Herzog, A.. 2016. “Measuring Political Positions from Legislative Speech.” Political Analysis 24(3):374394.Google Scholar
Laver, M., Benoit, K., and Garry, J.. 2003. “Extracting Policy Positions from Political Texts Using Words as Data.” American Political Science Review 97(2):311331.Google Scholar
Laver, M., and Budge, I.. 1992. “Measuring Policy Distances and Modelling Coalition Formation.” In Party Policy and Government Coalitions , edited by Laver, M. and Budge, I., 1540. Basingstoke: Macmillan.Google Scholar
Lowe, W. E. M.2013. “There’s (Basically) Only One Way To Do It: Some Unifying Theory for Text Scaling Models.” Paper prepared for the American Political Science Association Meeting, Chicago, September 2013.Google Scholar
Lowe, W. E. M., and Benoit, K.. 2013. “Validating Estimates of Latent Traits from Textual Data Using Human Judgment as a Benchmark.” Political Analysis 21(3):298313.Google Scholar
Manning, C. D., Raghavan, P., and Schütze, H.. 2008. Introduction to Information Retrieval . Cambridge: Cambridge University Press.Google Scholar
Maron, M. E., and Kuhns, J. L.. 1960. “On Relevance, Probabilistic Indexing and Information Retrieval.” Journal of the ACM 7(3):216244.Google Scholar
McLean, I., and Bustani, C.. 1999. “Irish Potatoes and British Politics: Interests, Ideology, Heresthetic and the Repeal of the Corn Laws.” Political Studies 47(5):817836.Google Scholar
Monroe, B. L., Colaresi, M. P., and Quinn, K. M.. 2008. “Fightin’ Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict.” Political Analysis 16(4):372403.Google Scholar
Norris, P., and Lovenduski, J.. 1995. Political Recruitment: Gender, Race and Class in the British Parliament . Cambridge: Cambridge University Press.Google Scholar
Peterson, A., and Spirling, A.. 2018. “Classification Accuracy as a Substantive Quantity of Interest: Measuring Polarization in Westminster Systems.” Political Analysis 26(1):120128.Google Scholar
Proksch, S.-O., and Slapin, J. B.. 2014. “Words as Data: Content Analysis in Legislative Studies.” In The Oxford Handbook of Legislative Studies , edited by Martin, S., Saalfeld, T., and Strøm, K., 126144. Oxford: Oxford University Press.Google Scholar
Proksch, S.-O., and Slapin, J. B.. 2015. The Politics of Parliamentary Debate: Parties, Rebels and Representation . Cambridge: Cambridge University Press.Google Scholar
Pugh, M. 1982. The Making of Modern British Politics, 1867–1939 . Oxford: Basil Blackwell Publisher Limited.Google Scholar
Pugh, M. 1999. State & Society: A Social and Political History of Britain 1870–1999 . 2nd edn. New York: Oxford University Press.Google Scholar
Quinn, K. M. et al. . 2010. “How to Analyze Political Attention with Minimal Assumptions and Costs.” American Journal of Political Science 54(1):209228.Google Scholar
Schwarz, D., Traber, D., and Benoit, K.. 2017. “Estimating Intra-party Preferences: Comparing Speeches to Votes.” Political Science Research and Methods 5(2):379396.Google Scholar
Slapin, J. B., and Proksch, S.-O.. 2008. “A Scaling Model for Estimating Time-series Party Positions from Texts.” American Journal of Political Science 52(3):705722.Google Scholar
Spirling, A. 2014. “British Political Development: A Research Agenda.” Legislative Studies Quarterly 39(4):435437.Google Scholar
Spirling, A., and McLean, I.. 2007. “Uk OC OK? Interpreting Optimal Classification Scores for the U.K. House of Commons.” Political Analysis 15(1):8596.Google Scholar
Vandoren, P. M. 1990. “Can We Learn the Causes of Congressional Decisions from Roll-call Data? Legislative Studies Quarterly 15(3):311340.Google Scholar
Volkens, A. et al. . 2016. The Manifesto Data Collection. Manifesto Project (MRG/CMP/MARPOR). Version 2016a. With Werner, Annika. Berlin: Wissenschaftszentrum Berlin für Sozialforschung.Google Scholar
Supplementary material: File

Goet supplementary material

Goet supplementary material 1

Download Goet supplementary material(File)
File 596.5 KB