Skip to main content Accessibility help

Measuring Polarization with Text Analysis: Evidence from the UK House of Commons, 1811–2015

  • Niels D. Goet (a1)


Political scientists can rely on a long tradition of applying unsupervised measurement models to estimate ideology and preferences from texts. However, in practice the hope that the dominant source of variation in their data is the quantity of interest is often not realized. In this paper, I argue that in the messy world of speeches we have to rely on supervised approaches that include information on party affiliation in order to produce meaningful estimates of polarization. To substantiate this argument, I introduce a validation framework that may be used to comparatively assess supervised and unsupervised methods, and estimate polarization on the basis of 6.2 million records of parliamentary speeches from the UK House of Commons over the period 1811–2015. Beyond introducing several important adjustments to existing estimation approaches, the paper’s methodological contribution therefore consists of outlining the challenges of applying unsupervised estimation techniques to speech data, and arguing in detail why we should instead rely on supervised methods to measure polarization.


Corresponding author


Hide All

Author’s note: I would like to thank Andrew Peterson, Max Goplerud, David Doyle, Radoslaw Zubek, Simon Hug, and Kamil Marcinkiewicz for helpful suggestions and comments on earlier drafts. I am also grateful to discussants and audiences at the American Political Science Association meeting (2017) and the ECPR Standing Group on Parliaments (2017), who provided helpful feedback. My manuscript benefited greatly from detailed comments from three anonymous referees, and from the editor at Political Analysis. The usual disclaimer applies. Replication materials are available on the Political Analysis Dataverse (Goet 2018). Supplementary materials for this article are available on the Political Analysis website.

Contributing Editor: R. Michael Alvarez



Hide All
Adcock, R., and Collier, D.. 2001. “Measurement Validity: A Shared Standard for Qualitative and Quantitative Research.” American Political Science Review 95(3):529546.
Benedetto, G., and Hix, S.. 2007. “The Rejected, the Ejected, and the Dejected: Explaining Government Rebels in the 2001–2005 British House of Commons.” Comparative Political Studies 40(7):755781.
Binder, S. A. 1996. “The Partisan Basis of Procedural Choice: Allocating Parliamentary Rights in the House, 1789–1990.” The American Political Science Review 90(1):820.
Bonica, A. 2014. “Mapping the Ideological Marketplace.” American Journal of Political Science 58(2):367386.
Bottou, L. 2004. “Stochastic Learning.” In Advanced Lectures on Machine Learning , edited by Bousquet, O., von Luxburg, U., and Rätsch, G., 146168. Berlin and Heidelberg: Springer.
Carrubba, C. J., Gabel, M., and Hug, S.. 2008. “Legislative Voting Behavior, Seen and Unseen: A Theory of Roll-call Vote Selection.” Legislative Studies Quarterly 33(4):543572.
Carrubba, C. J. et al. . 2006. “Off the Record: Unrecorded Legislative Votes, Selection Bias and Roll-call Vote Analysis.” British Journal of Political Science 36(4):691704.
Cox, G. W. 1987. The Efficient Secret: The Cabinet and the Development of Political Parties . Cambridge: Cambridge University Press.
Diermeier, D., and Vlaicu, R.. 2011. “Parties, Coalitions, and the Internal Organization of Legislatures.” American Political Science Review 105(2):359380.
Eggers, A. C., and Spirling, A.. 2014. “Electoral Security as a Determinant of Legislator Activity, 1832–1918: New Data and Methods for Analyzing British Political Development.” Legislative Studies Quarterly 39(4):593620.
Gentzkow, M., Shapiro, J. M., and Taddy, M.. 2016. “Measuring Polarization in High-dimensional Data: Method and Application to Congressional Speech.” Working Paper.
Goet, N. D.2018. “Replication Data for: Measuring Polarisation with Text Analysis - Evidence from the UK House of Commons, 1811–2015.”, Harvard Dataverse, V1. 2013. “Ideology Analysis of Members of Congress.”
Greenacre, M. 2016. Correspondence Analysis in Practice . 3rd edn. Boca Raton, FL: Chapman & Hall/CRC Press.
Grimmer, J. 2013. Representational Style in Congress: What Legislators Say and Why it Matters . Cambridge: Cambridge University Press.
Grimmer, J., and Stewart, B. M.. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21(3):267297.
Herzog, A., and Benoit, K.. 2015. “The Most Unkindest Cuts: Speaker Selection and Expressed Government Dissent During Economic Crisis.” The Journal of Politics 77(4):11571175.
Hix, S., and Noury, A.. 2010. “Scaling the Commons: Using MPs’ Left–right Self-placement and Voting Divisions to Map the British Parliament, 1997–2005.” Paper prepared for presentation at the annual meeting of the American Political Science Association in Washington, DC, September 2–5, 2010.
Hug, S. 2010. “Selection Effects in Roll Call Votes.” British Journal of Political Science 40(1):225235.
Kam, C. 2009. Party Discipline and Parliamentary Politics . Cambridge: Cambridge University Press.
Lauderdale, B. E., and Herzog, A.. 2016. “Measuring Political Positions from Legislative Speech.” Political Analysis 24(3):374394.
Laver, M., Benoit, K., and Garry, J.. 2003. “Extracting Policy Positions from Political Texts Using Words as Data.” American Political Science Review 97(2):311331.
Laver, M., and Budge, I.. 1992. “Measuring Policy Distances and Modelling Coalition Formation.” In Party Policy and Government Coalitions , edited by Laver, M. and Budge, I., 1540. Basingstoke: Macmillan.
Lowe, W. E. M.2013. “There’s (Basically) Only One Way To Do It: Some Unifying Theory for Text Scaling Models.” Paper prepared for the American Political Science Association Meeting, Chicago, September 2013.
Lowe, W. E. M., and Benoit, K.. 2013. “Validating Estimates of Latent Traits from Textual Data Using Human Judgment as a Benchmark.” Political Analysis 21(3):298313.
Manning, C. D., Raghavan, P., and Schütze, H.. 2008. Introduction to Information Retrieval . Cambridge: Cambridge University Press.
Maron, M. E., and Kuhns, J. L.. 1960. “On Relevance, Probabilistic Indexing and Information Retrieval.” Journal of the ACM 7(3):216244.
McLean, I., and Bustani, C.. 1999. “Irish Potatoes and British Politics: Interests, Ideology, Heresthetic and the Repeal of the Corn Laws.” Political Studies 47(5):817836.
Monroe, B. L., Colaresi, M. P., and Quinn, K. M.. 2008. “Fightin’ Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict.” Political Analysis 16(4):372403.
Norris, P., and Lovenduski, J.. 1995. Political Recruitment: Gender, Race and Class in the British Parliament . Cambridge: Cambridge University Press.
Peterson, A., and Spirling, A.. 2018. “Classification Accuracy as a Substantive Quantity of Interest: Measuring Polarization in Westminster Systems.” Political Analysis 26(1):120128.
Proksch, S.-O., and Slapin, J. B.. 2014. “Words as Data: Content Analysis in Legislative Studies.” In The Oxford Handbook of Legislative Studies , edited by Martin, S., Saalfeld, T., and Strøm, K., 126144. Oxford: Oxford University Press.
Proksch, S.-O., and Slapin, J. B.. 2015. The Politics of Parliamentary Debate: Parties, Rebels and Representation . Cambridge: Cambridge University Press.
Pugh, M. 1982. The Making of Modern British Politics, 1867–1939 . Oxford: Basil Blackwell Publisher Limited.
Pugh, M. 1999. State & Society: A Social and Political History of Britain 1870–1999 . 2nd edn. New York: Oxford University Press.
Quinn, K. M. et al. . 2010. “How to Analyze Political Attention with Minimal Assumptions and Costs.” American Journal of Political Science 54(1):209228.
Schwarz, D., Traber, D., and Benoit, K.. 2017. “Estimating Intra-party Preferences: Comparing Speeches to Votes.” Political Science Research and Methods 5(2):379396.
Slapin, J. B., and Proksch, S.-O.. 2008. “A Scaling Model for Estimating Time-series Party Positions from Texts.” American Journal of Political Science 52(3):705722.
Spirling, A. 2014. “British Political Development: A Research Agenda.” Legislative Studies Quarterly 39(4):435437.
Spirling, A., and McLean, I.. 2007. “Uk OC OK? Interpreting Optimal Classification Scores for the U.K. House of Commons.” Political Analysis 15(1):8596.
Vandoren, P. M. 1990. “Can We Learn the Causes of Congressional Decisions from Roll-call Data? Legislative Studies Quarterly 15(3):311340.
Volkens, A. et al. . 2016. The Manifesto Data Collection. Manifesto Project (MRG/CMP/MARPOR). Version 2016a. With Werner, Annika. Berlin: Wissenschaftszentrum Berlin für Sozialforschung.
MathJax is a JavaScript display engine for mathematics. For more information see


Type Description Title
Supplementary materials

Goet supplementary material
Goet supplementary material 1

 Unknown (597 KB)
597 KB

Measuring Polarization with Text Analysis: Evidence from the UK House of Commons, 1811–2015

  • Niels D. Goet (a1)


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed