Legislative speech records from the 101st to 108th Congresses of the US Senate are analysed to study political ideologies. A widely-used text classification algorithm – Support Vector Machines (SVM) – allows the extraction of terms that are most indicative of conservative and liberal positions in legislative speeches and the prediction of senators’ ideological positions, with a 92 per cent level of accuracy. Feature analysis identifies the terms associated with conservative and liberal ideologies. The results demonstrate that cultural references appear more important than economic references in distinguishing conservative from liberal congressional speeches, calling into question the common economic interpretation of ideological differences in the US Congress.
1 Poole, Keith T., ‘Changing Minds? Not in Congress’, Public Choice, 131 (2007), 435–451.
2 Converse, Philip E., ‘The Nature of Belief Systems in Mass Publics’, in David E. Apter, ed., Ideology and Discontent (New York: The Free Press, 1964), pp. 206–261.
3 Converse, , ‘The Nature of Belief Systems and Mass Publics’, p. 207.
4 Poole, Keith T. and Rosenthal, Howard, ‘Patterns of Congressional Voting’, American Journal of Political Science, 35 (1991), 228–278; Poole, Keith T. and Rosenthal, Howard, Congress: A Political-Economic History of Roll Call Voting (New York: Oxford University Press, 1997); McCarty, Nolan, Poole, Keith T. and Rosenthal, Howard, Income Redistribution and the Realignment of American Politics (Washington, D.C.: American Enterprise Institute, 1997); McCarty, Nolan, Poole, Keith T. and Rosenthal, Howard, Polarized America: The Dance of Ideology and Unequal Riches (Boston, Mass.: MIT Press, 2006); Poole, Keith T. and Rosenthal, Howard, Ideology and Congress (New Brunswick, N.J.: Transaction Publishers, 2007).
5 Poole and Rosenthal, Congress.
6 Poole, Keith T., Spatial Models of Parliamentary Voting (New York: Cambridge University Press, 2005).
7 Initially, this finding met with widespread disbelief. See Poole and Rosenthal, Congress, p. 8. However, the low-dimensionality of legislative voting has been confirmed by other scholars using different estimation methodologies, such as Bayesian procedures ( Clinton, Joshua, Jackman, Simon and Rivers, Doug, ‘The Statistical Analysis of Roll Call Data’, American Political Science Review, 98 (2004), 355–370) or factor analysis ( Heckman, James J. and Snyder, James M. Jr, ‘Linear Probability Models of the Demand for Attribution with an Empirical Application to Estimating the Preferences of Legislators’, RAND Journal of Economics, 28 (1997), S142–S189) for estimating ideal points.
8 Institutional features such as gate-keeping powers of committees ( Shepsle, Kenneth A. and Weingast, Barry R., ‘Structure-induced Equilibrium and Legislative Choice’, Public Choice, 37 (1981), 503–519), pre-floor legislative activities (such as co-sponsorship), strategic voting ( Talbert, Jeffery C. and Potoski, Matthew, ‘Setting the Legislative Agenda: The Dimensional Structure of Bill Cosponsoring and Floor Voting’, Journal of Politics, 64 (2002), 864–891) or institutional constraints such as the presidential veto ( Roberts, Jason M., ‘The Statistical Analysis of Roll Call Data: A Cautionary Tale’, Legislative Studies Quarterly, 22 (2007), 341–360; Clinton, Joshua D., ‘Lawmaking and Roll Calls’, Journal of Politics, 69 (2007), 457–469) can all affect the measurement of ideal points and reduce the dimensionality of legislative voting in Congress. It is also possible that exogenous factors, such as electoral incentives, could help explain why parties aim to present a coherent legislative agenda, and avoid intra-party voting divisions. Indeed, Snyder and Ting ( Snyder, James M. and Ting, Michael M., ‘An Informational Rationale for Political Parties’, American Journal of Political Science, 46 (2002), 90–110; Snyder, James M. and Ting, Michael M., ‘Party Labels, Roll Calls, and Elections’, Political Analysis, 11 (2003), 419–444) and Woon and Pope ( Woon, Jonathan and Pope, Jeremy C., ‘Made in Congress? Testing the Electoral Implications of Party Ideological Brand Names’, Journal of Politics, 70 (2008), 823–836) argue that parties can use their aggregate roll-call record to produce a coherent ideological brand name in order to communicate with the electorate. In this context, the observed unidimensionality in legislative voting would be facilitated by electoral incentives, rather than by institutional rules or agenda control.
9 On the Westminster style parliamentary systems, see Spirling and McLean ( Spirling, Arthur and McLean, Iain, ‘UK OC OK? Interpreting Optimal Classification Scores for the U.K. House of Commons’, Political Analysis, 15 (2006), 85–86). On the US Congress case, see Clinton, ‘Lawmaking and Roll Calls’, and Roberts, ‘The Statistical Analysis of Roll Call Data’.
10 See, for example, the NPAT candidate survey of Ansolabehere, Stephen, Snyder, James M. Jr and Stewart, Charles III, ‘The Effects of Party and Preferences on Congressional Roll-Call Voting’, Legislative Studies Quarterly, 26 (2001), 533–572, which looks at the correlation between first factor nominate and first factor NPAT scores; or the Poole and Rosenthal study of nominate scores and interest group ratings (Poole and Rosenthal, Congress; Poole and Rosenthal, Ideology and Congress).
11 One such example is Schonhardt-Bailey, Cheryl, ‘The Congressional Debate on Partial-Birth Abortion: Constitutional Gravitas and Moral Passion’, British Journal of Political Science, 38 (2008), 383–410. In her study of the US Senate debates on partial-birth abortion, Schonhardt-Bailey identifies two dimensions of conflict, where the first dimension represents an emotive conflict over the abortion procedure, while the second dimension is related to the constitutionality of the bill. Schonhardt-Bailey argues that legislative voting correlates with this second dimension.
12 Budge, Ian, Klingemann, Hans-Dieter, Volkens, Andrea, Bara, Judith and Tanenbaum, Eric, Mapping Policy Preferences: Estimates for Parties, Electors, and Governments 1945–1998 (Oxford: Oxford University Press, 2001); Baumgartner, Frank R. and Jones, Bryan D., Agendas and Instability in American Politics (Chicago: University of Chicago Press, 1993); Baumgartner, Frank R. and Jones, Bryan D., eds, Policy Dynamics (Chicago: University of Chicago Press, 2002); Baumgartner, Frank R. and Jones, Bryan D., The Politics of Attention: How Government Prioritizes Problems (Chicago: University of Chicago Press, 2005).
13 For examples, see Laver, Michael and Benoit, Kenneth, ‘Locating TDs in Policy Spaces: Wordscoring Dáil Speeches’, Irish Political Studies, 17 (2002), 59–73; Laver, Michael, Benoit, Kenneth and Garry, John, ‘Extracting Policy Positions from Political Texts Using Words as Data’, American Political Science Review, 97 (2003), 311–337; Benoit, Kenneth and Laver, Michael, ‘Estimating Irish Party Positions Using Computer Wordscoring: The 2002 Elections’, Irish Political Studies, 18 (2003), 97–107; Benoit, Kenneth and Laver, Michael, ‘Mapping the Irish Policy Space: Voter and Party Spaces in Preferential Elections’, Economic and Social Review, 36 (2005), 83–108; Monroe, Burt L. and Maeda, Ko, ‘Rhetorical Ideal Point Estimation: Mapping Legislative Speech’ (presented at the Society for Political Methodology, Palo Alto: Stanford University, 2004); Simon, Adam F. and Xenos, Michael, ‘Dimensional Reduction of Word-frequency Data as a Substitute for Intersubjective Content Analysis’, Political Analysis, 12 (2004), 63–75; Slapin, Jonathan B. and Proksch, Sven O., ‘A. Scaling Model for Estimating Time-Series Party Positions from Texts’, American Journal of Political Science, 52 (2008), 705–722; Quinn, Kevin M., Monroe, Burt L., Colaresi, Michael, Crespin, Michael H. and Radev, Dragomir R., ‘How to Analyze Political Attention with Minimal Assumptions and Costs’, American Journal of Political Science, 54 (2010), 209–228. For a recent review, see Monroe, Burt and Schrodt, Philipp A., ‘Introduction to the Special Issue: The Analysis of Political Text’, Political Analysis, 16 (2008), 351–355; and also Cousins, Ken and McIntosh, Wayne, ‘More than Typewriters, More than Adding Machines: Integrating Information Technology into Political Research’, Quality and Quantity, 39 (2005), 591–614; and Yano, Tae, Cohen, William W. and Smith, Noah A., ‘Predicting Response to Political Blog Posts with Topic Models’, Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics conference (NAACL) (2009), 477–485.
14 Examples include Laver, Benoit and Garry, ‘Extracting Policy Positions from Political Texts Using Words as Data’. See also Benoit and Laver, ‘Estimating Irish Party Positions Using Computer Wordscoring’; Benoit and Laver, ‘Mapping the Irish Policy Space’; Laver and Benoit, ‘Locating TDs in Policy Spaces; Benoit, Kenneth, Laver, Michael, Arnold, Christine, Pennings, Paul and Hosli, Madeleine O., ‘Measuring National Delegate Positions at the Convention on the Future of Europe Using Computerized Wordscoring’, European Union Politics, 6 (2005), 291–313. For a critical view, see Budge, Ian and Pennings, Paul, ‘Do They work? Validating Computerized Word Frequency Estimates against Policy Series’, Electoral Studies, 26 (2007), 121–129.
15 Monroe and Maeda, ‘Rhetorical Ideal Point Estimation’; Slapin and Proksch, ‘A Scaling Model for Estimating Time-Series Party Positions from Texts’.
16 Purpura, Stephen and Hillard, Dustin, ‘Automated Classification of Congressional Legislation’, Proceedings of the 2006 International Conference on Digital Government Research (2006), 219–225, retrieved 28 May 2007, from the ACM Digital Library; Pang, Bo, Lee, Lillian and Vaithyanathan, Shivakumar, ‘Thumbs up? Sentiment Classification Using Machine Learning Techniques’, Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, (2002), 79–86, retrieved 28 May 2007, from the ACM Digital Library; Thomas, Matt, Pang, Bo and Lee, Lillian, ‘Get out the Vote: Determining Support or Opposition from Congressional Floor-debate Transcripts’, Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (2006), 327–335, retrieved from the ACL Digital Archive, predicted speakers’ opinions about a specific bill (support or opposition) based on their speeches. Their classifier was trained on 2,740 speech segments in 38 bill debates and achieved an accuracy of 66 per cent in predicting the opinions expressed in 860 speech segments from ten different legislative debates.
17 We used the Poole and Rosenthal dw-nominatescores available at http://voteview.com/dwnomin.htm.
18 We will discuss differences between the House and the Senate below. We will also suggest how the approach can be utilized when studying other legislatures.
19 The dw-nominate scores for the same senators can be different across congresses. As a result, when we prepare the senatorial speeches as training and testing documents (each document is called an ‘example’ in machine learning terms), a senator could be assigned to the extreme category in one congress but moved to the moderate category in another. Therefore, we treat the same senators in different congresses as different training/testing examples.
20 Forty-five of these fifty ‘extreme’ senators had already served in the 107th Congress.
21 This issue was investigated by Poole (‘Changing Minds? Not in Congress’) in the context of voting behaviour. Poole found strong support for individual ideological consistency in members of Congress over time.
22 Ninety-one senators in the 108th Congress served in previous congresses. Forty-four of the fifty extreme senators in the 108th Congress were rated as extreme in previous congresses.
23 The performance of classification algorithms is tested using common benchmark datasets. The Reuters-21578 news collection, the OHSUMED Medline abstract collection, and the 20 Usenet newsgroups collection are the most widely used benchmark datasets. The Reuters-21578 collection is available at http://kdd.ics.uci.edu/databases/20newsgroups/20newsgroups.html. The OHSUMED collection is available at http://trec.nist.gov/data/t9_filtering.html. The 20 newsgroups collection is available at http://kdd.ics.uci.edu/databases/20newsgroups.html.
24 Dumais, Susan, Platt, John, Heckerman, David and Sahami, Mehran, ‘Inductive Learning Algorithms and Representations for Text Categorization’, Proceedings of the 7th International Conference on Information and Knowledge Management (1998), 48–155, retrieved 28 May 2007, from the ACM Digital Library; Guyon, Isabelle, Weston, Jason, Barnhilland, Stephen, Vapnik, Vladimir, ‘Gene Selection for Cancer Classification Using Support Vector Machines’, Machine Learning, 46 (2002), 389–422; Forman, George, ‘An Extensive Empirical Study of Feature Selection Metrics for Text Categorization’, Journal of Machine Learning Research, 3 (2003), 1289–1305; Joachims, Thorsten, ‘Text Categorization with Support Vector Machines: Learning with Many Relevant Features’, 10th European Conference on Machine Learning, Vol. 1398 of Lecture Notes in Computer Science (Berlin: Springer Verlag, 1998), pp. 137–142; Mladenic, Dunja, Brank, Janez, Grobelnik, Marko and Milic-Frayling, Natasa, ‘Feature Selection Using Linear Classifier Weights: Interaction with Classification Models’, Proceedings of the 27nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '04), (Sheffield: 25–29 July 2004), pp. 234–41; Yang, Yiming and Liu, Xin, ‘A Re-evaluation of Text Categorization Methods’, Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1999), 42–49, retrieved 28 May 2007, from the ACM Digital Library; Sebastiani, Fabrizio, ‘Machine Learning in Automated Text Categorization’, ACM Computing Surveys, 34 (2002), 1–47. We also compared our SVM algorithm to naïve Bayes, another popular classification method. Our experiment results show that SVM is slightly superior to naïve Bayes for ideological position classification.
25 Pang, Lee and Vaithyanathan, ‘Thumbs up?’
26 Details on the way in which these vectors were derived from the documents are discussed in the next section.
27 Vapnik, Vladimir, Estimating of Dependences Based on Empirical Data (New York: Springer-Verlag, 1982); Cortes, Corinna and Vapnik, Vladimir, ‘Support-vector Networks’, Machine Learning, 20 (1995), 273–297; Vapnik, Vladimir, The Nature of Statistical Learning Theory (New York: Springer-Verlag, 1999).
28 There are several efficient implementations of the SVM algorithm, such as LIBSVM and SVMlight (Thorsten Joachims, ‘SVMlight: Support Vector Machine (Version 6.01)’, (2004)). We used the SVMlight package with its default setting in this study. See Chang, Chih C. and Lin, Chih J., ‘LIBSVM: A Library for Support Vector Machines’ (2001), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
29 Lewis, David D., ‘An Evaluation of Phrasal and Clustered Representations on a Text Categorization Task’, Proceedings of the 15th Annual International Conference on Research and Development of Information Retrieval (1992), pp. 37–50, retrieved 28 May 2007, from the ACM Digital Library; Cohen, William W. and Singer, Yoram, ‘Context-sensitive Learning Methods for Text Categorization’, ACM Transactions on Information Systems, 17 (1999), 141–173; Scott, Sam and Matwin, Stan, ‘Feature Engineering for Text Classification’, Proceedings of the 16th International Conference on Machine Learning (San Francisco: Morgan Kaufmann, 1999), pp. 379–388; Moschitti, Alessandro and Basili, Roberto, ‘Complex Linguistic Features for Text Classification: A Comprehensive Study’, European Conference on Information Retrieval, Vol. 2997 of Lecture Notes in Computer Science (Berlin: Springer Verlag, 2004), pp. 181–196.
30 Dave, Kushal, Lawrence, Steve and Pennock, David M., ‘Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews’, Proceedings of the 12th International Conference on World Wide Web (2003), 519–522, retrieved 28 May 2007, from the ACM Digital Library; Pang, Lee and Vaithyanathan, ‘Thumbs up?’
31 Finn, Aidan and Kushmerick, Nicholas, ‘Learning to Classify Documents according to Genre’, Journal of American Society of Information Science and Technology, 57 (2006), 1506–1518. For example, some typical adjectives in movie reviews (like hilarious and boring) are unlikely to occur in restaurant reviews, although some opinion descriptors (like terrific and bad) are universal.
32 Miss was not included because all single female Senators (e.g. Susan Collins and Barbara Mikulski) were saluted as ‘Ms’.
33 Yu, Bei, Diermeier, Daniel and Kaufmann, Stefan‘Classifying Party Affiliation from Political Speech’, Journal of Information Technology & Politics, 5 (2008), 33–48.
34 Porter, M. F., ‘An Algorithm for Suffix Stripping’, Program, 14 (1980), 130–137.
35 We used the MorphAdorner tagger to tag the parts of speech. Since the tagger has its own tokenizer, the generated word forms in this case are slightly different from the results of the simple tokenizer.
36 This is a standard approach in classification tasks; see, e.g., Tom Mitchell, Machine Learning (Toronto: McGraw Hill, 1997). An alternative approach consists in setting aside a sizeable portion of the data as a ‘held-out’ set which is ignored during training and only used for testing. This approach is sound for datasets with large numbers of labelled examples. However, for small datasets such as ours, it is problematic since the arbitrary training/test split may accidentally lead to two datasets that are unlikely to have been produced by the same source.
37 The accuracy was even higher (94 per cent) when adjectives were used as feature sets. Since there are only fifty test examples, 2 per cent accuracy improvement corresponds to one more correctly predicted example. Therefore, we do not think the accuracy difference is significant.
38 Note, however, that the out-of-sample set is small due to lack of turnover among members of the Senate.
39 These polarities are arbitrary. See the methodology section for technical details.
40 This is related to the literature on framing. For a recent review, see Druckman, Jamie and Chong, Dennis , ‘Framing Theory’, Annual Review of Political Science, 10 (2007), 103–126.
41 We reproduce in Table 4 the most liberal and conservative words as they appear in our ranking, from first to the twentieth in rank order. However, for the purposes of this discussion, we selected words ranked in the top fifty to illustrate commonality.
42 For example, Senator Colman in the 106th Senate mentioned ‘grievous injury’ before he expressed his objection to this amendment to the partial-birth ban act.
43 To compare the two chambers directly, it is necessary to use a common space score for both the House and the Senate. See, for example, Royce Carroll, Jeff Lewis, James Lo, Nolan McCarty, Keith Poole and Howard Rosenthal, ‘ “Common Space” (Joint House and Senate) dw-nominate Scores with Bootstrapped Standard Errors’ (2009).
44 The kappa coefficient is often used to measure inter-rater agreement in annotation. We followed the kappa computation procedure described at http://faculty.vassar.edu/lowry/kappa.html.
45 Yu, Bei, Kaufmann, Stefan and Diermeier, Daniel, ‘Classifying Party Affiliation from Political Speech’ Journal of Information Technology and Politics, 5 (2008), 33–48. The lower accuracy is a consequence of a smaller dataset.
46 Yu, Bei, Kaufmann, Stefan and Diermeier, Daniel, ‘Exploring the Characteristics of Opinion Expressions for Political Opinion Classification’, Proceedings of the 9th Annual International Conference on Digital Government Research (dg.o 2008) (Montreal, May 2008), pp. 82–89.
47 We thank an anonymous referee for pointing out this possibility.
48 Høyland, Bjørn and Godbout, Jean-François, ‘Predicting Party Group Affiliation from European Parliament Debates’ (paper presented at the European Consortium for Political Research Meeting of the Standing Group on the European Union (Riga: Latvia, 2008)).
49 Poole, Spatial Models of Parliamentary Voting.
50 Except during the Era of Good Feelings (1817–25) and the period surrounding the Civil War (1853–76); Poole and Rosenthal, ‘Congress’; Poole and Rosenthal, Ideology and Congress.
51 See, for example, Lakoff, George, Moral Politics: How Liberals and Conservatives Think (Chicago: The University of Chicago Press, 2002).
52 The large dot in the equation refers to the operation of the inner product of two vectors.
53 Leopold, Edda and Kindermann, Jörg, ‘Text Categorization with Support Vector Machines: How to Represent Texts in Input Space?’ Machine Learning, 46 (2002), 423–444.
54 The abbreviation sv stands for an arbitrary support vector. In the SVMlight software package, the first support vector (according to its order in the input data) was used to compute b.
55 Joachims, ‘SVMlight’.
56 Chang and Lin, ‘Library for Support Vector Machines’.
* Department of Managerial Economics and Decision Sciences (MEDS) and Ford Motor Company Center for Global Citizenship, Kellogg School of Management and Northwestern Institute on Complex Systems (NICO), Northwestern University (email: email@example.com); Department of Political Science, University of Montreal; School of Information Studies, Syracuse University; and Department of Linguistics, Northwestern University, respectively. The authors wish to thank seminar participants at the annual meetings of the American Political Science Association and the Midwest Political Science Association, as well as the members of the Institutions, Organizations and Growth research group at the Canadian Institute for Advanced Research (CIFAR) for their helpful comments. Financial support from the Ford Motor Company Center for Global Citizenship, Kellogg School of Management, Northwestern University, and CIFAR is gratefully acknowledged.
Email your librarian or administrator to recommend adding this journal to your organisation's collection.
* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.
Usage data cannot currently be displayed