Skip to main content Accessibility help
×
Home
Hostname: page-component-7f7b94f6bd-vvt5l Total loading time: 0.347 Render date: 2022-06-29T22:50:51.364Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "useRatesEcommerce": false, "useNewApi": true } hasContentIssue true

Understanding Wordscores

Published online by Cambridge University Press:  04 January 2017

Will Lowe*
Affiliation:
Methods and Data Institute, School of Politics and International Relations, University of Nottingham, Nottingham, NG7 2RD, UK

Abstract

Wordscores is a widely used procedure for inferring policy positions, or scores, for new documents on the basis of scores for words derived from documents with known scores. It is computationally straightforward, requires no distributional assumptions, but has unresolved practical and theoretical problems. In applications, estimated document scores are on the wrong scale and the theoretical development does not specify a statistical model, so it is unclear what assumptions the method makes about political text and how to tell whether they fit particular text analysis applications. The first part of the paper demonstrates that badly scaled document score estimates reflect deeper problems with the method. The second part shows how to understand Wordscores as an approximation to correspondence analysis which itself approximates a statistical ideal point model for words. Problems with the method are identified with the conditions under which these layers of approximation fail to ensure consistent and unbiased estimation of the parameters of the ideal point model.

Type
Special Issue: The Statistical Analysis of Political Text
Copyright
Copyright © The Author 2008. Published by Oxford University Press on behalf of the Society for Political Methodology 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Author's note: I would like to thank Ken Benoit, Mik Laver, Cees van der Eijk, and Wijbrandt van Schuur for useful comments and discussion. The remaining errors are my own.

References

Baker, F., and Kim, S. H. 2004. Item response theory. 2nd ed. New York: Wiley.Google Scholar
Bartholomew, D. J. 1984. Latent variable models and factor analysis. Vol. 40. London: Charles Griffin and Company Limited.Google Scholar
Beh, E. J. 2004. Simple correspondence analysis: A bibliographic review. International Statistical Review 72: 257–84.Google Scholar
Benoit, K., and Laver, M. 2003. Estimating Irish party positions using computer wordscoring: The 2002 elections. Irish Political Studies 17: 97107.CrossRefGoogle Scholar
Benoit, K., and Laver, M. 2008. Compared to what? A comment on “A robust transformation procedure for interpreting political text” by Martin and Vanberg. Political Analysis 16: 101–11.CrossRefGoogle Scholar
Benzécri, J.-P. 1992. Correspondence analysis handbook. New York: Marcel Dekker.Google Scholar
Clinton, J., Jackman, S., and Rivers, D. 2004. The statistical analysis of roll call voting: A unified approach. American Journal of Political Science 98: 355–70.Google Scholar
Elff, M. 2008. A spatial model of electoral platforms. Annual meeting of the Political Methodology Society, Ann Arbor, Michigan.Google Scholar
Enelow, J. M., and Hinich, M. J. 1984. The spatial theory of voting: An introduction. New York: Cambridge University Press.Google Scholar
Greenacre, M. J. 1993. Correspondence analysis in practice. London: Academic Press.Google Scholar
Hill, M. O. 1973. Reciprocal averaging: An eigenvector method of ordination. Journal of Ecology 61: 237–51.CrossRefGoogle Scholar
Hill, M. O. 1974. Correspondence analysis: A neglected multivariate method. Applied Statistics 23: 340–54.CrossRefGoogle Scholar
Jackman, S. 2001. Multidimensional analysis of roll call data via Bayesian simulation: Identification, estimation, inference and model checking. Political Analysis 9: 227–41.CrossRefGoogle Scholar
Klemmensen, R., Hobolt, S. B., and Hansen, M. E. 2007. Estimating policy positions using political texts: An evaluation of the wordscores approach. Electoral Studies 26: 746–55.CrossRefGoogle Scholar
Laver, M., Benoit, K., and Garry, J. 2003. Extracting policy positions from political texts using words as data. American Political Science Review 97: 311–31.CrossRefGoogle Scholar
Lynn, H. S., and McCulloch, C. E. 2000. Using principal component analysis and correspondence analysis for estimation in latent variable models. Journal of the American Statistical Association 95: 561–72.CrossRefGoogle Scholar
Mandelbrot, B. 1954. Structure formelle des textes et communication. Word 10: 127.CrossRefGoogle Scholar
Monroe, B. L., and Maeda, K. 2004. Talk's cheap: Text-based estimation of rhetorical ideal points. Annual meeting of the Political Methodology Society. Stanford, CA.Google Scholar
Monroe, B., and Maeda, K. 2004. Talk's cheap: Text-based estimation of rhetorical ideal-points. POLMETH Working Paper.Google Scholar
Slapin, J. B., and Proksch, S.-O. 2008. A scaling model for estimating time-series party positions from texts. American Journal of Political Science 52: 705–22.CrossRefGoogle Scholar
ter Braak, C., and Prentice, I. C. 2004. A theory of gradient analysis. Advances in Ecological Research: Classic Papers 34: 235–82.CrossRefGoogle Scholar
ter Braak, C. J. F. 1985. Correspondence analysis of incidence and abundance data: Properties in terms of a unimodal response model. Biometrics 41: 859–73.CrossRefGoogle Scholar
ter Braak, C. J. F., and Looman, C. W. N. 1986. Weighted averaging, logistic regression and the Gaussian response model. Plant Ecology 65: 311.CrossRefGoogle Scholar
Zipf, G. K. 1949. Human behavior and the principal of least effort. Reading, MA: Addison Wesley.Google Scholar
102
Cited by

Save article to Kindle

To save this article to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Understanding Wordscores
Available formats
×

Save article to Dropbox

To save this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Dropbox account. Find out more about saving content to Dropbox.

Understanding Wordscores
Available formats
×

Save article to Google Drive

To save this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Google Drive account. Find out more about saving content to Google Drive.

Understanding Wordscores
Available formats
×
×

Reply to: Submit a response

Please enter your response.

Your details

Please enter a valid email address.

Conflicting interests

Do you have any conflicting interests? *