Skip to main content

Validating Estimates of Latent Traits from Textual Data Using Human Judgment as a Benchmark

  • Will Lowe (a1) and Kenneth Benoit (a2)

Automated and statistical methods for estimating latent political traits and classes from textual data hold great promise, because virtually every political act involves the production of text. Statistical models of natural language features, however, are heavily laden with unrealistic assumptions about the process that generates these data, including the stochastic process of text generation, the functional link between political variables and observed text, and the nature of the variables (and dimensions) on which observed text should be conditioned. While acknowledging statistical models of latent traits to be “wrong,” political scientists nonetheless treat their results as sufficiently valid to be useful. In this article, we address the issue of substantive validity in the face of potential model failure, in the context of unsupervised scaling methods of latent traits. We critically examine one popular parametric measurement model of latent traits for text and then compare its results to systematic human judgments of the texts as a benchmark for validity.

Corresponding author
e-mail: (corresponding author)
Hide All

Authors' note: Replication materials for this article are available from the Political Analysis dataverse at Supplementary materials for this article are available on the Political Analysis Web site.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Political Analysis
  • ISSN: 1047-1987
  • EISSN: 1476-4989
  • URL: /core/journals/political-analysis
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
Type Description Title
Supplementary materials

Lowe and Benoit supplementary material
Supplementary Material

 PDF (316 KB)
316 KB


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 93 *
Loading metrics...

Abstract views

Total abstract views: 407 *
Loading metrics...

* Views captured on Cambridge Core between 4th January 2017 - 24th March 2018. This data will be updated every 24 hours.