Hostname: page-component-89b8bd64d-x2lbr Total loading time: 0 Render date: 2026-05-09T05:49:39.000Z Has data issue: false hasContentIssue false

Crowd-sourced Text Analysis: Reproducible and Agile Production of Political Data

Published online by Cambridge University Press:  04 August 2016

KENNETH BENOIT*
Affiliation:
London School of Economics and Trinity College
DREW CONWAY*
Affiliation:
New York University
BENJAMIN E. LAUDERDALE*
Affiliation:
London School of Economics and Political Science
MICHAEL LAVER*
Affiliation:
New York University
SLAVA MIKHAYLOV*
Affiliation:
University College London
*
Kenneth Benoit is Professor, London School of Economics and Trinity College, Dublin (kbenoit@lse.ac.uk).
Drew Conway, New York University.
Benjamin E. Lauderdale is Associate Professor, London School of Economics.
Michael Laver is Professor, New York University.
Slava Mikhaylov is Senior Lecturer, University College London.

Abstract

Empirical social science often relies on data that are not observed in the field, but are transformed into quantitative variables by expert researchers who analyze and interpret qualitative raw sources. While generally considered the most valid way to produce data, this expert-driven process is inherently difficult to replicate or to assess on grounds of reliability. Using crowd-sourcing to distribute text for reading and interpretation by massive numbers of nonexperts, we generate results comparable to those using experts to read and interpret the same texts, but do so far more quickly and flexibly. Crucially, the data we collect can be reproduced and extended transparently, making crowd-sourced datasets intrinsically reproducible. This focuses researchers’ attention on the fundamental scientific objective of specifying reliable and replicable methods for collecting the data needed, rather than on the content of any particular dataset. We also show that our approach works straightforwardly with different types of political text, written in different languages. While findings reported here concern text analysis, they have far-reaching implications for expert-generated data in the social sciences.

Information

Type
Research Article
Copyright
Copyright © American Political Science Association 2016 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Supplementary material: PDF

BENOIT supplementary material

Supplementary material

Download BENOIT supplementary material(PDF)
PDF 2.8 MB
Submit a response

Comments

No Comments have been published for this article.