Skip to main content
×
Home
    • Aa
    • Aa

Two Wrongs Make a Right: Addressing Underreporting in Binary Data from Multiple Sources

  • Scott J. Cook (a1), Betsabe Blas (a2), Raymond J. Carroll (a3) (a4) and Samiran Sinha (a3)
Abstract

Media-based event data—i.e., data comprised from reporting by media outlets—are widely used in political science research. However, events of interest (e.g., strikes, protests, conflict) are often underreported by these primary and secondary sources, producing incomplete data that risks inconsistency and bias in subsequent analysis. While general strategies exist to help ameliorate this bias, these methods do not make full use of the information often available to researchers. Specifically, much of the event data used in the social sciences is drawn from multiple, overlapping news sources (e.g., Agence France-Presse, Reuters). Therefore, we propose a novel maximum likelihood estimator that corrects for misclassification in data arising from multiple sources. In the most general formulation of our estimator, researchers can specify separate sets of predictors for the true-event model and each of the misclassification models characterizing whether a source fails to report on an event. As such, researchers are able to accurately test theories on both the causes of and reporting on an event of interest. Simulations evidence that our technique regularly outperforms current strategies that either neglect misclassification, the unique features of the data-generating process, or both. We also illustrate the utility of this method with a model of repression using the Social Conflict in Africa Database.

Copyright
Corresponding author
* Email: sjcook@tamu.edu
Footnotes
Hide All

Authors’ note: For their helpful comments and suggestions, thanks to Kenneth Benoit, Graeme Blair, Chad Hazlett, Florian Hollenbach, Idean Salyehan, Nils Weidmann, the reviewers, and the editor(s). Replication materials are available online at Cook et al. (2016). All inquiries should be sent to the corresponding author at sjcook@tamu.edu

Contributing Editor: R. Michael Alvarez

Footnotes
Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

Christopher H. Achen 2002. Toward a new political methodology: Microfoundations and ART. Annual Review of Political Science 5(1):423450.

Raymond J. Carroll , David Ruppert , Leonard A. Stefanski , and Ciprian M. Crainiceanu . 2006. Measurement error in nonlinear models: a modern perspective . Boca Raton, FL: CRC Press.

Christian Davenport . 2007. State repression and political order. Annual Review of Political Science 10:123.

Christian Davenport , and Patrick Ball . 2002. Views to a kill exploring the implications of source selection in the case of Guatemalan state terror, 1977–1995. Journal of Conflict Resolution 46(3):427450.

Jennifer Earl , Andrew Martin , John D. McCarthy , and Sarah A. Soule . 2004. The use of newspaper data in the study of collective action. Annual Review of Sociology 30:6580.

Jerry A. Hausman , Jason Abrevaya , and Fiona M. Scott-Morton . 1998. Misclassification of the dependent variable in a discrete-response setting. Journal of Econometrics 87(2):239269.

Cullen S. Hendrix , and Idean Salehyan . 2015. No news is good news: Mark and recapture for event data when reporting probabilities are less than one. International Interactions 41(2):392406.

Simon Hug . 2003. Selection bias in comparative research: The case of incomplete data sets. Political Analysis 11(3):255274.

Simon Hug . 2009. The effect of misclassifications in probit models: Monte Carlo simulations and applications. Political Analysis 18(1):78102.

Kosuke Imai , and Teppei Yamamoto . 2010. Causal inference with differential measurement error: Nonparametric identification and sensitivity analysis. American Journal of Political Science 54(2):543560.

Steven C. Poe , and C. Neal Tate . 1994. Repression of human rights to personal integrity in the 1980s: A global analysis. American Political Science Review 88(4):853872.

Idean Salehyan , Cullen S. Hendrix , Jesse Hamner , Christina Case , Christopher Linebarger , Emily Stull , and Jennifer Williams . 2012. Social conflict in Africa: A new database. International Interactions 38(4):503511.

Philip A. Schrodt , and Deborah J. Gerner . 1994. Validity assessment of a machine-coded event data set for the Middle East, 1982–92. American Journal of Political Science 38(3):825854.

Peter F. Trumbore , and Byungwon Woo . 2014. Smugglers blues: Examining why countries become narcotics transit states using the new international narcotics production and transit (INAPT) data set. International Interactions 40(5):763787.

Nils B. Weidmann 2014. On the accuracy of media-based conflict event data. Journal of Conflict Resolution 59(6):11291149.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Political Analysis
  • ISSN: 1047-1987
  • EISSN: 1476-4989
  • URL: /core/journals/political-analysis
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×
MathJax
Type Description Title
UNKNOWN
Supplementary Materials

Cook supplementary material
Cook supplementary material 1

 Unknown (217 KB)
217 KB

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 2
Total number of PDF views: 33 *
Loading metrics...

Abstract views

Total abstract views: 241 *
Loading metrics...

* Views captured on Cambridge Core between 11th April 2017 - 29th May 2017. This data will be updated every 24 hours.