Skip to main content
×
Home

Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts

  • Justin Grimmer (a1) and Brandon M. Stewart (a2)
Abstract

Politics and political conflict often occur in the written and spoken word. Scholars have long recognized this, but the massive costs of analyzing even moderately sized collections of texts have hindered their use in political science research. Here lies the promise of automated text analysis: it substantially reduces the costs of analyzing large collections of text. We provide a guide to this exciting new area of research and show how, in many instances, the methods have already obtained part of their promise. But there are pitfalls to using automated methods—they are no substitute for careful thought and close reading and require extensive and problem-specific validation. We survey a wide range of new methods, provide guidance on how to validate the output of the models, and clarify misconceptions and errors in the literature. To conclude, we argue that for automated text methods to become a standard tool for political scientists, methodologists must contribute new methods and new methods of validation.

    • Send article to Kindle

      To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle.

      Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts
      Available formats
      ×
      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your Dropbox account. Find out more about sending content to Dropbox.

      Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts
      Available formats
      ×
      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your Google Drive account. Find out more about sending content to Google Drive.

      Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts
      Available formats
      ×
Copyright
Corresponding author
e-mail: jgrimmer@stanford.edu (corresponding author)
Footnotes
Hide All

Authors' note: For helpful comments and discussions, we thank participants in Stanford University's Text as Data class, Mike Alvarez, Dan Hopkins, Gary King, Kevin Quinn, Molly Roberts, Mike Tomz, Hanna Wallach, Yuri Zhurkov, and Frances Zlotnick. Replication data are available on the Political Analysis Dataverse at http://hdl.handle.net/1902.1/18517. Supplementary materials for this article are available on the Political Analysis Web site.

Footnotes
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Political Analysis
  • ISSN: 1047-1987
  • EISSN: 1476-4989
  • URL: /core/journals/political-analysis
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×
MathJax

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 8
Total number of PDF views: 1094 *
Loading metrics...

Abstract views

Total abstract views: 1912 *
Loading metrics...

* Views captured on Cambridge Core between 4th January 2017 - 20th November 2017. This data will be updated every 24 hours.