Skip to main content

Reading Between the Lines: Prediction of Political Violence Using Newspaper Text


This article provides a new methodology to predict armed conflict by using newspaper text. Through machine learning, vast quantities of newspaper text are reduced to interpretable topics. These topics are then used in panel regressions to predict the onset of conflict. We propose the use of the within-country variation of these topics to predict the timing of conflict. This allows us to avoid the tendency of predicting conflict only in countries where it occurred before. We show that the within-country variation of topics is a good predictor of conflict and becomes particularly useful when risk in previously peaceful countries arises. Two aspects seem to be responsible for these features. Topics provide depth because they consist of changing, long lists of terms that make them able to capture the changing context of conflict. At the same time, topics provide width because they are summaries of the full text, including stabilizing factors.

Corresponding author
Hannes Mueller is a tenured scientist at IAE (CSIC), Barcelona GSE Institut d’Analisi Economica, CSIC Campus UAB, 08193 Bellaterra, Spain (
Christopher Rauh is an Assistant Professor at University of Montreal, Département de Sciences Économiques, Université de Montréal, C.P.6128 succ. Centre-Ville, Montréal H3C 3J7, Canada (
Hide All

We thank Tim Besley, Melissa Dell, Vincenzo Galasso, Hector Galindo, Matt Gentzkow, Stephen Hansen, Ethan Kapstein, Daniel Ohayon, Akash Raja, Bernhard Reinsberg, Anand Shrivastava, Ron Smith, Jack Willis, Stephane Wolton, and the participants of the workshops and conferences ENCoRe Barcelona, Political Economy Cambridge (internal), EPCS Freiburg, ESOC in Washington, Barcelona GSE Calvo-Armengol, NBER SI Economics of National Security, Conflict at IGIER, and the seminars PSPE at LSE, BBE at WZB, and Macro Lunch Cambridge for valuable feedback. We are grateful to Alex Angelini, Lavinia Piemontese, and Bruno Conte Leite for excellent research assistance. We thank the Barcelona GSE under the Severo Ochoa Programme for financial assistance. All errors are ours.

Hide All
Bazzi, Samuel, and Blattman, Christopher. 2014. “Economic Shocks and Conflict: Evidence from Commodity Prices.” American Economic Journal: Macroeconomics 6 (4): 138.
Beck, Nathaniel. 2015. “Estimating Grouped Data Models with a Binary Dependent Variable and Fixed Effects: What Are the Issues?” Annual Meeting of the Society for Political Methodology, July.
Belloni, Alexandre, Chernozhukov, Victor, Hansen, Christian et al. 2011. “Inference for High-Dimensional Sparse Econometric Models.” Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Besley, Timothy, and Persson, Torsten. 2011 a. “The Logic of Political Violence.” Quarterly Journal of Economics 126 (3): 1411–45.
Besley, Timothy, and Persson, Torsten. 2011 b. Pillars of Prosperity: The Political Economics of Development Clusters. Princeton, NJ: Princeton University Press.
Blattman, Christopher, Hartman, Alexandra C., and Blair, Robert A.. 2014. “How to Promote Order and Property Rights Under Weak Rule of Law? An Experiment in Changing Dispute Resolution Behavior through Community Education.” American Political Science Review 108 (01): 100–20.
Blattman, Christopher, and Miguel, Edward. 2010. “Civil War.” Journal of Economic Literature 48 (1): 357.
Blei, David M., and Lafferty, John D.. 2006. “Dynamic Topic Models.” In Proceedings of the 23rd International Conference on Machine Learning. ACM, pp. 113–20.
Blei, David M., Ng, Andrew Y. and Jordan, Michael I.. 2003. “Latent Dirichlet Allocation.” The Journal of Machine Learning Research 3: 9931022.
Brandt, Patrick T., Freeman, John R., and Schrodt, Philip A.. 2011. “Real Time, Time Series Forecasting of Inter-And Intra-State Political Conflict.” Conflict Management and Peace Science 28 (1): 4164.
Brückner, Markus, and Ciccone, Antonio. 2010. “International Commodity Prices, Growth and the Outbreak of Civil War in Sub-Saharan Africa.” The Economic Journal 120 (544): 519–34.
Buhaug, Halvard, Nordkvelle, J., Bernauer, T., Böhmelt, T., Brzoska, M., Busby, J. W., Ciccone, A., Fjelde, Hanne, Gartzke, E., Gleditsch, N.P. et al. 2014. “One Effect to Rule them all? A Comment on Climate and Conflict.” Climatic Change 127 (3–4): 391–7.
Caselli, Francesco, and Coleman, Wilbur John. 2013. “On the Theory of Ethnic Conflict.” Journal of the European Economic Association 11 (s1): 161–92.
Cederman, Lars-Erik, and Weidmann, Nils B.. 2017. “Predicting Armed Conflict: Time to Adjust our Expectations?Science 355 (6324): 474–6.
Chadefaux, Thomas. 2014. “Early Warning Signals for War in the News.” Journal of Peace Research 51 (1): 518.
Chadefaux, Thomas. 2017 a. “Conflict Forecasting and its Limits.” Data Science (Preprint): 111.
Chadefaux, Thomas. 2017 b. “Market Anticipations of Conflict Onsets.” Journal of Peace Research 54 (2): 313–27.
Chiba, Daina, and Gleditsch, Kristian Skrede. 2017. “The Shape of Things to Come? Expanding the Inequality and Grievance Model for Civil War Forecasts with Event Data.” Journal of Peace Research 54 (2): 275–97.
Collier, Paul, and Hoeffler, Anke. 2004. “Greed and Grievance in Civil War.” Oxford Economic Papers 56 (4): 563–95.
Collier, Paul, Hoeffler, Anke, and Rohner, Dominic. 2009. “Beyond Greed and Grievance: Feasibility and Civil War.” Oxford Economic Papers 61 (1): 127.
Dell, Melissa, Jones, Benjamin F., and Olken, Benjamin A.. 2012. “Temperature Shocks and Economic Growth: Evidence from the Last Half Century.” American Economic Journal: Macroeconomics 4 (3): 6695.
Esteban, Joan, Mayoral, Laura, and Ray, Debraj. 2012. “Ethnicity and Conflict: An Empirical Study.” The American Economic Review 102 (4): 1310–42.
Fearon, James D. and Laitin, David D.. 2003. “Ethnicity, Insurgency, and Civil War.” American Political Science Review 97 (01): 7590.
Gerner, Deborah J., Schrodt, Philip A., Yilmaz, Omur, and Abu-Jabr, Rajaa. 2002. “The Creation of CAMEO (Conflict and Mediation Event Observations): An Event Data Framework for a Post Cold War World.” Annual Meeting of the American Political Science Association.
Gleditsch, Kristian Skrede, and Ruggeri, Andrea. 2010. “Political Opportunity Structures, Democracy, and Civil War.” Journal of Peace Research 47 (3): 299310.
Gleditsch, Nils Petter, Wallensteen, Peter, Eriksson, Mikael, Sollenberg, Margareta, and Strand, Håvard. 2002. “Armed Conflict 1946–2001: A New Dataset.” Journal of Peace Research 39 (5): 615–37.
Goldstone, Jack A., Bates, Robert H., Epstein, David L., Gurr, Ted Robert, Lustik, Michael B., Marshall, Monty G., Ulfelder, Jay, and Woodward, Mark. 2010. “A Global Model for Forecasting Political Instability.” American Journal of Political Science 54 (1): 190208.
Hansen, Stephen, McMahon, Michael, and Prat, Andrea. 2014. “Transparency and Deliberation within the FOMC: A Computational Linguistics Approach.” CEP Discussion Paper No. 1276.
Hegre, Håvard, Karlsen, Joakim, Nygård, Håvard Mokleiv, Strand, Håvard, and Urdal, Henrik. 2013. “Predicting Armed Conflict, 2010–20501.” International Studies Quarterly 57 (2): 250–70.
Hegre, Håvard, Metternich, Nils W., Nygård, Håvard Mokleiv, and Wucherpfennig, Julian. 2017. “Introduction: Forecasting in Peace Research.” Journal of Peace Research 54 (2): 113–24.
Kalyvas, Stathis N. and Balcells, Laia. 2010. “International System and Technologies of Rebellion: How the End of the Cold War Shaped Internal Conflict.” American Political Science Review 104 (03): 415–29.
Margolis, J. Eli. 2012. “Estimating State Instability.” Studies in Intelligence 56 (1): 1324.
Meernik, James. 2005. “Justice and Peace? How the International Criminal Tribunal Affects Societal Peace in Bosnia.” Journal of Peace Research 42 (3): 271–89.
Miguel, Edward, and Satyanath, Shanker. 2011. “Re-examining Economic Shocks and Civil Conflict.” American Economic Journal: Applied Economics 3 (4): 228–32.
Miguel, Edward, Satyanath, Shanker, and Sergenti, Ernest. 2004. “Economic Shocks and Civil Conflict: An Instrumental Variables Approach.” Journal of Political Economy 112 (4): 725–53.
Nimark, Kristoffer P. and Pitschner, Stefan. 2016. “Delegated Information Choice.” No 11323, CEPR Discussion Papers.
Olsen, Tricia D., Payne, Leigh A., and Reiter, Andrew G.. 2010. “Transitional Justice in the World, 1970–2007: Insights from a New Dataset.” Journal of Peace Research 47 (6): 803–9.
Pettersson, Therése, and Wallensteen, Peter. 2015. “Armed Conflicts, 1946–2014.” Journal of Peace Research 52 (4): 536–50.
Phan, Xuan-Hieu, and Nguyen, Cam-Tu. 2007. “GibbsLDA++: AC/C++ Implementation of Latent Dirichlet Allocation (LDA).” URL:
Porter, Martin F. 1980. “An Algorithm for Suffix Stripping.” Program 14 (3): 130–7.
Quinn, Kevin M., Monroe, Burt L., Colaresi, Michael, Crespin, Michael H., and Radev, Dragomir R.. 2010. “How to Analyze Political Attention with Minimal Assumptions and Costs.” American Journal of Political Science 54 (1): 209–28.
Reynal-Querol, Marta, and Montalvo, Jose G.. 2005. “Ethnic Polarization, Potential Conflict and Civil War.” American Economic Review 95 (3): 796816.
Roberts, Margaret E., Stewart, Brandon M., Tingley, Dustin, Airoldi, Edoardo M., et al. 2013. “The Structural Topic Model and Applied Social Science.” Advances in Neural Information Processing Systems Workshop on Topic Models: Computation, Application, and Evaluation.
Rost, Nicolas, Schneider, Gerald, and Kleibl, Johannes. 2009. “A Global Risk Assessment Model for Civil Wars.” Social Science Research 38 (4): 921–33.
Sambanis, Nicholas. 2004. “What is Civil War? Conceptual and Empirical Complexities of an Operational Definition.” Journal of Conflict Resolution 48 (6): 814–58.
Schrodt, P. A., Gerner, D. J., and Yilmaz, O.. 2009. “Conflict and Mediation Event Observations (CAMEO): An Event Data Framework for a Post Cold War World.” In International Conflict Mediation: New Approaches and Findings, ed. Bercovitch, Jacob and Sigmund Gartner, Scott. New York: Routledge.
Schrodt, Philip A., Yonamine, James, and Bagozzi, Benjamin E.. 2013. “Data-Based Computational Approaches to Forecasting Political Violence.” In Handbook of Computational Approaches to Counterterrorism, ed V.S. Subrahmanian. New York: Springer, 129–62.
Ward, Michael D., Greenhill, Brian D., and Bakke, Kristin M.. 2010. “The Perils of Policy by P-Value: Predicting Civil Conflicts.” Journal of Peace Research 47 (4): 363–75.
Ward, Michael D., Metternich, Nils W., Dorff, Cassy L., Gallop, Max, Hollenbach, Florian M., Schultz, Anna, and Weschle, Simon. 2013. “Learning from the Past and Stepping into the Future: Toward a New Generation of Conflict Prediction.” International Studies Review 15 (4): 473–90.
Weidmann, Nils B. 2016. “A Closer Look at Reporting Bias in Conflict Event Data.” American Journal of Political Science 60 (1): 206–18.
Woolley, John T. 2000. “Using Media-Based Data in Studies of Politics.” American Journal of Political Science 44 (1): 156–73.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

American Political Science Review
  • ISSN: 0003-0554
  • EISSN: 1537-5943
  • URL: /core/journals/american-political-science-review
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
Type Description Title
Supplementary materials

Mueller and Rauh Dataset

Supplementary materials

Mueller and Rauh supplementary material
Online Appendix

 PDF (890 KB)
890 KB


Altmetric attention score