Skip to main content
×
Home
    • Aa
    • Aa

What Can We Learn from Predictive Modeling?

  • Skyler J. Cranmer (a1) and Bruce A. Desmarais (a2)
Abstract

The large majority of inferences drawn in empirical political research follow from model-based associations (e.g., regression). Here, we articulate the benefits of predictive modeling as a complement to this approach. Predictive models aim to specify a probabilistic model that provides a good fit to testing data that were not used to estimate the model’s parameters. Our goals are threefold. First, we review the central benefits of this under-utilized approach from a perspective uncommon in the existing literature: we focus on how predictive modeling can be used to complement and augment standard associational analyses. Second, we advance the state of the literature by laying out a simple set of benchmark predictive criteria. Third, we illustrate our approach through a detailed application to the prediction of interstate conflict.

Copyright
Corresponding author
* Email: cranmer.12@osu.edu
Footnotes
Hide All

Authors’ note: Many thanks to Alison Craig for research assistance. Sincere thanks also to Matt Blackwell and Michael Neblo for helpful comments on an earlier draft. The authors are grateful for the support of the National Science Foundation (SES-1558661, SES-1619644, SES-1637089, CISE-1320219, SES-1357622, SES-1514750, and SES-1461493) and the Alexander von Humboldt Foundation. Replication data are posted to the Political Analysis Dataverse (Cranmer and Desmarais 2016a).

Contributing Editor: Jonathan Katz

Footnotes
Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

Christopher H. Achen 2002. Toward a new political methodology: Microfoundations and ART. Annual Review of Political Science 5(1):423450.

Lada A. Adamic , and Eytan Adar . 2003. Friends and neighbors on the web. Social Networks 25(3):211230.

Nathaniel Beck , Jonathan N. Katz , and Richard Tucker . 1998. Taking time seriously: Time-series-cross-section analysis with a binary dependent variable. American Journal of Political Science 42(4):12601288.

Nathaniel Beck , Gary King , and Langche Zeng . 2000. Improving quantitative studies of international conflict: A conjecture. American Political Science Review 94(1):2135.

Patrick T. Brandt , John R. Freeman , and Philip A. Schrodt . 2011. Real time, time series forecasting of political conflict. Conflict Management and Peace Science 28(1):4164.

Kevin A. Clarke , and David M. Primo . 2007. Modernizing political science: A model-based approach. Perspectives on Politics 5(4):741753.

Fred Collopy , Monica Adya , and J. Scott Armstrong . 1994. Principles for examining predictive validity – The case of information systems spending forecasts. Information Systems Research 5(2):170179.

Skyler J. Cranmer , and Bruce A. Desmarais . 2011. Inferential network analysis with exponential random graph models. Political Analysis 19(1):6686.

Skyler J. Cranmer , and Bruce A. Desmarais . 2016b. A critique of dyadic design. International Studies Quarterly 60(2):355362.

Skyler J. Cranmer , Bruce A. Desmarais , and Justin H. Kirkland . 2012. Towards a network theory of alliance formation. International Interactions 38(3):295324.

Skyler J. Cranmer , Bruce A. Desmarais , and Elizabeth J. Menninga . 2012. Complex dependencies in the alliance network. Conflict Management and Peace Science 29(3):279313.

Skyler J. Cranmer , Elizabeth J. Menninga , and Peter J. Mucha . 2015. Kantian fractionalization predicts the conflict propensity of the international system. Proceedings of the National Academy of Sciences 112(38):1181211816.

George Cybenko . 1989. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2(4):303314.

Allan Dafoe . 2011. Statistical critiques of the democratic peace: Caveat emptor. American Journal of Political Science 55(2):247262.

Jesse Davis , and Mark Goadrich . 2006. The relationship between precision-recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning, ICML ’06 . New York, NY: ACM, pp. 233240. http://doi.acm.org/10.1145/1143844.1143874.

Bruce A. Desmarais , and Skyler J. Cranmer . 2011. Forecasting the locational dynamics of transnational terrorism: A network analytic approach. In Proceedings of the European Intelligence and Security Informatics Conference (EISIC) 2011 , Athens, Greece: IEEE Computer Society.

Bruce A. Desmarais , and Skyler J. Cranmer . 2012. Statistical mechanics of networks: Estimation and uncertainty. Physica A 391(4):18651876.

Marcel Dettling , and Peter Bühlmann . 2003. Boosting for tumor classification with gene expression data. Bioinformatics 19(9):10611069.

Bernd Droge . 1999. Asymptotic optimality of full cross-validation for selecting linear regression models. Statistics and Probability Letters 44(4):351357.

James N. Druckman , Donald P. Green , James H. Kuklinski , and Arthur Lupia . 2006. The growth and development of experimental research in political science. American Political Science Review 100(04):627635.

Cristóbal Esteban , Danilo Schmidt , Denis Krompaß , and Volker Tresp . 2015. Predicting sequences of clinical events by using a personalized temporal latent embedding model. In Healthcare Informatics (ICHI), 2015 International Conference on IEEE , pp. 130139.

Tom Fawcett . 2006. An introduction to ROC analysis. Pattern Recognition Letters 27(8):861874.

Erik Gartzke . 2007. The capitalist peace. American Journal of Political Science 51(1):166191.

Kristian Skrede Gleditsch . 2002. Expanded trade and GDP data. Journal of Conflict Resolution 46(5):712724.

Kristian S. Gleditsch , and Michael D. Ward . 2001. Measuring space: A minimum-distance database and applications to international studies. Journal of Peace Research 38(6):739758.

Jack A. Goldstone , Robert H. Bates , David L. Epstein , Ted Robert Gurr , Michael B. Lustik , Monty G. Marshall , Jay Ulfelder , and Mark Woodward . 2010. A global model for forecasting political instability. American Journal of Political Science 54(1):190208.

Vijay Gurbaxani , and Haim Mendelson . 1990. An integrative model of information systems spending growth. Information Systems Research 1(1):2346.

Vijay Gurbaxani , and Haim Mendelson . 1994. Modeling vs. forecasting—The case of information systems spending. Information Systems Research 5(2):180190.

Peter Hall . 1983. Large sample optimality of least squares cross-validation in density estimation. The Annals of Statistics 11(4):11561174.

Steve Hanneke , Wenjie Fu , and Eric P. Xing . 2010. Discrete temporal models of social networks. The Electronic Journal of Statistics 4:585605.

Trevor Hastie , Robert Tibshirani , and Jerome Friedman . 2009. The elements of statistical learning: Data mining, inference, and prediction . 2nd edn. New York: Springer.

Peter D. Hoff , Adrian E. Raftery , and Mark S. Handcock . 2002. Latent space approaches to social network analysis. Journal of the American Statistical association 97(460):10901098.

David D. Jensen , and Paul R. Cohen . 2000. Multiple comparisons in induction algorithms. Machine Learning 38(3):309338.

Luke Keele . 2015. The statistics of causal inference: A view from political methodology. Political Analysis 23:313335.

Max Kuhn , and Kjell Johnson . 2013. Applied predictive modeling . New York: Springer.

Mark J. van der Laan , Sandrine Dudoit , and Sunduz Keles . 2004. Asymptotic optimality of likelihood-based cross-validation. Statistical Applications in Genetics and Molecular Biology 3(1):123.

Elizabeth A. Leicht , Petter Holme , and Mark E. J. Newman . 2006. Vertex similarity in networks. Physical Review E 73(2):026120.

David Muchlinski , David Siroky , Jingrui He , and Matthew Kocher . 2016. Comparing random forest with logistic regression for predicting class-imbalanced civil war onset data. Political Analysis 24(1):87103.

Claude Nadeau , and Yoshua Bengio . 2003. Inference for the generalization error. Machine Learning 52(3):239281.

Robert D. Nowak 1997. Optimal signal estimation using cross-validation. IEEE Signal Processing Letters 4(1):2325.

John Oneal , and Bruce M. Russett . 1999. The Kantian peace: The Pacific benefits of democracy, interdependence, and international organization. World Politics 52(1):137.

John R. Oneal , and Bruce Russett . 2005. Rule of three, let it be? When more really is better. Conflict Management and Peace Science 22(4):293310.

Brice Ozenne , Fabien Subtil , and Delphine Maucort-Boulch . 2015. The precision–recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. Journal of Clinical Epidemiology 68(8):855859.

Jon C. Pevenhouse , and Joshua S. Goldstein . 1999. Serbian compliance or defiance in Kosovo? Statistical analysis and real-time predictions. Journal of Conflict Resolution 43(4):538546.

Jon Pevehouse , Timothy Nordstrom , and Kevin Warnke . 2004. The correlates of war 2 international governmental organizations data version 2.0. Conflict Management and Peace Science 21(2):101119.

Pascal Pons , and Matthieu Latapy . 2005. Computing communities in large networks using random walks. Journal of Graph Algorithms and Applications 10(2):191218.

Nicholas Rost , Gerald Schneider , and Johannes Kleibl . 2009. A global risk assessment model for civil wars. Social Science Research 38(4):921933.

Gerald Schneider , Nils Petter Gleditsch , and Sabine Carey . 2010. Exploring the past, anticipating the future: A symposium. International Studies Review 12(1):17.

Gerald Schneider , Nils Petter Gleditsch , and Sabine Carey . 2011. Forecasting in international relations: One quest, three approaches. Conflict Management and Peace Science 28(5):514.

Philip A. Schrodt , and Deborah J. Gerner . 2000. Using cluster analysis to derive early warning indicators for political change in the middle east, 1979–1996. American Political Science Review 94(4):803818.

Galit Shmueli . 2010. To explain or to predict?. Statistical Science 25(3):289310.

T. Sing , O. Sander , N. Beerenwinkel , and T. Lengauer . 2005. ROCR: Visualizing classifier performance in R. Bioinformatics 21(20):7881.

Douglas M. Stinnett , Jaroslav Tir , Paul F. Diehl , Philip Schafer , and Charles Gochman . 2002. The Correlates of War (COW) project direct contiguity data, version 3.0. Conflict Management and Peace Science 19(2):5967.

M. Stone 1977. Asymptotics for and against cross-validation. Biometrika 64(1):2935.

John Van Maanen , Jesper B. Sørensen , and Terence R. Mitchell . 2007. The interplay between theory and method. Academy of Management Review 32(4):11451154.

Michael D. Ward , Brian D. Greenhill , and Kristin M. Bakke . 2010. The perils of policy by p-value: Predicting civil conflicts. Journal of Peace Research 47(4):363375.

Michael D. Ward , Randolph M. Siverson , and Xun Cao . 2007. Disputes, democracies, and dependencies: A reexamination of the Kantian peace. American Journal of Political Science 51(3):583601.

Hui Zou , and Trevor Hastie . 2005. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67(2):301320.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Political Analysis
  • ISSN: 1047-1987
  • EISSN: 1476-4989
  • URL: /core/journals/political-analysis
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×
MathJax

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 8
Total number of PDF views: 151 *
Loading metrics...

Abstract views

Total abstract views: 502 *
Loading metrics...

* Views captured on Cambridge Core between 24th April 2017 - 22nd September 2017. This data will be updated every 24 hours.