Skip to main content
×
×
Home

Improving Predictions using Ensemble Bayesian Model Averaging

  • Jacob M. Montgomery (a1), Florian M. Hollenbach (a2) and Michael D. Ward (a2)
Abstract

We present ensemble Bayesian model averaging (EBMA) and illustrate its ability to aid scholars in the social sciences to make more accurate forecasts of future events. In essence, EBMA improves prediction by pooling information from multiple forecast models to generate ensemble predictions similar to a weighted average of component forecasts. The weight assigned to each forecast is calibrated via its performance in some validation period. The aim is not to choose some “best” model, but rather to incorporate the insights and knowledge implicit in various forecasting efforts via statistical postprocessing. After presenting the method, we show that EBMA increases the accuracy of out-of-sample forecasts relative to component models in three applied examples: predicting the occurrence of insurgencies around the Pacific Rim, forecasting vote shares in U.S. presidential elections, and predicting the votes of U.S. Supreme Court Justices.

    • Send article to Kindle

      To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

      Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      Improving Predictions using Ensemble Bayesian Model Averaging
      Available formats
      ×
      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

      Improving Predictions using Ensemble Bayesian Model Averaging
      Available formats
      ×
      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

      Improving Predictions using Ensemble Bayesian Model Averaging
      Available formats
      ×
Copyright
Corresponding author
e-mail: michael.d.ward@duke.edu (corresponding author)
Footnotes
Hide All

Authors' note: For generously sharing their data and models with us, we thank Alan Abramowitz, James Campbell, Robert Erikson, Ray Fair, Douglas Hibbs, Michael Lewis-Beck, Andrew D. Martin, Kevin Quinn, Stephen Shellman, Charles Tien, and Christopher Wlezien. We especially want to thank Adrian Raftery and Brendan Nyhan for their encouragement and feedback as this project evolved. The editor and the reviewers of Political Analysis provided especially salient suggestions that substantially improved our research.

Footnotes
References
Hide All
Abramowitz, A. I. 2008. Forecasting the 2008 presidential election with the time-for-change model. PS: Political Science & Politics 41: 691–5.
Andriole, S. J., and Young, R. A. 1977. Toward the development of an integrated crisis warning system. International Studies Quarterly 21: 107–50.
Armstrong, J. S. 2001. Combining forecasts. In Principles of forecasting: A handbook for researchers and practitioners, ed. Amstrong, J. S. Norwell, MA: Kluwer Academic.
Ascher, W. 1978. Forecasting: An appraisal for policy-makers and planners. Baltimore: Johns Hopkins University Press.
Bartels, L. M. 1997. Specification uncertainty and model averaging. American Journal of Political Science 41: 641–74.
Bartels, L. M., and Zaller, J. 2001. Presidential vote models: A recount. PS: Political Science and Politics 34: 920.
Bates, J., and Granger, C. 1969. The combination of forecasts. Operations Research 20: 451–68.
Bennett, D. S., and Stam, A. C. 2009. Revisiting predictions of war duration. Conflict Management and Peace Science 26: 256–67.
Berg, J. E., Nelson, F. D., and Rietz, T. A. 2008. Prediction market accuracy in the long run. International Journal of Forecasting 24: 285300.
Berrocal, V. J., Raftery, A. E., Gneiting, T., and Steed, R. C. 2010. Probabilistic weather forecasting for winter road maintenance. Journal of the American Statistical Association 105: 522–37.
Billio, M., Casarin, R., Ravazzolo, F., and Van Dijk, H. K. 2010. Combining predictive densities using Bayesian filtering with applications to U.S. economics data. Norges Bank Working Paper. http://ssrn.com/abstract=1735421 (accessed June 1, 2011).
Billio, M., Casarin, R., Ravazzolo, F., and Van Dijk, H. K. 2011. Bayesian combinations of stock price predictions with an application to the Amsterdam exchange index. Tinbergen Institute Discussion Paper No. 2011-082/4. http://www.tinbergen.nl/discussionpapers/11082.pdf (accessed June 1, 2011).
Brandt, P. T., Colaresi, M., and Freeman, J. R. 2008. The dynamics of reciprocity, accountability, and credibility. Journal of Conflict Resolution 52: 343–74.
Brandt, P. T., Freeman, J. R., and Schrodt, P. A. 2011a. Racing horses: Constructing and evaluating forecasts in political science. Paper prepared for the 28th Annual Summer Meeting of the Society for Political Methodology. http://polmeth.wustl.edu/media/Paper/RHMethods20110721small_1.pdf (accessed August 20, 2011).
Brandt, P. T., Freeman, J. R., and Schrodt, P. A. 2011b. Real-time, time-series forecasting of inter- and intra-state political conflict. Conflict Management and Peace Science 28: 4164.
Brier, G. W. 1950. Verification of forecasts expressed in terms of probability. Monthly Weather Review 78: 13.
Brock, W. A., Durlauf, S. N., and West, K. D. 2007. Model uncertainty and policy evaluation: Some theory and empirics. Journal of Econometrics 136: 629–64.
Brown, L. B., and Chappell, H. W. 1999. Forecasting presidential elections using history and polls. International Journal of Forecasting 15: 127–35.
Bueno de Mesquita, B. 2002. Predicting politics. Columbus: Ohio State University Press.
Bueno de Mesquita, B. 2011. A new model for predicting policy choices: Preliminary tests. Conflict Management and Peace Science 28: 6585.
Campbell, J. E. 1992. Forecasting the presidential vote in the states. American Journal of Political Science 36: 386407.
Campbell, J. E. 2008. The trial-heat forecast of the 2008 presidential vote: Performance and value considerations in an open-seat election. PS: Political Science & Politics 41: 697701.
Campbell, J. E., and Wink, K. A. 1990. Trial-heat forecasts of the presidential vote. American Politics Research 18: 251–69.
Chmielecki, R. M., and Raftery, A. E. 2010. Probabilistic visibility forecasting using Bayesian model averaging. Monthly Weather Review 139: 1626–36.
Choucri, N., and Robinson, T. W., eds. 1978. Forecasting in international relations: Theory, methods, problems, prospects. San Francisco, CA: W. H. Freeman.
Clyde, M. 2003. Model averaging. In Subjective and objective Bayesian statistics: Principles, models, and applications, ed. Press, S. J., 320–35. Hoboken, NJ: Wiley-Interscience.
Clyde, M., and George, E. I. 2004. Model uncertainty. Statistical Science 19: 8194.
Cuzàn, A. G., and Bundrick, C. M. 2008. Forecasting the 2008 presidential election: A challenge for the fiscal model. PS: Political Science & Politics 41: 717–22.
Davies, J. L., and Gurr, T. R. 1998. Preventive measures: Building risk assessment and crisis early warning systems. Lanham, MD: Rowman & Littlefield.
Dawid, A. P. 1982. The well-calibrated Bayesian (with discussion). Journal of the American Statistical Association 77: 605–13.
Dawid, A. P. 1984. Present position and potential developments: Some personal views. Statistical theory: The prequential approach (with discussion). Journal of the Royal Statistical Society Series A (Statistics in Society) 147: 278–92.
de Marchi, S., Gelpi, C., and Grynaviski, J. D. 2004. Untangling neural nets. American Political Science Review 98: 371–8.
de Sola Pool, I., Abelson, R. P., and Popkin, S. L. 1964. Candidates, issues, and strategies: A computer simulation of the 1960 and 1964 presidential elections. Cambridge, MA: MIT Press.
Draper, D. 1995. Assessment and propagation of model uncertainty. Journal of the Royal Statistical Society Series B (Methodological) 57: 4597.
Enders, W., and Sandler, T. M. 2005. After 9/11: Is it all different now? Journal of Conflict Resolution 49: 259–77.
Erikson, R. S., and Wlezien, C. 2008. Leading economic indicators, the polls, and the presidential vote. PS: Political Science & Politics 41: 703–7.
Fair, R. C. 1978. The effect of economic events on votes for president. Review of Economics and Statistics 60: 159–73.
Fair, R. C. 2010. Presidential and congressional vote-share equations: November 2010 update. Working paper, Yale University. http://fairmodel.econ.yale.edu/RAYFAIR/PDF/2010C.pdf (accessed June 7, 2011).
Fearon, J. D., and Laitin, D. D. 2003. Ethnicity, insurgency, and civil war. American Political Science Review 97: 7590.
Feder, S. A. 2002. Forecasting for policy-making in the post-Cold War period. Annual Review of Political Science 5: 111–25.
Feldkircher, M. Forthcoming 2012. Forecast combination and Bayesian model averaging: A prior sensitivity analysis. Journal of Forecasting.
Fraley, C., Raftery, A. E., and Gneiting, T. 2010. Calibrating multimodel forecast ensembles with exchangeable and missing members using Bayesian model averaging. Monthly Weather Review 138: 190202.
Fraley, C., Raftery, A. E., Gneiting, T., Sloughter, J. M., and Berrocal, V. J. 2011. Probabilistic weather forecasting in R. R Journal 3: 5563.
Fraley, C., Raftery, A. E., Sloughter, J. M., and Gneiting, T. 2010. EnsembleBMA: Probabilistic forecasting using ensembles and Bayesian model averaging. R package version 4.5. http://CRAN.R-project.org/package=ensembleBMA.
Freeman, J. R., and Job, B. L. 1979. Scientific forecasts in international relations: Problems of definition and epistemology. International Studies Quarterly 23: 113–43.
Geer, J., and Lau, R. R. 2006. Filling in the blanks: A new method for estimating campaign effects. British Journal of Political Science 36: 269–90.
Gill, J. 2004. Introduction to the special issue. Political Analysis 12: 647–74.
Gleditsch, K. S., and Ward, M. D. 2010. Contentious issues and forecasting interstate disputes. Presented at the 2010 Annual Meeting of the International Studies Association, New Orleans, LA.
Gneiting, T., and Raftery, A. E. 2007. Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association 102: 359–78.
Gneiting, T., and Thorarinsdottir, T. L. 2010. Predicting inflation: Professional experts versus no-change forecasts. Working paper. http://arxiv.org/abs/1010.2318v1http://arxiv.org/abs/1010.2318v1 (accessed June 15, 2011).
Graefe, A., Cuzan, A. G., Jones, R. J., and Armstrong, J. S. 2010. Combining forecasts for U.S. presidential elections: The PollyVote. Working Paper. http://dl.dropbox.com/u/3662406/Articles/Graefe_et_al_Combining.pdf (accessed May 15, 2011).
Greenhill, B. D., Ward, M. D., and Sacks, A. 2011. The separation plot: A new visual method for evaluating the fit of binary data. American Journal of Political Science 55: 9901002.
Gurr, T. R., and Lichbach, M. I. 1986. Forecasting internal conflict: A competitive evaluation of empirical theories. Comparative Political Studies 19: 338.
Hamill, T. S., Whitaker, J. S., and Wei, X. 2004. Ensemble reforecasting: Improving medium-range forecast skill using retrospective forecasts. Monthly Weather Review 132: 1434–47.
Hastie, T., Tibshirani, R., and Friedman, J. 2009. The elements of statistical learning: Data mining, inference, and prediction. New York: Springer.
Hausegger, L., and Baum, L. 1999. Inviting congressional action: A study of Supreme Court motivations in statutory interpretation. American Journal of Political Science 43: 162–85.
Hibbs, D. A. 2000. Bread and peace voting in U.S. presidential elections. Public Choice 104: 149–80.
Hildebrand, D. K., Laing, J. D., and Rosenthal, H. 1976. Prediction analysis in political research. American Political Science Review 70: 509–35.
Hoeting, J. A., Madigan, D., Raftery, A. E., and Volinsky, C. T. 1999. Bayesian model averaging: A tutorial. Statistical Science 14: 382417.
Holbrook, T. M. 2008. Incumbency, national conditions, and the 2008 presidential election. PS: Political Science & Politics 41: 709–12.
Huisman, J., Breuer, L., Bormann, H., Bronstert, A., Croke, B., Frede, H.-G., Gräff, T., Hubrechts, L., Jakeman, A., Kite, G., et al. 2009. Assessing the impact of land-use change on hydrology by ensemble modeling (LUCHEM) II: Ensemble combinations and predictions. Advances in Water Resources 32: 147–58.
Imai, K., and King, G. 2004. Did illegal overseas absentee ballots decide the 2000 U.S. presidential election? Perspectives on Politics 2: 537–49.
Jerome, B., Jerome, V., and Lewis-Beck, M. S. 1999. Polls fail in France: Forecasts of the 1997 legislative election. International Journal of Forecasting 15: 163–74.
King, G., and Zeng, L. 2001. Improving forecasts of state failure. World Politics 53: 623–58.
Klein, D. E., and Hume, R. J. 2003. Fear of reversal as an explanation of lower court compliance. Law & Society Review 37: 579606.
Koop, G., and Korobilis, D. 2009. Forecasting inflation using dynamic model averaging. Working paper. http://personal.strath.ac.uk/gary.koop/koop_korobilis_forecasting_inflation_using_DMA.pdf (accessed May 25, 2011).
Krause, G. A. 1997. Voters, information heterogeneity, and the dynamics of aggregate economic expectations. American Journal of Political Science 41: 1170–200.
Leblang, D., and Satyanath, S. 2006. Institutions, expectations, and currency crises. International Organization 60: 245–62.
Lewis-Beck, M. S. 2005. Election forecasting: Principles and practice. British Journal of Politics & International Relations 7: 145–64.
Lewis-Beck, M. S., and Tien, C. 2008. The job of president and the jobs model forecast: Obama for '08? PS: Political Science & Politics 41: 687–90.
Lock, K., and Gelman, A. 2010. Bayesian combination of state polls and election forecasts. Political Analysis 18: 337–48.
Lockerbie, B. 2008. Election forecasting: The future of the presidency and the house. PS: Political Science & Politics 41: 713–6.
Madigan, D., and Raftery, A. E. 1994. Model selection and accounting for model uncertainty in graphical models using Occam's window. Journal of the American Statistical Association 89: 1535–46.
Marshall, M. G., Jaggers, K., and Gurr, T. R. 2009. Polity IV project: Political regime characteristics and transition 1800-2007. College Park, MD: CIDCM, University of Maryland.
Martin, A. D., Quinn, K. M., Ruger, T. W., and Kim, P. T. 2004. Competing approaches to predicting Supreme Court decision-making. Perspectives on Politics 2: 761–7.
McCandless, T. C., Haupt, S. E., and Young, G. S. 2011. The effects of imputing missing data on ensemble temperature forecasts. Journal of Computers 6: 162–71.
McCormick, T. H., Raftery, A. E., Madigan, D., and Burd, R. S. 2011. Dynamic logistic regression and dynamic model averaging for binary classification. Working paper. http://www.stat.columbia.edu/madigan/PAPERS/ldbma27.pdf (accessed March 26, 2011).
Min, S.-K., and Hense, A. 2006. A Bayesian approach to climate model evaluation and multi-model averaging with an application to global mean surface temperatures from IPCC AR4 coupled climate models. Geophysical Research Letters 33: L08708.
Min, S.-K., Simonis, D., and Hense, A. 2007. Probabilistic climate change predictions applying Bayesian model averaging. Philosophical Transactions of the Royal Society A: Mathematical, Physical, and Engineering Sciences 365: 2103–16.
Montgomery, J. M., Hollenbach, F., and Ward, M. D. 2012. Replication data for: Improving predictions using ensemble Bayesian model averaging. IQSS Dataverse Network. http://hdl.handle.net/1902.1/17286.
Montgomery, J. M., and Nyhan, B. 2010. Bayesian model averaging: Theoretical developments and practical applications. Political Analysis 18: 245–70.
Muhlbaier, M. D., and Polikar, R. 2007. An ensemble approach for incremental learning in nonstationary environments. Multiple Classifier Systems 4472: 490500.
Norpoth, H. 2008. On the razor's edge: The forecast of the primary model. PS: Political Science & Politics 41: 683–6.
O'Brien, S. P. 2002. Anticipating the good, the bad, and the ugly: An early warning approach to conflict and instability analysis. Journal of Conflict Resolution 46: 791811.
O'Brien, S. P. 2010. Crisis early warning and decision support: Contemporary approaches and thoughts on future research. International Studies Review 12: 87104.
Page, S. E. 2008. Uncertainty, difficulty, and complexity. Journal of Theoretical Politics 20: 115–49.
Page, S. E. 2011. Diversity and complexity. Princeton, NJ: Princeton University Press.
Page, S. E., Sander, L. M., and Schneider-Mizell, C. M. 2007. Conformity and dissonance in generalized voter models. Journal of Statistical Physics 128: 1279–87.
Pevehouse, J. C., and Goldstein, J. S. 1999. Serbian compliance or defiance in Kosovo? Statistical analysis and real-time predictions. Journal of Conflict Resolution 43: 538–46.
Raftery, A. E. 1995. Bayesian model selection in social research. Sociological Methodology 25: 111–63.
Raftery, A. E., Gneiting, T., Balabdaoui, F., and Polakowski, M. 2005. Using Bayesian model averaging to calibrate forecast ensembles. Monthly Weather Review 133: 1155–74.
Raftery, A. E., Kárný, M., and Ettler, P. 2010. Online prediction under model uncertainty via dynamic model averaging: Application to a cold rolling mill. Technometrics 52: 5266.
Raftery, A. E., and Zheng, Y. 2003. Long-run performance of Bayesian model averaging. Journal of the American Statistical Association 98: 931–8.
Richards, M. J., and Kritzer, H. M. 2002. Jurisprudential regimes in Supreme Court decision-making. American Political Science Review 96: 305–20.
Rosenstone, S. J. 1983. Forecasting presidential elections. New Haven, CT: Yale University Press.
Ruger, T. W., Kim, P. T., Martin, A. D., and Quinn, K. M. 2004. The Supreme Court Forecasting Project: Legal and political science approaches to predicting Supreme Court decision-making. Columbia Law Review 104: 1150–210.
Schneider, G., Gleditsch, N. P., and Carey, S. 2011. Forecasting in international relations: One quest, three approaches. Conflict Management and Peace Science 28: 514.
Schrodt, P. A., and Gerner, D. J. 2000. Using cluster analysis to derive early warning indicators for political change in the Middle East, 1979-1996. American Political Science Review 94: 803–18.
Segal, J. A., and Cover, A. D. 1989. Ideological values and the votes of U.S. Supreme Court Justices. American Political Science Review 83: 557–65.
Singer, J. D., and Wallace, M. D. 1979. To augur well: Early warning indicators in world politics. Beverly Hills, CA: Sage.
Sloughter, J. M., Gneiting, T., and Raftery, A. E. 2010. Probabilistic wind-speed forecasting using ensembles and Bayesian model averaging. Journal of the American Statistical Association 105: 2535.
Sloughter, J. M., Raftery, A. E., Gneiting, T., and Fraley, C. 2007. Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Monthly Weather Review 135: 3209–20.
Smith, R. L., Tebaldi, C., Nychka, D., and Mearns, L. O. 2009. Bayesian modeling of uncertainty in ensembles of climate models. Journal of the American Statistical Association 104: 97116.
Songer, D. R., Segal, J. A., and Cameron, C. M. 1994. The hierarchy of justice: Testing a principal-agent model of Supreme Courtcircuit court interactions. American Journal of Political Science 38: 673–96.
Spirtes, P., Glymour, C. N., and Scheines, R. 2000. Causation, prediction, and search. Vol. 81. Cambridge, MA: MIT Press.
Tomas, A. 2011. A dynamic logistic multiple classifier system for online classification. Working paper. http://www.stats.ox.ac.uk/tomas/html_links/T2011.pdf (accessed June 1, 2011).
Vincent, J. E. 1980. Scientific prediction versus crystal ball gazing: Can the unknown be known? International Studies Quarterly 24: 450–4.
Vrugt, J. A., Clark, M. P., Diks, C. G., Duan, Q., and Robinson, B. A. 2006. Multi-objective calibration of forecast ensembles using Bayesian model averaging. Geophysical Research Letters 33: L19817.
Vrugt, J. A., Diks, C. G., and Clark, M. P. 2008. Ensemble Bayesian model averaging using Markov chain Monte Carlo sampling. Environmental Fluid Mechanics 8: 579–95.
Ward, M. D., Greenhill, B. D., and Bakke, K. M. 2010. The perils of policy by p-value: Predicting civil conflict. Journal of Peace Research 47: 363–75.
Ward, M. D., Siverson, R. M., and Cao, X. 2007. Disputes, democracies, and dependencies: A re-examination of the Kantian peace. American Journal of Political Science 51: 583601.
Whiteley, P. F. 2005. Forecasting seats from votes in British general elections. British Journal of Politics & International Relations 7: 165–73.
Wright, J. H. 2008. Bayesian model averaging and exchange rate forecasts. Journal of Econometrics 146: 329–41.
Wright, J. H. 2009. Forecasting U.S. inflation by Bayesian model averaging. Journal of Forecasting 28: 131–44.
Zhang, X., Srinivasan, R., and Bosch, D. 2009. Calibration and uncertainty analysis of the SWAT model using genetic algorithms and Bayesian model averaging. Journal of Hydrology 374: 307–17.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Political Analysis
  • ISSN: 1047-1987
  • EISSN: 1476-4989
  • URL: /core/journals/political-analysis
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×
MathJax

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed