Skip to main content Accessibility help

Retrospective Causal Inference with Machine Learning Ensembles: An Application to Anti-recidivism Policies in Colombia

  • Cyrus Samii (a1), Laura Paler (a2) and Sarah Zukerman Daly (a3)


We present new methods to estimate causal effects retrospectively from micro data with the assistance of a machine learning ensemble. This approach overcomes two important limitations in conventional methods like regression modeling or matching: (i) ambiguity about the pertinent retrospective counterfactuals and (ii) potential misspecification, overfitting, and otherwise bias-prone or inefficient use of a large identifying covariate set in the estimation of causal effects. Our method targets the analysis toward a well-defined “retrospective intervention effect” based on hypothetical population interventions and applies a machine learning ensemble that allows data to guide us, in a controlled fashion, on how to use a large identifying covariate set. We illustrate with an analysis of policy options for reducing ex-combatant recidivism in Colombia.


Corresponding author


Hide All

Authors’ note: Authors are listed in reverse alphabetical order and are equal contributors to the project. All replication materials are available at the Political Analysis Dataverse (article url: We thank Carolina Serrano for excellent research assistance in Colombia and the team at Fundación Ideas para la Paz for their collaboration in the data collection. We also thank the Organization of American States, Misión de Apoyo al Proceso de Paz and the Agencia Colombiana para la Reintegración for their collaboration. Daly acknowledges funding from the Swedish Foreign Ministry, the Smith Richardson Foundation, and the Carroll L. Wilson Award. For helpful discussions, the authors thank Michael Alvarez, two anonymous Political Analysis reviewers, Deniz Aksoy, Peter Aronow, Neal Beck, Matthew Blackwell, Drew Dimmery, Ryan Jablonski, Michael Peress, Fredrik Savje, Maya Sen, Teppei Yamamoto, Rodrigo Zarazaga, and seminar participants at the American Political Science Association annual meetings, European Political Science Association annual meetings, Empirical Studies of Conflict working group, Massachussetts Institute of Technology, Midwest Political Science association annual meetings, New York University, and the University of Rochester. Supplementary materials for this article are available on the Political Analysis website.



Hide All
Angrist, Joshua D., and Krueger, Alan B. 1999. Empirical strategies in labor economics. In Handbook of labor economics, eds. Ahsenfelter, Orley C. and Card, David, Vol. 3:1277–1366. Amsterdam: North Holland.
Angrist, Joshua D., and Pischke, Jorn-Steffen. 2009. Mostly harmless econometrics: an empiricist's companion. Princeton, NJ: Princeton University Press.
Aronow, Peter M., and Samii, Cyrus. 2016. Does regression produce representative estimates of causal effects? American Journal of Political Science 60(1):250–67.
Athey, Susan, and Imbens, Guido W. 2015. Machine learning methods for estimating heterogeneous causal effects. Working paper.
Bang, Heejung, and Robins, James M. 2005. Doubly robust estimation in missing data and causal inference models. Biometrics 61:962–72.
Bickel, Peter J., and Li, Bo. 2006. Regularization in statistics. Test 15(2):271344.
Blackwell, Matthew. 2013. A framework for dynamic causal inference in political science. American Journal of Political Science 57(2):504–19.
Busso, Matias, DiNardo, John, and McCrary, Justin. 2014. New evidence on the finite sample properties of propensity score reweighting and matching estimators. The Review of Economics and Statistics 96(5):885–97.
Chalimourda, Athanassia, Schoelkopf, Bernhard, and Smola, Alex J. 2004. Experimentally optimal v in support vector regression for difference noise models and parameter settings. Neural Networks 17:127–41.
Chen, Pai-Hsuen, Lin, Chih-Jen, and Schoelkopf, Bernhard. 2005. A tutorial on nu-support vector machines. Applied Stochastic Models in Business and Industry 21:111–36.
Chipman, Hugh A., George, Edward I., and McCulloch, Robert E. 2010. BART: Bayesian additive regression trees. The Annals of Applied Statistics 4(1):266–98.
Cox, David R. 1958. Planning of experiments. New York: Wiley.
Crump, Richard K., Joseph Hotz, V., Imbens, Guido W., and Mitnik, Oscar A. 2009. Dealing with limited overlap in estimation of average treatment effects. Biometrika 96(1):187–99.
Daly, Sarah Zukerman, Laura, Paler, and Cyrus, Samii. 2016. Wartime Networks and the Social Logic of Crime. Typescript, University of Notre Dame, University of Pittsburgh: New York University.
Gelman, Andrew, Jakulin, Aleks, Grazia Pittau, Maria, and Su, Yu-Sung. 2008. A weakly informative default prior for logistic and other regression models. Annals of Applied Statistics 2(4):1360–83.
Geman, Stuart, and Hwang, Chii-Ruey. 1982. Nonparametric maximum likelihood estimation by the method of sieves. The Annals of Statistics 10(2):401–14.
Green, Donald P., and Kern, Holger L. 2012. Modeling heterogenous treatment effects in survey experiments with Bayesian additive regression trees. Public Opinion Quarterly 76(3):491511.
Greenshtein, Eitan, and YaAcov, Ritov. 2004. Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli 10(6):971–88.
Grimmer, Justin, Messing, Solomon, and Westwood, Sean J. 2014. Estimating heterogenous treatment effects and the effects of heterogenous treatments with ensemble methods. Unpublished manuscript, Stanford University.
Hainmueller, Jens. 2011. Entropy balancing for causal effects: A multivariate reweighting method to produce balanced samples in observational studies. Political Analysis 17(4):400–17.
Hainmueller, Jens, and Hazlett, Chad. 2014. Kernel regularized least squares: Reducing misspecification bias with a flexible ad interpretable machine learning approach. Political Analysis 22(2):143–68.
Hansen, Ben B. 2008. The prognostic analogue to the propensity score. Biometrika 95(2):481–88.
Hastie, Trevor, Tibshirani, Robert, and Friedman, Jerome. 2009. The elements of statistical learning: Data mining, inference, and prediction. New York: Springer.
Hill, Jennifer. 2011. Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics 20(1):217–40.
Ho, Daniel E., Imai, Kosuke, King, Gary, and Stuart, Elizabeth A. 2007. Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis 15(3):199236.
Holland, Paul W. 1986. Statistics and causal inference. Journal of the American Statistical Association 81(396):945–60.
Hubbard, Alan E., and Van der Laan, Mark J. 2008. Population intervention models in causal inference. Biometrika 95(1):3547.
Imai, Kosuke, and Ratkovic, Marc. 2013. Estimating treatment effect heterogeneity in randomized program evaluation. Annals of Applied Statistics 7(1):443–70.
Imai, Kosuke, and Strauss, Aaron. 2011. Estimation of heterogenous treatment effects from randomized experiments, with application to the optimal planning of the get-out-the-vote campaign. Political Analysis 19(1):119.
Imai, Kosuke, and van Dyk, David A. 2004. Causal inference with general treatment regimes: Generalizing the propensity score. Journal of the American Statistical Association 99(467):854–66.
Imbens, Guido W. 2004. Nonparametric estimation of average treatment effects under exogeneity: A review. Review of Economics and Statistics 86(1):429.
Imbens, Guido W., and Wooldridge, Jeffrey M. 2009. Recent developments in the econometrics of program evaluation. Journal of Economic Literature 47(1):586.
International Crisis Group. 2012. Dismantling Colombia's new illegal armed groups: Lessons from a surrender. International Crisis Group Latin America Report 41.
King, Gary, and Zeng, Langche. 2002. Estimating risk and rate leveks, ratios, and differences in case–control studies. Statistics in Medicine 21(10):1409–27.
King, Gary, and Zeng, Langche. 2006. The dangers of extreme counterfactuals. Political Analysis 14(2):131–59.
Korn, Edward L., and Graubard, Barry I. 1999. Analysis of health surveys. New York: Wiley.
Little, Roderick J.A., and Rubin, Donald B. 2002. Statistical analysis with missing data, 2nd ed. Hoboken, NJ: Wiley.
Lumley, Thomas. 2010. Complex surveys: A guide to analysis in R. Hoboken, NJ: Wiley.
Manski, Charles F. 1995. Identification problems in the social sciences. Cambridge, MA: Harvard University Press.
Montgomery, Jacob M., Hollanbach, Florian M., and Ward, Michael D. 2012. Improving predictions using ensemble Bayesian model averaging. Political Analysis 20:271–91.
Myers, Jessica A., Rassen, Jeremy A., Gagne, Jashua J., Huybrechts, Krista F., Schneeweiss, Sebastian, Rothman, Kenneth J., Joffe, Marshall M., and Glynn, Robert J. 2011. Effects of adjusting for instrumental variables on bias and precision of effect estimates. American Journal of Epidemiology 174(11):1213–22.
O’Brien, Peter C. 1984. Procedures for comparing samples with multiple endpoints. Biometrics 40(4):1079–87.
Pearl, Judea. 2009. Causality: Models, reasoning, and inference, 2nd ed. New York: Cambridge University Press.
Pearl, Judea. 2010. On a class of bias-amplifying variables that endanger effect estimates. In Proceedings of UAI, eds. Grunwald, Peter and Spirtes, Peter, 417–24. Corvallis, OR: AUAI.
Petersen, Maya L., Porter, Kristin E., Gruber, Susan, Wang, Yue, and Van der Laan, Mark J. 2011. Positivity. In Targeted learning: Causal inference for observational and experimental data, eds. Van der Laan, Mark J. and Rose, Sherri, chap. 10, 161–86. New York: Springer.
Polley, Eric C., and Van der Laan, Mark J. 2012. SuperLearner: Super learner prediction. R package version 2.0–9.
Polley, Eric C., Rose, Sherri, and Van der Laan, Mark J. 2011. Super learning. In Targeted learning: Causal inference for observational and experimental data, eds. Van der Laan, Mark J. and Rose, Sherri, chap. 3, 4366. New York: Springer.
Ratkovic, Marc. 2014. Balancing within the margin: Causal effect estimation with support vector machines. Unpublished manuscript, Princeton University.
Robins, James M., and Rotnitzky, Andrea. 1995. Semiparametric efficiency in multivariate regression models with missing data. Journal of the American Statistical Association 90:122–29.
Rosenbaum, Paul R. 1984. The consequences of adjustment for a concomitant variable that has been affected by the treatment. Journal of the Royal Statistical Society, Series A 147(5):656–66.
Rosenbaum, Paul R., and Rubin, Donald B. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):4155.
Rothman, Kenneth J., Greenland, Sander, and Lash, Timothy L. 2008. Modern epidemiology, 3rd ed. Philadelphia, PA: Lippincott, Williams. and Wilkins.
Royston, Patrick. 2004. Multiple imputation of missing values. Stata Journal 4(3):227–41.
Rubin, Donald B. 1978. Bayesian inference for causal effects: The role of randomization. The Annals of Statistics 6(1):3458.
Rubin, Donald B. 2008. For objective causal inference, design trumps analysis. The Annals of Applied Statistics 2(3):808–40.
Rubin, Donald D. 1990. Formal modes of statistical inference for causal effects. Journal of Statistical Planning and Inference 25:279–92.
Samii, Cyrus. 2016. Replication data for: Retrospective causal inference with machine learning ensembles: An application to anti-recidivism policies in Colombia., Harvard Dataverse.
Sekhon, Jasjeet S. 2009. Opiates for the matches: Matching methods for causal inference. Annual Review of Political Science 12(1):487508.
Tourangeau, Roger, and Yan, Ting. 2005. Sensitive questions in surveys. Psychological Bulletin 133(5):859–83.
Van der Laan, Mark J., Polley, Eric C., and Hubbard, Alan E. 2007. Super learner. Statistical Applications in Genetic and Molecular Biology 6(1):121.
Van der Laan, Mark J., and Rose, Sherry. 2011. Targeted learning: Causal inference for observational and experimental data. New York: Springer.
VanderWeele, Tyler, 2009. Concerning the consistency assumption in causal inference. Epidemiology 20(6):880–83.
Young, Jessica G., Hubbard, Alan E., Eshkenazi, Brenda, and Jewell, Nicholas P. 2009. A machine-learning algorithm for estimating and ranking the impact of environmental risk factors in exploratory epidemiological studies. University of California Berkeley Division of Biostatistics Working Paper Series 250.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Political Analysis
  • ISSN: 1047-1987
  • EISSN: 1476-4989
  • URL: /core/journals/political-analysis
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
Type Description Title
Supplementary materials

Samii et al. Supplementary Material
Supplementary Material

 PDF (149 KB)
149 KB


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed