The Statistics of Causal Inference: A View from Political Methodology

Luke Keele

doi:10.1093/pan/mpv007

The Statistics of Causal Inference: A View from Political Methodology

Published online by Cambridge University Press: 04 January 2017

Luke Keele

Show author details

Luke Keele*: Affiliation:
Department of Political Science, 211 Pond Lab, Penn State University, University Park, PA 19130
*: e-mail: ljk20@psu.edu (corresponding author)

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Many areas of political science focus on causal questions. Evidence from statistical analyses is often used to make the case for causal relationships. While statistical analyses can help establish causal relationships, it can also provide strong evidence of causality where none exists. In this essay, I provide an overview of the statistics of causal inference. Instead of focusing on specific statistical methods, such as matching, I focus more on the assumptions needed to give statistical estimates a causal interpretation. Such assumptions are often referred to as identification assumptions, and these assumptions are critical to any statistical analysis about causal effects. I outline a wide range of identification assumptions and highlight the design-based approach to causal inference. I conclude with an overview of statistical methods that are frequently used for causal inference.

Type: Articles
Information: Political Analysis , Volume 23 , Issue 3 , Summer 2015 , pp. 313 - 335

DOI: https://doi.org/10.1093/pan/mpv007 [Opens in a new window]
Copyright: Copyright © The Author 2015. Published by Oxford University Press on behalf of the Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Authors' note: For comments I thank the editors and the four anonymous reviewers. I also thank Rocío Titiunik, Jasjeet Sekhon, Paul Rosenbaum, and Dylan Small for many insightful conversations about these topics over the years. In the online Supplementary Materials, I provide further information about software tools to implement many of the methodologies discussed in this essay. Supplementary materials for this article are available on the Political Analysis Web site.

References

Abadie, Alberto, Diamond, Alexis, and Hainmueller, Jens. 2010. Synthetic control methods for comparative case studies: Estimating the effect of California's tobacco control program. Journal of the American Statistical Association 105(490): 493–505.Google Scholar

Abadie, Alberto, and Gardeazabal, Javier. 2003. The economic costs of conflict: A case study of the Basque country. American Economic Review 93(1): 112–32.Google Scholar

Angrist, Joshua D., Imbens, Guido W., and Rubin, Donald B. 1996. Identification of causal effects using instrumental variables. Journal of the American Statistical Association 91(434): 444–55.Google Scholar

Angrist, Joshua D., and Pischke, Jörn-Steffen. 2009. Mostly harmless econometrics. Princeton, NJ: Princeton University Press.Google Scholar

Angrist, Joshua D., and Pischke, Jörn-Steffen. 2010. The credibility revolution in empirical economics: How better research design is taking the con out of econometrics. Journal of Economic Perspectives 24(2): 3–30.Google Scholar

Arceneaux, Kevin, Gerber, Alan S., and Green, Donald P. 2006. Comparing experimental and matching methods using a large-scale voter mobilization study. Political Analysis 14(1): 37–62.Google Scholar

Balke, Alexander, and Pearl, Judea. 1997. Bounds on treatment effects from studies with imperfect compliance. Journal of the American Statistical Association 92(439): 1171–1176.Google Scholar

Barnow, B. S., Cain, G. G., and Goldberger, A. S. 1980. Issues in the analysis of selectivity bias. In Evaluation studies, eds. Stromsdorfer, E. and Farkas, G., Vol. 5, 43–59. San Francisco: Sage Publications.Google Scholar

Berk, Richard A. 2006. Regression analysis: A constructive critique. Thousand Oaks, CA: Sage Publications.Google Scholar

Bound, J., Jaeger, D. A., and Baker, R. M. 1995. Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. Journal of the American Statistical Association 90(430): 443–50.Google Scholar

Bowers, Jake, Fredrickson, Mark M., and Panagopoulos, Costas. 2013. Reasoning about interference between units: A general framework. Political Analysis 21(1): 97–124.Google Scholar

Calonico, Sebastian, Cattaneo, Matias, and Titiunik, Rocio. 2013. Robust nonparametric confidence intervals for regression-discontinuity designs. Econometrica 82(6): 2295–326.Google Scholar

Campbell, Donald T., and Stanley, Julian C. 1963. Experimental and quasi-experimental designs for research. Chicago: Rand McNally.Google Scholar

Cattaneo, Matias, Frandsen, Brigham, and Titiunik, Rocío. 2014. Randomization inference in the regression-discontinuity design: An application to party advantages in the U.S. Senate. Journal of Causal Inference. Unpublished manuscript.Google Scholar

Caughey, Devin, and Sekhon, Jasjeet S. 2011. Elections and the regression discontinuity design: Lessons from close U.S. House races, 1942–2008. Political Analysis 19(4): 385–408.Google Scholar

Cochran, William G., and Paul Chambers, S. 1965. The planning of observational studies of human populations. Journal of Royal Statistical Society, Series A 128(2): 234–65.Google Scholar

Cook, T. D., and Shadish, W. R. 1994. Social experiments: Some developments over the past fifteen years. Annual Review of Psychology 45:545–80.Google Scholar

Cook, Thomas D., Shadish, William R., and Wong, Vivian C. 2008. Three conditions under which experiments and observational studies produce comparable causal estimates: New findings from within-study comparisons. Journal of Policy Analysis and Management 27(4): 724–50.Google Scholar

Cornfield, J., Haenszel, W., Hammond, E., Lilienfeld, A., Shimkin, M., and Wynder, E. 1959. Smoking and lung cancer: Recent evidence and a discussion of some questions. Journal of National Cancer Institute 22:173–203.Google Scholar

Crump, Richard K., Joseph Hotz, V., Imbens, Guido W., and Mitnik, Oscar A. 2009. Dealing with limited overlap in estimation of average treatment effects. Biometrika 96(1): 187–99.CrossRef Google Scholar

Dawid, A. Philip. 2000. Causal inference without counterfactuals. Journal of the American Statistical Association 95(450): 407–24.Google Scholar

Dehejia, Rajeev, and Wahba, Sadek. 1999. Causal effects in non-experimental studies: Re-evaluating the evaluation of training programs. Journal of the American Statistical Association 94(448): 1053–1062.Google Scholar

Ding, Peng, and Miratrix, Luke W. 2015. To adjust or not to adjust? Sensitivity analysis of M-bias and butterfly-bias. Journal of Causal Inference 3(1): 41–57.Google Scholar

Dunning, Thad. 2012. Natural experiments in the social sciences: A design-based approach. Cambridge, UK: Cambridge University Press.Google Scholar

Elwert, Felix, and Winship, Christopher. 2014. Endogenous selection bias: The problem of conditioning on a collider variable. Annual Review of Sociology 40(1): 31–53.Google Scholar

Fisher, R. A. 1938. Presidential address. Sankhya: The Indian Journal of Statistics 4(1): 14–7.Google Scholar

Freedman, D. A. 2005. Linear statistical models for causation: A critical review. Encyclopedia of Statistics in Behavioral Science.Google Scholar

Gerber, Alan S., and Green, Donald P. 2012. Field experiments: Design, analysis, and interpretation. New York: Norton.Google Scholar

Glynn, Adam N., and Quinn, Kevin M. 2010. An introduction to the augmented inverse propensity weighted estimator. Political Analysis 18(1): 36–56.CrossRef Google Scholar

Gordon, Sanford C. 2011. Politicizing agency spending authority: Lessons from a bush-era scadal. American Political Science Review 105(4): 717–34.Google Scholar

Greevy, Robert, Lu, Bo, Silber, Jeffery H., and Rosenbaum, Paul. 2004. Optimal multivariate matching before randomization. Biostatistics 5(2): 263–75.Google Scholar

Hahn, Jinyong, Todd, Petra, and van der Klaauw, Wilbert. 2001. Identification and estimation of treatments effects with a regression-discontinuity design. Econometrica 69(1): 201–9.Google Scholar

Hainmueller, Jens, and Hazlett, Chad. 2014. Kernel regularized least squares: Reducing misspecification bias with a flexible and interpretable machine learning approach. Political Analysis 22(2): 143–168.Google Scholar

Hansford, Thomas G., and Gomez, Brad T. 2010. Estimating the electoral effects of voter turnout. American Political Science Review 104(2): 268–88.Google Scholar

Hernán, Miguel A., and VanderWeele, Tyler J. 2011. Compound treatments and transportability of causal inference. Epidemiology 22(3): 368–77.Google Scholar

Hidalgo, Daniel F., and Sekhon, Jasjeet S. 2011. Causation. In International Encyclopedia of Political Science, eds. Badie, Bertrand, Berg-Schlosser, Dirk, and Morlino, Leonardo, 203–10. Thousand Oaks, CA: Sage Publications.Google Scholar

Hill, Jennifer, Weiss, Christopher, and Zhai, Fuhua. 2011. Challenges with propensity score strategies in a highdimensional setting and a potential alternative. Multivariate Behavioral Research 46(3): 477–513.Google Scholar

Hill, Jennifer L. 2011. Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics 20(1): 217–40.Google Scholar

Ho, Daniel E., Imai, Kosuke, King, Gary, and Stuart, Elizabeth A. 2007. Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis 15(3): 199–236.Google Scholar

Holland, Paul W. 1986. Statistics and causal inference. Journal of the American Statistical Association 81(396): 945–60.Google Scholar

Holland, Paul W. 1988. Causal inference, path analysis, and recursive structural equation models. Sociological Methodology 18:449–84.Google Scholar

Imai, Kosuke, Keele, Dustin Tingley, Luke, and Yamamoto, Teppei. 2011. Unpacking the black box of causality: Learning about causal mechanisms from experimental and observational studies. American Political Science Review 105(4): 765–89.Google Scholar

Imbens, Guido W. 2003. Sensitivity to exogeneity assumptions in program evaluation. American Economic Review Papers and Proceedings 93(2): 126–32.Google Scholar

Imbens, Guido W. 2010. Better LATE than nothing: Some comments on Deaton (2009) and Heckman and Urzua (2009). Journal of Economic Literature 48(2): 399–423.Google Scholar

Imbens, Guido W., and Rubin, Donald B. 2015. Causal inference for statistics, social, and biomedical sciences: An introduction. Cambridge, UK: Cambridge University Press.Google Scholar

Imbens, Guido W., and Kalyanaraman, Karthik. 2012. Optimal bandwidth choice for the regression discontinuity estimator. Review of Economic Studies 79(3): 933–59.Google Scholar

Kang, Joseph D.Y., and Schafer, Joseph L. 2007. Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science 22(4): 523–39.Google Scholar

Keele, Luke. 2008. Semiparametric regression for the social sciences. Chichester, UK: Wiley and Sons.Google Scholar

Keele, Luke J., McConnaughy, Corrine, and White, Ismail K. 2012. Strengthening the experimenter's toolbox: Statistical estimation of internal validity. American Journal of Political Science 56(2): 484–99.Google Scholar

Keele, Luke J., and Minozzi, William. 2012. How much is Minnesota like Wisconsin? Assumptions and counterfactuals in causal inference with observational data. Political Analysis 21(2): 193–216.Google Scholar

Keele, Luke, Titiunik, Rocío, and Zubizarreta, José. 2014. Enhancing a geographic regression discontinuity design through matching to estimate the effect of ballot initiatives on voter turnout. Journal of the Royal Statistical Society, Series A 178(1): 223–39.Google Scholar

King, Gary, Lucas, Christopher, and Nielsen, Richard. 2014. The balance-sample size frontier in matching methods for causal inference. Unpublished Manuscript.Google Scholar

Lee, David S. 2008. Randomized experiments from non-random selection in U.S. House elections. Journal of Econometrics 142(2): 675–97.Google Scholar

Lee, David S. 2009. Training, wages, and sample selection: Estimating sharp bounds on treatment effects. Review of Economic Studies 76(3): 1071–102.Google Scholar

Lee, David S., and Lemieux, Thomas. 2010. Regression discontinuity designs in economics. Journal of Economic Literature 48(2): 281–355.Google Scholar

Lyall, Jason. 2009. Does indiscriminate violence incite insurgent attacks? Evidence from Chechnya. Journal of Conflict Resolution 53(3): 331–62.Google Scholar

Manski, Charles F. 1990. Nonparametric bounds on treatment effects. American Economic Review Papers and Proceedings 80(2): 319–23.Google Scholar

Manski, Charles F. 1995. Identification problems in the social sciences. Cambridge, MA: Harvard University Press.Google Scholar

Manski, Charles F. 2007. Identification for prediction and decision. Cambridge, MA: Harvard University Press.Google Scholar

Matzkin, Rosa L. 2007. Nonparametric identification. Handbook of Econometrics 6:5307–68.Google Scholar

Mebane, Walter R., and Poast, Paul. 2013. Causal inference without ignorability: Identification with nonrandom assignment and missing treatment data. Political Analysis 22(2): 169–82.Google Scholar

Morgan, Stephen L., and Winship, Christopher. 2014. Counterfactuals and causal inference: Methods and principles for social research. 2nd ed. New York: Cambridge University Press.Google Scholar

Norvell, Daniel C., and Cummings, Peter. 2002. Association of helmet use with death in motorcycle crashes. American Journal of Epidemiology 156(5): 483–87.Google Scholar

Pearl, Judea. 1995. Causal diagrams for empirical research. Biometrika 82(4): 669–710.Google Scholar

Pearl, Judea. 2001. Direct and indirect effects. Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence. San Francisco, CA: Morgan Kaufmann Publishers.Google Scholar

Pearl, Judea. 2009a. Causality: models, reasoning, and inference. 2nd ed. New York: Cambridge University Press.Google Scholar

Pearl, Judea. 2009b. Letter to the editor. Statistics in Medicine 28:1415–1416.Google Scholar

Pearl, Judea. 2010. On the consistency rule in causal inference: Axiom, definition, assumption, or theorem? Epidemiology 21(6): 872–5.Google Scholar

Poe, Steven C., and Neal Tate, C. 1994. Repression of human rights to personal integrity in the (1980s): A global analysis. American Political Science Review 88(04): 853–72.Google Scholar

Robins, James M. 1997. Causal inference from complex longitudinal data. Latent variable modeling and applications to causality, 69–117. New York: Springer.Google Scholar

Robins, James M. 1999. Marginal structural models versus structural nested models as tools for causal inference. In Statistical methods in epidemiology: The environment and clinical trials, eds. Halloran, E. and Berry, D., 95134. New York: Springer-Verlag.Google Scholar

Robins, James M., Rotnitzky, Andrea, and Ping Zhao, Lue. 1994. Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association 89(427): 846–66.Google Scholar

Robins, J. M. 2003. Semantics of causal DAG models and the identification of direct and indirect effects. In Highly structured stochastic systems, eds. Green, P. J., Hjort, N. L., and Richardson, S., 70–81. Oxford: Oxford University Press.Google Scholar

Rosenbaum, Paul R. 1984. The consequences of adjusting for a concomitant variable that has been affected by the treatment. Journal of the Royal Statistical Society, Series A 147(5): 656–66.Google Scholar

Rosenbaum, Paul R. 1987. Sensitivity analysis for certain permutation inferences in matched observational studies. Biometrika 74(1): 13–26.Google Scholar

Rosenbaum, Paul R. 2002a. Covariance adjustment in randomized experiments and observational studies. Statistical Science 17(3): 286–387.Google Scholar

Rosenbaum, Paul R. 2002b. Observational studies. 2nd ed. New York: Springer.Google Scholar

Rosenbaum, Paul R. 2004. Design sensitivity in observational studies. Biometrika 91(1): 153–64.Google Scholar

Rosenbaum, Paul R. 2005a. Heterogeneity and causality: Unit heterogeneity and design sensitivity in observational studies. American Statistician 59(2): 147–52.Google Scholar

Rosenbaum, Paul R. 2005b. Observational study. In Encyclopedia of statistics in behavioral science, eds. Everitt, Brian S. and Howell, C., Vol. 3, 1451–1462. Chichester, UK: John Wiley and Sons.Google Scholar

Rosenbaum, Paul R. 2010. Design of observational studies. New York: Springer-Verlag.Google Scholar

Rosenbaum, Paul R. 2012. Optimal matching of an optimally chosen subset in observational studies. Journal of Computational and Graphical Statistics 21(1): 57–71.Google Scholar

Rosenbaum, Paul R., and Rubin, Donald B. 1983. The central role of propensity scores in observational studies for causal effects. Biometrika 76(1): 41–55.Google Scholar

Rubin, Donald B. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 6:688–701.Google Scholar

Rubin, Donald B. 1986. Which ifs have causal answers. Journal of the American Statistical Association 81(396): 961–62.Google Scholar

Rubin, Donald B. 1991. Practical implications of modes of statistical inference for causal effects and the critical role of the assignment mechanism. Biometrics 47(4): 1213–34.Google Scholar

Rubin, Donald B. 2008. For objective causal inference, design trumps analysis. Annals of Applied Statistics 2(3): 808–40.Google Scholar

Scharfstein, Daniel O., Rotnitzky, Andrea, and Robins, James M. 1999. Adjusting for nonignorable drop-out using semiparametric nonresponse models. Journal of the American Statistical Association 94(448): 1096–1120.Google Scholar

Sekhon, Jasjeet S. 2009. Opiates for the matches: Matching methods for causal inference. Annual Review of Political Science 12:487–508.Google Scholar

Sekhon, Jasjeet S., and Titiunik, Rocío. 2012. When natural experiments are neither natural nor experiments. American Political Science Review 106(1): 35–57.Google Scholar

Sinclair, Betsy, McConnell, Margaret, and Green, Donald P. 2012. Detecting spillover in social networks: Design and analysis of multilevel experiments. American Journal of Political Science 56(4): 1055–1069.Google Scholar

Skerfving, S., Hansson, K., Mangs, C., Lindsten, J., and Ryman, N. 1974. Methylmercury-induced chromosome damage in man. Environmental Research 7(1): 83–98.Google Scholar

Sovey, J. Allison, and Green, Donald P. 2011. Instrumental variables estimation in political science: A readers’ guide. American Journal of Political Science 55(1): 188–200.Google Scholar

Tchetgen, Eric J. Tchetgen, and VanderWeele, Tyler J. 2012. On causal inference in the presence of interference. Statistical Methods in Medical Research 21(1): 55–75.Google Scholar

van der Laan, Mark J., Haight, Thaddeus J., and Tager, Ira B. 2005. “van der Laan et al. respond to ‘Hypothetical interventions to define causal effects’”. American Journal of Epidemiology 162(7): 621–22.Google Scholar

Zubizarreta José, R., Small, Dylan S., Goyal, Neera K., Lorch, Scott, and Rosenbaum, Paul R. 2013. Stronger instruments via integer programming in an observational study of late preterm birth outcomes. Annals of Applied Statistics 7(1): 25–50.Google Scholar

Keele supplementary material

Supplementary Material

PDF 74.8 KB

Article contents

The Statistics of Causal Inference: A View from Political Methodology

Abstract

Access options

Footnotes

References

Keele supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests