Hostname: page-component-89b8bd64d-7zcd7 Total loading time: 0 Render date: 2026-05-07T07:30:41.250Z Has data issue: false hasContentIssue false

Transparency challenges in policy evaluation with causal machine learning: improving usability and accountability

Published online by Cambridge University Press:  18 October 2024

Patrick Rehill*
Affiliation:
Centre for Social Research and Methods, Australian National University, Canberra, ACT, Australia
Nicholas Biddle
Affiliation:
School of Politics & International Relations, Australian National University, Canberra, ACT, Australia
*
Corresponding author: Patrick Rehill; Email: patrick.rehill@anu.edu.au

Abstract

Causal machine learning tools are beginning to see use in real-world policy evaluation tasks to flexibly estimate treatment effects. One issue with these methods is that the machine learning models used are generally black boxes, that is, there is no globally interpretable way to understand how a model makes estimates. This is a clear problem for governments who want to evaluate policy as it is difficult to understand whether such models are functioning in ways that are fair, based on the correct interpretation of evidence and transparent enough to allow for accountability if things go wrong. However, there has been little discussion of transparency problems in the causal machine learning literature and how these might be overcome. This article explores why transparency issues are a problem for causal machine learning in public policy evaluation applications and considers ways these problems might be addressed through explainable AI tools and by simplifying models in line with interpretable AI principles. It then applies these ideas to a case study using a causal forest model to estimate conditional average treatment effects for returns on education study. It shows that existing tools for understanding black-box predictive models are not as well suited to causal machine learning and that simplifying the model to make it interpretable leads to an unacceptable increase in error (in this application). It concludes that new tools are needed to properly understand causal machine learning models and the algorithms that fit them.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Table 1. Variable importance for the causal forest

Figure 1

Figure 1. Aggregated SHAP plot explaining the HTE estimate across the distribution.

Figure 2

Figure 2. Individual-level waterfall plots.

Figure 3

Figure 3. Rashomon curve for the effect of heterogeneity estimating model size showing absolute loss as a proportion of the original estimates for a variety of model sizes. Note: The y-axis is cut off at 5 for clarity. A small portion of points are above this line though these are still incorporated into the mean.

Figure 4

Table 2. Best linear projection of doubly robust scores onto selected covariates

Figure 5

Table 3. Average-level results of refutation tests

Figure 6

Figure 4. Effect of refutation tests on estimated treatment effects (treatment effects should be near zero, conditional averages are averages of doubly robust scores, not the individual estimates shown as points).

Figure 7

Figure 5. Propensity score densities.

Submit a response

Comments

No Comments have been published for this article.