Hostname: page-component-77f85d65b8-6bnxx Total loading time: 0 Render date: 2026-03-29T16:02:59.959Z Has data issue: false hasContentIssue false

Bayesian adaptive trials for social policy

Published online by Cambridge University Press:  05 March 2025

Sally Cripps*
Affiliation:
Human Technology Institute, University Technology Sydney, Ultimo, NSW, Australia School of Mathematical and Physical Science, University of Technology Sydney, Ultimo, NSW, Australia
Anna Lopatnikova
Affiliation:
Human Technology Institute, University Technology Sydney, Ultimo, NSW, Australia Discipline of Business Analytics, The University of Sydney, Darlington, NSW, Australia
Hadi Mohasel Afshar
Affiliation:
Human Technology Institute, University Technology Sydney, Ultimo, NSW, Australia School of Mathematical and Physical Science, University of Technology Sydney, Ultimo, NSW, Australia
Ben Gales
Affiliation:
Paul Ramsay Foundation, Darlinghurst, NSW, Australia
Roman Marchant
Affiliation:
Human Technology Institute, University Technology Sydney, Ultimo, NSW, Australia
Gilad Francis
Affiliation:
Human Technology Institute, University Technology Sydney, Ultimo, NSW, Australia
Catarina Moreira
Affiliation:
Human Technology Institute, University Technology Sydney, Ultimo, NSW, Australia
Alex Fischer
Affiliation:
Australian National University, Canberra, ACT, Australia
*
Corresponding author: Sally Cripps; Email: sally.cripps@uts.edu.au

Abstract

This article proposes Bayesian adaptive trials (BATs) as both an efficient method to conduct trials and a unifying framework for the evaluation of social policy interventions, addressing the limitations inherent in traditional methods, such as randomized controlled trials. Recognizing the crucial need for evidence-based approaches in public policy, the proposed approach aims to lower barriers to the adoption of evidence-based methods and to align evaluation processes more closely with the dynamic nature of policy cycles. BATs, grounded in decision theory, offer a dynamic, “learning as we go” approach, enabling the integration of diverse information types and facilitating a continuous, iterative process of policy evaluation. BATs’ adaptive nature is particularly advantageous in policy settings, allowing for more timely and context-sensitive decisions. Moreover, BATs’ ability to value potential future information sources positions it as an optimal strategy for sequential data acquisition during policy implementation. While acknowledging the assumptions and models intrinsic to BATs, such as prior distributions and likelihood functions, this article argues that these are advantageous for decision-makers in social policy, effectively merging the best features of various methodologies.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Open Practices
Open materials
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Figure 1. Bayesian adaptive trials in a decision-theoretic framework.

Figure 1

Figure 2. Formal presentation of the Bayesian adaptive trials process as explained in Figure 1. The quantity $ U\left({a}_t|{\mathcal{D}}_{, is the expected utility of executing action, $ {a}_t\in \mathcal{A} $, given data $ {\mathcal{D}}_{, where the expectation is with respect to the joint distribution $ p\left(\boldsymbol{\theta}, {y}_t|{a}_t,{\mathcal{D}}_{.

Figure 2

Figure 3. Ten samples are drawn from the model defined by Equations (3) and (4), where $ m=6 $, $ b=1 $, and $ {\sigma}_{\unicode{x025B}}=0.1 $. The true performance, $ \theta \left(\mathfrak{a}\right) $, is represented by the dashed line.

Figure 3

Figure 4. Posterior Gaussian process, $ \mathcal{GP}\left(\theta; {\mu}_{{\mathcal{D}}_{1:12}},{\Omega}_{{\mathcal{D}}_{1:12}}\right) $ for the fixed design trial, fitting the students’ learning gain, $ \theta $, versus external tutoring hours ($ \mathfrak{a} $) where the observations, $ {\mathcal{D}}_{1:12} $, are depicted by filled circles. The red dashed line shows the true learning gain, $ \theta \left(\mathfrak{a}\right) $, and the green line and shaded 95% credible interval represent $ {\mu}_{{\mathcal{D}}_{1:12}}\left(\mathfrak{a}\right)\pm 2\sqrt{\Omega_{{\mathcal{D}}_{1:12}}\left(\mathfrak{a},\mathfrak{a}\right)} $. Yellow line: The optimal tutoring hours where a combination of features (A) and (B) (formalized by (8)) maximizes utility.

Figure 4

Figure 5. Bayesian adaptive trial (BAT) applied to model (4). Red dashed line: The true performance, $ \theta \left(\mathfrak{a}\right) $, to be approximated (see Equation (3)). Circles, $ {\mathcal{D}}_{ represents data points; blue line represents GP’s mean, $ {\mu}_{{\mathcal{D}}_{; shaded blue region represents 95% credible interval around the GP’s mean; dashed black line represents the slope of GP’s mean, (up to a proportionality constant) $ \frac{\partial {\mu}_{{\mathcal{D}}_{; dotted line represents the standard deviation of the GP (up to a proportionality constant) $ {\sigma}_{{\mathcal{D}}_{; and solid back line represents the utility function (6) with $ {\lambda}_1=30 $ and $ {\lambda}_2=10 $. Yellow line represents tutoring hours where a combination of features (A) and (B) (formalized by (8)) is maximized. The results are plotted after adaptively collecting three data points (panel $ a $), 6 data points (panel $ b $), 9 data points (panel $ c $) and 12 data points (panel $ d $).

Figure 5

Figure 6. Fixed design (left column) versus Bayesian adaptive trial (BAT) (right column). Red dashed line represents the true performance, $ \theta \left(\mathfrak{a}\right) $, to be approximated. Circles, $ {\mathcal{D}}_{ represents the data points; blue and green lines represent GP’s mean, $ {\mu}_{{\mathcal{D}}_{; shaded blue/green regions represent 95% credible intervals around the GP’s mean; and yellow vertical line represents tutoring hours at which the linear combination of the performance and effect of tutoring is maximized (see (8)). First row: Six data points collected from the interval [0,12] (a subset of which are in the interval [5,7]) (a) by fixed design and (b) by BAT. Second row: Twelve data points collected from the interval [0,12] (c) by fixed design and (d) by BAT.

Submit a response

Comments

No Comments have been published for this article.