Hostname: page-component-54dcc4c588-mz6gc Total loading time: 0 Render date: 2025-10-10T12:12:04.626Z Has data issue: false hasContentIssue false

From evidence to delivery: an Implementation-Science blueprint for behavioural policy

Published online by Cambridge University Press:  08 October 2025

Giuseppe Alessandro Veltri*
Affiliation:
Center for Behavioural and Implementation Science, Yong Loo Lin School of Medicine, National University of Singapore, Singapore Sociology and Social Research, Università di Trento, Trento, Italy
Rights & Permissions [Opens in a new window]

Abstract

The notorious Rossi’s ‘Iron Law of Evaluation’ – that the expected net impact of any large-scale social programme is zero – reminds us that expectations about policy interventions rarely survive real-world delivery. Behavioural Public Policy (BPP) faces many implementation challenges. Implementation Science (IS), which studies how evidence-based practices are adopted, delivered and sustained, offers BPP a toolkit for overcoming the knowledge–action gap. We show how IS frameworks like CFIR (Consolidated Framework for Implementation Research) and RE-AIM (Reach, Effectiveness, Adoption, Implementation, Maintenance) diagnose contextual barriers – leadership, workflow fit, resources – and supply metrics of fidelity, adoption, cost and sustainment. Next, we outline three hybrid trial types from IS that co-test policy impact and implementation: Type 1 emphasises behavioural effects while sampling implementation data; Type 2 balances both; Type 3 optimises implementation while tracking outcomes. Cluster-randomised and stepped-wedge roll-outs create feedback loops that enable mid-course adaptation and speed scale-up. Cases illustrate how spotting delivery slippage early averts costly failure; they reveal how early IS integration can turn isolated behavioural wins into scalable, system-wide transformations that genuinely endure long. We situate these recommendations within the literature on scalability and the ‘voltage effect’, clarifying how common drops from pilot to scale can be anticipated, diagnosed and mitigated using IS outcomes and process data.

Information

Type
Perspective
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press.

Introduction

Almost forty years ago, the distinguished sociologist Peter H. Rossi (Reference Rossi1987) reflected on the disappointing outcomes of policy evaluations from the 1970s and 1980s, formulating what he termed the ‘Iron Law of Evaluation’: ‘the expected value of any net impact assessment of any large-scale social program is zero’. Rossi’s pessimistic assessment likely stems from inherent difficulties in consistently implementing social programs effectively, challenges equally relevant to contemporary Behavioural Public Policy (BPP) interventions. Indeed, despite significant advances and compelling evidence generated by behavioural science, the field remains confronted with persistent challenges in translating promising pilot results into sustained and widespread practice (Maier et al., Reference Maier, Bartoˇs, Stanley, Shanks, Harris and Wagenmakers2022; Chater and Loewenstein, Reference Chater and Loewenstein2023; Hallsworth, Reference Hallsworth2023; Varazzani and Hubble, Reference Varazzani and Hubble2025; Banerjee and John, Reference Banerjee and John2025). Many pressing social, health and economic issues depend not merely on the existence of effective behavioural interventions but critically on the willingness and capacity of individuals, communities and organisations to consistently adopt beneficial behaviours. Nevertheless, the ‘last mile’ challenge, wherein effective innovations struggle to establish themselves in real-world settings, continues to limit the broader impact and sustainability of behavioural interventions (Fixsen et al., Reference Fixsen, Naoom, Blase, Friedman and Wallace2005; Proctor et al., Reference Proctor, Landsverk, Aarons, Chambers, Glisson and Mittman2011).

Implementation Science (IS) explicitly addresses this critical gap by focusing on how evidence-based practices can be systematically embedded into everyday operations. IS emerged precisely from recognising that interventions, even those demonstrably effective under controlled conditions, frequently falter when confronted with organisational inertia, limited resources, competing demands or inadequate stakeholder engagement (Damschroder et al., Reference Damschroder, Aron, Keith, Kirsh, Alexander and Lowery2009; Proctor et al., Reference Proctor, Silmere, Raghavan, Hovmand, Aarons and Bunger2009). By identifying and systematically addressing these adoption-influencing factors, IS provides structured frameworks to guide interventions from initial concept through sustainable, large-scale application. While contemporary BPP increasingly attends to organisational and contextual constraints, the distinctive contribution of IS lies in offering purpose-built frameworks (e.g., Consolidated Framework for Implementation Research [CFIR]; Reach, Effectiveness, Adoption, Implementation, Maintenance [RE-AIM]) and validated measures that render that attention systematic, comparable across sites, and decision-relevant; in this sense, IS places equal emphasis on structural and contextual factors – such as leadership commitment, workflow alignment, policy support and training – that are critical for ensuring practices endure beyond pilot phases (Glasgow et al., Reference Glasgow, Vogt and Boles1999; Damschroder et al., Reference Damschroder, Aron, Keith, Kirsh, Alexander and Lowery2009).

Recognising that Rossi’s ‘Iron Law of Evaluation’ applies equally to behavioural interventions underscores the urgent need for a strategic integration between BPP and IS. Despite their evident complementarity, systematic integration between these fields remains relatively sparse. Many behavioural interventions still depend largely on small-scale demonstrations or localised experimentation, implicitly expecting initial successes to organically expand, an approach that needs to be revised (Almaatouq et al., Reference Almaatouq, Griffiths, Suchow, Whiting, Evans and Watts2024; List, Reference List2024). This perspective article argues for the intentional incorporation of IS strategies from the earliest stages of BPP intervention design – particularly through hybrid trial methodologies – to significantly enhance the likelihood of behavioural interventions achieving widespread and sustained adoption. By bridging these fields, IS can serve as a critical blueprint for uptake, empowering BPP interventions to overcome Rossi’s pessimistic prediction and realise greater scalability and longevity within complex institutional landscapes.

A practical implication is to foreground why effects attenuate at scale – the ‘voltage effect’ – and link candidate drivers of attenuation to concrete IS metrics. Dilution of non-negotiable components maps to fidelity indicators; diminished demand at scale maps to reach and adoption; and operational drift over time maps to maintenance. Designing studies with these links pre-specified enables earlier diagnosis and course correction (Glasgow et al., Reference Glasgow, Vogt and Boles1999; Proctor et al., Reference Proctor, Silmere, Raghavan, Hovmand, Aarons and Bunger2009; List, Reference List2024).

Why IS matters for BPP

BPP interventions, whether they involve rearranging choice architectures or offering social norm feedback, are grounded in the notion that human behaviour is malleable and strongly shaped by environment and context (Michie et al., Reference Michie, van Stralen and West2011, Reference Michie, Atkins and West2014). Yet in practice, these contexts are not simply neutral backdrops; they are often highly structured settings (hospitals, municipal governments, workplaces) where entrenched cultures, resource constraints or political pressures can stifle innovation. IS supplies the tools to identify these systemic forces early in the design process. For instance, the CFIR details constructs like leadership engagement, available resources, organisational culture, and external policies, each of which can make or break a promising intervention (Damschroder et al., Reference Damschroder, Aron, Keith, Kirsh, Alexander and Lowery2009). By systematically assessing these factors at the outset, a BPP project can anticipate roadblocks and structure interventions that mesh with real-world constraints (Table 1).

Table 1. Key dimensions of Implementation Science and their relevance to BPP

IS also emphasises the concept of fidelity, defined as how faithfully an intervention is delivered according to its core components. Even if a BPP solution is theoretically sound, local adaptations or partial implementation can dilute essential elements, leading to weaker outcomes or null results. By measuring fidelity – along with other implementation-related outcomes such as adoption, feasibility, acceptability and sustainability – IS offers an empirical feedback loop. This loop ensures that any observed effects (positive or negative) can be tied to how well the intervention was delivered, rather than leaving questions about whether the original protocol was substantially altered in the course of real-world deployment.

Moreover, IS underscores the significance of multi-stakeholder engagement. BPP innovations can benefit from the bottom-up perspectives of those who must implement the intervention daily (e.g., frontline staff, administrators and community leaders). If these individuals are sidelined during design and deployment, they may be less motivated to support the initiative, risking a superficial or reluctant adoption. IS frameworks like RE-AIM highlight the value of early and ongoing engagement with all relevant parties, ensuring that each stakeholder’s readiness, resources, and concerns are integrated into policy rollout (Glasgow et al., Reference Glasgow, Vogt and Boles1999). To be clear, many BPP teams already co-design with practitioners and communities; the value-add of IS lies in codifying these activities into replicable strategies and measures so that engagement is not episodic but built into the study design, data collection plan and decision rules.

Terminology and scope. We align terminology with canonical sources: ‘classic nudges’ are changes to choice architecture that alter behaviour predictably without restricting options or significantly changing incentives (Thaler and Sunstein, Reference Thaler and Sunstein2008); ‘nudge plus’ couples a nudge with a prompt for deliberation or reflection (Banerjee and John, Reference Banerjee and John2024); and ‘boosts’ aim to build decision competences (e.g., risk literacy) that persist and can generalise (Hertwig et al., Reference Hertwig, Michie, West and Reicher2025). This shared vocabulary helps map intervention classes to IS supports and outcomes.

IS can – and should be calibrated to the depth of reflection each behavioural technique demands. For classic nudges (Thaler and Sunstein, Reference Thaler and Sunstein2008), which rely on subtle choice architecture changes (e.g., defaults and salience cues), IS prioritises low-friction fidelity checklists, rapid audit-and-feedback cycles and the tracking of contextual moderators (leadership buy-in, workflow fit) that might silently erode the nudge’s potency at scale. Where designers add an element of deliberation – so-called nudge plus (Banerjee and John, Reference Banerjee and John2024), in which recipients are prompted to pause and reflect – IS highlights additional implementation ingredients: staff training to facilitate reflection, materials that scaffold self-explanation, and pragmatic measures of dose received (did participants actually engage in the reflective step?). Finally, boosts aim to build enduring decision competences (risk literacy, future cost visualisation) rather than steer one-off choices (Hertwig et al., Reference Hertwig, Michie, West and Reicher2025). Here, IS shifts emphasis toward capacity-building strategies (e.g., train-the-trainer models and refresher sessions), longitudinal sustainment metrics, and equity audits that verify whether competence gains are maintained across heterogeneous groups. Across all three classes, IS offers common scaffolds – CFIR diagnostics for context mapping, RE-AIM indicators for reach and maintenance – but applies them with differing intensity: lightweight and rapid for nudges, integrated with reflective content for nudge plus, and resource-intensive, educational, and longitudinal for boosts. Embedding these calibrated IS supports from the outset increases the likelihood that each intervention type not only works in principle but also survives the messy realities of routine delivery. Importantly, even ‘light-touch’ nudges often require substantial enabling infrastructure – IT changes to defaults, legal review, staff training and data governance – to achieve and sustain effect, so their simplicity at the user interface should not be conflated with ease of organisational delivery (cf. CFIR constructs of resources, leadership engagement and workflow fit; Damschroder et al., Reference Damschroder, Aron, Keith, Kirsh, Alexander and Lowery2009).

A practical tension concerns fidelity versus adaptation. We distinguish core components – non-negotiable elements tied to the causal theory – from an adaptable periphery that can be tailored to local context without eroding mechanism integrity (Carroll et al., Reference Carroll, Patterson, Wood, Booth, Rick and Balain2007). Using adaptation logs and reporting frameworks such as FRAME (Stirman et al., Reference Stirman, Miller, Toder and Calloway2013), teams can document what was modified, why, and with what consequences. In BPP, this means specifying which elements of a default change are fixed (e.g., enrolment logic) and which can vary (message channel, timing), and then measuring both fidelity and adaptations to explain outcomes.

Figure 2 summarises these choices in a practical decision tree that links intervention class and delivery footprint to IS intensity and a matching hybrid design.

Figure 2. Decision tree for calibrating Implementation Science (IS) supports by intervention type and delivery footprint. The branches map intervention class (classic nudges, nudge plus, boosts) and delivery considerations (e.g., IT/legal/workflow footprint; reflective component; heterogeneity, time horizon, equity risk) to a recommended IS intensity and a suitable hybrid trial type.

Hybrid trial methodologies as a convergence strategy

One especially powerful strategy for integrating IS and BPP is the use of ‘hybrid’ trial designs (Curran et al., Reference Curran, Bauer, Mittman, Pyne and Stetler2012) in one unified study (see Figure 1). Unlike traditional research approaches that first conduct a standalone effectiveness trial and then launch a separate implementation trial, hybrid designs systematically evaluate both the impact of an intervention and the processes underpinning its delivery in one unified study. By gathering data on both the primary policy outcomes of interest (e.g., increases in vaccination uptake and reductions in energy consumption) and key implementation factors (e.g., fidelity, adoption, cost, organisational readiness and stakeholder engagement), hybrid trials can identify in real time whether a disappointing result arises from limitations in the policy itself or from suboptimal implementation strategies. Hybrid trial methodologies are frequently categorised into three main types, see Table 2 and Figure 2, each striking a different balance between testing effectiveness and examining implementation (Curran et al., Reference Curran, Bauer, Mittman, Pyne and Stetler2012). Type 1 prioritises assessing the policy’s effectiveness while collecting preliminary data on key implementation outcomes. Type 2 conducts a more equal exploration of both the intervention’s effectiveness and the implementation strategies, allowing researchers to draw robust conclusions about how well the intervention works in real-world settings and why. Type 3 flips the emphasis toward implementation, rigorously investigating how best to embed the intervention in practice while still tracking its impact on the desired outcomes. In BPP, Type 2 designs are well-suited to experimentally compare allowable adaptations (e.g., alternative message channels or training intensity) while tracking fidelity, cost, and equity, thereby turning the fidelity–adaptation tension into a testable design feature rather than an afterthought.

Figure 1. Conceptual depiction of how hybrid trial designs integrate effectiveness and implementation research. Illustrative Type 2 pathway: a municipal energy default intervention compares two implementation strategies – (A) IT auto-switch plus staff training versus (B) IT auto-switch plus targeted public messaging – while concurrently tracking fidelity (default configuration delivered as specified), adoption (households enrolled), cost (IT time, staff hours, media spend), and maintenance (retention at 6–12 months).

Table 2. Key characteristics of hybrid trial types and their potential applications in Behavioural Public Policy

In practice, hybrid trials employ a range of research designs and data-collection methods to capture the nuance of both policy effects and implementation quality. For example, cluster-randomised (Hemming and Taljaard, Reference Hemming and Taljaard2023) or stepped-wedge designs (Barker et al., Reference Barker, McElduff, D’Este and Campbell2016) are often used when interventions are rolled out in multiple sites or over multiple time points, enabling researchers to compare different implementation strategies in real-world settings. These approaches are particularly advantageous for BPP because they account for local variations in leadership support, resource availability, staff training, and incentives – factors that can dramatically alter the success or failure of an intervention. Data collection typically involves standardised behavioural metrics (e.g., administrative data on service uptake or consumption), fidelity checklists (documenting whether and how various components of the intervention are delivered), cost analyses (capturing direct and indirect expenses associated with implementation), and stakeholder feedback (gauging buy-in and perceived feasibility). To manage this diversity of data, hybrid trials increasingly utilise multilevel models and related techniques to parse how context, fidelity, organisational culture, and leadership engagement mediate or moderate policy outcomes.

Another advantage of hybrid methodologies is their adaptive, iterative capacity. Because outcomes and implementation processes are tracked concurrently, teams can quickly detect and diagnose issues such as poor alignment with local workflows, insufficient staff training, or a mismatch in incentive structures. Rather than waiting until the trial concludes to make mid-course corrections, they can refine training protocols, adjust incentives, or address leadership engagement challenges as soon as they surface. This ensures that any observed effect (or lack thereof) is not merely a function of inadequate implementation. Continuous stakeholder involvement – from front-line staff to community representatives – helps refine the intervention to better match the evolving needs of the target population. In this way, hybrid trials can accelerate the transition from pilot projects to broader policy rollouts, avoiding scenarios where a policy that succeeded in a controlled demonstration fails to maintain its effectiveness in day-to-day operation.

Overall, by weaving together direct measures of policy impact with a rigorous exploration of the contexts and processes that shape real-world delivery, hybrid trials maximise both internal and external validity. They not only clarify ‘what works’ in terms of behaviour change, but also illuminate the ‘why’ and ‘how’ behind successful (or unsuccessful) implementation. This holistic perspective underscores the importance of systematically integrating IS and BPP in a single methodological framework – one that is well-suited to dynamic policy arenas where flexibility, scalability, and contextual fit are paramount.

The ASPIRE (Beidas et al., Reference Beidas, Ahmedani, Linn, Marcus, Johnson and Maye2021) and 3HP Options trials (Kadota et al., Reference Kadota, Musinguzi, Nabunje, Welishe, Ssemata and Bishop2020) provide examples of Type 3 hybrid studies that leverage behavioural science to design and test implementation strategies for established clinical interventions. The ASPIRE trial focused on implementing the S.A.F.E. Firearm safety programme in paediatric primary care by comparing a behavioural science-informed nudge strategy (an Electronic Health Record prompt) against nudge plus practice facilitation, with clinician fidelity as the primary outcome. The 3HP Options trial aimed to optimise delivery of a 12-week tuberculosis preventive therapy for people living with HIV in Uganda. It compared facilitated Directly Observed Therapy (DOT), facilitated Self-Administered Therapy (SAT), and Informed Patient Choice, using COM-B and the Behaviour Change Wheel to design strategies that address capability, opportunity, and motivation. These cases highlight distinct applications of behavioural science – ASPIRE’s targeted nudge versus 3HP’s comprehensive, theory-driven approach – both within a Type 3 framework prioritising implementation effectiveness.

Finally, hybrid trials complement rather than displace theory-rich approaches such as realist evaluation and theory-based evaluation. Realist logic models (mechanism–context – outcome configurations) help specify where and for whom mechanisms should fire (Pawson and Tilley, Reference Pawson and Tilley1997), while theory-based evaluation clarifies testable causal pathways linking components to outcomes (Weiss, Reference Weiss1997). Embedding their propositions within hybrid designs strengthens interpretation of both effectiveness and implementation findings.

Key challenges and implications

Despite the apparent synergy, merging IS with BPP is not without complications. One challenge lies in reconciling different terminologies and disciplinary norms. BPP practitioners may use terms like ‘choice architecture’ or ‘boosts’, while IS researchers focus on ‘outer setting’, ‘implementation strategy’ or ‘fidelity measures’. Clarity and alignment can be cultivated through early project discussions and shared training opportunities, ensuring that each side understands the other’s lexicon and goals. Empirically, efforts that surface and align assumptions early – via structured context assessments and strategy specification – tend to avoid later drift and rework (Damschroder et al., Reference Damschroder, Aron, Keith, Kirsh, Alexander and Lowery2009; Powell et al., Reference Powell, Waltz, Chinman, Damschroder, Smith and Matthieu2015).

Resource constraints also loom large. Hybrid trials demand thorough planning, mixed methods data collection, and often longer timelines to capture sustainability. BPP projects, particularly those within government settings, may face tighter budget cycles or political pressures for quick results. To mitigate these pressures, it is advisable to concentrate on a handful of critical implementation metrics – like fidelity, acceptability and cost – rather than attempting to measure every dimension of IS theory in a single project. Using digital platforms for data collection and focusing on feasible, pragmatic outcome measures can also lessen participant burden. Evidence from service-delivery settings shows that pragmatic fidelity tools, coupled with light-touch audit-and-feedback, can improve uptake while containing measurement burden (Carroll et al., Reference Carroll, Patterson, Wood, Booth, Rick and Balain2007; Powell et al., Reference Powell, Waltz, Chinman, Damschroder, Smith and Matthieu2015).

Another issue is the inherently context-specific nature of BPP. A social norm campaign that works in one region may fail elsewhere due to cultural variations, differing levels of trust in public institutions, or other structural disparities. IS frameworks offer systematic ways to document contextual differences, such as leadership involvement or resource availability, but BPP practitioners need to remain open to adaptation and iteration. Hybrid trials, by design, facilitate ongoing evaluation and provide evidence on which modifications are beneficial without drifting from core components of the intervention. Synthesis work on the diffusion and scale-up of complex interventions underscores that local history, professional norms, and inter-organisational networks often shape outcomes as much as formal design features (Greenhalgh et al., Reference Greenhalgh, Robert, Macfarlane, Bate and Kyriakidou2004).

Finally, long-term sustainability demands embedding changes into policy or organisational structures. While some BPP approaches – like nudges – are low-cost and minimally invasive, they may quickly lose impact if supporting systems (training, incentives, feedback loops) are not in place. IS underscores the role of institutional champions, policy alignment, and ongoing accountability to anchor new practices. Policymakers can be encouraged to adopt formal mechanisms – such as integration into routine funding lines or performance measures – that make the intervention a standard part of how agencies and organisations operate. In addition, implementation rarely unfolds in a political vacuum. Power dynamics, shifting agendas, and policy windows shape feasibility and timing. Simple power-mapping, the cultivation of champions, and sensitivity to agenda-setting processes can accelerate adoption and sustainment (Kingdon, Reference Kingdon2014; Cairney, Reference Cairney2016).

Conclusions

Although the central aim here is to show how IS can strengthen BPP, the synergy can flow in the opposite direction as well (Hodson et al., Reference Hodson, Powell, Nilsen and Beidas2024). Incorporating behaviourally informed principles into IS practice can help address common challenges like stakeholder resistance, cognitive biases within leadership teams, or inertia in bureaucratic systems. For example, defaults and friction can be used to simplify new workflows for implementers and remove rarely used but error-prone options; timely prompts, commitment devices, and salient feedback can support leaders’ regular review of implementation dashboards and follow-through on agreed actions (Thaler and Sunstein, Reference Thaler and Sunstein2008; Hodson et al., Reference Hodson, Powell, Nilsen and Beidas2024). Moving forward, more BPP programmes could be designed with IS principles baked in from the start: identifying feasible fidelity measures, clarifying roles and responsibilities, and building adaptive protocols that allow for site-specific tailoring. Policy teams might also consider stepped-wedge trial designs that roll out an intervention incrementally across jurisdictions, tracking both outcomes and implementation processes at each step to refine the approach. Such designs, already familiar in IS, would enhance the evidence base for BPP interventions by revealing how organisational and community variables moderate a policy’s effectiveness. Ultimately, by merging IS’s systemic lens with BPP’s nuanced understanding of behaviour, policymakers will be better equipped to mount evidence-informed interventions that do not merely spark short-term improvements but also create lasting transformations. Hybrid trials, iterative feedback loops, fidelity checks and a stronger emphasis on multi-level stakeholder engagement can make the difference between an intervention that flourishes beyond its initial grant period and one that quietly fades. Embracing these combined methods and perspectives is the next logical step for advancing the science and practice of BPP. Framing studies explicitly around likely voltage-loss mechanisms, and mapping them to IS outcomes, can further increase the odds that effects persist at scale (List, Reference List2024).

Acknowledgements

The author is grateful to Nick Sevdalis, Ioannis Bakoulis and Ivo Vlaev for the insightful discussions about the interplay between behavioural and implementation sciences.

References

Almaatouq, A., Griffiths, T. L., Suchow, J. W., Whiting, M. E., Evans, J. and Watts, D. J. (2024), ‘Beyond playing 20 questions with nature: integrative experiment design in the social and behavioral sciences’, Behavioral and Brain Sciences, 47: e33.10.1017/S0140525X22002874CrossRefGoogle Scholar
Banerjee, S. and John, P. (2024), ‘Nudge plus: incorporating reflection into behavioral public policy’, Behavioural Public Policy, 8(1): 6984.CrossRefGoogle Scholar
Banerjee, S. and John, P. (2025), Behavioral public policy: past, present, & future. Policy and Society. Retrieved 7 August 2025, from https://dx.doi.org/10.1093/polsoc/puaf012Google Scholar
Barker, D., McElduff, P., D’Este, C. and Campbell, M. (2016), ‘Stepped wedge cluster randomised trials: a review of the statistical methodology used and available’, BMC Medical Research Methodology, 16: 119.CrossRefGoogle ScholarPubMed
Beidas, R. S., Ahmedani, B. K., Linn, K. A., Marcus, S. C., Johnson, C., Maye, M. et al. (2021), ‘Study protocol for a type III hybrid effectiveness–implementation trial of strategies to implement firearm safety promotion as a universal suicide prevention strategy in pediatric primary care’, Implementation Science, 16: 116.CrossRefGoogle ScholarPubMed
Cairney, P. (2016), The Politics of Evidence-Based Policy Making, London: Palgrave Macmillan.Google Scholar
Carroll, C., Patterson, M., Wood, S., Booth, A., Rick, J. and Balain, S. (2007), ‘A conceptual framework for implementation fidelity’, Implementation Science, 2: 40.Google ScholarPubMed
Chater, N. and Loewenstein, G. (2023), ‘The i-frame and the s-frame: how focusing on individual-level solutions has led behavioral public policy astray’, Behavioral and Brain Sciences, 46: e147.Google Scholar
Curran, G. M., Bauer, M., Mittman, B., Pyne, J. M. and Stetler, C. (2012), ‘Effectiveness– implementation hybrid designs: combining elements of clinical effectiveness and implementation research to enhance public health impact’, Medical Care, 50(3): 217226.Google ScholarPubMed
Damschroder, L. J., Aron, D. C., Keith, R. E., Kirsh, S. R., Alexander, J. A. and Lowery, J. C. (2009), ‘Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science’, Implementation Science, 4: 50.Google Scholar
Fixsen, D. L., Naoom, S. F., Blase, K. A., Friedman, R. M. and Wallace, R. (2005), Implementation Research: A Synthesis of the Literature, Tampa, FL: University of South Florida, National Implementation Research Network.Google Scholar
Glasgow, R. E., Vogt, T. M. and Boles, S. M. (1999), ‘Evaluating the public health impact of health promotion interventions: the RE-AIM framework’, American Journal of Public Health, 89(9): 13221327.CrossRefGoogle ScholarPubMed
Greenhalgh, T., Robert, G., Macfarlane, F., Bate, P. and Kyriakidou, O. (2004), ‘Diffusion of innovations in service organizations: systematic review and recommendations’, The Milbank Quarterly, 82(4): 581629.Google Scholar
Hallsworth, M. (2023), ‘A manifesto for applying behavioural science’, Nature Human Behaviour, 7(3): 310322.CrossRefGoogle ScholarPubMed
Hemming, K. and Taljaard, M. (2023), ‘Key considerations for designing, conducting and analysing a cluster randomized trial’, International Journal of Epidemiology, 52(5): 16481658.CrossRefGoogle ScholarPubMed
Hertwig, R., Michie, S., West, R. and Reicher, S. (2025), ‘Moving from nudging to boosting: empowering behaviour change to address global challenges’, Behavioural Public Policy, 112.Google Scholar
Hodson, N., Powell, B. J., Nilsen, P. and Beidas, R. S. (2024), ‘How can a behavioral economics lens contribute to implementation science?’, Implementation Science, 19(1): 33.Google ScholarPubMed
Kadota, J. L., Musinguzi, A., Nabunje, J., Welishe, F., Ssemata, J. L., Bishop, O. et al. (2020), ‘Protocol for the 3HP Options trial: a hybrid Type 3 implementation–effectiveness randomized trial of delivery strategies for short-course tuberculosis preventive therapy among people living with HIV in Uganda’, Implementation Science, 15: 112.10.1186/s13012-020-01025-8CrossRefGoogle Scholar
Kingdon, J. W. (2014), Agendas, Alternatives, and Public Policies, 2nd edn, Updated. Harlow: Pearson Education Limited.Google Scholar
List, J. A. (2024), ‘Optimally generate policy-based evidence before scaling’, Nature, 626(7999): 491499.Google ScholarPubMed
Maier, M., Bartoˇs, F., Stanley, T., Shanks, D. R., Harris, A. J. L. and Wagenmakers, E.-J. (2022), ‘No evidence for nudging after adjusting for publication bias’, Proceedings of the National Academy of Sciences, 119(31): e2200300119.Google ScholarPubMed
Michie, S., Atkins, L. and West, R. (2014), The Behaviour Change Wheel: A Guide to Designing Interventions, London: Silverback Publishing.Google Scholar
Michie, S., van Stralen, M. M. and West, R. (2011), ‘The Behaviour Change Wheel: a new method for characterising and designing behaviour change interventions’, Implementation Science, 6: 42.CrossRefGoogle ScholarPubMed
Pawson, R. and Tilley, N. (1997), Realistic Evaluation, London: Sage.Google Scholar
Powell, B. J., Waltz, T. J., Chinman, M. J., Damschroder, L. J., Smith, J. L., Matthieu, M. M. et al. (2015), ‘A refined compilation of implementation strategies: results from the Expert Recommendations for Implementing Change (ERIC) project’, Implementation Science, 10: 21.10.1186/s13012-015-0209-1CrossRefGoogle ScholarPubMed
Proctor, E. K., Landsverk, J., Aarons, G., Chambers, D., Glisson, C. and Mittman, B. (2011), ‘Implementation research in mental health services: an emerging science with conceptual, methodological, and training challenges’, Administration and Policy in Mental Health and Mental Health Services Research, 38(1): 2434.Google Scholar
Proctor, E., Silmere, H., Raghavan, R., Hovmand, P., Aarons, G., Bunger, A. et al. (2009), ‘Outcomes for implementation research: conceptual distinctions, measurement challenges, and research agenda’, Administration and Policy in Mental Health and Mental Health Services Research, 36(2): 6576.Google Scholar
Rossi, P. (1987), ‘The iron law of evaluation and other metallic rules’, Research in Social Problems and Public Policy, 4(1): 320.Google Scholar
Stirman, S. W., Miller, C. J., Toder, K. and Calloway, A. (2013), ‘The FRAME: a framework for reporting adaptations and modifications to evidence-based interventions’, Implementation Science, 8: 65.CrossRefGoogle Scholar
Thaler, R. H. and Sunstein, C. R. (2008), Nudge: Improving Decisions about Health, Wealth, and Happiness, New Haven, CT: Yale University Press.Google Scholar
Varazzani, C., and Hubble, C. (2025), ‘Four sins in behavioural public policy’, Behavioural Public Policy, 9, 345353. https://doi.org/10.1017/bpp.2024.22.Google Scholar
Weiss, C. H. (1997), ‘Theory-based evaluation: past, present, and future’, New Directions for Evaluation, 76: 4155.CrossRefGoogle Scholar
Figure 0

Table 1. Key dimensions of Implementation Science and their relevance to BPP

Figure 1

Figure 2. Decision tree for calibrating Implementation Science (IS) supports by intervention type and delivery footprint. The branches map intervention class (classic nudges, nudge plus, boosts) and delivery considerations (e.g., IT/legal/workflow footprint; reflective component; heterogeneity, time horizon, equity risk) to a recommended IS intensity and a suitable hybrid trial type.

Figure 2

Figure 1. Conceptual depiction of how hybrid trial designs integrate effectiveness and implementation research. Illustrative Type 2 pathway: a municipal energy default intervention compares two implementation strategies – (A) IT auto-switch plus staff training versus (B) IT auto-switch plus targeted public messaging – while concurrently tracking fidelity (default configuration delivered as specified), adoption (households enrolled), cost (IT time, staff hours, media spend), and maintenance (retention at 6–12 months).

Figure 3

Table 2. Key characteristics of hybrid trial types and their potential applications in Behavioural Public Policy