The notorious Rossi’s ‘Iron Law of Evaluation’ – that the expected net impact of any large-scale social programme is zero – reminds us that expectations about policy interventions rarely survive real-world delivery. Behavioural Public Policy (BPP) faces many implementation challenges. Implementation Science (IS), which studies how evidence-based practices are adopted, delivered and sustained, offers BPP a toolkit for overcoming the knowledge–action gap. We show how IS frameworks like CFIR (Consolidated Framework for Implementation Research) and RE-AIM (Reach, Effectiveness, Adoption, Implementation, Maintenance) diagnose contextual barriers – leadership, workflow fit, resources – and supply metrics of fidelity, adoption, cost and sustainment. Next, we outline three hybrid trial types from IS that co-test policy impact and implementation: Type 1 emphasises behavioural effects while sampling implementation data; Type 2 balances both; Type 3 optimises implementation while tracking outcomes. Cluster-randomised and stepped-wedge roll-outs create feedback loops that enable mid-course adaptation and speed scale-up. Cases illustrate how spotting delivery slippage early averts costly failure; they reveal how early IS integration can turn isolated behavioural wins into scalable, system-wide transformations that genuinely endure long. We situate these recommendations within the literature on scalability and the ‘voltage effect’, clarifying how common drops from pilot to scale can be anticipated, diagnosed and mitigated using IS outcomes and process data.