Hostname: page-component-89b8bd64d-shngb Total loading time: 0 Render date: 2026-05-13T17:55:28.361Z Has data issue: false hasContentIssue false

Provenance Guided Rollback Suggestions

Published online by Cambridge University Press:  25 April 2025

DAVID ZHAO
Affiliation:
University of Sydney, Glebe, New South Wales, Australia (e-mail: d-z@outlook.com)
PAVLE SUBOTIĆ
Affiliation:
Microsoft, Redmond, WA, USA (e-mail: pavlesubotic@microsoft.com)
MUKUND RAGHOTHAMAN
Affiliation:
University Southern California, Los Angeles, CA, USA (e-mail: raghotha@usc.edu)
BERNHARD SCHOLZ
Affiliation:
University of Sydney, Glebe, New South Wales, Australia (e-mail: bernhard.scholz@sydney.edu.au)
Rights & Permissions [Opens in a new window]

Abstract

Advances in incremental Datalog evaluation strategies have made Datalog popular among use cases with constantly evolving inputs such as static analysis in continuous integration and deployment pipelines. As a result, new logic programming debugging techniques are needed to support these emerging use cases.

This paper introduces an incremental debugging technique for Datalog, which determines the failing changes for a rollback in an incremental setup. Our debugging technique leverages a novel incremental provenance method. We have implemented our technique using an incremental version of the Soufflé Datalog engine and evaluated its effectiveness on the DaCapo Java program benchmarks analyzed by the Doop static analysis library. Compared to state-of-the-art techniques, we can localize faults and suggest rollbacks with an overall speedup of over 26.9$\times$ while providing higher quality results.

Information

Type
Rapid Communication
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Fig. 1. A scenario where an incremental update results in faults in the output.

Figure 1

Fig. 2. Program analysis datalog setup.

Figure 2

Fig. 3. Fault localization and repair system.

Figure 3

Fig. 4. The proof tree for alias(userSession,sec). (+) denotes tuples that are inserted as a result of the incremental update, red denotes tuples that were not affected by the incremental update.

Figure 4

Fig. 5. A fault localization is a subset of input changes such that the faults are still reproduced.

Figure 5

Algorithm 1 Localize-Faults(P, E2, ΔE1 → 2, F): Given a diff ΔE1 → 2 and a set of fault tuples F, returns $\delta E \subseteq \Delta E_{1\in2}$ such that $E_1 \uplus \delta E$ produces all t$\in$F

Figure 6

Fig. 6. An input debugging suggestion is a subset of input changes such that the remainder of the input changes no longer produce the faults.

Figure 7

Algorithm 2 Rollback-Repair(P, E2, ΔE1 → 2, F): Given a diff ΔE1 → 2 and a set of fault tuples F, return a subset $\delta$E ⊆ ΔE1 → 2 such that $E_1 \uplus(\Delta E_{1\in 2} \backslash \delta E)$ does not produce tr

Figure 8

Algorithm 3 Full-Rollback-Repair(P, E1, ΔE1 → 2, (I+,I)): Given a diff ΔE1 → 2 and an intended output (I+,I,), compute a subset $\delta$EΔE1 → 2 such that ΔE1 → 2 \ $\delta$E satisfies the intended output

Figure 9

Table 1. Repair size and runtime of our technique compared to delta debugging