Capturing (Optimal) Relaxed Plans with Stable and Supported Models of Logic Programs

We establish a novel relation between delete-free planning, an important task for the AI Planning community also known as relaxed planning, and logic programming. We show that given a planning problem, all subsets of actions that could be ordered to produce relaxed plans for the problem can be bijectively captured with stable models of a logic program describing the corresponding relaxed planning problem. We also consider the supported model semantics of logic programs, and introduce one causal and one diagnostic encoding of the relaxed planning problem as logic programs, both capturing relaxed plans with their supported models. Our experimental results show that these new encodings can provide major performance gain when computing optimal relaxed plans, with our diagnostic encoding outperforming state-of-the-art approaches to relaxed planning regardless of the given time limit when measured on a wide collection of STRIPS planning benchmarks.

1 Introduction AI Planning, an active research area of Artificial Intelligence, is the task of finding a sequence of actions, called a plan, that when applied to a given initial state transforms it to a state that satisfies all members of a given set of goal conditions.According to the STRIPS formulation of AI Planning, states and goal conditions are represented by sets of atomic propositions, and each action can have separate sets of atomic propositions as its preconditions, positive effects (also called add effects), and negative effects (also called delete effects).Delete-free planning problems are those for which actions have no negative effects.A given Planning problem can be relaxed into a delete-free problem, optimal solving of which provides lower bound of the optimal cost of the original problem.This lower bound, denoted by h + , could be used as a heuristic in an A*-like search scheme to find an optimal solution for the original problem.Computing h + is, however, NP-equivalent (Bylander 1994).Also, h + is hard to approximate (Betz and Helmert 2009).
Optimally solving relaxed planning problems in an efficient way is important for multiple reasons.There have been many admissible heuristic functions that approximate h + in polynomial time by computing lower bounds.Examples are h max heuristic (Bonet and Geffner 2001), LMcut heuristic (Helmert and Domshlak 2009), set-additive heuristic (Keyder and Geffner 2008), and cost-sharing approximations of h max (Mirkis and Domshlak 2007).The informativeness of these heuristic functions cannot be evaluated unless we can compute the exact value of h + .Using such a measure for informativeness could lead to devising more informative heuristic functions.Moreover, efficient solving of relaxed planning problems is in itself of importance, because there exist planning tasks of interest for the AI community whose actions are all delete-free.Examples of such tasks are the minimal seed-set problem (Gefen and Brafman 2011), and the problem of determining join orders in relational database query plan generation (Robinson et al. 2014).Another reason for the importance of efficient optimal relaxed planning is the fact that optimal plans for non-relaxed planning problems can always be produced by iterative solving and reformulating relaxed planning tasks (Haslum 2012).By repeatedly finding optimal plans for newly produced relaxed problems, while reformulating the non-relaxed problem in each iteration, one can reach a point where the found optimal plan for the last relaxed problem is actually an optimal plan for the original problem.
Several approaches to solving relaxed planning problems have previously been introduced.The approaches include Boolean satisfiability (SAT) based encodings (Rankooh and Rintanen 2022b), integer programming based models (Imai and Fukunaga 2015;Rankooh and Rintanen 2022a); and a minimum-cost hitting set based method introduced by Haslum et al. (2012).In this work we take a new approach based on the stable and supported models of logic programs (Gelfond and Lifschitz 1988;Marek and Subrahmanian 1992).Such models provide the semantical basis for answer set programming (ASP); see, e.g., (Brewka et al. 2011) for an overview.The ASP paradigm offers general-purpose modeling languages for knowledge representation and reasoning.
A typical encoding of a search problem in ASP aims at a one-to-one correspondence between answer sets and the solutions of the problem.This is in perfect harmony with AI planning where sequences of actions (plans) form solutions to problems at hand.Indeed, many AI planning problems have been encoded as logic programs (Son and Balduccini 2018) and AI planning also played a role in the early development of the ASP paradigm (Lifschitz 1999) in the first place.Both stable and supported models implement a form of minimality, i.e., atomic propositions are false by default.This is highly useful in the context of AI planning since state predicates are falsified in this sense and the encodings of planning problems can concentrate on specifying which state predicates become true or remain true inertially.This tends to lead to more compact encodings compared to those based on pure SAT and, furthermore, enable memory savings if native answer set solvers are used for actual computations.The difference between stable and supported models is also interesting in this setting, since ASP solvers may compute answer sets based on either semantics.Stable models are also supported models but not vice versa in general.The gap between the two semantics vanishes if a logic program is suitably instrumented, e.g., in terms of acyclicity constraints (Bomanson et al. 2016).These observations open up new avenues when it comes to encoding planning problems as logic programs as well as choosing an approach for computing plans as answer sets.
In this work, we establish a new relation between relaxed planning and logic programs.We give an encoding of relaxed planning problems in ASP.We show that all subsets of actions that could be ordered to produce relaxed plans can be bijectively captured with stable models of the produced logic program.This enables the previously uninvestigated usage of off-the-shelf answer set solvers for computing the value of h + .While the supported model semantics of logic programs cannot be directly employed for this purpose, we show how by guaranteeing acyclicity in an underlying graph of the logic program, one may deploy supported models to harvest (optimal) relaxed plans of the planning problem.The logic program produced in this way inherits the causal nature of our stable model based encoding, in the sense that the direction of explanations provided by the rules is from causes/preconditions to effects.By reversing this direction, we provide a diagnostic encoding, which while still using the supported model semantics of logic programs, is shown to be more efficient than our causal encoding by our empirical study.Our experimental results show that when given small time limits these new encodings can significantly outperform the previous approaches to relaxed planning when measured on STRIPS planning benchmarks.Moreover, regardless of the used time limit, our diagnostic supported model based encoding enables CLASP (Gebser et al. 2015) to solve more problems compared to the integer programming solver based state-of-the-art method.
Logic programming has recently been employed for computing heuristics for lifted planning tasks.Corrêa et al. (2021;2022) employed Datalog programs to calculate h add (Bonet and Geffner 2001) and h FF (Hoffmann and Nebel 2001), respectively, for lifted planning tasks.However, the objective of our work differs from theirs.While both h add and h FF are non-admissible estimations of h + and can be computed in polynomial time for ground instances, we aim to compute h + itself.Furthermore, this work focuses on ground planning tasks.Although the generalization of our current approach to lifted planning is relatively simple, we leave it for future research.
The rest of this article is organized as follows.In Section 2, we recall basic concepts and definitions of planning problems, relaxed planning, logic programs, and their stable and supported model semantics.Then, in Section 3, we show how relaxed plans can be captured with stable models of an encoding of relaxed planning problems into logic programs.In Section 3, we first show how a logic program can be augmented with a dynamically varying digraph whose acyclicity guarantees a shift in the semantics from stable models to supported models.We then recall how vertex elimination can be used to check whether a given digraph is acyclic.Based on the supported model semantics and the vertex elimination method, we explain our causal and diagnostic encodings of relaxed planning problems.We present practical evidence in Section 5 based on an experimental evaluation of the resulting encoding for answer set and supported model op-timization.This analysis is based on 2212 problem instances from 84 STRIPS planning problem sets.Finally, we conclude the paper in Section 6.

Preliminaries
Since we intend to establish a connection between AI Planning and Answer Set Programming, we provide necessary formal definitions with respect to both of these paradigms.

AI Planning and relaxed plans
A STRIPS planning problem is a 5-tuple Π = ⟨X, I, A, G, cost⟩ where X is a finite set of Boolean state variables, also called atomic propositions.The initial state I and the set of goal conditions G, are subsets of X.The finite set A is the set of actions.Each member ⃗ a of A is a triple ⟨pre(⃗ a), add(⃗ a), del(⃗ a)⟩, where pre(⃗ a), add(⃗ a) and del(⃗ a) are sets of atomic propositions denoting the set of preconditions, positive effects, and negative effects of ⃗ a, which are the atomic propositions that⃗ a requires, adds, and deletes, respectively.The cost function cost maps members of A to a non-negative integer.We use the vector sign to distinguish actions from the corresponding atoms that represent them in logic programs.
States are represented as subsets of X.The successor s ′ = exec ⃗ a (s) of a state s with respect to action ⃗ a ∈ A is defined if pre(⃗ a) ⊆ s, where the definition is s ′ = (s \ del(⃗ a)) ∪ add(⃗ a).An action sequence ⃗ a 1 , ..., ⃗ a n is executable (in state s) if exec ⃗ a 1 ,...,⃗ a n (s) = exec ⃗ a n (...exec ⃗ a 2 (exec ⃗ a 1 (s))...) is defined.A plan for Π is a sequence π of actions from A such that G ⊆ exec π (I).The cost of plan π = ⃗ a 1 , ..., ⃗ a n for Π, is defined by Σ i=1,...,n cost(⃗ a i ).An optimal plan for Π is a plan with minimal cost.
For a given STRIPS planning problem Π = ⟨X, I, A, G, cost⟩, the delete relaxation (Bonet and Geffner 2001) is defined as Π + = ⟨X, I, A + , G, cost⟩, where A + is defined from A by replacing the set of negative effects of each member of A with the empty set.Without loss of generality, we can define Π + = ⟨X, / 0, A + , G, cost⟩, with an additional requirement that all members of I have been removed from G, and also from the preconditions and effects of members of A + .We use this latter definition of relaxation in the rest of the paper.
A plan for Π + is called a relaxed plan for the original problem Π.The minimal cost of plans of Π + is denoted by h + (Π).If there is no relaxed plan for Π, we set h + (Π) to ∞.

Answer set programming
In this work, we consider logic programs that consist of rules of the forms: (1) The symbols a, b 1 , . . ., b n with n ≥ 0, and c 1 , . . ., c m with m ≥ 0 occurring in the rules are (propositional) atoms and "not" denotes negation by default.Rules of the forms (1) and ( 2) are known as normal and choice rules, respectively (Simons et al. 2002).Intuitively, each rule r gives a reason to derive its head head(r) = a if the conditions in its body body(r) are met, i.e., atoms involved can be either derived or not by other rules.For a choice rule r of form (2), the derivation of head(r) is optional, enabling an exception to head(r) being false by default.We write body + (r) and body − (r) for the sets of atoms b 1 , . . ., b n (resp.c 1 , . . ., c m ) occurring positively (resp.negatively) in body(r).We say that r is a positive rule if body − (r) is empty.
The signature of a logic program P is the set of atoms At(P) = r∈P ({head(r)} ∪ body + (r) ∪ body − (r)) that occur in P. The positive dependency graph of P is DG + (P) = ⟨At(P), ⪰⟩ where a ⪰ b holds for a, b ∈ At(P) if head(r) = a and b ∈ body + (r) for some rule r ∈ P. If a ⪰ b, we say that a depends on b, and also denote this by ⟨a, b⟩ ∈ DG + (P).
An interpretation I ⊆ At(P) determines which atoms a ∈ At(P) are true (a ∈ I) and which are false (a ̸ ∈ I).Then I satisfies a rule r ∈ P of form ( 1 Given an interpretation I, the reduct r I of r with respect to I is obtained by partially evaluating the negative conditions of r.For a normal rule (1), r I = / 0 if c i ∈ I for some 1 ≤ i ≤ m and r I = {a ← b 1 , . . ., b n } otherwise.For a choice rule (2), the latter case additionally requires that a ∈ I. Finally, for an entire logic program P, the reduct P I = r∈P r I and I is a stable model of P iff I = LM(P I ).For the purposes of this work, it is also useful to distinguish the supporting rules of P with respect to I, denoted by SR P (I), which are the normal rules whose bodies are satisfied, and the choice rules whose bodies and heads are satisfied.Then, a model I |= P is supported (by P) when I = {head(r) | r ∈ SR P (I)}.Each stable model of P is supported, but supported models are not necessarily stable, such as I = {a} for P = {a ← a.}.

Relaxed plans captured with stable models of logic programs
Typically, modeling planning problems as answer set programs is done by assuming a number of time steps for the output plan, which is also mirrored in the structure of the produced logic program (Son et al. 2006).Here, however, we show that, as long as finding relaxed plans are concerned, one can encode the planning problem in such a way that there will be no need for a multi-step structure.
Let Π = ⟨X, I, A, G, cost⟩ be a relaxed STRIPS planning problem, Π + = ⟨X, / 0, A + , G, cost⟩ be the delete relaxation of Π, and P be a logic program consisting of rules of the form (1) g ← not g for every g ∈ G; (2) {a} ← q 1 , . . ., q n for every ⃗ a ∈ A with pre(⃗ a) = {q 1 , . . ., q n }; (3) p ← a for every ⃗ a ∈ A and p ∈ add(⃗ a).Intuitively, the first rule guarantees all goal atoms to be true in a model.The second rule explains the necessary conditions for the execution of an action ⃗ a.The third rule enforces the positive effects in case ⃗ a has been chosen to be in the model.
We show that more relaxed semantics of models could not play the same role.Example 1 shows that neither the classical models nor the supported models of P are generally suitable for capturing the relaxed plans of Π correctly.
It is easy to check that M = {a, b, p, q} is both a classical and a supported model for P.However, P has no stable model, due to circularities involved in the encoding. ■ We now formally show that P captures the relaxed plans of Π as its stable models.

Theorem 1
There is a bijection f (A ′ ) = ⃗ a∈A ′ (add(⃗ a) ∪ {a}) between all subsets A ′ of A + which can be ordered to produce a relaxed plan for Π, and all stable models of P.

Proof
We first show that f is well-defined, i.e., if π = ⃗ a 1 , ...,⃗ a m is a permutation of members of A ′ such that π is a relaxed plan for Π, then M = f (A ′ ) is a stable model of P. For every g ∈ G, g must be added by some action in π.Thus, the reduct P M consists of rules of the form (1) a ← q 1 , . . ., q n for every ⃗ a ∈ π and pre(⃗ a) = {q 1 , . . ., q n }, and (2) p ← a for every ⃗ a ∈ π and p ∈ add(⃗ a).
Clearly, M is model for P M .By bounded induction on the lengths of prefixes of π, we show that M is a subset of any model for P M .As we explained above, the initial state of the relaxed problem is (safely) assumed to be an empty set.Therefore, ⃗ a 1 cannot have any precondition.Thus, P M includes a rule of the form (a 1 .),and add(⃗ a 1 ) ∪ {a 1 } is a subset of any model for P M .Assume that for 1 ≤ j < m, i=1,..., j add(⃗ a i ) ∪ {a 1 , ..., a j } is a subset of any model for P M .Since ⃗ a j+1 is executable in exec ⃗ a 1 ,...,⃗ a j ( / 0), pre(⃗ a j+1 ) is a subset of i=1,..., j add(⃗ a i ).Because P M includes the two types of rules explained above for ⃗ a j+1 , we conclude that i=1,..., j+1 (add(⃗ a i ) ∪ {a i }) is a subset of any model for P M .
Clearly, f is injective.We now show that f is also surjective, i.e., if M is a stable model of P, then there exists A ′ ⊆ A + such that M = f (A ′ ), and A ′ can be permuted to produce a relaxed plan for Π.Let A ′ = {⃗ a | a ∈ M}.We have G ⊆ M because for every g ∈ G, P includes the rule g ← not g.The reduct P M consists of rules of the form (1) a ← q 1 , . . ., q n for every ⃗ a ∈ A ′ and pre(⃗ a) = {q 1 , . . ., q n } and (2) p ← a for every ⃗ a ∈ A ′ and p ∈ add(⃗ a).If p is added by some action ⃗ a ∈ A ′ , then clearly we must have p ∈ M. On the other hand, for every p ∈ X if p ∈ M, then p is added by some action ⃗ a ∈ A ′ , otherwise M \ {p} would also be a model for P M , contradicting that M is the least model for P M .We conclude that M = f (A ′ ) and if A ′ can be ordered to produce a sequence of actions executable in I, then that sequence is also a relaxed plan for Π.
For the sake of contradiction, assume that A ′ cannot be ordered to produce a sequence of actions executable in I. Let A ′′ be a (possibly empty) proper subset of A ′ such that its members (if any) can be ordered to produce a sequence of actions executable in I, and furthermore, let A ′′ be maximal in the sense that there is no subset of A ′ with such a property that is also a proper superset of trivially satisfies a ← q 1 , . . ., q n .On the other hand, for every ⃗ a ∈ A ′ \ A ′′ , the maximality of A ′′ implies that at least one precondition of ⃗ a is not in M ′ , and therefore, a ← q 1 , . . ., q n is vacuously satisfied.We conclude that M ′ is a model for P M , contradicting that M is the least model for P M .
Theorem 1 shows that if P is augmented with an optimization constraint requiring minimization over the summation of the costs of actions in the answer sets, the cost of an optimal stable model of P is equal to h + (Π).
The program P can be seen as a causal encoding of relaxed plans of P. That is because the direction of explaining the logic of relaxed plan computation in P is from preconditions to actions, and from actions to effects.In other words, the direction is from causes to effects.Alternatively, a diagnostic encoding would explain the logic of relaxed plan computation from effects to actions, and from actions to preconditions.In the next section, we show how this latter paradigm could be used for computing relaxed plans.

Relaxed plans captured with supported models of logic programs
In this section, we recall the instrumentation of logic programs with acyclicity constraint, which allows capturing the stable models of a given logic program P with the supported models of another program Tr ACYC (P) which are acyclic with respect to an underlying graph (Bomanson et al. 2016).We provide an adaptation of this method based on the structure of program P explained above.We then review the so-called vertex elimination method, used previously for cycle prevention in the produced models of SAT formulas (Rankooh and Rintanen 2022c; Rankooh and Janhunen 2022).We next show how vertex elimination could also be used to translate Tr ACYC (P) to a new program P c such that the supported models of P c represent acyclic supported models of Tr ACYC (P), and thus, stable models of P and relaxed plans of Π.Based on the structure of P c , we introduce another logic program P d which describes the relaxed plans diagnostically.We prove that the supported models of P d represent those of P c , thereby capturing the stable models of P and relaxed plans of Π.

Instrumentation of logic programs with acyclicity constraint
We adopt the acyclicity translation Tr ACYC (P) of a logic program P (Bomanson et al. 2016) that deploys special dependency atoms dep(x, y) to express the activation of the respective arc ⟨x, y⟩ ∈ DG + (P) in the acyclicity constraint.For the sake of the compactness of the output program, instead of using the exact method, we customize the translation method considering the structure of the program P explained above.In particular, we circumvent the introduction of dependency atoms for actions, by establishing dependencies only between atoms of the original planning problem.This way, the underlying graphs for which acyclicity must be guaranteed become considerably smaller than DG + (P).
The idea is to instrument P explained in the previous section with additional rules that capture well-support for atoms p ∈ X.For each pair ⟨p, q⟩, if there exists ⃗ a ∈ A such that p ∈ add(a) and q ∈ pre(a), the potential dependency of p on q is expressed using a choice rule {dep(p, q)} ← q.Also, atoms ws(a 1 , p) , . . ., ws(a k , p), for actions {⃗ a 1 , . . .,⃗ a k } that add p enforce the well-support for p in terms of k rules p ← ws(a i , p) for i = 1, ..., k.For an atom p ∈ X, the rule (3) below captures the option that the well-support for p is provided by some action ⃗ a such that pre(⃗ a) = {q 1 , . . ., q n } and p ∈ add(⃗ a).{ws(a, p)} ← dep(p, q 1 ) , . . ., dep(p, q n ).
(3) Also, the rule a ← ws(a, p) captures the atom a in the supported models, in the case that it has been used to provide well-support for p.As in program P, we need a rule g ← not g for every g ∈ G to guarantee that every goal atom has been produced.For Tr ACYC (P) obtained in this way, the distinction between stable and supported models {dep(p, q)} ← q. {dep(q, p)} ← p. {ws(a, q)} ← dep(q, p).
It can easily be checked that M = {a, b, p, q, ws(a, q), ws(b, p), dep(p, q), dep(q, p)} is the only supported model for Tr ACYC (P).However, this model is not acyclic, as it contains both dep(p, q) and dep(q, p).■ Similarly to the stable model based encoding, Tr ACYC (P) is a causal encoding, expressing the inference in the direction from preconditions to actions, and from actions to effects.However, there are additional concepts in this encoding, namely dependencies and well-support.In fact, in Tr ACYC (P), preconditions are assumed to cause dependencies, which in turn cause well-support and effects.Here, well-support atoms ws(a, p) take the causal role that action atoms a have in P. The action atoms are only included in Tr ACYC (P) to represent their cost in the minimization constraint.The rules in Example 2 establish the inference direction from preconditions to dependencies (the first row), from dependencies to well-support (the second row), and from wellsupport to effects (the third and the fourth rows).The final rule captures the goal condition (as before).

Vertex elimination graphs
The concept of vertex elimination graphs has been recently shown effective for guaranteeing acyclicity in constraint programs with underlying graphs.The concept of vertex elimination for digraphs was originally introduced by Rose and Tarjan (1975).
Given a digraph G = ⟨V, E⟩, an ordering of V is a bijection α : {1, . . ., n} → V .For a vertex v, the fill-in of v, denoted by F(v), is the set of arcs from the in-neighbors of v to the out-neighbors of v, formally defined by (4) The v-elimination graph of G is produced by removing v from G , and adding the fill-in of v to the resulting graph.Formally, Given a digraph G and an ordering α of its vertices, the elimination process of G according to α is the sequence The fill-in of the digraph G according to α, denoted by F α (G ), is the set of all arcs added to G in the vertex elimination process.Formally, F α (G ) is defined by ( 5), where (5) The vertex elimination graph of G according to α, denoted by G * α , is the union of all graphs produced in the elimination process of G according to α: For any digraph G , the number of arcs of the vertex elimination graph depends on the ordering function α.It has been shown that the problem of finding the optimal ordering function, the one resulting in the smallest number of arcs in the vertex elimination graph, is NP-complete (Rose and Tarjan 1975).Nevertheless, there are effective heuristics for finding empirically useful orderings.Examples are the minimum fill-in and minimum degree that accordingly choose a vertex for removal at each step during the elimination process.One important property of vertex elimination is that if the original graph G has a directed cycle, then G * α will have a cycle of length 2, regardless of the ordering α.

The causal encoding based on supported models
Consider Tr ACYC (P) explained above.Let G be the graph of all dependencies of Tr ACYC (P).Formally, G = ⟨X, E⟩, where E = {⟨p, q⟩ | dep(p, q) ∈ At(Tr ACYC (P))}.Also, for each supported model M of Tr ACYC (P), let G M be the graph of all dependencies in M, i.e., G M = ⟨X, E M ⟩, where E M = {⟨p, q⟩ | dep(p, q) ∈ M}.Assume that α is an ordering of the members of X, G = G 0 , G 1 , . . ., G n−1 is the elimination process of G according to α, and for i = 1, . . ., n, M ⟩ be the vertex elimination graphs of G and G M according to α, respectively.
We produce the causal supported model semantics based encoding of Π as logic program P c by adding the following rules to Tr ACYC (P).For every ⟨p, q⟩ Also, for every p and q such that ⟨p, q⟩ ∈ G * α and ⟨q, p⟩ ∈ G * α , we add Intuitively, for any vertex ordering α, and any supported model M of Tr ACYC (P), the rule (7) extends M by atoms representing the arcs in G * M,α , the vertex elimination graph of G M according to α, while the rule (8) guarantees that G * M,α has no cycle of length 2.
Theorem 2 Let A ′ be a subset of A + .There exists a permutation π of members of A ′ such that π is a relaxed plan for Π iff P c has a supported model M such that ..,⃗ a m , then according to Theorem 1, i=1,...,m add(⃗ a i ) ∪ {a 1 , ..., a m } is a stable model of P. By Proposition 1, Tr ACYC (P) has an acyclic supported model , where E N = {⟨p, q⟩ | dep(p, q) ∈ N}, and let G * N,α = ⟨X, E * N ⟩ be the vertex elimination graph of G N according to α.Since G N is acyclic, X can be ordered by topological sorting according to G N .Now, if the vertex elimination process adds the arc ⟨p, q⟩, then p must be ordered before q by the topological sorting.Therefore, G * N,α is also acyclic.It should now be easy to check that where E M = {⟨p, q⟩ | dep(p, q) ∈ M}.Assume that k > 1 is the smallest number for which there exist a cycle of length k in G M .Then there are atoms dep(p 1 , p 2 ), ..., dep(p k−1 , p k ), dep(p k , p 1 ) in M. According to the rule (8), k cannot be equal to 2. Let i = argmin 1≤ j≤k α −1 (p j ).Then p i is the vertex in the mentioned cycle that is eliminated before all other vertices in the cycle according to α.According to the rule (7), dep(p i−1 , p i+1 ) ∈ M (with indices considered modulo k), and therefore G M has a cycle of length k − 1, a contradiction.Let N = M ∩ At(Tr ACYC (P)).A straightforward investigation shows that N is a supported model of Tr ACYC (P).By Proposition 1, N ′ = N ∩ At(P) is a stable model of P. Since A ′ = {⃗ a ∈ A + | a ∈ N ′ }, by Theorem 1, there exists a permutation π of members of A ′ such that π is a relaxed plan for Π.

The diagnostic encoding based on supported models
One major approach to solving problems in the AI Planning field is to perform backward search, also known as regression, in the search space (Ghallab et al. 2004).In this approach, actions are assumed to act in reverse, i.e., producing their preconditions given they have some effects relevant to the current search node.The main drawback of this approach is that it can easily produce deadend states, which are not reachable from the initial state.The notion of reversibility of actions has been shown to be quite effective for detecting dead-end states.However, determining the reversibility of actions is itself challenging, and might even need a logic program (Faber et al. 2022) of its own.Nevertheless, the problem of detecting the dead-ends is an easy one in the case of relaxed planning, and can be done in polynomial time as a preprocessing method (Hoffmann and Nebel 2001).Therefore, this backward approach has promise to be efficient for relaxed planning.
Inferring causes from effects can be understood as diagnostic inference (Russell and Norvig 2020).In our causal encoding, we expressed the inference direction from preconditions to dependencies, from dependencies to well-supports, and from well-supports to effects.We can alternatively reverse all these directions to produce a diagnostic encoding.
In our diagnostic encoding P d , we assume that all atoms could possibly be in the model by using the rule {p} for every p ∈ X.However, if p is in the model, then it must have wellsupport by at least one action.We establish this by adding {ws(a, p)} ← p for every ⃗ a ∈ A such that p ∈ add(⃗ a), and also f ← p, not ws(a 1 , p) , . . ., not ws(a m , p), not f for p ∈ X and all actions ⃗ a 1 , ...,⃗ a m that could add p.The first rule provides the possibility of well-support atoms being in a supported model, while the second rule requires at least one of the well-support atoms to be in the model.To represent the inference from well-supports to dependencies, we add dep(p, q) ← ws(a, p) for ⃗ a ∈ A, q ∈ pre(⃗ a), and p ∈ add(⃗ a).Finally, to establish the inference direction from dependencies to preconditions, we add q ← dep(p, q).As in P c , all rules in the forms of ( 7) and ( 8) must be included to enforce acyclicity in the supported model.Moreover, we add a ← ws(a, p) for ⃗ a ∈ A and p ∈ add(⃗ a), to enable an action atom a to represent its cost in the minimization constraint, and also g ← not g for every g ∈ G to guarantee that goal atoms are included in the model.
It is quite easy to check that if P d has a supported model M, then M is also a supported model of P c .On the other hand, it can be shown in a straightforward manner that if N is a supported model of P c , then N \ L is a supported model of P d , where L is the set of atoms dep(p, q) for which there is no action ⃗ a such that ws(a, p) ∈ N and q ∈ pre(⃗ a).Thus, we have the following result: Theorem 2 and Theorem 3 can be used to establish Corollary 1.

Corollary 1
Let A ′ be any subset of A + .There exists a permutation π of members of A ′ such that π is a relaxed plan for Π iff P d has a supported model M such that A ′ = {⃗ a ∈ A + | a ∈ M}.

Empirical results
We have implemented our encoding methods inside the HSP* planner (Haslum 2015).The implementation is available under the ASPTOOLS collection1 .All experiments have been run on a cluster of Linux machines with Intel Xeon 2.40 GHz CPUs, using a timeout of 1800 seconds per problem, and a memory limit of 8 GB.For our supported model based encodings, where vertex elimination is used, for determining the order of vertex elimination, we have implemented the minimum degree heuristic, i.e., eliminating a vertex with minimal total number of incoming and outgoing arcs in the graph produced after the elimination of previously eliminated vertices.
Our three implemented encodings are (1) our stable model based encoding P; (2) our causal supported model based encoding P c ; and (3) our diagnostic supported model based encoding P d .As the solver we use CLASP 3.3.5,which is capable of optimizing over both stable and supported models.The CLASP solver searches for stable models by default.We enable the search for supported models only for our P c and P d encodings.As the optimization strategy we use the unsatisfiable core (USC) based search, which our preliminary experiments showed to significantly outperform the branch-and-bound (BB) strategy for the mentioned encodings.Although CLASP offers a variety of search strategies, we only use the default one.Therefore, the solver parameters have not been tuned to produce the best performance for our new methods.Henceforth, we refer to the method obtained by combing CLASP with our P, P c , and P d encodings simply by the name of the corresponding encoding.
To evaluate the efficiency of our methods, we have compared them based on the total time of encoding and solving with IP, the integer programming based encoding by Rankooh and Rintanen (2022a), which uses IBM ILOG CPLEX Optimization Studio 20.12 as the optimizer.Regardless of the given time limit, IP has shown to outperform previously introduced methods for optimal  (2012).Since IP has also been implemented inside the HSP* planner (Haslum 2015), all competing methods share the same code for reading the input problem, grounding, and preprocessing.
As benchmark problem sets, we use the STRIPS planning problem sets found in the planning repository 3 .From IPC domains, domains from both optimal and so-called satisficing tracks have been considered.In total, 2212 problem instances from 84 problem sets are used for comparison.Note that this is exactly the benchmark set used in Rankooh and Rintanen (2022a) for comparing IP with previously introduced methods.
The cumulative number of problems solved by all methods are presented in Figure 1.Out of the 2212 problems under evaluation, the cost of an optimal relaxed plan was computed in 1800 seconds for 1980, 1982, 1894, and 1567 problems by IP, P d , P c , and P, respectively.As it can be seen in Figure 1, our supported model based encodings significantly outperform the stable model based one, with the diagnostic encoding performing visibly faster than the causal one.Also, even though the number of problems solved within 1800 seconds by our diagnostic encoding is not much higher than that of IP, P d solves problems considerably faster than IP.In fact, regardless of the time limit, P d solves more problems compared to any other solver.Particularly, P d solves 3 https://github.com/AI-Planning/classical-domains1091 problems in less than 0.1 seconds, more than double the 516 problems solved by IP within the same time limit.

Conclusions and future research
In this work, we study the previously uninvestigated application of ASP solvers to optimal relaxed planning.Three different encodings of relaxed planning problems into logic programs are provided, one based on the stable model semantics, and two based on the supported model semantics of logic programs.According to our empirical results, all our encodings enable CLASP to outperform the state-of-the-art method if the time limit is small.Moreover, our diagnostic supported model based method outperforms the state-of-the-art solver on the studied benchmark problems regardless of the used time limit.
One direction to extend the current work is to study the impact of our new encodings and ASP solvers when employed for computing heuristic values inside state-of-the-art planners.Since our best encoding enables CLASP to solve almost half of the studied benchmark problems in less than one tenth of second, a direct usage of h + computed by CLASP seems to be promising.Also, the usage of USC as the optimization strategy allows for computing lower bounds for h + within any given time limit.It seems interesting to study the informativeness of such lower bounds in comparison to other commonly used heuristics such as LM-cut, another lower bound of h + , when given the same amount of time for computation.
), denoted I |= r, if the satisfaction of the body, denoted I |= body(r), implies that head(r) ∈ I, i.e., I |= head(r).For a choice rule r of form (2), I |= r unconditionally.Moreover, the interpretation I is a (classical) model of P if I |= r holds for every r ∈ P. Each positive normal program P has a unique least model LM(P) obtained as the intersection {I ⊆ At(P) | I |= P}.

Example 1
Consider a planning problem Π = ⟨X, I, A, G, cost⟩, where X = {p, q}, I = / 0, G = {p}, A = {⃗ a, ⃗ b}, pre(⃗ a) = add( ⃗ b) = {p}, add(⃗ a) = pre( ⃗ b) = {q}, and the cost function cost is arbitrary.This problem has no relaxed plan, as ⃗ a and ⃗ b are codependent.The logic program P explained above consists of the following rules: disappears if we insist on acyclic models I for which the digraph induced by the set of arcs {⟨a, b⟩ | dep(a, b) ∈ I} is acyclic.We deploy the following result: Proposition 1 (Bomanson et al. (2016)) If M is a stable model of P, then Tr ACYC (P) has an acyclic supported model N such that M = N ∩ At(P).If N is an acyclic supported model of Tr ACYC (P), then M = N ∩ At(P) is a stable model of P. Example 2 Consider Π to be the planning problem of Example 1.The program Tr ACYC (P) consists of the following rules:

Fig. 1 .
Fig. 1.Cumulative numbers of problems solved by the competing methods