An Application of a Runtime Epistemic Probabilistic Event Calculus to Decision-making in e-Health Systems

We present and discuss a runtime architecture that integrates sensorial data and classifiers with a logic-based decision-making system in the context of an e-Health system for the rehabilitation of children with neuromotor disorders. In this application, children perform a rehabilitation task in the form of games. The main aim of the system is to derive a set of parameters the child's current level of cognitive and behavioral performance (e.g., engagement, attention, task accuracy) from the available sensors and classifiers (e.g., eye trackers, motion sensors, emotion recognition techniques) and take decisions accordingly. These decisions are typically aimed at improving the child's performance by triggering appropriate re-engagement stimuli when their attention is low, by changing the game or making it more difficult when the child is losing interest in the task as it is too easy. Alongside state-of-the-art techniques for emotion recognition and head pose estimation, we use a runtime variant of a probabilistic and epistemic logic programming dialect of the Event Calculus, known as the Epistemic Probabilistic Event Calculus. In particular, the probabilistic component of this symbolic framework allows for a natural interface with the machine learning techniques. We overview the architecture and its components, and show some of its characteristics through a discussion of a running example and experiments. Under consideration for publication in Theory and Practice of Logic Programming (TPLP).


Introduction
In this paper, we present and discuss an architecture for integrating and reasoning about sensors in the context of the AVATEA project (Advanced Virtual Adaptive Technologies e-hEAlth).The final goal of the project is to design and implement an integrated system to support the rehabilitation process of children with Development Coordination Disorders (DCDs).The system needs to integrate and control several components, including an adjustable seat, various types of sensors, and an interactive visual interface to perform rehabilitation exercises in the form of games (such games are sometimes called exergames, presented by Vernadakis et al. (2015)).One of the main goals of the project is to automatize the therapeutic task, ideally without making it less effective.To this end, we employ a logic-based AI engine that collects data from the environment and decides what strategy may be implemented to make the video game as challenging as possible while keeping the child engaged and attentive.For example, the engine may decide to emit a sound if the visual attention of the user is detected to be low, or increase the video game's difficulty level if the sensors seem to indicate that the child is bored as s/he is performing the task effortlessly.
Although this architecture has been specifically designed for the rehabilitation of children with neuro-motor disorders, it can be more generally applied to any task requiring a system to take runtime decisions according to a stream of sensorial data.It mainly consists of four modules communicating with each other (see Figure 1).Layers of sensors and classifiers (e.g., a webcam paired with head-pose and emotion recognition algorithms) collect and process information about the current state of the environment and the child undergoing the rehabilitation process.This information is fed to a probabilistic logic programming system, a modified version of the Epistemic Probabilistic Event Calculus (EPEC for short, first introduced by D' Asaro et al. (2017) as a non-epistemic framework under the name PEC and extended in (D'Asaro et al. 2020) to the epistemic case).To guarantee good performance at runtime, in this work, we present a novel implementation of EPEC, dubbed PEC-RUNTIME, which implements a form of progression (Lin and Reiter 1997).PEC-RUNTIME processes the sensor data and takes decisions according to a predefined strategy.Finally, the gaming platform actuates these decisions and communicates the new state of the game to the layer of sensors.
EPEC (and consequently PEC-RUNTIME) is particularly well-suited for this task as its probabilistic nature facilitates the communication with probabilistic machine learning-based classifiers.As it is a symbolic framework, it can also be used to provide accurate humanunderstandable reports of the user activity at the end of each therapeutic session.Typical feedback includes information about attention levels of the child (e.g., "The user was highly engaged for 36 seconds during the session"), justification for the decisions taken (e.g., "Switched to higher difficulty level as the user has been carrying out the exercise correctly for the last 10 seconds"), as well as a complete report of what happened throughout the therapeutic session (e.g., "The user was not visible at 15:43:23", "Visual stimulus presented at 15:43:24", etc.) and a series of graphs for accurate tracking of user activity.
This paper extends (D'Asaro et al. 2019) and is organized as follows.In Section 2, we overview the logical part of our architecture, EPEC, and show some of its features through the discussion of a toy example.In Section 3, we discuss how we adapted the original ASP-based inference mechanism of EPEC in order to work at runtime.In Section 4, we briefly describe the layers of sensors and classifiers.In Section 5, we demonstrate our approach in our use-case and provide a systematic empirical evaluation.In Section 6 we discuss related work, in particular probabilistic  Note that a single sensor may produce data that is then fed into two or more classifiers (e.g. in the case of the webcam).Timestamped output from the classifiers is fed into PEC-RUNTIME, the logical core of the architecture.PEC-RUNTIME processes this information together with some domain independent axioms and outputs its decision to the environment (the game, in this case).
logic programming languages for reasoning about actions and gamification techniques.Some final remarks about directions for further work conclude the paper in Section 7.

Overview of the language
In this section, we overview a subset of EPEC1 that is relevant to our application, and show how it can be used to represent a domain.Similarly to other logical languages for reasoning about actions, EPEC models a domain as a collection of fluents and actions.Moreover, EPEC represents time explicitly through instants.Fluents, actions and instants constitute the principal sorts of EPEC.Actions are further sub-divided into agent actions (under the control of the agent being modeled) and environmental actions (performed by the environment).Fluents can take value in a set of values.Domain-specific theories can be designed using EPEC propositions.
As an illustration, we use the following toy domain (inspired by the AVATEA use-case) as our running example (but you can see e.g.(Acciaro et al. 2021) for other use-cases): Scenario 2.1 A child undergoes a simple attention test in which her level of visual attention is measured.The task consists in tracking a moving object on a screen.Therapists agree that when the child is not looking at the object, there is 90% chance this is due to lack of attention, and that when the child is looking at the object there is 100% chance s/he is paying attention to the task.A webcam is used to track her gaze.Readings from the webcam are only partly reliable due to hardware limitations, and are produced every second.These readings are then piped into a specialized classifier which decides whether the child is looking at the object on the screen or not.The classifier also outputs a confidence level for its classification.When the child is not looking at the screen, it may be useful to play a sound in order to re-engage the child.Two seconds after the start of the experiment, the child has been detected to be tracking the object on the screen with confidence levels 0.25 and 0.13.What is the level of attention of the child at instant 2? Should the sound be played at time 2?".
We let the set of instants in our timeline be {0, 1, 2}, whose elements can be naturally interpreted as seconds since the start of the experiment.

Syntax overview
The only fluent in our example scenario is Attention.It is boolean, i.e., it may take values and In the example, we consider the agent to be an automated system that is responsible for triggering a re-engagement strategy when necessary.On the other hand, the child is considered part of the environment.Therefore, the actions are PlaySound and TrackObject, where PlaySound is the agent action of playing a sound and TrackObject is the environmental action which corresponds to the child tracking the object with her eyes.The effects of not tracking the object are defined by the following proposition: Looking at the object is never considered to be due to chance, therefore we also include the following proposition: For what regards action PlaySound under the control of the agent, we want to formalize the conditional plan stating that this action must be performed at instant 2 if the child is believed to be distracted, i.e., if the probability of Attention has fallen below a predefined threshold, say 0.1.This can be written in EPEC as: We consider the child to be initially paying full attention to the task.This translates to: initially-one-of {({Attention}, 1)} (5) Finally, the action TrackObject occurs twice at instants 0 and 1 with confidence 0.25 and 0.13 respectively.This translates to the following propositions: TrackObject occurs-at 0 with-prob 0.25 (6) TrackObject occurs-at 1 with-prob 0.13 (7) In EPEC, Proposition (1) is known as a v-proposition (v for "value"), Propositions (2) and (3) are known as c-propositions (c for "causes"), Proposition (4) is known as a p-proposition (p for "performs"), Proposition (5) is known as an i-proposition (i for "initially"), Propositions (6) and ( 7) are known as o-propositions (o for occurs).Propositions (4), ( 6) and ( 7) are known as the narrative part of the domain.
The set of propositions {(1), . . ., (7)} constitutes what in EPEC is called a domain description and is usually denoted by D (with appropriate subscripts and/or supercripts).The narrative part of a domain is denoted by narr(D).A domain description is given meaning using a bespoke semantics.

Semantics overview
The semantics of EPEC enumerates all the possible evolutions (worlds in the terminology of EPEC) of the environment being modeled, starting from the initial state.It then applies a series of standard reasoning about actions principles (e.g.persistence of fluents, closed world assumption for actions) to filter out those evolutions of the environment that do not make intuitive sense with respect to the given domain.The remaining worlds, dubbed well-behaved worlds to indicate that they are meaningful with respect to the given domain description, are assigned a probability value.Some well-behaved worlds for our example domain can be depicted as follows: World W 1 represents a possible evolution of the environment starting from a state in which the child is initially (at instant 0) fully attentive and tracking the object with her eyes.At instant 1, the child is still paying attention to the task, but he/she is no longer tracking the object.At instant 2, the child is distracted and again not tracking the object.The sound is never played.EPEC's semantics assigns a probability of 0.19575 to this world, as this results from the product of 0.25 (due to proposition (6) and TrackObject occurring at 0 in W 1 ), 1 − 0.13 (due to proposition (7) and TrackObject not occurring at 1 in W 1 ) and 0.9 (due to proposition 2 and since not tracking the object at instant 1 caused Attention not to hold at instant 2).The intuitive meaning of world W 2 and its probability can be figured similarly.In W 2 the child is always attentive but never tracking the object, the sound is never played, and its probability is 0.006525.
There are 8 well-behaved worlds with respect to our example domain, and they are such that their probabilities sum up to 1. Formal results about the semantics of EPEC guarantee that this is always the case for any domain description and the corresponding set of well-behaved worlds.
We now want to address the question "What is the level of attention of the child at instant 2?".EPEC first needs to query the domain description about the probability of fluent Attention at instant 2. In EPEC, queries are expressed in the form of an i-formula (i.e., a formula with instants attached).In our case, the query has the form Q 1 = [Attention]@2, but more complicated queries are also possible, e.g.Q 2 = ([¬Attention]@1 ∨ [¬TrackObject]@1) ∧ [Attention]@2.We say that i-formula Q 1 has instant 2 (as this is the only instant appearing in the formula) and that the iformula Q 2 has instants > 0 (as all the instants appearing in the formula are strictly greater than 0).Answering query [Attention]@2 amounts to summing the probabilities of those well-behaved worlds in which the i-formula is true, i.e., such that Attention is true at instant 2. This results in 0.158275, and can also be written D ||= [Attention]@2 holds-with-prob 0.158275 Note that this value falls outside the interval [0, 0.1] in the precondition of Proposition (4).
Therefore, EPEC deduces that the answer to our last question in Scenario 2.1 ("Should the sound be played at time 2?") is no.Then, Proposition (4) does not fire and the action PlaySound, under the control of the agent, is not played in any of the well-behaved worlds.

Simulating epistemic actions via c-propositions and o-propositions
The reader may have noticed that our target application domain mainly deals with sensors.In its original formulation (D'Asaro et al. 2020), epistemic modeling is reserved to an additional, type of propositions, called s-propositions (s for "senses"), of the following form: for an action S, a fluent F and some matrix M representing the accuracy of A when sensing F.
Given their role within EPEC, we would need s-propositions to model sensors in our domain.However, s-propositions add a layer of complexity that in this work we would like to avoid due to the necessity of taking decisions at runtime.In this section, we informally show a possible translation procedure s-propositions into pairs of a c-proposition and an o-proposition.This aims to show that one can "improperly" use c-propositions and o-propositions (that are typically used for effectors) as a means to also model epistemic aspects of a domain.Whether a sound and complete translation is available remains however a subject for future work.
• The test came up negative at instant 0 and positive at instant 1 (as in the case of ( 16)).This is likely to happen with probability 0.184, and in this case the agent will believe that the probability of [Flu]@0 is 0.608.
• The test came up positive at instant 0 and negative at instant 1 (as in the case of ( 17)).This is likely to happen with probability 0.184, and in this case the agent will believe that the probability of [Flu]@0 is 0.608.
• The test came up positive twice (as inthe case of ( 18)).This is likely to happen with probability 0.496, and in this case the agent will believe that the probability of [Flu]@0 is 0.903.
This syntax shows the epistemic core of EPEC, which allows one to reasoning about past, present and future: for example, note that in ( 14) we reason about the perspective of an agent at instant 2 who is reasoning about whether it had Flu at instant 0 in the past.This powerful syntax comes at expenses of computational efficiency, and therefore in this work we adopt a number of simplifications due to the fact that we only need to reason about the present.The first of such simplifications is that we drop s-propositions and simulate them via o-propositions and c-propositions, in a way that we now aim to make clearer.
Assume that the agent modeled in the domain description above tests positive twice.Then, we may note that substituting the s-proposition (11) and the two p-propositions ( 12) and ( 13) with Test occurs-at 0 with-prob 0.412 (20) Test occurs-at 1 with-prob 0.452 (21) yields which matches with the result in ( 18), and therefore we may say that simulates it.Although a formal characterization of the translation from s-proposition to c-propositions and o-propositions is beyond the scope of this work, note that an s-proposition for a boolean fluent F of the form is simulated either by a c-proposition and an o-proposition if S is performed at I and produces a positive result, or by a c-proposition and an o-proposition if S is performed at I and produces a negative result, where p is the probability such that D ||= [F]@I with-prob p in the domain D under consideration.This informally shows that it is possible to meaningfully interpret c-propositions and o-propositions as epistemic actions in place of s-propositions, and justifies their employment to perform sensing.

A runtime adaptation of EPEC
The subset of EPEC introduced in Section 2 is sufficient to handle domains that require limited epistemic functionalities.For this special class of domains, one can use the non-epistemic framework to approximate the behavior of EPEC and efficiently compute probabilities of queries at runtime, at the expense of the ability to reason about the past.These assumptions are not restrictive for our use-case, as we only need to reason about the present state of the world and possibly react at runtime.In this section, we provide details about an implementation, dubbed PEC-RUNTIME, that exploits these restrictions to provide a fast reasoning system that overcomes the limitations of other implementations of EPEC in terms of computation time.PEC-RUNTIME is publicly available2 .PEC-RUNTIME is based on PEC-ASP, an Answer Set Programming implementation of the non-epistemic fragment of EPEC that was presented in (D'Asaro et al. 2017) and based on ASP grounder and solver clingo (Gebser et al. 2014).PEC-ASP and PEC-RUNTIME share the same syntax and domain-independent part of the implementation.In addition, PEC-RUNTIME exploits clingo's integration with Python to control the grounding and solving process.
One of the main problems of PEC-ASP is that it is an exact inference mechanism, which needs to enumerate all the well-behaved worlds that satisfy a given query.The number of worlds typically grows exponentially with the size of the narrative.As an illustration, consider the simple domain consisting of one fluent F, one action A, set of instants {0, 1, . . ., N − 1} for some N > 0, the i-proposition initially-one-of {(F, 1)} ( 28) and N o-propositions A performed-at I with-prob 0.5 (29) one for each I in the set of instants.This simple domain description has well-behaved worlds of the following form: where ±A is either A or its negation ¬A.Clearly, there are 2 N such worlds.Given that F is always true in all of them, PEC-ASP needs to enumerate all 2 N well-behaved worlds even to answer a simple query such as [F]@N − 1.A strategy to tackle this problem is to sample a number M 2 N of well-behaved worlds and approximate the probability of the query.This is the approach adopted by PEC-ANGLICAN, a probabilistic programming implementation of PEC3 .However, this technique also suffers from scalability and precision issues when dealing with large narratives as we describe in Section 5.1, Experiment (a), and is mostly suited for offline tasks.
The approach we adopt here relies on a form of progression (Lin and Reiter 1997).Given a domain description D and an instant I, knowledge about what happened before instant I can be appropriately recompiled and mapped to a new domain description D ≥I that does not include any narrative knowledge about instants < I but agrees with the original domain D on all queries about instants ≥ I.This requires exhaustively querying D about the state of the environment at I, i.e., if F 1 , . . ., F n are all the (boolean) fluents in the language, one must perform queries where ±F i is either F i or its negation ¬F i .These queries must be then recompiled into an appropriate i-proposition.
We elaborate this procedure using our Scenario 2.1.Consider the domain description D and instant 1.Since this domain has only one fluent, exhaustively querying D at 1 means finding the probabilities P 1 and P 2 such that D ||= [Attention]@1 holds-with-prob P 1 (30) These values are P 1 = 0.325 and P 2 = 1 − P 1 = 0.675 (as it can be calculated e.g. by using PEC-ASP) and can be recompiled into the following i-proposition: initially-one-of {({Attention}, 0.325), ({¬Attention}, 0.675)} (32) The domain description D ≥1 consists of propositions (32), (1), ( 2), (3), ( 4) and ( 7).Since the idea is that of discarding all knowledge about instants before 1, D ≥1 has corresponding domain language with set of instants {1, 2} (i.e., instant 0 was removed from the language).Domain descriptions D and D ≥I agree on all queries with instants ≥ I: for example, it is possible to show that M D ≥1 ([Attention]@2) = 0.158275, and we have already calculated it is also the case that M D ([Attention]@2) = 0.158275.Since domain description D ≥1 has fewer o-propositions than D, it also has fewer well-behaved worlds (namely 5).In turn this intuitively means that, if a query only contains instants ≥ 1, it is computationally more convenient to query D ≥1 rather than

D.
Repeating the process on D and instant 2 produces the i-proposition initially-one-of {({Attention}, 0.158275), ({¬Attention}, 0.841725)} ( 33) and the domain description D ≥2 consisting of propositions ( 33), ( 1), ( 2), ( 3) and ( 4).This new domain description has only 2 well-behaved worlds.Clearly, M D ≥2 ([Attention]@2) = 0.158275 as this is encoded directly in the i-proposition, and this also equals M D ([Attention]@2) and M D ≥1 ([Attention]@2) as we have already calculated.Note that the domain description D ≥2 trivializes the task of deciding whether Proposition (4) fires or not.In fact, it suffices to check whether its epistemic precondition is satisfied or not by the i-proposition in D ≥I .In this case, the epistemic precondition requires Attention to be in the interval [0, 0.1], which is not the case as it can be immediately seen by looking at the i-proposition (33) and considering that 0.158275 / ∈ [0, 0.1].As we show below, we can turn this intuition into an efficient procedure for updating a domain description as new events are received and reason about this smaller domain, instead of augmenting it with new propositions and reason about the full augmented domain.The pseudo code of our PEC-RUNTIME procedure is as follows:

See Appendix B 4 Sensors and Classifiers
In this section, we briefly overview the layers of sensors and classifiers we currently use to detect user activity that is relevant to the rehabilitation task.
At the present stage, our main source of information is a 2D-webcam placed in front of the children performing the rehabilitation task.The camera stream is mainly responsible for the evaluation of the current pose of head and shoulders of the child, and to estimate his/her emotional response to the interaction.Its output is processed by three modules: Head Pose Recognition Module: We use the library Head Pose Estimation4 to detect the facial landmarks (see Figure 2).A box containing the face and its surrounding area is selected, resized, and normalized in order to use it as input for the facial-landmarks detector, which is based on a CNN architecture.The facial-landmarks detection step is carried out by a custom Shoulders Alignment Module: We use PoseNet (Papandreou et al. 2018) to extract the skeleton joints coordinates from the camera frames.Then we compare these coordinates to the horizontal plane.We use this estimation to determine whether the child is assuming a correct posture while performing the rehabilitation.
Emotion Recognition Module: This module estimates the emotional response of the child, and outputs a valence (Rescigno et al. 2020).This value represents how good -or how bad -the emotion associated with the event or the situation was, and ranges from −1 to +1.To this end we use the AlexNet model (Krizhevsky et al. 2012) trained on the Affectnet Dataset (Mollahosseini et al. 2017).
In future extensions of this work we intend to combine data gathered with additional sensory sources.Specifically, we will employ pillows equipped with pressure sensors, mounted on the seat-back of the chair, to complement the evaluation of the Shoulders Alignment Module and better evaluate whether the child is sitting correctly.We also plan on complementing the Head Pose and Emotion Recognition Modules using an Eye Tracker and EEG respectively.

Examples and discussion
In this section we demonstrate our approach by means of examples and experimental results.Note that Experiment (a) was run on a Mid-2010 Apple MacBook Core 2 Duo 2.4 GHz with 8 GB of RAM, while Experiments (b), (c), (d), (e) and (f) were run on an Apple Macbook Pro 2020 M1 with 16 GB of RAM.  1.Time (in seconds) to execute the query [¬F]@I in the example discussed in Section 5.1.The numbers in bracket in the case of PEC-ANGLICAN refer to the number of sampled wellbehaved worlds used to approximate the result of the query.For every implementation, reported times include grounding and processing of the domain description.

Scalability
Experiment (a).To demonstrate that PEC-RUNTIME scales favourably compared with PEC-ASP and PEC-ANGLICAN we tested the three implementations on a simple domain description in which an action repeatedly occurs and causes the probability of a fluent to slowly decay: where the set of instants is {0, 1, . . ., 15}.The probability of ¬F as calculated by the three frameworks is shown in Figure 3. Table 1 summarizes the performances of PEC-RUNTIME, PEC-ASP and PEC-ANGLICAN.
As it can be seen from these results, PEC-RUNTIME outperforms both PEC-ASP and PEC-ANGLICAN on all queries.In terms of performance, PEC-ANGLICAN is close to PEC-RUNTIME when only 100 well-behaved worlds are sampled.However, sampling 100 worlds causes precision issues as it is clear from Figure 3.

Experiment (b).
In this experiment we consider the effect of varying the number of actions in the domain description.To isolate the effect of actions from that of other side-effects we consider where I = {1, . . ., 15} and we consider different values of n ranging from 0 to 15.We evaluate the computation time for query [F]@I at all instants.Results are plotted in Figure 4 and confirm the intuition that, at an instant I, computation time scales exponentially with the number of action occurrences at that instant.

Experiment (c).
To empirically evaluate the effect of the number of fluents in the domain, we performed a similar experiment with domain descriptions of the following forms: Experiment (e).In this experiment we empirically evaluate the effect of initial conditions, and show that computation time scales linearly with their number.To this aim, we consider the following domain description: where I = {1, . . ., 15} and n ranges from 1 to 2 10 , i.e., the maximum number of possible initial conditions with 10 fluents.Note that n is also the number of well-behaved worlds considered by PEC-RUNTIME at each progression step.Results are shown in Figure 6 and confirm the intuitive hypothesis that computational time scales linearly with the number of initial conditions/wellbehaved worlds.
Experiment (f).This experiment aims to study how PEC-RUNTIME behaves in a more realistic domain where every sensor has a fixed probability p of producing a reading at every instant.We consider the following domain descriptions:

Effect of Noise
One of the characteristics of EPEC (and its dialects) that makes it highly suitable as an interface with machine learning algorithms is that it supports events annotated with probabilities.If a classifier provides a confidence score for its output, it is possible to feed such score into EPEC in the form of a probability.This may be useful to account for possible artifacts and flickering of a given classifier.For instance, consider a classifier C that alternatively produces readings true and false for some characteristic F. This may be represented in EPEC as the following stream of events: Ctrue occurs-at 0 with-prob P true Cfalse occurs-at 1 with-prob P f alse Ctrue occurs-at 2 with-prob P true Cfalse occurs-at 3 with-prob P f alse . . .
Ctrue occurs-at 18 with-prob P true Cfalse occurs-at 19 with-prob P f alse where Ctrue (resp.Cfalse) is an event that causes the value of fluent F to be (resp.⊥), i.e.: Ctrue causes-one-of {({F}, 1)} Cfalse causes-one-of {({¬F}, 1)} and the truth value of f is initially unknown, i.e.: initially-one-of {({F}, 0.5), ({¬F}, 0.5)} We analyse the behavior of EPEC in different scenarios (corresponding to different values for P true and P f alse ): Scenario (a).The classifier C produces true with high confidence and false with low confidence, e.g.P true = 0.99 and P f alse = 0.01.This domain description entails "[F]@20 holds-with-prob P" where P ≈ 0.9899, which makes intuitive sense as the false events have low confidence and are discarded as background noise.
Scenario (b).The classifier C produces true and false with a high degree of uncertainty.However, it assigns true a slightly higher degree of confidence, e.g.P true = 0.501 and P f alse = 0.499.This domain description entails "[F]@20 holds-with-prob P" where P ≈ 0.653246, which reflects that such a sequence of events does not carry as much information as in Scenario (a), due to the the small difference between P true and P f alse .
Scenario (c).The classifier C produces both true and false with low confidence, with P true P f alse .For example, let P true = 0.49 and P f alse = 0.01.This domain description entails "[F]@20 holds-with-prob P" where P ≈ 0.979286, which again makes intuitive sense as false readings are again discarded as background noise, with the repeated true readings making the probability of F steadily grow throughout the timeline.
These scenarios show that EPEC is sensibly more accurate than any logical framework that is not capable of handling probabilities in the presence of noisy sensors.In fact, these frameworks would have to set a threshold, and only accept events with probabilities higher than that threshold.In our example, setting a threshold e.g. of 0.5 would lead to significant errors especially in Scenarios (b) and (c).In Scenario (b), all Cfalse events would be discarded, implying that F is detected to hold true at instant 20 with certainty (compare this to the probability 0.653246 assigned to the same query by EPEC).In Scenario (c), all events would be discarded and it would not be possible neither that F holds at 20 nor that it does not (compare again to the probability 0.979286 assigned by EPEC).

An example from AVATEA
As discussed in Section 4, a layer of classifiers directly injects events into PEC-RUNTIME, which merges them together and decides what action to take according to a predefined conditional plan.We show how PEC-RUNTIME accounts for these classifiers in the following example.
We use the Head Pose Recognition module to decide whether the child is looking at the screen or not, and therefore this module can trigger an EyesNotFollowingTarget event (and possibly an associated probability).Similarly, the Shoulders Alignment Module triggers a BadPosture event if it detects that the child's shoulders are not correctly aligned.Finally, the Emotion Recognition Module triggers a LowValence event when it thinks the child is experiencing negative valence.On the basis of these events, we aim to evaluate the probabilities of TaskCorrect, i.e., that the child is performing the therapeutic task correctly, and Engagement, i.e., that the child and engaged to the task.Consider the following narrative of events: EyesNotFollowingTarget occurs-at 0 with-prob 1 which states that a sound must be played by the system whenever the Attention of the child is low, and that the difficulty level must be lowered if the child does not manage to perform it correctly.PEC-RUNTIME works out the probabilities in Figure 8, and triggers decisions when they fall below the given threshold.In particular, PlaySound is triggered at instant 4, and LowerDifficultyLevel is triggered at instants 2, 3 and 4. It is worth noting here that thresholds may induce the so-called rubberband effect, where abrupt probability changes (due e.g. to noise) may lead to decisions being taken too frequently.Unfortunately, EPEC (and therefore also PEC-RUNTIME) does not allow to control this effect, therefore we implemented a bespoke semaphore in the underlying Python script that only allows decision to be taken every 20 seconds.
These decisions are forwarded to the game.At the end of each therapeutic session, this narrative is saved to a file so that it can be used to provide feedback to the therapists and re-consulted if needed.In this scenario, PEC-RUNTIME provides feedback such as "The child performed the task correctly for a total of 10 instants out of 13 (76.9%).The game difficulty was lowered at instants 2, 3 and 4 as s/he was unable to perform the exercise.His attention was lowest at instant 4.".A graph similar to that in Figure 8 is also shown to the therapists, so that they can have a quick overview of the session.

Related work
To our knowledge, our architecture is one of the first that merges gamification strategies for rehabilitation and probabilistic logic programming techniques.In the following, we briefly survey these two fields, and motivate our design choices.

Gamification
Gamification strategies consist in using game-like elements (e.g., points, rewards, performance graphs, etc.) in serious contexts such as ours.These have proven to be extremely successful to engage young children in diagnostic and therapeutic exercises, even before the advent of digital gaming.While games to test cognitive capabilities (Belpaeme et al. 2012) do not form a sharply defined class, games designed to test and improve motor skills are usually referred to as exergames.The effects of exergames have been found to be generally positive (Vernadakis et al. 2015).Gamification systems are usually effective in engaging young users in playful activities, while adapting the current challenge according to level of user competence.Sessions are typically logged in order to provide detailed feedback to therapists.On the cognitive side, these adaptive systems have been designed to evaluate subjective well-being (Wu et al. 2019) and phonological acquisition (Origlia et al. 2017) among others.Adaptive exergames have been used e.g. in the context of children with spinal impairments (Mulcahey et al. 2008), and to test gross motor skills (Huang et al. 2018).
(Deep) Machine Learning techniques are not advantageous in the case of exergames if they are used alone, as they must be trained on big amounts of data even when expert knowledge can be (relatively) easily extracted from experts and encoded in symbolic form (e.g., using logic programming), with the further advantage that symbolic knowledge can be used to automatically produce explanations and reports.As a centralized reasoning system, we use probabilistic logic programming techniques instead.These systems are a convenient option as they allow both for the representation of formal constraints needed to implement a clinically effective exercise, and for the statistical modeling of intrinsically noisy data sources.

Probabilistic Reasoning about Actions
Recently, logic-based techniques have been successfully applied to several fields of Artificial Intelligence, including among others event recognition from security cameras (Skarlatidis et al. 2015;Skarlatidis et al. 2015), robot location estimation (Belle and Levesque 2018), understanding of tenses (Van Lambalgen and Hamm 2008), natural language processing (Nadkarni et al. 2011), probabilistic diagnosis (Lee and Wang 2018) and intention recogntion (Acciaro et al. 2021).Given the importance of Machine Learning and Probability Theory in AI, these frameworks and languages have gradually started employing probabilistic semantics (Sato 1995) to incorporate and deal with uncertainty.This has given birth to the field of Probabilistic Logic Programming (Riguzzi 2018), and we are particularly concerned with Probabilistic Reasoning about Action frameworks as they can deal with agents interacting with some (partially known) environment.For example (Bacchus et al. 1999;Belle and Levesque 2018), that are based on the Situation Calculus ontology (Reiter 2001), can model imperfect sensors and effectors.The Situation Calculus' branching structure makes these frameworks mostly suitable for planning under partial states of information.A recent extension of language C + ( Giunchiglia et al. 2004), dubbed pBC +, is presented in (Lee and Wang 2018).The authors show how pBC + can be implemented in LP MLN , a probabilistic extension of ASP that can be readily implemented using tools such as LPMLN2ASP (Lee et al. 2017).Unlike our work, which focuses on runtime reasoning, pBC + is mostly suited for probabilistic diagnosis and parameter learning.The two languages MLN-EC (Skarlatidis et al. 2015) and ProbEC (Skarlatidis et al. 2015) extend the semantics of the Event Calculus (Kowalski and Sergot 1986;Miller and Shanahan 2002) ontology, using Markov Logic Networks (Richardson and Domingos 2006) and ProbLog (De Raedt et al. 2007) to perform event recognition from security cameras.In their proposed case study, the logical part of the architecture receives time-stamped events as inputs and processes them in order to detect complex long-term activities (e.g., infer that two people are fighting from the fact that they have been close to each other and moving abruptly during the last few seconds).Given their semi-probabilistic nature, these frameworks are able to handle uncertainty in the input events (ProbEC) or in the causal rules linking events and fluents (MLN-EC) but they do not deal with any form of epistemic knowledge.On the other hand, EPEC (D'Asaro et al. 2020) has the advantage of being based on the Event Calculus (making it particularly well suited for Event Recognition tasks) and being able to deal with epistemic aspects.Its semantics abstracts from any specific programming language and therefore it lends itself to task-specific optimizations, such as the runtime adaptation presented in this work.

Conclusion and Future Work
To summarize, the contribution of this paper is two-fold: • It presents an architecture, currently being actively developed and tested, for the rehabilitation of children with Development Coordination Disorders.The suggested approach mixes machine learning, logic-based and gamification techniques.
• It introduces an novel implementation of a state-of-the-art probabilistic logic programming framework (EPEC) that can work at runtime.
The logical framework plays the role of an interface between the exergame and the machinelearning techniques, and is mainly used for two reasons: (i) guide the exergame according to a predefined strategy that is hard-encoded in the domain description, and (ii) provide humanunderstandable feedback after each therapeutic session, that can be used to log user performance over time.
Nonetheless, there is much room for extending the scope and functionalities of our application.As we collect data about the users of our system, we can compare the decisions taken by the system to those that therapists and psychologists would take, and adapt our domain descriptions accordingly.In the future, we would like to do this semi-automatically by learning the probabilities in our domain descriptions from data-streams annotated by experts.
It is also possible to expand the feedback functionalities of our system.At the moment, feedback includes a series of graphs about the level of engagement of the patient and limited textual reports about his/her performance level (see Section 5.3).However, a full history of what happened throughout each therapeutic session is recorded in an EPEC-readable narrative.Implementing a device to automatically translate these narratives into natural language would greatly enhance the transparency of our system, and constitutes a future step of this work.In fact, "explanations" coming from our system are limited to plots such as the one depicted in Figure 8, that shows what the engagement level of the child is throughout the therapeutic session, and reasons behind decisions are explained in terms of trespassed thresholds.Transforming them into natural language would greatly enhance intelligibility of our system.

Fig. 1 .
Fig.1.The architecture of the AVATEA system.Several classifiers are applied to a stream of data from different sensors.Note that a single sensor may produce data that is then fed into two or more classifiers (e.g. in the case of the webcam).Timestamped output from the classifiers is fed into PEC-RUNTIME, the logical core of the architecture.PEC-RUNTIME processes this information together with some domain independent axioms and outputs its decision to the environment (the game, in this case).

Fig. 2 .
Fig. 2. Facial landmarks detected by the Head Pose Recognition module.
Fig. Probability of F as a function of time for the example discussed in Section 5.1, Experiment (a).

F
n takes-values { , ⊥} initially-one-of {({F 1 , . . ., F n }, 1)} where I = {1, . . ., 15} and n ranges from 1 to 15. Results are shown in Figure 4, and confirm the intuition that computational time scales exponentially as a function of the number of fluents in the domain.Experiment (d).This experiment combines (b) and (c) in that it shows how computation time scales with both fluents and actions.The considered domain descriptions are: F 1 takes-values { , ⊥} . . .F n takes-values { , ⊥} initially-one-of {({F 1 , . . ., F n }, 1)} ∀I, A 1 performed-at I with-prob 0.5 . . .∀I, A m performed-at I with-prob 0.5 where I = {1, . . ., 15}, the number of fluents varies from 1 to 15 and the number of actions varies from 0 to 15. Result are shown in Figure 5 and extend those of Figure 4.

Fig. 5 .
Fig. 5. Contour plot showing time (in seconds) to query the domain in Section 5.1, Experiment (d), expressed as a function of the number of actions and fluents.

Fig. 7 .
Fig. 7. Time (in seconds) to query the example discussed in Section 5.1, Experiment (f).The results show averages over 30 runs.

Fig. 8 .
Fig. 8. Engagement and TaskCorrect as a function of time in the example from Section 5.3