Behavioural equivalences for continuous-time Markov processes

Bisimulation is a concept that captures behavioural equivalence of states in a variety of types of transition systems. It has been widely studied in a discrete-time setting. The core of this work is to generalize the discrete-time picture to continuous time by providing a notion of behavioural equivalence for continuous-time Markov processes. In [6], we proposed two equivalent deﬁnitions of bisimulation for continuous-time stochastic processes where the evolution is a ﬂow through time: the ﬁrst one as an equivalence relation, the second one as a cospan of morphisms. In [7], we developed the theory further: we introduced different concepts that correspond to different behavioural equivalences and compared them to bisimulation. In particular, we studied the relation between bisimulation and symmetry groups of the dynamics. We also provided a game interpretation for two of the behavioural equivalences. The present work uniﬁes the cited conference presentations and gives detailed proofs.


Introduction
Bisimulation [19,23,26] is a fundamental concept in the theory of transition systems capturing a strong notion of behavioural equivalence.The extension to probabilistic systems is due to Larsen and Skou [18]; henceforth we will simply say "bisimulation" instead of "probabilistic bisimulation".Bisimulation has been studied for discrete-time systems where transitions happen as steps, both on discrete [18] and continuous state spaces [3,11,22].In all these types of systems a crucial ingredient of the definition of bisimulation is the ability to talk about the next step.Thus, the general format of the definition of bisimulation is that one has some property that must hold "now" (in the states being compared, this is an initiation condition), and then one says that the relation is preserved in the next step (this is an induction condition).
There is a vast range of systems that involve continuous-time evolution: deterministic systems governed by differential equations and stochastic systems governed by "noisy" differential equations called stochastic differential equations.These have been extensively studied for over a century since the pioneering work of Einstein [15] on Brownian motion.This work focuses on suggesting a notion of bisimulation to stochastic systems with true continuous-time evolution based on previous works on discrete-time systems.Some work had previously been done in what are called continuous-time systems [12], but even in what are called continuous-time Markov chains there is a discrete notion of time step; it is only that there is a real-valued duration associated with each state that makes such systems continuous-time.They are often called "jump processes" in the mathematical literature (see, for example, [24,28]), a phrase that better captures the true nature of such processes.
We focused on a class of systems called Feller-Dynkin processes for which a good mathematical theory exists.These systems are Markov processes defined on continuous state spaces and with continuous-time evolution.Such systems encompass Brownian motion and its many variants.
A first thought might be to take the well-known theory of discrete time processes, like random walks for example and to develop a theory of continuous time by taking some kind of limit of step sizes going to zero.However, it is not enough to take limits of what happens in discrete-time.Strange phenomena occur when shifting from discrete steps to continuous time.It is necessary to talk about trajectories instead.
We explored several notions of behavioural equivalence for such continuous-time processes.The strongest notion is one that captures the symmetries of the system: a group of symmetries is a set of bijections that leave the dynamics of the system unchanged.Bisimulation as a discrete-time notion has two conditions: an initiation one and a (co)induction one.Depending on whether we extend the initiation or the induction condition using trajectories, we get two notions of behavioural equivalences in continuous time: temporal equivalence and bisimulation respectively.Temporal equivalence can be summed up as trace equivalence with some additional step-like constraints.A bisimulation is a temporal equivalence, but it is still an open question whether these notions are equivalent.We have shown that trace equivalence is a strictly weaker notion than temporal equivalence (and hence the other notions).
It is possible to view discrete-time systems as continuous-time systems by artifically adding a timer and having transitions occur every unit of time.This has allowed us to compare our notions to the previous definition of bisimulation for discrete time systems.This shows that temporal equivalence indeed generalizes bisimulation to continuous-time systems.
Finally, we give two game interpretations, one for bisimulation and one for temporal equivalence.They closely mirror the one provided in [16].The game for bisimulation also emphasizes the importance of trajectories for the study of behavioural equivalences in continuous time.
The relations between these different behavioural equivalences can be displayed as follows: This paper is organized as follows.In section 2, we quickly go over mathematical prerequisites and bisimulation in discrete time.In section 3, we go over the definition of Feller-Dynkin processes and an example of such process: Brownian motion.We also provide some tools that will be useful later on when studying examples: hitting times.In section 4, we define the various behavioural equivalences that we have come up with in this work and we explain the relations between them.We then compare them to discrete-time bisimulation.Those continuous-time notions of behavioural equivalence are illustrated in section 5 with various examples.In section 6, we go over a more categorical approach to behavioural equivalences.Finally, we provide a game interpretation for bisimulation and temporal equivalence in section 7.

Mathematical background
We assume the reader to be familiar with the basic notions of topology (topology, continuous function, compact, locally compact, compactification, σ -compact, Hausdorff), Polish spaces, Banach spaces, measure theory (σ -algebra, measurable functions, measure, probability space).For a detailed introduction, we refer the reader to [14,22,2].

Discrete-time systems
Markov processes are stochastic processes where what "happens next" depends only on the current state and not the full history of how that state was reached.They are used in many fields as statistical models of real-world problems and have therefore been extensively studied.
In some cases, the environment or a user may influence how things will evolve; for instance, by pressing a key on a keyboard.This is modeled by using a countable set Act of actions that determine at each step which stochastic process is used.Definition 1.A labelled Markov process (LMP) is a tuple (S, Σ, (τ a ) a∈Act ), where (S, Σ) is a measurable space called the state space and for all a ∈ Act, τ a : S × Σ → [0, 1] is a Markov kernel, i.e.
• for all states x in S, τ a (x, •) is a subprobability measure, and • for all C in Σ, τ a (•, C) is Σ-measurable.
We will also assume that an LMP is equipped with a measurable function obs : S → 2 AP , where AP denotes a set of atomic propositions and 2 AP is equipped with the discrete σ -algebra.The function obs is useful to isolate areas of the state space.For instance, if the state space is a set of temperatures, there could be atomic propositions "freezing" and "heat wave" isolating respectively the subsets of the state space (−273, 0) and (30, +∞).Definition 2. An equivalence R on S is a bisimulation if for every states x, y ∈ S, if x R y, then the two following conditions are satisfied: • obs(x) = obs(y), • for every measurable and R-closed set C (meaning that for every z R z , z ∈ C if and only if z ∈ C) and every action a, τ a (x, C) = τ a (y, C) Two states are bisimilar if there exists a bisimulation that relates them.
There is a greatest bisimulation which corresponds to the equivalence "are bisimilar".This greatest bisimulation is called "bisimilarity".
Example 3. Random walk is a very standard example.The most basic version of it takes place on the set of signed integers Z and the intuition is that at each step, the process goes either to the left or to the right in an unbiased manner.More formally, the corresponding Markov kernel is the following for all integers n, m: The state space Z can be equipped with the function obs(x) = 0 if x = 0 and obs(x) = 1 if x = 0.
In this case, two states n and m are bisimilar if and only if |n| = |m|.
In [11], a more categorical approach to bisimulation is also provided.It is quite easy to see that for a state s and a zigzag f , the states s and f (s) are bisimilar.In [11], the authors compare bisimulation to spans of zigzags and showed that the two notions coincided if the state space is analytic.
It was shown that the notions induced respectively by spans of zigzags and by cospans of zigzags were equivalent on analytic Borel spaces [13] and on separable universally measurable spaces (isomorphic to a universally measurable subset of a separable completely metrizable space) [21].

Continuous-Time Systems
Before looking into similarities of behaviours of continuous-time Markov processes, one needs to have a definition of what a continuous-time Markov process is.We decided that the best trade-off for this study was to restrict to Feller-Dynkin processes.Such processes have some regularity on their trajectories which gives us tools to handle them.Moreover, this family of processes is general enough to encompass most systems that might be of interest.

Definition of Feller-Dynkin Processes
We will quickly review the theory of continuous-time processes on continuous state space; much of this material is adapted from [24] and we use their notations.Another useful source is [4].
Definition 5. A filtration on a measurable space (X, Σ) is a time-indexed, nondecreasing family This concept is used to capture the idea that at time t what is "known" or "observed" about the process is encoded in the sub-σ -algebra F t .Definition 6.A stochastic process is a collection of random variables (Y t ) 0≤t<∞ on a measurable space (Ω, Σ Ω ) that take values in a second measurable space (X, Σ X ) called the state space.More explicitely, we have a time-indexed family of random variables Y t : Define the filtration (G t ) t≥0 as follows: for every t ≥ 0, the σ -algebra G t is the one generated by all the random variables {Y s |s ≤ t}, i.e.
is also referred to as the natural filtration associated to the stochastic process (Y t ) t≥0 .A stochastic process is always adapted to its natural filtration.
Let E be a locally compact, Hausdorff space with countable base.These standard topological hypotheses allow us to perform the one-point compactification of the set E by adding the absorbing state ∂ .We write E ∂ for the corresponding set: E ∂ = E {∂ }.We also equip the set E with its Borel σ -algebra B(E) that we will denote E .The previous topological hypotheses also imply that E is σ -compact and Polish.Definition 7. A semigroup of operators on any Banach space X is a family of linear continuous (bounded) operators P t : X → X indexed by t ∈ R ≥0 such that ∀s, t ≥ 0, P s • P t = P s+t and P 0 = I (the identity).
The first equation above is called the semigroup property.The operators in a semigroup are continuous operators.Moreover, there is a useful continuity property with respect to time t of the semigroup as a whole.
Definition 8.For X a Banach space, we say that a semigroup P t : Example 9. Let us give a simple example on the simple Banach space R → R. For instance, P t f = f + t is a strongly continuous semigroup modelling some deterministic drift at a constant rate.A counterexample would be P t f = 5t 2 + f (modelling an example of a free fall): this family of operators does not satisfy P s • P t = P s+t .
What the semigroup property expresses is that we do not need to understand the past (what happens before time t) in order to compute the future (what happens after some additional time s, so at time t + s) as long as we know the present (at time t).However, it is simple to have such a semigroup describe a free fall as long as we are working on R 2 → R, as physics teaches us.The first coordinate represents the position and the second the velocity.In this case we have that We say that a continuous real-valued function f on E "vanishes at infinity" if for every ε > 0 there is a compact subset K ⊆ E such that for every To give an intuition, if E is the real line, this means that lim x→±∞ f (x) = 0.The space C 0 (E) of continuous real-valued functions that vanish at infinity is a Banach space with the sup norm.
Definition 10.A Feller-Dynkin (FD) semigroup is a strongly continuous semigroup ( Pt ) t≥0 of linear operators on C 0 (E) satisfying the additional condition: Under these conditions, strong continuity is equivalent to the apparently weaker condition (see Lemma III.6.7 in [24] for the proof): The following important proposition relates these FD-semigroups with Markov kernels which allows one to see the connection with more familiar probabilistic transition systems.We provide a detailed proof of the following proposition in Appendix A.
Proposition 11.Given an FD-semigroup ( Pt ) t≥0 on C 0 (E), it is possible to define a unique family of sub-Markov kernels A very important ingredient in the theory is the space of trajectories of a FD-process (FDsemigroup) as a probability space.This space does not appear explicitly in the study of labelled Markov processes but one does see it in the study of continuous-time Markov chains and jump processes. where • given any probability measure µ on E ∂ , by the Daniell-Kolmogorov theorem, there exists a unique probability measure P µ on (Ω, G ) such that for all n ∈ N, 0 ≤ t 1 ≤ t 2 ≤ ... ≤ t n and where P +∂ t is the Markov kernel obtained by extending P t to E ∂ by P +∂ t (x, {∂ }) = 1 − P t (x, E) and P +∂ t (∂ , {∂ }) = 1.We set P x = P δ x .
The distribution P x is a measure on the space of trajectories for a system started at the point x.

Remark 14. Let us give an intuition for this definition based on discrete-time Markov chain.
There are several ways to describe a Markov chain.One is to define a Markov kernel on the state space.By nesting Markov kernels, we can then compute the probability of jumping from a certain state to a certain set in a certain amount of steps.However, one may also directly define the step-indexed family of random variables describing where the process is at each step.
Here the Markov kernels correspond to the family of subkernels (P t ) t≥0 and the step-indexed family of random variables corresponds to the time-indexed family (X t ) t≥0 .a cadlag stands for the French "continu à droite, limite à gauche" b The σ -algebra G is the same as the one induced by the Skorohod metric, see theorem 16.6 of [1] c The dx i in this equation should be understood as infinitesimal volumes.This notation is standard in probabilities and should be understood by integrating it over measurable state sets C i .
We have constructed here one process which we called the canonical FD-process.Note that there could be other tuples satisfying the same conditions except that the space Ω would not be the space of trajectories.
In order to bring the FD-processes more in line with the kind of transition systems that have hitherto been studied in the computer science literature we introduce a countable set of atomic propositions AP and such an FD-process is equipped with a measurable function obs : E → 2 AP .This function is extended to a function obs : E ∂ → 2 AP {∂ } by setting obs(∂ ) = ∂ .
We further request that the function obs satisfies the following hypothesis: for each atomic proposition A, the atomic proposition partitions the state space into two subsets: one where A is satisfied (obs −1 (S A ) where S A = {a : AP → {0, 1} | a(A) = 1}) and another one where it is not satisfied (obs −1 (S A ) where S A = {a : AP → {0, 1} | a(A) = 0}).We assume that one of these spaces is open (and the other is closed).Note that it can be that for one atomic proposition A 0 , the set where it is satisfied is open and for a different atomic proposition A 1 , the set where it is satisfied is closed.This hypothesis will be useful in section 6.

A classic example of FD-process: Brownian motion
Brownian motion is a stochastic process first introduced to describe the irregular motion of a particle being buffeted by invisible molecules.Now its range of applicability extends far beyond its initial application [17]: it is used in finance, in physics, in biology or as a way to model noise to mention only a few domains.
We refer the reader to [17] for the complete definition.A standard one-dimensional Brownian motion is a Markov process on the real line R that is invariant under reflexion around any real value and under translation by any real value.
The following fundamental formula is useful in computations: If the process is at x at time 0, then at time t the probability that it is in the (measurable) set D is given by dy.
The associated FD-semigroup is the following: for f ∈ C 0 (R) and x ∈ R, Remark 15.A useful intuition that one can have about Brownian motion is that it is the limit of random walk when the time between jumps and the distance between states is taken to 0.
There are many variants of Brownian motions that we will use and describe in detail in further examples.For instance, a drift may be added.Another classic variant is when "walls" are added: when the process hits a wall, it can either vanish (boundary with absorption) or bounce back (boundary with reflection).These correspond to the variants of random walk that we discussed in Example 3.

Hitting times and stopping times
We will use the notions of hitting times and stopping times later.We will follow the definitions of Karatzas and Shreve in [17].
Throughout this section, we assume that (X t ) t≥0 is a stochastic process on (Ω, F ) such that (X t ) t≥0 takes values in a state space (E, B(E)), has right-continuous paths (for every time t, ω(t) = lim s→t,s>t ω(s)) and is adapted to a filtration (F t ) t≥0 .
Definition 16.Given a measurable set C ∈ B(E), the hitting time is the random time Intuitively T C (ω) corresponds to the first time when the trajectory ω "touches" the set C. The verb "touches" in previous sentence is important: this time is obtained as an infimum, not a minimum.

Definition 17.
A random time T is a stopping time of the filtration (F t ) t≥0 if for every time t, the event {T ≤ t} belongs to the σ -field F t .A random time T is an optional time of the filtration (F t ) t≥0 if for every time t, {T < t} ∈ F t .
Recall that the intuition behind the filtration (F t ) t∈R ≥0 is that the information about the process up to time t is stored in the σ -algebra F t .So intuitively, the random time T is a stopping time (resp.anoptional time) if and only if there is enough information gathered at time t to decide if the event described by the random T is less (≤ and < respectively) than t.We provide examples and counterexamples to this definition in the remaining of this section.
A stopping time is also an optional time but the converse may not be true.The next lemma may provide some additional intuition and some examples behind those notions.
If the set C is closed and the paths of the process X are continuous, then T C is a stopping time.
Let us give an example of a random time that is neither a stopping time nor an optional time.
Given a continuous trajectory ω : [0, +∞) → R, we define So T max is the time that ω reaches its maximum value.In order to know if T max (ω) < t, one would need to know the maximum value attained by the trajectory ω.More generally, any such random time that requires to know about the future behaviour of a trajectory is neither a stopping time nor an optional time.
Remark 19.For Brownian motion, we write T x instead of T {x} for x ∈ R.

Naive definition
A naive extension of the discrete-time definition of bisimulation where the induction condition would only require that steps of time t (for every such t ≥ 0) preserve the relation does not work in continuous time.
Indeed, consider Brownian motion on the real line with a single atomic proposition such that obs(x) = 0 if and only if x = 0.The equivalence x R y if and only if x = y = 0 or xy = 0 would satisfy the hypothesis of this naive extension of bisimulation.This would mean that any two non-zero states, regardless of their distance to 0 would be considered equivalent.This does not extend the intuition that we should get from random walk (see example 3) nicely into continuous time.
As it turns out, the problem lies with considering single time-steps since for every state z = 0 and every time t ≥ 0, P t (z, {0}) = 0. Had we considered instead probabilities of reaching the state 0 over an interval of time (P z (T 0 < t) for instance), we would not have had the same equivalence.
This is why we deal with trajectories instead of steps.

Closedness for sets of trajectories
Throughout the remaining of section 4, we fix an FD-process as in section 3.1.
Since the conditions for being a behavioural equivalence have to be stated for trajectories, we have to extend the notion of R-closedness of a set of states to a set of trajectories.This is what is done in the following definition.• it is measurable (cf Definition 13 where we defined the σ -algebra on the space of trajectories of a canonical FD-process), • for every trajectories ω and ω such that obs Recall that 2 AP is equipped with the discrete σalgebra.The σ -algebra on (2 AP ) N is generated by the family of sets (S k,A ) k∈N,A⊂2 AP where Remark 22.The last condition with the map Π may look unnatural; it was originally introduced as a technical trick to complete the correspondence with discrete time.However, it can be motivated as follows: the function obs determines what an external observer may view of the state space and hence the position of the process.The intuition behind the map Π is hence " given a trajectory ω, what does an outside user see of the state of the system at fixed times (every unit of time)?".
That last condition deals with the question "how often should someone be allowed to observe the system?" and answers that an outside observer should at least be allowed to probe the process once every unit of time.

Bisimulation
There are two conditions (initiation and induction conditions) that can be modified to account for trajectories.Depending on which one is adapted, we get either temporal equivalence or bisimulation.In [6], we introduced only the notion of "bisimulation" where the induction condition is adapted to trajectories.
Definition 23.A bisimulation is an equivalence relation R on E such that for all x, y ∈ E, if x R y, then (initiation) obs(x) = obs(y), and (induction) for all measurable time-R-closed sets B, P x (B) = P y (B).
Remark 24.Even though condition (induction) is called an induction, it really is a coinduction condition.
Proposition 25.There is a greatest bisimulation called bisimilarity.It is the union of all bisimulations.

Temporal Equivalence
We later came up with other notions of behavioural equivalences and in [7], we proposed three other notions of behavioural equivalences.It is quite likely that there are other interesting notions to be studied in future works.
Definition 26.A temporal equivalence is an equivalence relation R on E such that for all x, y ∈ E, if x R y, then (initiation) for all time-obs-closed sets B, P x (B) = P y (B), and (induction) for all measurable R-closed sets C, for all times t, P t (x, C) = P t (y, C).
Proposition 27.There is a greatest temporal equivalence which is the union of all temporal equivalences.
Definition 28.Two states that are related by a temporal equivalence are called temporally equivalent.

Trace equivalence
Another well-known concept is that of trace equivalence which corresponds to the initiation condition of temporal equivalence.Temporal equivalence can thus be viewed as trace equivalence which additionally accounts for step-like branching.
Definition 29.Two states are trace equivalent if and only if for all time-obs-closed sets B, P x (B) = P y (B).
We will later see in details how it relates to the standard notion of trace equivalence for discrete time processes (see section 4.8).

Relation between these equivalences
We are going to show that the weakest equivalence is trace equivalence and the strongest is bisimulation as hinted.
Theorem 30.A bisimulation is also a temporal equivalence.If two states are temporally equivalent, then they are trace equivalent.
Proof.Let R be a bisimulation and consider two states x and y such that x R y.
Consider a time-obs-closed set B. Then it is also time-R-closed.Using the induction condition of bisimulation, we get that P x (B) = P y (B).
Consider a measurable R-closed set C and a time t.The set X −1 t (C) = {ω | ω(t) ∈ C} is measurable and time-R-closed.We can then apply the induction condition and we get This concludes the proof that R is a temporal equivalence.
The second part of the lemma follows directly from the initiation condition of a temporal equivalence: this is precisely trace equivalence.
We provide in section 5.1.2an example where the greatest temporal equivalence is strictly greater than trace equivalence.It is still an open question as to whether bisimulation and temporal equivalence are the same notions or if they coincide only for a class of processes.We refer the reader to section 5 for examples of these equivalences.

Symmetries of systems
An interesting notion that appears when looking at examples is that of symmetries of a system.It was the third notion introduced in [7].
Given a function h : A group of symmetries is a set of bijections on the state space that respect the dynamics of the FD-process.
Definition 32.Given a group of symmetries H on E, we denote R H the equivalence defined on E as follows: x R H y if and only if there exists h ∈ H such that h(x) = y.The fact that H is a group guarantees that R H is an equivalence.
One of the requirements for being a group of symmetries is to be closed under inverse and composition.This condition is useful for getting an equivalence on the state space, however, it is usually easier (if possible) to view a group of symmetries as generated by a set of homeomorphisms.
Lemma 33.Consider a set of homeomorphisms H gen on the state space E and define H as the closure under inverse and composition of the set H gen .Assume that the set H gen satisfies the following conditions: • for all f ∈ H gen , obs • f = obs, and • for all measurable sets B such that for all f ∈ H gen , f * (B) = B, for all x ∈ E ∂ and for all g ∈ H gen , P x (B) = P g(x) (B).
Then the set H is a group of symmetries.
Theorem 34.Given a group of symmetries H , the equivalence R H is a bisimulation.
The proof of Theorem 34 requires two additional lemmas.We will omit the proof of the first one since it is straightforward.To prove the converse implication, consider ω ∈ B and define ω as ω = (h −1 ) * (ω).Note that for all times t, h(ω

Comparison to discrete time
Our work aims at extending the notion of bisimulation from discrete time to continuous time.It is possible to view a discrete-time process as a continuous-time one as explained below.It is important to check that our notions encompass the pre-existing notion of bisimulation in discrete time.
It is common in discrete time to consider several actions.Everything that was exposed for continuous time can easily be adapted to accommodate several actions.However, we will not mention actions in this section for the sake of readability.
Consider an LMP (S, Σ, τ, obs) where Σ is the Borel-algebra generated by a topology on S. We also assume that this LMP has at most finitely many atomic propositions.We can always view it as an FD-process where transitions happen at every unit of time.Since the corresponding continuous-time process has to satisfy the Markov property, it cannot keep track of how long it has spent in a state of the LMP.This is why we need to include a "timer" into states.A state in the corresponding FD-process is thus a pair of a state in S and a time explaining how long it has been since the last transition.That time is in [0, 1), the right open bound is in order for trajectories to be cadlag.
Formally, the state space of the FD-process is (E, E ) where the space is defined as and is equipped with the product topology and the corresponding Borel σ -algebra )).The corresponding kernel is defined for all x ∈ S and C ∈ E , t ≥ 0 and s ∈ [0, 1) as We also define obs(x, s) = obs(x).
This gives us a way to view an LMP as a continuous-time process and thus to compare the definition of bisimulation on the original LMP to the behavioural equivalences for the corresponding continuous-time process.In order to make clear when we talk about discrete time bisimulation and in order to avoid confusion for the reader, we will write "DT-bisimulation" for bisimulation in discrete time as was defined in section 2.2.
Lemma 38.If two states x and y of an LMP (S, Σ, τ, obs) are DT-bisimilar, then for all n ≥ 1, for all measurable R-closed (where R is the greatest DT-bisimulation) sets A 1 , ..., A n , x 1 ∈A 1 ...
Proof.Denote π R : S → S/R the corresponding quotient.We equip the quotient space S/R with the largest σ -algebra which makes the map π R measurable.We can also define the function Note that the choice of x (within an R-class) does not change the right term since R is a DTbisimulation and π −1 R (A)) is an R-closed set: you can replace x by x as long as x R x .A sequence of changes of variables yields: ...
It is possible to define the notion of trajectories in the LMP and that of trace equivalence just as we did in the case of FD-processes.A trajectory is a function ω : N → S {∂ } such that if Similarly to what was done in section 3.1, for a state x we can define the probability distribution P x on the set of trajectories that extends the finite-time distributions using the Daniell-Kolmogorov theorem and this allows us to define trace equivalence: two states x and y are trace equivalent if for every time-obs-closed sets B, P x (B) = P y (B).
An important remark has to be made here.Traditionally, trace equivalence for discrete time is much simpler than our definition of trace equivalence: two states x and y are DT-trace equivalent if for every C 0 , ..., C n measurable sets such that for every i, for every z, z such that obs(z Lemma 38 shows that any two DT-bisimilar states are DT-trace equivalent.Our definition of trace equivalence is stronger than this: it requires that P x and P y agree for all time-obs-closed sets.The sets {X 0 ∈ C 0 , ..., X n ∈ C n } are only examples of such time-obs-closed sets.
Lemma 39.In an LMP with finitely many atomic propositions, any two states x and y that are DT-bisimilar are trace equivalent.
Proof.We need to prove that for any time-obs-closed set B ⊂ N → S {∂ }, P x (B) = P y (B).
Recall the definition of Π from the definition of time-obs-closed (see definition 21): for a trajectory )) always holds.For the converse direction, consider ω ∈ Π −1 (Π(B)), i.e. there exists ω ∈ B such that Π(ω) = Π(ω ).This means that for every n, obs(ω and using the fact that B is time-obs-closed, we obtain that ω ∈ B which proves the desired equality.
Consider a finite cylinder on N → 2 AP , i.e. a set of the form: where n is an integer and for every i, A i is a subset of 2 AP .Since there are only finitely many atomic propositions, the set 2 AP is finite and the sets A i are measurable.Lemma 38 shows that for any finite cylinder C on N → 2 AP , P x (Π −1 (C)) = P y (Π −1 (C)) The finite cylinders form a generating π-system of the σ -algebra on N → 2 AP which means that for every measurable subset A of N → 2 AP , P x (Π −1 (A)) = P y (Π −1 (A)) We can now look at our set B.
The first and third equations hold since B = Π −1 (Π(B)) and the second equation holds since Π(B) is measurable and for every measurable subset Remark 40.It may look convoluted to use the map Π and move to the space N → 2 AP instead of N → S {∂ }.However, this is a necessity due to the fact that we do not know what the σ -algebra of time-obs-closed sets look like.We know that it contains the σ -algebra generated by the sets ), but this inclusion is not an equality.
Proposition 41.If the equivalence R is a DT-bisimulation, then the relation R defined as Since (x, s) R (y, s), we also have that x R y.By lemma 38, we have that τ t+s (x, C ) = τ t+s (y, C ).This allows us to conclude that P t ((x, s), C) = P t (y, s), C).
The initiation condition (trace equivalence) is a direct consequence of lemma 39.
Proposition 42.If the equivalence R is a temporal equivalence, then the relation R defined as the transitive closure of the relation Proof.The relation R is indeed an equivalence.
Let us consider x R y, i.e. there exists (x i ) i=1,...,n , (t i ) i=1,...,n−1 and (t i ) i=1,...,n−1 such that x = x 1 , y = x n and for every 1 The fact that obs(x) = obs(y) is a direct consequence of the initiation condition of a temporal equivalence.
These results can be summed up in the following theorem relating temporal equivalence and DTbisimulation.
Theorem 43.Two states x and y (in the LMP) are DT-bisimilar if and only if for all t ∈ [0, 1), the states (x, t) and (y, t) (in the Feller-Dynkin process) are temporally equivalent.

Examples
In this work, examples are important in order to get an intuition.The examples we provide can be divided into three categories: (mostly) deterministic drifts, Brownian motion (and its variants) and Poisson process.
Most of these examples follow the same type of proofs: first computing the trace equivalence, then providing a group of symmetries and showing that the equivalence generated by this group of symmetries corresponds to the trace equivalence.
We will only detail a few of these examples.All computations can be found in [8].where we define for every x ∈ R and every k ∈ Z, t k (x) = x + k and for x, y ∈ R >0 the function f x,y : R → R by These functions are essentially identity functions with a rescaled portion in the middle to connect positive reals.

Fork
The following example is of particular interest because it shows that temporal equivalence and trace equivalence are not equivalent notions.It emphasizes the importance of the induction conditions in the definitions of temporal equivalence and of bisimulation.It is an extension of a standard example in discrete time (see section 4.1, p.86 of [20]) to our continuous-time setting.The process is a deterministic drift at constant speed with a single probabilistic fork (the process then goes to either branch with equal probability).We compare the case where the fork is at the start with the case where the fork happens later.There are two atomic propositions P and Q which enable the process to tell the difference between the ends of each fork.
One can find in [8] a detailed and tedious description of the process and of the proofs.We will focus here on intuition.
A state is a pair (x, b) where x represents the time to be reached from the "origins" x 0 and y 0 and b represents the different "branches" of the state space.The state space with its two atomic propositions P and Q is the following: This process is a deterministic drift at constant speed towards the right.When the process encounters a fork (in x 0 or in y 1 ), it randomly picks between the upper and lower branch and then continues as a deterministic drift to the right.
It is easy to see that two states satisfying condition 2 or 3 are trace equivalent.
In order to show that no other non-equal states can be trace equivalent, note that lemma 44 shows that it is enough to show that (t, 2), (t, 3) and (t, 4) are not trace equivalent.It is enough to consider the set of trajectories seeing the atomic proposition P after letting time go for 100: This set is time-obs-closed, but we have that P (t,2) (B P ) = 1, P (t,3) (B P ) = 0 and P (t,4) (B P ) = 1/2 Proposition 46.The states x 0 and y 0 are not temporally equivalent (and hence they are not bisimilar either).
Using lemma 44, this shows that the states x 1 , x 2 and y 1 can only be temporally equivalent to themselves.
Third, we can consider the set {x 1 }.We have just shown that for any temporal equivalence R, the set This bisimulation is generated by the group of symmetries { f , g, id} where f permutes branches 2 and 5 and g similarly permutes branches 3 and 6.

Examples based on Brownian motion
All these examples are variants of Brownian motion with either a single or no (first two cases of absorbing wall) atomic propositions.
Using the additional constraint that (t k ) * (B) = B, we obtain the desired result.For s k , note that s k = t k • s.Brownian motion is invariant under symmetry which means that P x (B) = P s(x) (s * (B)) = P s(x) (B) and therefore Proposition 48.Two states x and y are trace equivalent if and only if x − x = y − y or y − y.
Proof.The equivalence generated by the group of symmetries {id, And therefore any two states x and y such that x − x = y − y or y − y are trace equivalent.
Let us show that no other states are trace equivalent.First note that if x ∈ Z and y / ∈ Z, they cannot be trace equivalent.Let us now consider two trace equivalent states x and y that are not in Z.Consider the sets B t = {ω | ∃s ∈ [0, t) ω(s) ∈ Z}.These sets are time-obs-closed: • If ω ∈ B t (i.e.there exists a time s < t such that ω(s) ∈ Z) and ω is such that for every time u, obs(ω(u)) = obs(ω (u)), then in particular ω (s) ∈ Z and ω ∈ B t .
• The sets B t can also be expressed as: where T n is the hitting time of {n}.From Lemma 18, we get that T −1 n ([0, t)) is measurable and hence that the sets B t are measurable.
• We can now check the final condition: Π(B t ) = {0, 1} N which is measurable.
Let us compute P z (B t ) for any z ∈ R. Since the distribution of Brownian motion is invariant under translation with respect to its starting point, we have that for every z ∈ R and t > 0, Therefore, it is sufficient to study the distribution of T 0 ∧ T 1 under P c for every c ∈ [0, 1).We claim that, for any pair c, c ∈ [0, 1), T 0 ∧ T 1 has identical distribution under P c and P c if and only if c − 1 2 = c − 1 2 , i.e., either c = c or c = 1 − c .To see this, we consider the Laplace transform of T 0 ∧ T 1 under P c and use the formula 1.3.0.1 of [5] to write for every λ > 0.
Then, T 0 ∧ T 1 has identical distribution under P c and P c if and only if E c e −λ (T 0 ∧T 1 ) = E c e −λ (T 0 ∧T 1 ) for every λ > 0, which is equivalent to c − 1 2 = c − 1 2 .
Thus, we have proven that for every pair x, y ∈ R, P x (B t ) = P y (B t ) for every t > 0 if and only x − x = y − y or x − x = y − y.
The equivalence generated by this group of symmetry is trace equivalence which shows that trace equivalence, the greatest bisimulation and the greatest temporal equivalence are identical.

Brownian motion with drift
Let us consider a Brownian process with an additional drift: where for every x ∈ R and every k ∈ Z, t k (x) = x + k.
With zero distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x = 0.
Proposition 49.Two states are trace equivalent if and only if they are equal.
Proof.The only thing to show is that two different states cannot be trace equivalent.
Define the set Similarly to what was done in the proof of proposition 48, this set is time-obs-closed.
Therefore, it is sufficient to study the distribution of T 0 under P z for every z ∈ R. We claim that, for any pair x, y, T 0 has identical distribution under P x and P y if and only if x = y.To see this, we consider the Laplace transform of T 0 under P z for an arbitrary state z ∈ R and use the formula 2.2.0.1 of [5] to write Then, T 0 has identical distribution under P x and P y if and only if E x e −λ T 0 = E y e −λ T 0 for every λ > 0, which is equivalent to which ultimately means that x = y.
This shows that there is a single group of symmetries possible: {id} and that no two different states can be bisimilar or temporally equivalent.

Brownian motion with absorbing wall
Another common variation of Brownian motion is to add boundaries and to consider that the process does not move anymore or dies once it has hit a boundary.The boundary is called an "absorbing wall".
The standard Brownian motion on the real line is denoted W t and Ω is its space of trajectories.Assume the boundaries are at 0 and z.We introduce two hitting times T 0 and T z (which are stopping times thanks to Lemma 18).We can now define the stopping time T = T 0 ∧ T z .Finally, we obtain the Brownian motion with absorbing walls at 0 and z: for any ω in Ω Note that here Ω is the space of trajectories for the original Brownian motion and serves as source of randomness for the new process.
Let us clarify why we have chosen this to be the state space; essentially the reason is to ensure that the trajectories are cadlag.The boundaries are not included in the state space, i.e. the process with absorbing walls at 0 and z has (0, z) as its state space.This is forced by the requirement that trajectories are cadlag.To see this, consider what would happen if the wall was included in the state space.We would have to modify the behaviour of the process at 0 and z.But what would these modified trajectories look like?They would look like a trajectory of a standard Brownian motion: continuous until it reaches either state 0 or z (assume it happens at time s) and then the trajectory would jump to the state ∂ , i.e. the trajectory would be continuous on [0, s] and then perform a jump and be continuous (constant even) on (s, +∞].This is not a cadlag trajectory because it is continuous on the "wrong side".

Poisson process
This is an example that we considered in [7].A Poisson process models a continuous-time process in which a discrete variable is incremented.A standard example is the arrival of customers in a queue.Let us give some notations: a Poisson process is a non-decreasing process (N t ) t≥0 onto the set of natural numbers N, a discrete space.We define the set Ω of trajectories as usual on the state space.The probability distribution on the set Ω is defined as We are going to study two cases.In the first case, we are able to test if there is an even or odd number of customers that have arrived.In the second case, we are able to test if there have been more customers in total than a critical value.

Testing evenness of number of customers:
There is a single atomic proposition on the state space: obs(k) = 1 if and only if k is even.
Proposition 50.Two states x and y are bisimilar (resp.temporally equivalent, trace equivalent) if and only if x ≡ y mod 2.
Proof.Let us write R for the corresponding equivalence.
First, it is indeed a bisimulation.Consider y = x + 2n where n ∈ N (note that obs(x) = obs(y)) and B a measurable, time-R-closed set.
For a measurable set B , P x (B ) = P x+2n (B + 2n), where In particular P x (B) = P y (B + Since B is time-R-closed, B + 2n ⊂ B, so P x (B) ≤ P y (B).
This concludes the proof that R is a bisimulation.Now, notice that x R y if and only if obs(x) = obs(y).Since this equivalence is weaker than trace equivalence which is itself weaker than bisimulation, we have that R is the trace equivalence and the greatest bisimulation (and similarly the greatest temporal equivalence).
Remark 51.This situation may look a lot like the deterministic or Brownian drift with parity as the atomic proposition.However, there is one key difference here: we are preventing translations to be bijections on the state space by only allowing non-negative numbers: those translations are not surjective.Therefore the set of translations by an even number cannot be a group of symmetries.
Proving that there is no greater group of symmetries than {id} is not as trivial as it may look.
Testing for a critical value: Fix m ∈ N ≥0 , we define the function obs by obs(x) = 1 if and only if x ≥ m.
Proposition 52.Two states x = y are bisimilar (resp.temporally equivalent, trace equivalent) if and only if x, y ≥ m.
Let us show that it is a bisimulation.Consider x R y and assume x = y.This means that x, y ≥ m and hence obs(x) = obs(y).
Now, also consider a measurable time-R-closed set B. Define B = B ∩ M ∩ {ω | ω(0) ≥ m} where M is the set of non-decreasing functions.Similarly to previous example, M is measurable.
Note that the process can only realize non-decreasing trajectories, therefore P y (B) = P y (B ) (and similarly for x).
We have that B = {ω | ω(0) ≥ m} ∩ M.This means that for every z ≥ m, P z (B ) = And in particular, this shows that P x (B) = P y (B) = 1.
To prove that it is the greatest bisimulation, we show that it corresponds to trace equivalence.This proof can be adapted to show that x, y ≥ m are trace equivalent.
Clearly if x < m and y ≥ m, then x and y cannot be trace equivalent since the set {ω | obs(ω(0 Consider the case when x = y are both less than m.Consider a time t > 0 and define B t = {ω | ω(t) ≥ m}.This set is time-obs-closed: the first two conditions are straightforward and This allows us to conclude that if x = y, P x (B t ) = P y (B t ).
There are several observations that can be made.First, in all examples, greatest temporal equivalence and bisimulation matched.We do not know if this is the case for all FD-processes or a subfamily of FD-processes.In particular, all the processes studied here are quite well-behaved and simple: Brownian motion has continuous trajectories (and not cadlag) for instance.Another interesting observation is that group of symmetries corresponds to intuition when handling a problem.Finally, notice how hitting times played a huge role in these proofs.

FD-cospans: a Categorical Approach to Bisimulation
In this section we explore a more categorical way of looking at bisimulation.We extend the notion of zigzag from discrete time.The core idea is that an equivalence can be viewed as a span or cospan of morphisms.This work was published in [6].
The concept of bisimulation that we have discussed so far is defined between states of a process.One often wants to compare different processes with different state spaces.For this one needs to use functions that relate the state spaces of different processes.One does want to preserve the relational character of bisimulation.In the coalgebra literature one uses spans of so-called "zigzag" morphisms.In previous work [10] on (discrete-time) Markov processes, people have considered cospans as this leads to a smoother theory.Intuitively, the difference is whether one thinks of an equivalence relation as a set of ordered pairs or as a collection of equivalence classes.Spans and cospans of zigzag give rise to equivalent notions for analytic spaces, but it has been shown that it is not the case in general [27].
The intuition behind FD-homomorphisms is that they are quotients by bisimulations.However, topological properties do not behave well once they are quotiented.For instance, the quotient of a Hausdorff or locally compact space need not be Hausdorff or locally compact.This section is where the additional hypothesis on obs (end of section 3.1) is going to be vital: for each atomic proposition A, the atomic proposition partitions the state space into two subsets: one where A is satisfied and another one where it is not satisfied.We assume that one of these spaces is open (and the other is closed).

Feller-Dynkin homomorphism
The definition of bisimulation can easily be adapted to states in different Markov processes by constructing the disjoint union of the Markov processes in the following manner.First, one constructs the disjoint union of the two state spaces as topological spaces.It is then possible to extend the semigroups on the two state spaces to a semigroup on the disjoint union of the state spaces.The construction of the time-indexed family of Markov kernels, of the space of trajectories and of and of the space-indexed family of probabilities on the space of trajectories is the same as the one described in 3.1.
In that context, a bisimulation is defined in the natural way: If x R y where x ∈ E i and y ∈ E j (i and j can be either 1 or 2 depending on which state space x and y are on), then: (initiation) obs i (x) = obs j (y), and (induction) for all measurable time-R-closed sets B, P x (B ∩ Ω i ) = P y (B ∩ Ω j ).This condition can also be stated as follows.For all sets B i ∈ G i and B j ∈ G j , P x i (B i ) = P y j (B j ) if the sets B i and B j satisfy the following conditions: for all trajectories ω in B i ∪ B j , To go from the first statement of the induction condition to the other, write B i = B ∩ Ω i and B j = B ∩ Ω j .To go from the second statement to the first statement is slightly trickier to write down since B may contain trajectories on both E 1 and E 2 (switching state space).However, those trajectories have probability zero.
Note that R ∩ (E j × E j ) is a bisimulation on (E j , E j , (P j t ), (P x j )).To proceed with our cospan idea we need a functional version of bisimulation; we call these Feller-Dynkin homomorphisms.
Definition 53.A continuous open surjective d function f : E → E is called a Feller-Dynkin homomorphism (FD-homomorphism) if it satisfies the following conditions: • for all x ∈ E and for all measurable sets B ⊂ Ω , P f (x) (B ) = P x (B) where B := {ω ∈ Note that if f and g are FD-homomorphisms, then so is g • f .This notion of FD-homomorphisms is designed to capture some aspects of bisimulation which is demonstrated in the following proposition.
Proposition 54.Given an FD-homomorphism f , the equivalence relation R defined on E E as Proof.Consider x and y such that x R y.We are going to assume that f (x) = f (y) and we will be treating the case xR f (x) at the same time.
Second, let us check the induction condition (induction).Write Ω and Ĝ for the set of trajectories and its σ -algebra on E E .Consider a measurable time-R-closed set B ∈ Ĝ .
Define the two sets As f is an FD-homomorphism, we have that P f (x) (B ) = P x (B).• Consider ω ∈ B ∩ Ω.The trajectory f • ω is well-defined and is in Ω since f is continuous.
Similarly to what was done for the first inclusion, we get that f • ω ∈ B since ω ∈ B and B is R-closed.This proves that ω ∈ B We get that P f (x) ( B ∩ Ω ) = P x ( B ∩ Ω).Since f (x) = f (y), we also get that P x ( B ∩ Ω) = P y ( B ∩ Ω).
A nice consequence of this result is that the equivalence relation R defined on Proposition 55.Given an FD-process on a state space E and a group of symmetries H for that FD-process, then there exists an FD-process on a state space E and an FD-homomorphism Let us show that R ⊂ E × E is closed.Consider a sequence (x n , y n ) in R that converges to (x,y).In particular, for every n ∈ N, π(x n ) = π(y n ).Furthermore, lim n π(x n ) = π(x) and lim n π(y n ) = π(y) since π is continuous.By uniqueness of limit, π(x) = π(y) and hence (x, y) ∈≈.
Since the quotient map is open and the equivalence R is closed, the space E/R is Hausdorff.
Since the map π is open and the space E is locally compact, the space E/R is also locally compact.
Let us now clarify what the FD-process is on that state space.First, we define for x ∈ E/R obs (x) = obs(y) where π(y) = x This is indeed well-defined as whenever y R z for y, z ∈ E, then there exists h ∈ H such that h(y) = z and hence obs(y) = obs(h(y)) = obs(z).Note that the function obs : E → 2 AP is measurable (the proof is similar to the proof of Lemma 71).Furthermore obs = obs • π by definition of π.
Let us denote Ω the set of trajectories on the one-point compactification of E/R: E ∂ .This defines a time-indexed family of random variables (X t ) t≥0 by X t (ω) = ω(t) for every trajectory ω ∈ Ω and every time t ≥ 0. The σ -algebra G on the set of trajectories Ω is defined as usual for FD-processes: We define the following family of probabilities on the set There are two things to show in order to check that Q x is well-defined: • The set B is measurable The proof is done by induction on the structure of B . - . And since π is continuous, we know that π −1 (C ) ∈ E and hence B ∈ G . - -If B = i∈N A i where for every i ∈ N, A i ∈ G and And since A i ∈ G for every i ∈ N, we get that B ∈ G .
• If y, z are in E and such that π(y) = π(z), then P y (B) = P z (B).Indeed, having π(y) = π(z) means that y R z where R is the bisimulation generated by the group of symmetries H .It is therefore enough to show that the set B is time-R-closed.Consider two trajectories ω and ω such that at every time t ≥ 0, ω(t) R ω (t), i.e. π(ω(t)) = π(ω (t)).This means that π • ω = π • ω and hence by definition of B, ω ∈ B if and only if ω ∈ B.
By definition of Q. we know that for all x ∈ E and for all measurable sets B ⊂ Ω , Q π(x) (B ) = P x (B) where We have already shown that the map π was surjective, continuous and open and that obs = obs • π.Which shows that π is an FD-homomorphism.
However, we still have to check that this family (Q y ) y∈E/R corresponds to an FD-process.That family of probabilities (Q x ) x∈E/R on Ω yields a Markov kernel on E/R: which in turns yields an FD-semigroup: In order to see that ( P t ) t≥0 forms an FD-semigroup, it is convenient to note that for any z such that x = π(z) This concludes the proof.
Example 56.Here is an example with one atomic proposition.We consider the standard Brownian motion and three of its variations.
(1) First define M standard to be the standard Brownian motion on the real line with obs standard (x) = 1 if and only if x ∈ Z.
(2) We write M re f l,1 for the reflected Brownian motion on [0, 1] with obs re f l,1 (x) = 1 if and only if x = 0 or 1.
(4) Finally, we write M circ for the standard Brownian motion on the circle of radius 1 2π and of perimeter 1.We will identify points on the circle with the angle wrt the vertical.The atomic proposition is given by obs circ (x) = 1 if and only if x = 0.
Let us detail how the Markov kernel is obtained for M re f l,1 .Write (P t ) t≥0 for the Markov kernel for the standard Brownian motion on R. We define the Markov kernel (Q t ) t≥0 for M re f l,1 for every x ∈ [0, 1] and C measurable subset of [0, 1]: is the symmetry of the original set C around 1/2.The intuition for that variation is that we "fold" the real line at each integer.The atomic proposition is therefore held at both extremities since they correspond to the integers of the real line once folded.
We will only provide intuition for the remaining two variations.First the reflexive variation on [0, 1/2] is obtained from the standard Brownian motion by folding the real line at each integer and at each half-integer (i.e.k + 1/2 where k ∈ Z).The atomic proposition therefore only holds at 0. Indeed all integers of the original real line get folded into 0.
Second M circ is obtained from the standard Brownian motion by "wrapping" the real line around a circle of perimeter 1.All the integers of the initial real line are thus mapped to a single point on the circle (which we decide to be 0).
We can define some natural mappings between these processes: where the upper two morphisms correspond to the intuition provided before: Now the remaining two morphisms intuitively make sense.How do we obtain the reflexive Brownian motion on [0, 1/2] from a reflexive Brownian motion on [0, 1]?We "fold" the interval [0, 1] at its middle point 1/2: In order to obtain the reflexive variation on [0, 1/2] from the circle, intuitively we "flatten" the circle by identifying the two halves that meet at 0 (the point with the atomic proposition): All these morphisms correspond to the construction of one variation of Brownian motion from another one.It is therefore easy to deduce that all these morphisms are FD-homomorphisms.

Definition of cospans
It is time to retrieve the relational aspect of bisimulation in this functorial framework.For discrete time, the initial definition used spans of zigzags [11].This is equivalent to considering cospans of zigzags in analytic spaces, but this is not necessarily the case for non-analytic spaces [27].
In order to show that FD-cospans behave like an equivalence, it is a necessity to show that they are somehow transitive.Namely, if we have two cospans ← S , can we construct an FD-cospan relating S and S ?The following theorem shows that we can.
Theorem 58.The category with FD-processes as objects and FD-homomorphisms as morphisms has pushouts.
The proof is already quite long and the proofs of some sublemmas can be found in Appendix B.
There are two inclusions i 1 : Define the equivalence relation ∼ on E 1 E 3 as the smallest equivalence such that for all x 1 ∈ E 1 and We equip this set with the largest topology that makes π ∼ continuous (where the topology on E 1 E 3 is the topology inherited from the inclusions).This means that a is open.This corresponds to the pushout in Top.It is worth noting that this topological space (E 1 E 3 )/ ∼ is bijective to the space E 2 / ≈ where ≈ is defined on E 2 as the smallest equivalence such that if g(z) = g(z ) or h(z) = h(z ), then z ≈ z (see Lemma 68).We will write π It is a quotient map and thus surjective.
Lemma 69 shows that the space E 4 equipped with its topology and Borel-algebra is locally compact and Hausdorff.
Note that both maps φ 1 and φ 3 are surjective, continuous and open (see Lemma 70) We define obs 4 as such: obs 4 (x 4 ) = obs 2 (x 2 ) where x 4 = π ≈ (x 2 ).This is well-defined since if z ≈ z , then obs 2 (z) = obs 2 (z ).Indeed, that would meand that there is a sequence z = z 0 , z 1 , ..., z n = z where z i and z i+1 have same image by either g or h.Since g and h are FD-homomorphisms, that means that obs 2 (z i ) = obs 2 (z i+1 ).It is equivalent to defining obs 4 as: • or x = i 2 (x 2 ) where x 2 ∈ E 2 .But then, we know that there exists There are two remaining conditions to check for f to be indeed an FD-homomorphism: that obs 1 • f = obs and for These two conditions directly follow from the fact that π itself satisfies these conditions.We thus have the following correspondence between bisimulation, groups of symmetries and FDcospans: Theorem 60.If there exists a group of symmetries H such that two states are related by that group of symmetries, then there exists an FD-cospan ( f , g) such that f (x) = g(y).
If there exists an FD-cospan ( f , g) such that f (x) = g(y), then the two states x, y are bisimilar.

Game Interpretation
The following games are adaptations from [16,9] to our setting of continuous-time processes and were published in [7].We will omit the proofs in this section but they can be found in [8].However, it is especially interesting to note that the game interpretation of bisimulation emphasizes once again the role of trajectories in that concept whereas the game interpretation of temporal equivalence resembles that in discrete time very closely.

Game interpretation of bisimulation
Definition 61.Two trajectories ω and ω are time-bisimilar if for all times t ≥ 0, ω(t) and ω (t) are bisimilar.
Lemma 62. Two states x and y are bisimilar if and only if the trajectories ω x and ω y are timebisimilar where ω z is the trajectory defined by ω z (t) = z for all times t ≥ 0 for a given state z.
We define the following game.Duplicator's plays are pairs of trajectories that he claims are timebisimilar.Spoiler is trying to prove him wrong.
• Given two trajectories ω and ω , Spoiler chooses t ≥ 0 and B = / 0 ∈ G such that P ω(t) (B) = P ω (t) (B) • Duplicator answers by choosing ω 0 ∈ B and ω 1 / ∈ B such that obs • ω 0 = obs • ω 1 and the game continues from (ω 0 , ω 1 ) A player who cannot make a move at any point loses.Duplicator wins if the game goes on forever.The only way for Spoiler to win is to choose a time-obs-closed set.
Theorem 63.Two trajectories ω and ω are time-bisimilar if and only if Duplicator has a winning strategy from (ω, ω ).
Corollary 64.Two states x and y are bisimilar if and only if Duplicator has a winning strategy from (ω x , ω y ).
While the proof is left as an exercise in [24], let us explicitly write it down: Proof.For every x in E, write V x for the functional V x ( f ) = V f (x).This functional is bounded and linear which enables us to use Theorem 66: there exists a signed measure µ x on E of finite total variation such that We claim that V : (x, B) → µ x (B) is a sub-Markov kernel, i.e.
• For all x in E, V (x, −) is a subprobability measure on (E, E ) • For all B in E , V (−, B) is E -measurable.
The first condition directly follows from the definition of V : V (x, −) = µ x which is a measure and furthermore V (x, E) = µ x (E) = V x (1) (where 1 is the constant function over the whole space E which value is 1).Using the hypothesis that V is Markov, we get that V 1 ≤ 1.We have to be more careful in order to prove that V (x, B) ≥ 0 for every measurable set B: this is a consequence of the regularity of the measure µ x (see Proposition 11 of section 21.4 of [25]).This shows that µ x is a subprobability measure on (E, E ).Now, we have to prove that for every B ∈ E , V (−, B) is measurable.Recall that the set E is σcompact: there exists countably many compact sets K k such that E = k∈N K k .For n ∈ N, define For every n ∈ N, there exists a sequence of functions ( f n j ) j∈N ⊂ C 0 (E) that converges pointwise to 1 B n , i.e. for every x ∈ E, µ x (B n ) = lim j→+∞ V f n j (x).Since the operator V : C 0 (E) → bE , the maps V f n j are measurable which means that V (−, B n ) : x → µ x (B n ) is measurable.

Appendix B. Details of the proof of theorem 58
Lemma 68.The two quotient topological spaces (E 1 E 3 )/ ∼ and E 2 / ≈ are bijective.
Hence by the universal property of the quotient, there exists a unique continuous map ψ such that the following diagram commutes: Second, let us construct a map ψ : (E 1 E 3 )/ ∼→ E 2 / ≈.We are also going to use the universal property of the quotient.First, we define a map where z ∈ h −1 (i 1 (x 1 )) This is well-defined since: • all the sets h −1 (i 1 (x 1 )) and g −1 (i 3 (x 3 )) are not empty (by surjectivity of g and h) • if z, z ∈ h −1 (i 1 (x 1 )), then h(z) = h(z ) = i 1 (x 1 ) and hence z ≈ z .This means that π ≈ (z) = π ≈ (z ) (and similarly for the other case) We will write f 1 = h and f 3 = g Now consider x ∼ x ∈ E 1 E 3 .We want to show that k (x) = k (x ).The fact that x ∼ x means that there exists y 0 , ..., y n such that y j ∈ E α( j) (where α( j) = 1 or 3) and i α(0) (y 0 ) = x and i α(n) (y n ) = x and a sequence z 0 , ..., z n−1 in E 2 such that for every j, f α( j) (z j ) = y j and y j+1 = f α( j+1) (z j+1 ).
Whether each downwards map is a g or an h depends only in which set E 1 or E 3 lies y j .Note that these maps should really be → but the notations are already strenuous enough.Now for such a j, we have that y j+1 = f α( j+1) (z j ) = f α( j+1) (z j+1 ) and hence k (i α( j) (y j )) = π ≈ (z j ) = π ≈ (z j+1) (note that for j = 0 and n, only one of those equalities make sense).This proves that k (x) = k (i α(0) (y 0 ) = k (i α(n) (y n ) = k (x ) Hence by the universal property of the quotient, there exists a unique continuous map ψ such that the following diagram commutes: Finally, using the uniqueness of those maps, we get that ψ • ψ = id and ψ • ψ = id which proves that the two quotient spaces are bijective.
Lemma 69.The space E 4 is locally compact and Hausdorff.
Proof.For this proof, we will write π instead of π ≈ .Let us show that ≈⊂ E 2 × E 2 is closed.Consider a sequence (x n , y n ) in ≈ that converges to (x,y).
Since the quotient map is open and the equivalence ≈ is closed, the space E 4 is Hausdorff.
Since the map π is open and the space E 2 is locally compact, the space E 4 is also locally compact.Proof.We will show the results for φ 1 = i 1 • h : E 1 → E 4 .First note that as the composition of two continuous maps, it is also a continuous map.
Second, consider x 4 ∈ E 4 .Recall that π ≈ is defined as We know that it is a quotient map and thus surjective, i.e. there exists x 2 such that π ≈ (x 2 ) = x 4 , and thus φ 1 (h(x 2 )) = x 4 .Hence φ 1 is surjective.Lemma 71.The map obs 4 is measurable.
Proof.For this proof, we will write π instead of π ≈ .
It is enough to show that obs −1 4 (A) is measurable for A a singleton in 2 AP .Now note that obs −1 4 (A) = π(obs −1 2 (A)).Furthermore, obs −1 2 (A) = i∈AP B i where B i is either open or closed and corresponds to whether there is a 1 or a 0 at the i-th position in the singleton A. Note that since obs 2 is stable under ≈, we have that for an arbitrary set C, π −1 π(obs −1 2 (C)) = obs −1 2 (C) and in particular π −1 π(B i ) = B i .

Definition 20 .
Given an equivalence R on E extended to E ∂ by setting ∂ R ∂ , a set B of trajectories is time-R-closed if for every trajectories ω and ω such that for every time t ≥ 0, ω(t) R ω (t), ω ∈ B if and only if ω ∈ B. Definition 21.A set B of trajectories is called time-obs-closed if the following three conditions are satisfied:

Lemma 35 .
Consider a time-R H -closed set B. Then for every f ∈ H , f * (B) ⊆ B. Lemma 36.Given a group of symmetries H , if a set B is time-R H -closed, then for every h ∈ H , h * (B) = B. Proof.First, using lemma 35, h * (B) ⊆ B.
y} is a temporal equivalence.Proof.Consider (x, s) R (y, s), t ≥ 0 and a measurable and R -closed set C. By definition of P t , P t ((x, s), C) = τ t+s (x, C ) where C = {z | (z, s ) ∈ C} with s = t + s − t + s (and similarly for y).The set C is R-closed.Indeed, consider two states z ∈ C and z ∈ S such that z R z .These conditions imply that (z, s ) ∈ C and (z, s ) R (z , s ).Since the set C is R -closed, (z , s ) ∈ C and hence by definition of the set C , z ∈ C .

5. 1
Basic examples 5.1.1Deterministic Drift Consider a deterministic drift on the real line R with constant speed v ∈ R >0 and a single atomic proposition.We study two cases: the first one with 0 as the only distinguished point and the second one with all the integers distinguished from the other points.In both cases, trace equivalence and greatest bisimulation are the same.atomic proposition distinguishes... trace equivalence/ bisimulation: x R y if and only if group of symmetries zero x = y or x, y > 0 generated by { f x,y | x, y > 0} integers x − x = y − y {t k | k ∈ Z} generated by {t 1 } b) -x = y or x = b − y {id, s b } 0 and 2b > 0 (0, 2b) b x = y or x = 2b − y {id, s 2b } 0 and 4b > 0 (0, 4b) b x = y {id} where s k (x) = k − x for every x ∈ R and k = b or 2b.
Let us show that B = B ∩ Ω.• Consider ω ∈ B, i.e. f • ω ∈ B ∩ Ω .By definition of the set B, ω ∈ Ω.Furthermore, f • ω ∈ B.By definition of R, we have that for all t ≥ 0, ω(t) R ( f • ω(t)).Since the set B is time-R-closed and f • ω ∈ B, we have that ω ∈ B which proves the first inclusion.

d
Recall that f is continuous if and only if for every open set U in E , f −1 (U) is open in E and that f is an open map if and only if for every open set U in E, f (U) is open in Eπ : E → E such that given two states x, y ∈ E, π(x) = π(y) if and only if x R y(where R is the bisimulation generated by the group of symmetries H ).Proof.Define the set E = E/Rand the quotient map π : E → E/R.We can equip the set E/R with the largest topology that makes the map π continuous.Let us show that the map π is open: for any open setU in E, π −1 π(U) = {x | ∃y ∈ U π(x) = π(y)} = {x | ∃y ∈ U x R y} = {x | ∃y ∈ U ∃h ∈ H x = h(y)} = h∈H h(U)Every symmetry h ∈ H is open since they are homeomorphisms, hence h(U) are all open sets.This means that the set π −1 π(U) is open and hence the map π is open.
trajectory ω on E ∂ to be a cadlag function [0, ∞) → E ∂ such that if either ω(t−) := lim s<t,s→t ω(s) = ∂ or ω(t) = ∂ then ∀u ≥ t, ω(u) = ∂ .We can extend ω to a map from [0, ∞] to E ∂ by setting ω(∞) = ∂ .The canonical FD-process associated to the FD-semigroup ( Pt ) t≥0 is Remark 37. Given a group of symmetries H , a set C of states is R H -closed if and only if for every h ∈ H , h(C) = C.It is tempting to find a nice characterization of time-R H -closed sets too, however in the case of trajectories, this is much more complicated.Lemma 36 showed that if a set B is time-R H -closed, then for every h ∈ H , h * (B) = B (which we used in the proofs of later examples) but this is no longer an equivalence.
and therefore ω ∈ h * (B).Proof of Theorem 34.Consider two equivalent states x R H y, i.e. there exists h ∈ H such that h(x) = y.First, obs(x) = obs • h(x) = obs(y).Second, let us consider a measurable, time-R H -closed B. Using lemma 36, we know that for every f ∈ H , f * (B) = B, and hence P x (B) = P h(x) (B) = P y (B), which concludes the proof.
For every k ∈ Z, the functions s k and t k are indeed homeomorphisms and obs • t k = obs for every k ∈ Z (and similarly for s k ).
integersx − x = y − y or y − y{id, t k , s k | k ∈ Z} generated by {s, t 1 } interval [−1, 1] |x| = |y| {s, id}where for everyx ∈ R, s(x) = −x, s k (x) = k − x and t k (x) = x + k for every k ∈ Z.With all integers distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x ∈ Z.Proposition 47.The set {id, t k , s k | k ∈ Z} is a group of symmetries.Proof.Consider a state x and a measurable set B such that (s k ) * (B) = B and (t k ) * (B) = B for every k ∈ Z.Brownian motion is invariant under translation which means that where W t is the standard Brownian motion and v > 0).