Displacement convexity for the entropy in semidiscrete nonlinear Fokker-Planck equations

The displacement $\lambda$-convexity of a nonstandard entropy with respect to a nonlocal transportation metric in finite state spaces is shown using a gradient flow approach. The constant $\lambda$ is computed explicitly in terms of a priori estimates of the solution to a finite-difference approximation of a nonlinear Fokker-Planck equation. The key idea is to employ a new mean function, which defines the Onsager operator in the gradient flow formulation.


Introduction
Displacement convexity, which was introduced by McCann [17], describes the geodesic convexity of functionals on the space of probability measures endowed with a transportation metric.Geodesic convexity has important consequences for the existence and uniqueness of gradient flows in the space of probability measures [1,5,19].It may also provide quantitative contraction estimates between solutions of the gradient flows [4] and exponential decay estimates [1].Displacement λ-convexity of the entropy is equivalent to a lower bound on the Ricci curvature Ric M of the Riemannian manifold M, i.e.Ric M ≥ λ [14,20].Furthermore, it leads to inequalities in convex geometry and probability theory, such as the Brunn-Minkowski, Talagrand, and log-Sobolev inequalities [22].
We are interested in the question to what extent the concept of displacement convexity can be extended to discrete settings, like numerical discretization schemes of gradient flows.As one step in this direction, we show in this paper that a certain entropy functional, related to the finite-difference approximation of nonlinear Fokker-Planck equations, is displacement convex.Before making this statement more precise, let us review the state of the art of the literature.
The study of discrete gradient flows and related topics is very recent.First results were concerned with Ricci curvature bounds in discrete settings [2].Markov processes and Fokker-Planck equations on finite graphs were investigated by Chow et al. in [6].Maas [15] and Mielke [18] introduced nonlocal transportation distances on probability spaces such that continuous-time Markov chains can be formulated as gradient flows of the entropy, and they explored geodesic convexity properties of the functionals.The concept of displacement convexity was used by Gozlan et al. [10] to derive HWI and log-Sobolev inequalities on (complete) graphs.Talagrand's inequality was studied in discrete spaces by Sammer and Tetali [21].
Only few results can be found in the literature on convexity properties of functionals for discretizations of partial differential equations.Exponential decay rates for time-continuous Markov chains were derived by Caputo et al. [3].This result implies the displacement convexity of the Shannon entropy for discretizations of one-dimensional linear Fokker-Planck equations, as first investigated by Mielke [18] (also see the presentation in [12,Section 5.2]).While the proof of Caputo et al. [3] is based on the Bochner-Bakry-Emery method, Mielke [18] employed a gradient flow approach together with matrix estimates.The nonlocal transportation metric, needed for the definition of displacement convexity, is induced by the logarithmic mean, which has some remarkable properties (proved in [18] and summarized in Lemma 5 below).The approach of [3] (and [9]) was extended to general convex entropy densities f (s) in [13] using the mean function which becomes the logarithmic mean for f (s) = s(log s − 1).
Concerning nonlinear equations, we are only aware of two results.Erbar and Maas [8] showed that a discrete one-dimensional porous-medium equation is a gradient flow of the Rényi entropy function f (s) = s α with respect to a suitable nonlocal transportation metric induced by the mean function However, the Rényi entropy fails to be convex along geodesics with respect to this transportation metric [8].A weaker notion than geodesic convexity (called convex entropy decay), which is strongly related to the Bakry-Emery method, was introduced by Maas and Matthes [16] to prove exponential decay rates for finite-volume discretizations of the quantum drift-diffusion equation.Its gradient flow formulation is based on the Fisher information and the logarithmic mean.
In this, paper, we propose a new mean function by composing the logarithmic mean with a nonlinear function (coming from the diffusivity), which is suitable for finite-difference discretizations of the nonlinear Fokker-Planck equation ( 2) supplemented with no-flux boundary conditions and an initial condition.Here, φ : [0, ∞) → [0, ∞) is a continuous function and V (x) is a confinement potential.An example is φ(ρ) = ρ α with α > 0 and V (x) = γ|x| 2 /2 with γ ≥ 0. A computation shows that the entropy is nonincreasing along (smooth) solutions to (2).The displacement convexity of equations related to (2) was analyzed in [5].Our aim is to show that a discrete version of the entropy F c is displacement convex along semidiscrete solutions associated to (2).
For the discretization of (2), let n ∈ N, h = 1/n > 0, and x i = ih, i = 0, . . ., n.Let ρ i (t) approximate the solution ρ(x i , t) and w i approximate the function w(x i ) = e −V (x i ) .Writing (2) in the form a corresponding finite-difference scheme reads as where h > 0 is the space size and κ i Λ i is an approximation of φ(ρ) in [x i , x i+1 ].Our idea is to employ the modified logarithmic mean and to set, as in [18], κ i = √ w i w i+1 .Since Λ i approximates u i = φ(ρ i )/w i , it follows that κ i Λ i approximates w i+1 /w i φ(ρ i ).Observe that with this choice, the numerical scheme reduces to which approximates (2) written in the form ∂ t ρ = ∂ x (w∂ x (φ(ρ)/w)).
The main result of the paper is as follows.If φ is invertible and φ ′ • φ −1 is nonincreasing (an example is φ(s) = s α with 0 < α < 1), then the discrete entropy ( 5) is displacement λ h -convex with respect to the nonlocal transportation metric induced by (4), where and If the minimum of φ ′ (ρ i ) is positive and the maximum of |∇ h φ ′ (ρ i )| is sufficiently small, then λ h is positive.Such bounds in terms of the initial data can be shown at least for the case V = 0; see Corollary 1.We expect that exponential convergence to the steady state holds for sufficiently small h > 0 (and V = 0), but we are unable to prove it.Our result is consistent with that one in [18]: If φ(s) = s is linear (and V = 0), λ h → γ as h → 0, and the constant is asymptotically sharp.
The paper is organized as follows.In Section 2, we introduce the mathematical setting and give the definition of displacement λ-convexity.We show that displacement λ-convexity follows if a certain matrix is positive semidefinite, slightly generalizing Proposition 2.1 in [18].As a warm-up, we consider in Section 3 the semidiscrete heat equation and prove that the entropy This result is a reformulation of Theorem 5 in [13], but our proof is very simple.Section 4 is concerned with the proof of displacement λ-convexity of ( 5) and contains our main result.Some properties of mean functions are recalled in Appendix A, and a priori estimates of solutions to (3) with V = 0 are proved in Appendix B.

Displacement convexity
In this section, we specify our setting and give the definition of displacement convexity.Let n ∈ N and introduce the finite state space This space can be identified with the space of probability measures on a (n + 1)-point set.We define the inner product ρ, ρ The value Q ij is the rate of a particle moving from state j to i.We assume that there exists a unique vector w ∈ X n such that the detailed balance condition is satisfied.Summing this condition for fixed i over j = 0, . . ., n, we see that Qw = 0. Note that in Markov chain theory, the detailed balance condition is usually formulated for the transposed matrix Q ⊤ .
Our aim is to show convexity properties of the entropy along solutions t → ρ(t) to ODE systems of the type (6) where φ is some smooth function.This equation can be formulated as a gradient flow.Indeed, given a (smooth) function f : [0, ∞) → R, we define the entropy F : and the Onsager operator K : X n → R (n+1)×(n+1) , ( 8) where e i = (δ i0 , . . ., δ in ) ⊤ ∈ R n+1 is the ith unit vector and "⊗" is the tensor product.By detailed balance and Q ij w j ≥ 0 for i = j, it follows that K(ρ) is symmetric and positive semidefinite.With these definitions, we can formulate (6) as a gradient system in the sense that it can be rewritten as ( 9) where The space X n is endowed with the nonlocal transportation distance ( 10) where It is well known that the function W is a pseudo-metric on X n [15, Theorem 1.1] and the pair (X n , W) defines a geodesic space [8, Prop.2.3], i.e., for all ρ 0 , ρ 1 ∈ X n , there exists at least one curve ρ : [0, 1] → X n , t → ρ(t), such that ρ(0) = ρ 0 , ρ(1) = ρ 1 , and W(ρ(s), ρ(t)) = |s − t|W(ρ 0 , ρ 1 ) for all s, t ∈ [0, 1].Such a curve is called a constant speed geodesics between ρ 0 and ρ 1 .If the pair (ρ, ψ) ∈ E(ρ 0 , ρ 1 ) attains the infimum in (10), then it satisfies the geodesic equations [8, Prop.2.5] (11) where the vector b Definition 1 (Displacement convexity).Let λ ∈ R. We say that a functional E : . We show that displacement λ-convexity of F is guaranteed if a certain matrix is positive semidefinite.This result (slightly) generalizes Proposition 2.1 in [18].

Semidiscrete heat equation
As a warm-up, we consider the semidiscrete heat equation ( 16) where n ∈ N and h = 1/n > 0. The no-flux boundary conditions are realized by setting ρ −1 = ρ 0 and ρ n+1 = ρ n .We write ρ = (ρ 0 , . . ., ρ n ).Equation ( 16) can be written as ( 6) by setting φ(s) = s and . By slightly abusing the notation, we set w i = 1 for i = 0, . . ., n and note that for a function f : [0, ∞) → R , the corresponding entropy given in (7) reduces to ( 17) Then, for the respective Onsager operator given in (8) with the mean function Λ f , we claim that the entropy F is displacement convex, under suitable conditions on f .Theorem 2. Let f be such that Λ f , defined in (1), is concave in both variables.Then the entropy (17) is displacement convex with respect to the metric (10) induced by Λ f .Lemma 7), thus fulfilling the assumption of the theorem.

Proof. We formulate
Then, setting K(ρ) = G ⊤ L(ρ)G, we can write (16) as the gradient system where we identify DF (ρ) with f ′ (ρ).Thus, by Proposition 1, it is sufficient to show that the matrix M(ρ), defined in (13), is positive semidefinite.In fact, because of the special structure of K(ρ), we can simplify this condition.Let ψ ∈ R n+1 .Then, using Hence, it is sufficient to show that We show this claim by verifying that M is diagonally dominant.To this end, we observe that M is a symmetric tridiagonal matrix with entries , where the coefficients are given by The matrix M is diagonally dominant if The first two conditions (18) follow from (19) Thus, it remains to prove (19).We compute Since Λ f is assumed to be concave, we may apply Lemma 6, which shows that this expression is nonnegative, and hence, M is positive semidefinite.
For nonlinear functions φ and nonconstant steady states (w i ), the proof of nonnegativity of a i + b i−1 + b i is, unfortunately, not as simple as above, and we need more properties of the mean function.It turns out that the logarithmic mean satisfies these properties.Such a situation is considered in the next section.

Semidiscrete nonlinear Fokker-Planck equations
We discretize the nonlinear Fokker-Planck equation where w(x) = e −V (x) .We choose the quadratic potential V (x) = γ|x| 2 /2 with γ > 0 but other choices are possible.Let n ∈ N, h = 1/n > 0, and x i = ih.Approximating ρ(x i , t) by ρ i (t), w(x i ) by w i and setting u i = φ(ρ i )/w i , the numerical scheme reads as (20) where κ i = √ w i w i+1 approximates w(x i+1/2 ).The no-flux boundary conditions are realized by u −1 = u 0 and u n+1 = u n .Setting Q = G ⊤ diag(κ i )G diag(w −1 i ) and, slightly abusing the notation, ρ = (ρ 0 , . . ., ρ n ), we see that the scheme can be formulated as ∂ t ρ = Qφ(ρ), and thus, the framework of Section 2 applies.Hence, (20) can be written as the gradient system and Λ is the logarithmic mean.The above system can be written as in ( 9) by chosing f (s) = s(log s − 1), and therefore, by (7), the entropy reads as Thus, DF (ρ) = log u and, for the nonlocal transportation metric W defined in (10), we have the following result.
Theorem 3. Let φ be invertible, φ ′ • φ −1 be nonincreasing, and γ > 0. Then the entropy F is displacement λ h -convex with respect to W, where From numerical analysis, we expect that min i=0,...,n φ ′ (ρ i ) and max i=0,...,n |∇ h φ ′ (ρ i )| are independent of h and bounded only by discrete norms of ρ(0).In Appendix B, we provide such estimates for the case V = 0.These estimates show that λ h is positive if max i |∇ h ρ i (0)| is sufficiently small.The function φ(s) = s α satisfies the assumptions of the theorem if 0 < α ≤ 1.In the linear case φ(s) = s, we recover essentially the result of [18]. where ).This matrix is symmetric and tridiagonal with entries , where the coefficients are given by and we abbreviated We show now that M − λ h L(ρ) is diagonally dominant for some λ ∈ R. For this, we introduce further abbreviations: We estimate these expressions term by term.Using property (ii) of Lemma 5, we find that The first terms on the right-hand sides cancel with some terms in I 1 .By property (iv) of Lemma 5, it follows that Finally, because of κ i α i+1 = κ i+1 β i , Inserting these computations into (21), we arrive at Employing property (iii) of Lemma 5 in the last term, we obtain The idea is to replace κ i±1 in β i−1 and α i+1 by an expression involving only κ i .By definition of α i and β i and since we find that In the same way, since we infer that ) .
If the potential vanishes, we can define w i = 1 for all i = 0, . . ., n.Then the entropy is displacement convex with respect to W. The following remark, based on an idea of [8], shows that this result may not hold for other entropies.
Remark 4. Erbar and Maas [8] considered the diffusion equation in the form where U satisfies sU ′′ (s) = φ ′ (s).The corresponding numerical scheme becomes where U ′ (ρ) = (U ′ (ρ 0 ), . . ., U ′ (ρ n )) and the operator L(ρ) is again defined by L(ρ) = diag(Λ(ρ i , ρ i+1 )), but with the mean function The associated entropy is F (ρ) = n i=0 U(ρ i ), and if ρ is a geodesic curve on X n with respect to the nonlinear transportation metric W induced by (24), then , with the matrix coefficients and adding both inequalities gives the conclusion.
For the proof of (29), we compute Making the change of variables i → i − 1 in the first sum and rearranging the terms, we find that Consequently, for any j = 0, . . ., n − 1 and t > 0, Taking the maximum over j = 0, . . ., n − 1 shows (29).