Coarse graining of a Fokker-Planck equation with excluded volume effects preserving the gradient-flow structure

The propagation of gradient flow structures from microscopic to macroscopic models is a topic of high current interest. In this paper we discuss this propagation in a model for the diffusion of particles interacting via hard-core exclusion or short-range repulsive potentials. We formulate the microscopic model as a high-dimensional gradient flow in the Wasserstein metric for an appropriate free-energy functional. Then we use the JKO approach to identify the asymptotics of the metric and the free-energy functional beyond the lowest order for single particle densities in the limit of small particle volumes by matched asymptotic expansions. While we use a propagation of chaos assumption at far distances, we consider correlations at small distance in the expansion. In this way we obtain a clear picture of the emergence of a macroscopic gradient structure incorporating corrections in the free energy functional due to the volume exclusion.


Introduction
An interesting feature of many partial differential equations (PDEs) describing dissipative mechanisms in particle systems is that they can be seen as gradient flows (or steepest descents) of an associated free-energy functional. This is the case of the linear Fokker-Planck equation [19], which describes the evolution of the probability of one or many Brownian independent particles, and many other nonlinear Fokker-Planck equations including nonlinear diffusions and McKean-Vlasov like equations [3,18,24,27,30]. For example, if we consider N Brownian particles moving under an external potential V (x), their evolution can be described by the following stochastic differential equation (SDE): The set of N particles can equivalently be described by a Fokker-Planck PDE for its joint probability density P ( x, t), where x = (x 1 , . . . , x N ): where ∇ x and ∇ x · respectively stand for the gradient and divergence operators with respect to the N -particle position vector x and V N ( x) = N i=1 V (x i ). The Fokker-Planck equation (1.2) can be seen as a gradient flow with respect to the Wasserstein metric and the free energy (1. 3) The connections between (1.1), (1.2) and (1.3) are well understood in the case of noninteracting particles, where essentially the macroscopic limit of a set of N particles coincides with the case of a single Brownian particle [19], leading to the linear Fokker-Planck equation In this paper we are interested in the connections between these objects when we consider interacting Brownian particles.
Having interactions makes the coarse-graining procedure, going from N particles to one, highly nontrivial. In particular, the Fokker-Planck equation for the one-particle density p(x, t) = P ( x, t)δ(x − x 1 )d x becomes in general nonlinear due to particle interactions, and its relation to the N -particle probability density becomes much more complicated due to correlations between particles. The particular assumptions on the interaction potential are crucial in order to derive the evolution of the one-particle probability density. When dealing with long range interactions, the mean-field limit assumption leads to nonlinear McKean-Vlasov equations, see [8,17,29] and the references therein. Other scaling limits are possible [7,15,23] and they will be discussed in Section 3 below.
In this work, we consider a set of N pairwise interacting particles in a bounded domain Ω ⊂ R d , with an interaction potential u of short range ǫ ≪ |Ω|: for 1 i N . Using an upscaling technique based on matched asymptotic expansions in the limit of low-volume fraction, one obtains [13] a nonlinear correction term in the development for small excluded volume fraction ǫ, leading to the nonlinear Fokker-Planck equation: where Here p ǫ is used to indicate the asymptotic approximation of p for small ǫ. Thus, we see that interactions introduce a nonlinear diffusion term into the macroscopic Fokker-Planck (1.4). Interestingly though, (1.6) preserves the gradient-flow structure of the original microscopic Fokker-Planck (1.2), with the following free-energy see for instance [24,18,30]. The same applies when, instead of soft interactions, one considers hard core interactions between particles (hard spheres of diameter ǫ). In that case, interactions do not appear in (1.5) but as boundary conditions on a perforated domain, i.e., X i (t) − X j (t) = ǫ and the coefficient is α u = V d (1), the volume of the unit ball in R d . We discuss this further in subsection 3.3.
The aim of this work is to study what happens to the gradient flow structure and freeenergy of a particle-based model when coarse-graining. In the example above it has been shown by matched asymptotic expansions directly on the PDE level [13] that the macroscopic model preserves the structure, and that interactions appeared as a quadratic term in the free-energy. But, can we recover this information from the variational viewpoint during the coarse graining procedure? Understanding this point would be potentially useful in the description of generalised models (1.1) for non-identical particles, for which the macroscopic model is a cross-diffusion PDE system [12] without a gradient flow structure, at least not in general and in the standard sense [11].
The idea of coarse-graining at the level of the variational Fokker-Planck equation or the free-energy is not a new one, see for example, [1,2]. This also embeds into a more abstract setting of evolutionary convergence of gradient flows, see [4,21,22,26,28]. All these papers are working on the lowest-order limit, while we seek to derive a first-order expansion in terms of the small volume of particles (note that the lowest order in our case is a linear Fokker-Planck equation for single particles that can be obtained easily). Here we propose to work with the variational scheme, also called the JKO scheme, leading to the Fokker-Planck equation [19] and use the method of matched asymptotics at that level to obtain a macroscopic variational scheme, thus ensuring the preservation of the gradient-flow structure. In this paper we focus on the simple example with identical interacting particles, for which we have a gradient flow structure at the macroscopic level as discussed above.
The paper is organised as follows. We first introduce the variational formulation of the steepest descent at the macroscopic and microscopic levels together with their optimality conditions. We discuss some aspects such as uniqueness of the variational formulations and their linearizations. Section 3 is the core of this work devoted to the strategy of matched asymptotic expansions at the level of the optimality conditions for the variational schemes. This is all done in the case of soft particles while a final subsection deals with the hard-sphere case. Finally, Section 4 illustrates these results with numerical experiments emphasizing the free energy comparisons between microscopic and macroscopic simulations. An appendix includes some of the computations for the hard-sphere case.
In this section we define the microscopic problem in detail and present the corresponding macroscopic problem. We then write both problems in variational form. For generality, we expose the problem for soft particles interacting via a repulsive potential (in subsection 3.3 we discuss how the hard-core particles case can be seen as a particular limit of soft spheres).

Models for soft spheres
We work in the dimensionless problem, with the length scales rescaled so that the characteristic macroscopic length and volume are of order one. We consider a system of N identical particles diffusing in a bounded domain Ω ⊂ R d with no-flux boundary conditions on ∂Ω evolving according to (1.5). Particles are initially positions are random and identically distributed, and are under the action of an external potential V . We assume particles interact with each other via a pairwise potential u that is isotropic (that is, u only depends on the distance between two particles). We assume that u is a decreasing function on [0, ∞) and vanishing at infinity sufficiently fast. In particular, we suppose that u = O(r −(d+δ) ) for some δ > 0 as r → ∞. We denote the range of interactions of the potential u by ǫ.
Remark 2.1 (Range of the interactions) We say that an interaction is long range when ǫ ≫ l, where l is the typical distance between particles, so that one particle interacts with a large number of particles surrounding it. In contrast, when ǫ ≪ l then on average one particle only interacts with one neighbouring particle at a given time, and the interaction is said to be short range. Using that |Ω| ∼ 1, we have that l ∼ N −1/d and: In this work we study the case when u is short ranged and ǫ ≪ 1. The joint probability density P ( x, t), x = (x 1 , . . . , x N ), of N particles evolving according to (1.5) satisfies the following problem where n is the outward normal vector on ∂Ω N , as before, and U N is the total interaction potential of the system The initial condition is P ( x, 0) = P 0 ( x) with P 0 invariant to permutations of the particle labels. The corresponding free-energy is given by (2.2) In contrast with the hard-spheres case discussed in the introduction, the microscopic Fokker-Planck (2.1) is defined in a domain with no holes, and the interactions between particles appear as a drift term in the equation.
Depending on the strength of the interaction potential u, one expects different limit equations [7]. When the interactions are long range (ǫ ≫ l), then one particle interacts on average with an order N particles. Using the mean-field approximation P ( x, t) = N i=1 p(x i , t) results in the following macroscopic model: for x, y ∈ Ω. The mean-field model (2.3) is valid when u is long range, but is not suitable for short-range interactions. In particular, if u is singular at the origin the convolution term in (2.3) may not even be defined. In these cases, we can use matched asymptotic expansions instead to derive the macroscopic model [13].
We consider a short-range repulsive interaction potential u with range ǫ ≪ 1 as described in Remark 2.1. As discussed in the Introduction, using the method of matched asymptotic expansions results in the following nonlinear Fokker-Planck equation (valid up to order ǫ d ) [13] The associated free-energy to (2.4) is the same as for hard spheres (1.7) replacing α by We want to study the correspondence between the microscopic model (2.1)-(2.2) and the macroscopic model (2.4) using the free energy and gradient flow description. In other words, is the gradient flow of (2.5) the macroscopic counterpart of the gradient flow of (2.2) in the limit N → ∞? In order to gain some insight into this question we will consider the minimizing movement scheme, a time-discrete variational approximation of gradient flows [19]. We then perform an asymptotic expansion in an outer region, where it is natural to assume the asymptotic independence of two particles, and in an inner region at the scale of particle sizes, where the size exclusion leads to significant dependence. In order to argue meaningfully with the first-order asymptotics, we also need to understand the linearized problem and its uniqueness. These building blocks will be discussed in the next subsections.

Minimizing Movement Scheme
It is well-known that, under certain conditions of the drift, the stationary solutions of the Fokker-Planck equations (2.1) and (2.4) satisfy a variational principle. Namely, they minimise their associated free-energy functionals E and E ǫ over the associated class of probability densities, P 2 (Ω N ) and P 2 (Ω) respectively. We will use an extra connection between the free-energy functional and the Fokker-Planck equation, namely that the solutions of the Fokker-Planck equation follow, at each instant in time, the direction of steepest descent of the associated free energy functional. In fact, the authors in [19] showed that the linear Fokker-Planck equation can be obtained as the limit of a variational scheme. LetP k be the approximated N −particle probability density at time t = k∆t. GivenP k−1 , then we defineP k as any solution of the variational problem inf P k ∈P2(Ω N ) where W 2 the Wasserstein metric. In order to find the Euler-Lagrange optimality conditions of (2.6), we make use of the Benamou-Brénier formulation of W 2 [6] to rewrite (2.6) as where the infimum is taken among all the pairs (P, U ) and 'final position' P k such that P k ∈ P 2 (Ω N ), P : [0, ∆t] → P 2 (Ω N ) with This can be understood as an optimal control problem: we need to find the best end point P k so that E(P k ) is the smallest, but also the optimal path with flux P U to get there. We note that in the literature is also common to find this problem defined in the time interval [0, 1]; in this case the dependency on ∆t appears as a 1/∆t coefficient outside the integral in (2.7 a). Benamou and Brénier pointed out that the variational scheme (2.7) can be seen as a convex minimization problem with linear constraints. The constraint, or the continuity equation (2.7 b), will be eliminated by introducing a Lagrange multiplier Φ in the next subsection.

Weak formulation and compatibility conditions
Following the procedure in [9], the variational problem (2.7) is equivalent to which leads to the classical optimality conditions [27,30] Here Φ can be interpreted as a Lagrange multiplier for conditions (2.7). This yields the following variational problem: given the initial distributionP k−1 at time t = (k − 1)∆t and an free-energy E, determine the pair (P, Φ) andP k ( x) = P ( x, ∆t) such that We can repeat the procedure to obtain the weak formulation and optimality conditions for the macroscopic Fokker-Planck equation (2.4). We arrive at in Ω × {∆t}, (2.10 e) where the final time condition is φ ǫ (∆t) = − logp ǫ k + V + α u (N − 1)ǫ dpǫ k . For ease of notation, from now on we will eliminate the bars on P k−1 and P k from the optimality conditions, and analogously for the macroscopic densities.

Uniqueness of Solutions of the Optimality Conditions
In the following we study the system of nonlinear compatibility conditions in a unified way such that it comprises the microscopic as well as the macroscopic problem. We thus consider a domain D ⊂ R M of arbitrary dimension M ; D can be the domain in R dN for the microscopic model (perforated in the case of hard spheres), or D = Ω with M = d in the macroscopic model. We look for solutions (p, φ) of Here, F is a strictly convex functional in the classical sense as (1.7) and (2.2), which we assume to be differentiable for the sake of simplicity, but an analogous proof based on subgradients can be carried out in general.
In order to verify the uniqueness of a solution we can follow the formal proof of Lasry and Lions [20] for mean-field games, which have the same structure as (2.11). The key idea is to take the difference of the equations for two solutions (p i , φ i ), i = 1, 2, that is, then multiply the first equation with φ 1 − φ 2 , the second with with p 1 − p 2 and integrate with respect to space and time. This, together with integration by parts, yields For nonnegative p 1 and p 2 , the right-hand side is obviously nonnegative, while the lefthand side is negative due to the strict convexity of F if p 1 (·, ∆t) = p 2 (·, ∆t). Thus we conclude p 1 (·, ∆t) = p 2 (·, ∆t) and ∇φ 1 = ∇φ 2 on the support of p 1 + p 2 . The uniqueness of the transport equation (2.11 a) thus implies p 1 ≡ p 2 .

The Linearized Compatibility Conditions
In order to justify an asymptotic expansion it is a key issue to understand the linearized problem and its well-posedness. In the following we will provide an analysis based on the Ladyzhenskaya-Babuska-Brezzi theory [10] for linear saddle-point problems under suitable conditions, which can be carried out in a dimension-independent way, hence being applicable for the high-dimensional microscopic as well as for the macroscopic problem. As before, we consider a general domain D ⊂ R M .
Given a known pair (q, ϕ), around which we linearize the optimality conditions (2.11), we obtain the following linear problem for (h, f ): for a nonnegative constant C.
Here we prove the existence and uniqueness of the linearized system (2.12). In order to verify existence and uniqueness of a weak solution of (2.12) we introduce a weak formulation, which actually corresponds to a second order approximation of the original variational problem. Throughout the following arguments we assume that q and ϕ are sufficiently smooth and q is strictly positive in D × [0, ∆t]. Consequently we introduce a variable corresponding to the flux, that is, a vector field g = q∇f . Then consider the variational problem of minimizing subject to ∂h ∂s + ∇ · (g + h∇ϕ) = 0. (2.14) Introducing f as a Lagrange parameter, we obtain the saddle-point problem Clearly an appropriate space for g is L 2 ((0, ∆t) × D) M due to the first term in the functional. Since we expect ∇f = g q , an obvious choice is f ∈ L 2 (0, ∆t; H 1 (D)). Hence, the constraint on h and g is to be interpreted in the dual space L 2 (0, ∆t; H 1 (D) * ), which makes sense also since ∇ · g ∈ L 2 (0, ∆t; H 1 (D) * ).
It remains to define an appropriate space for h, which will be based on the method of characteristics. First of all, let r denote the push-forward of h(·, 0), that is, the unique solution of ∂r ∂s + ∇ · (r∇ϕ) = 0, with initial value h(·, 0). Then we look for a distributional solution f ∈ r + W , where Note that for given ∂h ∂s + ∇ · (h∇ϕ) ∈ L 2 (0, ∆t; H 1 (D) * ) one can reconstruct h with zero initial value by appropriate integration along characteristics and obtains also a distributional trace at s = ∆t.
In the following we thus look for a solution using the general theory of saddle-point problems in Hilbert spaces from [10].

Theorem 2.2 There exists a unique solution
of the variational problem (2.13) subject to (2.14), respectively a weak solution of (2.12).
Proof Following [10] we need to verify an inf-sup condition for the constraint and the coercivity of the quadratic functional on the kernel of the constraint. The inf-sup condition follows immediately by estimating the supremum over all g, h by the value at g = −λ∇f with λ > 0 sufficiently large, and h constant in Finally the Poincare inequality implies that the right-hand side can be estimated from below by a multiple of the norm of f in L 2 (0, T ; H 1 (D)). For coercivity we restrict ourselves to 1 2 ∆t 0 Ω g 2 q dxds, which is clearly coercive with respect to g in L 2 ((0, ∆t) × D). However, for g ∈ L 2 ((0, ∆t) × Ω) and (g, h) in the kernel of the constraint we immediately have , which implies also coercivity.

Derivation of the macroscopic variational Fokker-Planck equation
In this section, we show that the macroscopic compatibility conditions (2.10) can be derived from the corresponding microscopic problem (2.9). We begin by the simple case of noninteracting particles, and then consider the case of interacting particles. We show the derivation for soft spheres, and conclude the section presenting the key differences in deriving the macroscopic variational problem for hard spheres.
In order to reduce the dimensionality of the problem (2.9), we consider the marginal densities for n = 1, 2, . . . N − 1, and a decomposition of the microscopic flow of the form [14] Φ In (3.2), each new term in the series describes higher-order interactions. For example, for non-interacting particles ϕ n = 0 for all n 2, whereas for pairwise interacting particles ϕ n = 0 for all n 3. Integrating (2.9 a) over dx 2 . . . dx N using the boundary conditions (2.9 e) gives ∂p ∂s where p is the one-particle marginal density (p ≡ P 1 ), that is,

Non-interacting particles
We begin by the simplest case of non-interacting particles, so that the interaction potential in (1.5) is u ≡ 0. Using that particles are initially independent and identically distributed, we can write Using (3.5) and (2.9 b)-(2.9 e), the problem for Φ reads Using the decomposition (3.2) with ϕ n = 0 for n 2, we arrive at As seen in Section 2.4, we know that this solution is unique. Finally, inserting (3.7) into (3.3) gives So we have obtained all the macroscopic compatibility conditions (2.10) as required (with u = 0). What we have shown is that if (p, ϕ) verify (3.8) and (3.9), then (P, Φ) given by (3.5) and (3.7) satisfy (2.10) with u = 0.

Soft-sphere particles
For non-interacting particles, we have seen that the microscopic density P is the product of N one-particle densities p, while the microscopic flow Φ is the sum of N one-particle flows φ. For pairwise interacting particles, we neglect the interactions at all orders higher than two in (3.2). Then the equation for p (3.3) reads where we have used that P is invariant with respect to permutations of the particle labels and (3.1).
In order to obtain a closed equation in terms of macroscopic quantities from (3.10), we consider the equation satisfied by the two-particle density P 2 (x 1 , x 2 , t). This is obtained by integrating the microscopic Euler-Lagrange equation (2.9 a) over x 3 , . . . , x N , applying the divergence theorem with (2.9 c) and relabelling particles (as they are all identical): and Similarly, inserting (3.2) into (2.9 b) we arrive at where F is a differential operator of first order in its arguments.
For low-concentrated systems with short-range interactions, three-particle (and higher) interactions are negligible compared to two-particle interactions: when two particles are close to each other, the probability of a third particle being nearby is negligible. Mathematically, this means that the two-particle probability density P 2 (x 1 , x 2 , t) is governed by the dynamics of particles 1 and 2 only, independently of the remaining N −2 particles. In other words, the terms G 2 in (3.11) and F in (3.13) are negligible, and the problem for (P 2 , Φ 2 ) reduces to in Ω 2 × (0, ∆t), (3.14 b) 14 c) where P n,k denotes the nth marginal of P k .

Matched asymptotic expansions
Since the potential u is short-ranged, we rewrite it as u(r) =ũ(r/ǫ) with ǫ ≪ 1. We seek a solution to (3.14) using the method of matched asymptotic expansions. We suppose that when two particles are far apart ( x 1 −x 2 ≫ ǫ), their Brownian motions are independent (so we can use the results for point particles in §3.1), whereas when they are close to each other ( x 1 − x 2 ∼ ǫ) they are correlated due to interactions. We designate these two regions of the configuration space Ω 2 the outer region and inner region, respectively.

Outer region
In the outer region we define By independence, we have that for some functions q(x, s) and ϕ(x, s). That is, at leading order in the outer region, we consider the forms we found for non-interacting particles, namely that the density is a product of densities in x i and the flow is a sum of flows in x i .
In the outer region, x 1 − x 2 ≫ ǫ and hence the interaction term in (3.14 e) will be small. Specifically, given that u = O(r −(d+δ) ) for some δ > 0 as r → ∞, the outer problem up to O(ǫ d ) does not see the interaction term. 1 In particular, this implies that the outer density q and the outer flow ϕ satisfy the equations for independent particles (3.9) and (3.8 a), respectively, found in the previous subsection. That is,

e)
At order ǫ, (3.14) gives . If the initial positions of the particles X i (0) are identically and independently distributed, then the initial joint density P (0, x) is separable and thus P (k) out (0, x) = 0 for all k 1 (all the contribution goes at the leading order, which we have assumed is separable). Therefore P

Inner region
In the inner region, we set x 1 =x 1 , x 2 =x 1 + ǫx, and defineP (x 1 ,x, s) = P 2 (x 1 , x 2 , s) andΦ(x 1 ,x, s) = Φ 2 (x 1 , x 2 , s). With this rescaling, (3.14) becomes . To this we need to add matching conditions onP andΦ so that they tend to the outer solution as x → ∞. Expanding the outer solution (3.15) and taking into account the solution of (3.17) in inner variables givesP
In sum, we find that the inner region solution is, to O(ǫ), where q k and ϕ satisfy the outer problem (3.16). We note that the inner region equations determine completely the flow up to O(ǫ), but that we only obtain conditions for the density at the final time (s = ∆t) and at infinity ( x ∼ ∞). But since the problem is stationary up to O(ǫ) (s appears only as a parameter in (3.19) and (3.22)), we can replace q k (x 1 ) by q(s,x 1 ) in (3.24 a) and writẽ for any functions B i (i = 0, 1) such that B i = 0 at s = ∆t and as x ∼ ∞, so we do not obtain a unique solution for the density in the inner region. However, the subsequent analysis shows that the value of the integral in (3.10) and the integrated compatibility conditions are invariant to B i and thus it what follows we can simply set them to zero:

Integrated equations
We now go back to (3.10) and use the inner and outer solutions in order to obtain the optimisation conditions for the macroscopic problem (2.10). First, we need the following result.
Lemma 1 (Relationship between p and q) The one-particle density p and the outer density q are related by

26)
and Ω q(x, s) dx ∼ 1 + ǫ d a, where a is an order one constant. Therefore p = q + O(ǫ d ).
Proof To keep the notation simple, in the following we omit the time variable s as an argument of the densities. We begin by evaluating the integral in (3.4) by splitting the integration volume Ω for x 2 into the inner and outer regions and using the inner and outer solutions for P 2 (x 1 , x 2 , s), respectively. Even though there is no sharp boundary between the inner and outer regions, it is convenient to introduce an intermediate radius δ, with ǫ ≪ δ ≪ 1, which divides the regions. Then the inner region is Ω in (x 1 ) = {x 2 ∈ Ω : x 2 − x 1 < δ} and the outer region is the complimentary set Ω out (x 1 ) = Ω\Ω in (x 1 ). Then The outer integral is, using (3.15), where V d (1) denotes the volume of the unit ball in R d . The inner integral is, using the leading-order of (3.24 c), Combining the two integrals and choosing δ such that δ d+1 = ǫ k with d < k < d + 1, we obtain using that δ d V d (1) = ǫ d V d (δ/ǫ). Since 1 − e −ũ(x) decays at infinity, we can extend the domain of integration to the entire R d introducing only higher-order errors. Therefore, as required, where α u is given in (3.26). To obtain the asymptotic value of the mass of q, we integrate the equation above to impose the normalisation condition on p: Rearranging, we find that as required. The constant is a = 1 2 α u Ω q 2 (x) dx.
Now we turn to the main contribution of this work.
Proposition 1 Consider a short range repulsive interaction potential u satisfying u(x) = u(x/ǫ) with ǫ ≪ 1, and N = 2. Then the one-particle marginal density and flow (p, φ) of the optimality conditions (2.9) satisfies the following equations up to order ǫ d : which are consistent with (2.10) for N = 2.
Proof We consider now the integrated equation (3.10) with N = 2 and Φ 2 defined in (3.12). We introduce a macroscopic mobility m(p) and a macroscopic flow φ such that The integral I can be evaluated splitting it into inner and outer parts as in Lemma 1.

Using (3.15), the outer component is
whereas the inner-region integral reads, using (3.24), Combining the two integrals as in Lemma 1 and using (3.25), we obtain with k > d as before. Therefore, from (3.29) we have that We use this expression to determine the mobility m(p) in a general way. Using the condition (3.16 e) on ϕ and (3.25) we have that, Expanding the mobility and the potential in powers of ǫ d , m = m (0) + ǫ d m (1) + · · · and φ = φ (0) + ǫ d φ (1) + · · · , this implies that, at leading order Since the mobility should be the same for any external potential V , this requires m (0) (p) to be linear in p. We take it m (0) (p) = p without loss of generality (otherwise the constant would go into φ (0) ). This implies that ∇ x1 At the next order we have that, expanding the logarithm, Substituting for m (0) and ∇ x1 φ (0) and following the same reasoning as before leads to m (1) = cp for some constant c (in fact, this applies to all m (i) for i 0). Since in the right-hand side of (3.31) there is no potential appearing, an obvious choice for the correction of the mobility is c ≡ 0, leading to ∇ x1 φ (1) = −α u ∇ x1 p. Therefore,

General number of particles N
The result stated in Proposition 1 formally extends to any number of particles N under the assumption of N ǫ d ≪ 1, that is, that the total volume of interaction is small compared to the macroscopic volume Ω. The final condition reads There are two points to check. The first is justifying that when N ǫ d ≪ 1, the twoparticle optimality conditions for (P 2 , Φ 2 ) reduce to (3.14). The second is in the domain decomposition to evaluate the integral I, where instead of one inner region there are now (N − 1).
Regarding the first point, we already discussed at the beginning of Subsection 3.2 that the additional terms in (3.11) and (3.13) with respect to (3.14) correspond to three or more particle interactions and then are negligible under the assumption N ǫ d ≪ 1. This is because in configuration space they have volume of order N 2 ǫ 2d , and therefore asymptotically they will appear at a higher order.
Regarding the second point, for a general N the domain decomposition to evaluate (3.27) now has N − 1 inner region integrals, one for each pair x 1 forms with another particle. Using that all particles are identical, the integration variable in the N − 1 inner region integrals can be relabelled to x 2 . This implies that the terms with α u in (3.25) and (3.32) are multiplied by N − 1.
This completes the derivation of our main result.

Remark 3.1 (Different scaling limits)
Here we discuss the regime of interactions that we consider in the context of previous works in the literature. Oelschläger [23] considers the model of interacting particles evolving according to (1.5) with V = 0 in the limit of N → ∞. Bodnar and Velazquez [7] also consider the same model for d = 1. The interaction is rescaled in the following way as a function of N : where β ∈ [0, 1] is a parameter that models the strength and range of the interaction. The potential then becomes where we have introduced l = N −1/d (this corresponds to the average distance between uniformly distributed particles in a domain of unit volume). Therefore the range of the interaction is given by l β . In this form, it is easier to extract different cases: • β = 0: here u scales like 1/N and its range is order one. This corresponds to weakly interacting particles (mean field limit), where each particle interacts on average with all the other N particles and the potential is long range. In the limit of N → ∞ one obtains an integro-differential equation of the form (2.3).
• β ∈ (0, β * ): in this case each particle interacts with an order N 1−β neighbours but each interaction is stronger (of order N β−1 ). This limit is termed "moderately interacting particles" in [23]. In contrast to the mean-field case, as N → ∞ the interactions get more and more local and the limit dynamics satisfy a nonlinear diffusion equation of the form (2.4) but with a nonlinear coefficient of the form u 1 (x)dx (see (28) in [7] or (15) in [13]). In [23] they have β * = d/(d + 2) and in [7] they get β * = 1 for d = 1 for particles are near the equilibrium (distributed according to the Gibbs measure).
• β β * : in contrast to the other two cases, the random fluctuations in the interaction term N j>i u(x i − x j ) does not vanish as N → ∞. This corresponds to strongly interacting particles or the hydrodynamic limit. In the case of β = 1, the strength of u is independent of the number of particles, but the range is short so that on average only interact with on neighbour.
What the three limits above have in common is the order of the total excluded volume η. This is given by N times the volume of the range of the potential, multiplied by the strength of the repulsion (this can be thought of as α u ). For an interaction potential of the form (3.33), we have η = N N β−1 (l β ) d = N β (N −1/d ) βd = 1, independent of β. In contrast to [7,23], here we do not take N → ∞, the volume fraction tends to a nonzero small constant, in which we do the asymptotic expansion. In particular, the strength of the potential is order one, its range is ǫ, and the total excluded volume is N ǫ d = η. The assumption η ≪ 1 implies that a particle at a given time only interacts with another particle on average, and that we can use the N = 2 case to obtain the leading-order correction term. This means that we are in a regime in between the moderately and strongly interacting particles, in that we arrive at a nonlinear diffusion equation for the population density as for β < β * but the strength of the interaction is order one.

Hard-sphere particles
So far we have assumed that particles interact via a soft short-range repulsive potential u(r) to obtain the equations governing the one-particle marginal in variational form. In this subsection we outline the case of a hard spheres, that is, spherical particles of diameter ǫ interacting via the hard-sphere potential u(r) = +∞ for r < ǫ, u(r) = 0 otherwise. The microscopic optimal problem (3.14) for N = 2 reads in Ω 2 ǫ with boundary conditions P ∇ xi Φ · n i = 0, and initial/final conditions Most of the steps in the derivation are analogous to the soft spheres case we have just presented. Hence here we only outline the key differences arising when considering hard spheres and (see appendix A for more details on the hard spheres case): • The microscopic model (3.34) is defined in a perforated domain, namely Ω 2 ǫ}. • The interaction between particles appears in the microscopic problem as the boundary condition (3.34 c), rather than in the final time condition (3.34 f). But in the macroscopic model the hard spheres interaction appears in the same way as the soft shortrange interaction, namely by substituting in the hard-sphere potential into (3.26) to find α u = V d (1) (2 for d = 1, π for d = 2 and 4π/3 for d = 3). • The initial condition P ( x, 0) cannot be separable, since initial configurations must live in the configuration space, X(0) ∈ Ω 2 ǫ . However, the correction due to overlaps in the initial condition P ( x, 0) scales with the excluded volume, and therefore it only shows at higher order.

Numerical examples
In this section we present several numerical examples of the macroscopic equation to illustrate the effect the nonlinear diffusion has on the behaviour of solutions and their convergence to the steady states. Throughout this section we use no-flux boundary conditions on ∂Ω, where Ω = [−1/2, 1/2] d . We will also compare the predictions of (4.1) with stochastic simulations of the corresponding microscopic model (soft or hard spheres depending on the choice of α u ).  without external potential (that is, V = 0). The corresponding unit-mass steady state is p ǫ,∞ = 1, which is the unique minimiser of the free-energy In the case without interactions (ǫ = 0), Eq. (4.1) is simply the diffusion equation, whose solutions approach the steady state with an exponential convergence rate. However, it is not clear what effect has the nonlinearity in the convergence rate. We solve (4.1) numerically using the finite-volume method presented in [16], with the initial condition p ǫ (x, 0) = χ [0.2,0.4] and N = 100 hard rods of length ǫ = 0.0015. We also solve the linear case, setting ǫ = 0 in (4.1). The decay of the free-energy (4.2) for these two cases in shown in Figure 4.1 as ∆E(t) = E ǫ (p ǫ (x, t)) − E ǫ (p ǫ,∞ (x)). We observe the expected exponential convergence with rate r 0 = 2λ 1 = 2π 2 in the linear case. In the nonlinear case, we also observe an exponential convergence with rate r ǫ > r 0 .
In order to approximate the increased rate r ǫ , we look at the linearised version of equation (4.1) around the equilibrium p ǫ,∞ = 1, which corresponds to the linear diffusion equation but with diffusion coefficient D eff = 1+α(N −1)ǫ d , equal to 1.3 for our choice of parameters. It can easily be shown that in this case the rate isr ǫ = D eff r 0 . We find that the free-energy decay of the linearised equation agrees with this prediction, as well as with the free-energy decay of the nonlinear equation (see Figure 4.1), indicating that the solution is already very close to the equilibrium and well approximated by the linearised equation. as well as with the prediction.
In the next examples we compare the behaviour of the solutions of (4.1) for different interaction types and external potentials, with the corresponding microscopic particlelevel model. For the particle-level simulations, we use the open-source C++ library Aboria [25]. The overdamped Langevin equation (1.1) is integrated using the Euler-Maruyama method and a constant timestep ∆t.
In order to compare the models at the density level, we perform R independent realisations and output the positions of all N R particles at a set of output time points. A histogram of the positions is calculated and then scaled to produce a discretised density function (p i (t) ≈ p(x i , t) in 1d, p ij (t) ≈ p(x i , y j , t) in 2d, . . . ) that can be compared with the solution to (4.1). The macroscopic free-energy (4.2) should approximate well the microscopic free-energy. For hard spheres this is Without the P log P term, this would be straightforward using a Monte Carlo integration; however the entropic term make things more complicated. One approach would be to compute an estimateP ( x, t) of the joint probability density P (using for example a Kernel Density Estimate) and then obtaining an estimate of E either by a so-called resubstitution estimate (Monte Carlo integration using the same samples used to obtain P ) or a splitting data estimate (generating new samples for the Monte Carlo integration) where X k i is the position of the ith particles in the kth sample at the time of the freeenergy estimate. A cheaper alternative approach is to use the discretised density function p i and compute the approximation to the free-energy using a discretised version of (4.2). The direct estimation of the microscopic free-energy (4.3) using (4.4) is out of the scope of this paper, and we are going to use the second approach.
Example 4.2 (Hard-core interacting particles with quadratic external potential) In this example, we consider a two-dimensional system (d = 2) with N = 1000 hard-core disks of diameter ǫ = 0.01, and a quadratic external potential in the horizontal direction, V (x) = 5x 2 (see Figure 2(a)). In two dimensions, the hard-core potential has α = π, and for our choice of parameters the coefficient of the nonlinear term in (4.1) is α(N − 1)ǫ 2 = 0.314. We choose initial data constant in the vertical direction so that the evolution of (4.1) is purely in the x-direction. Specifically, we take initial data p(x, 0) = χ [0.1,0.3] (x). In Figure 2(b) we plot the early-time evolution, and in Figure  2(c) the steady-state solutions. As expected, we find an increased speed of convergence to equilibrium in the case of interactions (see Figure 2(d)). We observe good agreement between the PDE solutions and the stochastic simulations. PSfrag replacements PSfrag replacements PSfrag replacements PSfrag replacements "volcano-shaped" external potential in the horizontal direction (see Figure 3(a)). In two dimensions, the Yukawa potential has α u = 3.926 (using (3.26)), and for our choice of parameters the coefficient of the nonlinear term in (4.1) is α u (N − 1)ǫ 2 = 0.392. We choose initial data to be a sum of two gaussians along the horizontal direction and constant in the vertical direction so that the evolution of (4.1) is purely in the x-direction. We observe that the density moves very quickly from the asymmetric initial condition to the centre of the domain, where the minimum of the potential V is (see Figure 3(b)). This effect can also be observed in the evolution of the relative entropy, which shows a steep change until around t = 0.01 and then relaxes to the long-time convergence (see Figure  3(d)). Again we observe a marked difference in the speed of converge to the equilibrium solution between the interacting or point particles simulations, and that the difference in slopes is well-captured by our PDE solutions (see Figure 3(d)).
a radially-symmetric "volcano-shaped" external potential in the horizontal direction (see Figure 4(a)). In two dimensions, this interaction potential has α u = 5.568 (using (3.26)), and for our choice of parameters the coefficient of the nonlinear term in (4.1) is α u (N − 1)ǫ 2 = 0.556. This time we choose a radial initial condition whose amplitude depends on the angular variable (see Figure 4(b)) so that the evolution is in two dimensions and has two distinct timescales. In particular, we observe a very fast evolution until about t = 0.01 by when the mass is almost centred in the middle of the domain ( Figure  4(c)). After that time the evolution is a lot slower as observed by the change in slope in relative entropy (Figure 4(d)).

Conclusion
In this paper we have obtained a framework to derive higher-order expansions of gradient flows for many particle systems, which automatically preserves the gradient flow structure  and thus gives a first answer to the questions raised in [11], due to the observation that higher-order expansions only at the level of the PDEs can lead to a loss of the gradient flow structure. It obviously motivates further research, e.g. about the cross-diffusion system from [12], where this loss of a gradient structure happens this way [11] and we expect to restore the correct gradient structure by our approach. Another interesting issue is to study the N -particle problem and its limit in further detail, without assuming that we can directly work on the two-particle level. In this respect it is a relevant question to find the BBGKY hierarchy equivalent for the (P, Φ) problem in the sense of [14] and to provide a justification of truncation in the limit of low volume fraction.

A.1 Matched asymptotic expansions
We proceed to solve (3.34) using matched asymptotic expansions.
In the outer region we obtain the same solution as for soft spheres, as expected since we are outside the interaction region: P out (s, x 1 , x 2 ) ∼ q(s, x 1 )q(s, x 2 ), Φ out (s, x 1 , x 2 ) ∼ ϕ(s, x 1 ) + ϕ(s, x 2 ), (A 1) where q and ϕ satisfy (3.16). This is valid up to O(ǫ d ) by the same argument as in the soft spheres case and the remark that the initial density is chosen separable up to that order.

A.2 Integrated equations
The procedure is analogous to the soft spheres case, except that now the domain of integration for x 2 depends on the position of the first particle x 1 . This will result in some surface integrals. Fixing the first particle at x 1 , we integrate (3.34) over the region available to the second particle, namely Ω ǫ (x 1 ) := Ω\B ǫ (x 1 ). We ignore any intersections that the ball B ǫ (x 1 ) may have with ∂Ω (for some positions x 1 close to the boundaries), since this gives a higher-order correction. This gives where p(s, x 1 ) = Ωǫ(x1) P (s, x 1 , x 2 ) dx 2 .
Here dS x2 denotes the surface element with respect to variables x 2 . In the last term of (A 7) we have used the divergence theorem and the no-flux boundary condition (3.34 c).
We again compute I breaking it into inner and outer regions: The outer part is, using the outer expansion (A 1) where V d (1) denotes the volume of the unit ball in R d . The inner region integral, using the inner solution (A 6), becomes Combining the two integrals we obtain using that α u = V d (1) for hard spheres. Similarly, we can use Lemma 1 to find p ∼ q(x 1 ) Ω q(x 2 ) dx 2 − α u ǫ d q(x 1 ) , which implies that m(p) = p and φ = ϕ(q) up to O(ǫ d+1 ) as expected.