1 Introduction
The problem of predicting the evolution of a nonlinear dynamical system is ubiquitous in science and engineering. In many such applications, the fact that thousands or millions degrees of freedom (DoF) must be solved for at once makes direct numerical solution infeasible, and so many methods are constrained to apply to smaller (and hence less scientifically interesting) problems. One of the most relevant example of this kind is represented by Molecular Dynamics (MD) simulation [Reference Leimkuhler and Matthews40]. In an MD simulation, a set of atoms which represents a particular molecular system is considered, and their positions and momenta are calculated, assuming that they evolve according to the Newton’s equations of motion [Reference Rapaport46] with prescribed interaction potentials and an appropriate thermostat.
Fully atomistic MD simulations regularly performed nowadays reach the order of a few million atoms ( $\approx\!10^6$ ) [Reference Zhao, Perilla, Yufenyuy, Meng, Chen, Ning, Ahn, Gronenborn, Schulten and Aiken55], a number which remains well below the size of a macroscopic system with Avogadro’s number of atoms ( $\approx\!10^{23}$ ). Although simulations with around $10^4$ atoms are usually large enough to extract properties of interest in systems like crystals and small macromolecules (such as polymers and peptides), in order to understand the dynamics of proteins, one must consider the environment in which they evolve. From the point of view of simulation, the problem for this kind of systems is not only the number of DoF required to represent the system, but also the relaxation times of these biomacromolecules. These can be so large that the cost of performing any atomistic simulation becomes quickly prohibitive, since billions of time steps are still required to extract meaningful information. This dual space–time problem continues to render accurate simulations of this type out of our reach.
For these reasons, in recent years, a series of techniques to derive simpler coarsegrained (CG) model have been developed. The main idea behind CG models is to reduce the number of variables in the system while maintaining accuracy. In place of the detailed evolution of the atoms, we consider a smaller number of macroscopic variables, functions of the underlying atomistic structure [Reference Voth51, Reference Papoian44, Reference Di Pasquale, Marchisio and Carbone19, Reference Di Pasquale and Carbone17]. Although theoretical results concerning static equilibrium statistics for coarsegrained systems are relatively welldeveloped [Reference Lelievre, Rousset and Stoltz41], theoretical work on dynamical CG models is still evolving, particularly focusing on applications of the Mori–Zwanzig (MZ) projection operator formalism. In [Reference Hijón, Español, VandenEijnden and DelgadoBuscalioni27], a systematic derivation of CG models based on MZ techniques was proposed. The authors of the present work contributed by putting various aspects of MZCG models on a more rigorous footing [Reference Di Pasquale, Hudson and Icardi18]. The idea behind the application of the MZ framework to coarsegraining is to follow the dynamics of the CG DoF by explicitly integrating over the noncoarsegrained DoF. The resulting equations represent a formidable mathematical problem, and several works were already published reporting the analysis of different aspect of the MZ equations [Reference Givon, Kupferman and Hald21, Reference Chorin and Hald7, Reference Chorin, Kast and Kupferman10]. The importance of the MZ equations is not limited to CG problems alone, and similar approaches have been proposed to describe many different systems, ranging from the relativistic heavyion collision, where Fick’s law breaks down [Reference Koide36, Reference Koide37], to the dynamics of supercooled liquids, glasses and other amorphous solids [Reference Cui, Gebbia, Romanini, Rudić, FernandezPerea, Bermejo, Tamarit and Zaccone12, Reference Cui, Gebbia, Tamarit and Zaccone13, Reference Cui, Milkus and Zaccone14, Reference Cui, Yang, Qiao, Jiang, Dai, Wang and Zaccone15, Reference Götze22, Reference Handle, Rovigatti and Sciortino24, Reference Voigtmann, Puertas and Fuchs50, Reference Zaccarelli, Foffi, Sciortino, Tartaglia and Dawson53, Reference Zaccone54] and the study of heat conduction [Reference Chu and Li11].
A significant challenge with handling the MZ equations arises due to the need to approximate the memory kernel $\boldsymbol{\mathcal{M}}$ , which varies in both time and space. Different methods were proposed to calculate this term as the use of Krylovsubspace method [Reference Chen, Li and Liu6] or datadriven approximations [Reference Lei, Baker and Li38]. A route often followed in the literature is to assume a functional form for either the whole memory kernel or just the fluctuating force, and to ensure that both are coupled through the Fluctuation–Dissipation Theorem. In Berkowitz et al. [Reference Berkowitz, Morgan and McCammon2], the starting point is to assume that the fluctuating force is a periodic function which can thus be developed in Fourier series. Taking the other perspective, functional forms for the memory kernel that have been considered include Gaussexponential process [Reference Berkowitz, Morgan and McCammon2, Reference Berne and Harp3, Reference Wang and Uhlenbeck52], the GaussGauss process [Reference Berkowitz, Morgan and McCammon2], or combination of exponentially decaying kernels [Reference Ayaz, Tepper, Brünig, Kappler, Daldrop and Netz1, Reference Berkowitz, Morgan and McCammon2]. In all these cited examples, however, the choice of ansatz implies a strong hypothesis: the memory kernel does not depend on the positions (i.e., it is spatial homogeneous). However, as we will show in the next sections, for most CG systems this hypothesis does not seem justified. Even in a onedimensional chain interacting with a simple Lennard–Jones potential, which is the example we present here, we observe that the memory kernel has a strong spatial dependence. The spatial dependence for this term was also observed in [Reference Chu and Li11]. Although it is not clear yet how to quantify the importance of this spatial dependence, in particular for more complicated systems in higher dimensions, the spatial homogeneity of the friction should not be taken for granted without verification.
1.1 The Mori–Zwanzig equations
The MZCG equations are a set of integrodifferential equations which take the form of Generalised Langevin Equations under appropriate assumptions. We assume that, at the microscopic level, the considered system is well described by classical mechanics and is composed by $N^{FG}$ particles. To derive a reduced description of this system, we consider a smaller set of function of phase variables $\boldsymbol\zeta_{}$ and their time evolution
Here, the operator $\mathfrak{L}\equiv iL \,\,\unicode{x2254} \left\lbrace {\ldots,\mathcal{H}}\right\rbrace$ is the Liouville operator where $\left\lbrace {\ldots,\mathcal{H}}\right\rbrace$ is Poisson bracket with the Hamiltonian $\mathcal{H}$ , and $i=\sqrt{1}$ .
The Mori–Zwanzig formalism allows us to replace equation (1.1) for the functions of the phase variables $\boldsymbol\zeta_{}(\textbf{r}(t),\textbf{p}(t))$ by an equation of motion which depends on the microstate $(\textbf{r}(t),\textbf{p}(t))$ only through the CG DoF themselves, and where, as shown by Di Pasquale et al. [Reference Di Pasquale, Hudson and Icardi18], the complexity of the computation of equation (1.1) is shifted to the computation of the socalled orthogonal dynamics.
Generally speaking, a CG model can be obtained by grouping DoF of the finegrained (FG) system into $N^{CG}$ beads. In the case of the Mori–Zwanzig projection this is done by projecting the equations for the functions of the phase variables $\boldsymbol\zeta_{}(\textbf{r}(t),\textbf{p}(t))$ onto the space of functions of the CG variables, which may be regarded as a subspace of the FG observable space. Naturally, the CG variables must be continuous and differentiable in the FG variables. The equations satisfied by these beads can be written as (for full derivations see [Reference Di Pasquale, Hudson and Icardi18, Reference Hijón, Español, VandenEijnden and DelgadoBuscalioni27])
where:
$\boldsymbol Z_{}=\left(\boldsymbol Z_{\textbf{R}},\boldsymbol Z_{\textbf{P}}\right)=\left(\textbf{R}_1,\ldots,\textbf{R}_N^{CG},\textbf{P}_1,\ldots,\textbf{P}_N^{CG}\right)$ are the CG DoF;
$\textbf{M}=\textrm{diag}(M_1,\dots,M_{N^{CG}})$ is a diagonal matrix of the masses of the beads;
$V^{\textrm{eff}}$ is the effective potential (for an explicit definition we refer to [Reference Di Pasquale, Hudson and Icardi18]);
$\mathcal{P}$ and $\mathcal{Q}$ are the MZ projection operator and its orthogonal counterpart (i.e., $\mathcal{Q}=\mathcal{I}\mathcal{P}$ where $\mathcal{I}$ is the identity operator), respectively.
The projection operator $\mathcal{P}$ applied to a generic function $F(\textbf{z})$ acting on the space phase $X=\mathbb{R}^{6N^{FG}}$ is defined by
Here, $\textbf{z} = (\textbf{r}(t),\textbf{p}(t))$ are the FG DoF, $\boldsymbol\zeta_{}(\textbf{z})$ is a smaller set of phase function, and we have introduced a set of fixed values $\textbf{a}$ such that for a given set $\textbf{z}$ we have $\boldsymbol\zeta_{} = \textbf{a}$ . The normalisation factor (also called the structure function) represents the total volume of the surface $S(\boldsymbol\zeta_{})$ in phase space, which is defined by $\boldsymbol\zeta_{}(\textbf{z})=\textbf{a}$ :
where $\mathbb{P}(\mbox{d}{\textbf{z}'}) = \rho^{eq}(\textbf{z}')\, \mbox{d}{\textbf{z}}'$ is the Gibbs measure on the FG phase space X. Loosely speaking, the projection operator $\mathcal{P}_{\textbf{a}}$ ‘averages’ the function $F(\textbf{z})$ over the surface $S(\textbf{a})$ in case the observables $\boldsymbol\zeta_{}$ are equal to $\textbf{a}$ for a given set $\textbf{z}$ . The fact that $\mathcal{P}$ is a projection operator can be easily verified by showing that it satisfies the necessary and sufficient conditions for a projection operator, i.e., it is Hermitian and $\mathcal{P}^2 = \mathcal{P}$ [Reference Zwanzig57]. Since the observed value of the phase function $F(\textbf{z})$ is the expected value $\mathbb{E}[F]$ with respect to the Gibbs measure, we can interpret the phase function as a random variable. Therefore, the projection operator $\mathcal{P}$ has a nice interpretation as a conditional expectation over the CG DoF:
Finally, the term ${\boldsymbol{\mathcal{F}}}_{\boldsymbol Z_{\textbf{P}}}(t,\cdot)$ is often called fluctuating force and can be written (for the Kth bead) as
Here, we used a dot in the second argument of ${\boldsymbol{\mathcal{F}}}_K$ to stress that this term is a function of all of the variables in the atomistic system. Computing ${\boldsymbol{\mathcal{F}}}_K$ requires the calculation of the evolution of the orthogonal variables in the null space of $\mathcal{P}$ . In general, it is given by solving an auxiliary set of equations called orthogonal dynamics equations [Reference Givon, Kupferman and Hald21], which (with an appropriate CG mapping) can be determined by means of constrained dynamics (see [Reference Di Pasquale, Hudson and Icardi18]). Its presence in the MZ equations equation (1.2) is usually considered as a random noise for the CG DoF [Reference Chorin, Hald and Kupferman8]. Making this assumption avoids the problem of its direct calculation, and so allows us to reduce the complexity of the problem. However, ensuring that statistics of ${\boldsymbol{\mathcal{F}}}_K$ are captured accurately in a stochastic model is important to preserve the accuracy of the CG system. This represents the main problem we address in this work.
The integral term in equation (1.2) is known as the ‘memory’, since it requires information coming from the elapsed time. In a sense, in order to know the next position occupied by the system, we need to ‘remember’ what happened since the initial time $t=0$ . The memory term can be rewritten (for the Kth bead) as [Reference Di Pasquale, Hudson and Icardi18, Reference Hijón, Español, VandenEijnden and DelgadoBuscalioni27]
where we introduced the KJth components of the memory kernel $ \boldsymbol{\mathcal{M}}\left({\boldsymbol Z_{}(ts),s}\right)$ :
Here, $\otimes$ represents a tensor product and $\beta^{1}=k_B T$ , with $k_B$ the Boltzmann constant, and ${\boldsymbol{\mathcal{F}}}_K(0,\cdot) = \mathcal{Q}\mathfrak{L}\,\textbf{P}_K\,$ (see equation (1.6)). Di Pasquale et al. have shown that (with an appropriate CG mapping) the fluctuating force ${\boldsymbol{\mathcal{F}}}_K$ is given by the difference of the total force acting on atoms within a bead and the mean force given by the effective potential that describe the bead interaction (see [Reference Di Pasquale, Hudson and Icardi18]). Finally, in the last passage, we used the Zwanzig projection $\mathcal{P}$ , and the fact that it is equivalent to take the conditional expectation given the CG (set of) variables $\boldsymbol Z_{}$ , interpreted as random variables, with respect to the equilibrium distribution [Reference Chorin, Hald and Kupferman8, Reference Di Pasquale, Hudson and Icardi18].
For a system in thermal equilibrium, the memory kernel is related to the autocovariance of the fluctuating forces (ACF) through the Fluctuation–dissipation theorem [Reference Farias, Ramos and da Silva20, Reference Hänggi and Jung25, Reference Jung, Hanke and Schmid30]:
where, as above, expectations are taken with respect to an appropriate ensemble. This description is usually called ‘colored noise’ fluctuation, as opposed to the ‘white noise’ represented by the Markovian behaviour (see Section 2.1) [Reference Hänggi and Jung25].
Different approaches have been proposed in the literature to deal with the memory term. In [Reference Kauzlarić, Español, Greiner and Succi31], an analysis of the memory term obtained from the Mori projector operator [Reference Mori43, Reference Kawasaki32], a special case of the more general MZ projector, was presented. A more detailed analysis is reported in [Reference Zhu, Dominy and Venturi56] where some boundary analysis is proposed and the hierarchical approximation, first introduced by [Reference Stinis47]), is further developed. Other approaches proposed in the literature to calculate the memory term, include the use of the Galerkin decomposition [Reference Darve, Solomon and Kia16], or the assumption that the evolution of the CG DoF can be modelled using an autoregressive stochastic process [Reference Kneller and Hinsen35].
The two most used approximations for the calculation of the memory term are related to the behaviour of the integrand of the memory integral with respect to time. If the integrand quickly decays to zero, it can be assumed that all the information is concentrated in a very short time (i.e. we don’t need the whole history of the system from $t=0$ ). The memory term becomes local in time and we talk about the Markovian Approximation [Reference Chorin and Hald7, Reference Hijón, Español, VandenEijnden and DelgadoBuscalioni27]. If, on the other hand, the decay of the integrand of the memory term is very slow compared to the characteristic time of the dynamics of the system, then we can assume that the integrand is constant in the interval [0,t] and we obtain the socalled tmodel [Reference Chorin, Hald and Kupferman9, Reference Chorin and Hald7, Reference Hald and Stinis23, Reference Price and Stinis45].
In this work, we will focus on the shorttime approximation by describing a model which can be interpreted as giving a Markovian reduction of the MZ equation. This paper is structured as follows: we will start with the discussion of the memory term from the point of view of the Markovian Approximation as usually is considered in the literature, and then we present our new description for it. We will show how it can be effectively approximated and we will give a precise definition of all the hypothesis used in the derivation. We will then apply the framework derived in a simple example composed by a onedimensional linear chain where atoms interact with a nonlinear potential. We will show how the relevant quantities described here can be explicitly calculated and we draw some conclusions.
2 The behaviour of the MZ CG system
2.1 Memory approximation
The physical theory of Brownian motion assumes that thermal fluctuations which occur due to the collision of the small, light particles with the larger Brownian particle happen at a time scale which is much shorter than the one describing the evolution of the larger particle. The consequence of this assumption is that the force acting on the Brownian particle at each time is completely uncorrelated with the history of its evolution. Mathematically speaking, this corresponds to the idea that the random force is a stochastic process which obeys the Markov property. We note that in the theory of Brownian motion, there is a further important assumption, which is that the behaviour of the heat bath is spatially uniform. This assumption is clearly justified for a particle within a fluid which is at rest macroscopically. However, the spatial uniformity may not be generally valid, since many coarsegrained DoF are tightly bound within a molecule or bulk solid, and so are in close contact with their particular environment, rather than a homogeneous random one.
Moreover, for coarsegraining of molecular systems, it is common to make an analogous timescale separation assumption about the action of the neglected DoF on the coarsegrained DoF, i.e. that the discarded DoF fluctuate at a faster timescale and so act on the CG DoF randomly with no correlations in time.
Setting aside the justification for the use of timescale separation and spatial homogeneity, under these two assumptions the fluctuating forces ${\boldsymbol{\mathcal{F}}}$ are usually modelled as a white noise process, with an ACF which is assumed to take the form:
with ${\boldsymbol{\mathcal{F}}}$ being the fluctuating forces which act in the system at full resolution. Since the fluctuating forces are assumed to be stationary as a random process (in fact they are defined as averages over equilibrium ensembles, see [Reference Berne and Pecora4]), the terms $\Phi_{KJ}$ do not depend on the particular time origin [Reference Lei, Caswell and Karniadakis39, Reference Kinjo and Hyodo33], i.e. there is no time dependence in the definition of $\Phi_{KJ}$ given in (2.1). One of the problems with this description is the socalled plateau problem, first noticed by Kirkwood and Buff [Reference Kirkwood and Buff34]. The plateau problem stems from the fact that the integral definition of $\Phi_{KJ}$ is zero if the upper boundary is $+\infty$ , because of the decay of the negative part of the correlation function at infinity [Reference Helfand26]. For this reason, one approach has been to cut the domain of integration for the last integral in equation (2.1) at a certain time $\tau$ which should be large enough to include the relevant information in the ACF, but small compared to the time scale of the evolution of the CG DoF. By cutting the integral at a finite time, we are assuming that it remains constant for $t > \tau$ (i.e. the integral reaches a plateau). The problem comes from the fact that the intermediate time $\tau$ is not well defined within the theory and there is no method to estimate it a priori.
In the following, we will show that our derivation overcomes this issue by proposing an asymptotic methodology which allows us to use data to fit a functional form for the memory, and thereby derive an explicit form for the Markovian approximation directly from data.
2.2 Memory ansatz
Our analysis starts in a similar fashion by making an ansatz on the form of the memory kernel $ \boldsymbol{\mathcal{M}}(\boldsymbol Z_{},t)$ , albeit a less restrictive one. We assume that the memory kernel for the generic pair of beads K and J takes the general form
where $\tau_{KJ}$ is a characteristic time, $\alpha$ is a constant, and $f_{KJ}(\boldsymbol Z_{},t)$ is a sufficiently smooth function on the interval $0\leq t < \infty$ . In general, each entry of the memory kernel is specific to each pair of beads. However, in the following, in order to simplify the notation, we will write
i.e. we will consider the case where all the entries in the memory kernel are equal. The general case can then be easily obtained by repeating the same arguments for all the possible KJ pairs. It is interesting to note that a similar ansatz was considered when using a MZderived Generalised Langevin Equation to study the dynamical behaviour of amorphous solids [Reference Cui, Gebbia, Romanini, Rudić, FernandezPerea, Bermejo, Tamarit and Zaccone12, Reference Cui, Gebbia, Tamarit and Zaccone13, Reference Cui, Milkus and Zaccone14, Reference Cui, Yang, Qiao, Jiang, Dai, Wang and Zaccone15, Reference Zaccone54].
While equation (2.3) represents a somewhat strict ansatz for the form of the memory kernel, it is consistent with similar quantities calculated in other systems [Reference Li, Lee, Darve and Karniadakis42], and moreover, our approach includes a possibly nonuniform dependence on the spatial position, in contrast to assumptions made in other applications of CG theory discussed above. Moreover, given the connection between the memory kernel and the ACF of equation (1.9), both $\tau$ and $\alpha$ can be extracted from the ACF, for example by fitting to numerical data. In particular, $\tau$ represents the time decay of the ACF and we will show in Section 3.3 its explicit calculation in a test system. We will also make the natural physical assumptions that $\boldsymbol{\mathcal{M}}(\boldsymbol Z_{},t)$ is maximal when $t=0$ , and that the parameters satisfy $\tau>0$ and $\alpha > 0$ .
Under the assumptions that $\alpha>0$ and $\tau>0$ , the exponential factor makes it possible to expand the memory kernel as an asymptotic series from which the behaviour close to the origin can be explicitly obtained. Here, we argue that this asymptotic series approach leads naturally to a form of the Markovian approximation commonly considered in CG theory. We note that the case where $\alpha = 0$ corresponds to algebraic decay of the memory kernel, and will require a different approach which we leave for future work.
In the following, we consider only the first integral of the memory term (i.e., equation (1.7)) since $\boldsymbol{\mathcal{M}}$ does not depend on the momenta for an appropriate CG mapping, see [Reference Di Pasquale, Hudson and Icardi18]. Furthermore, we make the memory term dimensionless by multiplying it by suitable constants:
where:
M is the mean effective mass of the coarsegrained DoF;
$L_c$ is a characteristic length whose specific definition is not important for what is discussed here (and can be taken as the average bead size, for instance);
$\tau_P$ is a characteristic time, which can be interpreted as the decay of the autocovariance function of the momenta (ACM) of the beads. This quantity can be interpreted also as the characteristic time scale of the macroscopic system.
We indicate with $\widetilde{\cdot}$ the dimensionless quantities and we note that we can always rewrite the different terms in equation (2.4) as function of the dimensionless time $\tilde{t} = t/\tau_P$ , by writing $\boldsymbol Z_{}(ts)=\boldsymbol Z_{}(\tau_P(\tilde{t}\tilde{s}))=\widetilde{\boldsymbol Z_{}}(\tilde{t}\tilde{s})$ , where we absorb $\tau_P$ in the definition of the function. We will also simplify the notation in equation (2.4) by defining the dimensionless quantity $\lambda=\left({\frac{\tau_P}{\tau}}\right)^\alpha$ . We want to highlight here the first result of the formulation presented here. The concept of the timescale separation can be precisely and quantitatively defined by measuring it with the parameter $\lambda$ . From the definition of $\lambda$ , we can distinguish two cases:

(i) $\lambda \gg 1$ , then $\tau_P \gg \tau$ , that is to say that the macroscopic time scale is much larger than the microscopic one. The Markovian approximation is expected to hold.

(ii) $\lambda \approx 1$ , then $\tau_P \approx \tau$ and there is no separation between the macroscopic and the microscopic timescale. The Markovian approximation is expected to fail and the full memory term must be considered.
In the next section, we will present a result which will make more clear the qualitative explanation given here in terms of validity or not of the Markovian approximation. Then, the problem will become the explicit calculation of the two quantities, $\tau$ and $\tau_P$ , which will be presented in the following sections. In order to simplify our notation, we will drop tildes from now on, and assume all quantities are dimensionless.
2.3 Rigorous approximation result
Under the assumption that the ansatz equation (2.3) holds, we now derive an error estimate for an expansion of the memory integral which we can use to define proxy dynamics, providing a theoretical guarantee of accuracy in the case where the timescale separation is significant.
Proposition 1. Suppose that the memory kernel $\mathcal{M}$ takes the form
where $\lambda>0$ , $\alpha>0$ , and the function
is of class $\textrm{C}^{k,\gamma}$ in both t and s, for some nonnegative integer k and Hölder exponent $\gamma\in(0,1]$ . Then the absolute error committed by replacing the memory
by a kterm approximation is bounded by
where
$\Gamma$ is the Gamma function, and $C_n$ are appropriate positive constants which depend upon the regularity of the trajectories.
A full proof of this proposition is given in Appendix A, and we focus on the physical interpretation of the result here. The error bound for the memory matrix is composed of two terms. The first of these is
and it can be shown that $\psi$ decays exponentially in time for fixed $\lambda$ (see Appendix B). Physically, this decay represent transient behaviour as the system relaxes towards the equilibrium. We believe this transient behaviour is closely related to the plateau problem which has been observed in earlier works referred to above.
The second term
does not decay with increasing time t. However, we can study its behaviour with respect to $\lambda$ . In particular, it follows that $\zeta(\lambda)$ decays to zero as $\lambda\to\infty$ because $\frac{k+\gamma+1}{\alpha} > 0$ , then
As already observed (see Section 2.2), $\lambda$ represents the scaleseparation between the underlying atomistic model and the coarsegrained one, so as we will show below, the above result provides both a form for a Markovian approximation of the dynamics, and a theoretical guarantee backing up the commonly held belief that a Markovian approximation is more accurate when there is a large time scale separation between atomistic and CG models.
In particular, as a corollary of Proposition 1, we note that if the integrand function g(t,s) in the memory integral is Lipschitz, so that $k=0$ and $\gamma=1$ , then we have
We remark that the assumption that g(t,s) is Lipschitz is particularly mild. It is guaranteed to hold if trajectories of the system are twice continuously differentiable in time, and this will in turn be the case when the potential energy function is twice continuously differentiable in space. As such, the assumption is therefore satisfied for almost all Hamiltonian systems of interest.
We note that the function $G_0(t)$ is
and hence we are able to approximate equation (2.4), where we have dropped the tildes, as
or in other words, we can define a spatially varying friction cooefficient matrix $\boldsymbol{\chi}$ to replace the memory integral.
In this case, we can use this expression to show that the relative error of the approximation is bounded by
As we already observed for the absolute error bound, the former term exhibits decays to zero as $t\to\infty$ , regardless of the value of $\lambda$ , while the latter decays as $\lambda\to\infty$ for fixed t.
In the next section, we will show how to compute $\chi$ in practice and discuss the error committed in replacing the memory term with its approximation shown in equation (2.8).
3 A numerical example
In this section, we demonstrate how one might apply the mathematical result of Proposition 1. To do so, we use a numerical implementation of a onedimensional periodic chain made of $N^{FG} = 30$ atoms of two different species, which differ in mass and bond stiffness, grouped in $N^{CG} = 10$ CG beads.
3.1 Model
We assume the atoms are arranged in a repeating pattern, and a coarsegrained model is obtained by combining the single repeated units into beads. The CG variables are given by the centre of mass of each bead and by the corresponding momenta, and the repeating pattern consists of two atoms of mass $M_2$ and a single atom of mass $M_1$ arranged in the following configuration: $(M_2M_1M_2)$ . We considered two test cases: $M_1=M_2=1$ and $M_1=1$ , $M_2=100$ .
The bond stiffness between the two different species of atoms is described by means of an interatomic potential, which is a simple 126 Lennard–Jones potential:
Here, $i \in \left\lbrace {1,2,\dots,N^{FG}}\right\rbrace$ and we choose $\sigma_{i,i+1}=2^{1/6}$ so that the minimum value of the potential is obtained for $r^*_{i,i+1}=1$ . The value of the interaction strength, $\epsilon_{i,i+1}$ , is set to 1 and 10 for $M_1  M_1$ and $M_1  M_2$ pairs, respectively. In the following, all parameters, such as temperature and time step, are expressed in terms of Lennard–Jones reduced units [Reference Tuckerman48].
The simulations to obtain the fluctuating force (and therefore to determine the memory kernel) are implemented in the canonical (NVT) ensemble at fixed $k_B T = 1$ and friction parameter $\gamma=1$ . Furthermore, we assume periodic boundary conditions and nearest neighbour interactions.
The Hamiltonian of the atomistic system is given by
where $\textbf{M}$ is the mass matrix (that is, a diagonal matrix whose entries are the masses of each atom) and $\textbf{r}$ and $\textbf{p}$ are the positions and momenta of the atoms, respectively. The final term takes into account that periodic boundary conditions are in place, i.e., that the two ends of the linear chain are connected to each other.
The results shown below were obtained through simulation of two different dynamics: (i) the FineGrained Dynamics (FGD) when solving the equations of motion with the Hamiltonian defined in equation (3.1),
where, $\textbf{z}=\left({ \textbf{z}_{ \textbf{r} },\textbf{z}_{ \textbf{p} } }\right)=( \textbf{r}_1,\ldots,\textbf{r}_{N^{FG}} , \textbf{p}_1,\ldots,\textbf{p}_{N^{FG}} )$ is a state of the system characterised by the instantaneous positions and momenta of the $N^{FG}$ particles that compose it; (ii) the Orthogonal Dynamics (OD) when solving the system of equations
for $k = 1,2,\dots,N^{FG}$ , where I is the index such that $k \in S_I$ and $S_I$ is the set of particles of the FG system included within the bead of index I. We note that both FGD and OD have roughly the same numerical performances, and that for the simple system investigated here the speedup achieved by the coarsegraining procedure is $\approx\!5$ .
3.2 Numerical sampling of the memory kernel
All the simulations reported below were performed using the code HybridZwanzJulia [Reference Icardi, Hudson, Di Pasquale, Spinaci and Rovigatti29] written in Julia v1.4.1 [Reference Bezanson, Edelman, Karpinski and Shah5]. This code allows a relatively simple recalibration of the test system through Julia ‘types’, which are initialised by the user to fix all the parameters of the simulation, including the repeating pattern of the beads, the number of beads, the mass and stiffness of the interatomic potentials. Here, we give an idea of the sampling implemented in the code to compute the fluctuating force, given in equation (1.6), and therefore the memory term as given in equation (2.8).
As mentioned in the introduction, the idea is to sample using OD, since it is possible to determine the fluctuating force through constrained dynamics. First, the initial conditions for the simulation are fixed, so that the momenta are distributed according to the Maxwell–Boltzmann distribution and the positions lie in a neighbourhood of the equilibrium positions, determined by the equilibrium distance of the interatomic potential. Then the trajectory of the FGD is computed and samples of the positions and momenta are stored. These samples are used to determine the CG positions and momenta and are used as initial condition for computing the trajectories of the OD. Along these trajectories, time averages of the first and second moment of the effective force between beads are stored. The first moment gives the value of the mean force acting between adjacent beads given the position of the center of mass, while the second moment is used to compute the fluctuating force. Since the sampling steps of the OD can run at the same time without having to communicate between the processes, these procedures can be run in parallel.
In order to improve the time required to sample the fluctuating force, we use the translational symmetry of this simple system, the fact that the beads are identical and the assumption that the effective potential can be approximated as a sum of twobody interactions based on the distance to the nearestneighbour beads only. We thus compute the interactions between any combination of beads and assume that the interactions computed are valid for any equivalent pair in the system. In this way, we reduce the dimensionality of the space to be sampled from the total number of CG variables to simply one, i.e., the interbead distance, which simplifies the exploration of the entire constrained phase space.
If we consider bead I, we can assume that the forces that act on it can be decomposed as the independent ‘stress’ contributions $\sigma_{I+1,I}$ and $\sigma_{I,I1}$ , which are the forces exerted on bead I by its two neighbours, $I1$ and $I+1$ . Under these conditions, we write the conditional expectation in the equation (2.8) as
where $\Delta\sigma$ is the fluctuating part of the ‘stress’ contribution, $\textbf{R}$ is the interbead distance and we assume that the cross terms are negligible, i.e.,
In other words, we assume that the fluctuating stresses acting between adjacent pairs of beads are not correlated, since we have made the assumption that pair interactions are sufficient to describe the behaviour of the system.
Under this assumption, we obtain
Therefore, we compute a smooth approximation of the variance of the fluctuating force between beads which allows us to define
and the spatially varying friction coefficient matrix $\boldsymbol{\chi}$ (see equation (2.8)) as
where $I,J \in \left\lbrace {1,2,\dots,N^{CG}}\right\rbrace$ . We note here that the memory term we consider depends on CG coordinates. The importance of this feature in this kind of models was shown in [Reference Hudson and Li28].
3.3 Fitting and results
In this section, we show the numerical results for the autocorrelation function (ACF) of the fluctuating forces and of the bead momenta.
Applying our ansatz (see equation (2.3)), we fit the decaying of the ACF of the fluctuating forces obtained from a finegrained simulation using the function
The best fit values for $\alpha_F$ and $\tau$ , obtained with the XMGRACE software [Reference Turner49], are reported in Table 1 for the two systems investigated here. In order to check the consistency of the results, we also fit the initial decay of the ACF of the momenta and of the fluctuating forces using
For comparison, the data and the accompanying fitted curves are shown in Figure 1.
For the fluctuating force, for both systems ( $M_2 = 1$ and $M_2 = 100$ ), the initial behaviour can be described by a quasiGaussian behaviour with a characteristic decay time that does not depend on the exponent used, followed by damped oscillations that we do not take into account in our analysis. By contrast, the bead momenta ACF of both systems exhibit a nearlyperfect Gaussian decay. This distinction is likely due to the quadratic nature of the kinetic energy term in the Hamiltonian.
We can now use equation (3.2) to obtain the elements of $\boldsymbol{\chi}$ , which explicitly depends on the interbead distance $\textbf{R}$ and are used to build our approximation of the memory term equation (2.8). The results for the two systems investigated here are reported in Figure 2. The numerical data shows that, while the fluctuating forces ${\boldsymbol{\mathcal{F}}}(0,\cdot)$ (shown in the inset of Figure 2) depends only weakly on the atomic masses, the resulting friction term features a much stronger dependence on $M_2$ through the value of $\tau$ . This formulation demonstrates the possibility of using the result of Proposition 1 to provide a CG approximation of the full system with appropriate theoretical guarantees on the error committed in replacing the memory integral with a simpler friction matrix.
3.4 Discussion
We can now use the numerical results reported in Table 1 to estimate the the error committed in replacing the full memory term with its approximated value based on the ansatz reported in equation (2.3). In particular, this error is controlled by the two terms $\psi(t,\lambda)$ and $\zeta(\lambda)$ (see equations (2.5), (2.6)). Using the results obtained for equations (B2), (2.7), we can write for the two terms of the boundary of the relative error in equation (2.9):
In order to check the magnitude of the leading term in equation (3.5a), we need the time t, which represents the total simulated time of our simulation. However, for the sake of giving an estimate of the order of magnitude of $\psi^\prime(\tilde{t},\lambda)$ , we can simply use a time of the order of time scales considered in a standard CG simulation. For this reason, it seems reasonable to assume for it the typical time scale of the bead dynamics given by $\tau_P$ . Using the numbers reported in Table 1, the leading term in equation (3.5a) is $\mbox{exp}\left[{\left({\frac{\tau_P}{\tau}}\right)^\alpha}\right] \approx 10^{21}$ for the system with $M_2 = 1$ , and $\mbox{exp}\left[{\left({\frac{\tau_P}{\tau}}\right)^\alpha}\right] \approx 10^{8}$ for the system with $M_2 = 100$ . Although the specific values of these numbers (and even their orders of magnitude) depend on the functional form used to fit the simulation data, they are in all cases always much smaller than unity, demonstrating that the portion of the relative error bounded by $\psi^\prime(t)$ is extremely small. Moreover, the simulated time is usually well beyond $\tau_P$ , that is to say that in a standard simulation we can expect an even lower value of $\psi^\prime(\tilde{t},\lambda)$ .
The term $\psi^\prime(t)$ is roughly related to the replacement of the integration boundary in equation (2.4) from (0,t) to $(0,\infty)$ . It is therefore expected that the error due to this replacement decreases with the increase of t and, given the argument of the integral, it decreases with exponential behaviour.
Let us now consider the portion of the relative error bounded by $\zeta^\prime(\lambda)$ , equation (3.5b). With the same values reported in Table 1 we obtain $\frac{\tau}{\tau_P} = 0.225$ for the system with $M_2 = 1$ and $\frac{\tau}{\tau_P} = 0.29$ for the system with $M_2 = 100$ . The term $\zeta^\prime(\lambda)$ represents the relative error committed by replacing the memory term by its Taylor expansion stopped at a some k. In this case, we considered $k=0$ , i.e. we replaced the whole memory term with the leading term of its Taylor expansion. We see that, in this case, the bounding value for the percentage error is $\approx$ 23% for the system with $M_2 = 1$ and $\approx$ 30% for the system with $M_2 = 100$ , meaning that the results of the simulations should be carefully checked as it is not guaranteed that the error committed would be negligible.
The analysis just showed puts on a more rigorous ground the qualitative statement that the characteristic time of the decay of the momenta ACF, being of the same order of magnitude of the decay time of the fluctuatingforce ACF, makes resorting to equation (3.2) questionable. Moreover, our results show that including more terms in the approximation (see Section 2.3) should improve such model. Indeed, since in equation (2.7) $\zeta(\lambda)$ approaches zero faster upon the increasing of k, this error bound can be reduced, making the results of such simulations more robust. We plan to test this approach in future work.
4 Conclusions
In this work, we discussed the properties and various approximations of the dissipative memory term in the Mori–Zwanzig equations applied to CG simulations in MD. A common approach to approximate this term is to assume apriori a particular behaviour (e.g., Markovian) for the dynamics of the system and therefore simplify the terms involved in the memory accordingly. In this work, instead, starting from the Green–Kubo definition of the memory term involving the autocovariance of the fluctuating forces, we assume a functional form for this latter quantity. This is chosen such that an explicit calculation of the memory is possible, and it can be therefore compared and calibrated with finegrained data.
Starting from this ansatz, we proved a rigorous results on the boundary of the error committed by replacing the full memory term with its approximation. We also included an asymptotic analysis on the terms bounding the error and an explicit definition of the parameters which control the approximations used in the derivation.
Finally, we tested this approach on a simple system represented by a onedimensional chain interacting with a Lennard–Jones type potential in two different cases, with different masses of the beads. We explicitly calculated the friction term for both of them checking the consistency of the Markovian approximation, using the asymptotic analysis presented in Section 2.3.
Our results show that this approach is feasible in a onedimensional case, and demonstrates that the friction exhibits strong spatial dependence, suggesting that further investigation is required for more complex, multidimensional systems.
Moreover, our rigorous analysis clearly gives what is the range of error expected, showing that even for such a simple system like the one we considered the Markovian approximation should be checked carefully, as the error committed by such approximation can be important.
This work represents a step towards a systematic datadriven analysis of the memory term, and future work will focus on the analysis of the longtime memory behaviour of the system and the application to more realistic models.
Acknowledgements
During this project, the work of T. Hudson was supported by the Leverhulme Trust through Early Career Fellowship ECF2016526. The authors also wish to acknowledge the comments of anonymous referees, which helped to significantly improve this manuscript.
Conflicts of interest
The authors declare that there are no conflicts of interest relating to this work.
A. Proof of Proposition 1
We begin by using Taylor’s theorem (with remainder in Lagrange form) to expand g in s about $s=0$ up to order $k1$ , writing
for some $\theta\in[0,1]$ . Using this expansion and splitting the domain of integration, we have
Employing the triangle inequality, the regularity assumption on g, and the fact that $\theta\in[0,1]$ , we obtain
where $G_{k,\gamma}(t)$ is the Hölder $\gamma$ seminorm of a bounded function g on the set [0,t]. We note that, for the evolution over a given time interval [0,T], we can replace the functions $G_n(t)$ and $G_{k,\gamma}(t)$ by appropriate constants $C_n$ and $C_{k+1}$ by taking a supremum over the time interval.
To conclude, we note that by making the change of variable, we can express the integrals over the interval $(0,\infty)$ in terms of the Gamma function:
Using this result, we conclude that the statement holds.
B. Proof of equation (2.5)
Let us start with the definition of $\psi(t)$
Let us now estimate the last integral in the previous expression. We start by making the change of variable $v = s^\alpha$ :
There are now two cases: either (i) $\frac{n+1}{\alpha}1\leq 0$ , or (ii) $\frac{n+1}{\alpha}1> 0$ .
In case (i), we can estimate the integral as being
This is exponentially small as long as $\lambda t^\alpha\gg 1$ .
In case (ii), we can integrate by parts to obtain
Integrating by parts more times as necessary, we ultimately obtain a similar result to that obtained in case (i), i.e. that this error term is exponentially small as long as $\lambda t^\alpha\gg 1$ .
In summary, we have shown that