Diffuse-interface approximation and weak-strong uniqueness of anisotropic mean curvature flow

The purpose of this paper is to derive anisotropic mean curvature flow as the limit of the anisotropic Allen-Cahn equation. We rely on distributional solution concepts for both the diffuse and sharp interface models, and prove convergence using relative entropy methods, which have recently proven to be a powerful tool in interface evolution problems. With the same relative entropy, we prove a weak-strong uniqueness result, which relies on the construction of gradient flow calibrations for our anisotropic energy functionals.


Introduction
We consider anisotropic mean curvature flow, a geometric evolution equation used to model microstructure in complex materials. The prototypical application is in multi-phase grain growth for polycrystals. As noted in [28], isotropic models, such as mean curvature flow, fail to capture phenomenological features such as the dendritic growth of phases (see also [15]). In chemical kinetics, phase separation experiences anisotropy due to the underlying lattice orientation of the solid host-material [5]. Similarly, many materials even display anisotropic surface tensions which are not smooth with respect to the interface orientation (see also [3,53,54]): here, one can even consider the household setting of salt (NaCl) and air. At the same time study of interface evolutions poses serious numerical and mathematical challenges and a large amount of insight has been gained by modeling such systems in terms of phase-field models, where one replaces interfaces by continuous order parameters (see, e.g., [15]).
In this paper, we prove convergence of solutions of the anisotropic scalar Allen-Cahn equation, a phase-field model, to anisotropic mean curvature flow using variational methods. This may be considered as a first step to proving convergence in the physically relevant vectorial setting. Our approach generally sheds light onto anisotropic mean curvature flow, and further enables us to prove a weak-strong uniqueness result for the interface evolution.
Anisotropic mean curvature flow prescribes the evolution of an oriented hypersurface with the surface velocity determined by a weighted mean curvature. Fixing a surface tension σ : R d → R ≥0 and a mobility µ : R d → R ≥0 (where one can think of extending from the sphere by one-homogeneity), we say that a time-parametrized collection of sets where ν is the outer unit normal of A, V is the surface normal velocity and H σ := div Γ(t) (Dσ(ν)) is the anisotropic mean curvature with respect to σ. We note that Dσ(ν) is typically referred to as the Cahn-Hoffman vector field, a generalization of the surface normal. In the case that σ = µ = | · | are given by the Euclidean norm, one recovers the usual (isotropic) mean curvature flow V = −H. Following the approach of Luckhaus and Sturzenhecker [47], one way to encode the motion (1.1) is through the characteristic functions χ(t) := χ A(t) . Here A(t) are naturally given by sets of finite perimeter instead of smooth open sets, allowing for distributional (or BV ) weak solutions to (1.1). In the isotropic setting, solutions were derived via a minimizing movements scheme (see also [1]) for the perimeter functional. This was a natural approach as mean curvature flow can formally be viewed as the gradient flow of the perimeter functional with an appropriate metric [51,36].
To carry this analogy to our setting, the curvature flow (1.1) seeks to minimize an Here, ν = − ∇χ |∇χ| is the measure-theoretic outer unit normal, and c 0 is a positive constant quantifying surface energy. Formally speaking (see Subsection 2.2), anisotropic mean curvature flow is a gradient flow of the anisotropic perimeter E with respect to the weighted L 2 -surface metric (V, W ) Γ := c 0 To construct solutions of interface evolutions, such as (1.1), one can approximate via diffuse-interface models. The idea is that the sharp interface Γ(t), which captures a jump discontinuity of u = χ, is replaced by a diffuse interface u ε forming a continuous transition between values close to 1 and values close to 0. Diffuse-interface or phase-field models are often used in practice and especially for numerics, where tracking of the interface is reduced to a reaction-diffusion equation. Herein, we consider a phase-field approximation of the curvature flow (1.1) given by the anisotropic Allen-Cahn equation in Ω × (0, T ), u ε (·, 0) = u ε,0 in Ω, (1.4) where Ω is the d-dimensional torus, the function W : R → R ≥0 is a double-well potential with its wells at 0 and 1, and the anisotropic surface tension and mobility are encoded by the functions f : R d → R ≥0 , f (p) = σ 2 (p) (1.5) and g : R d → R >0 , g(p) = σ(p) + 1 µ(p) + 1 . (1.6) For the isotropic case σ = µ = | · |, we simply obtain f = | · | 2 and g ≡ 1.
In contrast to the approach taken in [21], we have introduced a regularization (1.6) of the mobility µ allowing us to take advantage of the gradient flow structure for the anisotropic Allen-Cahn equation. We define the anisotropic Cahn-Hilliard energy E ε : and introduce a weighted L 2 -metric given by for a given point u ∈ dom(E ε ). Note that we omit the dependence on ε in the notation (·, ·) u for convenience. A formal calculation shows that the Allen-Cahn equation (1.4) is equivalent to where ∇ uε(t) is the gradient on L 2 (Ω), (·, ·) uε(t) . The above equation encapsulates (1.4) as a gradient flow, and this structure will be exploited for the construction of solutions to (1.4) in Section 3 as well as for the sharp-interface limit in Section 4.
A first indication that (1.4) approximates (1.1) is the Γ-convergence of the associated energies (see [10,19]). With c 0 := 1 0 W (s)ds, (1.9) it was shown by Bouchitté [9] that E ε Γ −→ E as ε ց 0 with respect to the strong L 1topology on the underlying space. Likewise in the spirit of Luckhaus and Modica [46], Cicalese et al. [18] verified an anisotropic Gibbs-Thomson relation for the energies connecting the first variation of (1.7) to the limiting minimal surface's anisotropic curvature.
Early results on anisotropic mean curvature flow in the special case µ = σ are due to Chen, Giga, and Goto [17], who proved the existence and uniqueness of viscosity solutions (up to fattening) for smooth surface tensions σ. Almgren, Taylor, and Wang [1] introduced a time discretization in the form of a minimizing movements scheme including crystalline surface tensions, yielding the so-called flat-flow solutions for anisotropic mean curvature flow. They also proved a short-time existence result for strong solutions if the surface tension σ is smooth. Bellettini and Paolini [7] provide a thorough introduction of the anisotropic mean curvature flow equation for a Finsler metric σ, i.e., with the surface tension possibly depending on the position in space, and argue formally that the time discretization, the level-set equation proposed in [17], and the anisotropic Allen-Cahn equation lead to solutions to anisotropic mean curvature flow.
Allowing for sufficiently regular convex surface tensions σ and arbitrary mobilities µ, Elliott and Schätzle [21] proved that, in the sharp-interface limit, solutions to the anisotropic Allen-Cahn equation converge to anisotropic mean curvature flow in the sense of the viscosity formulation. Unlike in the present work, Elliott and Schätzle used a discontinuous, non-regularized version of g and, therefore, resorted to viscosity solutions of the phase-field equation. In [14], Chambolle and Novaga prove consistency of the minimizing movements scheme [1] and the MBO thresholding scheme for the energy (1.2) with anisotropic mean curvature flow using viscosity solutions. Similarly, using viscosity solutions and distributional solutions with an energy convergence hypothesis, Chambolle et al. [11] prove convergence of the minimizing movements scheme [1] for a translationally dependendent energy to inhomogeneous anisotropic mean curvature flow.
In the case of non-smooth (or crystalline) surface tensions, the Cahn-Hoffman vectorfield is effectively described in terms of a differential inclusion ν σ ∈ ∂σ(ν) and selection of the appropriate curvature can make the problem nonlocal. Recently, a variety of work has been invested in understanding crystalline curvature flow. Giga and Giga were among the first to develop a robust solution concept in the planar setting [29]. For crystalline surface tensions, existence and uniqueness was proven in dimension d = 3 by Giga and Požár in [32], and the result was ultimately extended to arbitrary dimension in [33] by the same authors. Chambolle, Morini, and Ponsiglione [13] introduced a novel definition of supersolutions, subsolutions, and weak solutions to anisotropic mean curvature flow that is also based on level set techniques and is particularly useful for crystalline surface tensions. They presented an existence and uniqueness result up to fattening and a comparison principle. The same authors together with Novaga [12] extended this result from the special case µ = σ to arbitrary mobilities µ.
Many of the above results are qualitative, but in the spirit of Chen [16], given the power of viscosity methods, quantitative rates of convergence have been derived and we refer the interested reader to [6,31] and references therein.
In contrast to the above approaches, we will apply relative entropy methods to identify the limit of (1.4). A related approach regarding a localized energy excess was introduced by the first author with Otto in [41] where they proved convergence of the MBO thresholding scheme to multi-phase mean curvature flow. A non-trivial modification of this idea-based on controlling the tilt excess of the sharp or diffuse interface with regard to a smooth approximation-has been used to prove convergence of the vectorial Allen-Cahn equation to multi-phase mean curvature flow [42] and derive an associated rate of convergence [25] along with optimal quantitative convergence rates for the Allen-Cahn equation to mean curvature flow in the two-phase setting [24]. A key feature of viscosity type solutions is the associated comparison principle, which automatically provides uniqueness of solutions up to the issue of fattening. However, in the multi-phase case, fundamentally different tools are needed to address uniqueness: The relative entropy method [23] has been used to prove weak-strong uniqueness results for a variety of geometric evolution equations including planar multi-phase flows (see, e.g., [40,36,35,22]).
In this paper we prove convergence of the anisotropic Allen-Cahn equation (1.4) to anisotropic mean curvature flow (1.1) for arbitrary Lipschitz mobilities µ and C 2 surface tensions σ under an energy convergence hypothesis, as is often used in application [41,43,47,38]. This result provides a complete proof for the result first announced in [39]. Further, we prove weak-strong uniqueness of the associated distributional solution concept for anisotropic mean curvature flow: If a smooth solution (which we will endow with the structure of a calibrated evolution) and a BV solution share the same initial data, it follows that both solutions coincide for all times in their common interval of existence. We summarize these results here, and refer to Theorems 4.1 and 5.2 for precise details. Theorem 1.1. Let µ be a Lipschitz mobility, σ a C 2 uniformly convex surface tension, and Ω the d-dimensional torus. Then the following holds: • Any sequence of weak solutions u ε of the anisotropic Allen-Cahn equation (1.4) with well-prepared initial conditions has a subsequence converging to some limit u = χ with χ : Ω × (0, T ) → {0, 1} as ε → 0. Under an energy convergence hypothesis, u is a weak solution of anisotropic mean curvature flow.
• Let σ and µ be smooth. If {A (t)} t∈[0,T ] is a strong solution of anisotropic mean curvature flow (1.1) and χ : Ω × (0, T ) → {0, 1} is a distributional solution of anisotropic mean curvature flow with the same initial condition, i.e., χ(·, 0) = χ A (0) , We remark that convergence of the anisotropic Allen-Cahn equation is well-studied, but as far as the authors are aware, this has exclusively been done from the perspective of viscosity solutions. Our result considers this from the distributional setting and may ultimately be amenable to tackling the multi-phase setting that is most relevant to physical applications. Further, our uniqueness result for distributional solutions shows that it may be possible to obtain quantitative convergence rates in the spirit of Fischer et al. [24] for the anisotropic Allen-Cahn equation.
The structure of the paper is as follows: In Section 2, we introduce the admissible class of anisotropies and, based on the closely connected notions of anisotropic surface energy and anisotropic mean curvature, derive the anisotropic mean curvature flow equation as a formal gradient flow. Here, we further introduce our notion of a distributional solution to anisotropic mean curvature flow. As a preparation for the convergence result, Section 3 discusses the anisotropic Allen-Cahn equation (1.4) and establishes the existence-via a time discretization-and regularity of weak solutions. Section 4 is devoted to the proof of the conditional convergence theorem. Finally, Section 5 covers the weak-strong uniqueness theorem.

Notation
Throughout the paper let d ≥ 2 be the ambient dimension. We consider the equations with periodic boundary conditions, i.e., for the domain we will always choose the flat torus The i-th unit vector will be denoted by e i , and the identity matrix in dimension d will be written as I d ∈ R d×d . For the scalar products of vectors a, b ∈ R d and of matrices The symbol ∇ is reserved for derivatives with respect to the space variable x ∈ Ω. For a vector field X : Derivatives with respect to variables p ∈ R d will be denoted by D = D p . In contrast to ∇, Dσ(ν) will denote the column vector.
For a set A ⊂ R d , χ A is the characteristic function taking the value 1 on A and 0 in the complement.

Anisotropies and anisotropic mean curvature
In this section, we develop the necessary mathematical preliminaries for the rest of the paper. We introduce the notion of admissible anisotropies in Subsection 2.1, and provide a calculation clarifying the formal view of anisotropic mean curvature flow as gradient flow in Subsection 2.2. Finally, we show in the the smooth setting how the anisotropic curvature can be reinterpreted via integration by parts, allowing us to introduce a distributional solution for anisotropic mean curvature flow in Subsection 2.3.
(iii) and (iv) follow from the positive 1-homogeneity and continuity resp. smoothness of σ.
(v) is a result of (iv) and the fundamental theorem of calculus, cf. [30, Section 1.7.2]: Giga [ A useful trick with regard to the Euclidean metric | · | is to control quadratic errors for unit vectors via the inequality with equality if and only if |p ′ | = 1. In order to introduce a tilt excess functional for anisotropic mean curvature flow, we are interested in an anisotropic counterpart to the above inequality. The suitable anisotropic inequality bounds squared distances |p − p ′ | 2 by a term of the form σ(p) − |p ′ |Dσ(p ′ ) · p. However, since the mapping p ′ → |p ′ |Dσ(p ′ ) is, in general, not continuously differentiable at p ′ = 0, we will use a truncated version instead. To this end, we fix a cutoff function ψ ∈ C ∞ ([0, ∞)) satisfying The following lemma contains two estimates featuring the truncated version.

Lemma 2.4.
Let σ be an admissible surface tension. There exist constants c σ , C σ > 0 depending only on σ such that for all p, p ′ ∈ R d such that |p| = 1 and |p ′ | ≤ 1, and for all p, p ′ ∈ R d such that |p| = 1 and |p ′ | ≤ 1.
Proof. Variants of (i) and (ii) were provided by Dziuk [ For inequality (i), we consider two cases with respect to p ′ : First, if Dσ(p ′ ) · p < σ(p) 2 or p ′ = 0, we use the estimate |p − p ′ | 2 ≤ 4 to obtain The uniform convexity assumption (S4) can equivalently be stated as follows (see [30,Remark 1.7.5]): There exists a constant σ > 0 such that for all p * = 0. We use a second-order Taylor expansion around p ′ |p ′ | and write the remainder in terms of an intermediate point 1]. Together with the convexity property (2.6), it follows that (2.7) Furthermore, the assumption Dσ(p ′ ) · p ≥ σ(p) 2 allows us to compute Finally, a combination of (2.7) and (2.8) together with an application of Young's inequality yields which completes the proof of (i) in the second case.
For inequality (ii), we distinguish two cases again. First, in the easier case |p − p ′ | ≥ 1, we can estimate and, therefore, for all t ∈ [0, 1]. As in the proof of inequality (i) above, we introduce a second-order Taylor expansion around p ′ |p ′ | with an intermediate point 1]. Using this expansion, Lemma 2.3(ii), (vi), and Young's inequality, we obtain from which (ii) follows in the case |p − p ′ | < 1.
The following fact on the duality of σ and σ • will be used at a later point. Then where the supremum is taken over

Anisotropic mean curvature and surface energy
In the remainder of this section, we assume that (σ, µ) is an admissible pair of anisotropies according to Definition 2.1. Similarly to [7], we introduce the σ-mean curvature of a hypersurface Γ as the surface divergence of the Cahn-Hoffman vector of the outer unit normal ν, i.e., H σ = div Γ (Dσ(ν)). For the notion of the Cahn-Hoffman vector, see [30,Section 1.3]. The surface divergence is given by div Γ X := div X − ν ·∇X ν. Observe that, for every C 1 -extension of the normal ν to the whole space and every x ∈ Γ, we have where the last step uses the fact that D 2 σ(p)p = d ds s=0 Dσ(e s p) = d ds s=0 Dσ(p) = 0 for all p ∈ R d \ {0} by the positive 0-homogeneity of Dσ, Lemma 2.3(iv).
Furthermore, we define the anisotropic surface energy E as in (1.2). The relation between the anisotropic mean curvature and anisotropic surface energy becomes clear from the following theorem, which deals with the first variation and direction of steepest descent of the functional E: Theorem 2.6. Let A ⊂ R d be a bounded open set with C 2 -boundary, and let ν denote the outer unit normal on ∂A. Given a compactly supported vector field B ∈ C 1 c (R d ) d , we define a one-parameter family of diffeomorphisms  This theorem helps us to justify the gradient flow structure for anisotropic mean curvature flow as introduced earlier via (1.2) and (1.3): Considering that tangential components of the velocity B do not contribute to the variation of A, let us restrict ourselves to velocities of the form B = −λ sgn(H σ )Dσ(ν), where λ ∈ C(∂A; R ≥0 ). The normal component of such a vector field B is V = B · ν = −λ sgn(H σ )σ(ν), and the metric term becomes which is in accordance with (1.3). The velocity of steepest descent which was given in Theorem 2.6 satisfies , which (formally) verifies the gradient flow structure of (1.1).

Distributional solutions to anisotropic mean curvature flow
We introduce a distributional formulation for (1.1) that was proposed in [39]. The idea behind this formulation is to encode the σ-mean curvature by a (d × d)-matrix via an integration by parts as follows: (2.14) To see this, we extend the normal ν to a vector field ν ∈ C 1 (R d ) d , e.g., as a truncation of the normalized gradient ∇ sdist |∇ sdist | of the signed distance function.
There exists an open neighborhood U of ∂A where ν = 0, and in this neighborhood we can decompose the vector field B as The restriction of B 1 to ∂A is tangential since, by Lemma 2.3(v), we have B 1 · ν = 0. By the divergence theorem on the closed surface ∂A, we obtain (2. 15) In this computation, the third equality follows by adding zero and using that ν · ∇ν B 1 = B 1 · ∇ 1 2 |ν| 2 = 0. For the fourth equality observe that (∇B 1 ) T ν + (∇ν) T B 1 = ∇(B 1 · ν) = 0 since B 1 is tangential. Finally, the fifth equality holds true because the second fundamental form is symmetric for all x ∈ ∂A (see [30,Section 1.3]).
On the other hand, we have From (2.15) and (2.16) we obtain (2.14).
Using the integration by parts (2.14), we arrive at a distributional formulation for anisotropic mean curvature which resembles the isotropic version due to Luckhaus and Sturzenhecker [47]. Following [39], our definition for BV solutions to anisotropic mean curvature flow also includes an optimal energy dissipation inequality, which alludes to the gradient flow structure of the problem.
which is the normal velocity in the sense that for all ζ ∈ C 1 (Ω×[0, T ]) and T ′ ∈ (0, T ]: and (iii) the function χ satisfies the optimal energy dissipation inequality A prime example of anisotropic mean curvature flow is the following self-similar solution, which generalizes the evolution of shrinking spheres by (isotropic) mean curvature flow. This example of motion by anisotropic mean curvature flow is derived in [30, Section 1.7.2]. Example 2.9 (The Wulff shape). Let σ be an admissible surface tension, and suppose that µ = σ. The Wulff shape associated with σ is the set In Theorem 4.1, we will prove convergence of the anisotropic Allen-Cahn equation (1.4) to a distributional solution of anisotropic mean curvature flow in the sense of Definition 2.8, but first we must introduce an appropriate notion of solution for (1.4).

The anisotropic Allen-Cahn equation
The goal of this chapter is to construct solutions to the anisotropic Allen-Cahn equation (1.4) and to establish spatial H 2 -regularity for these solutions.
From now on, our assumptions on the double-well potential W are While (W1) and (W2) are common assumptions on double-well potentials, (W3) is motivated by the desirable property of exponential convergence of solutions to (1.4) to the wells far away from the diffuse interface and is not necessary for the main results of existence and sharp-interface limit. This assumption was used in a similar way by Sternberg [52]. A reference which makes use of the exponential convergence is [24], where Fischer, Simon, and the first author derive a convergence rate for the sharp-interface limit in the isotropic case.
Assumption (W4) is needed in the proof of existence of solutions to guarantee the convergence of the approximate energies. Assumption (W5) is dispensable if one only considers solutions u ε to (1.4) that satisfy 0 ≤ u ε ≤ 1 almost everywhere.
Up to a linear scaling, a possible choice for W is the standard double-well potential W (s) = 9 16 1 − s 2 2 . Here, the prefactor 9 16 is chosen such that c 0 : Remark 3.1. Let X be a Hilbert space and λ > 0. A function F : X → [0, ∞] is called λ-convex if the following equivalent conditions are satisfied: (ii) for all x, y ∈ X and µ ∈ (0, 1), we have Furthermore, we will always assume that (σ, µ) is an admissible pair of anisotropies in accordance with Definition 2.1. The information on the surface tension σ and mobility µ is contained in the functions f , g as defined in (1.5) and (1.6), respectively. We remark that there is some freedom with regard to the choice of g: Since σ(p) µ(p) is not defined for p = 0, adding +1 in the numerator as well as the denominator (cf. (1.6)) is a means of avoiding a singularity at 0. Our existence and conditional convergence statements rely merely on the following properties of g: Letting g be as in (1.6), the following holds: The following weak solution concept for the anisotropic Allen-Cahn equation combines an integration by parts in space (ii) with an optimal energy dissipation identity (iii), which hints at the gradient flow structure of (1.4). Prescribing the initial data as in

Definition 3.3 (Solutions to the anisotropic Allen-Cahn equation). Let ε > 0 and
is a solution to the anisotropic Allen-Cahn equation with initial data u ε,0 if We comment on the typical behavior of the anisotropic Allen-Cahn equation (1.4).
Remark 3.4. The reaction term − 1 ε 2 W ′ (u ε ) forces the solution u ε towards the wells of W , i.e., towards the set {0, 1}. The interplay of the reaction term with the anisotropic diffusion term − div (Df (−∇u ε )) results in a transition layer of width O(ε) whose shape also depends on the direction of the approximate outer normal − ∇uε |∇uε| . More precisely, the typical width of the transition layer is εσ − ∇uε |∇uε| . To see this, one can choose the 1-dimensional stationary ansatz with ν ∈ S d−1 being a fixed unit-length vector. Together with a monotonicity assumption on the transition, this admits the solution where Θ is the unique solution to With this in mind, we note that the weighted L 2 -metric (1.8) depends on σ and µ whereas anisotropic mean curvature is a gradient flow with respect to the metric (1.3), which depends only on µ. This can be explained by the idea that, in the anisotropic Allen-Cahn equation, the metric (1.8) has to compensate for the typical width of the transition layer varying with the orientation of the normal.

Existence of weak solutions
To prove this existence theorem, we exploit the gradient-flow structure of the equation and construct solutions via a minimizing movements scheme. To this end, we consider an approximation of the PDE (1.4) that replaces the time derivative by difference quotients: Given a time-step size h > 0, let us look for functions {u n h } n∈N 0 that solve In fact, equation (3.9) is the strong Euler-Lagrange equation of the variational problem u n h ∈ arg min where we recall the energy (1.7) and the inner-product (1.8) for which we use the notation in which the dependence on ε is suppressed again. Equation (3.9) does not precisely resemble an implicit Euler scheme as it features the explicit term g(−∇u n−1 h ) rather than the implicit term g(−∇u n h ). Instead (3.9) can be viewed an implicit-explicit splitting discretization scheme of (1.4). This is mirrored in the minimization problem (3.11) by the occurrence of the metric (·, ·) u n−1 h , which is taken with respect to the constant point u n−1 h and therefore does not depend on u. Thanks to this choice, one can immediately prove that (3.11) admits a unique solution as soon as h is small enough, using the direct method in the calculus of variations: Lemma 3.6. Let E ε be defined as in (1.7) and assumptions (W1)-(W5) hold. It follows that: (Ω) and µ ∈ [0, 1]. We will use the equivalent formulation (ii) in Remark 3.1, so that we have to show that for all u, v ∈ L 2 (Ω) and µ ∈ (0, 1). We can assume without loss of generality that u, v ∈ dom(E ε ). Exploiting the convexity of f as well as the λ-convexity of W (W4), we obtain (ii) This follows from (i) and quantifying the strong convexity of the metric term: Given u, v ∈ L 2 (Ω) and µ ∈ (0, 1), we compute λ . (iii) This will follow from the lower semi-continuity of the individual terms. To see that the anisotropic Dirichlet energy is lower semi-continuous, we can assume without restriction that v k ∈ H 1 (Ω) for all k ∈ N, so that it remains to prove that Without loss of generality, v k converge to v weakly in H 1 (Ω), and lower semicontinuous follows from convexity of f .
The lower semicontinuity of the potential term Ω W (u) dx and the metric term (iv) The functional to be minimized is proper. Let {v k } k∈N be a minimizing sequence, i.e., (3.14) By Rellich's theorem, there exists a subsequence converging in L 2 (Ω) to a limit function v ∈ L 2 (Ω). The lower semicontinuity (iii) then shows that v ∈ arg min The minimizer v is unique due to the strong convexity (ii).
For the L ∞ -bound we define a truncated version of u n h by and a general fact about Sobolev spaces that v ∈ H 1 (Ω) and Then, using (3.15), the positive definiteness of σ, the monotonicity assumption (W5), and the pointwise inequality By the uniqueness of minimizers, this implies that v = u n h and, therefore, In order to pass to the limit h ց 0, we define one affine and two piecewise constant interpolations: where u 0 h ∈ dom(E) and u n h are defined inductively via (3.11) for n ∈ N. Before we turn to the limiting function u, let us prove the following inequality, which is a consequence of the λ 2ε -convexity of E ε and the minimizing property of the time steps u n h (3.11): Proof. It suffices to prove the corresponding inequality involving the time steps u n h , n ∈ N: Clearly, one can write the interpolations u h , u h , and u h on the left-hand side of (3.17) in terms of the time steps, choosing n = n(t) ∈ N such that t ∈ [(n(t) − 1)h, n(t)h). We also observe that the piecewise affine interpolation u h ∈ C [0, T ]; L 2 (Ω) has a weak time derivative ∂ t u h given by for almost every t ∈ (0, T ). Therefore, (3.17) follows from (3.18) by reformulating the interpolation in terms of the time steps and using v = w(t) for fixed t ∈ (0, T ). To prove (3.18), let v ∈ L 2 (Ω) and δ ∈ (0, 1). Again, we use the equivalent characterization of λ 2ε -convexity in Remark 3.1(ii). For computational ease, first note, that Subtracting (1 − δ)|u n − u n−1 | 2 from both sides of the above equality, we may directly compute where the third step is an application of (3.11), with δv + (1 − δ)u n h acting as a contender for the minimization problem. Taking the limit δ ց 0 finally yields (3.18).
An application of the Arzelà-Ascoli theorem, as performed in the lemma below, shows that a subsequence of {u h } h converges strongly, and the derivatives in time and space will turn out to converge weakly in L 2 . However, due to the nonlinear term g(−∇u) in the anisotropic Allen-Cahn equation (1.4), weak convergence of the gradients is not sufficient to show that the limit function solves the equation. We will therefore use a convexity argument to also prove strong convergence of the gradients.
Then there exist a sequence h ց 0, denoted without relabeling, and a limit function u ∈ H 1 (Ω × (0, T )) such that Proof. (i) The first step is to prove a H 1 -bound for the minimizers u n h , n ∈ N 0 : where the last inequality uses Lemma 3.6(iv) and (3.11) repeatedly.
For any t ∈ [0, T ], the function u h (t) is a convex combination of two minimizers u n−1 h , u n h of (3.11). Applying the triangle inequality yields It follows from Rellich's compact embedding theorem that any sequence {u h k (t)} k∈N with t ∈ [0, T ] and lim k→∞ h k = 0 has a subsequence converging in L 2 (Ω).
To prove equicontinuity in time, which is the second requirement for the Arzelà-Ascoli theorem, we will-as is standard with minimizing movement approximationsshow the stronger statement that u h ∈ C [0, T ]; L 2 (Ω) are 1 2 -Hölder continuous and the Hölder constants are uniformly bounded as h ց 0. Indeed, for n ∈ N, we use the definition of the metric (·, ·) u n−1 h and the minimization property (3.11) to show that Inequality (3.22) allows us to bound the L 2 -norms of the weak time derivates ∂ t u h as The fundamental theorem of calculus for vector-valued functions together with the Cauchy-Schwarz inequality now yields for all s, t ∈ [0, T ]. This is the one-dimensional case of Morrey's inequality.
By (3.24) and the assumption on the initial data, the Hölder constants are bounded as h ց 0. Families of functions with bounded Hölder constants are equicontinuous. These equicontinuity and precompactness statements allow us to apply the Arzelà-Ascoli theorem, which states that there exists a subsequence h ց 0 and a function . Furthermore, the Hölder continuity carries over to the limit function, i.e., . This pointwise convergence together with the triangle inequality yields In a last step, we argue briefly that the piecewise constant interpolations u h and u h converge uniformly.
(iii) We will prove the desired convergence result for the piecewise constant interpolation u h first. Integrating (3.20) from 0 to T shows that the gradients {∇u h } are bounded in L 2 (Ω × (0, T )) d as h ց 0. Just like in (ii), it then follows from the strong convergence u h → u in L 2 (Ω × (0, T )) that the limit function u has a weak gradient ∇u ∈ L 2 (Ω × (0, T )) d , and that ∇u h ⇀ ∇u as h ց 0 in L 2 (Ω × (0, T )) d .
The next step is to upgrade this weak convergence statement for the gradients to strong convergence in L 2 . The key ingredient for the argument will be the energy convergence Let us first argue how (3.26) implies strong convergence of the gradients. Since the Cahn-Hilliard energy E ε is made up of a Dirichlet energy and a nonconvex term involving W , we will use the convergence result (3.26) for the Cahn-Hilliard energies and a liminf inequality for the nonconvex part to derive a limsup inequality for the Dirichlet energy, which will allow us to prove the strong convergence statement.
Similarly to the proof of Lemma 3.6(iii), an application of Fatou's lemma yields One can now test (3.26) with a constant test function and combine this convergence statement with (3.27), which yields lim sup The function f is strongly convex, i.e., there exists a constant c > 0 such that where the last step also uses the fact that ∇u h ⇀ ∇u in L 2 (Ω × (0, T )) d . This computation implies that ∇u h → ∇u strongly in L 2 (Ω × (0, T )).
In order to prove (3.26), it suffices to consider nonnegative test functions ζ ∈ L ∞ (0, T ; R ≥0 ). We will prove a liminf inequality and a limsup inequality separately.
On the one hand, by Fatou's lemma, the lower semicontinuity of E ε , and (i), we obtain On the other hand, choosing w(x, t) := u(x, t) in (3.17) and integrating against ζ yields In the limit h ց 0, the second integral on the right-hand side vanishes since u h → u in L 2 (Ω × (0, T )). Thus, We have shown that ∇u h → ∇u in L 2 (Ω × (0, T )) as h ց 0. As for ∇u h , one computes where the last integral vanishes as h ց 0 due to the continuity of translation.
We are now ready to pass to the limit h ց 0 in (3.17), which will eventually allow us to prove Theorem 3.5. This is the content of the following lemma: Lemma 3.9. Let w ∈ L 2 (Ω×(0, T )) and ζ ∈ L ∞ (0, T ) such that ζ ≥ 0 almost everywhere. Then In other words, for every w ∈ L 2 (Ω × (0, T )), the inequality 4ε Ω |w − u| 2 dx holds true for almost every t ∈ (0, T ).
Proof. First, by the weak convergence (3.26), it follows that for any function ζ ∈ L ∞ (0, T ) such that ζ ≥ 0. Second, for the term involving the time derivative, we know from Lemma 3.8(ii) that ∂ t u h ⇀ ∂ t u in L 2 (Ω × (0, T )). Thus, to derive the convergence of the integrals, it suffices By Lemma 3.8(iii), we know that ∇u h , ∇u h → ∇u in L 2 (Ω × (0, T )). Since g is a continuous function, every subsequence of g(−∇u h ) as h ց 0 has a further subsequence that converges pointwise almost everywhere to g(−∇u). By the uniform boundedness of g and the dominated convergence theorem, it follows that in L 2 (Ω) for almost all t ∈ (0, T ).
It remains to show the L 2 -convergence on the product space Ω × (0, T ). Using the generalized dominated convergence theorem with 0, T )). In total, we obtain Third, it follows from Lemma 3.8(i) that Integrating (3.17) against ζ, taking the limit h ց 0, and plugging in (3.32)-(3.34) proves the desired inequality.
With the help of Lemma 3.9, it can be seen that −∂ t u(t) lies in the u(t)-subdifferential ∂ u(t) E ε [u(t)] for almost every t ∈ (0, T ). Thus, it seems plausible that the limiting trajectory u(t) is a gradient flow for E ε .
Proof of Theorem 3.5. Choosing u 0 h = u 0 , one can construct the functions u h , u h , and u h by Lemma 3.6 and (3.16). Invoking Lemma 3.8 yields a subsequence h ց 0 as well as a function u ∈ H 1 (Ω × (0, T ) 1 2 , and such that the three convergence results in Lemma 3.8(i)-(iii) hold true. It remains to be shown that u is a weak solution to the anisotropic Allen-Cahn equation with initial data u 0 in the sense of Definition 3.3.
As for the initial condition, it follows from the uniform convergence in Lemma 3.8(i) and the choice u 0 h = u 0 that where all limits are in L 2 (Ω). For the optimal energy dissipation relation (3.4) let T ′ ∈ (0, T ] be a fixed time horizon. We define an extension of u : where we have replaced u byũ on Ω × (0, T ′ ) several times and used the definition ofũ in the last line.
Let us now take a ց 0 in the inequality above. As a consequence of the lower semicontinuity of E ε and the L 2 -continuity of the map t → u(t), we have Furthermore, it follows from a general fact about Sobolev functions that (see [34,Lemma 7.23 and proof of Lemma 7.24]). In particular, the integral dxdt is bounded as a ց 0.
Therefore, the limiting inequality of (3.35) as a ց 0 reads (3.36) which is one inequality in the optimal energy dissipation relation (3.4).
For the distributional formulation of the PDE (3.3) let ϕ ∈ C 1 (Ω × [0, T ]) and s > 0. By plugging in ζ ≡ 1 and w = u + sϕ into (3.31), we obtain Dividing by s and taking s ց 0 leads to where the limit and the integrals can be interchanged in the last step because u is essentially bounded. We divide by ε to obtain one inequality in (3.3). The converse inequality follows by taking s < 0, s ր 0.

Regularity of weak solutions
The remainder of this section is devoted to proving a regularity result which states that solutions to the anisotropic Allen-Cahn equation have weak second derivatives in space. We apply a difference quotient method for elliptic regularity. The idea for this proof is taken from [26]. The fact that f is, in general, not twice continuously differentiable on R d due to a singularity at 0 will not pose a problem when proving the existence of second derivatives, but it prevents us from deriving an additional PDE for the second derivatives. It suffices to prove the following Claim. Let u ∈ H 1 (Ω), and suppose that u is a weak solution to − div(Df (−∇u)) = h for some function h ∈ L 2 (Ω). More precisely, we assume that Ω Df (−∇u) · ∇w dx = Ω hw dx for all w ∈ H 1 (Ω). Then u has a weak second derivative in space D 2 u ∈ L 2 (Ω) d×d and D 2 u L 2 (Ω) ≤ C h L 2 (Ω) . Theorem 3.10 follows from this claim by choosing h = 2g(−∇u)∂ t u + 1 ε 2 W ′ (u) and slicing in time. Then we have h(·, t) ∈ L 2 (Ω) for almost every t ∈ (0, T ) since ∂ t u ∈ L 2 (Ω × (0, T )), u ∈ L ∞ (Ω × (0, T )), and W ∈ C 1 (R). The fact that (3.3) holds true for test functions w ∈ L 2 0, T ; H 1 (Ω) instead of just C 1 (Ω × [0, T ]) can be shown with a density argument.
Proof of the claim. Given a function w as in the claim, we define the difference quotient In order to deal with the right-hand side, we will use the following identity for i = 1, 2, . . . , d and for almost every x ∈ Ω: where the shorthand notation in the last line is to be read as If the line segment between ∇u(x − se l ) and ∇u(x) does not contain the origin, then f is twice continuously differentiable in a neighborhood of the line segment, so that (3.39) follows immediately from the fundamental theorem of calculus. On the other hand, if 0 ∈ conv ({∇u(x − se l ), ∇u(x, t)}), then (3.39) can be deduced by decomposing the line segment into two parts and applying the fundamental theorem of calculus on each part.
By the strong convexity of f , there exists a constant c > 0 such that Using the definition of A s ij , we obtain for almost every x ∈ Ω.
One can now choose w = D −s l u in (3.38) and use (3.39) with (3.40) as is standard (see, e.g., [34,Lemma 7.23] As ∇D −s l u ⇀ ∂ l ∇u as s → 0, this concludes the claim.

Convergence of the anisotropic Allen-Cahn equation to anisotropic mean curvature flow
The goal of this section is to prove a conditional convergence result for the sharp-interface limit of the anisotropic Allen-Cahn equation. In order to conclude that the limit is a BV solution to (1.1) in the sense of Definition 2.8, a critical assumption is the convergence of the time-integrated energies. A statement of the result and a sketch of the proof are included in the notes [39]. Here, we complete the proof and correct a mistake with respect to the phase-field equation (1.4): In [39], the function g was defined as g(p) = |p| µ(p) , thereby missing the effect of the anisotropic surface tension σ on g.
as ε ց 0, and that • there is a uniform L ∞ -bound on the initial conditions, Then there exists a subsequence ε ց 0 as well as a function

3)
then u is a distributional solution to anisotropic mean curvature flow with initial condition u(0) = u 0 in the sense of Definition 2.8.
Similarly to [47], where a minimizing movements construction for isotropic mean curvature is performed, the intuitive meaning of the assumption of energy convergence (4.3) is that no surface area is lost in the limit.
We will first argue by compactness that a limiting function u exists. Assumption (4.3) then allows us to prove an equipartition of energy between the anisotropic Dirichlet energy and the nonconvex potential energy as ε ց 0. In addition, this assumption guarantees the existence of a normal velocity V in the sense of (2.18). We introduce a relative entropy functional which serves as a tilt excess. Using this tilt excess to bound the occurring error terms, we will derive the weak formulation (2.19) from the distributional formulation (3.3) and the optimal energy dissipation inequality (2.20) from (3.4).

Compactness
To prove the compactness statement for the solutions {u ε } ε>0 to the anisotropic Allen-Cahn equation from Theorem 3.5, we will first show a W 1,1 -bound for the compositions of a suitable continuous function with u ε . This argument was performed, for example, by Fonseca and Tartar [27] in the isotropic case (see also [44]). However, some simplifications are possible since we assume an L ∞ -bound on the functions {u ε } instead of prescribing growth conditions on the double-well potential W .

Lemma 4.2. Let u ε be solutions of the anisotropic Allen-Cahn equation as in Theorem 3.5. Under the assumptions of Theorem 4.1 except (4.3), there exist a subsequence ε ց 0 and a function
(4.4) The first step of the proof is to show that lim sup εց0 φ • u ε W 1,1 (Ω×(0,T )) < ∞. For this step, we observe that, by assumption (4.2), there exists a finite constant R > 0 such that u ε (x, t) ∈ [−R, R] for almost every (x, t) ∈ Ω × (0, T ) and for all sufficiently small ε > 0. Using the continuity of φ, we first estimate for ε > 0 small enough. Furthermore, the chain rule for Sobolev functions is applicable to the compositions φ • u ε even though φ need not be globally Lipschitz continuous: By virtue of its definition as a primitive function, φ is continuously differentiable. In particular, φ ′ [−R,R] is bounded, and we can assume without restriction that φ ′ is bounded. Thus, one can estimate for ε > 0 sufficiently small. Lastly, a similar argument for small ε yields where the fourth inequality uses the optimal energy dissipation identity (3.4) in the first and in the second factor. It is known from assumption (4.1) that where the second estimate makes use of the fact that t → E ε [u ε (t)] is nonincreasing by (3.4). Thus, the estimates (4.6)-(4.8) suffice to prove (4.5).
The following argument for the Hölder continuity v ∈ C 0, 1 2 [0, T ]; L 1 (Ω) is adapted from [35,Lemma 2], where Hensel and the first author deal with the Hölder continuity in the isotropic case: Let 0 ≤ s ≤ t ≤ T . If we proceed as in (4.8), but only integrate from s to t, we find There exists a null set N ⊂ (0, T ) such that, for all s, t ∈ (0, T ) \ N , we can pass to the limit ε ց 0 to obtain This allows us to redefine v on the null set Ω × N so that v ∈ C 0, 1 2 [0, T ]; L 1 (Ω) . We identify φ • u ε with their Hölder continuous representatives due to (4.12). If a sequence of uniformly Hölder continuous functions converges pointwise almost everywhere, it follows that the sequence converges pointwise. In particular, we can upgrade the L 1 (Ω)convergence for a.e. t ∈ (0, T ) in (4.11) to L 1 (Ω)-convergence for all t ∈ [0, T ].
Using the continuity φ • u ε ∈ C 0, 1 2 [0, T ]; L 1 (Ω) and the lower semicontinuity of the variation [2, Remark 3.5], we conclude from (4.9) that lim sup i.e., we can also upgrade the essential boundedness of the BV -norm to uniform boundedness in time. From this, one finds that φ • u ε (t) * ⇀ v(t) in BV (Ω) for all t ∈ [0, T ], and that v ∈ L ∞ (0, T ; BV (Ω)). (This statement includes the measurability of the measurevalued map t → ∇v(t) in the sense of [2, Definition 2.25], which follows from the fact that we have sup t∈[0,T ] Ω |∇v(t)| < ∞ and v(t) → v(t 0 ) in L 1 (Ω) as t → t 0 , and therefore ∇v(t) * ⇀ ∇v(t 0 ) in M(Ω) as t → t 0 .) Let us turn to the convergence result for the solutions u ε . As the function √ W is strictly positive except at two isolated points, the primitive function φ is strictly increasing. Therefore, there exists a continuous inverse φ −1 , and we obtain from (4.10) that as ε ց 0 for almost all (x, t) ∈ Ω × (0, T ).
, it follows that the same holds true for u = 1 c 0 v.

Equipartition of energy
The following theorem states that, asymptotically as ε ց 0, the anisotropic Dirichlet energy 1 2 T 0 Ω εf (−∇u ε )dxdt and the nonconvex term 1 2 T 0 Ω 1 ε W (u ε )dxdt contribute equally to the Cahn-Hilliard energy. The equipartition of energy also holds true in a localized form. The argument relies crucially on the energy convergence assumption (4.3) and uses a trick introduced by Modica-Mortola for Gamma-convergence of the energies [49] (see also [8]). Equipartition results of this kind have been used to prove conditional convergence in the static or dynamic case since [46,48].
Statements (i) and (ii) can be viewed primarily as preparatory results for the equipartition statements (iii)-(v). (4.3), the following convergence statements hold true in the limit ε ց 0: Proof. (i) It suffices to show that
If ζ ∈ C 1 (Ω × [0, T ]) and 0 ≤ ζ ≤ 1, we can use Young's inequality and the chain rule for Sobolev functions to estimate By Lemma 2.5, we have that We apply (4.18) and the L 1 -convergence φ•u ε → c 0 χ as ε ց 0 to the above estimate, which, recalling Lemma 2.3, yields where the supremum is taken over all To prove the estimate from above, we observe that (4.19) also applies to the function 1 − ζ instead of ζ, and use the energy convergence assumption (4.3). Indeed, The combination of the two inequalities (4.19) and (4.20) yields the first claim.
(ii) In the same manner as in (i), it suffices to show that The lim inf inequality follows from the chain of inequalities in (4.19).
For the lim sup inequality we can use Young's inequality and (i) to compute lim sup (iii) By taking ζ ≡ 1 as test functions for the weak-* limits in (i) and (ii), we see that This observation allows us to compute (iv) Young's inequality yields for all δ > 0. Taking the limit ε ց 0 and using (iii) as well as (4.21), we obtain lim sup Since δ > 0 is arbitrary, it follows that (v) The two convergence results follow from adding and subtracting (i) and (iv), respectively.

Construction of the normal velocity
This construction follows the argument in the first step of [42, Proposition 2.10], where it was carried out by Simon and the first author for the (multiphase) isotropic case. The idea is to introduce the velocity as a Radon-Nikodým density V := ∂tχ ∇χ .
For simplicity of notation, we denote the energy density of the anisotropic Cahn-Hilliard energy by The first step of the proof is to show that the Radon-Nikodým theorem is applicable: Given a smooth test function ζ ∈ C ∞ c (Ω×(0, T )), the definition of distributional derivatives and an application of the Cauchy-Schwarz inequality yield It follows from (4.13) that Thus, by taking the limit inferior on both sides of (4.22) and recalling that φ • u ε → c 0 χ in L 1 (Ω × (0, T )) as ε ց 0, we find where the second inequality follows from the optimal energy dissipation inequality (3.4).
It is desirable to reformulate (4.24) in terms of open sets instead of test functions: Given an open set A ⊆ Ω × (0, T ), let us maximize the left-hand side of (4.24) over all ζ ∈ C ∞ c (A) with ζ ≤ 1. This provides us with the inequality Making use of the outer regularity of Radon measures, we see that for all Borel sets E ⊆ Ω × (0, T ) such that ∇χ (E) = 0, i.e., ∂ t χ ≪ ∇χ . By the Radon-Nikodým theorem, there exists a ∇χ -measurable function V such that ∂ t χ = V ∇χ on the open set Ω × (0, T ). In order to prove the square integrability (2.17) of V , we go back to (4.24). Let us first fix a finite number M > 0 and find a sequence {ζ k } k∈N of smooth test functions such that ζ k → V χ {|V |≤M } in L 2 ( ∇χ ) and |ζ k | ≤ M for all k ∈ N. Then it follows by dominated convergence that V ζ k → V 2 χ {|V |≤M } in L 1 ( ∇χ ) and, therefore, Plugging in ζ k and taking the limit k → ∞ in (4.24), we now obtain By rearranging this inequality and taking M → ∞ with the help of the monotone convergence theorem, we find the desired integrability statement, namely The final step is to prove the identity (2.18) (see also [35,Lemma 7]). Given a test function ζ ∈ C 1 (Ω × [0, T ]) and an intermediate time T ′ ∈ (0, T ], we introduce a cutoff η α in time such that and set ζ α (x, t) := ζ(x, t)η α (t). Then, in particular, we have ζ α ∈ C 1 c (Ω × (0, T )) for sufficiently small α > 0. The definition of V as a Radon-Nikodým density yields In the limit α ց 0, the left-hand side integral and the first right-hand side integral converge due to the dominated convergence theorem. For the last integral, we observe ; L 1 (Ω) by Lemma 4.2, and ζ is uniformly continuous, so that the map t → Ω χ(x, t)ζ(x, t)dx is continuous on [0, T ]. We can therefore compute the limits of all three integrals and obtain which proves (2.18).

Relative entropies
A key ingredient to derive the sharp interface limit are the following notions of relative entropies.
Definition 4.5. (i) Let u = χ ∈ BV (Ω) take values in {0, 1} almost everywhere. As in the definition of the anisotropic surface energy, let ν := − ∇χ |∇χ| be the measuretheoretic outer unit normal. The relative entropy of u with respect to a vector field ξ ∈ C(Ω) d is where ψ ∈ C ∞ ([0, ∞) is a cutoff function such that the three properties in (2.2) hold true.
(ii) Let ε > 0 and u ε ∈ H 1 (Ω) ∩ L ∞ (Ω). The ε-relative entropy of u ε with respect to a vector field ξ ∈ C(Ω) d is with ∂ * denoting the reduced boundary of a set of finite perimeter.
(ii) One can also show that |∇χ(t)|⊗L 1 (0, T ) = |∇χ| as Radon measures on Ω×(0, T ). Thus, integrating the relative entropy over time yields If ξ is chosen to be a smooth approximation of the normal, then these relative entropies serve as tilt excesses and can be used to control quadratic error terms: Let (4.30) The link between the relative entropy E and the phase-field version E ε is the following convergence statement:

Proof. A direct computation yields
where the first term converges due to Theorem 4.3(ii). The convergence of the second term follows from the observation that , which is a consequence of the bound in (4.7) and the L 1 -convergence φ • u ε → c 0 χ.
While we have only used the first inequality in Lemma 2.4 so far, the second inequality helps us to show that the tilt excess can be made arbitrarily small by approximating the normal ν with suitable vector fields ξ.

Lemma 4.8. For every δ > 0, there exists a smooth vector field
Proof. The outer unit normal ν is ∇χ -measurable, and the estimate shows that ν ∈ L 2 ( ∇χ ). Since ∇χ is a Radon measure on Ω × [0, T ], there exists an approximating sequence Clearly, ν takes values in the closed convex set B 1 = p ∈ R d |p| ≤ 1 almost everywhere with respect to ∇χ , so we can choose ξ n in a way that ensures that |ξ n | ≤ 1 on Ω × [0, T ] for all n ∈ N. By Lemma 2.4(ii) and the Cauchy-Schwarz inequality, we obtain as n → ∞.

Convergence of the curvature term
The following two subsections are dedicated to proving the distributional law of anisotropic mean curvature flow (2.19) for the limit function u. The strategy is to use B · ε∇u ε as a test function in the distributional formulation of the anisotropic Allen-Cahn equation (3.3), where B ∈ C 1 (Ω × [0, T ]) d , and then pass to the limit as ε ց 0 on both sides separately.
Here and in the following subsections, C denotes a generic positive constant that may depend on the pair of anisotropies (σ, µ), the double-well potential W , the time horizon T , and the cutoff function ψ. The constant C is not necessarily the same on every occurrence.
This theorem was first proved by Cicalese, Nagase, and Pisante in [18,Theorem 3.3]. However, we proceed with the alternative strategy of proof proposed in [39,Proposition 4.5].
We define the energy-stress tensor T ε by With this definition, the statement of Theorem 4.9 can be rewritten as a weak-* convergence claim for the energy-stress tensor: Indeed, an integration by parts on the flat torus Ω yields We observe that the right-hand side is exactly the term appearing in (4.32). Thus, in order to prove Theorem 4.9, it suffices to show that T ε * ⇀ c 0 (σ(ν)I d − ν ⊗ Dσ(ν)) ∇χ as R d×d -valued Radon measures on Ω × [0, T ].
As for the first summand of the energy-stress tensor T ε , it follows immediately from the equipartition of energy statement in Theorem 4.3(i) that The following lemma covers the convergence of the second summand of T ε , thereby completing the proof of Theorem 4.9.
For shorter notation, we introduce the approximate outer unit normal ν ε via Applying the product rule to f = σ 2 and exploiting the positive 0-homogeneity of Dσ, one can rewrite the left-hand side of (4.35) as where one can now conveniently insert the vector field ξ at the cost of two errors controlled by the tilt excess E : Let us add zero twice and compute As a consequence of the smoothness and homogeneity of σ and since ψ ≡ 0 in a neighborhood of zero, the map p → ψ(|p|)Dσ(p) is Lipschitz continuous, i.e., |ψ(|q|)Dσ(q) − ψ(|p|)Dσ(p)| ≤ C|q − p| for all p, q ∈ R d . (4.39) The first right-hand side integral of (4.38) can now be estimated as follows: and with the help of the energy convergence assumption (4.3) and Lemma 4.11 below, we obtain lim sup 1 2 T 0 E u(t) ξ(t) dt 1 2 . (4.40) Similarly recalling (4.30), the last integral on the right-hand side of (4.38) can be controlled by the tilt excess via 1 2 . (4.41) As for the remaining terms in (4.38), we have to show that To prove this weak-* convergence statement, we rewrite The first summand converges strongly to 0 in L 1 (Ω × (0, T )): This follows by the Cauchy-Schwarz inequality if we recall that √ ε∇u ε is bounded in L 2 (Ω × (0, T )) as ε ց 0 and εf (−∇u ε ) − 1 ε W (u ε ) → 0 in L 2 (Ω × (0, T )) by Theorem 4.3(iii). It has been shown in (4.6)-(4.8) that φ • u ε is bounded in W 1,1 as ε ց 0. In particular, the total variation |∇u ε | is uniformly bounded as ε ց 0.
In total, we have This concludes the proof of Lemma 4.10 since, appealing to Lemma 4.8, the time-integrated relative entropy can be made arbitrarily small.
Proof. We apply Lemma 2.4(i), the estimate |ν ε − ξ| ≤ 2, and the Cauchy-Schwarz inequality in order to control the occurring integrals by the anisotropic Cahn-Hilliard energy and the ε-relative entropy. Precisely, One can now take the limit superior on both sides and note that, by assumption, we have lim sup εց0 T 0 E ε [u ε (t)]dt < ∞. Thus, applying Lemma 4.7 in the first term and Theorem 4.3(iii) in the second term yields lim sup

Convergence of the velocity term
The analogue of Theorem 4.9 for the left-hand side terms is Similarly to the argument for the curvature term, we replace the approximate outer normal ν ε by a vector field ξ ∈ C(Ω × [0, T ]) d such that |ξ| ≤ 1 on Ω × [0, T ]. The resulting error terms can be controlled by the relative entropy. An additional error term arises as we replace the term g(−∇u ε ) by its 'asymptotic' version σ(νε) µ(νε) : Expanding both sides of (4.44), we obtain where we have used the identity −σ(ν ε )∇u ε = σ(ν ε )|∇u ε |ν ε = σ(−∇u ε )ν ε . To see that the third line on the right-hand side converges to zero as ε ց 0, it suffices to show that as ε ց 0 for all ζ ∈ C 1 (Ω × [0, T ]). A density argument then yields the same statement for all ζ ∈ C(Ω × [0, T ]), so that one can choose ζ = B · ψ(|ξ|) µ(ξ) ξ. In order to prove (4.46), we integrate by parts and write The first integral converges to zero as ε ց 0 due to the equipartiton of energy, Theorem 4.3(iii), and the optimal dissipation. For the other three integrals, we apply the convergence statements for the subsequence ε ց 0 given in Lemma 4.2, which leads to The distributional criterion (2.18) for the normal velocity yields (4.46), and so it follows that the third line on the right-hand side of (4.45) converges to zero. It remains to estimate the "error" terms in (4.45). For the first integral on the righthand side, one uses the asymptotic description of g provided by Lemma 3.2(iii): For every δ > 0, there exists a positive constant R such that g(p) − σ(p) µ(p) < δ whenever |p| ≥ R. Therefore, we obtain from which it follows that lim sup Since δ > 0 is arbitrary, we have shown that the first integral on the right-hand side of (4.45) converges to zero as ε ց 0.
As for the second integral on the right-hand side, we use that the map p → ψ(|p|) It follows with the help of Lemma 4.11 that Again making use of the Lipschitz continuity and Lemma 2.4(i), the last integral on the right-hand side of (4.45) can be estimated as 1 2 . (4.50) Plugging in the results for each right-hand side integral into (4.45) leads to lim sup The proof of Theorem 4.12 is complete once we take into account that, by Lemma 4.8, the time-integrated relative entropy can be made arbitrarily small for suitable choices of vector fields ξ.

Optimal energy dissipation inequality
The strategy to prove this lemma is to take ε ց 0 in the optimal energy dissipation identity (3.4) for the anisotropic Allen-Cahn equation. By the well-preparedness of the initial data (4.1), we have lim εց0 E ε [u ε,0 ] = E 0 . Furthermore, the Γ-convergence E ε Γ −→ E on L 1 (Ω) (see [9,Theorem 3.5] We remark that Bouchitté [9] defines dom(E ε ) = Lip(Ω), so that the liminf inequality for our definition in (1.7) does not immediately follow. However, the liminf inequality is the easier part of the Γ-convergence statement and can be shown by combining the Modica-Mortola argument in Theorem 4.3(i) above with the truncation of W that can be found in [44, Proof of Theorem 1.6].
Therefore, the proof of the lemma reduces to proving the following inequality: We apply Young's inequality in the form a 2 ≥ 2ab − b 2 and add zero multiple times in order to replace g with its asymptotic version and ν ε with ξ, similarly to the approach in (4.45). In this case, we obtain so that, by appealing to Lemma 4.11, we obtain lim sup For the fourth line of the right-hand side of (4.52), a very similar computation yields lim sup (4.56) Finally, it follows from (4.46) and the equipartition of energy, Theorem 4.3(v), that (4.57) It is now desirable to let ξ → ν in L 2 ( ∇χ ) so that T 0 E u(t) ξ(t) dt → 0. This is possible by Lemma 4.8 and Lemma 2.4(i). By the dominated convergence theorem, we also have ψ(|ξ|) σ(ξ) µ(ξ) → σ(ν) µ(ν) in L 2 ( ∇χ ). Under this convergence, it follows from (4.52) and the computations (4.53)-(4.57) that In a last step we let ζ → V σ(ν) in L 2 ( ∇χ ), which yields the lower bound as stated in the claim.

Proof of Theorem 4.1
We simply wrap up the proof of the theorem.
Proof. The compactness follows from Lemma 4.2. Under the energy convergence assumption, the distributional formulation for the time derivative (2.18) follows from Lemma 4.4. The optimal energy dissipation relation (2.20) follows from Lemma 4.13. To obtain the distributional formulation for the curvature in (2.19), we use B · ε∇u ε as a test function in the distributional formulation of the anisotropic Allen-Cahn equation (3.3), where B ∈ C 1 (Ω×[0, T ]) d , where we recall that by Theorem 3.10, we have B·ε∇u ε ∈ L 2 0, T ; H 1 (Ω) , and that these functions are admissible test functions in (3.3). To pass to the limit as ε ց 0 in the left hand-side, one applies Theorem 4.12, and likewise for the right-hand side, apply Theorem 4.9.

Weak-strong uniqueness for anisotropic mean curvature flow
The goal of this section is to prove that, as long as a strong solution to anisotropic mean curvature flow (1.1) exists, any BV solution with the same initial data coincides with the strong solution. One needs to require additional regularity for (σ, µ) to make sure that strong solutions will be sufficiently smooth, and we will also rely on the higher regularity of σ in the proof of the weak-strong uniqueness statement. Thus, we assume that σ, µ ∈ C ∞ R d \ {0} . Further, without loss of generality, we let c 0 = 1. Following Hensel and Moser [37, Definition 10], we define strong solutions to (1.1) as follows: The setup in [37, Definition 10 and Remark 15] also suggests that the C ∞ -regularity for (σ, µ), Φ, and ∂A (0) can be relaxed.
We are now ready to formulate the central theorem of this section: The proof for this theorem is modeled after [35,Sections 2.2,4], where Hensel and the first author derive an analogous result for multiphase isotropic mean curvature flow. The key step in this argument is to find a gradient flow calibration (see below) for the smooth evolution {A (t)} t∈[0,T ] . While the existence of a gradient flow calibration is nontrivial in the multiphase case (see [37,Theorem 4] for a gradient flow calibration for multiphase mean curvature flow in d = 2 with constant contact angle), such a calibration can always be constructed explicitly for a smooth two-phase evolution.

Gradient flow calibrations
In the remainder of this chapter, C < ∞ and c > 0 denote positive constants ('large' and 'small', respectively) that may depend on the pair of anisotropies (σ, µ), on the time horizon T , and on the smooth evolution {A (t)} t∈[0,T ] . These constants need not be the same on every occurrence.
satisfying • the approximate evolution equations • the compatibility condition (5.4) • and with ν ∂A (t) denoting the outer unit normal of the set A (t), we have the coercivity conditions Intuitively, ξ is an extension of the outer unit normal ν ∂A (t) (with an additional coercivity property), whereas B extends the normal velocity vector and ϑ is comparable to a signed distance function. The compatibility condition (5.4) encodes the motion by anisotropic mean curvature. The space C 2 1 (Ω × [0, T ]) is defined as Let us collect the key inequalities for gradient flow calibrations that will be used in the proof of Theorem 5.2. .
Proof. (i) The lower bound is precisely the coercivity condition (5.6).
(iv) This is the statement of Lemma 2.4(i).
(v) This is another elementary estimate: With the help of Lemma 2.3(ii), (vi), one computes (vi) follows from the previous two estimates since Proof. There exists a positive δ > 0 such that, in the neighborhood the signed distance function sdist : U → (−δ, δ) and the orthogonal projection p : U → t∈[0,T ] (∂A (t) × {t}) with respect to ∂A (t) are well-defined and regular (see [34,Lemmas 14.16,14.17]). We use the sign convention that sdist(x, t) < 0 for x ∈ A (t) ∩ B δ (∂A (t)).

A Grönwall-type stability estimate
We wish to prove Theorem 5.2 by deriving a Grönwall-type estimate for a suitable quantity. A straightforward way to measure the difference between the calibrated evolution A (t) and the weak solution A(t) (χ A(t) satisfying Definition 2.8) at a given time t ∈ [0, T ] is the bulk error B χ(t) ϑ(t) := |ϑ(x, t)|dx.
Since |ϑ(·, t)| is strictly positive outside ∂A (t), and hence almost everywhere in R d this bulk error vanishes if and only if |A (t)△A(t)| = 0. However, the available stability estimate for the bulk error is not strong enough to apply Grönwall's lemma immediately: (ii) For every δ > 0, there exists a constant C(δ) > 0, which also depends on the calibrated evolution, such that for all T ′ ∈ [0, T ].
Proof. See [35,Sections 4.2,4.3], where the argument is carried out for isotropic mean curvature flow. The same argument applies in our anisotropic setting since the definition of the bulk error functional has remained unchanged.
To compensate for the additional error term on the right-hand side of (5.9), we make use of another stability estimate for the relative entropy: A combination of the two stability estimates yields In particular, it follows by Grönwall's lemma that B χ(T ′ ) ϑ(T ′ ) = 0 for all T ′ ∈ [0, T ] if B χ(T ′ ) ϑ(0) + E χ(T ′ ) ξ(0) = 0, which completes the proof of Theorem 5.2.

Stability of the relative entropy
The goal of this subsection is to prove Lemma 5.7.
Furthermore, an application of the chain rule shows that (ν ⊗ (B · ∇)ξ) : M (ξ) = ν · (B · ∇)(F (ξ)), and we will from now on use this slightly shorter formulation for the third right-hand side integral of (5.15). By using (5.15) and plugging in the computations for the three right-hand side integrals as well as the weak formulation of anisotropic mean curvature flow in (2.19), we obtain ∇B : (σ(ν)I d − ν ⊗ Dσ(ν)) dH d−1 dt It is now advisable to expand the first right-hand side integral and to add zero in order to isolate two more errors controlled by the time-integrated relative entropy: The first two integrals are controlled by T ′ 0 E χ(t) ξ(t) dt; in the case of the second integral this follows from the local Lipschitz continuity of F .
As a result of an integration by parts and the symmetry relation div(div(a ⊗ b)) = div(div(b ⊗ a)), one can then see that Plugging in (5.20) into (5.19) yields (5.14). In a last step, plugging in (5.12), (5.13), and (5.14) into (5.11) yields This concludes the proof of Lemma 5.7.