Partial regularity for minimizers of discontinuous quasiconvex integrals with general growth

We prove the partial H\"older continuity for minimizers of quasiconvex functionals \[ \mathcal{F}({\bf u}) \colon =\int_{\Omega} f(x,{\bf u},D{\bf u})\,\mathrm{d}x, \] where $f$ satisfies a uniform VMO condition with respect to the $x$-variable and is continuous with respect to ${\bf u}$. The growth condition with respect to the gradient variable is assumed a general one.


Introduction
In this paper we study the partial regularity of minimizers of the integral functional F (u) : =ˆΩ f (x, u, Du) dx, (1.1) where Ω ⊆ R n is an open bounded set and u : Ω → R N , with n, N ≥ 2 -i.e., we consider vectorial minimizers of F . The growth conditions we impose on f = f (x, u, P) are quite general, being as they permit "general growth conditions" with respect to the gradient variable. This allows us to treat in a unified way the degenerate (when p > 2) or singular (when p < 2) behaviour. We assume with respect to x a weak VMO condition, uniformly in (u, P), and continuity with respect to u. Our main result, Theorem 1.1, proves that a minimizer of (1.1) is locally Hölder continuous for any Hölder exponent 0 < α < 1 -i.e., if u is a minimizer of (1.1), then u ∈ C 0,α loc Ω 0 , R N , where Ω 0 ⊂ Ω is an open set of full measure specified in the statement of Theorem 1.1 later in this section.

Literature Review
We begin by explaining how the study of functional (1.1) fits into the broader regularity theory research over the past many years. Before proceeding further, we point out that Mingione [35] has provided a comprehensive account of the various areas of study within regularity theory for integral functionals and PDEs; it is an excellent reference for those wishing to read a broad overview of the various areas of interest within the larger realm of regularity theory.
As already mentioned we allow f to satisfy a VMO-type condition with respect to x. More precisely the partial map x → f (x, u, P) ϕ(|P|) satisfies a uniform VMO condition; here ϕ is an Nfunction -see condition (F4) later in this section for the precise formulation. As a consequence we allow a certain controlled discontinuous behavior with respect to the spatial variable in the integrand of (1.1). We prove partial Hölder continuity for the local minimizers. The first paper who considered low order regularity (for variational integrals) was the one by Foss & Mingione [23], where they were assuming continuity with respect to x and u. Thereafter Kristensen & Mingione [29] proved Hölder continuity for convex integral functionals with continuous coefficients for a fixed Hölder exponent depending on the dimension and the growth exponent. Stronger assumptions as Dini-type conditions [20] lead to partial C 1 -regularity. It is worth mentioning the uniform porosity of the singular set for Lipschitzian minimizers of quasiconvex functionals, [30]. The space of functions with vanishing mean oscillation (VMO) has been introduced by Sarason in the realm of harmonic analysis, see [37]. It has had several applications in connection with Hardy spaces, Riesz transforms or nonlinear commutators, see [39], [27] and references therein. In the early 90's Chiarenza, Frasca and Longo [8] studied non-divergence form equations with VMO coefficients by means of singular integrals operators, see also [18], [19].
The study of functionals with VMO-type coefficients has been broadened considerably over the past couple decades, see [38], [9]. Recently, Bögelein, Duzaar, Habermann, and Scheven [5] considered a functional of the form (1.1) under the assumption that (x, u, P) → f (x, u, P) satisfies a type of VMO assumption in x, uniformly with respect to u and P; they further considered an analogous elliptic system of the form ∇ · a(x, u, Du) = 0, in which, again, the coefficient a was assumed to satisfy a VMO-type condition with respect to its spatial coordinate. Moreover, the integral functional they studied was assumed to be quasi-convex. However, unlike our study, they assumed that the growth of f with respect to P was standard p-growth, p ≥ 2.
Similarly, Bögelein [4] studied quasi-convex integral functionals in the vectorial case. But the assumed growth of the integrand with respect to the gradient was standard p-growth. It was also assumed that the map x → f (x, u, P) (1 + |P|) p was VMO, uniformly with respect to u and P. Bögelein, Duzaar, Habermann, and Scheven [6] made some similar assumptions when considering a system of PDEs involving the symmetric part of the gradient Du, wherein the coefficients on the symmetric part are VMO.
Goodrich [26] then further generalized, in part, the results of [5] by considering (1.1) in the case where x → f (x, u, P) was VMO, uniformly with respect to u and P, and, furthermore, in which f was only asymptotically convex.
Next, the study of problems with general growth conditions has been initiated by Marcellini in a list of papers [32,33,34] and it is now very rich -see, e.g., [15,16,17,7,10,40]. In particular, Marcellini & Papi proved the Lipschitz bound for a solution of an elliptic system with general growth of Uhlenbeck type. In view of comparison estimates, it is worth mentioning the paper [15], where the C 1,α regularity is proven via an excess decay estimate. Very recently, DeFilippis & Mingione have relaxed the hypotheses by considering also growth of exponential type (no ∆ 2 -condition), [11].
So, we see that many papers in recent years have treated either VMO-type coefficient problems or general growth problems. To our knowledge, it seems that the combination of these two generalities has not been considered as we do in this paper. Thus, the results of this paper significantly generalize many of the previously mentioned papers.

Strategy of the proof
We briefly explain the strategy of the proof of the main result. As a major difficulty with respect to the proof by Bögelein or Duzaar et al. in the p-setting, we can't rely on homogeneity of the function ϕ. In particular, an analog of the Campanato excess defined there and playing a key role in the iteration process could not be easily handled in the Orlicz setting.
Our strategy is to find carefully the two quantities which play the role both in the nondegenerate and in the degenerate cases. The first leading quantity is the excess functional (3.14)). In the non-degenerate case, when we linearize the problem, via the A-harmonic approximation [17]. This procedure, exploiting assumptions (F4)-(F5) and a freezing technique (with respect to the variables x and u) based on the Ekeland variational principle, provides a comparison map which is an almost minimizer of the frozen functional and whose gradient is L 1 -close to that of the original minimizer (see Lemma 3.8). Such comparison map is shown to be approximately A-harmonic, and this property is inherited by the minimizer itself via the comparison estimate. This allows to prove an excessdecay estimate, which, in turn, permits the iteration of the rescaled excess Φ(x0,̺) ϕ(|(Du)x 0 ,̺ |) and of a "Morrey-type" excess at each scale. Namely, there exists ϑ ∈ (0, 1) such that, if the boundedness conditions ≤ ε * and Θ(x 0 , ϑ m ̺) ≤ δ * hold for every m = 0, 1, . . . .. Therefore, Θ(x 0 , ̺) is the adequate excess playing the role of Ψ α in our setting. In the degenerate case, when for some κ < 1, we perform a different linearization procedure: the assumption (F7) coupled with an analogous freezing argument as before provides, now, the almost ϕ-harmonicity of the minimizer via the application of the ϕ-harmonic approximation [16] to the comparison map. The corresponding excess improvement implies that if the excess is small at radius ̺ it is also small at some smaller radius θ̺, for θ < 1. The key point in this iteration process is that the boundedness of both the excess Φ and the Morrey excess Θ at some scale ϑθ k0 ̺ ("switching radius") under assumption (1.2) is satisfied exactly when the degenerate bound (1.3) fails and therefore we can proceed the iteration in the non-degenerate regime. Notice that, if on the one hand |(Du) x0,̺ | might blow up in the iteration since we cannot expect C 1 -regularity, on the other hand the Morrey excess Θ(x 0 , θ k ̺) stays bounded, exactly as it should be for a C 0,α -regularity result. In addition, if at level k 0 the regime is non-degenerate, the behavior stays non-degenerate at any subsequent level k ≥ k 0 , and the iteration can proceed. The smallness of Θ at any level ensures Hölder continuity of u in x 0 provided the excess functionals Φ and Θ are small at some initial radius ̺ (actually, this holds in a neighborhood of x 0 , since these smallness conditions are open). Finally, it is then proven that such a smallness condition on the excesses is indeed satisfied on the complement of the set Σ 1 ∪ Σ 2 of Theorem 1.1.

Assumptions and statement of the main result
We list here the main assumptions on the integral functional that we are going to study throughout the paper. We assume that ϕ : We may assume, without loss of generality, that 1 < µ 1 < 2 < µ 2 .
For the precise notation and definitions, as well as the additional assumptions we will require on ϕ, we refer to Section 2.
Our main regularity result can be stated as follows. Note that the definition of V appearing in Σ 1 can be found in (2.3).

Some basic facts on N -functions
We recall here some elementary definitions and basic results about Orlicz functions. The following definitions and results can be found, e.g., in [28,31,3,1]. A real-valued function ϕ : R + 0 → R + 0 is said to be an N -function if it is convex and satisfies the following conditions: ϕ(0) = 0, ϕ admits the derivative ϕ ′ and this derivative is right continuous, non-decreasing and satisfies ϕ ′ (0) = 0, ϕ ′ (t) > 0 for t > 0, and lim t→∞ ϕ ′ (t) = ∞.
Proposition 2.1: Let ϕ be an N -function complying with (ϕ1) and (ϕ2). Then uniformly in t > 0. The constants in (2.1) are called the characteristics of ϕ; (ii) it holds that are increasing and decreasing, respectively; (iv) as for the functions ϕ and ϕ ′ applied to multiples of given arguments, the following inequalities hold for every t ≥ 0: In particular, from (iv) it follows that both ϕ and ϕ * satisfy the ∆ 2 -condition with constants ∆ 2 (ϕ) and ∆ 2 (ϕ * ) determined by µ 1 and µ 2 . We will denote by ∆ 2 (ϕ, ϕ * ) constants depending on ∆ 2 (ϕ) and ∆ 2 (ϕ * ). Moreover, for t > 0 we have We recall also that the following inequalities hold for the inverse function ϕ −1 : for every t ≥ 0 with 0 < a ≤ 1. The same result holds also for a ≥ 1 by exchanging the role of µ 1 and µ 2 . For given ϕ we define the associated N -function ψ by Notice that if ϕ satisfies assumption (2.1), then also ϕ * , ψ, and ψ * satisfy this assumption. Define V : R N ×n → R N ×n in the following way: It is easy to check that |V(Q)| 2 ∼ ϕ(|Q|) , Another important set of tools are the shifted N -functions {ϕ a } a≥0 (see [12]). We define for We have the following relations: The families {ϕ a } a≥0 and {(ϕ a ) * } a≥0 satisfy the ∆ 2 -condition uniformly in a ≥ 0. The connection between V and ϕ a (see [12]) is the following: The following lemma (see [14,Corollary 26]) deals with the change of shift for N -functions.
Lemma 2.2: Let ϕ be an N -function with ∆ 2 (ϕ), ∆ 2 (ϕ * ) < ∞. Then for any η > 0 there exists c η > 0, depending only on η and ∆ 2 (ϕ), such that for all a, b ∈ R d and t ≥ 0 We define the function V a : where ϕ a is the shifted N -function of ϕ. Since ϕ 0 = ϕ, we retrieve in (2.8) the function V for a = 0. With the following lemma, we list some properties of functions V a which will be useful in the sequel.

Lemma 2.3:
Let a ≥ 0 and V a be as above. Then for any P, Q ∈ R N ×n a Young-type inequality holds: where the constant c depends only on ∆ 2 (ϕ).
In view of the previous considerations, the same proposition holds true for the shifted functions, uniformly in a ≥ 0.
From assumption (F2) we can easily infer an upper bound for f (x, u, P)−f (x, u, Q), uniformly in x ∈ Ω and u ∈ R N , for every P, Q ∈ R N ×n ; namely, (2.10) The following estimate is a consequence of (F2) and Lemma 2.4 (see [17, eq. (2.14)]): for every P, Q ∈ R N ×n .
The following version of Sobolev-Poincaré inequality can be found in [12,Lemma 7].

Some useful lemmas
The following lemma, useful in order to re-absorb certain terms, is a variant of the classical [25, Lemma 6.1] (see [17,Lemma 3.1]).
Lemma 2.6: Let ψ be an N -function with ψ ∈ ∆ 2 , let ̺ > 0 and h ∈ L ψ (B ̺ (x 0 )). Let g : [r, ̺] → R be nonnegative and bounded such that for all r ≤ s < t ≤ ̺ The following lemma is useful to derive reverse Hölder estimates. It is a variant of the results by Gehring [24] and Giaquinta-Modica [25, Theorem 6.6].

A-harmonic and ϕ-harmonic functions
Let A be a bilinear form on R N ×n . We say that A is strongly elliptic in the sense of Legendre-Hadamard if for all ξ ∈ R N , ζ ∈ R n it holds that It is well known from the classical theory (see, e.g. [25,Chapter 10]) that w is smooth in the interior of B ̺ (x 0 ), and it satisfies the estimate (2.14) Let ϕ be an Orlicz function. We say that a map w ∈ [16]) if and only if More precisely, Dw and V(Dw) are Hölder continuous due to the following decay estimate, see [15].
for all t > 0 and s ∈ R with |s| < 1 2 t.
Then there exist a constant c ≥ 1 and an exponent γ 0 ∈ (0, 1) depending only on n, N and the characteristics of ϕ, such that the following statement holds true: This result can be viewed as the Orlicz version of the milestone theorem of Uhlenbeck [41] for differential forms solving a p-harmonic system, see also [2].

Harmonic type approximation results
We recall here two different harmonic type approximation results. The first one is the A-harmonic approximation: given a Sobolev function u on a ball B, we want to find an A-harmonic function w which is "close" the function u. It will be the A-harmonic function with the same boundary values as u; i.e., a Sobolev function w which satisfies in the sense of distributions. Setting z := w − u, then (2.15) is equivalent to finding a Sobolev function z which satisfies in the sense of distributions.
The following A-harmonic approximation result in the setting of Orlicz spaces has been proved in [17,Theorem 14]. Theorem 2.9: Let B ⊂⊂ Ω be a ball with radius r B and let B ⊂ Ω denote either B or 2B. Let A be a strongly elliptic (in the sense of Legendre-Hadamard) bilinear form on R N ×n . Let ψ be an N-function with ψ ∈ ∆ 2 (ψ, ψ * ) and let s > 1. Then for every ε > 0, there exists δ > 0 only depending on n, N , κ A , |A|, ∆ 2 (ψ, ψ * ) and s > 1 such that the following holds. Let Remark 2.10: We will exploit the previous approximation result in a slightly modified version. Indeed, following [7, Lemma 2.7], under the additional assumption for some exponent s > 1 and for a constant µ > 0, and (2.17) replaced by it can be seen with minor changes in the proof that the unique solution z ∈ W 1,ψ 0 (B, R N ) of (2.16) satisfies Now, moving on to ϕ-harmonic functions, the following ϕ-harmonic approximation lemma ([16, Lemma 1.1]) is the extension to general convex functions of the p-harmonic approximation lemma [21], [22,Lemma 1], and allows to approximate "almost ϕ-harmonic" functions by ϕharmonic functions.
Lemma 2.11: Let ϕ satisfy assumption (2.1). For every ε > 0 and θ ∈ (0, 1) there exists δ > 0 which only depends on ε, θ, and the characteristics of ϕ such that the following holds. Let B ⊂ R n be a ball and letB denote either B or 2B.
where V is as in (2.3).

Caccioppoli inequalities and higher integrability results
As usual, the first step in proving a regularity theorem for the minimizers of integral functionals is to establish suitable Caccioppoli-type inequalities.
First, we state a "zero order" Caccioppoli inequality. The proof is an adaptation to the ϕ-setting of [4, Lemma 3.1], we then omit the details (see also [7,Theorem 2.4

Lemma 3.2:
There exist an exponent s 0 = s 0 (n, N, ϕ, L, ν) > 1 and a constant c depending only on n, N, ϕ, L, ν such that, if u ∈ W 1,ϕ (Ω; R N ) is a minimizer of the functional (1.1), complying with (F1)-(F2), then the following holds: for every s ∈ (1, s 0 ], for any x 0 ∈ Ω, any radius Another useful tool will be the following global higher integrability result on balls for minimizers of (1.1), which has been proven in the Orlicz setting for more general integrands in [10,Lemma 4.3].
We have the following Caccioppoli inequality of second type for local minimizers of (1.1), involving affine functions.

Lemma 3.4:
There exists a constant c = c(n, N, ∆ 2 (ϕ), ν, L) > 0 such that, if u ∈ W 1,ϕ (Ω; R N ) is a minimizer of the functional (1.1) under the assumptions (F1)-(F7), and ℓ : R n → R N is an affine function, say ℓ(x) := u 0 + Q(x − x 0 ) for some u 0 ∈ R N and Q ∈ R N ×n , then for any ball for every s ∈ (1, s 0 ] where s 0 is that of Lemma 3.2. Proof: We follow the argument of [4, Lemma 3.5] for functionals with p-growth, just mentioning how to obtain the analogous main estimates therein. We assume, without loss of generality, that x 0 = 0. For radii ̺ 2 ≤ r < τ < t ≤ 3̺ 4 with τ := r+t 2 we consider a cut-off function η ∈ C ∞ 0 (B τ ; [0, 1]) such that η ≡ 1 on B r and |Dη| ≤ 4 t−r on B τ . Correspondingly, we define the functions ξ := η(u − ℓ) ∈ W 1,ϕ (B τ ; R N ) and ψ := (1 − η)(u − ℓ) ∈ W 1,ϕ (B τ ; R N ). Note that ℓ + ξ = u − ψ. From the quasi-convexity assumption (F3), (2.4) and simple manipulations we obtainˆB Now, we proceed to estimate each term above separately. From the minimizing property of u we infer that J 4 ≤ 0, and by assumptions (F5) and (F4) we obtain the estimates respectively. Again by exploiting property (F5), the monotonicity of ω and ϕ, and the fact that we can estimate J 5 as whence, taking into account that by virtue of (2.5), and recalling that ω ≤ 1, we get For what concerns J 6 , an analogous computation as for the estimate of J 5 based on (3.2) and the VMO assumption (F4) gives The terms J 1 and J 7 can be combined together as From the Cauchy-Schwarz inequality, (2.11) and the fact that Dψ = 0 on B r we infer We can estimate J ′ 1 analogously, by recalling that Du − (1 − θ)Dψ = Q + Dξ + θDψ, Dψ = 0 on B r and applying the triangle inequality for ϕ ′ |Q| , (2.11) and the Young's inequality (2.9). In this way we get Recalling the definitions of ξ and ψ, by a simple computation we find that so that combining with the previous estimates we get Since ξ = u − ℓ on B r and τ ≤ ̺, from (3.1) and the estimates for J 1 − J 7 we obtain Now, in a standard way we "fill the hole" thus obtaininĝ where σ :=c c+1 < 1. In order to bound the latter term further, we exploit the higher integrability result of Lemma 3.2. Thus, with fixed s ∈ (1, s 0 ], as a consequence of Hölder's inequality, the concavity of ω, the bounds ω ≤ 1 and v 0 ≤ 2L, and Jensen's inequality also we obtain where c = c(n, N, ∆ 2 (ϕ), ν, L). This estimate, combined with (3.3) giveŝ Now, since the previous estimate holds for arbitrary radii r, t such that ̺/2 ≤ r < t ≤ 3̺/4, the constant c depends only on n, N, ∆ 2 (ϕ), ν, L and σ < 1, as a consequence of Lemma 2.6 applied with β := n(s − 1) we obtain In view of Lemma 3.1 applied with ̺ in place of t − s and from (3.2) we get which combined with (3.4) and using the fact that ω ≤ 1 as well as V(̺) ≤ 2L giveŝ where c = c(n, N, ∆ 2 (ϕ), ν, L). The Caccioppoli inequality then follows by taking means on both sides of the latter inequality.
We can apply Lemma 3.4 to affine functions ℓ x0,r (x) := (u) x0,̺ +Q(x−x 0 ) for some Q ∈ R N ×n , and the resulting Caccioppoli inequality can be compared with that of [7, Theorem 3.1]. We notice that, apart of an extra VMO term due to assumption (F4), the dependence of the integrand f also on u implies that the remainder term inside ω; i.e., is, in general, non-monotone in the radius ̺. Indeed, it can be estimated from above by the Morrey-type excess which fails to be monotone for small ̺ (Lemma 3.5(i)). This does not allow, in general, for an application of Gehring's lemma in order to infer an higher integrability result: for this purpose, a suitable "smallness" regime (3.9) has to be imposed (Lemma 3.5(ii)).

Comparison maps via Ekeland's variational principle
The proof of the main results will require suitable comparison functions, which will be constructed with a freezing argument in the variables (x, u) based on Ekeland's variational principle. We recall below a version of this classical tool, whose proof can be found, e.g., in [25,Theorem 5.6].
Lemma 3.7 (Ekeland's principle): Let (X, d) be a complete metric space, and assume that F : X → [0, ∞] be not identically ∞ and lower semicontinuous with respect to the metric topology on X. If for some u ∈ X and some κ > 0, there holds Although a similar analysis in the Orlicz setting, for integrands f = f (x, ξ), has been performed in [7, Theorem 3.3], we will follow a quite different argument, which refers to the case of p-growth as in [4,Lemma 3.7]. We will also specify the appropriate complete metric space X, which is not explicitly mentioned in [7,Theorem 3.3].
for some constant c * = c * (n, N, ∆ 2 (ϕ), ν, L). Moreover, v fulfills the following Euler-Lagrange variational inequality: Proof: We may assume, without loss of generality, that x 0 = 0 and, correspondingly, we use the shorthand K(̺) for K(0, ̺). As a first remark, we recall that from Lemma 3.1 with r = 3 4 ̺ we have where c = c(∆ 2 (ϕ), L, ν). We then denote byṽ ∈ X a minimizer of the functional (3.21) whose existence is ensured by the direct method under the assumptions (F1)-(F2). From the minimality ofṽ, assumption (F1) and (2.10) we get (3.25) By the sublinearity of ϕ, the Poincaré inequality (Theorem 2.5), Jensen's inequality and (3.24) this gives where c = c(ϕ, n, L, ν). Moreover, as a consequence of the higher integrability results of both Lemma 3.2 and 3.3, together with (3.25) and (3.24), we infer the higher integrability result  where c = c(n, N, ϕ, ν, L) and s = s(n, N, ϕ, ν, L) ∈ (1, s 0 ]. Now we prove that u is an almost minimizer of the functional G. Indeed, from the minimality of u and assumptions (F4), (F3) we get Then, by using Jensen's inequality, the concavity and sub-linearity of ω, (3.26) and (3.27), from the previous estimate we obtain where c = c(n, N, ∆ 2 (̺), ν, L). Arguing similarly, we can estimate where the constant c has the same dependencies as before. Adding term by term (3.28)- (3.29) and taking into account the minimality ofṽ, we infer for a constant c * = c * (n, N, ∆ 2 (ϕ), ν, L). Finally, Ekeland's variational principle (Lemma 3.7) with the choice κ = c * K(̺) provides the existence of a function v ∈ X with the desired property of minimality for the functional G and such that d(u, v) ≤ 1, which corresponds to (3.22). The inequality (3.23) follows from the validity of the associated Euler-Lagrange variational inequality for v in a standard way.

Approximate A-harmonicity and ϕ-harmonicity
In this section, we provide two different linearization strategies for the minimization problem, along the lines of [4, Section 3.2], where an analogous analysis has been performed for functionals with p-growth. On the one hand, with Lemma 3.9 we will show that the minimizer u of F is an almost A-harmonic function for a suitable elliptic bilinear form A. On the other hand, this u turns out to be an almost ϕ-harmonic function (see Lemma 3.10). These results will allow us to apply the A-harmonic approximation lemma, respectively the ϕ-harmonic approximation lemma. The proof will require, in both cases, the comparison maps obtained with Lemma 3.8. We start by proving the approximate A-harmonicity of a minimizer to (1.1). To this aim, only assumptions (F1)-(F6) are required on f .
Let L x0,̺ be the affine function associated to u as in (3.13), which complies with L x0,̺ (x 0 ) = (u) x0,̺ and DL x0,̺ = (Du) x0,̺ =: Q x0,̺ . We set We point out that A defined above is a bilinear form on R N ×n , satisfying the ellipticity assumption (2.13) by virtue of (F2) and (F3).  If, in addition, f complies also with (F7), we can show that each local minimizer of the functional F (u) (eq. (1.1)) is almost ϕ-harmonic.

Excess decay estimates: the non-degenerate regime
We start by establishing excess improvement estimates in the non-degenerate regime characterized by (3.33) below, i.e. the fact that Φ(x 0 , ̺) ≤ cϕ(|(Du) x0,̺ |). The strategy of the proof is to exploit Lemma 3.9 to approximate the given minimizer by A-harmonic functions, for which suitable decay estimates are available from Theorem 2.9.
Proof: The proof follows the argument of [7,Lemma 4.2]. We emphasize that Corollary 3.6 is crucial in order to obtain the estimate  which comes into play in applying the A-harmonic approximation theorem in the modified version of Remark 2.10.
Lemma 3.12: Let ϑ ∈ (0, 1), and assume that where c µ2 is the constant of the change of shift formula (2.7) with η = 1 2 µ 2 +1 . Then it holds that Proof: As a consequence of (2.7) for η = 1 2 µ 2 +1 and with (3.36) we get whence, passing to ϕ −1 and taking into account (2.2), we obtain whence (3.37) follows by re-absorbing the first term of the right-hand side into the left.
We argue by induction on m. Since (3.39) are trivially true for m = 0 by assumption (3.38), our aim is to show that if (3.39) holds for some m ≥ 1, then the corresponding inequalities hold with m + 1 in place of m. Setting in order to prove the second inequalities in (3.39) it will suffice to show that We have, with (3.39) at step m, the shift-change formula (2.7) with η = 1 2 and (3.42), the estimate Then, by virtue of Lemma 3.11 and Lemma 3.12 applied with radius ϑ m ̺ in place of ̺, and recalling the choice of ϑ (3.41), we get Finally, since the iteration starting from m = 0 of the estimate and this estimate a fortiori holds if we consider E(B ϑ m ̺ (y)) for y ∈ B ̺/2 in place of E(B ϑ m ̺ ), we deduce the Morrey-type estimate for all y ∈ B ̺/2 and r ≤ ̺/2, which is equivalent to (3.40). The proof is now concluded.

Excess decay estimate: the degenerate regime
In this section, with Lemma 3.14 we will establish an excess improvement estimate for the degenerate case which is characterized by the fact that Φ(x 0 , ̺) is "large" compared to ϕ(|(Du) x0,̺ |).
At this point, we can argue as in the case S = N 0 for the proof of (3.58), whence (3.55) follows thus concluding the proof.
By the absolute continuity of the integral, we can find an open neighborhood U x0 of x 0 such that Φ(x,̺) < ε # and Θ(x,̺) < δ * for every x ∈ U x0 . We can apply Lemma 3.15 at each point of U x0 , proving that u ∈ C 0,α (U x0 , R N ) for every α ∈ (0, 1). Thus, x 0 ∈ Ω 0 and the proof is concluded.