Equilibrium measures of the natural extension of 
$\boldsymbol{\beta}$
 -shifts

Abstract We give a necessary and sufficient condition on 
$\beta $
 of the natural extension of a 
$\beta $
 -shift, so that any equilibrium measure for a function of bounded total oscillations is a weak Gibbs measure.


Introduction
We study equilibrium measures of the natural extension of β-shifts. This is an interesting class of dynamical systems which have been studied in ergodic and number theory since the fundamental papers [Re, Pa]. We want to determinate whether an equilibrium measure for a continuous function ϕ is a weak Gibbs measure. In [PS3] we developed a method based on a decoupling property [PS3,Definition 2.3], which is a slightly weaker condition than condition (D) in [Ru,§4.1]. The results of [PS3] are valid only for β such that the β-shift has the specification property and ϕ is of bounded total oscillations. This set of β is the set C 3 in [Sc]. Schmeling proved that C 3 has Lebesgue measure 0, but Hausdorff dimension 1. For the more restricted class of functions satisfying the Bowen condition [Bo], one has a stronger result. Under expansiveness and specification Haydn and Ruelle [HR] proved the equivalence of equilibrium measures and Gibbs measures (in the sense of Bowen [Bo] and Capocaccia [Ca]).
However, using basic ideas of [PS3,CTY], it is possible to obtain a necessary and sufficient condition on β such that for any function of bounded total oscillations all equilibrium measures are weak Gibbs measures.
To formulate our main result, Theorem 2.12, precisely we need to recall first some basic properties of β-shifts. This is done in §2.1. In §2.2 we consider the class of functions of bounded total oscillations following [PS3,§3], and in §2.3 we consider the pressure, establishing two basic estimates for the proof of Theorem 2.12. Our main result is stated in §2.4 and proved in §3. We discuss briefly large deviations for empirical measures in §2.4.
2. Setting and main result 2.1. Beta-shift. Let β > 1 be fixed. The case β ∈ N is special and corresponds to the full shift. From now on we assume that β ∈ N. For t ∈ R, let t := min{i ∈ Z : i ≥ t}. We define b := β . Consider the β-expansion of 1, which is given by the algorithm which ensures that r i > 0 for all i ∈ Z + . It follows that c 1 = β − 1 > 0 and c β := (c 1 , c 2 , . . .) cannot end with zeros only. For sequences (a 1 , a 2 , . . .) and (b 1 , b 2 , . . .) the lexicographical order is defined by (a 1 , a 2 where T is the left shift operator. In particular, T k c β c β for all k ∈ Z + , so that X β is a shift-invariant closed subset of A N (with product topology). The language of the shift X β is denoted by L β and the set of the words of length n by L β n . In this paper the empty word is always denoted by , L β 0 = { }, while ε is always a positive real number. A word is written w 1 · · · w n or simply w. The length of a word w is written |w|.
The shift-space X β can be described by a labeled graph G β = (V, E β ) where V := {q j : j ∈ Z + }. The root of the graph is the vertex q 0 . There is an edge q 0 → q 0 , labeled by k, for each k = 0, . . . , b − 2, and there is an edge q j −1 → q j labeled by c j for each j ∈ N. Moreover, if the label c j of q j −1 → q j is different from 0, then there are c j edges q j −1 → q 0 labeled by 0, . . . , c j − 1. Each word w 1 · · · w n ∈ L β can always be presented by a path of length n in G β starting with vertex q 0 . For a word w ∈ L β we define q(w) as the end vertex of this path starting at q 0 and presenting w. One can concatenate two words w and w if and only if there is a path η in G β presenting w and a path η in G β presenting w , so that η ends at vertex q and η starts at vertex q. In particular, one can concatenate w with any words of L β if q(w) = q 0 . There is a unique labeled path presenting the infinite sequence c β , which is the path (q 0 , q 1 , q 2 , . . .). Let P β be the set of the prefixes of the sequence c β , including the empty word . Let c 1 · · · c n ∈ P β and suppose that c n+1 = c m+2 = · · · c n+m = 0, c n+m+1 = 0. The word c 1 · · · c n is presented by a path starting at q 0 and ending at q n . Since there is only one outgoing edge from each of the vertices q n , . . . , q n+m−1 , the only words w with prefix c 1 · · · c n are the words c 1 · · · c n , c 1 · · · c n 0, c 1 · · · c n 00, . . . , c 1 · · · c n 0 · · · 0 m , c 1 · · · c n 0 · · · 0 m w , c 1 · · · c k , Equilibrium measures of β-shifts 2417 where w is any word of L β with first letter 0 ≤ w 1 ≤ c n+m+1 − 1, and c 1 · · · c k ∈ P β with k > n + m. For u ∈ P β we set p if u = c 1 · · · c , c +1 = · · · = c +p = 0 and c +p+1 > 0.
(1) z β (u) is a measure of the obstruction to going from vertex q(u) to vertex q 0 . We set For each prefix u = c 1 · · · c n of c β we define a new word u as follows. Let c j be the last letter in c 1 · · · c n which is different from 0. We set The word u := c 1 · · · c n differs from u = c 1 · · · c n by a single letter and q( u) = q 0 . For any word w ∈ L β there is a unique decomposition of w into We extend the definition (1) to any word w by setting and extend the transformation u → u to any word by setting By convention we set = . The words w can be freely concatenated since q( w) = q 0 (see Lemma 2.2).
LEMMA 2.1. Let a = a 1 · · · a k and b = b 1 · · · b be two prefixes of c β . If ab ∈ L β , then ab is a prefix of c β .

Proof. By hypothesis
where c 1 the first character of c β . Then the mapping on L β , w → w, is at most (p 1 + 2)-to-one, and s( w) = .
( 6 ) On the other hand, if | u| > p 1 + 1, then the first character of u is c 1 . Let w = w . Let s(w ) be the largest suffix of w among the first |w | + 1 elements of the list P β . We write w = v u with u = s(w ). Let w = vu, u = s(w), such that w = w . We have |u| ≤ |u |, otherwise w = w would imply that s(w ) is not maximal. In particular, if s(w ) = , then s(w) = and w = w.
Suppose that w = w and p 1 + 2 ≤ |u| < |u |. Then the first character of u is c 1 and also the first character of u is c 1 since |u| ≥ p 1 + 2. By hypothesis |u| < |u |. This implies that v = v a with a a prefix of u . Hence the first letter of a is the first letter of u , which is c 1 , and the letter following a is the first letter of u, which is c 1 . We have u = a u = s(w ) ∈ P β . By definition of the map w → w (see (5) and (3)) we conclude that a is a prefix of c β . By Lemma 2.1, au is a prefix of c β , contradicting the maximality of u. Therefore |u| = |u |, and in this case the mapping w → w is two-to-one. In the remaining cases |s(w)| ≤ p 1 + 1. Therefore the mapping w → w is at most (p 1 + 2)-to-one.
It is henceforth simply called the β-shift.
We can always extend w to the left by 0, that is, there exist y ∈ β , y j = 0, j < k, and y j = x j , j = k, . . . , k + m − 1. We can also extend w to the right by 0. If q(w) = q 0 , this is clear. If w = vu, s(w) = u = , then u = c 1 · · · c p for some p ≥ 1. When c p+1 = 0, we may change c p+1 into 0. When c p+1 = · · · = c p+r = 0, but c p+r+1 = 0, we may change c p+r+1 into 0. Hence there exist y ∈ β , y j = x j , for all j = k, . . . , k + m − 1, and y j = 0 for all j < k and j ≥ m.

Functions of bounded total oscillations.
We recall the definition of a function of bounded total oscillations. For details we refer to [PS3,§3]. Letf ∈ C(A Z ) be a continuous function defined on the full shift A Z . On A Z we define for each i ∈ Z, A function has bounded total oscillations if f δ < ∞. On a subshift X ⊂ A Z , δ i (f ) may not make sense. If f ∈ C(X) is a continuous function on X and has a continuous extension An extensionf of f exists [PS3,Proposition 3.2]. For f ∈ C(X) we define Examples of functions of bounded total oscillations are given in [PS3]. The set of bounded total oscillations is a Banach space B( β ) with the norm [PS3,Proposition 3.1] We prove two basic estimates for functions with bounded total oscillations. For convenience, from now on we writef for a continuous extension of f to A Z . The arguments do not require thatf satisfies f δ = f δ but just that f ≈f and f δ < ∞. Fundamental to many of the arguments is the following lemma.
LEMMA 2.4. Let x, y ∈ X and := {j : Proof. Since is at most countable, we can list the elements of , so that = {j 1 , j 2 , . . .}. We define a sequence of elements of A Z as follows. Let z j 0 := x. For j ∈ , set The lemma follows from the identity LEMMA 2.5. Let f be a function of bounded total oscillations on X. Given ε > 0, there exists N ε such that for m ≥ N ε , sup 1≤i≤m |f (T i x) − f (T i y)| : x, y, x k = y k for all k ∈ {1, . . . , m} ≤ mε (7) and sup j ∈{1,...,m} |f (T j x) − f (T j y)| : x, y, x k = y k for all k ∈ {1, . . . , m} ≤ mε.

(8)
Proof. Let ε > 0 be given. There exists r ε so that k:|k|>r ε δ k (f ) ≤ ε/2. If m > 2r ε , then for x [1,m] = y [1,m] the sum over [1, m] of |T i f (x) − T i f (y)| can be written as over For i in the middle interval and j / ∈ [1, m] we have |i − j | > r ε , so that by Lemma 2.4, The proof of the second statement is similar.
2.3. Equilibrium measure and pressure. In our setting a shift-invariant (Borel) probability measure ν is an equilibrium measure for a continuous function ϕ if and only if ν is a tangent functional to the pressure p at ϕ (see [Wa,Theorems 8.2 and 9.5]).
Definition 2.6. An invariant probability measure ν is a tangent functional to the pressure p at ϕ if The set of tangent functionals to the pressure at ϕ is denoted ∂p(ϕ).
For each n ∈ N we choose a set E n ⊂ β with the following properties:

2421
The result in (11) is independent of the choice of the sets E n . From now on we choose E n so that if x ∈ E n , then x j = 0, for all |j | > n.
By our choice of E n we can extend (4) and (5) to the infinite sequence x − k since x j = 0 for j < −n. Let s(x − k ) be the largest suffix ∈ P β of x − k , and We have ( 1 3 ) Lemmas 2.7 and 2.8 give basic estimates used in the proof of Theorem 2.12.
LEMMA 2.7. Let [k, ] ⊂ [−n, n] and ϕ δ < ∞. Then * ,n Proof. This first inequality follows from E * ,n Hence the map f is at most (z β (c 1 ) + 2)-to-one (v is fixed, hence v is also fixed). The sequences x and x = f (x) differ at most at two coordinates, so that by Lemma 2.4, n] ϕ(T j y) This map is at most |A| z β (v)+1 -to-one. We have The configurations x − k vx + and x − k vx + differ at most at i = + 1, . . . , + z β (v) + 1, so that (1 5) The configurations x − k vx + and x − k vx + differ at one coordinate, so that n] ϕ(T j y) LEMMA 2.9. Let w ∈ L β m and w ∈ β , The pressure p(ϕ) is equal to Proof. The configurations w and w differ at most at one coordinate, so that m j =1 (ϕ(T j w ) − ϕ(T j w )) ≤ ϕ δ .
(1) If ν is an equilibrium measure for ϕ and if If ν is a weak Gibbs measure, then the empirical measures satisfy a large-deviations principle [PS2]. Large deviations for (one-sided) β-shifts, for any β > 1, and equilibrium measures have been proved by Climenhaga, Thompson and Yamamoto [CTY] for the class of functions satisfying the Bowen condition. From the estimates of Lemma 3.2 [PS1, Proposition 4.3 and Theorem 3.1], the result of [CTY] is also valid for all equilibrium measures for functions ϕ of bounded total oscillations. This is important since the Bowen condition implies uniqueness of the equilibrium measure for β-shifts, while this is not necessarily the case for bounded total oscillations functions.

Proof of Theorem 2.12
Let ϕ be a function of bounded total oscillations on β . In §3.1 we prove upper and lower bounds for ν([y 0 · · · y m−1 ]) for any equilibrium measure ν of ϕ. There is no restriction on β > 1. In §3.2 we prove Theorem 2.12.
3.1. Upper and lower bounds. We first assume that there is a unique tangent functional ν to the pressure at ϕ. The result is then extended to any ϕ of bounded total oscillations using a theorem of Mazur and a theorem of Lanford and Robinson (see, for example, [Ru,Appendix A.3.7]).
When there is a unique tangent functional to the pressure at ϕ we can estimate ν([y 0 · · · y m−1 ]) using a classical result about differentiability of a convex function, here the pressure, which is a pointwise limit of convex functions [Ro,Theorem 25.7]. Let u ∈ L β m be fixed and set .