Encoding subshifts through sliding block codes

We prove a generalization of Krieger's embedding theorem, in the spirit of zero-error information theory. Specifically, given a mixing shift of finite type $X$, a mixing sofic shift $Y$, and a surjective sliding block code $\pi: X \to Y$, we give necessary and sufficient conditions for a subshift $Z$ of topological entropy strictly lower than that of $Y$ to admit an embedding $\psi: Z \to X$ such that $\pi \circ \psi$ is injective.


Introduction
In a foundational paper [10] in information theory, Shannon introduced a model of a noisy communication channel, in which the input and output are modeled by stationary probability measures on a space of sequences of symbols.Shannon gave conditions under which the input can be recovered from the output, at least with an acceptable rate of error or ambiguity, in the case of a Bernoulli source, and this work has since been extended to more general sources [5].
This paper is motivated by the particular question of when one can ensure zero error, not just almost surely as in information theory but in fact deterministically.A deterministic channel can be modeled by a sliding block code, i.e. a continuous, shift-commuting map on a subshift, on which a stationary process could be supported.In this model, we can apply techniques of symbolic dynamics to investigate the effects of deterministic noise [9], also called distortion [10], which we can interpret as a failure of injectivity of the sliding block code representing the channel, even in the absence of random errors.
The main result of this paper, Theorem 1.1, determines the extent to which the non-injectivity of a sliding block code on a mixing shift of finite type (SFT) can be avoided by restricting to a subshift of the domain.Interpreting the sliding block code as a channel with deterministic noise, Theorem 1.1 characterizes the sources with entropy strictly lower than that of the output which can be transmitted without error or ambiguity.
Theorem 1.1.Let X be a mixing SFT, Y a mixing sofic shift, and π : X → Y a factor code.Let Z be a subshift with topological entropy strictly less than that of Y .Then there exists a subshift Z ′ of X conjugate to Z such that π| Z ′ is injective, if and only if for every n ≥ 1, the number of periodic points of least period n in Z is at most the number of periodic points of least period n in Y with a π-preimage of equal least period.
Theorem 1.1 is a generalization of the following theorem of Krieger in the case of unequal entropy; in particular, Theorem 1.1 reduces to Theorem 1.2 in the case that Y = X and π is the identity.Theorem 1.2 (Theorem 2 in [6]).Let Y be a mixing shift of finite type and Z a subshift.Then there is a subshift Z ′ ⊆ Y conjugate to Z if and only if Z and Y are conjugate or the (topological) entropy of Z is less than that of Y and, for every n ≥ 1, the number of periodic points of least period n in Z is at most the corresponding number in Y .
We note that with X, Y, Z, π as in the statement of Theorem 1.1, clearly there exists a subshift Z ′ of X conjugate to Z such that π| Z ′ is injective if and only if there exists a sliding block code ψ : Z → X such that π • ψ is injective, in which case Z ′ = ψ(Z) ⊂ X.To verify the "only if" statement in Theorem 1.1, suppose that there is a subshift Z ′ of X conjugate to Z such that π| Z ′ is injective.Let y ∈ π(Z ′ ) be periodic.Let x = π| −1 Z ′ (y) be the unique preimage of y in Z ′ .Then the orbit of x is in bijection with the orbit of y; otherwise, π would fail to be injective on the orbit of x, which is contained in Z ′ .In particular, x has finite orbit, so x is periodic, moreover with per(x) = per(y).Thus, every periodic point in π(Z ′ ) ⊂ Y has a periodic preimage in Z ′ ⊂ X of equal least period, which shows the necessity of the stated condition.
Both Theorem 1.1 and Theorem 1.2 give conditions for the existence of an embedding in terms of entropy and a periodic point condition.The following corollary, which we prove in Section 5, shows that the periodic point condition can be removed in exchange for a small loss of injectivity.
Corollary 1.3.Let X be a mixing SFT, Y a mixing sofic shift, and π : X → Y a factor code.Let Z be a subshift with topological entropy strictly less than that of Y .Then there exist a subshift Z ′ , a finite-to-one factor code χ : Z ′ → Z, and a sliding block code ψ : Z ′ → X such that π • χ is injective.Moreover, if Z is mixing sofic with positive entropy (i.e.not a single fixed point), then Z ′ can be taken to be a mixing SFT and χ can be taken to be almost invertible.
The code χ is in fact injective except on points in Z ′ whose images in Z are backward-asymptotic to one of finitely many periodic points in Z. See Lemma 2.2 and Remark 2.3.From Corollary 1.3, we can immediately conclude the following, with h denoting the topological entropy of a subshift.
Corollary 1.4.Let X be a mixing SFT, Y a mixing sofic shift, and π : X → Y a factor code.For any ε > 0, there exists a mixing SFT Z ⊂ X with h(Z) > h(Y ) − ε such that π| Z is injective.
The proof of Theorem 1.1 adapts the strategy used to prove Theorem 1.2 in [6,7] and related results in [1].The outline of the proof is as follows.We use a marker set, as in the proof of Theorem 1.2, to break points in Z into moderate blocks and long periodic blocks, separated by marker coordinates.We code these separately using certain "data blocks" in Y , some of moderate length and some long and periodic, where the long periodic data blocks come from periodic points with π-preimages of equal least period in X.A block in Z between marker coordinates is coded to a data block in Y which is shorter by an additive constant, so that there are gaps between the data blocks, filled with repetitions of a "blank" symbol.We then lift the data blocks from Y to data blocks from X, then replace the blanks with a "stamp" block from X to form a valid point in X.The stamp block is chosen to ensure that once the point in X is coded into Y by π, the locations of the stamp, and thus of the marker coordinates, can be recognized.These manipulations of markers, blanks, and stamps are presented in detail in Section 3, while the quantitative arguments required to construct the data blocks and stamps are given in Section 4.
The statement of Theorem 1.2 is false for X merely mixing sofic, and to date there is no known characterization of the subshifts that embed into a given mixing sofic shift, though some sufficient conditions are known [1,11].Theorem 1.1 sheds some light on this problem, without resolving it.Salo-Törmä have answered [4] the following related question: let Y be a mixing sofic shift and Z ⊂ Y a mixing SFT.For which such Y, Z do there exist a mixing SFT extension π : X → Y and a (mixing SFT) Z ′ ⊂ X such that π| Z ′ : Z ′ → Z is a conjugacy?However, it is unclear how the conditions given in that answer compare to those in Theorem 1.1, or to the results given in [11].As a final related question, when Y is an SFT and Z is conjugate to Y , the existence of an SFT Z ′ ⊂ X conjugate to Z such that π| Z ′ : Z ′ → Y is a conjugacy, i.e. is surjective as well as injective, has been studied in [3], continuing work from [9].
2 Conventions, definitions, and background on symbolic dynamics

Subshifts and sliding block codes
Let A be a finite set with the discrete topology, which we will call an alphabet.The set A Z of bi-infinite sequences over A, equipped with the product topology, is called the full shift over A, so called because the shift action σ : A Z → A Z , given by (σx) i = x i+1 , is a homeomorphism.A closed, shift-invariant subset of the full shift is called a subshift.The topology on A Z is generated by cylinders, which are sets of the form where w ∈ A n is a block or word of length n ∈ N, and i ∈ Z.Note that by shift-invariance, for any subshift X ⊂ A Z and any block w ∈ A * , we have X ∩ [w] i = ∅ for some i ∈ Z if and only if X ∩ [w] i = ∅ for all i ∈ Z.
A subshift X ⊂ A Z is characterized by the set B(X) of blocks w ∈ A * such that X ∩ [w] = ∅, called the language of X.When the intended subshift X is clear, we write [w] i for X ∩ [w] i .We write B n (X) = B(X) ∩ A n .We can equivalently characterize a subshift by a set of forbidden words F ⊂ A * , writing X F := A Z \ w∈F i∈Z [w] i .Note that in general F A * \ B(X F ).For a given subshift X ⊂ A Z , there may be several different sets of forbidden words F ⊂ A * such that X = X F .A shift of finite type (SFT) is a subshift X such that X = X F for some finite set F .A k-step SFT over A is an SFT of the form X = X F for some set F ⊂ A k+1 .
It is a theorem (the Curtis-Hedlund-Lyndon theorem) that, for subshifts X, Y , a function φ : X → Y is continuous and shift-equivariant if and only if it is a sliding block code, which means that there exist m, n ≥ 0 and Φ : B m+n+1 (X) → B 1 (Y ) such that for every x ∈ X and every i ∈ Z, φ(x) i = Φ(x [i−m,i+n] ).We say that φ is a k-block code if m + n + 1 = k.A factor code is a surjective sliding block code, and for a sliding block code φ defined on a subshift X, we say that the image φ(X) is a factor of X, and that X, or more properly φ : X → φ(X), is an extension of φ(X).A sofic shift (from the Hebrew ‫,סופי‬ "sofi", meaning "finite") is any factor of a shift of finite type.An injective sliding block code is called an embedding, and a bijective sliding block code is called a conjugacy.The properties of being sofic and of finite type are both invariant under conjugacy.
A subshift X is said to be irreducible if for all u, w ∈ B(X), there exists v ∈ B(X) such that uvw ∈ B(X), and strongly irreducible with gap g ≥ 1 if, for any u, w, we can take always take v ∈ B g (X).Any factor of an irreducible (resp.strongly irreducible) subshift is irreducible (resp.strongly irreducible).A periodic point in a subshift X is a point x ∈ X with x = σ n x for some n ≥ 1-we say that x has period n.The least period per(x) of a periodic point x is the least n such that σ n x = x.Note that |{σ n x | n ∈ Z}| = per(x).We write P (X) for the set of periodic points in a subshift X, Q n (X) for the set of periodic points of least period n, and q n (X) = |Q n (X)|.The number of periodic points of a given least period is a conjugacy invariant.
It is a theorem that periodic points are dense in any irreducible shift of finite type.The period per(X) of an irreducible shift of finite type X is the gcd of the periods of the periodic points of X.An irreducible SFT with period 1 is said to be aperiodic.An irreducible SFT is strongly irreducible if and only if it is aperiodic, if and only if has periodic points of all sufficiently high periods.For irreducible sofic shifts, strong irreducibility is equivalent to having periodic points of all sufficiently high periods, which clearly implies that the periods have gcd 1, but the reverse implication fails.For example, consider the odd shift over {0, 1}, in which the block 10 n 1 is permitted only for odd n.This is an irreducible sofic shift which contains the fixed point 0 ∞ , so the periods of periodic points trivially have gcd 1.However, the odd shift has no other periodic points of odd period.We follow the convention of the literature in referring to strongly irreducible sofic shifts (in particular SFTs) as mixing sofic shifts (mixing SFTs), because they are also characterized by a topological mixing property, but we will not use that property explicitly, so we do not define it here.
The following definition is new, and we use it extensively.
Definition 2.1.Let X and Y be subshifts and let π : X → Y be a factor code.We write R n (π) for the set of periodic points y ∈ Y such that y = π(x) for some periodic point x ∈ X with per(x) = per(y).We write For a subshift X, the (topological) entropy of X is the value h(X) = inf n≥1 1 n log |B n (X)|; in fact, the limit lim n→∞ 1 n log |B n (X)| exists and is equal to h(X).For a mixing sofic shift (in particular, a mixing SFT) X, we also have h(X) = lim n→∞ 1 n log q n (X).Entropy is non-increasing under factor codes and is thus a conjugacy invariant, though certainly not a complete invariant.For any irreducible sofic shift X, and any proper subshift V ⊂ X, we have h(V ) < h(X).In Section 4, we use the following lemma of Marcus, which allows us to approximate a sofic shift from the inside by SFTs in terms of entropy.Lemma 2.1 (Proposition 3 in [8]).Let Y be a sofic shift.For every ε > 0, there exists an irreducible SFT For any subshift X and any k ≥ 1, we can form the kth higher block shift X [k] with alphabet B k (X), where if any only if for each i, j we have a i,j = a i+1,j−1 , so that and a 1 a 2 . . .a k+ℓ ∈ B(X).Observe that X and X [k] are conjugate for any subshift X and any k ≥ 1.Moreover, if X is an m-step SFT and k ≤ m − 1, then X [k] is an (m − k)-step SFT.In particular, every SFT is conjugate to a 1-step SFT, and every sliding block code on an SFT can be written as a composition of a conjugacy and a 1-block code.We will therefore frequently assume WLOG that a given sliding block code on an SFT is a 1-block code on a 1-step SFT.
For a sliding block code on an irreducible shift of finite type, either every fiber is a finite set (indeed, of bounded cardinality), in which case the code is said to be finite-to-one and the entropy of the image is equal to that of the domain, or almost every fiber is uncountable, and the entropy of the image is strictly less than that of the domain.In the finite-to-one case, the minimum fiber cardinality is generic and is known as the degree.In particular, a code (on an irreducible SFT) with degree 1 is said to be almost invertible.It is a theorem that every irreducible (resp.mixing) sofic shift is an almost invertible factor of an irreducible (resp.mixing) SFT.We use the following construction of almost invertible codes, known as the "blowing-up lemma", in the proof of Corollary 1.3 in Section 5.
Lemma 2.2 (Lemma 10.3.2, [7]).Let Z be a mixing SFT and let z ∈ Z be a periodic point with least period p.Let M ≥ 1.Then there exist a mixing SFT Z ′ and an almost invertible factor code χ : Z ′ → Z such that the preimage of the orbit of z under χ is a single orbit of length M p.
Remark 2.3.Note that in [7], the extension χ in Lemma 2.2 is only stated to be finite-to-one, but the existence of periodic points having unique preimage already implies almost invertibility.Indeed, the construction in [7], based on work in [1], in fact shows that χ is injective except on the points that are backward-asymptotic to points in the preimage of the orbit of z, where we say that two points z, z ′ are backward-asymptotic if

Markers and Markov approximations
We now recall the constructions with markers and long periodic blocks that are at the heart of the proof of Theorem 1.1.For an alphabet A, we say that a block w = w Lemma 2.5 (Lemma 2 in [6]).Let Z be a subshift.For any N ≥ 1, there exists a subset F ⊂ Z, which can be taken to be a finite union of cylinders, such that: For any subshift X and any n ≥ 1, we can form the nth Markov approximation X n , which is the SFT defined by allowing precisely the blocks of length n which appear in X. Clearly X n+1 ⊂ X n .It is an exercise to show that for any ε > 0 and any N ≥ 1, there exists N ′ ≥ N such that h(X N ′ ) < h(X) + ε and q n (X N ′ ) = q n (X) for all n ≤ N .In Lemma 2.6, we use the Markov approximation, together with higher block shifts, to show that in the proof of Theorem 1.1, we can assume WLOG that Z is a 1-step SFT, which allows us to apply Lemma 2.4.
We remark that there are versions of Lemma 2.5 which obviate the need for Lemma 2.4.However, for our purposes in this paper, embedding Z into an SFT has the additional benefit that the rate of convergence of 1  n log q n (Z) to h(Z) can be easily estimated when Z is an SFT (see e.g.[7], pp.349-351), which gives a procedure for deciding whether a given X, Y, π, Z satisfy the periodic point condition in Theorem 1.1, assuming that h(Z) < h(Y ) (namely, compute N ≥ 1 such that for all n ≥ N , q n (Z) < r n (π), then check all n ≤ N to determine whether q n (Z) ≤ r n (π)).
Lemma 2.6.Let X be a mixing SFT, Y a mixing sofic shift, and π : X → Y a factor code.Let Z be a subshift with h(Z) < h(Y ) and q n (Z) ≤ r n (π) for all n ≥ 1.Then there exists a 1-step SFT Z ′ such that Z embeds into Z ′ , h(Z ′ ) < h(Y ), and q n (Z ′ ) ≤ r n (π) for all n ≥ 1.
We defer the proof of Lemma 2.6 to Section 5.

Coding
In Section 3.1, we introduce two coding constructions, namely subshifts with blanks adjoined, Definition 3.1, and stamps, Definition 3.2, then use them to create one side of an interface between Z on the one hand and π : X → Y on the other.In Section 3.2, we use markers in Z to construct the other side of this interface.In Section 3.3, we use stamps to give a construction of SFTs analogous to S-gap shifts.We use this construction in Section 4.2 to construct the shifts that are used in Section 3.1 and Section 3.2.

Blanks and stamps
As outlined in Section 1, the proof of Theorem 1.1 involves coding Z into X via certain intermediate subshifts which consist of long "data" blocks separated by blanks.We now define this construction precisely.The purpose of the Blanks construction is to provide an interface between the channel π : X → Y and the subshift Z to be embedded.One side of this interface, namely the embedding of a Blanks subshift into X, is specified in Proposition 3.4.The construction involves particular blocks, which we call stamps, that can be unambiguously recognized in the following sense: Remark 3.1.In Definition 3.2, continuing with the notation there, we do not explicitly require u 1 v 1 µv 2 u 2 to be legal in X. Doing so would neither affect the results nor simplify the proofs.In all of the examples we consider, such blocks will in fact be legal in X. Proposition 3.2.Let Y be a strongly irreducible subshift with gap g and W ⊂ Y a proper subshift.For every k ≥ g and every sufficiently large n, there exists a (Y, W, k) stamp of length n.
We defer the proof of Proposition 3.2 to Section 4.1, but before applying stamps in Proposition 3.4, we prove a lemma that expresses how stamps are actually used in our constructions.Proof.By the hypotheses on µ, γ ± , and w, and Definition 3.2, µ appears exactly once in each subblock µγ − w, wγ + µ.An appearance of µ other than at the positions explicitly indicated must therefore overlap both of these subblocks.Since |w| ≥ |µ|, µ must therefore be a subblock of w, contradicting the hypothesis that w ∈ B(W ) and µ ∈ B(Y ) \ B(W ).
We now give one of the main coding constructions (Proposition 3.4), embedding a subshift with blanks adjoined, and with blocks from a subshift V ⊂ X, into X via a sliding block code γ, such that π • γ is injective.The large amount of data in the statement is representative of the complexity of the construction and the modular nature of the proof.Proposition 3.4.Let X be a mixing SFT with gap g, let Y be a mixing sofic shift, and let π : X → Y be a 1-block factor code.
Let V ⊂ X, W = π(V ) ⊂ Y be proper subshifts.Let * be a symbol not appearing in the alphabets of where either w i ∈ M or w i = x T for some x ∈ R and T an interval with 2N + 1 where either w i ∈ M or w i = y T for some y ∈ R and T an interval with 2N + 1 ≤ |T | ≤ ∞.For η of this form, using Lemma 2.4, we can use the injections κ, λ to reconstruct a unique ξ ∈ V [ * ] such that π[ * ](ξ) = η.
We now suppose that each block in M has length at least N and that we have a (Y, W, g) stamp µ ∈ B(Y ) \ B(W ) such that |µ| ≤ N .Under these assumptions, we construct a sliding block code γ : V [ * ] → X and show that π • γ is injective.Fix a π-preimage μ of µ, and let ℓ = |µ| + 2g.Using the hypothesis that X is a mixing 1-step SFT, define maps γ ± : B 1 (V ) → B g (X) such that, for a, b ∈ B 1 (V ), we have μγ − (a)a, bγ + (b)μ ∈ B(X).We then have a sliding block code γ : V [ * ] → X, given by replacing each block b * ℓ a by bγ + (b)μγ − (a)a, and leaving the non-blank symbols unchanged. Let where a i , b i are, respectively, the initial and terminal symbols of v i .In turn, we have Moreover, by Lemma 3.3 and the lower bound on lengths of blocks in M , it follows that µ appears in π • γ(ξ) only where μ appears at the same position in γ(ξ).By replacing, in π • γ(ξ), each appearance of µ, and the blocks of length k to the left and right of µ, with * ℓ , we obtain the point , from which ξ can be recovered since π[ * ] is a conjugacy.

Blanks and markers
We now prove a lemma that encapsulates the use of marker constructions in our proof of Theorem 1.1.
Lemma 3.5.Let Z, W be subshifts with Z a 1-step SFT.Let N, ℓ ≥ 1 be such that q n (Z) ≤ q n (W ) Remark 3.6.The lower bound on the length of blocks in M is not in fact needed for Lemma 3.5, but it is needed in order to apply Lemma 3.5 in conjunction with Proposition 3.4 in the proof of Theorem 1.1 below.
Proof.Let F be a marker set for Z with parameter N .For z ∈ Z, let A(z) = {i ∈ Z | σ i z ∈ F }. Enumerate each A(z) as {a j (z)} j∈J(z) where the index set J(z) may be the empty set, or a finite set, or the integers, or the positive or negative natural numbers, and where a j (z) < a j+1 (z) for each j.We refer to the elements of A(z) as marker coordinates for z.Say that T is a marker interval for z if: T = [a j (z), a j+1 (z)) where a j (z), a j+1 (z) are both defined; or We construct an embedding of Z into Blanks(M, Q, N, * , ℓ) by constructing a function Φ that maps a block occurring between marker coordinates to a data block padded with * ℓ .Let c n : Define φ : Z → W by declaring that φ(z) T = Φ(z T ) whenever T is a marker interval for z.We need to show that φ is an embedding.Certainly φ is shift-commuting, since, if T is a marker interval for z, then T − 1 is a marker interval for σz, so Thus indeed φ(σz) = σφ(z).Moreover, φ is injective because the appearances of β in φ(z) allow us to reconstruct the marker coordinates, and then the injectivity of Φ allows us to reconstruct z T for each marker interval T for z.
We need to show finally that φ is continuous, i.e. that for z ∈ Z, φ(z) 0 depends only on z [−L,L] for some finite L independent of z.To see this, let The remainder of the proof of Theorem 1.1 follows the following proposition, the proof of which is taken up in Section 4. Proposition 3.7.Let X be a mixing SFT with gap g, Y a mixing sofic shift, and π : X → Y a factor code.Let Z be a subshift with h(Z) < h(Y ) and q n (Z) ≤ r n (π) for every n ≥ 1.Then there exist: Proof of Theorem 1.1.By Lemma 2.6, assume WLOG that Z is a 1-step SFT.Let N , ℓ, V ⊂ X, W = π(V ) ⊂ Y , and µ be as in Proposition 3.7.Let M ⊂ 2N −1 n=N B n (W ) be as in Lemma 3.5, and let R ⊂ Each of which the orbits in R is, by the definition of R n , necessarily the image of an orbit with equal cardinality in V .Here, R takes the role that Q plays in Lemma 3.5, but in Lemma 3.5, there was no channel π, and thus no preimage requirement, hence the change in notation.By Lemma 3.5, let φ : Z → Blanks(M, R, N, * , ℓ) be an embedding.

Stamps and SFTs
In this subsection, we prove Lemma 3.9, which, in conjunction with Lemma 2.1, allows us, in Proposition 4.4, to construct a mixing SFT V ⊂ X such that the image π(V ) ⊂ Y is a proper subshift of Y but has entropy at least h(Y ) − ε for a given ε > 0. It may be possible to give a more efficient construction of such a V , but we have not found one.We first prove Lemma 3.8, which is related to the characterization of SFTs among S-gap shifts (Theorem 3.3 in [2]).Lemma 3.8.Let X be a mixing SFT with gap g and let V 0 ⊂ X be an SFT.Let k ≥ g and let µ ∈ B(X)\B(V 0 ) be an (X, V 0 , k) stamp.Let N ≥ |µ| and let V 1 ⊂ X be the closure of the set of points of the form Assume without loss of generality that X is a 1-step SFT.We first perform a small recoding for convenience, specifically to make it easier to recognize stamps, by replacing X by a conjugate shift X.For each x ∈ X, define x as follows: if x [i,i+|µ|) = µ, then for each i ∈ [−k, |µ| + k), let a = x i and let xi = â, where for symbols a, b in the alphabet of X, we have â = b if and only if a = b, and the set of symbols with hats is disjoint from the alphabet of X.If there is no j ∈ (i − (|µ| + k), i + k] with x [j,j+|µ|) = µ, then let xi = x i .Clearly the map x → x is a sliding block code, and it is just as clearly injective, since we recover x from x by dropping hats.Therefore X = {x | x ∈ X} is a mixing SFT, conjugate to X.
Denote by V1 ⊂ X the image of V 1 under the map x → x.Let ℓ = |µ| + 2k.Since µ is an (X, V 0 , k) stamp, and N ≥ |µ|, blocks of the form γ + i µγ − i+1 do not overlap in any point in V by Lemma 3.3, so symbols with hats occur in V1 in blocks of length exactly ℓ.The blocks of symbols with hats are separated by blocks from V 0 .Since V1 is the image of V under a conjugacy X → X, V 1 is an SFT if and only if V1 is an SFT.
Let m ≥ N be such that X and V 0 are m-step SFTs.We claim that if x ∈ X is such that x [i,i+m] ∈ B m+1 ( V1 ) for all i ∈ Z, then x ∈ V1 , which means precisely that V is an m-step SFT.To prove this claim, let F ⊂ B m+1 (X) be the set of blocks of length m + 1 which contain at least one of the following: a block of length greater than ℓ in which all symbols have hats; a block without hats that is not in B(V 0 ); or a block of symbols without hats, of length less than N , bounded on both sides by symbols with hats.Note that F is disjoint from B m+1 ( V1 ).Suppose that x[i,i+m] / ∈ F for all i ∈ Z. Then any block of symbols with hats in x has length exactly ℓ, and is thus of the form γ + µγ − , where γ ± ∈ B g (X) (with hats added).Furthermore, the blocks separating the blocks with hats must have length at least N and must be in B(V 0 ), since every subblock of length m + 1 is in B(V 0 ) and V 0 is an m-step SFT.Thus indeed x ∈ V1 , so V1 is indeed an SFT.
To see that ).We do so as follows.Extend u − on the right to form a block v − ∈ B(V 1 ), which begins with u − and ends with γ + −1 µγ − 0 where γ Similarly, extend u + on the left to form a block v + ∈ B(V 1 ) which ends with u + and begins with γ + 0 µγ − 1 , where γ + 0 , γ and are joined together in a way that creates no violations of the restrictions defining V 1 .Letting u 0 be the block appearing between u − , u + , such that where m is as above, and and X is a 1-step SFT.Moreover, per(x i ) divides ℓ + |u i |, and gcd(ℓ As advertised, we now use Lemma 3.8 to prove the following lemma, which is applied in the proof of Proposition 4.4, which in turn is the main input to the proof of Proposition 3.7.Lemma 3.9.Let X be a mixing SFT, Y a mixing sofic shift, and π : X → Y a factor code.Let W 0 Y be an SFT.Then there exists a mixing SFT Note that V 0 is an SFT since W 0 is an SFT.Let g be the mixing gap of X.Let y ∈ Y \ W 0 be a periodic point with least period k ≥ g.Such a y certainly exists because periodic points are dense in Y and W 0 is a proper subshift.Let k ′ be such that y Then every ℓ-block in y is forbidden in W 0 .In particular, for any x ∈ π −1 ({y}) and any i ∈ Z, we have x [i,i+ℓ) / ∈ B(V 0 ).By Proposition 3.2, let µ be an (X, V 0 , g) stamp.Let V 1 consist of the closure of the set of points of the form . .
. By Lemma 3.8, V 1 is indeed a mixing SFT.Note that every point in V 1 contains ℓ-blocks permitted in V 0 , so V 1 is disjoint from π −1 ({y}), and therefore π(V 1 ) Y .

Counting
In this section, we prove Proposition 3.2 and Proposition 3.7, which state the existence and properties respectively of the stamps and the shifts V ⊂ X, W ⊂ Y used in Section 3. Section 4.1 contains two results required for the proof of Proposition 3.2, one (Lemma 4.1) showing that most blocks in a subshift with positive entropy have little self-overlap, and the other (Lemma 4.2) showing that one can assume, at the cost of a small loss of entropy, that a given sufficiently long block appears syndetically in a mixing sofic shift.Section 4.2 then gives a crucial asymptotic result on the number of periodic points in Y with a preimage of equal least period in X, and applies the results from Section 3.3 to construct the shifts V and W .

Self-overlap and stamps
We begin by showing that most blocks have very little self-overlap, which we use both to construct stamps and to determine the asymptotic number of periodic points in Y with a π-preimage of equal least period.
) and s = exp(h(Y ) + ε).Note that s α < r < s and that r n ≤ |B n (Y )| for every n.Let N 0 be large enough that for all n ≥ N 0 , we have Then the number of blocks in X of length n with self-overlap of more than αn is at most . Then, for n ≥ N , the number of blocks in Y of length n with no self-overlap by more than αn is at least where we can take which is positive by the choice of N .
We now control the entropy loss incurred by requiring a given long block to appear syndetically.
Therefore, by manipulation of logarithms and the fact that h(Y ) = inf ℓ≥1 for large enough m, where the final inequality follows from the choices of β and N .We conclude that h(S) = lim inf m→∞ It is clearly enough to prove the result for u 1 , u 2 sufficiently long, since we can then pass to subwords of u 1 , u 2 .By Lemma 4.2, let β ∈ (0, 1), m sufficiently large, and θ ∈ B βm (Y ) \ B(W ) be such that the subshift S ⊂ Y defined by requiring at least one appearance of θ in any block of length m has h(S) > 0. Let α ∈ (0, 1) be arbitrary, and let n > (m + k)/(1 − α) be large enough that, by Lemma 4.1, there exists µ ∈ B n (S) such that µ has no self-overlap by more than αn, in particular by more than n − (m + k).
Let u 1 ∈ B k1 (U ), u 2 ∈ B k2 (U ) with k 1 , k 2 ≥ m and let v 1 , v 2 ∈ B k (Y ).Then µ cannot appear in u 1 v 2 µv 2 u 2 except at the position explicitly indicated.Indeed, µ cannot appear at a position shifted by at most m + k-otherwise, µ would overlap itself by too much-and it cannot appear at a position shifted by more than m + k, as it would then overlap with u 1 or u 2 in a block of length at least m, contradicting the fact that µ ∈ B(S), and thus has θ as a subword.

Entropy and periodic points
We first show that at least a positive fraction of periodic points in Y of sufficient least period have a preimage of equal least period, and in particular that their growth is exponential with rate h(Y ).Proof.Let g be the mixing gap of X.By Lemma 4.1, let b > 0 and N > 3g be such that, for all n ≥ N , the number of blocks in Y of length n − g with no self-overlap by more than n/3 is at least c exp(nh(Y )), where we may take c = 1 2 exp(−gh(X)).For each block v ∈ B n−g (Y ), there exists a periodic point x ∈ X with π(x) [0,n−g) = v such that per(x) divides n.Thus π(x) is also periodic with least period dividing n.Moreover, if v has no self-overlap by more than n/3, then in fact per(π(x)) = n.Therefore r n (π) ≥ c exp(nh(Y )), so lim inf n→∞ preimage under χ (j) .Let η (1) = χ (1) and η (j+1) = η (j) • χ (j+1) .Let Z ′ = Z (C ′ ) and η = η (C ′ ) : Z ′ → Z. Certainly η is almost invertible, so h(Z ′ ) = h( Z) < h(Y ).We claim that q n (Z ′ ) ≤ r n (π) for all n ≥ 1.Indeed, for each j, if per(z j ) = k, then we have q k (Z (j) ) = q k (Z (j−1) ) − k, q Mk (Z (j) ) = q Mk (Z (j−1) ) + M k, and q n (Z (j) ) = q n (Z (j−1) ) for all n / ∈ {k, M k}.Therefore q k (Z ′ ) = r k (π), and q Mk (Z ′ ) = q Mk ( Z) + M max{0, q k ( Z) − r k (π)} ≤ q Mk ( Z) + CM ≤ r Mk (π) where the last inequality follows from the choice of M .Therefore X, Y, π, Z ′ satisfy the hypotheses of Theorem 1.1, so there exists a sliding block code ψ : Z ′ → X such that π • ψ is injective.This concludes the proof in the case that Z is mixing sofic.We now handle the general case, where Z is an arbitrary subshift with h(Z) < h(Y ).By Lemma 5.2, let V be a mixing SFT containing Z with h(V ) < h(Y ).By the mixing sofic case, let V ′ be a mixing SFT such that X, Y, π, V ′ satisfy the hypotheses of Theorem 1.1, and let χ : V ′ → V be an almost invertible factor code.Let Z ′ = χ −1 (Z).Then χ| Z ′ is still finite-to-one, which concludes the proof.

Lemma 2 . 4 (
Lemma 2.3 in [1]).Let Z be a subshift, let N ≥ 1, and a, b ∈ Z with b − a ≥ 2N .Let z ∈ Z.If for every i ∈ [a + N, b − N ] there exists p ≤ N − 1 such that z [i−N,i+N ] is p-periodic, then there is at most one periodic point ζ ∈ Z with per(ζ) ≤ N − 1 and ζ [a,b] = z [a,b] .If Z is a 1-step SFT, then such a ζ exists.
we can determine whether there are marker coordinates for z in [−2N, 0) and/or [0, 2N ].If each of these intervals contains a marker coordinate, then φ(z) 0 is determined by z T where T ⊂ [−2N, 2N ] is the unique marker interval for z containing 0. If at least one of [−2N, 0), [0, 2N ] has no marker coordinates, then 0 is in a long marker interval for z.If there is a marker coordinate in (−ℓ, 0], then φ(z) 0 = * .Otherwise, by Lemma 2.4, φ(z) 0 is determined by any subblock z [a,b] where a < 0 ≤ b, b − a ≥ 2N , and [a, b] contains no marker coordinate for z.This concludes the proof that φ is continuous.

Proposition 4 . 3 . 1 n
Let X be a mixing SFT, Y a mixing sofic shift, and π : X → Y a factor code.Then lim n→∞ log r n (π) = h(Y ).