1 Introduction
We begin with an introduction to automorphism groups and the topic of distortion in §1.1, as this is the motivation and context for our main results listed in §1.2.
The proofs of the main results are based on rather different ideas, namely conveyor belts, ‘ducking’, dynamics of Turing machines, permutation groups, and also some ideas from computer science, namely reversible computation and logical gates. Some background for these ideas is given in §1.3.
1.1 Automorphism groups and distortion
A recent trend in symbolic dynamics is the study of automorphism groups of subshifts. Typical activities include the study of restrictions that dynamical properties of the subshift put on these groups, and in turn constructing complicated automorphism groups or subgroups thereof.
The former activity has been most successful in the lowcomplexity setting, see [Reference Pavlov and Schmieding43] for a recent account of the state of the art. For example, minimal subshifts with upper entropy dimension less than $1/2$ have amenable automorphism groups [Reference Cyr and Kra18], and (as discussed in more depth below) zeroentropy subshifts do not admit elements with exponential distortion [Reference Cyr, Franks, Kra and Petite17].
The latter activity has been most successful on sofic shifts. In particular, a lot is known about the finitely generated subgroups of automorphism groups of full shifts: see [Reference Salo47] for a listing of properties that have been exhibited. For instance, let us mention that these groups G, while not finitely generated, contain finitely generated ‘f.g.universal subgroups’, namely ones that contain isomorphic copies of all finitely generated subgroups of G. The class of subgroups of automorphism groups is also quite robust, being closed under graph products [Reference Salo46, Reference Salo47]. A classical reference for the study of automorphism groups of transitive SFTs (an important subclass of sofic shifts) is [Reference Boyle, Lind and Rudolph10].
In this paper, we study the grouptheoretic notion of distortion, introduced by Gromov [Reference Gromov26], in the context of automorphism groups of subshifts. If G is a finitely generated group, we say $g \in G$ is a distortion element, or distorted, if $g$ is of infinite order and the word norm $\lvert g^n\rvert $ grows sublinearly (with respect to some, or equivalently any, finite generating set). For groups that are not finitely generated, we say that an element is distorted if it is distorted in some finitely generated subgroup. While distortion elements are usually allowed to have finite order, in this paper, we focus on distorted elements of infinite order.
Two basic examples of groups with distortion elements are the Heisenberg group with presentation $\langle a, b \mid [[a,b], a], [[a,b], b] \rangle $ , where the element $[a,b]$ has quadratic distortion, meaning we can represent an element of the form $[a,b]^{\Omega (n^2)}$ by composing $n$ generators; and the Baumslag–Solitar group $\mathrm {BS}(1,2)$ with presentation $\langle a, b \mid a^b = a^2 \rangle $ , where $a$ is easily seen to be exponentially distorted, meaning the word norm of $a^n$ grows logarithmically.
The previous examples show that distortion elements can appear in nilpotent and metabelian linear groups. It is known that they cannot appear in biautomatic groups [Reference Gersten and Short25], certain types of mapping class groups [Reference Farb, Lubotzky and Minsky21], and the outer automorphism group of the free group [Reference Alibegović1]. See [Reference Calegari and Freedman12, Reference Cantat and de Cornulier13, Reference Franks and Handel23, Reference Gal and Kedra24, Reference Guelman and Liousse27, Reference Le Roux and Mann36, Reference Navas40, Reference Pengitore44] for other distortionrelated works.
Getting back to automorphism groups, it is an open problem (that we solve in the present paper) whether the automorphism group of any subshift can contain a distortion element [Reference Cyr, Franks, Kra and Petite17]. It is not known whether the Heisenberg group [Reference Kim and Roush33] or the Baumslag–Solitar group $\mathrm {BS}(1,2)$ embed in $\mathrm {Aut}(A^{\mathbb {Z}})$ , or indeed in the automorphism group of any subshift, and these problems stay open. (It is also open whether the additive group of dyadic rationals $\mathbb {Z}[\tfrac 12] \leq \mathrm {BS}(1,2)$ embeds in $\mathrm {Aut}(A^{\mathbb {Z}})$ [Reference Boyle, Lind and Rudolph10].)
In addition to being an interesting grouptheoretic notion, the quest for distortion elements in automorphism groups of subshifts is motivated by several purely symbolic dynamical considerations. First, [Reference Cyr and Kra18, Theorem 1.2] shows that finitely generated torsionfree subgroups of the automorphism group of a subshift of polynomial complexity are virtually nilpotent. See [Reference Donoso, Durand, Maass and Petite20, Theorem 5.5] for a similar conclusion for inverse limits of bounded step nilsystems. If we could rule out distortion in such examples, we could conclude virtual abelianness.
Second, it is known that the Baumslag–Solitar group, more generally any group with an exponentially distorted element, does not embed in the automorphism group of a zeroentropy subshift [Reference Cyr, Franks, Kra and Petite17]. More precisely, it was observed there that the Morse–Hedlund theorem allows one to translate a distortion element into a lower bound on the complexity of a subshift. This is notable, as this is the only known restriction for automorphism groups of general zeroentropy subshifts. Thus, distortion looks like a natural candidate for restrictions on automorphism groups of general subshifts (as far as the authors know, no restrictions are known on countable subgroups of automorphism groups of general subshifts).
Third, distortion is tied to an intrinsic notion in automorphism group theory, namely the growth of the radius (as known as range) of the automorphism, when seen as a cellular automaton. Namely, distortion in the group sense implies sublinear growth of the radius [Reference Cyr, Franks and Kra16]. It is not immediately obvious that even sublinear radius growth is possible (indeed this was left open in [Reference Cyr, Franks and Kra16]), but several examples of sublinear radius growth have been constructed. The most relevant for us is the observation from [Reference Guillon, Salo, Dennunzio, Formenti, Manzoni and Porreca28] that one can even obtain sublinear radius growth in the automorphism group of a full shift: the socalled SMART machine, when simulated by an automorphism, gives rise to such growth.
While distortion elements have not previously been exhibited in automorphism groups of subshifts, some facts are known about their dynamics (mostly related the notion of radius). Links to expansive directions and Lyapunov exponents are shown in [Reference Cyr, Franks and Kra16]. A related result is shown in [Reference Bitar, Donoso, Maass and Adamatzky6], namely distortion elements of automorphism groups of general expansive systems can not themselves be expansive. Links to the dimension group action and inertness are discussed in [Reference Cyr, Franks and Kra16, Reference Schmieding50].
1.2 Results
The main result of the present paper is that the automorphism group of some full shift (thus any full shift by standard embedding theorems [Reference Kim and Roush33]) contains a distortion element with ‘quasiexponential’ distortion, in the sense that the distortion function grows like $\exp (\sqrt [4]{\Omega (n)})$ . It is more convenient to work directly with word norms than with the distortion function, so we take this approach in the paper. Note that for wellbehaved functions, the word norm growth is just the inverse of the distortion function.
Theorem A. For any nontrivial alphabet A, the group $\mathrm {Aut}(A^{\mathbb {Z}})$ has an element $g$ of infinite order such that $\lvert g^n\rvert _F = O(\log ^4 n)$ for some finite set F.
Here, by a nontrivial alphabet, we mean a finite set A with $2 \leq \lvert A\rvert < \infty $ ; we also use the standard shorthand $\log ^4 n = (\log n)^4$ .
A simple counting argument shows that the word norms of $n$ th powers of a group element cannot be $o(\log n)$ with respect to a fixed finite generating set. For our specific automorphism, one can strengthen this: the radius of $g^n$ as a cellular automaton is $\Theta (\log n)$ , so the true growth of word norms of powers of our automorphism is between $\Omega (\log n)$ and $O(\log ^4 n)$ .
Our theorem solves the second subquestion of [Reference Cyr, Franks, Kra and Petite17, Question 5.1] in the affirmative. Most of the present paper deals with the proof of this theorem. The element $g$ in this theorem is essentially the SMART machine [Reference Cassaigne, Ollinger and TorresAvilés14], so morally this also confirms a conjecture of [Reference Guillon, Salo, Dennunzio, Formenti, Manzoni and Porreca28], although the embedding we use is slightly more involved than the specific one considered in [Reference Guillon, Salo, Dennunzio, Formenti, Manzoni and Porreca28]. The group we use in the proof is given in Lemma 5.14.
The generators of our group are relatively simple, but we have little idea what kind of group they generate. The finitely generated group $\langle F \rangle $ can of course be taken to be larger (the distortion function can only become fastergrowing this way), so one can take a more canonical choice of generators, say, all reversible cellular automata with biradius $1$ (on the huge alphabet we use).
One can perform some further massage to get a simplersounding example: it is known that the automorphism group of a full shift contains socalled finitely generated (f.g.)universal subgroups, namely ones containing copies of all f.g. groups of reversible cellular automata [Reference Salo48]. Any such group can be used in the result (although the element $g$ will be more complicated). In particular, one can pick as F the symbol permutations and the partial shift $\sigma \times \mathrm {id}$ on the product full shift $\{0,1\}^{\mathbb {Z}} \times \{0,1,2\}^{\mathbb {Z}}$ .
From the main theorem, we obtain several corollaries of interest, which are proved in §6. First, we obtain the characterization of the class of sofic shifts whose automorphism groups have distortion elements.
Theorem B. Let X be a sofic shift. Then $\mathrm {Aut}(X)$ contains a distortion element if and only if X is uncountable.
It is well known that for sofic shifts, uncountability is equivalent to having positive entropy.
As another immediate consequence, using the argument of [Reference Cyr, Franks, Kra and Petite17], we obtain that the automorphism group of a full shift cannot be embedded in the automorphism group of a lowcomplexity subshift. Recall that the lower entropy dimension [Reference Meyerovitch37] of a subshift is defined by the formula
where $N_k(X)$ is the number of words of length $k$ that appear in X. The lower entropy dimension of a (onedimensional) subshift with positive entropy is of course $1$ . The upper entropy dimension is defined analogously, with $\limsup $ in place of $\liminf $ .
Lemma 1.1. Let X be a subshift with lower entropy dimension less than $1/d$ . If $f \in \mathrm {Aut}(X)$ satisfies $\lvert f^n\rvert = O(\log ^d n)$ , then $f$ is periodic.
Theorem C. The group $\mathrm {Aut}(A^{\mathbb {Z}})$ has a finitely generated subgroup G such that every subshift X with $G \leq \mathrm {Aut}(X)$ has lower entropy dimension at least $1/4$ .
Theorem C is of course an immediate corollary of Lemma 1.1. It states a lowcomplexity restriction on the automorphism group, that is, it states that automorphism groups of subshifts with low enough complexity (growth of the number of admissible words) cannot have some property. The above theorem seems to be the first lowcomplexity restriction on automorphism groups where:

(1) the complexity bound is superpolynomial;

(2) there are no additional dynamical restrictions; and

(3) the prevented behavior can be exhibited in the automorphism group of another subshift.
There are previously known restrictions satisfying any two of these items. For items (1) and (2), zero entropy prevents exponential distortion [Reference Cyr, Franks, Kra and Petite17]; for items (1) and (3), [Reference Cyr and Kra18] shows that if X is minimal and has upper entropy dimension less than $1/2$ , then it is amenable (while $\mathrm {Aut}(A^{\mathbb {Z}})$ is not); for items (2) and (3) (very low complexity restrictions), there are many results, see [Reference Pavlov and Schmieding43].
The subgroup where our distortion element lies can itself be seen as a group of Turing machines, indeed restricting its action to a certain sofic subshift directly gives rise to a subgroup of the group ${\mathrm {RTM}}(n, k)$ studied in [Reference Barbieri, Kari, Salo, Cook and Neary3], leading to the following theorem.
Theorem D. Let $n \geq 2, k \geq 1$ . Then the group of Turing machines ${\mathrm {RTM}}(n, k)$ contains a distortion element; indeed there is a finitely generated subgroup $G = \langle F \rangle $ and an element $f$ such that $\lvert f^n\rvert _F = O(\log ^4 n)$ .
All groups of Turing machines in turn embed in the higherdimensional Brin–Thompson $mV$ for $m \geq 2$ introduced by Brin [Reference Brin11], and we obtain the following theorem.
Theorem E. The Brin–Thompson group $mV$ contains a distortion element; indeed there is an element $f$ such that $\lvert f^n\rvert = O(\log ^4 n)$ .
This theorem provides a new restriction for geometries of $2V$ . Namely, it is known that Thompson’s group V admits a proper action by isometries on a CAT $(0)$ cube complex [Reference Farley22]. By [Reference Haglund29, Theorem 1.5], a group with distortion elements does not admit such an action, thus we have the following corollary.
Corollary 1.2. The Brin–Thompson group $mV$ does not act properly on a CAT $(0)$ cube complex for $m \geq 2$ .
Of course, a similar fact is true for the other groups where we exhibit distortion elements.
We conclude with previously known (but possibly not well known) related distortion facts that are easy to prove. First, the fact the automorphism group of a full shift contains finitely generated subgroups that are distorted is essentially classical, namely $F_2 \times F_2$ embeds in $\mathrm {Aut}(A^{\mathbb {Z}})$ [Reference Kim and Roush33] and has subgroups with arbitrarily bad (recursive) distortion essentially by [Reference Mihaĭlova38]. (Given a subgroup H and an overgroup G equipped with their respective word norms, the subgroup H is distorted in G if $\min \{ \lVert h \rVert _G \mid h \in H \text { and } \lVert h \rVert _H \geq n \} = o(n)$ . In this sense, a distorted element corresponds to a distorted cyclic subgroup.) To give a more downtoearth example, $\mathbb {Z}_2 \wr \mathbb {Z}^2$ , which embeds in $\mathrm {Aut}(A^{\mathbb {Z}})$ by [Reference Salo47], contains a polynomially distorted copy of itself, by a nice geometric argument [Reference Davis and Olshanskii.19]. One can also construct distorted subgroups directly by more intrinsic automorphism group techniques.
Second, in the setting of general expansive homeomorphisms, finding distortion elements is very easy. Namely, if $\mathbb {S}$ is the invertible natural extension of the $\times 2$ map on the circle, $\mathrm {Aut}(\mathbb {S}^n)$ contains a natural copy of $\mathrm {GL}(n, \mathbb {Z})$ by simply summing tracks to each other [Reference Kopra and Salo34]. For $n = 3$ , the group $\mathrm {GL}(n, \mathbb {Z})$ contains the Heisenberg group, thus has distortion elements.
1.3 Turing machines and gates
While our results in the previous section are stated fully in terms of homeomorphism groups, our proof methods rather belong to the theories of dynamical Turing machines and of reversible gates. In this section, we outline some history of these ideas.
1.3.1 Turing machines
As mentioned in §1.1, our automorphism group element simulates a ‘Turing machine’, that is, a dynamical system where a single head moves over an infinite tape of arbitrary data (over a fixed finite alphabet), and all the action happens near the head (which may move around the tape, such movement depending on the content of the tape; or modify said content). The dynamics of Turing machines, also known as onehead machines, is an important branch of symbolic dynamics. This can be seen as initiated in the 1997 paper of Kůrka [Reference Kůrka35], which explicitly defined the movinghead and movingtape dynamics of Turing machines (although many relevant dynamical ideas appeared in the literature before this [Reference Hooper30, Reference Moore39, Reference Rogozhin45]).
One of the moststudied behaviors of Turing machines is aperiodicity, meaning that the action of the Turing machine has no periodic points. This property is particularly interesting in the moving tape model, where the head is seen as fixed and only the tape moves. Kůrka originally conjectured that Turing machines cannot be aperiodic, but an explicit aperiodic Turing machine was exhibited in 2002 by Blondel, Cassaige, and Nichitiu [Reference Blondel, Cassaigne and Nichitiu7] (inspired by a technique of Hooper from 1966 [Reference Hooper30]). Later, reversible aperiodic Turing machines (ones whose action is a homeomorphism) were found, the first by Kari and Ollinger [Reference Kari, Ollinger, Ochmański and Tyszkiewicz32]. This culminated in the discovery of the SMART machine $\mathcal {S}$ by Cassaigne, Ollinger, and TorresAvilés [Reference Cassaigne, Ollinger and TorresAvilés14], a machine with only four states and three tapeletters, which is reversible and aperiodic, and whose movingtape dynamics is a minimal homeomorphism on the Cantor space, see also [Reference Ollinger, Kari and Ulidowski41].
Turing machines, in the movinghead dynamics where the tape is not shifted and the head moves over it, can be directly seen as automorphisms of a sofic shift [Reference Barbieri, Kari, Salo, Cook and Neary3]. In fact, it is well known that Turing machines can be ‘embedded’ into automorphism groups of full shifts $\mathrm {Aut}(A^{\mathbb {Z}})$ . There are multiple ways of doing so; in this paper, we use the conveyor belt technique similar to the one used in [Reference Guillon, Salo, Dennunzio, Formenti, Manzoni and Porreca28].
For the purpose of establishing distortion, the first important consideration, already discussed in §1.1, is the ‘speed’ of a Turing machine: a Turing machine with positive speed, meaning the existence of tape contents such that the head moves to infinity at a positive rate, could not possibly give rise to a distortion element. This is because the linear movement of the head (even on a single configuration) means that the radius of powers of the corresponding automorphism must grow at a linear rate as well, which prevents distortion.
It was shown in [Reference Jeandel, Mayr and Portier31] that all aperiodic Turing machines have zero speed, and in [Reference Guillon, Salo, Dennunzio, Formenti, Manzoni and Porreca28], this was strengthened by proving that the maximal offset by which such a machine can move in $t$ time steps is $O(t/\log t)$ . For the SMART machine $\mathcal {S}$ , more is known: in $t$ steps, it can only move by an offset of at most $O(\log t)$ . This makes $\mathcal {S}$ a perfect candidate for a distortion element of a subshift automorphism group, and indeed it was conjectured in [Reference Guillon, Salo, Dennunzio, Formenti, Manzoni and Porreca28] that it is one.
1.3.2 Gates
The next ideas come from the study of reversible gates. By this, we refer to the study of permutation groups acting on (a sublanguage of) $A^n$ , where A is a finite alphabet, which are generated by ‘reversible gates’: that is, permutations that only consider a bounded subset of coordinates at a time. More precisely, if $k \leq n$ and $\pi \in \mathrm {Sym}(A^k)$ , then we can apply $\pi $ to the subword starting at $i$ by the formula $\hat \pi (u \cdot v \cdot w) = u \cdot \pi (v) \cdot w$ , where $u \in A^i, v \in A^k, w \in A^{n  k  i}$ . From now on, we use the term ‘gate’ for reversible gates and ‘classical gate’ to refer to the usual not necessarily reversible gates (in the few places where they are needed).
A fundamental lemma in this topic is that $\mathrm {Alt}(A^n)$ admits a generating set with bounded $k$ , namely it is generated by the even permutations of $A^2$ if the cardinality $\# A$ is at least $3$ (when we consider them as gates, and allow their applications at any position $i = 0, \ldots , n  2$ ). A more complete statement appears in [Reference Salo48], while earlier proofs are given in [Reference Boykett8, Reference Boykett, Kari and Salo9, Reference Selinger51].
The connection between gates and Turing machines is as follows. Let us consider generalized Turing machines in the sense of [Reference Barbieri, Kari, Salo, Cook and Neary3], meaning the machine can look at and modify multiple cells at once, although only at a bounded distance from the head. Now, walking on a cyclic tape containing an element of $A^n$ , we can apply permutations of $A^k$ at different relative positions $i$ : simply move by $i$ steps, apply the permutation locally, and then move back by $i$ steps. The above paragraph translates to the fact that there is a finite set of generalized Turing machines that can perform any even permutation of the tape content (relative to the head position). Actually, it turns out that since Turing machines carry a state, $k = 1$ suffices, that is, the generating Turing machines need not be of a generalized type.
2 Definitions
2.1 General notions
We have $\mathbb {N} = \{0,1,2,\ldots \}$ , $\mathbb {Z}_+ = \mathbb {N} \setminus \{0\}$ , and $\mathbb {Z}_{\ell } = \mathbb {Z}/\ell \mathbb {Z}$ is integers modulo $\ell $ . For S a finite set, we denote by $\# S$ the cardinality of S. For $i,j \in \mathbb {N}$ , denote and . If $w \in \{0,1,\ldots , k1\}^*$ , write $\mathrm {v}_k(w)$ for the value $w$ represents in base $k$ (the leftmost digit having the highest significance by default), that is, $\mathrm {v}_k(w) = \sum _{i=0}^{\lvert w\rvert 1} k^{\lvert w\rvert 1i} w_{i}$ ; and we write $n_{(k)} \in \{0, \ldots , k1\}^*$ for the number $n \in \mathbb {N}$ written in base $k$ (with length determined from context or specified in text), that is, $\mathrm {v}_k(n_{(k)}) = n$ .
For $\Sigma $ a finite set, called an alphabet, denote by $\Sigma ^* = \bigcup _{n = 0}^{\infty } \Sigma ^n$ the set of finite words over $\Sigma $ . For $w \in \Sigma ^*$ , denote by ${\mathrm {len}}(w)$ the length of $w$ , that is, the integer $n$ such that $w \in \Sigma ^n$ . For a word $w \in \Sigma ^*$ , denote by $\overline {w}$ the reverse (or ‘mirror image’) of $w$ , that is, if $w = w_0 \cdot w_1 \cdots w_{n1}$ , then $\overline {w} = w_{n1} \cdot w_{n2} \cdots w_0$ . For $w \in \Sigma ^n$ and , define $w_{J} = w_{j_0} \cdot w_{j_1} \cdots w_{j_k}$ as the restriction of $w$ to J, for $J = \{j_0,\ldots ,j_k\}$ and $j_0 \leq \cdots \leq j_k$ . Given $a \in \Sigma $ and , the cylinder $[a]_j$ denotes the set of words $\{w \in \Sigma ^n \mid w_j = a\}$ . Usually our alphabets are nontrivial, by which we mean $\lvert \Sigma \rvert \geq 2$ .
In Lemma 4.10, we denote ${\mathrm {NC}}^1$ for ‘Nick’s Class’ of complexity of level $1$ , that is, the class of languages $L \subseteq \Sigma ^*$ such that L is decidable by Boolean circuits with a polynomial number of gates, with at most two inputs and depth $O(\log n)$ (see for example [Reference Arora and Barak2]). The reader need not be familiar with this class to follow our argument. The main technical result we need is Barrington’s theorem from [Reference Barrington4] (but this is also proved from scratch in our context).
For $a,b$ elements of a group, the commutator of $a$ and $b$ is $[a, b] = a^{1}b^{1}ab$ . The conjugation convention is $a^b = b^{1} \circ a \circ b$ . If $\pi \in \mathrm {Sym}(A)$ is a permutation, we may regard it as a permutation of $A \times B$ by $\pi ((a, b)) = (\pi (a), b)$ . Our groups always act from the left. If $g_1, \ldots , g_n$ are commuting elements of a group, we write $\prod _{i=1}^n g_i$ for their ordered product $g_n \cdots g_1$ . In groups of bijections on a set (which almost all our groups are), we denote composition by $\circ $ .
Given a finitely generated group G generated by the finite set S, a presentation of $g \in G$ is a word $w = s_n \cdots s_1 \in (S \cup S^{1})^*$ such that $g = s_n \cdots s_1$ , and we write $w \equiv g$ . The word norm $\lVert g \rVert _S$ of $g \in G$ relative to S is then the length of a shortest presentation of $g$ , that is, $\lVert g \rVert _S = \min \{ n \in \mathbb {N} : \text { there exists } w \in (S \cup S^{1})^n, w \equiv g \}$ . This word norm is also the distance in the Cayley graph of G between $1_G$ and $g$ . In this context, an element $g \in G$ is said to be distorted if $\lVert g^n \rVert _S = o(n)$ .
For sets $X,Y$ , and Z, we say that a map $f : X \to Y$ lifts into $\tilde {f} : X \times Z \to Y \times Z$ (or that $\tilde {f}$ is the lift of $f$ ) if $\tilde {f}(x,z) = (f(x),z)$ . For $S \subseteq X$ a subset of X, and $f : X \to X$ , we call extended restriction of $f$ to S the map $f\hspace {0.05em}/\hspace {0.2em}_S : X \to X$ defined as
that is, $f\hspace {0.05em}/\hspace {0.2em}_S$ is the extension of the restriction $f_S$ back to the full domain X, by fixing elements outside S.
2.2 Subshifts and cellular automata
Let $\Sigma $ be a finite alphabet. An element $x \in \Sigma ^{\mathbb {Z}}$ is called a configuration. An element $w \in \Sigma ^*$ is called a word or a pattern, and a pattern $w \in \Sigma ^*$ is said to appear in a configuration $x \in \Sigma ^{\mathbb {Z}}$ , denoted $w \sqsubseteq x$ if there exists some $i \in \mathbb {Z}$ such that $x_{i+j} = w_j$ for every .
We endow $\Sigma ^{\mathbb {Z}}$ with the product topology. This topology is generated by the cylinders $[a]_j = \{ x \in \Sigma ^{\mathbb {Z}} : x_j = a \}$ for $a \in \Sigma $ and $j \in \mathbb {Z}$ . The left shift $\sigma : \Sigma ^{\mathbb {Z}} \to \Sigma ^{\mathbb {Z}}$ defined by $\sigma (x)_i = x_{i+1}$ is a $\mathbb {Z}$ action on $\Sigma ^{\mathbb {Z}}$ . Closed and shiftinvariant subsets X of $\Sigma ^{\mathbb {Z}}$ are called subshifts. For X a subshift and $n \in \mathbb {N}$ , we denote by $\mathcal {L}_n(X)$ the set of finite words of length $n$ that appear in X, and by $\mathcal {L}(X) = \bigcup _{n \in \mathbb {N}} \mathcal {L}_n(X)$ its language. We say that a subshift X is sofic if $\mathcal {L}(X)$ is a regular language. If X and Y are subshifts, a continuous and shiftequivariant map $f : X \to Y$ is called a morphism. It is an endomorphism if $X = Y$ and an automorphism if, in addition, it is bijective (in which case $f^{1}$ is also an endomorphism). Endomorphisms are sometimes called cellular automata, and automorphisms reversible cellular automata. For $f : X \to Y$ a morphism between two subshifts, its radius (as a cellular automaton) is the minimal $r$ such that $f(x)_i$ is a function of . The biradius of an automorphism is the maximum of the radii of $f$ and $f^{1}$ .
2.3 Turing machines
In this article, we use Turing machines as a specific kind of action on subshifts. We note that despite the terminology, it is not necessarily helpful to think of them as computational devices. (We will later perform computation in a group of Turing machines, but this computation is not related to the usual type of Turing machine computation, in that the iteration of a single machine is not going to be used to perform computation.)
Let Q be a finite set called the state set, and $\Gamma $ be a finite set called the tape alphabet. In the model of [Reference Kari, Ollinger, Ochmański and Tyszkiewicz32] (this model is equivalent to the usual definition of Turing machines, but handles reversibility better), a Turing machine is a triple $\mathcal {M} = (\Gamma , Q, \Delta )$ , where $\Delta \subseteq (Q \times \{+1,1\} \times Q) \cup (Q \times \Gamma \times Q \times \Gamma )$ is the transition table. A transition $(q,\delta ,q') \in Q \times \{+1,1\} \times Q$ is called a move transition, and a transition $(q,a,q',b) \in Q \times \Gamma \times Q \times \Gamma $ is called a matching transition.
In the rest of this paper, we focus on the action of Turing machines on two families of objects: biinfinite tapes and finite cyclic tapes.
2.3.1 Biinfinite tapes
In the alphabet $\Gamma \cup (Q \times \Gamma )$ , elements of $H = Q \times \Gamma $ are called heads. Denote by
the set of biinfinite tapes with at most one head somewhere. We can associate to $\mathcal {M}$ its socalled movinghead model [Reference Kůrka35], that is, the binary relation $\rightarrow _{\mathcal {M}}$ on $\mathcal {X}$ defined by $x \rightarrow _{\mathcal {M}} x$ if $x \in \mathcal {X}$ contains no head (that is, $x \in \Gamma ^{\mathbb {Z}}$ ); and if $x \in \mathcal {X}$ contains a head at position, say $i_0 \in \mathbb {Z}$ with $x_{i_0} = (q,a)$ for some $q \in Q$ and $a \in \Gamma $ , then $x \rightarrow _{\mathcal {M}} x'$ if there exists $t \in \Delta $ such that:
The binary relation $\rightarrow _{\mathcal {M}}$ on $\mathcal {X}$ (denoted $\rightarrow $ for short if the context is clear) is the reachability relation. We write $\rightarrow _{\mathcal {M}}^k$ its $k$ th power, and $\rightarrow _{\mathcal {M}}^*$ its transitive closure. We say that $\mathcal {M}$ reaches the configuration $x'$ from $x$ in $k \in \mathbb {N}$ steps if $x \rightarrow _{\mathcal {M}}^k x'$ . A transition $x \rightarrow _{\mathcal {M}}^* x'$ is called a move.
The machine $\mathcal {M}$ is deterministic if $\rightarrow _{\mathcal {M}}$ defines a partial function, complete deterministic if it defines a total function (which is then continuous and, obviously, shiftcommuting), and complete reversible (or reversible for short) if it defines a bijection (which is then a homeomorphism). When $\mathcal {M}$ is complete deterministic (which all our machines are), when using the relation $\rightarrow _{\mathcal {M}}$ as a function, we write it as $T_{\mathcal {M}} : \mathcal {X} \to \mathcal {X}$ , which is an endomorphism of the subshift $\mathcal {X}$ . Similarly, when the machine $\mathcal {M}$ is reversible, it is an automorphism of $\mathcal {X}$ .
2.3.2 Finite cyclic tapes
The set of cyclic configurations of length $\ell \in \mathbb {Z}_+$ is the set
of finite configurations containing at most one head. We always assume $\ell \geq 2$ in what follows (the case $\ell = 1$ makes sense, but requires notational modifications and is the least interesting case anyway).
The machine $\mathcal {M}$ defines a binary relation $\rightarrow _{\mathcal {M}}$ on $\mathcal {C}_{\ell }$ by considering these finite tapes as cyclic, that is, we define $x \rightarrow _{\mathcal {M}} x$ if $x \in \mathcal {C}_{\ell }$ contains no head (that is, $x \in \Gamma ^{\mathbb {Z}/\ell \mathbb {Z}}$ ); and if $x \in \mathcal {C}_{\ell }$ contains a head at position, say $i_0 \in \mathbb {Z}/\ell \mathbb {Z}$ with $x_{i_0} = (q,a)$ for some $q \in Q$ and $a \in \Gamma $ , then $x \rightarrow _{\mathcal {M}} x'$ if there exists $t \in \Delta $ such that:
As above, the relation $\rightarrow _{\mathcal {M}}$ is called the reachability relation. If $\mathcal {M}$ is complete deterministic, the function $\rightarrow _{\mathcal {M}}$ will be denoted by $T_{\ell ,\mathcal {M}} : \mathcal {C}_{\ell } \to \mathcal {C}_{\ell }$ . Note that it is an endomorphism of the shift action of $\mathbb {Z}$ (or $\mathbb {Z}_{\ell }$ ) which translates the cyclic tape around.
Finally, for any machine $\mathcal {M} = (Q,\Gamma ,\Delta )$ , denote by $m: \mathbb {N} \to \mathbb {N}$ its movement function, that is, $m(n)$ is the maximal number of cells the machine can visit in $n$ steps. More precisely, $m(n)$ is the length $rl+1$ of the largest interval such that there exists a sequence of $n$ steps of computation $x_0 \rightarrow _{\mathcal {M}} x_1 \rightarrow _{\mathcal {M}} \cdots \rightarrow _{\mathcal {M}} x_n$ (with $x_i \in \mathcal {X} $ ) such that for every position , at least one of the tapes $x_k$ ( $0 \leq k \leq n)$ has its head at position $i$ .
3 The SMART machine on cyclic tapes
Let SMART be the Turing machine $(Q,\Gamma ,\Delta )$ , where $Q = \{{\blacktriangleright }_1,{\blacktriangleleft }_1,{\rhd }_1,{\lhd }_1\} \cup \{{\blacktriangleright }_2, {\blacktriangleleft }_2,{\rhd }_2,{\lhd }_2\}$ , $\Gamma = \{0,1,2\}$ and $\Delta $ is the transition table as shown in Figure 1.
We refer to ${\blacktriangleright }_1,{\blacktriangleright }_2,{\blacktriangleleft }_1,{\blacktriangleleft }_2$ (respectively ${\rhd }_1,{\rhd }_2,{\lhd }_1,{\lhd }_2$ ) as filled (respectively hollow) triangles.
Remark 3.1. The SMART machine was introduced with a slightly different formalism in [Reference Cassaigne, Ollinger and TorresAvilés14], and slightly revised in [Reference Ollinger, Kari and Ulidowski41] (states were renamed and permuted). The machine above adapts the latter in the model of [Reference Kari, Ollinger, Ochmański and Tyszkiewicz32] for Turing machines: in other words, we duplicate the states. We kindly advise readers already familiar with the SMART machine to read these definitions and propositions carefully.
Namely, while our SMART machine is in a sense completely equivalent, in the formulas in Proposition 3.2 describing traversals of SMART over zeroes, the patterns corresponding to filled and hollow initial states are of the same length (unlike the corresponding ones in [Reference Cassaigne, Ollinger and TorresAvilés14]). This will be helpful later, when we encode the position in the sweep into the corresponding area on the tape without any extra space.
In this section, we consider the action of this machine on finite patterns (denoted with rounds brackets) like
The argument applies whether or not these are finite subpatterns of a finite cyclic tape, or of an infinite configuration. When specifying a move (with some number of transition steps) between two patterns, it is implicit that the initial and final patterns have the same domain, and the machine does not exit this domain during the intermediate steps. Complete cyclic configurations (where the notation specifies the contents of all $\ell $ cells) will be denoted similarly, but with square brackets.
Proposition 3.2. (Adapted from [Reference Cassaigne, Ollinger and TorresAvilés14, Lemma 1])
Let $f(k) = 3^{k+1}2$ . For all $k$ , $s_* \in \{0,1,2\}$ , and $s_+ \in \{1,2\}$ , the following moves hold:
Additionally, the cell containing $s_*$ is only visited at the last (respectively first) step of the sequences of transitions $M_{\blacktriangleright }$ and $M_{\blacktriangleleft }$ (respectively $M_{\rhd }$ and $M_{\lhd }$ ). And the cell containing $s_+$ is never modified.
Proof. This proof adapts the proof of [Reference Cassaigne, Ollinger and TorresAvilés14, Lemma 1], and highlights the recursive/nested aspects of these moves. In the case $k = 0$ , one can check that indeed the formula describes a single transition. We reason by induction, and assume $M_{\blacktriangleright }(k)$ , $M_{\blacktriangleleft }(k)$ , $M_{\rhd }(k)$ , and $M_{\lhd }(k)$ hold. We only prove $M_{\blacktriangleright }(k+1)$ and $M_{\rhd }(k+1)$ , by symmetry between ${\blacktriangleright }$ and ${\blacktriangleleft }$ (respectively ${\rhd }$ and ${\lhd }$ ). Since $f(k+1) = 3f(k) + 4$ , we should find $3$ recursions and $4$ extra steps. This is what happens:
3.1 Action of SMART on cyclic tapes
This section studies the action of SMART on cyclic tapes of length $\ell \geq 2$ . We call initial configurations the following four cyclic configurations:
Proposition 3.3. Let $\ell \geq 2$ . The action of the $(2\cdot 3^{\ell })\text {th}$ power of SMART on $C_{\blacktriangleright }$ and $C_{\rhd }$ (respectively $C_{\blacktriangleleft }$ and $C_{\lhd }$ ) is a rightshift (respectively leftshift). Furthermore, the intermediate configurations are all distinct even up to a shift.
Proof. By symmetries between ${\blacktriangleright }$ and ${\blacktriangleleft }$ (respectively ${\rhd }$ and ${\lhd }$ ), we prove the result for $C_{\blacktriangleright }$ and $C_{\rhd }$ .
We used moves $M_{\blacktriangleright }(\ell 1),M_{\blacktriangleleft }(\ell 1),M_{\rhd }(\ell 1)$ and $M_{\lhd }(\ell 1)$ in patterns that overlap themselves on their first and last letters in the cyclic tape. This is valid, because the cell containing $s_*$ is only visited at the last (respectively first) step of $M_{\blacktriangleright }$ and $M_{\blacktriangleleft }$ (respectively $M_{\rhd }$ and $M_{\lhd }$ ).
For the last claim, by shiftcommutation and bijectivity of the action, it is enough to show that a shifted copy of the initial configuration does not appear before the last step. This is clear from looking at the first columns, which have positive values on all but the first step and the two last steps.
Lemma 3.4. For $\ell \geq 1$ , the action of SMART on cyclic tapes of length $\ell $ is composed of four disjoint cycles of length $2\ell \cdot 3^{\ell }$ , which are the orbits of the four initial configurations. Additionally, the action of the $(2\cdot 3^{\ell })\text {th}$ power of SMART on a cyclic tape is a rightshift (respectively leftshift) on the orbits of $C_{\blacktriangleright }$ and $C_{\rhd }$ (respectively $C_{\blacktriangleleft }$ and $C_{\lhd }$ ).
Proof. This is an immediate consequence of Proposition 3.3: the orbits are each of length $2\ell \cdot 3^{\ell }$ (number of shifts $\times $ number of steps for each shift), and are disjoint (by looking at the first column in the previous proof). As there are $8\ell \cdot 3^{\ell }$ different cyclic configurations containing a head (eight different states with $\ell $ possible positions and a ternary tape of length $\ell $ ), any configuration belongs to one of these four orbits: this concludes the proof.
3.2 Encoding SMART configurations
Recall that for the SMART machine, $\Gamma = \{0,1,2\}$ and $Q \simeq \{ {\blacktriangleright },{\blacktriangleleft },{\rhd },{\lhd }\} \times \{1,2\} $ . Lemma 3.4 implies that the set of cyclic SMART configurations of length $\ell $ that contain a head
is conjugate (as a finite dynamical system or as a permutation) to a disjoint union of four (depending on whether the head is in state ${\blacktriangleright }, {\blacktriangleleft }, {\rhd }$ , or ${\lhd }$ ) disjoint systems of counters ranging in (respectively for the position of the head, the second component $\{1,2\}$ of Q, and the tape of alphabet $\Gamma )$ . Each of these counters encodes $2\ell \cdot 3^{\ell }$ different values, which is exactly the length of any SMART cycle by Lemma 3.4.
We pick a natural conjugacy $E_{\ell } : \mathcal {C}_{\ell } \to \mathcal {C}_{\ell }$ , a shiftinvariant bijection that encodes a SMART configuration into its orbit position in base $2\ell \cdot 3^{\ell }$ . We refer to the conjugacy $E_{\ell }$ as the encoding map.
More precisely, if $w \in \mathcal {C}_{\ell }$ contains a head, then by Lemma 3.4, there exists some $0 \leq n < 2\ell \cdot 3^{\ell }$ such that $w = (T_{\ell ,\mathcal {S}})^n(C_q)$ for some initial configuration $C_q$ ( $q \in \{{\blacktriangleright },{\blacktriangleleft },{\rhd },{\lhd }\}$ ), and the map $E_{\ell }$ encodes the tuple $(q,n)$ in $\mathcal {C}_{\ell }$ as
where:

• $q$ is stored in the first component of the state $\{{\blacktriangleright },{\blacktriangleleft },{\rhd },{\lhd }\}$ ;

• $b \cdot c \in \{1,2\} \cdot \{0,1,2\}^{\ell }$ encodes $n \bmod 2\cdot 3^{\ell }$ , that is, $c$ is a ternary word satisfying $v_{(3)}(c) = n \bmod 3^{\ell }$ , and $b$ stores $(\lfloor n/3^{\ell } \rfloor \bmod 2)+1$ in the second component of the state;

• $a$ is the quotient of $n$ by $2 \cdot 3^{\ell }$ , and is encoded in how much the cyclic configuration is shifted;

• $\varepsilon = +1$ if $q \in \{{\blacktriangleright },{\rhd }\}$ (respectively $\varepsilon = 1$ if $q \in \{{\blacktriangleleft },{\lhd }\}$ ) shifts to the right (respectively left) if $q \in \{{\blacktriangleright },{\rhd }\}$ (respectively $q \in \{{\blacktriangleleft },{\lhd }\}$ ),
and if $w \in \mathcal {C}_{\ell }$ contains no head (that is, $w \in \Gamma ^{\mathbb {Z}/\ell \mathbb {Z}})$ , then we set $E_{\ell }(w) = w$ .
In other words, given a configuration $w = (T_{\ell ,\mathcal {S}})^n(C_q)$ for some initial configuration $C_q$ ( $q \in \{{\blacktriangleright },{\blacktriangleleft },{\rhd },{\lhd }\}$ ) and $0 \leq n < 2\ell \cdot 3^{\ell }$ , the map $E_{\ell }$ encodes the tuple $(q,n)$ as plainly (and humanly readable) as possible.
In the next two sections (§§3.3 and 3.4), we detail how this bijection can be computed inductively, that is, we define piecewisedefined bijective maps $F_{\mathrm {init}}$ , $F_{k \to k+1}$ (for $0 \leq k \leq \ell 2$ ), and $F_{\ell ,\mathrm {final}}$ acting on $\mathcal {C}_{\ell }$ such that
The precise definition of these maps is not strictly necessary to understand the rest of this paper (it is only used in Lemmas 4.18 and 4.20, where we need the piecewise defined functions to satisfy the requirements described in §4.2.3). On a first reading, we recommend the reader simply remembers the idea of encoding configurations into a counter, and goes directly to §4 about the finitary distortion of the SMART machine.
3.3 Analysis of SMART configurations
We now explain how, given a cyclic SMART configuration $w$ of length $\ell $ , we determine which orbit it belongs in and its position in this orbit, that is, the number of steps required to obtain it from its corresponding initial configuration $C_{\blacktriangleright },C_{\blacktriangleleft },C_{\rhd }$ or $C_{\lhd }$ .
We say that a cyclic configuration $w \in \mathcal {C}_{\ell }$ is performing the $j\text {th}$ step of computation of $M_{\blacktriangleright }(k)$ (respectively $M_{\blacktriangleleft }(k), M_{\rhd }(k), M_{\lhd }(k)$ ), for $0 \leq k \leq \ell 1$ and $0 \leq j \leq f(k)$ , if it contains the $j\text {th}$ pattern of the sequence of transitions $M_{\blacktriangleright }(k)$ (respectively $M_{\blacktriangleleft }(k), M_{\rhd }(k), M_{\lhd }(k)$ ) of Proposition 3.2. At this point, it may not be clear that this is unique, but this will follow from our argument.
If a configuration is performing some step of computation from one of the moves $M_{\blacktriangleright }(k)$ , $M_{\blacktriangleleft }(k)$ , $M_{\rhd }(k)$ , or $M_{\lhd }(k)$ , we refer to this move as its computation of level k.
3.3.1 Initialization
We call the following patterns special patterns of level k (for $1 \leq k \leq \ell 1$ ):
and the following are special patterns of level $\ell $ :
The latter appear exactly in the shifts of the configurations $\mathcal {S}^{1}(C_q)$ , for $C_q$ the initial configurations ( $q \in \{{\blacktriangleright },{\blacktriangleleft },{\rhd },{\lhd }\}$ ).
By the proof of Proposition 3.2, we see that if a cyclic configuration contains a special pattern $s({\blacktriangleright }_2,k)$ , $s({\blacktriangleright }_1,k)$ , $s({\blacktriangleleft }_2,k)$ , or $s({\blacktriangleleft }_1,k)$ (respectively $s({\rhd }_2,k)$ , $s({\rhd }_1,k)$ , $s({\lhd }_2,k)$ , or $s({\lhd }_1,k)$ ), then it performs the last two steps of $M_{\blacktriangleright }(k)$ or $M_{\blacktriangleleft }(k)$ respectively (respectively the first two steps of $M_{\rhd }(k)$ or $M_{\lhd }(k)$ ).
Claim 3.5. Given a cyclic configuration $w$ of length $\ell $ containing a head, exactly one of the following holds:

• $w$ is the shift of an initial configuration;

• $w$ performs some step of computation of level $0$ from either $M_{\blacktriangleright }(0)$ , $M_{\blacktriangleleft }(0)$ , $M_{\rhd }(0)$ , or $M_{\lhd }(0)$ ;

• $w$ contains a special pattern of level $1 \leq k \leq \ell $ .
Proof. The patterns of $M_{\blacktriangleright }(0)$ , $M_{\blacktriangleleft }(0)$ , $M_{\rhd }(0)$ , and $M_{\lhd }(0)$ (eight in total), along with the special patterns of every level, disjointly cover all the noninitial configurations with a head (the level is determined by the distance to the nearest nonzero symbol in an appropriate direction).
3.3.2 Inductive analysis $k \to k+1$ (for $k+1 \leq \ell 1$ )
Let $k \leq \ell 2$ be an integer and $w$ be a noninitial cyclic configuration. If $w$ performs some computation of level $k$ (for $0 \leq k \leq \ell 2$ ), then it performs some computation of level $k+1$ : indeed, Figure 2 shows that any computation of level $k$ belongs to some computation of level $k+1$ , and that the latter is uniquely determined by considering the value of two cells (circled on the figure) which are left unmodified by the computation of level $k$ .
Additionally, if we know that $w$ performs the $j\text {th}$ step of some computation of level $k$ (for $0 \leq j \leq f(k)$ ), then the same caseanalysis determines $j'$ ( $0 \leq j' \leq f(k+1)$ ) such that $w$ performs the $j'\text {th}$ step of its computation of level $k+1$ .
Example 3.6. For example, consider the cyclic configuration of length $\ell = 6$ :
First, $w$ performs step $0$ of $M_{\blacktriangleright }(0)$ (by simply considering all the computations of level $0$ ). Indeed, the highlighted pattern inside the configuration below is the $0\text {th}$ pattern of the move $M_{\blacktriangleright }(0)$ :
Then, by looking at Figure 2 (four cases, with three subcases each), we deduce successively that:

(1) by considering Figure 2 (case $M_{\blacktriangleright }(0)$ , second subcase),
we deduce that $w$ performs step $2 = 0 + (f(0)+1)$ of $M_{\blacktriangleleft }(1)$ . Additionally, the computation extends to the left, that is, the move $M_{\blacktriangleleft }(1)$ appears in the following highlighted pattern:

(2) by considering Figure 2 (case $M_{\blacktriangleleft }(1)$ , third subcase),
we deduce that $w$ performs step $4 = 2 + 2$ of $M_{\lhd }(2)$ . Additionally, the computation extends to the right and the move $M_{\lhd }(2)$ appears in the following highlighted pattern:

(3) by considering Figure 2 (case $M_{\lhd }(2)$ , first subcase),
we deduce that $w$ performs step $58 = 4 + (2f(2)+4)$ of $M_{\lhd }(3)$ . Additionally, the computation extends to the right and the move $M_{\lhd }(3)$ appears in the following highlighted pattern (remember that configurations are cyclic):

(4) by considering Figure 2 (case $M_{\lhd }(3)$ , second subcase),
we deduce that $w$ performs step $218 = 58 + (2f(3)+2)$ of $M_{\blacktriangleleft }(4)$ . Additionally, the computation extends to the left and the move $M_{\blacktriangleleft }(4)$ appears in the following highlighted pattern:

(5) by considering Figure 2 (case $M_{\blacktriangleleft }(4)$ , second subcase),
we deduce that $w$ performs step $460 = 218 + (f(4)+1)$ of $M_{\blacktriangleright }(5)$ .
On the final step, we should extend to the right; since we run out of cells on the tape, this means that we should interpret the circled $2$ cell as both the first and last cell of the $M_{\blacktriangleright }(5)$ computation, that is, the move $M_{\blacktriangleright }(5)$ appears in the following highlighted selfoverlapping pattern:
So we claim that $w$ performs step $460 = 218 + (f(4)+1)$ of $M_{\blacktriangleright }(5)$ . And one can verify by a direct calculation (for example, by computer) that
as we just deduced.
3.3.3 Conclusion
Let $w \in \mathcal {C}_{\ell }$ be a cyclic tape of length $\ell $ containing a head. By the claim above, there are three different cases.
Either $w$ is the shift of an initial configuration, in which case the state of $w$ determines which orbit $w$ belongs to, and counting the shift (and multiplying it by $2 \cdot 3^{\ell }$ ) is enough to know the position of the configuration $w$ in its orbit. A similar reasoning applies for configurations $w$ that contain a special pattern of level $\ell $ .
However, assume $w$ contains some computation of level $0$ . Then $w$ contains computations of every level $k$ for $0 \leq k \leq \ell 1$ by the previous induction. Similarly, if $w$ contains a special pattern of level $k$ for some $0 \leq k \leq \ell 1$ , then $w$ corresponds to either the last two steps of $M_{\blacktriangleright }(k)$ or $M_{\blacktriangleleft }(k)$ , or the first two steps of $M_{\rhd }(k)$ or $M_{\lhd }(k)$ . Then $w$ corresponds to computations of every level $k'$ for $k \leq k' \leq \ell 1$ by the previous induction.
Finally, the structure of each SMART cycle (detailed in the proof of Proposition 3.3) enables to conclude about which orbit the configuration $w$ belongs to, and its position in said orbit.
Example 3.7. Consider once again the cyclic configuration of length $\ell = 6$ ,
By the previous example, $w$ performs step $460$ of $M_{\blacktriangleright }(5)$ on the following (selfoverlapping) pattern:
Considering the proof of Proposition 3.3, only the orbits of $C_{\rhd }$ and $C_{\lhd }$ have a cell containing tapeletter $2$ that is left unchanged during a computation of level $\ell 1$ . Additionally, out of these two, only the orbit of $C_{\rhd }$ contains the move $M_{\blacktriangleright }(5)$ .
So $w$ belongs in the orbit of $C_{\rhd }$ . From the proof of Proposition 3.3, we deduce that, modulo $2 \cdot 3^{\ell }$ , the position of $w$ in said orbit is $1+460$ . Finally, $2$ being the second cell of the tape, one shift already happened in the orbit: we conclude that the position of $w$ in the orbit of $C_{\rhd }$ is $2\cdot 3^{\ell } + 461 = 1919$ .
Indeed, one can verify by a direct calculation that
as we just deduced.
3.4 Encoding cyclic configurations into their orbit positions in $\mathcal {C}_{\ell }$
Denote by $\mathcal {S}$ the SMART machine introduced above. Recall the encoding map $E_{\ell } : \mathcal {C}_{\ell } \to \mathcal {C}_{\ell }$ defined in §3.2 as
if $w$ contains a head (in which case, there exists $n$ and $q \in \{{\blacktriangleright },{\blacktriangleleft },{\rhd },{\lhd }\}$ such that $w = T_{\ell ,\mathcal {S}}^n(C_q)$ , and we set $b \cdot c \in \{1,2\} \cdot \{0,1,2\}^{\ell }$ to encode $n \bmod 2 \cdot 3^{\ell }$ , and $a$ is the quotient of $n$ by $2 \cdot 3^{\ell }$ ); and if $w \in \mathcal {C}_{\ell }$ contains no head, then we set $E_{\ell }(w) = w$ .
3.4.1 Inductive encoding
In this section, we use the analysis performed in §3.3 to provide a lineartime algorithm that computes this encoding $E_{\ell } : \mathcal {C}_{\ell } \to \mathcal {C}_{\ell }$ inductively.
Figure 3 describes a piecewisedefined bijection $F_{\mathrm {init}} : \mathcal {C}_{\ell } \to \mathcal {C}_{\ell }$ : each case describes how a pattern of length $2$ (e.g. ) is bijectively replaced by another (in the previous example, by ). In other words, if a cyclic configuration $w$ contains a subpattern of length $2$ that matches with one case of Figure 3, then $F_{\mathrm {init}}$ replaces this subpattern in $w$ by its image in the figure.
Similarly, Figures 4 and 5 together describe a piecewisedefined bijection $F_{k \to k+1} : \mathcal {C}_{\ell } \to \mathcal {C}_{\ell }$ that replaces subpatterns of length $k+4$ . Intuitively, Figure 4 (defining the first half of $F_{k \to k+1}$ ) encodes moves of level $k+1$ into counters, while Figure 5 (the second half of $F_{k \to k+1}$ ) encodes special patterns of level $k+1$ .
Finally, Figure 6 describes a similar bijection $F_{\ell ,\mathrm {final}} : \mathcal {C}_{\ell } \to \mathcal {C}_{\ell }$ .
We can then prove the following lemma.
Lemma 3.8. Let $w$ be a cyclic configuration of $\mathcal {C}_{\ell }$ . Then,
Proof Sketch of proof
Let $w$ be a configuration and $k$ be some integer $k \leq \ell 1$ . If $w$ is neither the shift of an initial configuration nor a special configuration of level $> k$ , then there exists some $q \in \{{\blacktriangleright },{\blacktriangleleft },{\rhd },{\lhd }\}$ and a unique word $p = p_0 \cdots p_{k+1}$ of length $k+2$ such that $p \sqsubseteq w$ and $p$ computes the $j\text {th}$ step of $M_q(k)$ for some $0 \leq j \leq f(k) = 3^{k+2}  2$ .
Then by induction on $k$ , one sees that the partial composition
when applied on $w$ , replaces $p$ in $w$ by another pattern $p'$ of the same length defined as follows:
where $c \in \{0,1,2\}^{k+1}$ is a ternary counter such that $\mathrm {v}_3(c) = j+1$ . Notice that $0 \leq j \leq f(k) 2$ , where $f(k) = 3^{k+2}2$ ; so that $1 \leq j +1 \leq f(k)1$ fits exactly in the space of nonzero counters. The zero counters are reserved for (shifts of) initial and special configurations.
Finally,
is equal to $E_{\ell }$ by considering how $F_{\ell ,\mathrm {final}}$ acts in accordance with the structure of the four disjoint cycles of SMART (detailed in the proof of Proposition 3.3).
Example 3.9. Consider once again the cyclic configuration of length $\ell = 6$ from Example 3.6, and let us use the formulas above to encode it into a counter value. This process will roughly mirror Examples 3.6 and 3.7, except that due to our coding convention, the counter value is one larger than the correct one until the very last step. First, we apply $F_{{\mathrm {init}}}$ , which observes based on the highlighted cells
that the first case of $M_{\blacktriangleright }(0)$ applies, and rewrites this as
Next, we apply $F_{0 \to 1}$ . Based on the highlighted cells, the fourth case of $M_{\blacktriangleright }(0)$ applies and we rewrite this as
Next, we apply $F_{1 \to 2}$ . Based on the highlighted cells, the fifth case of $M_{\blacktriangleleft }(1)$ applies and we rewrite this as
Next, we apply $F_{2 \to 3}$ . Based on the highlighted cells, the first case of $M_{\lhd }(2)$ applies and we rewrite this as
Next, we apply $F_{3 \to 4}$ . Based on the highlighted cells, the third case of $M_{\lhd }(3)$ applies and we rewrite this as
Next, we apply $F_{4 \to 5}$ . Based on the highlighted cells, the fourth case of $M_{\blacktriangleleft }(4)$ applies and we rewrite this as
Finally, we apply $F_{{\mathrm {final}}}$ (in Figure 6). We are in the fourth case, where we do not touch the counter and just change the state to ${\rhd }_1$ :
This gives the expected result: as $v_3(122002) = 461$ , and the head is shifted once to the right in the tape, this configuration has position $1 \cdot (2 \cdot 3^{\ell }) + 461 = 1919$ in the orbit of the initial configuration corresponding to state ${\rhd }_1$ .
4 Finitary distortion for SMART
In this section, we first introduce the group $\mathcal {G}_{\ell }$ generated by Turing machines instructions (§4.1.2), and we slightly alter the SMART machine $\mathcal {S}$ (§4.1.1): we call it the decorated SMART since we add some additional components to its states. The action of the SMART machine extends trivially to the new decorations—new components are added as a Cartesian product, and the head simply carries its new components without modifying or reading them.
Denoting by $T_{\ell ,\mathcal {S}} : \mathcal {C}_{\ell } \to \mathcal {C}_{\ell }$ the finite action of this decorated SMART on the cyclic tapes of $\mathcal {C}_{\ell }$ , we then establish Lemma 4.2, an intermediary result about $T_{\ell ,\mathcal {S}}$ . Namely, this automorphism is ‘finitarily distorted’ in $\mathcal {G}_{\ell }$ , in the sense that all its powers (including those exponential in $\ell $ ) have word norm polynomial in $\ell $ under the fixed generators (which is exponentially lower than the order of the group would suggest). The rest of this section is dedicated to the proof of this lemma.
Later in §5, we prove that every nontrivial full shift contains a distortion element of infinite order. Technically, §5 focuses on transporting the finite actions $T_{\ell ,\mathcal {S}}$ of the decorated SMART into an infinite action on a full shift (and showing that this transposition preserves distortion). In other words, the ‘distortion’ aspect of Lemma 5.1 entirely comes from Lemma 4.2, that is, from the main result of this section.
4.1 Context and results
4.1.1 Decorated SMART on $\mathcal {C}_{\ell }$
Let $\mathcal {S}^{(\textrm {o})} = (Q^{(\textrm {o})},\Gamma ^{(\textrm {o})},\Delta ^{(\textrm {o})})$ be the SMART machine (see §3). Define the decorated version of the SMART machine as $\mathcal {S}_{\mathrm {dec}} = (Q,\Gamma ,\Delta )$ , with
for $D = \{{d_{1}},{d_{2}}\}$ and , that is, the states of SMART now carry a state $q^{(\textrm {o})} \in Q^{(\textrm {o})}$ of the original machine $\mathcal {S}$ , a special symbol $d \in D$ called the duck, and a ghost symbol $x \in G$ . We have $\lvert Q\rvert = 96$ .
Technically, the two SMART machines $\mathcal {S}_{\mathrm {dec}}$ and $\mathcal {S}$ are different: for one, they act on different sets of cyclic tapes (since they have different sets of states). However, they have very similar behaviors, as the decorated machine only carries its decoration unmodified in its state while acting on tapes. To refer to the original set of states of SMART, we will use $Q^{(\textrm {o})}$ , and Q will denote $Q = Q^{(\textrm {o})} \times D \times G$ .
Since the remainder of this article only uses the decorated version of SMART, in what follows, $\mathcal {S}$ will refer to the decorated version of the machine, despite lacking the ‘ ${\mathrm {dec}}$ ’ subscript. This should not cause confusion, as the machines act on different sets.
The point of the ghost is to allow us to condition the application of gates, and to build the permutations we perform in §4.2. The duck $d \in \{{d_{1}},{d_{2}}\}$ will be important during intermediate steps of computation in §4.2, to realize piecewise defined functions.
For $S \subseteq \mathcal {C}_{\ell }$ a subset of finite cyclic tapes, we denote by $S[{d_{1}}]$ and $S[{d_{2}}]$ the subsets of the tapes of S containing a head, and whose ducks respectively are $d = {d_{1}}$ and $d = {d_{2}}$ . For $d \in \{{d_{1}},{d_{2}}\}$ and a function $f : S \to S$ , we abuse notation and denote by $f\hspace {0.05em}/\hspace {0.2em}_{d}$ the extended restriction $f\hspace {0.05em}/\hspace {0.2em}_{S[d]}$ of $f$ to $S[d]$ :
4.1.2 Group of Turing machine instructions on finite cyclic tapes
Recall that $\mathcal {C}_{\ell }$ is the set of finite cyclic tapes of length $\ell \geq 2$ (see §2) with states Q and tapealphabet $\Gamma $ , containing at most one head (that is, a letter in $Q \times \Gamma )$ .
Let $\mathcal {G}_{\ell }$ be the finitely generated subgroup of $\mathrm {Sym}(\mathcal {C}_{\ell })$ generated by statedependent moves, and the unary gates permuting heads. Formally, for $g \in \mathrm {Sym}(Q \times \Gamma )$ , define the unary gate $\pi _g \in \mathrm {Sym}(\mathcal {C}_{\ell })$ as
for a cyclic tape $w \in \mathcal {C}_{\ell }$ . In addition, for $q \in Q$ , define the statedependent right move $\rho _q \in \mathrm {Sym}(\mathcal {C}_{\ell })$ as
for $w \in \mathcal {C}_{\ell }$ and $\pi _{\Gamma } : \Gamma \cup (Q \times \Gamma ) \to \Gamma $ the natural projection.
We then define the group $\mathcal {G}_{\ell } \leq \mathrm {Sym}(\mathcal {C}_{\ell })$ generated by these permutations:
We can see the group $\mathcal {G}_{\ell }$ as the group generated by the instructions of Turing machines: moving heads based on their states, or permuting their values. To ease notation, we denote $\prod _{q \in Q} \rho _q$ by $\rho $ . This (finite) group is equipped with a metric, that is, the word norm given by the generators $\pi _g$ and $\rho _q$ .
It is easy to see that for any reversible Turing machine $\mathcal {M}$ of states Q and tapealphabet $\Gamma $ , $T_{\ell ,\mathcal {M}}$ is an element of $\mathcal {G}_{\ell }$ . Indeed, a step of computation is the composition of a head permutation $\alpha $ of $Q \times \Gamma $ , followed with statedependent moves $\beta _{+1}$ and $\beta _{1}$ :
Finally, we denote by $\delta (\ell ,n)$ the word norm of $(T_{\ell ,\mathcal {M}})^n$ in $\mathcal {G}_{\ell }$ . In this section, we focus on proving that $\delta (\ell ,n)$ is polynomial in $\ell $ for powers of the decorated SMART machine (even for powers exponential in $\ell $ ).
Remark 4.1. It can be shown that $\mathcal {G}_{\ell }$ is, for large enough $\ell $ , $\lvert Q\rvert $ , and $\lvert \Gamma \rvert $ (in particular for all versions of the SMART machine we consider and for $\ell \geq 2$ ), of bounded index in the automorphism group of $\mathcal {C}_{\ell }$ under the shift action of $\mathbb {Z}_{\ell }$ . This is not particularly useful, however, as what we need it for is to provide a group where the SMART machine corresponds to an element of small word norm (far smaller than the radius of the group).
4.1.3 Main result: finitary distortion of the decorated SMART
Recall that $m: \mathbb {N} \to \mathbb {N}$ is the movement function, that is, $m(n)$ is the maximal number of cells the machine $\mathcal {S}$ can visit in $n$ steps; and that $\delta (\ell ,n)$ is the word norm of $(T_{\ell ,\mathcal {S}})^n$ in $\mathcal {G}_{\ell }$ .
Lemma 4.2. Let $\mathcal {S}$ be the (decorated) SMART machine.

(1) $T_{\mathcal{S}}$ has infinite order.

(2) There exist some $C,C'> 0$ such that $m(n) \leq C \log n + C'$ .

(3) There exists some $p> 0$ such that