Seminorm control for ergodic averages with commuting transformations along pairwise dependent polynomials

NIKOS FRANTZIKINAKIS; BORYS KUCA

doi:10.1017/etds.2022.117

Seminorm control for ergodic averages with commuting transformations along pairwise dependent polynomials

Part of: Measure-theoretic ergodic theory Ergodic theory

Published online by Cambridge University Press: 23 January 2023

NIKOS FRANTZIKINAKIS and

BORYS KUCA

Show author details

NIKOS FRANTZIKINAKIS: Affiliation:
Department of Mathematics and Applied Mathematics, University of Crete, Voutes University Campus, Heraklion 71003, Greece (e-mail: frantzikinakis@gmail.com)
BORYS KUCA*: Affiliation:
Department of Mathematics and Applied Mathematics, University of Crete, Voutes University Campus, Heraklion 71003, Greece (e-mail: frantzikinakis@gmail.com)
*: e-mail: boryskuca@uoc.gr

Article contents

Abstract
Introduction
Ergodic background and definitions
Preliminary results
Two motivating examples
Formalism and general strategy for longer families
Further maneuvers and obstructions for longer families
The proof of Theorem
Proofs of joint ergodicity results
References

Rights & Permissions

Abstract

We examine multiple ergodic averages of commuting transformations with polynomial iterates in which the polynomials may be pairwise dependent. In particular, we show that such averages are controlled by the Gowers–Host–Kra seminorms whenever the system satisfies some mild ergodicity assumptions. Combining this result with the general criteria for joint ergodicity established in our earlier work, we determine a necessary and sufficient condition under which such averages are jointly ergodic, in the sense that they converge in the mean to the product of integrals, or weakly jointly ergodic, in that they converge to the product of conditional expectations. As a corollary, we deduce a special case of a conjecture by Donoso, Koutsogiannis, and Sun in a stronger form.

Keywords

joint ergodicity ergodic averages ergodic seminorms characteristic factors Host–Kra factors

MSC classification

Primary: 37A05: Measure-preserving transformations

Secondary: 37A30: Ergodic theorems, spectral theory, Markov operators 28D05: Measure-preserving transformations

Information

Type: Original Article
Information: Ergodic Theory and Dynamical Systems , Volume 43 , Issue 12 , December 2023 , pp. 4074 - 4137

DOI: https://doi.org/10.1017/etds.2022.117 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2023. Published by Cambridge University Press

1 Introduction

1.1 Main results

An important question in ergodic theory is to examine the limiting behavior of multiple ergodic averages of the form

(1)

$$ \begin{align} \frac{1}{N}\sum_{n=1}^N T_1^{p_1(n)}f_1 \cdots T_{\ell}^{p_{\ell}(n)}f_{\ell}. \end{align} $$

Here and throughout the paper, we consider a system $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ that is invertible commuting measure-preserving transformations $T_1, \ldots , T_{\ell }$ acting on a Lebesgue probability space $(X, {\mathcal X}, \mu )$ , polynomials $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ that need not be distinct but are always assumed to have zero constant terms, and functions $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ . The motivation for studying the limiting behavior of (1) comes from the proof of the multidimensional polynomial Szemerédi theorem [Reference Bergelson and Leibman3], in which the averages from equation (1) are the central object of investigation.

It has been proved by Walsh [Reference Walsh19] that the averages from equation (1) converge in $L^2(\mu )$ ; however, little is known about the nature of the limit except in several special cases. In this paper, we examine the following question.

Question 1. When are the polynomials $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$

(i) jointly ergodic for the system $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ , in the sense that
(2) $$ \begin{align} \lim_{N\to\infty}\bigg\Vert \frac{1}{N}\sum_{n=1}^N T_1^{p_1(n)}f_1 \cdots T_{\ell}^{p_{\ell}(n)}f_{\ell} - \int f_1\, d\mu \cdots \int f_{\ell}\, d\mu\bigg\Vert_{L^2(\mu)} = 0 \end{align} $$

for all $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ ?
(ii) weakly jointly ergodic for the system, in the sense that
(3) $$ \begin{align} \lim_{N\to\infty}\bigg\Vert \frac{1}{N}\sum_{n=1}^N T_1^{p_1(n)}f_1 \cdots T_{\ell}^{p_{\ell}(n)}f_{\ell} - {\mathbb{E}}(f_1|{\mathcal I}(T_1)) \cdots {\mathbb{E}}(f_{\ell}|{\mathcal I}(T_{\ell}))\bigg\Vert_{L^2(\mu)} = 0 \end{align} $$

for all $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ ?

The first step in deriving the identities of equations (2) and (3) is usually to establish control over the $L^2(\mu )$ limit of equation (1) by one of the Gowers–Host–Kra seminorms constructed in [Reference Host and Kra13], leading to the following question.

Question 2. When are the polynomials $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ good for seminorm control for the system $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ , in the sense that there exists $s\in {\mathbb N}$ such that

(4)

$$ \begin{align} \lim_{N\to\infty}\bigg\Vert \frac{1}{N}\sum_{n=1}^N T_1^{p_1(n)}f_1 \cdots T_{\ell}^{p_{\ell}(n)}f_{\ell}\bigg\Vert_{L^2(\mu)} = 0 \end{align} $$

holds for all functions $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s,T_j} = 0$ for some $j \in \{1, \ldots , \ell \}$ ?

Question 1(i) was originally posed by Bergelson and was motivated by a result of Berend and Bergelson [Reference Berend and Bergelson1] that covered the case of linear polynomials. It was investigated thoroughly by Donoso, Koutsogiannis, and Sun [Reference Donoso, Koutsogiannis and Sun7], as well as in a subsequent work of the three authors and Ferré-Moragues [Reference Donoso, Ferré-Moragues, Koutsogiannis and Sun6], in which they identified a set of sufficient (but not necessary) conditions under which Questions 1(i) and 2 can be answered affirmatively for general polynomials. Their conditions turned out to also be necessary when all the polynomials are equal.

In [Reference Frantzikinakis and Kuca11], we have addressed Questions 1 and 2 under the assumption that the polynomials $p_1, \ldots , p_{\ell }$ are pairwise independent, without any extra assumption on the system. Specifically, we have showed that for every family of pairwise independent polynomials $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ , there exists $s\in {\mathbb N}$ such that the identity in equation (4) holds for all systems and all $L^{\infty }(\mu )$ functions under the stated seminorm assumptions. We then gave a necessary and sufficient spectral condition under which the identities in equations (2) and (3) hold.

In this paper, we drop the assumption of pairwise independence. We are thus interested in answering Questions 1 and 2 for averages from equation (1) in which some of the polynomial sequences $p_1, \ldots , p_{\ell }$ may be pairwise dependent or even identical. An example of this is the average

(5)

$$ \begin{align} \frac{1}{N}\sum_{n=1}^N T_1^{n^2}f_1 \cdot T_2^{n^2}f_2 \cdot T_3^{n^2+n} f_3. \end{align} $$

The pairwise dependence of the polynomials $n^2, n^2, n^2+n$ means that, contrary to the results in [Reference Frantzikinakis and Kuca11], we cannot establish the seminorm control described in Question 2 for all systems. Rather, we need to identify a special property of the system that makes the seminorm control possible. The needed property turns out to be the following.

Definition. (Good and very good ergodicity property)

Let $\ell \in {\mathbb N}$ and $p_1,\ldots , p_{\ell }\in {\mathbb Z}[n]$ . We say that the system $(X,{\mathcal X},\mu ,T_1,\ldots , T_{\ell })$ has the good ergodicity property for the polynomials $p_1,\ldots , p_{\ell }$ (sometimes we also say that the polynomials $p_1,\ldots , p_{\ell }$ have the good ergodicity property for the system $(X,{\mathcal X},\mu ,T_1,\ldots , T_{\ell })$ , or the tuple $(T_j^{p_j(n)})_{j=1,\ldots , \ell }$ has the good ergodicity property), if whenever $p_i/c_i=p_j/c_j$ for some $i\neq j$ and non-zero $c_i,c_j\in {\mathbb Z}$ with $\gcd (c_i,c_j)=1$ , then $ {\mathcal I}(T_i^{c_i}T_j^{-c_j}) = {\mathcal I}(T_i)\cap {\mathcal I}(T_j)$ (we assume throughout that all the equalities and inclusions of $\sigma $ -algebras hold up to null sets with respect to a given measure on the system), that is, a function cannot be invariant under $T_i^{c_i}T_j^{-c_j}$ except in the trivial case when it is simultaneously invariant under $T_i$ and $T_j$ . If $T_i^{c_i}T_j^{-c_j}$ is ergodic for all the aforementioned indices $i,j$ and values $c_i, c_j$ , then we say that the system has the very good ergodicity property for the polynomials $p_1,\ldots , p_{\ell }$ .

Remark. As we work under the standing assumption that the polynomials have zero constant terms, the equality $p_i/c_i=p_j/c_j$ holds for some non-zero $c_i, c_j\in {\mathbb Z}$ precisely when the polynomials $p_i, p_j$ are linearly dependent.

For instance, the system $(X, {\mathcal X}, \mu , T_1, T_2, T_3)$ has the good ergodicity property for the families $n^2, n^2, n^2 + n$ or $2n^2, 2n^2, n^2+n$ if and only if the only functions invariant under $T_1 T_2^{-1}$ are those invariant under $T_1$ and $T_2$ , and it has the very good ergodicity property if $T_1 T_2^{-1}$ is ergodic, that is, only constant functions are invariant under $T_1 T_2^{-1}$ .

We first address Question 2 for systems with the good ergodicity property.

Theorem 1.1. (Seminorm control)

Let $D, \ell \in {\mathbb N}$ and $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ be polynomials of degrees at most D with the good ergodicity property for $(X,{\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ . Then there exists $s\in {\mathbb N}$ , depending only on $D, \ell $ , such that for all $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ , we have

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \frac{1}{N}\sum_{n=1}^N T_1^{p_1(n)}f_1 \cdots T_{\ell}^{p_{\ell}(n)}f_{\ell}\bigg\Vert_{L^2(\mu)} = 0 \end{align*} $$

whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_j} = 0$ for some $j\in \{1, \ldots , \ell \}$ .

Subsequently, we use Theorem 1.1 and results from [Reference Frantzikinakis and Kuca11] to address Question 1. All the concepts appearing in the results below will be defined precisely in §2.

Theorem 1.2. (Weak joint ergodicity)

The polynomials $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ are weakly jointly ergodic for the system $(X,{\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ if and only if the following two conditions hold:

(i) the system has the good ergodicity property for the polynomials;
(ii) for all non-ergodic eigenfunctions $\chi _j\in {\mathcal E}(T_j)$ , $j\in \{1, \ldots , \ell \}$ , we have
(6) $$ \begin{align} \lim_{N\to\infty}\bigg\Vert \frac{1}{N}\sum_{n=1}^N T_1^{p_1(n)}\chi_1 \cdots T_{\ell}^{p_{\ell}(n)}\chi_{\ell} - {\mathbb{E}}(\chi_1|{\mathcal I}(T_1)) \cdots {\mathbb{E}}(\chi_{\ell}|{\mathcal I}(T_{\ell}))\bigg\Vert_{L^2(\mu)} = 0. \end{align} $$

Remark. Using terminology from [Reference Frantzikinakis9, Reference Frantzikinakis and Kuca11], condition (ii) equivalently states that the polynomials $p_1,\ldots , p_{\ell }$ are good for equidistribution for the system $(X,\mu ,T_1,\ldots , T_{\ell })$ .

If additionally the transformations $T_1, \ldots , T_{\ell }$ are ergodic, we get the following result.

Corollary 1.3. (Joint ergodicity)

The polynomials $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ are jointly ergodic for the system $(X,{\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ if and only if the following two conditions hold:

(i) all the transformations $T_1, \ldots , T_{\ell }$ are ergodic and the system has the very good ergodicity property for the polynomials;
(ii) for eigenvalues $\alpha _j\in \operatorname {\mathrm {Spec}}(T_j)$ , $j\in \{1, \ldots , \ell \}$ , we have
(7) $$ \begin{align} \lim_{N\to\infty} \frac{1}{N}\sum_{n=1}^N e(\alpha_1 p_1(n) +\cdots+ \alpha_{\ell} p_{\ell}(n)) = 0 \end{align} $$

unless $\alpha _1 = \cdots = \alpha _{\ell } = 0$ .

Theorem 1.1 and Corollary 1.3 extend Theorems 2.8 and 2.14 in [Reference Frantzikinakis and Kuca11] that cover the case of pairwise independent polynomials.

Theorem 1.2 and Corollary 1.3 can be put in the context of the following conjecture by Donoso, Koutsogiannis, and Sun (the version presented below is a special case of [Reference Donoso, Koutsogiannis and Sun7, Conjecture 1.5]) that was motivated by previous results of Berend and Bergelson [Reference Berend and Bergelson1]. In the statement that follows, we say that a sequence of commuting transformations $(T_n)_{n\in {\mathbb N}}$ on a probability space $(X, {\mathcal X}, \mu )$ is ergodic for $\mu $ if

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \frac{1}{N}\sum_{n=1}^N T_n f - \int f d\mu\bigg\Vert_{L^2(\mu)}=0 \end{align*} $$

for every $f\in L^{\infty }(\mu )$ .

Conjecture 1. The polynomials $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ are jointly ergodic for the system $(X,{\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ if only if the following two conditions are satisfied:

(i) for all distinct $i, j\in \{1, \ldots , \ell \}$ , the sequence $(T_i^{p_i(n)} T_j^{-p_j(n)})_{n\in {\mathbb N}}$ is ergodic for $\mu $ ;
(ii) the sequence $(T_1^{p_1(n)}\times \cdots \times T_{\ell }^{p_{\ell }(n)})_{n\in {\mathbb N}}$ is ergodic for $\mu \times \cdots \times \mu $ .

Conjecture 1 thus lists conditions that have to be checked to verify the joint ergodicity of a family of polynomials for a system.

Corollary 1.4. Conjecture 1 holds.

In fact, our Theorem 1.2 is stronger than Corollary 1.4 in a number of ways. First, Theorem 1.2 gives a criterion for weak joint ergodicity, not just for joint ergodicity, meaning that the transformations $T_1, \ldots , T_{\ell }$ need not be ergodic for us to be able to say anything meaningful. Second, our good ergodicity property lists strictly fewer conditions to check to verify joint ergodicity than the condition (i) in Conjecture 1. For instance, for the average from equation (5), the condition (i) in Conjecture 1 requires us to check the ergodicity of the three sequences $((T_1 T_2^{-1})^{n^2})_{n\in {\mathbb N}}$ , $(T_1^{n^2} T_3^{-(n^2+n)})_{n\in {\mathbb N}}$ , and $(T_2^{n^2}T_3^{-(n^2+n)})_{n\in {\mathbb N}}$ . By contrast, the good ergodicity property of equation (5) holds if and only if ${\mathcal I}(T_1 T_2^{-1}) = {\mathcal I}(T_1)\cap {\mathcal I}(T_2)$ , which is in any way a necessary condition for $((T_1 T_2^{-1})^{n^2})_{n\in {\mathbb N}}$ to be ergodic.

Finally, we remark that the original version of Conjecture 1 from [Reference Donoso, Koutsogiannis and Sun7] is stated for more general tuples

(8)

$$ \begin{align} (T_1^{p_{11}({\underline{n}})}\cdots T_{\ell}^{p_{1\ell}({\underline{n}})}, \ldots, T_1^{p_{\ell 1}({\underline{n}})}\cdots T_{\ell}^{p_{\ell\ell}({\underline{n}})}), \end{align} $$

a simple example of which would be the tuple

$$ \begin{align*} (T_1^{n^2}T_2^{n^2+n}, T_3^{n^2}T_4^{n^2+n}). \end{align*} $$

It is possible that an extension of our method would establish an analog of Theorem 1.1 for such averages. However, in addition to the fact that new problems arise, the technical complexity of some of our arguments in this paper is already formidable, and it would likely grow significantly if we wanted to tackle the more complicated averages from equation (8). We have therefore refrained from seeking an extension of Theorem 1.1 to averages of tuples as in equation (8), sticking instead to the simpler and arguably more natural averages from equation (1).

1.2 Extensions to other averaging schemes

Our arguments can be modified to cover multivariate polynomials and averages over arbitrary Følner sequences. (A sequence $(I_N)_{N\in {\mathbb N}}$ of finite subsets of ${\mathbb Z}^D$ is called Følner, if $\lim _{N\to \infty }{|(I_N+{\underline {h}})\triangle I_N|}/{|I_N|}=0$ for every ${\underline {h}}\in {\mathbb Z}^D$ .) While these modifications do not require any new ideas, they force us to introduce even more complicated notation and deal with straightforward but tedious technicalities. For this reason, we omit their proofs. We start with a generalization of Theorem 1.1.

Theorem 1.5. Let $D, K, \ell \in {\mathbb N}$ be integers, $(I_N)_{N\in {\mathbb N}}$ be a Følner sequence on ${\mathbb Z}^K$ , and $p_1, \ldots , p_{\ell }\in {\mathbb Z}[{\underline {n}}]$ be polynomials of degree at most D. Suppose that $p_1, \ldots , p_{\ell }$ have the good ergodicity property for a system $(X,{\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ . Then there exists $s\in {\mathbb N}$ , depending only on $D, K, \ell $ , such that for all $1$ -bounded functions $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ , we have

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \frac{1}{|I_N|}\sum_{{\underline{n}}\in I_N} T_1^{p_1({\underline{n}})}f_1 \cdots T_{\ell}^{p_{\ell}({\underline{n}})}f_{\ell}\bigg\Vert_{L^2(\mu)} = 0 \end{align*} $$

whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_j} = 0$ for some $j\in \{1, \ldots , \ell \}$ .

Theorem 1.5 and [Reference Frantzikinakis and Kuca11, Theorem 2.7] give the following generalization of Theorem 1.2.

Theorem 1.6. Let $K, \ell \in {\mathbb N}$ be integers and $(I_N)_{N\in {\mathbb N}}$ be a Følner sequence on ${\mathbb Z}^K$ . The polynomials $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ are weakly jointly ergodic for the system $(X,{\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ along $(I_N)_{N\in {\mathbb N}}$ , in the sense that

for all $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ , if and only if the following two conditions hold:

(i) the system has the good ergodicity property for the polynomials;
(ii) for all non-ergodic eigenfunctions $\chi _j\in {\mathcal E}(T_j)$ , $j\in \{1, \ldots , \ell \}$ , we have
$$ \begin{align*} \lim_{N\to\infty}\kern-1pt\bigg\Vert \frac{1}{|I_N|}\sum_{{\underline{n}}\in I_N}\kern-1.2pt T_1^{p_1({\underline{n}})}\chi_1 \cdots T_{\ell}^{p_{\ell}({\underline{n}})}\chi_{\ell} - {\mathbb{E}}(\chi_1|{\mathcal I}(T_1)) \cdots {\mathbb{E}}(\chi_{\ell}|{\mathcal I}(T_{\ell}))\bigg\Vert_{L^2(\mu)} {=}\kern1.4pt 0. \end{align*} $$

1.3 Outline of the article

We begin by recalling in §2 basic notions and results from ergodic theory, especially those related to the families of Gowers–Host–Kra and box seminorms, dual functions, as well as non-ergodic eigenfunctions. Next, we state in §3 preliminary technical lemmas that are used to prove our main results, most of which are variations of results from [Reference Donoso, Ferré-Moragues, Koutsogiannis and Sun6, Reference Frantzikinakis9, Reference Frantzikinakis and Kuca11]. Having stated all preliminary definitions and lemmas, we discuss at length in §4 two baby cases of Theorem 1.1 that illustrate some of our techniques and point out the necessity of the good ergodicity property. We then proceed in §5 to discuss the formalism and the general strategy for handling longer families. In §6, we give more details of various maneuvers outlined in §5. These moves take the form of several highly technical propositions that play a crucial part in the inductive proof of Theorem 1.1. By presenting relevant examples, we also show various obstructions that need to be overcome to prove Theorem 1.1 in full generality. Section 7 is entirely devoted to the proof of Theorem 1.1: it contains an intricate induction scheme used for the proof and proofs of various intermediate results that together amount to Theorem 1.1. Lastly, in §8, we derive Theorem 1.2 and Corollaries 1.3 and 1.4.

Some of the techniques used in this paper were inspired by our earlier work in [Reference Frantzikinakis and Kuca11] where we dealt with pairwise independent polynomials $p_1, \ldots , p_{\ell }$ . The lack of pairwise independence introduces serious additional complications. Consequently, we are forced to keep track of more information about the averages from equation (1) than in [Reference Frantzikinakis and Kuca11], particularly concerning the properties of the functions present therein and the coefficients of the polynomial iterates. The methods developed in this paper therefore differ in a number of places from the techniques employed in [Reference Frantzikinakis and Kuca11], and the argument from [Reference Frantzikinakis and Kuca11] is most emphatically not a special case of the argument presented in the current paper.

The need to have a better grip on the averages necessitates more extensive formalism than that in [Reference Frantzikinakis and Kuca11], making our argument rather hard to digest on a first reading. To compensate for this, we have included numerous examples that illustrate the main new obstacles and ideas in the proofs. The reader is invited to first go over these examples before delving into the details of the proofs.

2 Ergodic background and definitions

In this section, we present various notions from ergodic theory together with some basic results.

2.1 Basic notation

We start with explaining basic notation used throughout the paper.

The letters ${\mathbb C}, {\mathbb R}, {\mathbb Z}, {\mathbb N}, and {\mathbb N}_0$ stand for the set of complex numbers, real numbers, integers, positive integers, and non-negative integers. With ${\mathbb T}$ , we denote the one-dimensional torus, and we often identify it with ${\mathbb R}/{\mathbb Z}$ or with $[0,1)$ . We let $[N]:=\{1, \ldots , N\}$ for any $N\in {\mathbb N}$ . With ${\mathbb Z}[n]$ , we denote the collection of polynomials with integer coefficients.

For an element $t\in {\mathbb R}$ , we let $e(t):=e^{2\pi i t}$ .

If $a\colon {\mathbb N}^s\to {\mathbb C}$ is a bounded sequence for some $s\in {\mathbb N}$ and A is a non-empty finite subset of ${\mathbb N}^s$ , we let

$$ \begin{align*}\mathop{\mathbb{E}}\limits_{n\in A}\,a(n):=\frac{1}{|A|}\sum_{n\in A}\, a(n). \end{align*} $$

We commonly use the letter $\ell $ to denote the number of transformations in our system or the number of functions in an average while the letter s usually stands for the degree of ergodic seminorms. We normally write tuples of length $\ell $ in bold, e.g. ${\textbf {b}}\in {\mathbb Z}^{\ell }$ , and we underline tuples of length s (or $s+1$ , or $s-1$ ) that are typically used for averaging, e.g. ${\underline {h}}\in {\mathbb Z}^s$ . For a vector ${\textbf {b}}=(b_1,\ldots , b_{\ell })\in {\mathbb Z}^{\ell }$ and a system $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ , we let

$$ \begin{align*} T^{{\textbf{b}}}:=T_1^{b_1}\cdots T_{\ell}^{b_{\ell}}, \end{align*} $$

and we denote the $\sigma $ -algebra of $T^{\textbf {b}}$ invariant functions by ${\mathcal I}(T^{\textbf {b}})$ . For $j\in [\ell ]$ , we set ${\mathbf {e}}_j$ to be the unit vector in ${\mathbb Z}^{\ell }$ in the jth direction, and we let ${\mathbf {e}}_0 = \mathbf {0}$ , so that $T^{{\mathbf {e}}_j} = T_j$ for $j\in [\ell ]$ and $T^{{\mathbf {e}}_0}$ is the identity transformation.

We often write ${\underline {\epsilon }}\in \{0,1\}^s$ for a vector of 0s and 1s of length s. For ${\underline {\epsilon }}\in \{0,1\}^s$ and ${\underline {h}}, {\underline {h}}'\in {\mathbb Z}^s$ , we set:

• ${\underline {\epsilon }}\cdot {\underline {h}}:=\epsilon _1 h_1+\cdots + \epsilon _s h_s$ ;
• $\mathopen {}| {\underline {h}}\mathclose {}| := |h_1|+\cdots +|h_s|$ ;
• ${\underline {h}}^{\underline {\epsilon }} := (h_1^{\epsilon _1}, \ldots , h_s^{\epsilon _s})$ , where $h_j^0:=h_j$ and $h_j^1:=h_j'$ for $j=1,\ldots , s$ .

We let ${\mathcal C} z := \overline {z}$ be the complex conjugate of $z\in {\mathbb C}$ .

For a tuple $\eta \in {\mathbb N}_0^{\ell }$ and $I\subset [\ell ]$ , we define the restriction $\eta |_I := (\eta _i)_{i\in I}$ .

2.2 Ergodic seminorms

We review some basic facts about two families of ergodic seminorms: the Gowers–Host–Kra seminorms and the box seminorms.

2.2.1 Gowers–Host–Kra seminorms

Given a system $(X, {\mathcal X}, \mu ,T)$ , we will use the family of ergodic seminorms $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| \cdot |\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T}$ , also known as Gowers–Host–Kra seminorms, which were originally introduced in [Reference Host and Kra13] for ergodic systems. A detailed exposition of their basic properties can be found in [Reference Host and Kra14, Ch. 8]. These seminorms are inductively defined for $f\in L^{\infty }(\mu )$ as follows (for convenience, we also define $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| \cdot |\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _0$ , which is not a seminorm):

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{0,T}:=\int f\, d\mu, \end{align*} $$

and for $s\in {\mathbb N}_0$ , we let

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s+1,T}^{2^{s+1}}:=\lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{h\in [H]} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{T; h}f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s,T}^{2^{s}}, \end{align*} $$

where

$$ \begin{align*} \Delta_{T;h}f:=f\cdot T^h\overline{f}, \quad h\in {\mathbb Z}, \end{align*} $$

is the multiplicative derivative of f with respect to T. The limit can be shown to exist by successive applications of the mean ergodic theorem, and for $f\in L^{\infty }(\mu )$ and $s\in {\mathbb N}_0$ , we have $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s,T}\leq \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s+1,T}$ (see [Reference Host and Kra13] or [Reference Host and Kra14, Ch. 8]). It follows immediately from the definition that

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{1,T}=\Vert {\mathbb{E}}(f|{\mathcal I}(T))\Vert_{L^2(\mu)}, \end{align*} $$

where ${\mathcal I}(T):=\{f\in L^2(\mu )\colon Tf=f\}$ . We also have

(9)

$$ \begin{align} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s,T}^{2^s}=\lim_{H_1\to\infty}\cdots \lim_{H_s\to\infty}\mathop{\mathbb{E}}\limits_{h_1\in [H_1]}\cdots \mathop{\mathbb{E}}\limits_{h_s\in [H_s]} \int \Delta_{s, T; {\underline{h}}}f\, d\mu, \end{align} $$

where for ${\underline {h}}=(h_1,\ldots , h_s)\in {\mathbb Z}^s$ , we let

$$ \begin{align*} \Delta_{s,T;{\underline{h}}}f:=\Delta_{T;h_1}\cdots \Delta_{T;h_s}f=\prod_{{\underline{\epsilon}}\in \{0,1\}^s}\mathcal{C}^{|{\underline{\epsilon}}|}T^{{\underline{\epsilon}}\cdot {\underline{h}}}f \end{align*} $$

be the multiplicative derivative of f of degree s with respect to T.

It can be shown that we can take any $s'\leq s$ of the iterative limits to be simultaneous limits (that is average over $[H]^{s'}$ and let $H\to \infty $ ) without changing the value of the limit in equation (9). This was originally proved in [Reference Host and Kra13] using the main structural result of [Reference Host and Kra13]; a more ‘elementary’ proof can be deduced from [Reference Bergelson and Leibman4, Lemma 1.12] once the convergence of the uniform Cesàro averages is known (and yet another proof can be found in [Reference Host12, Lemma 1]). For $s':=s$ , this gives the identity

(10)

$$ \begin{align} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s,T}^{2^s}=\lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in [H]^s} \int \Delta_{s, T; {\underline{h}}}f\, d\mu. \end{align} $$

Moreover, for $1\leq s'\leq s$ , we have

(11)

$$ \begin{align} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s,T}^{2^{s}}=\lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in [H]^{s-s'}} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{s-s',T;{\underline{h}}}f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s'}^{2^{s'}}. \end{align} $$

It has been established in [Reference Host and Kra13] for ergodic systems and in [Reference Host and Kra14, Ch. 8, Theorem 14] for general systems that the seminorms are intimately connected with a certain family of factors of the system. Specifically, for every $s\in {\mathbb N}$ , there exists a factor ${\mathcal Z}_s(T)\subseteq {\mathcal X}$ , known as the Host–Kra factor of degree s, with the property that

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s, T} = 0 \quad\text{if and only if } f \text{ is orthogonal to } {\mathcal Z}_{s-1}(T). \end{align*} $$

Equivalently, $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| \cdot |\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T}$ defines a norm on the space $L^2({\mathcal Z}_{s-1}(T))$ (for a proof, see [Reference Host and Kra14, Theorem 15, Ch. 9]).

2.2.2 Box seminorms

More generally, we use analogs of equation (10) defined with regards to several commuting transformations. These seminorms originally appeared in the work of Host [Reference Host12]; their finitary versions are often called box seminorms, and we sometimes employ this terminology. Let $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system. For each $f\in L^{\infty }(\mu )$ , $h\in {\mathbb Z}$ , and ${\textbf {b}} \in {\mathbb Z}^{\ell }$ , we define

$$ \begin{align*} \Delta_{{\textbf{b}}; h} f := f \cdot T^{{\textbf{b}} h} \overline{f} \end{align*} $$

and for ${\underline {h}}\in {\mathbb Z}^s$ and ${\textbf {b}}_1,\ldots , {\textbf {b}}_s\in {\mathbb Z}^{\ell }$ , we let

$$ \begin{align*} \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}} f := \Delta_{{{\textbf{b}}_1; h_1}}\cdots\Delta_{{{\textbf{b}}_s; h_s}} f = \prod_{{\underline{\epsilon}}\in\{0,1\}^s} {\mathcal C}^{|{\underline{\epsilon}}|} T^{{\textbf{b}}_1 \epsilon_1 h_1 + \cdots + {\textbf{b}}_s \epsilon_s h_s}f. \end{align*} $$

We let

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{\emptyset}:=\int f\, d\mu \end{align*} $$

and

(12)

$$ \begin{align} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s+1}}}^{2^{s+1}}:=\lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{h\in[H]}\lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{{{\textbf{b}}_{s+1}}; h}f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}}^{2^s}. \end{align} $$

In particular, if ${\textbf {b}}_1 = \cdots = {\textbf {b}}_s:={\textbf {b}}$ , then $ \Delta _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s; {\underline {h}}}=\Delta _{s, T^{\textbf {b}}; {\underline {h}}}$ and $ \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| \cdot |\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s}=\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| \cdot |\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T^{\textbf {b}}}$ . We remark that these seminorms were defined in a slightly different way in [Reference Host12] and the above identities were established in [Reference Host12, §2.3].

Iterating equation (12), we get the identity

(13)

$$ \begin{align} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}}^{2^{s+1}}=\lim_{H_1\to\infty}\cdots \lim_{H_s\to\infty}\mathop{\mathbb{E}}\limits_{h_1\in [H_1]}\cdots \mathop{\mathbb{E}}\limits_{h_s\in [H_s]} \int\Delta_{{{\textbf{b}}_1; h_1}}\cdots\Delta_{{{\textbf{b}}_s; h_s}} f\, d\mu, \end{align} $$

which extends equation (9). In a complete analogy with the remarks made for the Gowers–Host–Kra seminorms, we have the following: using [Reference Host12, Lemma 1] (which implies the convergence of the uniform Cesàro averages over ${\underline {h}}\in {\mathbb Z}^s$ of $\int \Delta _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_{s}}; {\underline {h}}} f\, d\mu $ ) and [Reference Bergelson and Leibman4, Lemma 1.12], we get that we can take any $s'\leq s$ of the iterative limits to be simultaneous limits (that is, average over $[H]^{s'}$ and let $H\to \infty $ ) without changing the value of the limit in equation (13). Taking $s'=s$ gives the identity

(14)

$$ \begin{align} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}}^{2^s}=\lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in [H]^s}\,\int \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}} f\, d\mu. \end{align} $$

More generally, for any $1\leq s'\leq s$ and $f\in L^{\infty }(\mu )$ , we get the identity

(15)

$$ \begin{align} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s}^{2^s} = \lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in[H]^{s-s'}}\lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{{\textbf{b}}_{s'+1}, \ldots, {\textbf{b}}_s; {\underline{h}}}f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_{s'}}^{2^{s'}}, \end{align} $$

which generalizes equation (11).

As a consequence of the identity in equation (14), we observe that for any permutation $\sigma :[s]\to [s]$ , we have

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s} = \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_{\sigma(1)}, \ldots, {\textbf{b}}_{\sigma(s)}}, \end{align*} $$

that is, the order of taking the vectors ${\textbf {b}}_1, \ldots , {\textbf {b}}_s$ does not matter.

As an example of a box seminorm that is not a Gowers–Host–Kra seminorm, consider $s=2$ and the vectors ${\mathbf {e}}_1=(1,0)$ , ${\mathbf {e}}_2=(0,1)$ , in which case

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\mathbf{e}}_1},{{\mathbf{e}}_2}}^4 = \lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{h_1, h_2\in [H]^2}\int f\cdot T_1^{h_1}\overline{f}\cdot T_2^{h_2}\overline{f}\cdot T_1^{h_1}T_2^{h_2}f\, d\mu. \end{align*} $$

More generally, for $s=2$ and ${\mathbf {a}}=(a_1,a_2)$ , ${\textbf {b}}=(b_1,b_2)$ , we have

$$ \begin{align*} &\lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\mathbf{a}}},{{\textbf{b}}}}^4\\&\quad=\lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{h_1, h_2\in [H]^2}\int f\cdot T_1^{a_1h_1}T_2^{a_2h_1}\overline{f}\cdot T_1^{b_1h_2}T_2^{b_2h_2}\overline{f}\cdot T_1^{a_1h_1+b_1h_2}T_2^{a_2h_1+b_2h_2}f\, d\mu. \end{align*} $$

If the vector ${\mathbf {a}}$ repeats s times, we abbreviate it as ${\mathbf {a}}^{\times s}$ , e.g.

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\mathbf{a}}^{\times 2}, {\textbf{b}}^{\times 3}, {\mathbf{c}}}=\lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\mathbf{a}}, {\mathbf{a}}, {\textbf{b}}, {\textbf{b}}, {\textbf{b}}, {\mathbf{c}}}. \end{align*} $$

Box seminorms satisfy the following Gowers–Cauchy–Schwarz inequality [Reference Host12, Proposition 2]:

(16)

$$ \begin{align} \limsup_{H\to\infty}\mathopen{}\bigg| \mathop{\mathbb{E}}\limits_{{\underline{h}}\in[H]^s}\int \prod_{{\underline{\epsilon}}\in\{0,1\}^s} {\mathcal C}^{|{\underline{\epsilon}}|} T^{{\textbf{b}}_1 \epsilon_1 h_1 + \cdots + {\textbf{b}}_s \epsilon_s h_s}f_{\underline{\epsilon}}\, d\mu\mathclose{}\bigg|\leq \prod_{{\underline{\epsilon}}\in\{0,1\}^s}\lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{\underline{\epsilon}}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s}. \end{align} $$

(One can replace the limsup with a limit since it is known to exist.)

We frequently bound one seminorm in terms of another. An inductive application of equation (12), or alternatively a simple application of the Gowers–Cauchy–Schwarz inequality in equation (16), yields the following monotonicity property:

(17)

$$ \begin{align} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s}\leq \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots,{\textbf{b}}_s, {\textbf{b}}_{s+1}}, \end{align} $$

a special case of which is the aforementioned bound $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s,T}\leq \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s+1, T}$ for any ${f\in L^{\infty }(\mu )}$ and system $(X, {\mathcal X}, \mu , T)$ .

In many of our arguments, we have to deal simultaneously both with a collection of transformations and their powers. The relevant box seminorms are compared in the following lemma.

Lemma 2.1. [Reference Frantzikinakis and Kuca11, Lemma 3.1]

Let $\ell ,s\in {\mathbb N}$ , $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system, ${f\in L^{\infty }(\mu )}$ be a function, ${\textbf {b}}_1, \ldots , {\textbf {b}}_s\in {\mathbb Z}^{\ell }$ be vectors, and $r_1, \ldots , r_s\in {\mathbb Z}$ be non-zero. Then

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s}\leq \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{r_1 {\textbf{b}}_1, \ldots, r_s {\textbf{b}}_s}, \end{align*} $$

and if $s\geq 2$ , we additionally get the bound

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{r_1 {\textbf{b}}_1, \ldots, r_s {\textbf{b}}_s} \leq (r_1\cdots r_s)^{1/2^{s}}\lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s}. \end{align*} $$

The following lemma allows us to compare box seminorms depending on the invariant $\sigma $ -algebras of the transformations involved.

Lemma 2.2. Let $\ell ,s\in {\mathbb N}$ , $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system of commuting transformations, and ${\textbf {b}}_1, \ldots , {\textbf {b}}_s, {\mathbf {c}}_1, \ldots , {\mathbf {c}}_s\in {\mathbb Z}^{\ell }$ be vectors with the property that ${\mathcal I}(T^{{\textbf {b}}_i})\subseteq {\mathcal I}(T^{{\mathbf {c}}_i})$ for each $i\in [s]$ . Then $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s}\leq \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\mathbf {c}}_1, \ldots , {\mathbf {c}}_s}$ for each $f\in L^{\infty }(\mu )$ .

Proof. We prove this by induction on s. For $s=1$ , we simply have

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1}=\Vert {\mathbb{E}}(f|{\mathcal I}(T^{{\textbf{b}}_1}))\Vert_{L^2(\mu)}\leq \Vert {\mathbb{E}}(f|{\mathcal I}(T^{{\mathbf{c}}_1}))\Vert_{L^2(\mu)} = \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\mathbf{c}}_1}, \end{align*} $$

where we use the fact that $\Vert {\mathbb {E}}(f|{\mathcal A})\Vert _{L^2(\mu )}\leq \Vert {\mathbb {E}}(f|{\mathcal B})\Vert _{L^2(\mu )}$ whenever ${\mathcal A}\subseteq {\mathcal B}$ . For $s>1$ , we use the induction formula for seminorms and the result for $s=1$ to deduce that

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s}^{2^s} &=\lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in[H]^{s-1}}\lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{{\textbf{b}}_1, \ldots, {\textbf{b}}_{s-1};{\underline{h}}} f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_s}^{2^{s-1}}\\ &\leq \lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in[H]^{s-1}}\lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{{\textbf{b}}_1, \ldots, {\textbf{b}}_{s-1};{\underline{h}}} f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\mathbf{c}}_s}^{2^{s-1}} = \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_{s-1}, {\mathbf{c}}_s}^{2^s}. \end{align*} $$

The claim follows by iterating this procedure $s-1$ more times.

2.3 Dual functions and sequences

Let $s\in {\mathbb N}$ and $\{0,1\}^s_* = \{0,1\}^s\setminus \{\underline {0}\}$ . For a system $(X, {\mathcal X}, \mu , T)$ and $f\in L^{\infty }(\mu ),$ we define

$$ \begin{align*} {\mathcal D}_{s, T}(f) := \lim_{M\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{m}}\in [M]^s}\prod_{{\underline{\epsilon}}\in\{0,1\}^s_*}{\mathcal C}^{|{\underline{\epsilon}}|}T^{{\underline{\epsilon}}\cdot{\underline{m}}}f \end{align*} $$

(the limit exists in $L^2(\mu )$ by [Reference Host and Kra13]). We call ${\mathcal D}_{s,T}(f)$ the dual function of f of level s with respect to T. The name comes because of the identity

(18)

$$ \begin{align} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s, T}^{2^s} = \int f \cdot {\mathcal D}_{s, T}(f)\, d\mu, \end{align} $$

a consequence of which is that the span of dual functions of degree s is dense in $L^1({\mathcal Z}_{s-1}(T))$ .

Let $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system. Using the identities in equations (15) and (18), we get

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}, {\textbf{b}}_{s+1}^{\times s'}}^{2^{s+s'}}=\lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in [H]^s}\int \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}} f \cdot {\mathcal D}_{s', T^{{\textbf{b}}_{s+1}}}(\Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}}f)\, d\mu, \end{align*} $$

the special case of which is

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {\textbf{b}}_{s+1}}^{2^{s+1}} = \lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in [H]^s} \int \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}}f \cdot {\mathbb{E}}(\Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}}\overline{f}|{\mathcal I}(T^{{\textbf{b}}_{s+1}}))\, d\mu. \end{align*} $$

For $s\in {\mathbb N}$ , we denote

$$ \begin{align*} {\mathfrak D}_s := \{(T_j^n {\mathcal D}_{s', T_j}f)_{n\in{\mathbb Z}}: f\in L^{\infty}(\mu),\, j\in[\ell],\, 1\leq s'\leq s\} \end{align*} $$

to be the set of sequences of 1-bounded functions coming from dual functions of degree up to s for the transformations $T_1,\ldots , T_{\ell }$ , and moreover we define ${\mathfrak D} := \bigcup _{s\in {\mathbb N}}{\mathfrak D}_s$ .

The utility of dual functions comes from the following approximation result.

Proposition 2.3. (Dual decomposition [Reference Frantzikinakis8, Proposition 3.4])

Let $(X, {\mathcal X}, \mu , T)$ be a system, $f\in L^{\infty }(\mu )$ , $s\in {\mathbb N}$ , and $\varepsilon>0$ . Then we can decompose $f = f_1 + f_2 + f_3$ , where

(i) (Structured component) $f_1 = \sum _k c_k {\mathcal D}_{s, T}(g_k)$ is a linear combination of finitely many dual functions of level s with respect to T;
(ii) (Small component) $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_2|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{L^1(\mu )}\leq \varepsilon $ ;
(iii) (Uniform component) $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T} = 0$ .

Proposition 2.3 will be used as follows. Suppose that the $L^2(\mu )$ limit of the average ${\mathbb {E}}_{n\in [N]}\, T_1^{p_1(n)}f_1 \cdots T_{\ell }^{p_{\ell }(n)}f_{\ell }$ vanishes whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_{\ell }|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\ell }} = 0$ . If

$$ \begin{align*} \limsup_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]}\, T_1^{p_1(n)}f_1 \cdots T_{\ell}^{p_{\ell}(n)}f_{\ell}\Big\Vert_{L^2(\mu)}>0 \end{align*} $$

for some functions $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ , then we decompose $f_{\ell }$ as in Proposition 2.3 for sufficiently small $\varepsilon>0$ so that

$$ \begin{align*} \limsup_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in[N]}\prod_{j\in[\ell-1]}T_j^{p_j(n)}f_j \cdot \sum_k c_k T_{\ell}^{p_{\ell}(n)}{\mathcal D}_{s, T_{\ell}}(g_k)\bigg\Vert_{L^2(\mu)}>0 \end{align*} $$

for some (finite) linear combinations of dual functions. Applying the triangle inequality and the pigeonhole principle, we deduce that there exists k for which

$$ \begin{align*} \limsup_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in[N]}\prod_{j\in[\ell-1]}T_j^{p_j(n)}f_j \cdot {\mathcal D}(p_{\ell}(n))\bigg\Vert_{L^2(\mu)}>0, \end{align*} $$

where ${\mathcal D}(n)(x) := T_{\ell }^{n}{\mathcal D}_{s, T_{\ell }}g_k(x)$ for $n\in {\mathbb N}$ and $x\in X$ . This way, we essentially replace the term $T_{\ell }^{p_{\ell }(n)}f_{\ell }$ in the original average by the more structured piece ${\mathcal D}(p_{\ell }(n))$ .

2.4 Eigenfunctions and criterion for weak joint ergodicity

Following [Reference Frantzikinakis and Host10], we define the notion of eigenfunctions that appears in the statement of Theorem 1.2.

Definition. Let $(X, {\mathcal X}, \mu ,T)$ be a system, $\chi \in L^{\infty }(\mu )$ , and $\unicode{x3bb} \in L^{\infty }(\mu )$ be a T-invariant function. We say that $\chi \in L^{\infty }(\mu )$ is a non-ergodic eigenfunction with eigenvalue $\unicode{x3bb} $ if

(i) $|\chi (x)|$ has value $0$ or $1$ for $\mu $ -almost every $x\in X$ and $\unicode{x3bb} (x)=0$ whenever $\chi (x)=0$ ;
(ii) $T\chi =\unicode{x3bb} \, \chi $ , $\mu $ -almost everywhere (a.e.).

We denote the set of non-ergodic eigenfunctions with respect to T by ${\mathcal E}(T)$ . For ergodic systems, a non-ergodic eigenfunction is either the zero function or a classical unit modulus eigenfunction. For general systems, each function $\chi \in {\mathcal E}(T)$ satisfies $\chi (Tx)\,{=}\,\mathbf {1}_E(x) \, e(\phi (x))\, \chi (x)$ for some T-invariant set $E\in {\mathcal X}$ and measurable T-invariant function $\phi \colon X\to {\mathbb T}$ .

Definition. (Weak joint ergodicity)

We say that a collection of sequences $a_1, \ldots , a_{\ell }: {\mathbb N}\to {\mathbb Z}$ is weakly jointly ergodic for the system $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ , if

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \frac{1}{N}\sum_{n=1}^N T_1^{a_1(n)}f_1 \cdots T_{\ell}^{a_{\ell}(n)}f_{\ell} - {\mathbb{E}}(f_1|{\mathcal I}(T_1)) \cdots {\mathbb{E}}(f_{\ell}|{\mathcal I}(T_{\ell}))\bigg\Vert_{L^2(\mu)} = 0 \end{align*} $$

for all $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ .

The notion of non-ergodic eigenfunction is important for us because of the following criterion for weak joint ergodicity from [Reference Frantzikinakis and Kuca11].

Theorem 2.4. (Criterion for weak joint ergodicity [Reference Frantzikinakis and Kuca11, Theorem 2.5])

The collection of sequences $a_1, \ldots , a_{\ell }\colon {\mathbb N}\to {\mathbb Z}$ is weakly jointly ergodic for the system $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ , if and only if the following two properties hold:

(i) there exists $s\in {\mathbb N}$ such that for every $m\in [\ell ]$ , we have
$$ \begin{align*} \lim_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{a_1(n)}f_1\cdots T_{\ell}^{a_{\ell}(n)}f_{\ell}\Big\Vert_{L^2(\mu)} = 0 \end{align*} $$

for all $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ with $f_j\in {\mathcal E}(T_j)$ for $j\in \{m+1, \ldots , \ell \}$ , whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_m} = 0$ ;
(ii) for all non-ergodic eigenfunctions $\chi _j\in {\mathcal E}(T_j)$ , $j\in [\ell ]$ , we have
(19) $$ \begin{align} \lim_{N\to\infty}\mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{a_1(n)}\chi_1\cdots T_{\ell}^{a_{\ell}(n)}\chi_{\ell} ={\mathbb{E}}(\chi_1|{\mathcal I}(T_1))\cdots {\mathbb{E}}(\chi_{\ell}|{\mathcal I}(T_{\ell})) \end{align} $$

in $L^2(\mu )$ .

When $T_1, \ldots , T_{\ell }$ are ergodic, the condition in equation (19) can be restated as follows:

$$ \begin{align*} \lim_{N\to\infty}\mathop{\mathbb{E}}\limits_{n\in[N]} \, e(\alpha_1 a_1(n) + \cdots + \alpha_{\ell} a_{\ell}(n)) = 0 \end{align*} $$

for all $\alpha _j\in \operatorname {\mathrm {Spec}}(T_j)$ , $j\in [\ell ]$ , not all zero. Here,

$$ \begin{align*}\operatorname{\mathrm{Spec}}(T) := \{\alpha\in{\mathbb T}: T\chi = e(\alpha) \chi \textrm{ for some } \chi\in{\mathcal E}(T)\}.\end{align*} $$

We will apply Theorem 2.4 in the proof of Theorem 1.2. The first condition will be satisfied thanks to the stronger result proved in Theorem 1.1.

3 Preliminary results

In this section, we gather auxiliary results needed in the proof of Theorem 1.1. We start with the following simple lemma from [Reference Frantzikinakis and Kuca11, Lemma 5.2], which allows us to pass from averages of sequences $a_{{\underline {h}}-{\underline {h}}'}$ to averages of sequences $a_{{\underline {h}}}$ .

Lemma 3.1. Let $(a_{\underline {h}})_{{\underline {h}}\in {\mathbb N}^s}$ be a sequence of non-negative real numbers. Then

$$ \begin{align*} \mathop{\mathbb{E}}\limits_{{\underline{h}}, {\underline{h}}'\in[H]^s}a_{{\underline{h}}-{\underline{h}}'} \leq \mathop{\mathbb{E}}\limits_{{\underline{h}}\in[H]^s}a_{{\underline{h}}} \end{align*} $$

for every $H\in {\mathbb N}$ .

Subsequently, we state a result that allows us to replace a function $f_m$ in the original average by a more structured averaged term $\tilde {f}_m$ that encodes the information about the original average. This idea originates in the finitary works on the polynomial Szemerédi theorem by Peluse and Prendiville [Reference Peluse15–Reference Peluse and Prendiville17], and it has been successfully applied in the ergodic theoretic setting in [Reference Best and Ferré Moragues5, Reference Frantzikinakis9, Reference Frantzikinakis and Kuca11]. The version below differs from earlier formulations because we additionally show that if the to-be-replaced function $f_m$ is measurable with respect to some sub- $\sigma $ -algebra ${\mathcal A}$ , then the same can be assumed about the function that replaces it. In our applications, ${\mathcal A}$ will always be either the full $\sigma $ -algebra ${\mathcal X}$ or the invariant sub- $\sigma $ -algebra of some measure-preserving transformation.

Lemma 3.2. (Introducing averaged functions)

Let $a_1,\ldots , a_{\ell }\colon {\mathbb N}\to {\mathbb Z}$ be sequences, $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system, and 1-bounded functions $f_1,\ldots , f_{\ell }\in L^{\infty }(\mu )$ be such that

$$ \begin{align*} \limsup_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \, T_1^{a_1(n)}f_1\cdots T_{\ell}^{a_{\ell}(n)}f_{\ell}\Big\Vert_{L^2(\mu)} \geq \delta \end{align*} $$

for some $\delta>0$ . Let $m\in [\ell ]$ . Suppose moreover that $f_m$ is ${\mathcal A}$ -measurable for some sub- $\sigma $ -algebra ${\mathcal A}\subseteq {\mathcal X}$ . Then there exist $N_k\to \infty $ and $g_k\in L^{\infty }(\mu )$ , with $\Vert g_k\Vert _{L^{\infty }(\mu )}\leq 1$ , $k\in {\mathbb N}$ , such that for

$$ \begin{align*} \tilde{f}_m:=\lim_{k\to\infty} \mathop{\mathbb{E}}\limits_{n\in [N_k]} \, T_m^{-a_m(n)}g_k\cdot \prod_{j\in [\ell], j\neq m}T_m^{-a_m(n)} T_j^{a_j(n)}\overline{f}_j, \end{align*} $$

where the limit is a weak limit, we have

$$ \begin{align*} \limsup_{k\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in[N_k]} \, T_m^{a_m(n)}{\mathbb{E}}(\tilde{f}_m|{\mathcal A})\cdot \prod_{j\in [\ell], j\neq m}T_j^{a_j(n)}f_j\bigg\Vert_{L^2(\mu)} \geq \delta^4. \end{align*} $$

Proof. Let $\{\tilde {N}_k\}_{k\in {\mathbb N}}$ be an increasing sequence of integers for which

(20)

$$ \begin{align} \Big\Vert \mathop{\mathbb{E}}\limits_{n\in [\tilde{N}_k]} \, T_1^{a_1(n)}f_1\cdots T_{\ell}^{a_{\ell}(n)}f_{\ell}\Big\Vert_{L^2(\mu)}\geq \delta. \end{align} $$

We set

$$ \begin{align*} \tilde{g}_N := \mathop{\mathbb{E}}\limits_{n\in [N]} \, T_1^{a_1(n)}{f_1}\cdots T_{\ell}^{a_{\ell}(n)}{f_{\ell}} \end{align*} $$

for every $N\in {\mathbb N}$ . The weak compactness of $L^2(\mu )$ implies that there exists a subsequence $(N_k)_{k\in {\mathbb N}}$ of $(\tilde {N}_k)_{k\in {\mathbb N}}$ for which the sequence

$$ \begin{align*} F_k:=\mathop{\mathbb{E}}\limits_{n\in [N_k]} \, T_m^{-a_m(n)}g_k\cdot \prod_{j\in [\ell], j\neq m}T_m^{-a_m(n)} T_j^{a_j(n)}\overline{f}_j, \end{align*} $$

where $g_k := \tilde {g}_{\tilde {N}_k}$ , $k\in {\mathbb N}$ , converges weakly to a 1-bounded function $\tilde {f}_m$ .

We observe from equation (20) that

$$ \begin{align*} \delta^2 &\leq \int {g_k} \cdot \mathop{\mathbb{E}}\limits_{n\in [{N}_k]} \, T_1^{a_1(n)}\overline{f}_1\cdots T_{\ell}^{a_{\ell}(n)}\overline{f}_{\ell}\, d\mu\\ &= \int {f}_m \cdot \mathop{\mathbb{E}}\limits_{n\in [{N}_k]} \, T_m^{-a_m(n)} g_k \cdot \prod_{j\in [\ell], j\neq m}T_m^{-a_m(n)} T_j^{a_j(n)}\overline{f}_j\, d\mu. \end{align*} $$

Taking $k\to \infty $ , using the ${\mathcal A}$ -measurability of the 1-bounded function $f_m$ , and applying the Cauchy–Schwarz inequality, we get

$$ \begin{align*} \delta^2 \leq \int f_m \cdot \tilde{f}_m\, d\mu = \int f_m \cdot {\mathbb{E}}(\tilde{f}_m|{\mathcal A})\, d\mu \leq \Vert {\mathbb{E}}(\tilde{f}_m|{\mathcal A})\Vert_{L^2(\mu)}. \end{align*} $$

Hence,

$$ \begin{align*} \delta^4 &\leq \Vert {\mathbb{E}}(\tilde{f}_m|{\mathcal A})\Vert_{L^2(\mu)}^2 = \int {\mathbb{E}}(\tilde{f}_m|{\mathcal A}) \cdot \overline{\tilde{f}}_m\, d\mu\\ &= \lim_{k\to\infty}\int {\mathbb{E}}(\tilde{f}_m|{\mathcal A}) \cdot \mathop{\mathbb{E}}\limits_{n\in [N_k]} \, T_m^{-a_m(n)}\overline{g}_k\cdot \prod_{j\in [\ell], j\neq m}T_m^{-a_m(n)} T_j^{a_j(n)}{f}_j\, d\mu\\ &= \lim_{k\to\infty}\int \overline{g}_k \cdot \mathop{\mathbb{E}}\limits_{n\in [N_k]} T_m^{a_m(n)}{\mathbb{E}}(\tilde{f}_m|{\mathcal A}) \cdot \prod_{j\in [\ell], j\neq m}T_j^{a_j(n)}{f}_j\, d\mu. \end{align*} $$

An application of the Cauchy–Schwarz inequality gives the result.

We now present two different versions of the dual-difference interchange result that we use in our smoothing argument. While the second version in the proposition below has already been used in [Reference Frantzikinakis and Kuca11], the first one is novel since the extra information that it provides has not been required in earlier arguments.

Proposition 3.3. (Dual-difference interchange)

Let $(X, {\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ be a system, $s, s'\in {\mathbb N}$ , ${\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}, {\mathbf {c}}\in {\mathbb Z}^{\ell }$ be vectors, $(f_{n,k})_{n,k\in {\mathbb N}}\subseteq L^{\infty }(\mu )$ be 1-bounded, and $f\in L^{\infty }(\mu )$ be defined by

$$ \begin{align*} f:=\, \lim_{k\to\infty}\mathop{\mathbb{E}}\limits_{n\in [N_k]}\, f_{n,k}, \end{align*} $$

for some $N_k\to \infty $ , where the average is assumed to converge weakly.

(i) If
(21) $$ \begin{align} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| {\mathbb{E}}(f|{\mathcal I}(T^{\mathbf{c}}))|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_{s+1}}>0, \end{align} $$

then there exist 1-bounded functions $u_{{\underline {h}}, {\underline {h}}'}$ , invariant under both $T^{{\textbf {b}}_{s+1}}$ and $T^{{\mathbf {c}}}$ , for which the inequality
$$ \begin{align*} \liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}, {\underline{h}}'\in [H]^s}\limsup_{k\to\infty} \mathop{\mathbb{E}}\limits_{n\in[N_k]} \int\Delta_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s; {\underline{h}}-{\underline{h}}'}f_{n,k} \cdot u_{{\underline{h}}, {\underline{h}}'} \, d\mu>0 \end{align*} $$

holds.
(ii) If
$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s, {\textbf{b}}_{s+1}^{\times s'}}>0, \end{align*} $$

then
$$ \begin{align*} \liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}, {\underline{h}}'\in [H]^s}\limsup_{k\to\infty} \mathop{\mathbb{E}}\limits_{n\in[N_k]} \int\Delta_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s; {\underline{h}}-{\underline{h}}'}f_{n,k} \cdot \prod_{j=1}^{2^s}{\mathcal D}_{j,{\underline{h}},{\underline{h}}'} \, d\mu>0 \end{align*} $$

for some 1-bounded dual functions ${\mathcal D}_{j, {\underline {h}}, {\underline {h}}'}$ of T of level $s'$ .

For the proof of Proposition 3.3, we need the following version of the Gowers–Cauchy– Schwarz inequality from [Reference Frantzikinakis and Kuca11].

Lemma 3.4. (Twisted Gowers–Cauchy–Schwarz inequality)

Let $(X, {\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ be a system, $s\in {\mathbb N}$ , ${\textbf {b}}_1, \ldots , {\textbf {b}}_s\in {\mathbb Z}^{\ell }$ be vectors, and the functions $(f_{\underline {\epsilon }})_{{\underline {\epsilon }}\in \{0,1\}^s}, (u_{\underline {h}})_{{\underline {h}}\in {\mathbb N}^s} \subseteq L^{\infty }(\mu )$ be 1-bounded. Then for every $H\in {\mathbb N}$ , we have

$$ \begin{align*} &\bigg|\mathop{\mathbb{E}}\limits_{{\underline{h}}\in [H]^s}\, \int \prod_{{\underline{\epsilon}}\in \{0,1\}^s} T^{{\textbf{b}}_1 \epsilon_1 h_1+\cdots+{\textbf{b}}_s \epsilon_s h_s} f_{\underline{\epsilon}}\cdot u_{{\underline{h}}}\, d\mu\bigg|^{2^s} \\ &\quad\leq \mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^s}\, \int \Delta_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s; {\underline{h}}-{\underline{h}}'}f_{\underline{1}} \cdot T^{-({\textbf{b}}_1 h_1'+\cdots+{\textbf{b}}_s h^{\prime}_s)}\bigg(\prod_{{\underline{\epsilon}}\in \{0,1\}^s}\mathcal{C}^{|{\underline{\epsilon}}|}u_{{\underline{h}}^{\underline{\epsilon}}}\bigg)\, d\mu. \end{align*} $$

We also record two simple observations.

Lemma 3.5. Let $(X, {\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ be a system, ${\mathbf {a}}, {\textbf {b}}\in {\mathbb Z}^{\ell }$ be vectors, and ${f\in L^{\infty }(\mu )}$ be a function. Then the following two properties hold.

(i) If f is invariant under $T^{\mathbf {a}} T^{-{\textbf {b}}}$ , then $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T^{\mathbf {a}}} = \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T^{\textbf {b}}}$ for any $s\in {\mathbb N}$ .
(ii) If f is invariant under $T^{\mathbf {a}}$ , then so is the function ${\mathbb {E}}(f|{\mathcal I}(T^{\textbf {b}}))$ .

Proof. For part (i), we notice that

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s, T^{\mathbf{a}}} &= \lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in[H]^s}\int \prod_{{\underline{\epsilon}}\in\{0,1\}^s} {\mathcal C}^{|{\underline{\epsilon}}|}T^{{\mathbf{a}}({\underline{\epsilon}}\cdot {\underline{h}})} f\, d\mu\\ &= \lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in[H]^s}\int \prod_{{\underline{\epsilon}}\in\{0,1\}^s} {\mathcal C}^{|{\underline{\epsilon}}|}T^{{\textbf{b}}({\underline{\epsilon}}\cdot {\underline{h}})} f\, d\mu = \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s, T^{\textbf{b}}}. \end{align*} $$

For part (ii), we use the fact that $T^{\mathbf {a}}, T^{{\textbf {b}}}$ commute and the $T^{\mathbf {a}}$ -invariance of f to observe that $T^{\mathbf {a}} T^{h{\textbf {b}}}f = T^{h{\textbf {b}}} T^{\mathbf {a}} f = T^{h{\textbf {b}}} f$ . From this and the mean ergodic theorem, it follows that

$$ \begin{align*} T^{{\mathbf{a}}}{\mathbb{E}}(f|{\mathcal I}(T^{\textbf{b}})) = \lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{h\in[H]}T^{\mathbf{a}} T^{h{\textbf{b}}}f = \lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{h\in[H]}T^{h{\textbf{b}}} f = {\mathbb{E}}(f|{\mathcal I}(T^{\textbf{b}})), \end{align*} $$

and so ${\mathbb {E}}(f|{\mathcal I}(T^{\textbf {b}}))$ is invariant under $T^{\mathbf {a}}$ .

Proof of Proposition 3.3

Part (ii) follows from [Reference Frantzikinakis and Kuca11, Proposition 5.7], and so we only prove (i). Letting $u_{\underline {h}} := \overline {{\mathbb {E}}(\Delta _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_{s}}; {\underline {h}}}{\mathbb {E}}(f|{\mathcal I}(T^{\mathbf {c}}))|{\mathcal I}(T^{{\textbf {b}}_{s+1}}))}$ , we deduce from equation (21) that

$$ \begin{align*} \lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in [H]^s} \int \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}}{\mathbb{E}}(f|{\mathcal I}(T^{\mathbf{c}})) \cdot u_{\underline{h}}\, d\mu> 0. \end{align*} $$

The $T^{\mathbf {c}}$ -invariance of ${\mathbb {E}}(f|{\mathcal I}(T^{\mathbf {c}}))$ implies the $T^{\mathbf {c}}$ -invariance of $\Delta _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_{s}}; {\underline {h}}}{\mathbb {E}}(f|{\mathcal I}(T^{\mathbf {c}}))$ , so the functions $u_{\underline {h}}$ are invariant under $T^{\mathbf {c}}$ by Lemma 3.5 (their $T^{{\textbf {b}}_{s+1}}$ -invariance is trivial). Using the $T^{\mathbf {c}}$ -invariance of $u_{\underline {h}}$ and the properties of conditional expectations, we deduce that

$$ \begin{align*}&\lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in [H]^s} \int \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}}{\mathbb{E}}(f|{\mathcal I}(T^{\mathbf{c}})) \cdot u_{\underline{h}}\, d\mu\nonumber\\ &\quad= \lim_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}\in [H]^s} \int \prod\limits_{{{\underline{\epsilon}}}\in \{0,1\}^s\setminus\{\underline{1}\}}\mathcal{C}^{|{{\underline{\epsilon}}}|} T^{{\textbf{b}}_1 \epsilon_1 h_1+\cdots+ {\textbf{b}}_s \epsilon_s h_s}{\mathbb{E}}(f|{\mathcal I}(T^{\mathbf{c}}))\nonumber\\ &\qquad T^{{\textbf{b}}_1 h_1+\cdots+{\textbf{b}}_s h_s}f \cdot {\mathbb{E}}(u_{\underline{h}}|{\mathcal I}(T^{\mathbf{c}}))\, d\mu. \end{align*} $$

For ${\underline {\epsilon }}\in \{0,1\}^s\setminus \{\underline {1}\}$ , we let $f_{\underline {\epsilon }} := {\mathcal C}^{|{\underline {\epsilon }}|}{\mathbb {E}}(f|{\mathcal I}(T^{\mathbf {c}}))$ . We deduce from the previous identity and the fact $f=\lim _{k\to \infty }{\mathbb {E}}_{n\in [N_k]}\, f_{n,k}$ , where convergence is in the weak sense, that

$$ \begin{align*} &\lim_{H\to\infty} \lim_{k\to\infty}\mathop{\mathbb{E}}\limits_{n\in [N_k]}\, \mathop{\mathbb{E}}\limits_{{\underline{h}}\in [H]^s} \int \prod_{{\underline{\epsilon}}\in \{0,1\}^s\setminus \{\underline{1}\}} T^{{\textbf{b}}_1 \epsilon_1 h_1+\cdots+ {\textbf{b}}_s \epsilon_s h_s} f_{\underline{\epsilon}}\\ &\ \quad\times T^{{\textbf{b}}_1 h_1+\cdots+{\textbf{b}}_s h_s} f_{n,k}\cdot u_{\underline{h}}\, d\mu> 0. \end{align*} $$

For fixed $k,n, H\in {\mathbb N}$ , we apply Lemma 3.4 with $f_{\underline {1}}:= f_{n,k}$ , obtaining

$$ \begin{align*} \liminf_{H\to\infty} \limsup_{k\to\infty}\mathop{\mathbb{E}}\limits_{n\in [N_k]}\, \mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^s}\, \int\Delta_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s; {\underline{h}}-{\underline{h}}'}f_{n,k} \cdot u_{{\underline{h}}, {\underline{h}}'} \, d\mu> 0, \end{align*} $$

where

$$ \begin{align*} u_{{\underline{h}}, {\underline{h}}'} := T^{-({\textbf{b}}_1 h_1'+\cdots+{\textbf{b}}_s h^{\prime}_s)}\bigg(\prod_{{\underline{\epsilon}}\in \{0,1\}^s}\mathcal{C}^{|{\underline{\epsilon}}|}u_{{\underline{h}}^{\underline{\epsilon}}}\bigg). \end{align*} $$

The functions $u_{{\underline {h}}, {\underline {h}}'}$ are both $T^{{\textbf {b}}_{s+1}}$ and $T^{{\mathbf {c}}}$ invariant given that each $(u_{\underline {h}})_{{\underline {h}}\in {\mathbb N}^s}$ is and the transformations $T_1,\ldots , T_{\ell }$ commute. The result follows from the fact that the limsup of a sum is at most the sum of the limsups.

The proposition below enables a transition between qualitative and soft quantitative results. Its proof uses rather abstract functional analytic arguments and the mean convergence result of Walsh [Reference Walsh19]. If we instead use the mean convergence result of Zorin-Kranich [Reference Zorin-Kranich20] we can also get a variant that deals with averages over an arbitrary Følner sequence on ${\mathbb Z}^D$ .

Proposition 3.6. (Soft quantitative control [Reference Frantzikinakis and Kuca11, Proposition A.1])

Let $m, \ell , s\in {\mathbb N}$ with $m\in [\ell ]$ , $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ be polynomials, and $(X, {\mathcal X}, \mu , T_1,\ldots , T_{\ell })$ be a system. Let ${\mathcal Y}_1, \ldots , {\mathcal Y}_{\ell }\subseteq {\mathcal X}$ be sub- $\sigma $ -algebras. Suppose that for all $f_j\in L^{\infty }({\mathcal Y}_j, \mu )$ , $j\in [\ell ]$ , the seminorm $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s}$ controls the average

(22)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{p_1(n)}f_1\cdots T_{\ell}^{p_{\ell}(n)}f_{\ell} \end{align} $$

in that equation (22) converges to 0 in $L^2(\mu )$ whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s} = 0$ . Then for every $\varepsilon>0$ , there exists $\delta>0$ such that if $f_j\in L^{\infty }({\mathcal Y}_j, \mu )$ , $j\in [\ell ]$ are $1$ -bounded and $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s}\leq \delta $ , then

$$ \begin{align*} \lim_{N\to\infty} \Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{p_1(n)}f_1\cdots T_{\ell}^{p_{\ell}(n)}f_{\ell}\Big\Vert_{L^2(\mu)}\leq \varepsilon. \end{align*} $$

Finally, we need the following polynomial ergodic theorem (PET) result that gives box seminorm control for averages with extra terms involving dual functions. It extends [Reference Donoso, Ferré-Moragues, Koutsogiannis and Sun6, Theorem 2.5] that did not involve dual functions. We remark that these arguments are proved by combining a complicated variant of Bergelson’s original PET technique [Reference Bergelson2] with concatenation results of Tao and Ziegler [Reference Tao and Ziegler18].

Proposition 3.7. (Box seminorm control [Reference Frantzikinakis and Kuca11, Proposition B.1])

Let $d, \ell , D, L\in {\mathbb N}$ , $\eta \in [\ell ]^{\ell },$ and $p_1, \ldots , p_{\ell }, q_1, \ldots , q_L\in {\mathbb Z}[n]$ be polynomials of degrees at most D. Let $p_0 = 0$ and $p_j(n) = \sum _{i=0}^D a_{ji} n^i$ for $j\in [\ell ]$ . Suppose that $\deg p_{\ell } = D$ and ${d_{\ell j} := \deg (p_{\ell } {\mathbf {e}}_{\eta _{\ell }} - p_j {\mathbf {e}}_{\eta _j})>0}$ for every $j=0, \ldots , \ell -1$ . Then there exist $s\in {\mathbb N}$ , depending only on $d, D, \ell , L, \eta $ , and non-zero vectors

(23)

$$ \begin{align} {\textbf{b}}_1, \ldots, {\textbf{b}}_s \in \{a_{\ell d_{\ell j}}{\mathbf{e}}_{\eta_{\ell}}- a_{j d_{\ell j}}{\mathbf{e}}_{\eta_j}: j = 0, \ldots, \ell-1\}, \end{align} $$

with the following property: for every system $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ , functions $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ , and sequences of functions ${\mathcal D}_1, \ldots , {\mathcal D}_L\in {\mathfrak D}_d$ , we have

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in[N]}\prod_{j\in[\ell]}T_{\eta_j}^{p_j(n)}f_j\cdot\prod_{j\in[L]}{\mathcal D}_{j}(q_j(n))\bigg\Vert_{L^2(\mu)}=0 \end{align*} $$

whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_{\ell }|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_s}}=0$ .

Due to the monotonicity property of box seminorms in equation (17), we may (and will) always assume that $s\geq 2$ in Proposition 3.7 (this is necessary to apply Lemma 2.1 in some of our arguments in §7).

4 Two motivating examples

In this section, we prove Theorem 1.1 for the family $n^2, n^2, n^2+n$ and sketch the changes needed to handle the family $n^2, n^2, 2n^2+n$ . These two cases illustrate some (but not all) key ideas needed in the proof of Theorem 1.1 in a simple setting. Additional complications arise for more general families and the ideas needed to overcome them will be illustrated with examples given in subsequent sections.

Example 1. (Seminorm control for a monic family of length 3)

Our goal is to prove the following result.

Proposition 4.1. (Seminorm control for $n^2,n^2,n^2+n$ )

There exists $s\in {\mathbb N}$ such that for every system $(X, {\mathcal X}, \mu , T_1, T_2, T_3)$ satisfying ${\mathcal I}(T_1 T_2^{-1})={\mathcal I}(T_1)\cap {\mathcal I}(T_2)$ and all functions $f_1, f_2, f_3\in L^{\infty }(\mu )$ , the average

(24)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} f_1 \cdot T_2^{n^2}f_2 \cdot T_3^{n^2+n}f_3 \end{align} $$

converges to 0 in $L^2(\mu )$ whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_3} = 0$ .

We subsequently show in Corollary 4.5 that the same conclusion holds if we assume $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_i|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_i} = 0$ for $i=1,2,$ instead.

By Proposition 3.7, there exist vectors ${\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}\in \{{\mathbf {e}}_3, {\mathbf {e}}_3 - {\mathbf {e}}_1, {\mathbf {e}}_3 - {\mathbf {e}}_2\}$ such that

(25)

$$ \begin{align} \lim_{N\to\infty} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} f_1 \cdot T_2^{n^2}f_2 \cdot T_3^{n^2+n}f_3 = 0 \end{align} $$

whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}} = 0$ . The goal is to inductively replace all the vectors ${\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}$ in the seminorm different from ${\mathbf {e}}_3$ by ${\mathbf {e}}_3^{\times s'}$ for some $s'\in {\mathbb N}$ , which is achieved in the following proposition.

Proposition 4.2. (Box seminorm smoothing)

Let ${\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}\in \{{\mathbf {e}}_3, {\mathbf {e}}_3 - {\mathbf {e}}_1, {\mathbf {e}}_3 - {\mathbf {e}}_2\}$ and $(X, {\mathcal X}, \mu , T_1, T_2, T_3)$ be a system satisfying ${\mathcal I}(T_1 T_2^{-1})={\mathcal I}(T_1)\cap {\mathcal I}(T_2)$ . Suppose that equation (25) holds for all functions $f_1, f_2, f_3\in L^{\infty }(\mu )$ whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}} = 0$ . Then there exists $s'\in {\mathbb N}$ , depending only on s, such that equation (25) holds whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s, {\mathbf {e}}_3^{\times s'}}=0$ .

Proposition 4.1 follows from Proposition 3.7 and an iterative application of Proposition 4.2.

Passing from a control by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}}$ in Proposition 4.2 to a control by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s, {\mathbf {e}}_{\ell }^{\times s'}}$ follows a two-step ping-pong strategy similar to the one used in [Reference Frantzikinakis and Kuca11]. Using the control by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}}$ , we first pass to an auxiliary control by some seminorm $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_i|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s, {\mathbf {e}}_i^{\times s_1}}$ for some $i=1,2$ and $s_1\in {\mathbb N}$ , and then we use this auxiliary control to go back and control the average from equation (42) by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s, {\mathbf {e}}_{\ell }^{\times s'}}$ for some $s'\in {\mathbb N}$ . We call the two steps outlined above ping and pong.

The assumption ${\mathcal I}(T_1 T_2^{-1})={\mathcal I}(T_1)\cap {\mathcal I}(T_2)$ is crucial for the following special case that will be invoked in the proof of the general case of Proposition 4.1.

Proposition 4.3. There exists $s\in {\mathbb N}$ such that for every system $(X, {\mathcal X}, \mu , T_1, T_2)$ satisfying ${\mathcal I}(T_1 T_2^{-1})={\mathcal I}(T_1)\cap {\mathcal I}(T_2)$ and all functions $f_1, f_2, f_3\in L^{\infty }(\mu )$ , the average

(26)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} f_1 \cdot T_2^{n^2}f_2 \cdot T_2^{n^2+n}f_3 \end{align} $$

converges to 0 in $L^2(\mu )$ whenever one of $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_1|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_1}$ , $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_2|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_2}$ , $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_2}$ is 0.

Proof. We prove that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_2} = 0$ implies

(27)

$$ \begin{align} \lim_{N\to\infty}\mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} f_1 \cdot T_2^{n^2}f_2 \cdot T_2^{n^2+n}f_3 = 0 \end{align} $$

for some $s\in {\mathbb N}$ ; the other cases follow similarly. By Proposition 3.7, the equality in equation (27) holds under the assumption that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\mathbf {e}}_2^{\times s_1}, ({\mathbf {e}}_2 - {\mathbf {e}}_1)^{\times s_2}} = 0$ for some $s_1, s_2\in {\mathbb N}_0$ (which are absolute in that they do not depend on the system or the functions). The assumption ${\mathcal I}(T_1 T_2^{-1})={\mathcal I}(T_1)\cap {\mathcal I}(T_2)$ implies that ${\mathcal I}(T_1 T_2^{-1})\subseteq {\mathcal I}(T_2)$ . Together with Lemma 2.2, this gives

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_3|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\mathbf{e}}_2^{\times s_1}, ({\mathbf{e}}_2 - {\mathbf{e}}_1)^{\times s_2}}\leq \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_3|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\mathbf{e}}_2^{\times (s_1+s_2)}} = \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_3|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s_1+s_2, T_2}, \end{align*} $$

and so equation (27) holds whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_2} = 0$ for $s=s_1+s_2$ .

Proposition 4.3 is invoked in the ping step of the proof of Proposition 4.2; in the pong step, we invoke the result below.

Proposition 4.4. Let $d, L\in {\mathbb N}$ . Then there exists $s\in {\mathbb N}$ depending only on d and L such that for all systems $(X, {\mathcal X}, \mu , T_1, T_2, T_3)$ , functions $f_1, f_3\in L^{\infty }(\mu )$ , and sequences ${\mathcal D}_1, \ldots , {\mathcal D}_L\in {\mathfrak D}_d$ , the average

(28)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} f_1 \cdot \prod_{j=1}^L {\mathcal D}_{j}(n^2) \cdot T_3^{n^2+n}f_3 \end{align} $$

converges to 0 in $L^2(\mu )$ whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_3} = 0$ .

Proposition 4.4 follows from [Reference Frantzikinakis and Kuca11, Proposition 8.4] since the two non-dual terms in equation (28) involve the pairwise independent polynomials $n^2$ and $n^2+n$ .

Having stated all the needed auxiliary results, we are finally in the position to prove Proposition 4.2.

Proof of Proposition 4.2

Suppose that (25) fails. Then $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}}>0$ . If ${\textbf {b}}_{s+1} = {\mathbf {e}}_3$ , then $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s}, {\mathbf {e}}_3}>0$ , so we assume without loss of generality that ${\textbf {b}}_{s+1} = {\mathbf {e}}_3 - {\mathbf {e}}_2$ , the case ${\textbf {b}}_{s+1} = {\mathbf {e}}_3 - {\mathbf {e}}_1$ being identical.

Step 1 (ping): Obtaining auxiliary control by a seminorm of $f_2$ .

By Proposition 3.2 (applied with ${\mathcal A} = {\mathcal X}$ ), we replace $f_3$ by $\tilde {f}_3$ so that

$$ \begin{align*} \lim_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} f_1 \cdot T_2^{n^2}f_2 \cdot T_3^{n^2+n}\tilde{f}_3\Big\Vert_{L^2(\mu)}>0. \end{align*} $$

We set $f_{j,{\underline {h}}, {\underline {h}}'} := \Delta _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_{s}}; {\underline {h}}-{\underline {h}}'} f_j$ for $j\in [3]$ and apply Proposition 3.3(i) (with ${\mathbf {c}} = {\boldsymbol {0}}$ ) to conclude that

$$ \begin{align*} \liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}} \lim_{N\to\infty}\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{n^2} f_{1, {\underline{h}}, {\underline{h}}'}\cdot T_2^{n^2} f_{2, {\underline{h}}, {\underline{h}}'}\cdot T_3^{n^2+n}u_{{\underline{h}},{\underline{h}}'}\Vert_{L^2(\mu)}>0 \end{align*} $$

for some 1-bounded functions $u_{{\underline {h}}, {\underline {h}}'}$ invariant under $T_3 T_2^{-1}$ . The invariance property implies that

$$ \begin{align*} T_3^{n^2+n} u_{{\underline{h}}, {\underline{h}}'} = T_2^{n^2+n} u_{{\underline{h}}, {\underline{h}}'}, \end{align*} $$

and hence

$$ \begin{align*} \liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}} \lim_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{n^2} f_{1, {\underline{h}}, {\underline{h}}'}\cdot T_2^{n^2} f_{2, {\underline{h}}, {\underline{h}}'}\cdot T_2^{n^2+n}u_{{\underline{h}},{\underline{h}}'}\Big\Vert_{L^2(\mu)}>0. \end{align*} $$

Consequently, there exist a set $B\subseteq {\mathbb N}^{2s}$ of positive lower density and $\varepsilon>0$ such that

(29)

$$ \begin{align} \lim_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{n^2} f_{1, {\underline{h}}, {\underline{h}}'}\cdot T_2^{n^2} f_{2, {\underline{h}}, {\underline{h}}'}\cdot T_2^{n^2+n}u_{{\underline{h}},{\underline{h}}'}\Big\Vert_{L^2(\mu)}>\varepsilon \end{align} $$

for all $({\underline {h}}, {\underline {h}}')\in B$ . Each of the averages in equation (29) takes the form of equation (26); we therefore apply Propositions 4.3 and 3.6 to obtain $s_1\in {\mathbb N}$ (independent of the system or the functions) and $\delta>0$ (which depends only on $\varepsilon $ and the system but not the functions) such that

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{2, {\underline{h}}, {\underline{h}}'}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s_1, T_2}> \delta \end{align*} $$

for all $({\underline {h}}, {\underline {h}}')\in B$ . Hence,

(30)

$$ \begin{align} \liminf_{H \to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}-{\underline{h}}'} f_2|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s_1, T_2}>0. \end{align} $$

Together with Lemma 3.1, the inductive formula for seminorms in equation (15), and Hölder inequality, the inequality in equation (30) implies that

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_2|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}, {\mathbf{e}}_2^{\times s_1}}>0. \end{align*} $$

We deduce from this that the seminorm $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_2|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_{s}}, {\mathbf {e}}_2^{\times s_1}}$ controls the average (24).

This seminorm control is not particularly useful as an independent result since the vectors ${\textbf {b}}_1, \ldots , {\textbf {b}}_s$ may not involve the transformation $T_2$ in any way. However, it is of great importance as an intermediate result applied in the next step of our argument.

Step 2 (pong): Obtaining control by a seminorm of $f_3$ .

Using our assumption that equation (25) fails, we now replace $f_2$ by $\tilde {f}_2$ and deduce from Proposition 3.2 (applied again with ${\mathcal A} = {\mathcal X}$ ) that

$$ \begin{align*} \lim_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} f_1 \cdot T_2^{n^2}\tilde{f}_2 \cdot T_2^{n^2+n}f_3\Big\Vert_{L^2(\mu)}>0. \end{align*} $$

From Proposition 3.3 (with ${\mathbf {c}} = {\boldsymbol {0}}$ ), it follows that

$$ \begin{align*} \liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}}\lim_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{n^2} f_{1, {\underline{h}}, {\underline{h}}'}\cdot {\mathcal D}_{{\underline{h}},{\underline{h}}'}(n^2) \cdot T_3^{n^2+n} f_{3, {\underline{h}}, {\underline{h}}'}\Big\Vert_{L^2(\mu)}>0, \end{align*} $$

where ${\mathcal D}_{{\underline {h}},{\underline {h}}'}$ is a product of $2^s$ elements of ${\mathfrak D}_{s_1}$ . Once again, there exists a set $B'\subseteq {\mathbb N}^{2s}$ of positive lower density and $\varepsilon>0$ such that

(31)

$$ \begin{align} \Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{n^2} f_{1, {\underline{h}}, {\underline{h}}'}\cdot {\mathcal D}_{{\underline{h}},{\underline{h}}'}(n^2) \cdot T_3^{n^2+n} f_{3, {\underline{h}}, {\underline{h}}'}\Big\Vert_{L^2(\mu)}> \varepsilon \end{align} $$

for $({\underline {h}}, {\underline {h}}')\in B'$ . Each average indexed in equation (31) is of the form

(32)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{n^2} g_1 \cdot \prod_{j=1}^{2^s}{\mathcal D}_{j}(n^2)\cdot T_3^{n^2+n} g_3 \end{align} $$

for ${\mathcal D}_1, \ldots , {\mathcal D}_{2^s}\in {\mathfrak D}_{s_1}$ . By Proposition 4.4, the averages from equation (32) are controlled by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| g_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_3}$ for some $s'\in {\mathbb N}$ depending only on s, and so Proposition 3.6 gives $\delta>0$ such that

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{3, {\underline{h}}, {\underline{h}}'}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s', T_3}> \delta \end{align*} $$

for all $({\underline {h}}, {\underline {h}}')\in B'$ . (Specifically, we invoke Proposition 3.6 for averages of the form ${\mathbb {E}}_{n\in [N]}\, T_1^{n^2}g_1 \cdot \prod _{j=1}^{2^{s}} T_2^{n^2} g^{\prime }_j \cdot T_3^{n^2+n}g_3$ , where $g^{\prime }_j$ is ${\mathcal Z}_{s_1}(T_2)$ -measurable for each $j\in [2^{s}]$ . One can show that an average like this is qualitatively controlled by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| g_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_3}$ by approximating functions $g^{\prime }_1, \ldots , g^{\prime }_{2^{s}}$ by linear combinations of dual functions using Proposition 2.3 and then applying Proposition 3.7.) Hence,

$$ \begin{align*} \liminf_{H \to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}-{\underline{h}}'} f_3|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s', T_3}>0. \end{align*} $$

Together with Lemma 3.1, the Hölder inequality, and the inductive formula in equation (15) for the seminorms, this implies that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s, {\mathbf {e}}_3^{\times s'}}>0$ , which is what we claim.

Finally, we show how we can use Proposition 4.1 to obtain control of the average from equation (24) by seminorms of other terms.

Corollary 4.5. There exists $s\in {\mathbb N}$ such that for every system $(X, {\mathcal X}, \mu , T_1, T_2, T_3)$ satisfying ${\mathcal I}(T_1 T_2^{-1})={\mathcal I}(T_1)\cap {\mathcal I}(T_2)$ and all functions $f_1, f_2, f_3\in L^{\infty }(\mu )$ , the average from equation (24) converges to 0 in $L^2(\mu )$ whenever one of $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_1|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_1}, \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_2|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_2}, \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_3}$ is 0.

Proof. The statement that for some absolute $s\in {\mathbb N}$ , the identity $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_3} = 0$ implies the vanishing of the $L^2(\mu )$ limit of equation (24) follows from Propositions 3.7 and 4.1, so the content of Corollary 4.5 is to show control by other terms. Suppose that

$$ \begin{align*} \lim_{N\to\infty} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} f_1 \cdot T_2^{n^2}f_2 \cdot T_3^{n^2+n}f_3 \neq 0. \end{align*} $$

We then apply Proposition 2.3 and the pigeonhole principle to find a 1-bounded dual function ${\mathcal D}_{s, T_3}g$ such that

$$ \begin{align*} \lim_{N\to\infty} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} f_1 \cdot T_2^{n^2}f_2 \cdot {\mathcal D}(n^2+n), \end{align*} $$

where ${\mathcal D}(n) := T_3^n {\mathcal D}_{s, T_3} g$ . By Proposition 3.7, we have $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_2|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\mathbf {e}}_2^{\times s'}, ({\mathbf {e}}_2-{\mathbf {e}}_1)^{\times s'}}>0$ for some absolute $s'\in {\mathbb N}$ . The ergodicity assumption of $T_1 T_2^{-1}$ and Lemma 2.2 imply that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_2|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\mathbf {e}}_2^{\times 2 s'}}>0$ , and an analogous argument gives $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_1|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\mathbf {e}}_1^{\times 2 s'}}>0$ .

The argument in Example 1 is relatively clean because the leading coefficients of the polynomials are all 1. When this is not the case, minor modifications are required as explained in the next example.

Example 2. (Seminorm control for a non-monic family of length 3)

Consider the average

(33)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} f_1 \cdot T_2^{n^2}f_2 \cdot T_3^{2n^2+n}f_3. \end{align} $$

By Proposition 3.7, this average is controlled by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}}$ for some $s\in {\mathbb N}$ and ${\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}\in \{2{\mathbf {e}}_3, 2{\mathbf {e}}_3 - {\mathbf {e}}_2\}$ , and we want to pass toward the control by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_3|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s, {\mathbf {e}}_3^{\times s'}}$ . Suppose that the $L^2(\mu )$ limit of equation (33) does not vanish, and suppose moreover that ${\textbf {b}}_{s+1} = 2{\mathbf {e}}_3 - {\mathbf {e}}_2$ . Arguing as in the proof of Proposition 4.2, we arrive at the inequality

$$ \begin{align*} \liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}} \lim_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{n^2} f_{1, {\underline{h}}, {\underline{h}}'}\cdot T_2^{n^2} f_{2, {\underline{h}}, {\underline{h}}'}\cdot T_3^{2n^2+n}u_{{\underline{h}},{\underline{h}}'}\Big\Vert_{L^2(\mu)}>0, \end{align*} $$

where $f_{j, {\underline {h}}, {\underline {h}}'}:=\Delta _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s; {\underline {h}}-{\underline {h}}'}f_j$ for $j\in [2]$ and the functions $u_{{\underline {h}}, {\underline {h}}'}$ are $T_2 T_3^{-2}$ invariant. We can no longer apply the invariance property in the same way as before since the polynomial $2n^2+n$ is not divisible by 2. Instead, we first split ${\mathbb N}$ into the odd and even part and then apply the triangle inequality to deduce that

$$ \begin{align*} &\mathop{\mathbb{E}}\limits_{r\in\{0,1\}}\limsup_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}}\\ &\ \,\quad\lim_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{(2n+r)^2} f_{1, {\underline{h}}, {\underline{h}}'}\cdot T_2^{(2n+r)^2} f_{2, {\underline{h}}, {\underline{h}}'}\cdot T_3^{2(2n+r)^2+(2n+r)}u_{{\underline{h}},{\underline{h}}'}\Big\Vert_{L^2(\mu)}>0. \end{align*} $$

Only then can we apply the $T_2 T_3^{-2}$ invariance of $u_{{\underline {h}}, {\underline {h}}'}$ to obtain the identity

$$ \begin{align*} T_3^{2(2n+r)^2+(2n+r)}u_{{\underline{h}},{\underline{h}}'} = T_2^{4n^2 + (4r+1)n}T_3^{2r^2 + r}u_{{\underline{h}}, {\underline{h}}'}. \end{align*} $$

It follows that for some $r_0\in \{0,1\}$ , we have

$$ \begin{align*} &\limsup_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}} \\ &\quad\lim_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{4n^2 + 4r_0 n} f^{\prime}_{1, {\underline{h}}, {\underline{h}}'}\cdot T_2^{4n^2 + 4r_0 n} f^{\prime}_{2, {\underline{h}}, {\underline{h}}'}\cdot T_2^{4n^2 + (4r_0+1)n}u^{\prime}_{{\underline{h}},{\underline{h}}'}\Big\Vert_{L^2(\mu)}>0 \end{align*} $$

upon setting $f^{\prime }_{j, {\underline {h}}, {\underline {h}}'} := T_j^{r_0^2} f_{j, {\underline {h}}, {\underline {h}}'}$ for $j\in [2]$ and $u^{\prime }_{{\underline {h}}, {\underline {h}}'} := T_3^{2r_0^2 + r_0} u_{{\underline {h}}, {\underline {h}}'}$ . The rest of the argument proceeds analogously except that we invoke an analog of Proposition 4.3 for the tuple $(T_1^{4n^2 + 4r_0 n}, T_2^{4n^2 + 4r_0 n}, T_2^{4n^2 + (4r_0+1)n})$ . The important part about this new tuple is that the first two polynomials are again pairwise dependent while the last one is pairwise independent with any of the first two, and that the new tuple retains the good ergodicity property.

5 Formalism and general strategy for longer families

We move on towards deriving Theorem 1.1 for longer families. To prove it for averages

(34)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{j}^{p_j(n)}f_j, \end{align} $$

we need to analyze more complicated averages of the form

(35)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{\eta_j}^{\,\rho_j(n)}f_j \cdot \prod_{j\in[L]}{\mathcal D}_{j}(q_j(n)) \end{align} $$

that appear at the intermediate steps of the proof of Theorem 1.1, much like averages from equations (26) and (28) show up at the intermediate steps of the proof of Proposition 4.1, a special case of Theorem 1.1 for the family $n^2, n^2, n^2+n$ . In equations (34) and (35), $p_1, \ldots , p_{\ell }, \rho _1, \ldots , \rho _{\ell }, q_1, \ldots , q_L$ are (not necessarily distinct) polynomials with integer coefficients and zero constant terms, $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ is a system, $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ are 1-bounded functions, and ${\mathcal D}_1, \ldots , {\mathcal D}_L\in {\mathfrak D}$ are sequences of functions. Since ${\mathcal D}_{j}(q_j(n))$ has the form $T_{\pi _j}^{q_j(n)}g_j$ for some $\pi _j\in [\ell ]$ and $g_j\in L^{\infty }(\mu )$ , the averages from equation (35) converge in $L^2(\mu )$ by [Reference Walsh19]. The same comment applies for all limits involving dual sequences that appear in the rest of the paper.

The purpose of this section is to introduce a formalism that helps us to meaningfully discuss averages from equation (35). This will be done in §5.1. Subsequently, we give in §5.2 an overview of the strategy used to prove Theorem 1.1. The details of various moves discussed in §5.2 will be presented in §6.

While discussing various examples in this and the next sections, we often say informally that for $j\in [\ell ]$ , the average from equation (35) is controlled by a $T_{\eta _j}$ -seminorm of $f_j$ (or that we have seminorm control of equation (35) by a $T_{\eta _j}$ -seminorm of $f_j$ ) if for all $d, L\in {\mathbb N}$ , there exists $s\in {\mathbb N}$ such that if the $T_{\eta _j}$ -seminorm of $f_j$ vanishes, then the $L^2(\mu )$ limit of equation (35) is 0 for all sequences ${\mathcal D}_1, \ldots , {\mathcal D}_L\in {\mathfrak D}_d$ and all functions $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ satisfying some explicitly stated invariance properties. We also say informally that we have seminorm control over the average from equation (35) if we have seminorm control by a $T_{\eta _j}$ -seminorm of $f_j$ for every $j\in [\ell ]$ .

5.1 The formalism behind the induction scheme

We start by introducing a handy formalism used for the induction scheme in the proof of Theorem 1.1. We often associate the average from equation (35) with the tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ . This tuple does not contain any information about the polynomials $q_1, \ldots , q_L$ , but this is not necessary. These terms play no role in our inductive argument and can be easily disposed of using Proposition 3.7. The only thing they do influence is the degree s of the seminorm with which we end up controlling the average from equation (35).

Definition. (Indexing data)

For an average from equation (35) or the associated tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ , we let $\ell $ be its length, $d:= \max \nolimits _{j\in [\ell ]}\deg \rho _j$ be its degree, and $\eta $ be its indexing tuple. For $j\in [\ell ]$ , we set $d_j := \deg \rho _j$ . We let $K_1$ be the maximum number of pairwise independent polynomials within the family $\rho _1,\ldots , \rho _{\ell }$ (we set $K_1:=1$ if $\ell =1$ or every two polynomials are pairwise dependent). We partition $[\ell ]=\bigcup _{t\in [K_1]} {\mathfrak I}_t$ , where $j_1, j_2$ belong to the same ${\mathfrak I}_t$ if and only if $\rho _{j_1}$ , $\rho _{j_2}$ are linearly dependent. Thus, ${\mathfrak I}_1, \ldots , {\mathfrak I}_{K_1}$ partitions $[\ell ]$ into index sets corresponding to families of pairwise dependent polynomials. Furthermore, we define

$$ \begin{align*} {\mathfrak L} := \{j\in[\ell]: \deg \rho_j = d\} \end{align*} $$

to be the set of indices corresponding to polynomials of maximum degree, and we rearrange ${\mathfrak I}_1, \ldots , {\mathfrak I}_{K_1}$ so that ${\mathfrak L} =\bigcup _{t\in [K_2]}{\mathfrak I}_t$ for some $K_2\leq K_1$ . We also let $K_3 := |{\mathfrak L}|$ be the number of maximum degree polynomials, and we notice that $K_2\leq K_3\leq \ell $ .

Sometimes, we denote ${\mathfrak L} = {\mathfrak L}(\rho _1, \ldots , \rho _{\ell }), {\mathfrak I}_t={\mathfrak I}_t(\rho _1, \ldots , \rho _{\ell })$ , and $K_i = K_i (\rho _1, \ldots , \rho _{\ell })$ to emphasize the dependence on a specific family of polynomials.

Example 3. Consider the tuple

(36)

$$ \begin{align} (T_1^{n^2}, T_2^{n^2}, T_3^{n^2+n}, T_4^{2n^2+2n}, T_3^{n^2+2n}, T_6^n, T_2^{n^2+3n}). \end{align} $$

It has length 7, degree 2, indexing tuple $\eta = (1, 2, 3, 4, 3, 6, 2)$ , $K_1 = 5$ (corresponding to five pairwise independent polynomials $n^2, n^2+n, n^2 + 2n, n^2 + 3n, n$ ), $K_2 = 4$ (corresponding to four quadratic pairwise independent polynomials $n^2, n^2 + n, n^2 + 2n, n^2+3n$ ), $K_3 = 6$ (corresponding to six quadratic polynomials appearing in equation (36)), ${\mathfrak L} = \{1, 2, 3, 4, 5, 7\}$ , and partition ${\mathfrak I}_1 = \{1, 2\}, {\mathfrak I}_2 = \{3, 4\}, {\mathfrak I}_3 = \{5\}, {\mathfrak I}_4 = \{7\}, {\mathfrak I}_5 = \{6\}$ .

The rationale behind introducing the partition ${\mathfrak I}_1, \ldots , {\mathfrak I}_{K_1}$ and the indexing tuple $\eta $ is as follows. As part of our assumptions, we know that $T_i^{\beta _i} T_j^{-\beta _j}$ is ergodic whenever the polynomials $\rho _i, \rho _j$ are dependent, $b_i, b_j$ are their leading coefficients, and $\beta _i := b_i/\gcd (b_i, b_j), \beta _j := b_j/\gcd (b_i, b_j)$ . Introducing the partition ${\mathfrak I}_1, \ldots , {\mathfrak I}_{K_1}$ allows us to keep track of pairs $(i,j)$ for which we have these ergodicity properties. The reason for introducing the indexing tuple $\eta $ is that in our induction procedure, we gradually replace a transformation $T_{\eta _m}$ in the tuple $(T_{\eta _j}^{\,\rho _j(n)})_{j\in [\ell ]}$ by a different transformation $T_{\eta _i}$ , and so $\eta $ keeps track of these substitutions.

Definition. (Good ergodicity property along $\eta $ )

Let $\eta \in [\ell ]^{\ell }$ be an indexing tuple and $b_1, \ldots , b_{\ell }$ be the leading coefficients of the polynomials $\rho _1, \ldots , \rho _{\ell }$ . We say that the system $(X,{\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ has the good ergodicity property along $\eta $ for the polynomials $\rho _1, \ldots , \rho _{\ell }$ if for every distinct $\eta _{j_1}, \eta _{j_2}$ belonging to the same ${\mathfrak I}_t$ with $t\in [K_1]$ , we have

(37)

$$ \begin{align} {\mathcal I}(T_{\eta_{j_1}}^{\beta_{j_1}}T_{\eta_{j_2}}^{-\beta_{j_2}}) = {\mathcal I}(T_{\eta_{j_1}})\cap {\mathcal I}(T_{\eta_{j_2}}), \end{align} $$

where $\beta _{j} := b_{j}/\gcd (b_{j_1}, b_{j_2})$ for $j=j_1, j_2$ . In other words, the only functions invariant under $T_{\eta _{j_1}}^{\beta _{j_1}}T_{\eta _{j_2}}^{-\beta _{j_2}}$ are those simultaneously invariant under $T_{\eta _{j_1}}$ and $T_{\eta _{j_2}}$ . In particular, having the good ergodicity property corresponds to having the good ergodicity property for the polynomials $p_1,\ldots , p_{\ell }$ along the identity indexing tuple $\eta _0 := (1, \ldots , \ell )$ . We similarly say that the tuple $(T_{\eta _j}^{\,\rho _j(n)})_{j\in [\ell ]}$ has the good ergodicity property if $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ has the good ergodicity property along $\eta $ for $\rho _1, \ldots , \rho _{\ell }$ .

What this definition captures is that every time we encounter in our tuple two transformations $T_{\eta _{j_1}}, T_{\eta _{j_2}}$ whose indices $\eta _{j_1}, \eta _{j_2}$ lie in the same set ${\mathfrak I}_t$ , the identity in equation (37) is satisfied. For instance, the tuple in equation (36) has the good ergodicity property along $\eta $ precisely when

$$ \begin{align*} {\mathcal I}(T_1 T_2^{-1}) = {\mathcal I}(T_1)\cap{\mathcal I}(T_2)\quad \textrm{and}\quad {\mathcal I}(T_3 T_4^{-2}) = {\mathcal I}(T_3)\cap {\mathcal I}(T_4). \end{align*} $$

The first identity corresponds to comparing the pairs $(j_1, j_2) = (1,2), (1,7)$ corresponding to the occurrences of transformations $T_1$ and $T_2$ from the cell ${\mathfrak I}_1$ ( $T_1$ occurs at the index 1 and $T_2$ occurs at the indices 2 and 7). The second identity comes from comparing the pairs $(j_1, j_2) = (3, 4), (4, 5)$ corresponding to the transformations $T_3, T_4$ from the cell ${\mathfrak I}_2$ ( $T_3$ occurs at the indices 3 and 5 whereas $T_4$ occurs at the index 4).

The guiding principle behind our arguments is that we derive seminorm control of the average from equation (35) by inductively applying seminorm control of an average that is ‘simpler’ than the original average in an appropriate sense. For instance, in Proposition 4.2 and Corollary 4.5, we obtained seminorm control for the tuple $(T_1^{n^2}, T_2^{n^2}, T_3^{n^2+n})$ from Example 1 by invoking seminorm control for the following tuples:

• $(T_1^{n^2}, T_2^{n^2}, T_2^{n^2+n})$ in the ping step of the smoothing argument in Proposition 4.2;
• $(T_1^{n^2}, *, T_3^{n^2+n})$ in the pong step of the smoothing argument in Proposition 4.2 (the asterisk is introduced purely for convenience; it denotes the term replaced by a product of dual functions);
• $(T_1^{n^2}, T_2^{n^2}, *)$ in Corollary 4.5.

The relative complexity of a tuple or an average is captured by the following notion.

Definition. (Type)

The type of $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ is the tuple $w = (w_1, \ldots , w_{K_2})$ , where

$$ \begin{align*} w_t := |\{j\in {\mathfrak L}: {\eta_j}\in {\mathfrak I}_t\}| = |\{j\in[\ell]: \deg \rho_j = d,\; {\eta_j}\in {\mathfrak I}_t\}| \end{align*} $$

counts the number of times the transformations $(T_j)_{j\in {\mathfrak I}_t}$ appear in $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ with a polynomial iterate of maximum degree. (We note here that the type of $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ depends not just on $\eta $ and the polynomials $\rho _1, \ldots , \rho _{\ell }$ , but also on the ordering of the sets ${\mathfrak I}_1, \ldots , {\mathfrak I}_{K_1}$ . We do not record this dependence explicitly, instead fixing some ordering of ${\mathfrak I}_1, \ldots , {\mathfrak I}_{K_1}$ a priori.) We note that $|w| := w_1 +\cdots + w_{K_2} = K_3$ . We say that the type w is basic if it has the form $w=(K_3,0,\ldots , 0)$ .

For instance, the tuple in equation (36) has type $(3,3,0,0)$ : this is because $T_1, T_2$ corresponding to ${\mathfrak I}_1$ occur thrice, as do the transformations $T_3, T_4$ corresponding to ${\mathfrak I}_2$ , while the transformations $T_5, T_7$ corresponding to ${\mathfrak I}_3$ and ${\mathfrak I}_4$ do not occur at all. We do not care about the occurrence of $T_6$ since it has a linear iterate.

It is instructive to see what happens when the polynomials $\rho _1, \ldots , \rho _{\ell }$ are pairwise independent. In that case, ${\mathfrak I}_t = \{j_t\}$ for every $t\in [\ell ]$ and $w_t$ counts the number of times the transformation $T_{j_t}$ appears among $(T_{\eta _j})_{j\in {\mathfrak L}}$ , or equivalently the number of times that $T_{j_t}$ attains a polynomial iterate of maximal degree. So for pairwise independence polynomials, this notion of type recovers the concept of type from [Reference Frantzikinakis and Kuca11, §8.2] (up to permuting ${\mathfrak I}_1, \ldots , {\mathfrak I}_{K_1}$ ).

For the set of tuples in ${\mathbb N}_0^{K_2}$ of length $K_3$ , we say $w'<w$ if there exists ${\kappa \in [K_2-1]}$ such that for all $t\in [\kappa ]$ , we have $w^{\prime }_t = w_t$ , and either $w^{\prime }_{\kappa +1} = 0 < w_{\kappa +1}$ or $w^{\prime }_{\kappa +1}>w_{\kappa +1}>0$ . For instance, we have the following chain of type inequalities:

$$ \begin{align*} (4, 0, 0) &< (3, 0, 1) < (3, 1, 0) < (2, 0, 2) < (2, 2, 0)\\ &< (2, 1, 1) < (1, 0, 3) < (1, 2, 1) < (1, 1, 2). \end{align*} $$

The first, third, fifth, sixth, and eighth inequalities follow from the condition $w^{\prime }_{\kappa +1}>w_{\kappa +1}>0$ while the second, fourth, and seventh inequalities are consequences of the condition $w^{\prime }_{\kappa +1} = 0 < w_{\kappa +1}$ . This is a rather atypical ordering, but it turns out to determine well which of the tuples $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ , $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ is ‘simpler’ than the other. The motivation for this particular choice of ordering is that in the arguments to come, we will be passing from a tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ of type w to another tuple $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ of type $w'<w$ in two ways. In one of them, the type $w'$ will meet the condition $w^{\prime }_{\kappa +1}>w_{\kappa +1}>0$ while in the other one, it will satisfy the condition $w^{\prime }_{\kappa +1} = 0 < w_{\kappa +1}$ . Arguing this way, we arrive in finitely many steps at a tuple of a basic type $w=(K_3, 0, \ldots , 0)$ , which constitutes the base case of our induction. This transition will be explained in greater detail at the end of §5.2 and illustrated in Example 10.

Lemma 5.1. For fixed $K_2,K_3\in {\mathbb N}$ , let $A:=\{w\in {\mathbb N}_0^{K_2}: w_1+\cdots +w_{K_2} = K_3\}$ . Then $<$ defines a strict partial order on A.

Proof. It is clear that $<$ is asymmetric and irreflexive, so it remains to show that it is transitive. Suppose that $w"<w'$ , $w'<w$ , and let $\kappa _1, \kappa _2\in [K_2-1]$ be indices such that $w^{\prime \prime }_t=w^{\prime }_t$ for all $t\in [\kappa _2]$ but not for $t=\kappa _2+1$ and $w^{\prime }_t=w_t$ for all $t\in [\kappa _1]$ but not for $t=\kappa _1+1$ . If $\kappa _2<\kappa _1$ , then either $w^{\prime \prime }_{\kappa _2+1}=0<w^{\prime }_{\kappa _2+1} = w_{\kappa _2+1}$ or $w^{\prime \prime }_{\kappa _2+1}>w^{\prime }_{\kappa _2+1} = w_{\kappa _2+1}>0$ , and so $w"<w$ . Otherwise $\kappa _2\geq \kappa _1$ , in which case either $w^{\prime \prime }_{\kappa _2+1}=w^{\prime }_{\kappa _2+1} = 0 < w_{\kappa _2+1}$ or $w^{\prime \prime }_{\kappa _2+1} \geq w^{\prime }_{\kappa _2+1}> w_{\kappa _2+1}>0$ , implying $w"<w$ again.

5.2 The general strategy

In this section, we outline how to obtain a seminorm control of a given tuple using seminorm control for tuples of lower type or shorter length.

Definition. (Controllable and uncontrollable tuples)

Let $t_w := \max \{t: w_t>0\}$ be the last non-zero index of w. We call a tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ of a non-basic type w (or the corresponding average) controllable, if there exists an index $m\in [\ell ]$ such that

• $\eta _m\in {\mathfrak I}_{t_w}$ ;
• for every other $i\in [\ell ]$ with $\eta _i=\eta _m$ , we have $\rho _i\neq \rho _m$ .

If m satisfies the aforementioned assumption, we say that it satisfies the controllability condition; in this case, Proposition 3.7 guarantees that the average from equation (35) is controlled by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s}}$ for non-zero vectors ${\textbf {b}}_1, \ldots , {\textbf {b}}_{s}$ . If no such index m exists, we call the tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ uncontrollable.

The previous notions of controllability are supposed to capture whether Proposition 3.7 is applicable to the relevant tuples in our setting.

Example 4. (Controllable versus uncontrollable tuples)

Consider the following two tuples:

(38)

$$ \begin{align} &\mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_5^{n^2+n}, T_1^{n^2+n}, T_5^{n^2+2n}, T_5^{n^2+2n} \mathclose{}), \end{align} $$

(39)

$$ \begin{align} &\mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_1^{n^2+n}, T_1^{n^2+n}, T_5^{n^2+2n}, T_5^{n^2+2n} \mathclose{}). \end{align} $$

Defining the partitions ${\mathcal I}_1 = \{1, 2, 3, 4\}$ , ${\mathcal I}_2 = \{5, 6\}$ , ${\mathcal I}_3 = \{7, 8\}$ corresponding to the independent polynomials $n^2, n^2+n, n^2+2n$ . respectively, the first tuple has type $(5, 3, 0)$ while the second one has type $(6, 2, 0)$ , and for both tuples, we have $t_w=2$ . The first one is controllable because for the index $m = 5$ , the only values $i\neq m$ such that $\eta _i = 5$ are $i=7,8$ corresponding to the polynomial $n^2+2n$ , which is distinct from $n^2+n$ .

By contrast, the tuple in equation (39) does not possess an index satisfying the controllability condition: the only indices $m\in [8]$ with $\eta _m\in {\mathfrak I}_2$ are $m=7,8$ , and we have both $\eta _7 = \eta _8$ and $\rho _7(n) = n^2+2n = \rho _8(n)$ . Hence, this tuple is uncontrollable.

Our strategy for proving seminorm control will work rather differently for controllable and uncontrollable tuples. Suppose first that the average from equation (35) with tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ of a non-basic type is controllable, and that an index m satisfies the controllability condition. Then Proposition 3.7 guarantees that the average from equation (35) is controlled by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}}$ for some vectors ${\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}$ from equation (23), and the controllability implies that these vectors are indeed non-zero. We want to show that this average is also controlled by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s}, {\mathbf {e}}_{\eta _m}^{\times s'}}$ for some $s'\in {\mathbb N}$ via a seminorm smoothing argument that generalizes Proposition 4.2. We then iterate this result s more times to get control by a $T_{\eta _m}$ -seminorm of $f_m$ . The seminorm smoothing argument follows a ping-pong strategy much like in the proof of Proposition 4.2. We first show that control by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}}$ implies control by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_i|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s}, {\mathbf {e}}_{\eta _i}^{\times s_1}}$ for some $s_1\in {\mathbb N}$ and $i\neq m$ , and then we use this auxiliary result to obtain control by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s}, {\mathbf {e}}_{\eta _m}^{\times s'}}$ .

The main idea behind the ping step of the seminorm smoothing argument is to show that a seminorm control of the average from equation (35) can be deduced from a seminorm control of a family of averages of the form

(40)

$$ \begin{align} \lim_{N\to\infty}\mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{{\eta_j'}}^{\,\rho^{\prime}_j(n)}f^{\prime}_j \cdot \prod_{j\in[L']}{\mathcal D}^{\prime}_{j}(q^{\prime}_j(n)) \end{align} $$

for some polynomials $\rho ^{\prime }_1, \ldots , \rho ^{\prime }_{\ell }, q^{\prime }_1, \ldots , q^{\prime }_{L'}\in {\mathbb Z}[n]$ , 1-bounded functions $f^{\prime }_1, \ldots , f^{\prime }_{\ell }\in L^{\infty }(\mu )$ , and sequences of functions ${\mathcal D}^{\prime }_1, \ldots , {\mathcal D}^{\prime }_{L'}\in {\mathfrak D}$ . Moreover, the indexing tuple $\eta '\in [\ell ]^{\ell }$ is obtained from $\eta $ by changing $\eta _m$ into $\eta _i$ for some $i\neq m$ , that is, the passage from equation (35) to equation (40) goes by replacing $T_{\eta _m}$ at index m with $T_{\eta _i}$ . Importantly, the new average from equation (40) satisfies several key properties:

(i) it has a lower type than the original average, so that we can argue by induction;
(ii) the new average retains the good ergodicity property of the original average;
(iii) the functions $f^{\prime }_j$ in the new average satisfy some invariance properties;
(iv) as long as the aforementioned invariance properties are satisfied, the new average from equation (40) is controlled by the seminorm $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j'|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta ^{\prime }_j}}$ for each $j\in [\ell ]$ and some $s\in {\mathbb N}$ .

Proposition 6.1 explains the exact way in which we pass from averages from equation (35) to those from equation (40) so that the property (i) is satisfied, and Proposition 6.4 establishes the property (ii). Proposition 6.5 then ensures that the functions $f_j'$ in equation (40) satisfy needed invariance properties.

We note though that the new average from equation (40) need not be controllable. For instance, if we take the average corresponding to the tuple from equation (38), then in the ping step, we replace $T_5^{n^2+n}$ by the same iterate of one of $T_1, T_2, T_3, T_4$ . The new average is then uncontrollable, as is the tuple from equation (39), corresponding to replacing $T_5^{n^2+n}$ by $T_1^{n^2+n}$ . Hence, controllability may not be preserved while performing the procedure outlined above.

In the pong step of the smoothing argument for equation (35), we deal with averages of the form

(41)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{\substack{j\in[\ell],\\ j\neq i}}T_{\eta_j}^{\,\rho_j(n)}f^{\prime\prime}_j \cdot \prod_{j\in[L"]}{\mathcal D}^{\prime\prime}_{j}(q^{\prime\prime}_j(n)). \end{align} $$

Crucially, each function $f^{\prime \prime }_j$ is invariant under some composition of $T_{\eta _j}$ and $T_j$ . This allows us to replace (some iterate of) $T_{\eta _j}$ in equation (41) by (some iterate of) $T_j$ , a procedure that we call flipping, and show that an average from equation (41) essentially equals an average of the form

(42)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{\substack{j\in[\ell],\\ j\neq i}}T_j^{\,\rho^{\prime\prime\prime}_j(n)}f^{\prime\prime\prime}_j \cdot \prod_{j\in[L"]}{\mathcal D}^{\prime\prime}_{j}(q^{\prime\prime\prime}_j(n)) \end{align} $$

of length $\ell -1$ . The details of how flipping is performed are presented in Proposition 6.6. An inductive application of a suitable modification of Theorem 1.1 then gives a control of equation (42) by a $T_j$ -seminorm of $f^{\prime \prime \prime }_j$ for each $j\neq i$ , and the invariance property of $f^{\prime \prime }_j$ translates it into a control of equation (41) by a $T_{\eta _j}$ -seminorm of $f^{\prime \prime }_j$ for each $j\neq i$ . A straightforward argument analogous to one at the end of the proof of Proposition 4.2 gives a control of equation (35) by a $T_{\eta _m}$ -seminorm of $f_m$ .

If the average from equation (35) is uncontrollable, then we proceed rather differently. The previous strategy breaks right at the start since there is no index m satisfying the controllability condition. Consequently, whichever index m with $\eta _m\in {\mathfrak I}_{t_w}$ we take, we cannot employ Proposition 3.7 to bound the seminorm by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}}$ for non-zero vectors ${\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}$ . What we use instead is the inductive assumption that the functions $f_j$ at indices j with $\eta _j\in {\mathfrak I}_{t_w}$ are invariant under a composition of (some power of) $T_{\eta _j}$ and (some power of) $T_j^{-1}$ . Using this invariance property, we perform flipping once more to replace the original average from equation (35) by a new average

(43)

$$ \begin{align} \lim_{N\to\infty}\mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{{\eta_j""}}^{\,\rho^{\prime\prime\prime\prime}_j(n)}f^{\prime\prime\prime\prime}_j \cdot \prod_{j\in[L]}{\mathcal D}_{j}(q^{\prime\prime\prime\prime}_j(n)), \end{align} $$

where $\eta ^{\prime \prime \prime \prime }_j = j$ whenever $\eta _j\in {\mathfrak I}_{t_w}$ ; the details are provided in Corollary 6.7. This new average has the good ergodicity property and is controllable. Importantly, it has a lower type, which is established in Proposition 6.8. We can then obtain seminorm control of equation (35) by inductively invoking the seminorm control of equation (43). The seminorm control of equation (43) is proved in turn by the smoothing argument for controllable averages described above.

Thus, whether the tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ of a non-basic type is controllable or not, the idea is to control it by a Gowers–Host–Kra seminorm by invoking seminorm control for tuples of lower type or smaller length that naturally appear when examining $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ . If the tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ of type w is controllable, we will invoke seminorm control for tuples of type $w'$ satisfying $w^{\prime }_t = w_t$ for $t\in [\kappa ]$ and $w^{\prime }_{\kappa +1}>w_{\kappa +1}>0$ for some $\kappa \in [K_2-1]$ in the ping step of the smoothing argument. Specifically, the new type $w'$ is obtained from w by the type operation defined in equation (45). In the pong step of the smoothing argument, we will use seminorm control for tuples of length $\ell -1$ . If the tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ is uncontrollable, we will invoke seminorm control for tuples of type $w'$ satisfying $w^{\prime }_t = w_t$ for $t\in [\kappa ]$ and $w^{\prime }_{\kappa +1}=0<w_{\kappa +1}$ for some $\kappa \in [K_2-1]$ ; the details are given in Proposition 6.8(v). The way in which we apply seminorm control for tuples of lower type motivates the choice of our somewhat weird ordering on types.

Reducing to tuples of lower type this way and noting that the tuples of length $\ell $ can have at most $(\ell +1)^{\ell }$ distinct types, we arrive after finitely many steps at tuples of basic type $w = (K_3, 0, \ldots , 0)$ , that is, those in which all the transformations come from the same class ${\mathfrak I}_1$ . Tuples of basic type will serve as the basis for our induction procedure. For instance, the tuple $(T_1^{n^2}, T_2^{n^2}, T_2^{n^2+n})$ from Example 1 has basic type $(3, 0)$ because it only involves the transformations $T_1, T_2$ whose indices belong to the set ${\mathfrak I}_1 = \{1, 2\}$ (corresponding to the polynomial $n^2$ ); however, the type $(2, 1)$ of $(T_1^{n^2}, T_2^{n^2}, T_3^{n^2+n})$ is not basic because this tuple involves both the transformations $T_1, T_2$ and the transformation $T_3$ with an index from the set ${\mathfrak I}_2 = \{3\}$ (corresponding to the polynomial $n^2 + n$ ).

6 Further maneuvers and obstructions for longer families

Having presented the general strategy for proving Theorem 1.1 for longer families, we move on to discuss in detail the specific maneuvers outlined in §5.2. In this section, we state and prove various partial results that give substance to the moves discussed in §5.2. We also discuss a number of obstructions that appear in the process and have to be overcome before we can give a complete proof of Theorem 1.1. All of the above are illustrated with examples that will hopefully make the abstract statements in this and the next section more comprehensible to the reader. We then move on in §7 to prove Theorem 1.1.

The plan for this section is as follows. In §6.1, we discuss how to obtain a tuple of lower type in the ping step of the seminorm smoothing argument for controllable tuples. Section 6.2 exhibits the necessity of assuming that the functions appearing in the averages from equation (40) have some invariance properties. In particular, we show by examples how these properties are essentially used to tackle tuples of basic type and to perform the pong step of the seminorm smoothing argument for controllable tuples. We also give details of the flipping procedure that relies on these invariance properties. Subsequently, we discuss in §6.3 how flipping can be used to reduce an uncontrollable tuple to a controllable tuple of a lower type. Finally, we combine the details of the aforementioned moves in §6.4 and show how we can reach a tuple of a basic type in a finite number of steps.

6.1 Reducing controllable tuples to tuples of lower type in the ping step

As explained in §5.2, in the ping part of the smoothing argument, we will replace the original tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ by a new tuple $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ . The new indexing tuple $\eta '$ will be defined via the operation

(44)

$$ \begin{align} (\tau_{mi}\eta)_j := \begin{cases} \eta_j, &j\neq m,\\ \eta_i, &j = m, \end{cases} \end{align} $$

for some distinct values $m, i\in {\mathfrak L}$ . This indexing tuple corresponds to replacing the term $T_{\eta _m}^{\,\rho _m(n)}$ in $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ by $T_{\eta _i}^{\,\rho ^{\prime }_i(n)}$ , and all the other terms $T_{\eta _j}^{\,\rho _j(n)}$ by $T_{\eta _j}^{\,\rho ^{\prime }_j(n)}$ . The new tuple $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ has to be chosen carefully: it must preserve the good ergodicity property and allow for seminorm control. Lastly, it must have a lower type. For this reason, we let $\operatorname {\mathrm {Supp}}(w):= \{t\in [K_2]: w_t>0\}$ , and if there exist distinct integers $t_1,t_2\in \operatorname {\mathrm {Supp}}(w)$ , we define the type operation $\sigma _{t_1 t_2}w$ by the formula

(45)

$$ \begin{align} (\sigma_{t_1 t_2}w)_t := \begin{cases} w_t, &t \neq t_1, t_2,\\ w_{t_1}-1, &t = t_1,\\ w_{t_2} + 1, &t = t_2. \end{cases}. \end{align} $$

For instance, $\sigma _{32}(3, 2, 2) = (3, 3, 1)$ . As a consequence of our ordering on types, we have $\sigma _{t_1 t_2}w < w$ whenever $t_2 < t_1$ (the assumption $w_{t_2}>0$ is crucial here), so in particular $(3, 3, 1) < (3, 2, 2)$ .

Proposition 6.1, which we are about to state now, specifies how these tuples of lower type are picked, what form they take, and what properties they enjoy. It will be used in our smoothing argument in Proposition 7.5 in that the tuple of lower type for which we invoke the induction hypothesis in the ping step is constructed in Proposition 6.1.

Proposition 6.1. (Type reduction for controllable tuples)

Let $\ell \in {\mathbb N}$ , $\eta \in [\ell ]^{\ell }$ be an indexing tuple and $\rho _1, \ldots , \rho _{\ell }\in {\mathbb Z}[n]$ be polynomials with leading coefficients $b_1, \ldots , b_{\ell }$ . Let also $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system and $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ be a tuple of a non-basic type w whose last non-zero index is $t_w$ . Suppose that $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ is controllable. Then there exists $\unicode{x3bb} \in {\mathbb N}$ such that for every $r\in \{0, \ldots , \unicode{x3bb} -1\}$ , there exist an index $t^{\prime }_w\in \operatorname {\mathrm {Supp}}(w)$ distinct from $t_w$ , an index $i\in [\ell ]$ with $\eta _i\in {\mathfrak I}_{t^{\prime }_w}$ , and a tuple $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ satisfying the following properties.

(i) The type $w'$ of the tuple $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ satisfies $w' = \sigma _{t_w t^{\prime }_w}w < w$ .
(ii) The indexing tuple $\eta '$ is given by $\eta ' := \tau _{m i} \eta $ for some $m\in [\ell ]$ satisfying the controllability condition (recall that $\tau _{mi}$ is defined in equation (44)).
(iii) The polynomials $\rho ^{\prime }_{1}, \ldots , \rho ^{\prime }_{\ell }$ have integer coefficients and zero constant terms, and they take the form
$$ \begin{align*} \rho^{\prime}_{j}(n) := \begin{cases} \rho_{j}(\unicode{x3bb} n + r) - \rho_{j}(r), &j\neq m,\\[4pt] \dfrac{b_{i}}{b_{m}}(\rho_j(\unicode{x3bb} n + r) - \rho_j(r)), &j= m. \end{cases} \end{align*} $$

We remark that when the leading coefficients of $\rho _1, \ldots , \rho _{\ell }$ are 1, then $\rho ^{\prime }_j = \rho _j$ for every $j\in [\ell ]$ , so the property (iii) becomes trivial.

Proof. Let $t_w$ be the last non-zero index of the type w of $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ . By the controllability of the tuple, there exists $m\in [\ell ]$ with $\eta _m\in {\mathfrak I}_{t_w}$ such that $\eta _{m'}=\eta _m$ implies that $\rho _{m'}$ and $\rho _m$ are distinct. Let $t^{\prime }_w\in \operatorname {\mathrm {Supp}}(w)$ be an index different from $t_w$ (it exists since the type w is non-basic) and $i\in [\ell ]$ be an index with $\eta _i\in {\mathfrak I}_{t^{\prime }_w}$ . We define $\eta ':=\tau _{mi}\eta $ , meaning that we replace $T_{\eta _m}$ by $T_{\eta _i}$ and keep the other transformations the same.

We let $\unicode{x3bb} \in {\mathbb N}$ be the smallest number for which $({\unicode{x3bb} }/{b_m})\rho _m\in {\mathbb Z}[n]$ (equivalently, $\unicode{x3bb} $ is the smallest number such that $b_m$ divides the coefficients of the polynomials $\rho _m(\unicode{x3bb} n +~r) - \rho _m(r)$ for $r\in {\mathbb Z}$ ). We also fix an arbitrary $r\in \{0, \ldots , \unicode{x3bb} -1\}$ . We then define the new polynomials $\rho ^{\prime }_1, \ldots , \rho ^{\prime }_{\ell }$ by the formula

$$ \begin{align*} \rho^{\prime}_{j}(n) := \begin{cases} \rho_{j}(\unicode{x3bb} n + r) - \rho_{j}(r), &j\neq m,\\[4pt] \dfrac{b_{i}}{b_{m}}(\rho_j(\unicode{x3bb} n + r) - \rho_j(r)), &j= m. \end{cases} \end{align*} $$

The new polynomials are in ${\mathbb Z}[n]$ by the choice of $\unicode{x3bb} $ , and it is not hard to check that they have zero constant terms. Lastly, the new tuple $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ has the type $w' = \sigma _{t_w t^{\prime }_w}w$ , which is strictly smaller than w by the assumption that $t^{\prime }_w<t_w$ (which follows from ${t^{\prime }_w\neq t_w}$ and the assumption that $t_w$ is the last non-zero index).

Example 5. (Examples of type reduction)

We show how Proposition 6.1 has been implicitly applied to the two tuples from §4.

(i) The tuple $(T_1^{n^2}, T_2^{n^2}, T_3^{n^2+n})$ discussed at length in §4 has type $(2, 1)$ corresponding to the partition ${\mathfrak I}_1 = \{1,2\}, {\mathfrak I}_2 = \{3\}$ , and its good ergodicity property means that ${\mathcal I}(T_1 T_2^{-1})={\mathcal I}(T_1)\cap {\mathcal I}(T_2)$ , that is, the only $T_1 T_2^{-1}$ -invariant functions are those invariant simultaneously under $T_1$ and $T_2$ . In the proof of Proposition 4.2, we applied the type reduction once (with the operation $\tau _{32}$ ) to obtain the new tuple $(T_1^{n^2}, T_2^{n^2}, T_2^{n^2+n})$ with indexing tuple $(1,2,2)$ and basic type $(3,0)$ .
(ii) The tuple $(T_1^{n^2}, T_2^{n^2}, T_3^{2n^2+n})$ presented at the end of §4 also has type $(2, 1)$ corresponding to the partition ${\mathfrak I}_1 = \{1,2\}, {\mathfrak I}_2 = \{3\}$ , and its good ergodicity property also states that ${\mathcal I}(T_1 T_2^{-1})={\mathcal I}(T_1)\cap {\mathcal I}(T_2)$ . In the ping step of the smoothing argument, we obtained (upon taking $r_0 = 1$ ) the new tuple $(T_1^{4n^2+4n}, T_2^{4n^2+4}, T_2^{4n^2+5n})$ by performing the operation $\tau _{32}$ . This new tuple also has the indexing tuple $(1,2,2)$ and basic type $(3,0)$ , and its ergodicity property is the same as for the original tuple.

Definition. (Descendants)

Let $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ be polynomials with leading coefficients $a_1, \ldots , a_{\ell }$ and $\eta \in [\ell ]^{\ell }$ be an indexing tuple. We say that the polynomials $\rho _1, \ldots , \rho _{\ell }$ are descendants of $p_1, \ldots , p_{\ell }$ along $\eta $ , if there exists $\unicode{x3bb} \in {\mathbb N}$ and ${r\in \{0, \ldots , \unicode{x3bb} -1\}}$ such that $\rho _j(n) = {a_{\eta _j}}/{a_j}(p_j(\unicode{x3bb} n+r) - p_j(r))$ . If this is the case, we also say the tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ is a descendant of the tuple $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ .

Descendancy enjoys the following transitivity property.

Lemma 6.2. (Descendancy is transitive)

Suppose that the polynomials $\rho _1, \ldots , \rho _{\ell } \in {\mathbb Z}[n]$ are descendants of $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ along $\eta $ , and let $\rho ^{\prime }_j(n) := {a_{\eta ^{\prime }_j}}/{a_{\eta _j}} (\rho _j(\unicode{x3bb} ' n + r') - \rho _j(r'))$ for all $j\in [\ell ]$ , where $\eta '\in [\ell ]^{\ell }$ , $\unicode{x3bb} '\in {\mathbb N}$ , $r'\in \{0, \ldots , \unicode{x3bb} -1\}$ and $a_1, \ldots , a_{\ell }$ are the leading coefficients of $p_1, \ldots , p_{\ell }$ . Then $\rho ^{\prime }_1, \ldots , \rho ^{\prime }_{\ell }$ are descendants of $p_1, \ldots , p_{\ell }$ along $\eta '$ .

Proof. Let $\unicode{x3bb} \in {\mathbb N}$ and $r\in \{0, \ldots , \unicode{x3bb} -1\}$ be such that $\rho _j(n) = {a_{\eta _j}}/{a_j}(p_j(\unicode{x3bb} n+r) - p_j(r))$ . Then a direct computation gives that

$$ \begin{align*} \rho^{\prime}_j(n) = \frac{a_{\eta^{\prime}_j}}{a_j}(p_j(\unicode{x3bb}\unicode{x3bb}' n + \unicode{x3bb} r' + r) - p_j(\unicode{x3bb} r' + r)), \end{align*} $$

giving the claim.

In particular, we get the following corollary of interest to us that follows from a straightforward combination of Proposition 6.1 and Lemma 6.2.

Corollary 6.3. (Type reduction preserves descendancy)

Let $\ell \in {\mathbb N}$ , $\eta \in [\ell ]^{\ell }$ be an indexing tuple, $p_1, \ldots , p_{\ell }, \rho _1, \ldots , \rho _{\ell }\kern1.2pt{\in}\kern1.2pt {\mathbb Z}[n]$ be polynomials, and $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system. Suppose that $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ is a tuple of a non-basic type w that is a descendant of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}.$ Then the tuple $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ constructed from $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ in Proposition 6.1 is also a descendant of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}.$

Proof. Suppose that $\rho _1, \ldots , \rho _{\ell }$ are descendants of $p_1, \ldots , p_{\ell }$ along $\eta $ by assumption. Letting $a_j, b_j$ be the leading coefficients of $p_j, \rho _j$ , respectively, and $\rho ^{\prime }_j$ be as defined in Proposition 6.1, we have $b_j = a_{\eta _j} (\unicode{x3bb} ')^{d_j}$ for some $\unicode{x3bb} \in {\mathbb Z}$ , where $d_j := \deg p_j = \deg \rho _j = \deg \rho ^{\prime }_j$ . Thus,

$$ \begin{align*}\rho^{\prime}_j(n) = \rho_j(\unicode{x3bb} n + r) - \rho_j(r) =\frac{a_{\eta^{\prime}_j}}{a_{\eta_j}}(\rho_j(\unicode{x3bb} n + r) - \rho_j(r))\end{align*} $$

for $j\neq m$ and

$$ \begin{align*}\rho^{\prime}_j(n) &= \frac{b_i}{b_m}(\rho_j(\unicode{x3bb} n + r) - \rho_j(r))\\ &= \frac{a_{\eta_i}}{a_{\eta_m}}(\rho_j(\unicode{x3bb} n + r) - \rho_j(r)) = \frac{a_{\eta_m'}}{a_{\eta_m}}(\rho_j(\unicode{x3bb} n + r) - \rho_j(r))\end{align*} $$

for $j = m$ , where we use $\eta ^{\prime }_m = \eta _i$ and $\eta ^{\prime }_j = \eta _j$ for $j\neq m$ . Hence, the polynomials $\rho ^{\prime }_1, \ldots , \rho ^{\prime }_{\ell }$ satisfy the condition of Lemma 6.2, implying the claim.

Descendant tuples enjoy the following important properties.

Proposition 6.4. (Properties of descendants)

Let $\ell \in {\mathbb N}$ , $\eta \in [\ell ]^{\ell }$ be an indexing tuple, $p_1, \ldots , p_{\ell }, \rho _1, \ldots , \rho _{\ell }\in {\mathbb Z}[n]$ be polynomials, and $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system. Suppose that $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ has the good ergodicity property and $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ is its descendant. Then the following hold.

(i) We have
$$ \begin{align*} {\mathfrak L}(\rho_1, \ldots, \rho_{\ell}) &= {\mathfrak L}(p_1, \ldots, p_{\ell}), \\ K_i(\rho_1, \ldots, \rho_{\ell}) &=K_i(p_1, \ldots, p_{\ell}),\quad i\in[3],\\ {\mathfrak I}_t(\rho_1, \ldots, \rho_{\ell})&={\mathfrak I}_t(p_1, \ldots, p_{\ell}), \quad t\in[K_1]. \end{align*} $$
(ii) The tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ has the good ergodicity property.

Property (i) ensures that when passing to descendants, we do not need to redefine the partition ${\mathfrak I}_1, \ldots , {\mathfrak I}_{K_1}$ . Property (ii) is crucial because it shows that descendants retain the essential ergodicity properties of the original tuple.

Proof. Part (i) follows from the fact that for every $j\in [\ell ]$ , the polynomials $p_j$ and $\rho _j$ have the same degree, and that $p_{j_1}, p_{j_2}$ are linearly dependent if and only if $\rho _{j_1}, \rho _{j_2}$ are. We therefore move on to proving part (ii). Let $b_j$ be the leading coefficient of $\rho _j$ , $a_j$ be the leading coefficient of $p_j$ , and $d_j:= \deg p_j = \deg \rho _j$ for every $j\in [\ell ]$ . To check that the tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ has the good ergodicity property, we need to show that if $\eta _{j_1}, \eta _{j_2}$ are distinct elements of the same set ${\mathfrak I}_{t}$ , then

(46)

$$ \begin{align} {\mathcal I}(T_{\eta_{j_1}}^{\beta_{j_1}} T_{\eta_{j_2}}^{-\beta_{j_2}})={\mathcal I}(T_{\eta_{j_1}})\cap{\mathcal I}(T_{\eta_{j_2}}), \end{align} $$

where

$$ \begin{align*} \beta_{j} := b_{j}/\gcd(b_{j_1}, b_{j_2}) \end{align*} $$

for $j = j_1, j_2$ . By construction, $b_j = a_{\eta _j}\unicode{x3bb} ^{d_j}$ for some $\unicode{x3bb} \in {\mathbb N}$ , and so

(47)

$$ \begin{align} \beta_{j} = a_{\eta_j}/\gcd(a_{\eta_{j_1}}, a_{\eta_{j_2}})=:\alpha_{\eta_j} \end{align} $$

for $j= j_1, j_2$ . The assumption $\eta _{j_1}, \eta _{j_2}\in {\mathfrak I}_t$ for some fixed t implies that $p_{\eta _{j_1}}, p_{\eta _{j_2}}$ are linearly dependent, and additionally $p_{\eta _{j_1}}/\alpha _{\eta _{j_1}} = p_{\eta _{j_2}}/\alpha _{\eta _{j_2}}$ . Since $\alpha _{\eta _{j_1}}, \alpha _{\eta _{j_2}}$ are coprime, the good ergodicity property of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ implies that

$$ \begin{align*} {\mathcal I}(T_{\eta_{j_1}}^{\alpha_{\eta_{j_1}}} T_{\eta_{j_2}}^{-\alpha_{\eta_{j_2}}})={\mathcal I}(T_{\eta_{j_1}})\cap{\mathcal I}(T_{\eta_{j_2}}). \end{align*} $$

The equality in equation (46) follows from this and the identification in equation (47).

Example 6. (Type reduction for non-monic polynomials)

We present one more example to show how Proposition 6.1 is applied iteratively for more complicated tuples, and how properties listed in Proposition 6.4 are retained when passing to lower-type descendant tuples. Consider the tuple

(48)

$$ \begin{align} (T_1^{n^2}, T_2^{3n^2}, T_3^{2n^2}, T_4^{2n^2+n}, T_5^{n^2+n}, T_6^{n^2+n}, T_7^n), \end{align} $$

and assume that it has the good ergodicity property, that is,

$$ \begin{align*} {\mathcal I}(T_1 T_2^{-3}) = {\mathcal I}(T_1)\cap{\mathcal I}(T_2),\quad {\mathcal I}(T_1 T_3^{-2}) = {\mathcal I}(T_1)\cap{\mathcal I}(T_3),\\ {\mathcal I}(T_2^3 T_3^{-2}) = {\mathcal I}(T_2)\cap{\mathcal I}(T_3),\quad {\mathcal I}(T_5 T_6^{-1}) = {\mathcal I}(T_5)\cap{\mathcal I}(T_6). \end{align*} $$

This tuple has length 7, degree 2, $K_1 = 5$ , $K_2 = 4$ , $K_3=6$ , and ${\mathfrak L} = \{1, 2, 3, 4, 5, 6\}$ . If we define the partition ${\mathfrak I}_1=\{1, 2, 3\}$ , ${\mathfrak I}_2 = \{5,6\}$ , ${\mathfrak I}_3 = \{4\}$ , ${\mathfrak I}_4 = \{7\}$ , then the tuple has type $w_0 = (3,2, 1)$ ; we recall that the term $T_7^n$ plays no part in the type consideration since the polynomial $\rho _{07}(n)=n$ has a lower degree. (Perhaps a more natural way to define the partition would be to have ${\mathfrak I}_1=\{1, 2, 3\}$ , ${\mathfrak I}_2 = \{4\}$ , ${\mathfrak I}_3 = \{5,6\}$ , ${\mathfrak I}_4 = \{7\}$ . However, then the tuple would have type $(3, 1, 2)$ , and reducing to the tuple of basic type by iteratively applying Proposition 6.1 would take more steps. This shows that choosing the partition strategically can save on the number of iterations of Proposition 6.1 needed to reach a tuple of basic type.) The tuple in equation (48) has the basic indexing tuple $\eta _0 = (1, 2, 3, 4, 5, 6,7)$ . The tuple is controllable, and 4 satisfies the controllability condition, so in the first step, we replace $T_4$ (this corresponds to us wanting to first get a $T_4$ -seminorm control over the tuple in equation (48)). We are then provided with an index $i_0\in {\mathfrak I}_1\cup {\mathfrak I}_2$ (say, $i_0 = 1$ ), and we get the new indexing tuple

$$ \begin{align*} \eta_1 := \tau_{m_0 i_0} \eta_0 = \tau_{41}\eta_0 = (1, 2, 3, 1, 5, 6, 7). \end{align*} $$

The leading coefficient 2 of $2n^2+n$ does not divide the linear coefficient, and the smallest $\unicode{x3bb} _0\in {\mathbb N}$ such that $2$ divides the coefficients of $\unicode{x3bb} _0(2n^2+n)$ is $\unicode{x3bb} _0 = 2$ . In performing the ping step of the seminorm smoothing argument for the tuple in equation (48), we will want to apply the $T_1 T_4^{-2}$ invariance of some function u to replace $T_4^{2n^2+n}u$ by $T_1^{q(n)}u'$ for some $q\in {\mathbb Z}[n]$ and a function $u'$ related in some way to u. We cannot do this directly since $\tfrac 12(2n^2+n)\notin {\mathbb Z}[n]$ , but we can do this ‘piecewise’ by splitting ${\mathbb N}$ into arithmetic progressions $(2{\mathbb N} + r)_{r=0, 1}$ and considering the two cases separately (see the sketch of the seminorm smoothing argument for $n^2, n^2, 2n^2+n$ at the end of §4 to see how this was done for that family). We therefore replace the original polynomials $\rho _{01}, \ldots , \rho _{07}$ by new polynomials

$$ \begin{align*} \rho_{1j}(n) := \begin{cases} \rho_{0j}(2 n + r_0) - \rho_{0j}(r_0), &j \neq 4,\\ \frac{1}{2}(\rho_{04}(2 n + r_0) - \rho_{04}(r_0)), &j =4, \end{cases} \end{align*} $$

for some $r_0\in \{0,1\}$ (the choice of $r_0$ is not ours). Assuming that $r_0=1$ , we obtain the new tuple

(49)

$$ \begin{align} (T_1^{4n^2 + 4n}, T_2^{12n^2+12n}, T_3^{8n^2+8n}, T_1^{4n^2+5n}, T_5^{4n^2+6n}, T_6^{4n^2+6n}, T_7^{2n}). \end{align} $$

The type of the new tuple is $w_1 = \sigma _{31}w_0 = (4, 2,0)$ since we now have four transformations with indices coming from ${\mathfrak I}_1$ and two transformations coming from ${\mathfrak I}_2$ . This type is lower than the original type $w_0$ , and so we have successfully obtained a tuple of lower type. The new tuple is controllable, with $m=5,6$ both satisfying the controllability condition.

Although we replaced the polynomials $\rho _{01}, \ldots , \rho _{07}$ by new ones, we note that for any $j_1, j_2\in [7]$ , the polynomials $\rho _{1j_1}, \rho _{1j_2}$ are pairwise dependent if and only if $\rho _{0j_1}, \rho _{0j_2}$ are, and not only that: if they are pairwise dependent, then $\rho _{1j_1}/c_1 = \rho _{1j_2}/c_2$ if and only if $\rho _{0j_1}/c_1 = \rho _{0j_2}/c_2$ for any non-zero integers $c_1, c_2$ . Moreover, if $\eta _{1j_1} = \eta _{1j_2}$ , then the leading coefficients of $\rho _{1j_1}$ and $\rho _{1j_2}$ are identical. These two observations ensure that the ergodicity conditions on $T_1 T_2^{-3}, T_1 T_3^{-2}, T_2^3 T_3^{-2}, T_5 T_6^{-1}$ , which constitute the assumption that the original tuple in equation (48) has the good ergodicity property, carry on to the new tuple in equation (49), implying that it also enjoys the good ergodicity property. This exemplifies the claim from Proposition 6.4 that descendants of tuples with the good ergodicity property inherit the property.

The type $w_1$ is not basic, and so we continue the procedure. This time, we pick some $m_1\in {\mathfrak I}_2$ , say $m_1 = 5$ (it satisfies the controllability condition, as does 6, the other possible choice). We are then handed an index $i_1\in {\mathfrak I}_1$ (say, $i_1 = 3$ ), so that

$$ \begin{align*} \eta_2 := \tau_{m_1 i_1}\eta_1= \tau_{53}\eta_1 = (1, 2, 3, 1, 3, 6, 7). \end{align*} $$

The leading coefficient $4$ of $4n^2+6n$ does not divide the linear term, and so we replace the polynomials $\rho _{11}, \ldots , \rho _{17}$ by new polynomials of the form

$$ \begin{align*} \rho_{2j}(n) := \begin{cases} \rho_{1j}(2 n + r_1) - \rho_{1j}(r_1), &j \neq 5,\\ \frac{8}{4}(\rho_{15}(2 n + r_1) - \rho_{15}(r_1)), &j =5, \end{cases} \end{align*} $$

for some $r_1\in \{0,1\}$ (we pass from n to $2n+r_1$ because 2 is the smallest natural number $\unicode{x3bb} _1$ such that the leading coefficient 4 of $\rho _{15}$ divides the coefficients of $\unicode{x3bb} _1 \rho _{15}$ ). Hence, the new tuple takes the form (upon assuming $r_1 = 0$ )

(50)

$$ \begin{align} (T_1^{16n^2 + 8n}, T_2^{48n^2+24n}, T_3^{32n^2+16n}, T_1^{16n^2+10n}, T_3^{32n^2+24n}, T_6^{16n^2+12n}, T_7^{4n}). \end{align} $$

The tuple in equation (50) still has the good ergodicity property; this is once again a consequence of two facts:

• for any $j_1, j_2\in [7]$ and non-zero $c_1, c_2\in {\mathbb Z}$ , we have $\rho _{2j_1}/c_1 = \rho _{2j_2}/c_2$ if and only if $\rho _{0j_1}/c_1 = \rho _{0j_2}/c_2$ ;
• for any $j_1, j_2\in [7]$ , if $\eta _{2j_1} = \eta _{2j_2}$ , then $\rho _{2j_1}, \rho _{2j_2}$ have identical leading coefficients.

We remark though that to ensure the ergodicity property of equation (50), we no longer need the original assumption ${\mathcal I}(T_5 T_6^{-1}) = {\mathcal I}(T_5)\cap {\mathcal I}(T_6)$ because the transformation $T_5$ is not present.

The new tuple in equation (50) has type $w_2 = \sigma _{21}w_1 = (5,1, 0)$ , which is still not basic, and so we continue the procedure one more time. The only index left in ${\mathfrak I}_2$ is $6$ , and it satisfies the controllability assumption, so we replace $T_6$ this time. We are given an index $i_2\in {\mathfrak I}_1$ (say, $i_2 = 1$ ), so that

$$ \begin{align*} \eta_3 := \tau_{m_2 i_2}\eta_1 = (1, 2, 3, 1, 3, 1, 7). \end{align*} $$

Since the leading coefficient 16 of $\rho _{21}$ does not divide the coefficients of the polynomial $\rho _{26}(n) = 16n^2 + 12 n$ , and the smallest $\unicode{x3bb} _2\in {\mathbb N}$ for which 16 divides the coefficients of $\unicode{x3bb} _2 \rho _{26}(n) = \unicode{x3bb} _2(16n^2+12n)$ is $\unicode{x3bb} _2 = 4$ , we define the new polynomials to be

$$ \begin{align*} \rho_{3j}(n) := \begin{cases} \rho_{2j}(4 n + r_2) - \rho_{2j}(r_2), &j \neq 6,\\[4pt] \dfrac{16}{16}(\rho_{26}(4 n + r_2) - \rho_{26}(r_2)), &j =6, \end{cases} \end{align*} $$

for some $r_2 \in \{0, 1, 2, 3\}$ . Assuming, say, $r_2 = 3$ , we get the new tuple

$$ \begin{align*} (T_1^{256n^2 + 416n}, T_2^{768n^2+1248n}, T_3^{512n^2+832n}, T_1^{256n^2+424n}, T_3^{512n^2+864n}, T_1^{256n^2+432n}, T_7^{16n}). \end{align*} $$

This tuple has the basic type $w_2 = \sigma _{21}w_2 = (6,0,0)$ , and so the procedure halts. A similar argument as before shows also that it enjoys the good ergodicity property.

Lastly, we observe that $\eta _3|_{{\mathfrak I}_1} = \eta _3|_{\{1, 2, 3\}}$ is constant, that is, while performing the type reduction procedure, we did not replace the transformations at indices from ${\mathfrak I}_1$ . This is a special case of property (vi) from Proposition 6.8, which will play an important role in the proof of Proposition 7.2, a seminorm control argument for tuples of basic type.

6.2 The role of invariance properties

Proposition 6.1 ensures that the lower type tuples to which we pass in the ping step of the smoothing argument have the good ergodicity property. However, this is not enough. For more complicated tuples, we also need to assume that the functions appearing in the associated average have some invariance properties, otherwise the induction breaks. The example that we present now displays the necessity of this extra information. We sketch how—reducing the original tuple to tuples of shorter length or lower type—we eventually arrive at averages for which we cannot obtain seminorm control unless the functions appearing in the averages satisfy certain invariance properties. We emphasize that our goal in this example is not to give a complete proof of seminorm control, but rather to point out the necessity of the invariance assumptions. Therefore, we assume without proof when convenient that we have seminorm control over certain tuples of lower type or shorter length.

Example 7. (The necessity of invariance properties)

Consider the average

(51)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2}f_1 \cdot T_2^{n^2}f_2 \cdot T_3^{n^2+n}f_3 \cdot T_4^{n^2 + n} f_4. \end{align} $$

It has length 4, degree 2, and type $w=(2,2)$ , corresponding to the partition ${{\mathfrak I}_1 = \{1, 2\}, {\mathfrak I}_2 = \{3, 4\}}$ . Suppose that equation (51) has the good ergodicity property, that is,

$$ \begin{align*} {\mathcal I}(T_1 T_2^{-1}) = {\mathcal I}(T_1)\cap {\mathcal I}(T_2)\quad \textrm{and}\quad {\mathcal I}(T_3 T_4^{-1}) = {\mathcal I}(T_3)\cap{\mathcal I}(T_4). \end{align*} $$

We illustrate the steps that need to be taken to show that this average is controlled by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_4|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_4}$ for some $s\in {\mathbb N}$ .

By Proposition 3.7, the average from equation (51) is controlled by the seminorm $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_4|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_{s+1}}}$ for some vectors

$$ \begin{align*} {\textbf{b}}_1, \ldots, {\textbf{b}}_{s+1}\in\{{\mathbf{e}}_4, {\mathbf{e}}_4-{\mathbf{e}}_3, {\mathbf{e}}_4-{\mathbf{e}}_2, {\mathbf{e}}_4-{\mathbf{e}}_1\}. \end{align*} $$

We want to replace the vector ${\textbf {b}}_{s+1}$ by (multiple copies of) ${\mathbf {e}}_4$ . Iterating this procedure gives a control of equation (51) by a $T_4$ -seminorm of $f_4$ .

If ${\textbf {b}}_{s+1}={\mathbf {e}}_4$ , there is nothing to check. If ${\textbf {b}}_{s+1} = {\mathbf {e}}_4-{\mathbf {e}}_3$ , then this is the consequence of the good ergodicity property of the average and Lemma 2.2. So the only cases to check are when ${\textbf {b}}_{s+1}$ equals ${\mathbf {e}}_4-{\mathbf {e}}_2$ or ${\mathbf {e}}_4-{\mathbf {e}}_1$ . Without loss of generality, we assume that ${\textbf {b}}_{s+1} = {\mathbf {e}}_4-{\mathbf {e}}_2$ .

Suppose that the limit of equation (51) is non-zero. Arguing as in the proof of Proposition 4.1, we deduce that

$$ \begin{align*} \liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}}\lim_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{n^2} f_{1, {\underline{h}}, {\underline{h}}'}\cdot T_2^{n^2} f_{2, {\underline{h}}, {\underline{h}}'}\cdot T_3^{n^2+n}f_{3, {\underline{h}}, {\underline{h}}'}\cdot T_4^{n^2+n}u_{{\underline{h}},{\underline{h}}'}\Big\Vert_{L^2(\mu)}\kern1.2pt{>}\kern1.2pt0 \end{align*} $$

for some $T_4 T_2^{-1}$ -invariant functions $u_{{\underline {h}}, {\underline {h}}'}$ as well as functions $f_{j, {\underline {h}},{\underline {h}}'} := \Delta _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_{s}}; {\underline {h}}-{\underline {h}}'} f_j$ for $j\in [4]$ . The invariance property of the functions $u_{{\underline {h}}, {\underline {h}}'}$ implies that

(52)

$$ \begin{align} \liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}}\lim_{N\to\infty}\Big\Vert\! \mathop{\mathbb{E}}\limits_{n\in[N]} T_1^{n^2} f_{1, {\underline{h}}, {\underline{h}}'}\cdot T_2^{n^2} f_{2, {\underline{h}}, {\underline{h}}'}\cdot T_3^{n^2+n}f_{3, {\underline{h}}, {\underline{h}}'}\cdot T_2^{n^2+n}u_{{\underline{h}},{\underline{h}}'}\Big\Vert_{L^2(\mu)}>0. \end{align} $$

Each of the averages inside the liminf above is of the form

(53)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} g_1\cdot T_2^{n^2} g_2\cdot T_3^{n^2+n}g_3\cdot T_2^{n^2+n}g_4 \end{align} $$

for 1-bounded functions $g_1, g_2, g_3, g_4\in L^{\infty }(\mu )$ of which $g_4$ is $T_4 T_2^{-1}$ invariant. The averages from equation (53) are controllable, with $3$ satisfying the controllability condition, and they have type $(3, 1)$ , which is lower than the type $(2,2)$ of the original average from equation (51). Assuming inductively that we have the seminorm control of averages from equation (53) by a $T_3$ -seminorm of $f_3$ (while we only use this particular control, our inductive assumption will guarantee that we control averages from equation (53) by a relevant seminorm of other functions, too), we can deduce from equation (52) (like in the proof of Proposition 4.2) that

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_3|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s, {\mathbf{e}}_3^{\times s_1}}>0 \end{align*} $$

for some $s_1\in {\mathbb N}$ , and similarly for other terms. This completes the ping step. For the pong step, this auxiliary control and Proposition 3.3 imply that

(54)

$$ \begin{align} &\liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}}\lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{n^2} f_{1, {\underline{h}}, {\underline{h}}'}\cdot T_2^{n^2} f_{2, {\underline{h}}, {\underline{h}}'}\nonumber\\ &\quad\times \prod_{j=1}^{2^{s}}{\mathcal D}_{j}(n^2+n)\cdot T_4^{n^2+n}f_{4, {\underline{h}}, {\underline{h}}'}\bigg\Vert_{L^2(\mu)}>0 \end{align} $$

for some ${\mathcal D}_j\in {\mathfrak D}_{s_1}$ . Each average in equation (54) takes the form

(55)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]} \, T_1^{n^2} g_1\cdot T_2^{n^2} g_2 \cdot \prod_{j=1}^{2^{s}}{\mathcal D}_{j}(n^2+n)\cdot T_4^{n^2+n} g_4. \end{align} $$

Assuming inductively that averages of the form in equation (55) are controlled by a $T_4$ -seminorm of the last term, we get the desired claim $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_4|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s, {\mathbf {e}}_4^{\times s'}}>0$ for some $s'\in {\mathbb N}$ using an argument similar to one in the proof of Proposition 4.2.

We have showed how a seminorm control of the original average from equation (51) by a $T_4$ -seminorm of $f_4$ follows from the seminorm control of the averages from equations (53) and (55). We have not proved, however, that these auxiliary averages are indeed controlled by Gowers–Host–Kra seminorms, assuming instead that this follows by induction. It turns out that obtaining a seminorm control of the averages from equation (53) involves an interesting twist in that the argument makes essential use of the assumption that the function $g_4$ is $T_4 T_2^{-1}$ -invariant. We sketch the steps taken in the seminorm smoothing argument for this average under the extra invariance assumption to show where this invariance property comes up and why it is necessary.

In proving the seminorm control of equation (53), we first prove that the average is controlled by a $T_3$ -seminorm of $g_3$ since $T_3$ is the only transformation with index in ${\mathfrak I}_2$ . Arguing as above (using Proposition 3.7 for equation (53), assuming that the $L^2(\mu )$ limit of equation (53) is non-zero and mimicking the proof of Proposition 4.2), we deduce that

$$ \begin{align*} \liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}}\lim_{N\to\infty}\Big\Vert\! \mathop{\mathbb{E}}\limits_{n\in[N]} T_1^{n^2} g_{1, {\underline{h}}, {\underline{h}}'}\cdot T_2^{n^2} g_{2, {\underline{h}}, {\underline{h}}'}\cdot T_3^{n^2+n}u_{{\underline{h}}, {\underline{h}}'}\cdot T_2^{n^2+n}g_{4, {\underline{h}},{\underline{h}}'}\Big\Vert_{L^2(\mu)}>0 \end{align*} $$

for 1-bounded functions $u_{{\underline {h}}, {\underline {h}}'}$ that are all invariant either under $T_3 T_1^{-1}$ or under $T_3 T_2^{-1}$ . Then we use the relevant invariance property to replace each $T_3^{n^2+n} u_{{\underline {h}}, {\underline {h}}'}$ by $T_1^{n^2+n} u_{{\underline {h}}, {\underline {h}}'}$ or $T_2^{n^2+n} u_{{\underline {h}}, {\underline {h}}'}$ . Hence, in the ping step of the seminorm smoothing argument for equation (53), we need to invoke seminorm control of averages of the form

(56)

$$ \begin{align} \nonumber &\mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} g_1'\cdot T_2^{n^2} g_2'\cdot T_1^{n^2+n}g_3'\cdot T_2^{n^2+n}g_4'\\ \textrm{and}\quad & \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} g_1'\cdot T_2^{n^2} g_2'\cdot T_2^{n^2+n}g_3'\cdot T_2^{n^2+n}g_4', \end{align} $$

where $g^{\prime }_4$ is $T_4 T_2^{-1}$ -invariant while $g^{\prime }_3$ is invariant under $T_3 T_1^{-1}$ and $T_3 T_2^{-1}$ respectively. Both of them have basic type.

We show that for arbitrary $g_1', g_2', g_3', g_4'$ , without the aforementioned invariance assumptions, we would not be able to control the average from equation (56) by Gowers–Host–Kra seminorms; specifically, we could not control it by a $T_2$ -seminorm of $g^{\prime }_4$ . Conversely, this is achievable if $g^{\prime }_3, g^{\prime }_4$ are invariant under $T_3T_2^{-1}$ , $T_4T_2^{-1}$ , respectively. Assuming for simplicity that $g_1'=g_2':= 1$ , we have that the second average equals

$$ \begin{align*} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2} g_1'\cdot T_2^{n^2} g_2'\cdot T_2^{n^2+n}g_3'\cdot T_2^{n^2+n}g_4' = \mathop{\mathbb{E}}\limits_{n\in[N]}T_2^{n^2+n}(g_3'\cdot g_4'), \end{align*} $$

and so without any additional assumptions, the average from equation (56) is in general not controlled by a $T_2$ -seminorm of $g_3'$ or a $T_2$ -seminorm of $g^{\prime }_4$ . However, the invariance assumptions on $g^{\prime }_3, g^{\prime }_4$ give us

(57)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_2^{n^2+n}(g_3'\cdot g_4') = \mathop{\mathbb{E}}\limits_{n\in[N]} T_3^{n^2+n}g_3'\cdot T_4^{n^2+n}g_4', \end{align} $$

and by Proposition 3.7, these averages can be controlled by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| g_4'|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\mathbf {e}}_4^{\times s}, ({\mathbf {e}}_4-{\mathbf {e}}_3)^{\times s}}$ for some $s\in {\mathbb N}$ . Then the assumption ${\mathcal I}(T_4 T_3^{-1})\subseteq {\mathcal I}(T_4)$ and Lemma 2.2 give $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| g_4'|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\mathbf {e}}_4^{\times s}, ({\mathbf {e}}_4-{\mathbf {e}}_3)^{\times s}} \leq \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| g^{\prime }_4|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{2s, T_4}$ , and so a $T_4$ -seminorm of $g_4'$ does control the average from equation (57). Using the invariance property once again, this time along with Lemma 3.5, we get $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| g_4'|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_4} =\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| g_4'|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_2}$ , so a $T_2$ -seminorm of $g^{\prime }_4$ controls equation (57) and hence equation (56).

To get control over equation (56) by a $T_2$ -seminorm of $g_4'$ without any simplifying assumptions on $g_1', g_2'$ , we have to run a more complicated argument. Combining Proposition 3.7, the ergodic condition on $T_1T_2^{-1}$ and Lemma 2.2, we first obtain control of equation (56) by a $T_1$ -seminorm of $g^{\prime }_1$ and a $T_2$ -seminorm of $g^{\prime }_2$ . Assuming that the $L^2(\mu )$ limit of the average from equation (56) is positive, we use this newly established control, decompose $g^{\prime }_1$ using Proposition 2.3, and apply the pigeonhole principle to show that the average

(58)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}{\mathcal D}(n^2)\cdot T_2^{n^2} g_2'\cdot T_2^{n^2+n}g_3'\cdot T_2^{n^2+n}g_4' \end{align} $$

has a non-vanishing limit. The invariance properties of $g^{\prime }_3$ and $g^{\prime }_4$ imply that the average from equation (58) equals

(59)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}{\mathcal D}(n^2)\cdot T_2^{n^2} g_2'\cdot T_3^{n^2+n}g_3'\cdot T_4^{n^2+n}g_4', \end{align} $$

for which a seminorm control by a $T_4$ -seminorm of $g^{\prime }_4$ follows by inductively invoking seminorm control for averages of length 3. The invariance property of $g^{\prime }_4$ implies once again that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| g_4'|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_4} =\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| g_4'|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_2}$ for any $s'\in {\mathbb N}$ . It follows that a $T_2$ -seminorm of $g^{\prime }_4$ controls equation (59), and hence also equations (58) and (56).

The invariance properties also come up in the pong step of the smoothing argument for equation (53). In this part, we encounter averages of the form

(60)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}\prod_{j=1}^L {\mathcal D}_{j}(n^2)\cdot T_2^{n^2} g_2"\cdot T_3^{n^2+n}g_3"\cdot T_2^{n^2+n}g_4". \end{align} $$

Moreover, the function $g_4"$ is $T_4 T_2^{-1}$ -invariant because it is essentially a multiplicative derivative of $g_4'$ . By a similar reason as before, such averages could not be controlled by Gowers–Host–Kra seminorms for arbitrary $g_2", g_3", g_4"$ without the invariance assumption. However, thanks to the invariance assumption, the average from equation (60) equals

(61)

$$ \begin{align} \lim_{N\to\infty} \mathop{\mathbb{E}}\limits_{n\in[N]}\prod_{j=1}^L {\mathcal D}_{j}(n^2)\cdot T_2^{n^2} g_2"\cdot T_3^{n^2+n}g_3"\cdot T_4^{n^2+n}g_4", \end{align} $$

which is controlled by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| g^{\prime \prime }_4|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_4}$ for some $s'\in {\mathbb N}$ . (We can assume this inductively, or we can prove that a $T_4$ -seminorm of $g_4"$ controls equation (61) in essentially the same way as we argued in Proposition 4.2.) Using the invariance property of $g^{\prime \prime }_4$ again together with Lemma 3.5, we deduce that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| g_4"|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_4} = \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| g_4"|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_2}$ , and so a $T_2$ -seminorm of $g^{\prime \prime }_4$ does control equation (60). An argument similar to the one at the end of the proof of Proposition 4.2 implies that a $T_2$ -seminorm of $g_4$ controls equation (53).

The example above shows that it is crucial to keep track of the invariance properties of the functions appearing in our averages; these invariance properties turn out to be indispensable for applying Proposition 3.7 to averages of basic type, while obtaining seminorm control in the pong step of the argument, or—as we see later on—for handling uncontrollable averages. Using the invariance property to replace an average like equations (58) and (60) for which we cannot have seminorm control by an average like equations (59) and (61), respectively, which is controlled by Gowers–Host–Kra seminorms, exemplifies the flipping technique that will be presented in detail in Proposition 6.6.

Recalling how we have performed the ping and pong steps in Examples 1, 2, and 7, we observe that in the ping step, we pass from the average

(62)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{\eta_j}^{\,\rho_j(n)}f_j \cdot \prod_{j\in[L]}{\mathcal D}_{j}(q_j(n)) \end{align} $$

to averages

(63)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}\,\prod_{\substack{j\in[\ell],\\ j\neq m}}T_{\eta_j}^{\,\rho^{\prime}_j(n)}f_{j, {\underline{h}}, {\underline{h}}'} \cdot T_{\eta_i}^{\,\rho^{\prime}_m(n)}u_{{\underline{h}}, {\underline{h}}'} \cdot \prod_{j\in[L']}{\mathcal D}^{\prime}_{j}(q^{\prime}_j(n)), \end{align} $$

where $f_{j, {\underline {h}}, {\underline {h}}'}:= \Delta _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s; {\underline {h}}-{\underline {h}}'}f_j$ for some vectors ${\textbf {b}}_1, \ldots , {\textbf {b}}_s\in {\mathbb Z}^{\ell }$ . In particular, the functions $f_{j, {\underline {h}}, {\underline {h}}'}$ are invariant under whatever transformations the functions $f_j$ are invariant. Moreover, the functions $u_{{\underline {h}}, {\underline {h}}'}$ are invariant under $T_{\eta _m}^{a_m}T_{\eta _i}^{-a_i}$ , where $a_m$ and $a_i$ are the leading coefficients of $\rho _m$ and $\rho _i$ , but they also retain whatever invariance property $f_m$ has. Thus, by passing from equation (62) to equation (63), we do not lose any invariance properties of the original functions, but rather gain new ones.

Similarly, in the pong step, we pass to averages

$$ \begin{align*} \mathop{\mathbb{E}}\limits_{n\in[N]}\,\prod_{\substack{j\in[\ell],\\ j\neq i}}T_{\eta_j}^{\,\rho_j(n)}f_{j, {\underline{h}}, {\underline{h}}'} \cdot \prod_{j\in[L"]}{\mathcal D}^{\prime\prime}_{j}(q_j"(n)), \end{align*} $$

and the functions $f_{j, {\underline {h}}, {\underline {h}}'}$ retain whatever invariance properties $f_j$ have.

Thus, the functions $(f_1, \ldots , f_{\ell })$ get replaced by

(64)

$$ \begin{align} (f_{1, {\underline{h}}, {\underline{h}}'}, \ldots, f_{m-1, {\underline{h}}, {\underline{h}}'}, u_{{\underline{h}}, {\underline{h}}'}, f_{m+1, {\underline{h}}, {\underline{h}}'}, \ldots, f_{\ell, {\underline{h}}, {\underline{h}}'}) \end{align} $$

in the ping step and

(65)

$$ \begin{align} (f_{1, {\underline{h}}, {\underline{h}}'}, \ldots, f_{i-1, {\underline{h}}, {\underline{h}}'}, 1, f_{i+1, {\underline{h}}, {\underline{h}}'}, \ldots, f_{\ell, {\underline{h}}, {\underline{h}}'}) \end{align} $$

in the pong step. We now formalize the idea that these new families of functions retain the original invariance properties and gain new ones.

Definition. (Good invariance property)

Let $\gamma \in {\mathbb N}$ . We say that the tuple of functions $(f_1, \ldots , f_{\ell })$ has the $\gamma $ -invariance property along $\eta $ with respect to polynomials $p_1, \ldots , p_{\ell }$ with leading coefficients $a_1, \ldots , a_{\ell }$ if for every $j\in [\ell ]$ , the function $f_j$ is invariant under $\mathopen {}( T_{\eta _j}^{a_{\eta _j}}T_j^{-a_j} \mathclose {})^{\gamma }$ . Let I be a (possibly infinite) indexing set. We say that a collection $(f_{i1}, \ldots , f_{i\ell })_{i\in I}$ has the good invariance property along $\eta $ with respect to polynomials $p_1, \ldots , p_{\ell }$ if there exists $\gamma \in {\mathbb N}$ such that $(f_{i1}, \ldots , f_{i\ell })_{i\in I}$ has the $\gamma $ -invariance property along $\eta $ with respect to $p_1, \ldots , p_{\ell }$ for every $i\in I$ .

If $\eta =(1, \ldots , \ell )$ is the identity tuple, there is nothing to check and any collection of functions has the $1$ -invariance property with respect to any polynomial family. The property only becomes non-trivial when $\eta $ is not the identity tuple.

In our arguments, we will ensure that the functions $f_j$ in the average

$$ \begin{align*} \mathop{\mathbb{E}}\limits_{n\in[N]}\prod_{j\in[\ell]}T_{\eta_{j}}^{\,\rho_j(n)} f_j\cdot\prod_{j\in[L]}{\mathcal D}_{j}(q_j(n)), \end{align*} $$

obtained by a sequence of reductions from the original average ${\mathbb {E}}_{n\in [N]}\prod _{j\in [\ell ]}T_{j}^{p_j(n)}$ , have the good invariance property with respect to the original polynomials $p_1, \ldots , p_{\ell }$ , that is, there exists $\gamma \in {\mathbb N}$ such that for every $j\in [\ell ]$ , the function $f_j$ is invariant under $\mathopen {}( T_{\eta _j}^{a_{\eta _{j}}}T_j^{-a_{j}} \mathclose {})^{\gamma }$ , where $a_j$ is the leading coefficient of the polynomial $p_j$ . The need to keep track of the invariance property with respect to the original polynomials is explained in Example 8 below. Before we state this example, however, we prove that the invariance properties get preserved when passing from the tuple of functions $(f_1, \ldots , f_{\ell })$ to the tuples in equations (64) and (65).

Proposition 6.5. (Propagation of invariance properties)

Let $\ell , s\in {\mathbb N}$ , $\eta \in [\ell ]^{\ell }$ be an indexing tuple, $\eta ' := \tau _{mi}\eta $ for distinct $m,i\in [\ell ]$ be another indexing tuple, $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ be polynomials with leading coefficients $a_1, \ldots , a_{\ell }$ , $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system, and ${\textbf {b}}_1, \ldots ,{\textbf {b}}_s\in {\mathbb Z}^{\ell }$ be vectors. Suppose that for some $\gamma \in {\mathbb N}$ , the functions $(f_1, \ldots , f_{\ell })$ have the $\gamma $ -invariance property along $\eta $ with respect to the polynomials $p_1, \ldots , p_{\ell }$ . Consider the functions $(f^{\prime }_{1, {\underline {h}}}, \ldots , f^{\prime }_{\ell , {\underline {h}}})_{{\underline {h}} \in {\mathbb Z}^s}$ , where $f^{\prime }_{j, {\underline {h}}} := \Delta _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s; {\underline {h}}}f_j$ for $j\neq m$ , and $f_{m, {\underline {h}}}'$ is a function invariant under both $S_1 :=\mathopen {}( T_{\eta _m}^{a_{\eta _m}}T_m^{-a_m} \mathclose {})^{\gamma _1}$ and ${S_2 := \mathopen {}( T_{\eta _i}^{a_{\eta _i}}T_{\eta _m}^{-a_{\eta _m}} \mathclose {})^{\gamma _2}}$ for some $\gamma _1, \gamma _2\in {\mathbb N}$ independent of ${\underline {h}}$ . Then $(f^{\prime }_{1, {\underline {h}}}, \ldots , f^{\prime }_{\ell , {\underline {h}}})_{{\underline {h}} \in {\mathbb Z}^s}$ has the good invariance property along $\eta '$ with respect to $p_1, \ldots , p_{\ell }$ .

Proof of Proposition 6.5

For $j\neq m$ , the functions $f_{j}$ are invariant under $\mathopen {}( T_{\eta _j}^{a_{\eta _j}}T_j^{-a_j} \mathclose {})^{\gamma }$ for some non-zero $\gamma \in {\mathbb Z}$ independent of ${\underline {h}}\in {\mathbb Z}^s$ , and so are their translations

$$ \begin{align*} {\mathcal C}^{|{\underline{\epsilon}}|}T^{\epsilon_1 h_1 {\textbf{b}}_1 + \cdots + \epsilon_{s} h_s {\textbf{b}}_s}f_j. \end{align*} $$

The identity $\eta ^{\prime }_j = \eta _j$ , which holds for $j\neq m$ , and the fact that $f^{\prime }_{j, {\underline {h}}}$ is a product of $\mathopen {}( T_{\eta _j}^{a_{\eta _j}}T_j^{-a_j} \mathclose {})^{\gamma }$ -invariant functions, imply that $f^{\prime }_{j, {\underline {h}}}$ is itself invariant under $\mathopen {}( T_{\eta ^{\prime }_j}^{a_{\eta ^{\prime }_j}}T_j^{-a_j} \mathclose {})^{\gamma }$ . For $j=m$ , the functions $f^{\prime }_{j, {\underline {h}}}$ are invariant under

$$ \begin{align*} S_1^{\gamma_2}S_2^{\gamma_1} = \mathopen{}( T_{\eta_m}^{a_{\eta_m}}T_m^{-a_m}T_{\eta_i}^{a_{\eta_i}}T_{\eta_m}^{-a_{\eta_m}} \mathclose{})^{\gamma_1\gamma_2} = \mathopen{}( T_{\eta_i}^{a_{\eta_i}}T_m^{-a_m} \mathclose{})^{\gamma_1\gamma_2}=\mathopen{}( T_{\eta_m'}^{a_{\eta_m'}}T_m^{-a_m} \mathclose{})^{\gamma_1\gamma_2} \end{align*} $$

by noting $\eta ^{\prime }_m = \eta _i$ and combining the two invariance properties that these functions enjoy. Letting $\gamma ' := \textrm {lcm}(\gamma , \gamma _1\gamma _2)$ , it follows that for every ${\underline {h}}\in {\mathbb Z}^s$ , the collection $(f^{\prime }_{1, {\underline {h}}}, \ldots , f^{\prime }_{\ell , {\underline {h}}})$ has the $\gamma '$ -invariance property along $\eta '$ with respect to $p_1, \ldots , p_{\ell }$ .

To get desirable seminorm control over the intermediate tuples encountered in Proposition 6.1, it is not sufficient to keep track of the most immediate invariance properties. This is illustrated by the example below.

Example 8. (The necessity of composed invariance properties)

Consider the average

(66)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2}f_1 \cdot T_2^{n^2}f_2\cdot T_3^{n^2+n}f_3\cdot T_4^{n^2+n}f_4 \cdot T_5^{n^2+2n}f_5\cdot T_6^{n^2+2n}f_6. \end{align} $$

It has length $6$ , degree 2, and type $(2,2,2)$ corresponding to the partition

$$ \begin{align*} {\mathfrak I}_1 =\{1, 2\},\quad {\mathfrak I}_2 = \{3, 4\}, \quad {\mathfrak I}_3 = \{5, 6\}. \end{align*} $$

We assume that it has the good ergodicity property, that is,

$$ \begin{align*} {\mathcal I}(T_1 T_2^{-1}) \kern1.2pt{=}\kern1.2pt {\mathcal I}(T_1)\cap{\mathcal I}(T_2),\kern-1pt\!\quad {\mathcal I}(T_3 T_4^{-1}) \kern1.2pt{=}\kern1.2pt {\mathcal I}(T_3)\cap {\mathcal I}(T_4), \kern-1pt\!\quad {\mathcal I}(T_5 T_6^{-1}) \kern1.2pt{=}\kern1.2pt{\mathcal I}(T_5)\cap {\mathcal I}(T_6). \end{align*} $$

Suppose we want to perform the seminorm smoothing argument to obtain a control of the associated average by the $T_6$ -seminorm of $f_6$ . We iteratively pass to averages of lower type as in Proposition 6.1 (all of which turn out to be controllable), and we show that to get seminorm control for the average of basic type at which we arrive, we need to keep track of not just the latest invariance properties that the functions in the intermediate averages enjoy, but of all the invariance properties that the functions in earlier intermediate averages enjoyed.

Step 1: Reducing to an average of basic type.

In the ping part of the seminorm smoothing argument for equation (66), we replace $T_6$ in the original average from equation (66) by some $T_i$ with $i\in [4]={\mathfrak I}_1\cup {\mathfrak I}_2$ , arriving at, say, averages

(67)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2}f_{11} \cdot T_2^{n^2}f_{12}\cdot T_3^{n^2+n}f_{13}\cdot T_4^{n^2+n}f_{14} \cdot T_5^{n^2+2n}f_{15}\cdot T_4^{n^2+2n}f_{16}. \end{align} $$

The new tuple in equation (67) has type $(2,3,1)$ and is controllable, with the index 5 satisfying the controllability condition. The functions inside take the form

$$ \begin{align*} f_{1j} := \begin{cases} \Delta_{{\textbf{b}}_{11}, \ldots, {\textbf{b}}_{1s_1}; {\underline{h}}-{\underline{h}}'}f_{j}, & j \neq 6,\\ u_{1,{\underline{h}}, {\underline{h}}'}, &j=6, \end{cases} \end{align*} $$

where the functions $u_{1, {\underline {h}}, {\underline {h}}'}$ are $T_6 T_4^{-1}$ -invariant.

To obtain seminorm control of the average from equation (67), we need to perform the seminorm smoothing argument for this tuple. We aim to control it first by a $T_5$ -seminorm of $f_{15}$ since $5$ satisfies the controllability condition and is the only index left in ${\mathfrak I}_3$ . As guided by Proposition 6.1, in the ping step of the smoothing argument, we replace $T_5$ in equation (67) by some $T_i$ with $i\in [4]$ . When $i=1$ , for instance, we end up with averages

(68)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2}f_{21} \cdot T_2^{n^2}f_{22}\cdot T_3^{n^2+n}f_{23}\cdot T_4^{n^2+n}f_{24} \cdot T_1^{n^2+2n}f_{25}\cdot T_4^{n^2+2n}f_{26} \end{align} $$

of type $(3, 3, 0)$ . The functions in equation (68) take the form

$$ \begin{align*} f_{2j} := \begin{cases} \Delta_{{\textbf{b}}_{21}, \ldots, {\textbf{b}}_{2s_2}; {\underline{h}}-{\underline{h}}'}f_{1j}, & j \neq 5,\\ u_{2,{\underline{h}}, {\underline{h}}'}, &j=5, \end{cases} \end{align*} $$

where $u_{2, {\underline {h}}, {\underline {h}}'}$ are $T_5 T_1^{-1}$ -invariant. We note by Proposition 6.5 that the functions $f_{26}$ retain the $T_6 T_4^{-1}$ -invariance of $f_{16}$ .

The indices $3, 4, 6$ in the average from equation (68) all satisfy the controllability condition, so if we want to get seminorm control of this average, we should first control it by a relevant seminorm of one of the functions $f_{23}, f_{24}, f_{26}$ . Suppose that we choose to obtain a seminorm control of the tuple in equation (68) with respect to $f_{26}$ first. Then we would replace $T_4$ at index 6 by $T_i$ for any $i\in [2] = {\mathfrak I}_1$ in the ping step, getting, say, averages

(69)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2}f_{31} \cdot T_2^{n^2}f_{32}\cdot T_3^{n^2+n}f_{33}\cdot T_4^{n^2+n}f_{34} \cdot T_1^{n^2+2n}f_{35}\cdot T_1^{n^2+2n}f_{36} \end{align} $$

of type $(4, 2, 0)$ . The functions in equation (69) take the form

$$ \begin{align*} f_{3j} := \begin{cases} \Delta_{{\textbf{b}}_{31}, \ldots, {\textbf{b}}_{3s_3}; {\underline{h}}-{\underline{h}}'}f_{2j}, & j \neq 6,\\ u_{3,{\underline{h}}, {\underline{h}}'}, &j=6. \end{cases} \end{align*} $$

The function $f_{35}$ , being a multiplicative derivative of a $T_5 T_1^{-1}$ -invariant function, is itself invariant under $T_5 T_1^{-1}$ . The function $f_{36}$ is invariant not only under $T_4T_1^{-1}$ , but also under $T_6 T_4^{-1}$ thanks to Proposition 6.5. It is crucial that $f_{36}$ retains the $T_6 T_4^{-1}$ -invariance of $f_{26}$ , and we shall return to this point shortly.

The average from equation (69) is controllable, and so to arrive at an average of a basic type, we need to perform this procedure two more times. Both the indices 3 and 4 satisfy the controllability condition, so we want to get seminorm control in terms of one of $f_{33}, f_{34}$ —say we choose $f_{34}$ . To obtain control of the tuple in equation (69) by a $T_4$ -seminorm of $f_{34}$ , we replace $T_4$ by $T_i$ for any $i\in [2] = {\mathfrak I}_1$ in the ping step, arriving at, say, averages

(70)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2}f_{41} \cdot T_2^{n^2}f_{42}\cdot T_3^{n^2+n}f_{43}\cdot T_2^{n^2+n}f_{44} \cdot T_1^{n^2+2n}f_{45}\cdot T_1^{n^2+2n}f_{46} \end{align} $$

of type $(5, 1, 0)$ if $i=2$ . The functions in equation (70) take the form

$$ \begin{align*} f_{4j} := \begin{cases} \Delta_{{\textbf{b}}_{41}, \ldots, {\textbf{b}}_{4s_4}; {\underline{h}}-{\underline{h}}'}f_{3j}, & j \neq 4,\\ u_{4,{\underline{h}}, {\underline{h}}'}, &j=4. \end{cases} \end{align*} $$

The function $f_{45}$ retains the $T_5 T_1^{-1}$ -invariance of $f_{35}$ while $f_{46}$ , like $f_{36}$ , is invariant under $T_4T_1^{-1}$ and $T_6 T_4^{-1}$ . Moreover, the function $f_{44}$ is $T_4 T_1^{-1}$ -invariant.

Finally, the only index in the average from equation (70) satisfying the controllability condition is $3$ , so if we want to obtain control of the tuple in equation (70), we first want to get this in terms of a $T_3$ -seminorm of $f_{43}$ . Applying Proposition 6.1, we end up replacing $T_3$ by, say, $T_2$ , getting averages

(71)

$$ \begin{align} \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2}f_{51} \cdot T_2^{n^2}f_{52}\cdot T_2^{n^2+n}f_{53}\cdot T_2^{n^2+n}f_{54} \cdot T_1^{n^2+2n}f_{55}\cdot T_1^{n^2+2n}f_{56}. \end{align} $$

The functions in equation (71) take the form

$$ \begin{align*} f_{5j} := \begin{cases} \Delta_{{\textbf{b}}_{51}, \ldots, {\textbf{b}}_{5s_5}; {\underline{h}}-{\underline{h}}'}f_{4j}, & j \neq 3,\\ u_{5,{\underline{h}}, {\underline{h}}'}, &j=3, \end{cases} \end{align*} $$

in particular, the functions $f_{54}, f_{55}, f_{56}$ retain the invariance properties of $f_{44}, f_{45}, f_{46}$ and $f_{53}$ is $T_3T_2^{-1}$ -invariant.

Step 2: Handling an average of basic type.

The average from equation (71) has basic type $(6, 0, 0)$ , and so we want to control it by appropriate seminorms using Proposition 3.7. We show that without the assumption that $f_{56}$ is invariant under both $T_4T_1^{-1}$ and $T_6T_4^{-1}$ , we cannot control this average by a $T_1$ -seminorm of $f_{56}$ , and conversely that this goal can be achieved with both of these assumptions.

We first note that Proposition 3.7, the ergodicity condition ${\mathcal I}(T_1 T_2^{-1}) = {\mathcal I}(T_1)\cap {\mathcal I}(T_2)$ , and Lemma 2.2 allow us to control it by a $T_1$ -seminorm of $f_{51}$ and by a $T_2$ -seminorm of $f_{52}$ . (At the same time, Proposition 3.7, our ergodicity assumptions, and Lemma 2.2 alone cannot be used to control equation (71) by $T_2$ -seminorms of $f_{53}$ or $f_{54}$ or by $T_1$ -seminorms of $f_{55}$ or $f_{56}$ . Without additional information about the invariance properties of the functions, Proposition 3.7, our ergodicity assumptions, and Lemma 2.2 could only give control of equation (71) by a $T_2$ -seminorms of $f_{53}f_{54}$ and by a $T_1$ -seminorm of $f_{55}f_{56}$ , which is insufficient for our purposes.) Suppose that the $L^2(\mu )$ limit of equation (71) is positive. Decomposing $f_{51}$ using Proposition 2.3 and then applying the pigeonhole principle, we deduce the existence of ${\mathcal D}\in {\mathfrak D}$ such that

(72)

$$ \begin{align} \lim_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]}{\mathcal D}(n^2)\cdot T_2^{n^2}f_{52}\cdot T_2^{n^2+n}f_{53}\cdot T_2^{n^2+n}f_{54} \cdot T_1^{n^2+2n}f_{55}\cdot T_1^{n^2+2n}f_{56}\Big\Vert_{L^2(\mu)}>0. \end{align} $$

We now want to obtain a seminorm control of the average in equation (72) by inductively invoking seminorm control for some average of length 5. To this end, we attempt to proceed like in Example 7. That is, we use the invariance of $f_{53}, f_{54}, f_{55}, f_{56}$ under $T_3T_2^{-1}, T_4T_2^{-1}, T_5T_1^{-1}, T_4 T_1^{-1}$ , respectively, to conclude that the average in equation (72) equals

(73)

$$ \begin{align} \lim_{N\to\infty}\mathop{\mathbb{E}}\limits_{n\in[N]}{\mathcal D}(n^2)\cdot T_2^{n^2}f_{52}\cdot T_3^{n^2+n}f_{53}\cdot T_4^{n^2+n}f_{54} \cdot T_5^{n^2+2n}f_{55}\cdot T_4^{n^2+2n}f_{56}. \end{align} $$

However, without any extra information, we could not control the average from equation (73) using a $T_2$ -seminorm of $f_{55}$ or $T_1$ -seminorm of $f_{56}$ . Suppose for simplicity that ${\mathcal D}$ is a constant sequence and $f_{52}=f_{53}=f_{54} = 1$ . Then equation (73) reduces to

(74)

$$ \begin{align} \lim_{N\to\infty}\mathop{\mathbb{E}}\limits_{n\in[N]}T_5^{n^2+2n}f_{55}\cdot T_4^{n^2+2n}f_{56}. \end{align} $$

We know nothing about the composition $T_5 T_4^{-1}$ , and so without additional input, we cannot control equation (74) by a Gowers–Host–Kra seminorm.

This is the moment when we have to use the additional $T_6T_4^{-1}$ -invariance of $f_{56}$ . Since $T_4 f_{56} = T_6 f_{56}$ , we can replace $T_4$ in equation (74) by $T_6$ . Then Proposition 3.7 gives us control over equation (74) by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_{56}|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\mathbf {e}}_6^{\times s}, ({\mathbf {e}}_6-{\mathbf {e}}_5)^{\times s}}$ for some $s\in {\mathbb N}$ . The ergodicity condition on $T_6 T_5^{-1}$ and the $T_6 T_1^{-1}$ -invariance of $f_{56}$ then give $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_{56}|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\mathbf {e}}_6^{\times s}, ({\mathbf {e}}_6-{\mathbf {e}}_5)^{\times s}}\leq \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_{56}|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{2s, T_6}= \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_{56}|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{2s, T_1}$ , and so this latter seminorm controls equation (72), and hence also equation (71).

If we want to control equation (72) by a $T_1$ -seminorm of $f_{56}$ without the simplifying assumptions on ${\mathcal D}$ and $f_{52}, f_{53}, f_{54}$ , we proceed similarly. (There is no special reason why we would want to control equation (72) by a seminorm of $f_{56}$ instead of other functions. We just aim to illustrate that using the extra $T_6T_4^{-1}$ -invariance of $f_{56}$ , this can be done.) Using the $T_6T_4^{-1}$ -invariance of $f_{56}$ , we rewrite equation (73) as

(75)

Then we inductively apply the fact that we have seminorm control for averages of length 5 of the form in equation (75) to control this average, and hence also equation (71), by a $T_6$ -seminorm of $f_{56}$ . Subsequently, the $T_6 T_1^{-1}$ -invariance of $f_{56}$ (resulting from its $T_6 T_4^{-1}$ - and $T_4T_1^{-1}$ -invariance) and Lemma 3.5 give control of equation (71) by a $T_1$ -seminorm of $f_{56}$ .

We note that the argument above would not work if we only used the ‘new’ property of $f_{56}$ of being invariant under $T_4 T_1^{-1}$ rather than the ‘combined’ property of being invariant under $T_6 T_1^{-1}$ . The important point that this example shows is that it is not enough to keep track of the new invariance properties that we obtain at each stage of the ping argument and forget the old ones. Rather, we need to keep track of the invariance property with respect to a composition of the original and the most recent transformations, which in our example are $T_6$ and $T_1$ .

Examples 7 and 8 reveal that to obtain seminorm control of a controllable average of basic type and in the pong step of the seminorm smoothing argument, we need to substitute the transformations $T_j$ for $T_{\eta _j}$ to arrive at an average for which we have seminorm control. The following proposition allows us to do just that.

Proposition 6.6. (Flipping)

Let $\gamma , \ell , L\in {\mathbb N}$ , $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system, $\eta \in [\ell ]^{\ell }$ be an indexing tuple, $A\subset [\ell ]$ be a subset of indices, and $p_1, \ldots , p_{\ell }, \rho _1, \ldots , \rho _{\ell }, q_1, \ldots , q_L\in {\mathbb Z}[n]$ be polynomials. Suppose that:

(i) the tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ is a descendant of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ ;
(ii) $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ are 1-bounded functions having the $\gamma $ -invariance property along $\eta $ with respect to $p_1, \ldots , p_{\ell }$ .

Then there exist 1-bounded functions $f_1', \ldots , f^{\prime }_{\ell }$ , polynomials $\rho ^{\prime }_1, \ldots , \rho ^{\prime }_{\ell }, q_1', \ldots , q_L'\in {\mathbb Z}[n]$ , and an indexing tuple $\eta '$ with the following properties:

(i) the tuple $\eta '$ takes the form
$$ \begin{align*} \eta^{\prime}_j = \begin{cases} j, &j\in A,\\ \eta_j, &j\notin A; \end{cases} \end{align*} $$
(ii) $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ is a descendant of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ ;
(iii) $f^{\prime }_1, \ldots , f^{\prime }_{\ell }\in L^{\infty }(\mu )$ are 1-bounded and have the $\gamma $ -invariance property along $\eta '$ with respect to $p_1, \ldots , p_{\ell }$ ; moreover, for every $s\geq 2$ and $j\in [\ell ]$ , they satisfy the bound $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f^{\prime }_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta ^{\prime }_j}}\leq C \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _{j}}}$ for some $C>0$ depending only on $\gamma $ and the leading coefficients of $p_1, \ldots , p_{\ell }$ ; lastly, $f^{\prime }_j = 1$ whenever $f_j = 1$ ;
(iv) we have the inequality
(76) $$ \begin{align} &\lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{{\eta_{j}}}^{\,\rho_{j}(n)}f_{j} \cdot \prod_{j\in[L]}{\mathcal D}_{j}(q_{j}(n))\bigg\Vert_{L^2(\mu)} \\ \nonumber &\quad\leq \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{{\eta^{\prime}_j}}^{\,\rho^{\prime}_j(n)}f^{\prime}_j \cdot\prod_{j\in[L]}{\mathcal D}_{j}(q^{\prime}_j(n))\bigg\Vert_{L^2(\mu)}. \end{align} $$

We note that if the leading coefficients of $p_1, \ldots , p_{\ell }$ are all 1, and the good invariance property takes the form of $f_j$ being invariant under $T_{\eta _j}T_j^{-1}$ for all $j\in [\ell ]$ , then Proposition 6.6 is straightforward, and in fact for every $N\in {\mathbb N}$ , we have

$$ \begin{align*} \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{{\eta_{j}}}^{\,\rho_{j}(n)}f_{j} \prod_{j\in[L]}{\mathcal D}_{j}(q_{j}(n))=\mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{\eta^{\prime}_j}^{\,\rho_{j}(n)}f_{j} \prod_{j\in[L]}{\mathcal D}_{j}(q_{j}(n)) \end{align*} $$

and $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _j}} = \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta ^{\prime }_j}}$ for all $j\in [\ell ], s\in {\mathbb N}$ . The need for the more complicated statement of Proposition 6.6 comes from tedious but uninspiring technicalities that appear when the polynomials $p_1, \ldots , p_{\ell }$ have leading coefficients distinct from 1.

We have used the flipping technique twice in Example 7: in equation (57) and when passing from equation (60) to equation (61) to obtain a seminorm control on the former using the seminorm control on the latter. We also used it in Example 8 to get seminorm control of equation (71). We will also use it shortly to handle uncontrollable tuples.

Proof of Proposition 6.6

For each $j\in [\ell ]$ , let $a_j$ be the leading coefficient of $p_j$ and $\gamma _j\in {\mathbb N}$ be the smallest natural number such that $f_j$ is invariant under $\mathopen {}( T_{\eta _j}^{a_{\eta _j}}T_j^{-a_j} \mathclose {})^{\gamma _j}$ (in particular, $\gamma _j = 1$ if $\eta _j = j$ ). Let $\gamma \in {\mathbb N}$ be the smallest natural number such that $a_{\eta _j} \gamma _j$ divides the coefficients of $\gamma \rho _j$ for every $j\in A$ . We then define

$$ \begin{align*} f^{\prime}_j := T_{\eta_j}^{\,\rho_j(r)}f_j,\quad q^{\prime}_j(n):=q_j(\gamma n + r),\quad \rho^{\prime}_{j}(n) := \frac{a_{\eta^{\prime}_j}}{a_{\eta_{j}}}\mathopen{}( \rho_{j}(\gamma n + r) - \rho_{j}(r) \mathclose{}) \end{align*} $$

for some $r\in \{0, \ldots , \unicode{x3bb} -1\}$ to be chosen later, and we observe that $\rho ^{\prime }_j\in {\mathbb Z}[n]$ for every $j\in [\ell ]$ . By definition of $\eta '$ and the $\gamma $ -invariance of $f_1, \ldots , f_{\ell }$ along $\eta $ , the functions $f_1', \ldots , f_{\ell }'$ are $\gamma $ -invariant along $\eta '$ . Moreover, if $f_j =1$ , then so is $f^{\prime }_j$ . Lastly, they satisfy the bound $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f^{\prime }_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta ^{\prime }_j}}\ll _{a_{\eta _j} \gamma } \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _{j}}}$ for every $s\geq 2$ and $j\in [\ell ]$ ; this is trivial for $j\notin A$ , and if $j\in A$ , then

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f^{\prime}_j|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s, T_{\eta^{\prime}_j}} = \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f^{\prime}_j|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s, T_{j}}= \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_j|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s, T_{j}} \leq \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_j|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s, T_{j}^{a_{j}\gamma}} = \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_j|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s, T_{\eta_{j}}^{a_{\eta_{j}}\gamma}}\ll_{a_{\eta_j} \gamma}\lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_j|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s, T_{\eta_j}}, \end{align*} $$

where we use the fact that $f_j$ is a composition of $f^{\prime }_j$ with respect to a measure-preserving transformation, the invariance property of $f_j$ , Lemma 3.5, and both directions of Lemma 2.1.

We move on to prove the inequality in equation (76). For $j\notin A$ , where $\eta ^{\prime }_j = \eta _j$ , we simply have $T_{\eta _j}^{\,\rho _j(\gamma n+r)}f_j = T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)}f_j'$ . For $j\in A$ , the invariance property of $f_j$ gives us the identity

$$ \begin{align*} T_{\eta_j}^{\,\rho_j(\gamma n+r)}f_j = \mathopen{}( T_{\eta_j}^{a_{\eta_j}\gamma_j} \mathclose{})^{({\rho_j(\gamma n+r)-\rho_j(r)})/{a_{\eta_j}\gamma_j}} T_{\eta_j}^{\,\rho_j(r)} f_j = T_{\eta^{\prime}_j}^{\,\rho^{\prime}_j(n)}f_j'. \end{align*} $$

Splitting ${\mathbb N}$ into $(\gamma \cdot {\mathbb N} + r)_{r\in \{0, \ldots , \gamma -1\}}$ , we deduce from the pigeonhole principle that there exists $r\in \{0, \ldots , \gamma -1\}$ for which equation (76) holds.

From the construction of the polynomials $\rho _1', \ldots , \rho ^{\prime }_{\ell }$ , the assumption that $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ is a descendant of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ and Lemma 6.2, it follows that $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ is also a descendant of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ .

6.3 Handling uncontrollable tuples

We have explained in the previous sections that if a tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ is controllable, then we control it by a Gowers–Host–Kra seminorm using a seminorm smoothing argument. If the tuple is uncontrollable, however, we use the following variant of the flipping technique from Proposition 6.6 to bound the $L^2(\mu )$ norm of the associated average by an $L^2(\mu )$ norm of a controllable average.

Corollary 6.7. (Flipping uncontrollable tuples)

Let $\gamma , \ell , L\in {\mathbb N}$ , $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system, $\eta \in [\ell ]^{\ell }$ be an indexing tuple, and $p_1, \ldots , p_{\ell }, \rho _1, \ldots , \rho _{\ell }, q_1, \ldots , q_L\in {\mathbb Z}[n]$ be polynomials. Suppose that

(i) the tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ is uncontrollable of type w with the last non-zero index $t_w$ , and it is a descendant of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ ;
(ii) $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ are 1-bounded functions having the $\gamma $ -invariance property along $\eta $ with respect to $p_1, \ldots , p_{\ell }$ .

(i) the tuple $\eta '$ takes the form
$$ \begin{align*} \eta^{\prime}_j = \begin{cases} j, &\eta_j\in {\mathfrak I}_{t_w},\\ \eta_j, &\eta_j\notin {\mathfrak I}_{t_w}; \end{cases} \end{align*} $$
(ii) $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ is a descendant of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ ;
(iii) $f^{\prime }_1, \ldots , f^{\prime }_{\ell }\in L^{\infty }(\mu )$ are 1-bounded and have the $\gamma $ -invariance property along $\eta '$ with respect to $p_1, \ldots , p_{\ell }$ ; moreover, for every $s\geq 2$ and $j\in [\ell ]$ , they satisfy the bound $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f^{\prime }_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta ^{\prime }_j}}\leq C \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _{j}}}$ for some $C>0$ depending only on $\gamma $ and the leading coefficients of $p_1, \ldots , p_{\ell }$ ; lastly, $f^{\prime }_j = 1$ whenever $f_j = 1$ ;
(iv) we have the inequality
$$ \begin{align*} &\lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{{\eta_{j}}}^{\,\rho_{j}(n)}f_{j} \cdot \prod_{j\in[L]}{\mathcal D}_{j}(q_{j}(n))\bigg\Vert_{L^2(\mu)} \\ \nonumber & \quad\leq\lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{{\eta^{\prime}_j}}^{\,\rho^{\prime}_j(n)}f^{\prime}_j \cdot\prod_{j\in[L]}{\mathcal D}_{j}(q^{\prime}_j(n))\bigg\Vert_{L^2(\mu)}. \end{align*} $$

Corollary 6.7 follows from Proposition 6.6 by taking $A=\{j\in [\ell ]: \eta _j\in {\mathfrak I}_{t_w}\}$ .

We emphasize that Corollary 6.7 by itself does not guarantee that the tuple $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ has a lower type than the tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ . However, this will be the case when we apply it to all the tuples $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ that appear in our inductive procedure. The crucial ingredient in achieving this type reduction will be the property (iv) in Proposition 6.8 enjoyed by all the tuples showing up in our arguments.

Example 9. Consider the tuple

(77)

$$ \begin{align} \mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_1^{n^2+n}, T_1^{n^2+n}, T_5^{n^2+2n}, T_5^{n^2+2n} \mathclose{}) \end{align} $$

from Example 4. It is a descendant of the tuple

(78)

$$ \begin{align} \mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_5^{n^2+n}, T_6^{n^2+n}, T_7^{n^2+2n}, T_8^{n^2+2n} \mathclose{}), \end{align} $$

obtained by four applications of Proposition 6.1, in which we substitute $T_5$ for $T_8$ at the index 8, $T_5$ for $T_7$ at the index 7, $T_1$ for $T_5$ at the index 5, and $T_1$ for $T_6$ at the index 6. Corollary 6.7 gives that if $f_1, \ldots , f_8\in L^{\infty }(\mu )$ are functions such that $f_5, f_6, f_7, f_8$ are invariant under $T_5 T_1^{-1}, T_6T_1^{-1}, T_7T_5^{-1}, T_8T_5^{-1}$ , respectively, then we have

$$ \begin{align*} &\kern-3pt\lim_{N\to\infty}\kern-2pt\Big\Vert\kern-2pt \mathop{\mathbb{E}}\limits_{n\in[N]}\kern-2ptT_1^{n^2}f_1\kern1.2pt{\cdot}\kern1.2pt T_2^{n^2}f_2\kern1.2pt{\cdot}\kern1.2pt T_3^{n^2}f_3\kern1.2pt{\cdot}\kern1.2pt T_4^{n^2}f_4\kern1.2pt{\cdot}\kern1.2pt T_1^{n^2+n}f_5\kern1.2pt{\cdot}\kern1.2pt T_1^{n^2+n}f_6\kern1.2pt{\cdot}\kern1.2pt T_5^{n^2+2n}f_7\kern1.2pt{\cdot}\kern1.2pt T_5^{n^2+2n}f_8\Big\Vert_{L^2(\mu)}\\ &\quad\leq \lim_{N\to\infty}\Big\Vert \mathop{\mathbb{E}}\limits_{n\in[N]}T_1^{n^2}f_1\cdot T_2^{n^2}f_2\cdot T_3^{n^2}f_3\cdot T_4^{n^2}f_4\cdot T_1^{n^2+n}f_5\cdot T_1^{n^2+n}f_6\\&\quad\times T_7^{n^2+2n}f_7\cdot T_8^{n^2+2n}f_8\Big\Vert_{L^2(\mu)} \end{align*} $$

(in fact, a closer look guarantees that we get an equality and not just for the $L^2(\mu )$ limits, but for each finite average). We moreover obtain the equality of seminorms $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_7|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_7} = \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_7|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_5}, \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_8|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_8} = \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_8|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_5},$ and the new tuple

(79)

$$ \begin{align} \mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_1^{n^2+n}, T_1^{n^2+n}, T_7^{n^2+2n}, T_8^{n^2+2n} \mathclose{}) \end{align} $$

is a descendant of equation (78). Importantly, the new tuple in equation (79) has type $(6, 0, 2)$ , which is lower than the type $(6, 2, 0)$ of equation (77). Applying Corollary 6.7 to equation (77), we have thus successfully replaced it by a tuple of lower type. Lastly, the new tuple in equation (79) is controllable as the indices $7, 8$ satisfy the controllability condition.

6.4 Recapitulation

We conclude this section with Proposition 6.8, which forms an inductive framework for the proof of Theorem 1.1 in the next section. Combining the content of Proposition 6.1 and Corollary 6.7, it shows that—starting with a tuple $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ satisfying the good ergodicity property—we reach a tuple of basic type in a finite number of steps. For tuples of basic type, seminorm control will follow from arguments made in Proposition 7.2, and then we will use this fact and induction to go back and get seminorm control for the original tuple $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ .

Proposition 6.8. (Inductive framework)

Let $\ell \in {\mathbb N}$ , $p_1, \ldots , p_{\ell }\in {\mathbb Z}[n]$ be polynomials, and $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system. Suppose that the tuple $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ has the good ergodicity property. Then there exists $r\in {\mathbb N}$ (which depends only on $p_1, \ldots , p_{\ell }$ but can be bounded purely in terms of $\ell $ ) and a sequence of tuples

$$ \begin{align*} \mathopen{}( T_{\eta_{0j}}^{\,\rho_{0j}(n)} \mathclose{})_{j\in[\ell]} \to \mathopen{}( T_{\eta_{1j}}^{\,\rho_{1j}(n)} \mathclose{})_{j\in[\ell]} \to \mathopen{}( T_{\eta_{2j}}^{\,\rho_{2j}(n)} \mathclose{})_{j\in[\ell]}\to \ldots \to \mathopen{}( T_{\eta_{rj}}^{\,\rho_{rj}(n)} \mathclose{})_{j\in[\ell]} \end{align*} $$

with types $w_0, \ldots , w_r$ such that $\mathopen {}( T_{\eta _{0j}}^{\,\rho _{0j}(n)} \mathclose {})_{j\in [\ell ]} := \mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ and for $k\in \{0, \ldots , r-1\}$ , the following properties hold.

(i) If the tuple $\mathopen {}( T_{\eta _{kj}}^{\,\rho _{kj}(n)} \mathclose {})_{j\in [\ell ]}$ is controllable, then the tuple $\mathopen {}( T_{\eta _{(k+1)j}}^{\,\rho _{(k+1)j}(n)} \mathclose {})_{j\in [\ell ]}$ is chosen using Proposition 6.1.
(ii) If the tuple $\mathopen {}( T_{\eta _{kj}}^{\,\rho _{kj}(n)} \mathclose {})_{j\in [\ell ]}$ is uncontrollable, then the tuple $\mathopen {}( T_{\eta _{(k+1)j}}^{\,\rho _{(k+1)j}(n)} \mathclose {})_{j\in [\ell ]}$ is chosen using Corollary 6.7.
(iii) The tuple $\mathopen {}( T_{\eta _{(k+1)j}}^{\,\rho _{(k+1)j}(n)} \mathclose {})_{j\in [\ell ]}$ is a descendant of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ . In particular, it has the good ergodicity property.
(iv) For the indexing tuple $\eta _{k+1}=(\eta _{(k+1)1}, \ldots , \eta _{(k+1)\ell })$ , if $ j\in {\mathfrak I}_t$ , then either $\eta _{(k+1)j} = j$ or $\eta _{(k+1)j}\in {\mathfrak I}_{t'}$ for some $t'<t$ .
(v) We have $w_{k+1}<w_k$ , that is, the tuple $\mathopen {}( T_{\eta _{(k+1)j}}^{\,\rho _{(k+1)j}(n)} \mathclose {})_{j\in [\ell ]}$ has a lower type than the tuple $\mathopen {}( T_{\eta _{kj}}^{\,\rho _{kj}(n)} \mathclose {})_{j\in [\ell ]}$ .
(vi) The tuple $\mathopen {}( T_{\eta _{rj}}^{\,\rho _{rj}(n)} \mathclose {})_{j\in [\ell ]}$ has basic type and the restriction $\eta _r|_{{\mathfrak I}_1}$ is the identity sequence.

For the rest of the paper, we call a tuple $\mathopen {}( T_{\eta _{j}}^{\,\rho _{j}(n)} \mathclose {})_{j\in [\ell ]}$ a proper descendant of the tuple $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ if it appears in one of the sequences of tuples constructed from $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ using Proposition 6.8.

Proof. Let $k\in \{0, \ldots , r-1\}$ . Suppose that the tuples $\mathopen {}( T_{\eta _{0j}}^{\,\rho _{0j}(n)} \mathclose {})_{j\in [\ell ]}, \ldots , \mathopen {}( T_{\eta _{kj}}^{\,\rho _{kj}(n)} \mathclose {})_{j\in [\ell ]}$ are already constructed and satisfy the properties (i)–(v). If the tuple $\mathopen {}( T_{\eta _{kj}}^{\,\rho _{kj}(n)} \mathclose {})_{j\in [\ell ]}$ has basic type, we halt. Otherwise, we choose the tuple $\mathopen {}( T_{\eta _{(k+1)j}}^{\,\rho _{(k+1)j}(n)} \mathclose {})_{j\in [\ell ]}$ using Proposition 6.1 if $\mathopen {}( T_{\eta _{kj}}^{\,\rho _{kj}(n)} \mathclose {})_{j\in [\ell ]}$ is controllable and using Corollary 6.7 otherwise, so that the properties (i) and (ii) are satisfied. (We remark that by the property (iv) applied to the tuple $\mathopen {}( T_{\eta _{kj}}^{\,\rho _{kj}(n)} \mathclose {})_{j\in [\ell ]}$ , we have $\eta _{kj} = j$ whenever $j\in {\mathfrak I}_1$ , and hence the type ${w_k=(w_{k1}, \ldots , w_{k\ell })}$ of $\mathopen {}( T_{\eta _{kj}}^{\,\rho _{kj}(n)} \mathclose {})_{j\in [\ell ]}$ satisfies $w_{k1}$ ¿0. By the assumption that $w_k$ is not basic, we also have $w_{kt}>0$ for some $t>1$ ; therefore, we have at least two non-zero indices in $w_k$ and we can act as in Proposition 6.1.) We note from Corollaries 6.3 and 6.7 that the new tuple is a descendant of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ , and by Proposition 6.4, it has the good ergodicity property and hence the property (iii) holds as well. The property (iv) holds by induction and the way the tuple $\mathopen {}( T_{\eta _{(k+1)j}}^{\,\rho _{(k+1)j}(n)} \mathclose {})_{j\in [\ell ]}$ is constructed using Proposition 6.1 or Corollary 6.7.

For the property (v), we first note that if $\mathopen {}( T_{\eta _{kj}}^{\,\rho _{kj}(n)} \mathclose {})_{j\in [\ell ]}$ is controllable, then the property (v) holds for $\mathopen {}( T_{\eta _{(k+1)j}}^{\,\rho _{(k+1)j}(n)} \mathclose {})_{j\in [\ell ]}$ by Proposition 6.1. If $\mathopen {}( T_{\eta _{kj}}^{\,\rho _{kj}(n)} \mathclose {})_{j\in [\ell ]}$ is uncontrollable, then we get from Corollary 6.7 that $w_{(k+1)t_{k}} = 0<w_{kt_k}$ , where $t_{k}:=t_{w_{k}}$ is the last non-zero index of $w_k$ , and we deduce from property (iv) that $w_{(k+1)t} = w_{kt}$ for $1\leq t < t_k$ . This point is important, so we explain it in words. What happens is that when we apply Corollary 6.7, the index $w_{(k+1)t_k}$ goes down to 0 (since all the transformations with indices from ${\mathfrak I}_{t_k}$ get flipped), but the new transformations appearing in their place have indices from ${\mathfrak I}_{t_k+1}, \ldots , {\mathfrak I}_{K_2}$ , as given by the property (iv). Hence, $w_{k+1}<w_k$ in this case as well.

It follows from the property (v) and the fact that there are at most $(K_3+1)^{K_2}$ possible types for tuples in ${\mathbb N}_0^{K_2}$ with sum of coordinates $K_3$ that the sequence eventually terminates. And since, by our construction, it can only terminate on the basic type $(K_3,0,\ldots , 0)$ , there exists $r<(K_3+1)^{K_2}\leq (\ell +1)^{\ell }$ such that the tuple $\mathopen {}( T_{\eta _{rj}}^{\,\rho _{rj}(n)} \mathclose {})_{j\in [\ell ]}$ has basic type $(K_3,0,\ldots , 0)$ . The second part of property (vi) for this tuple follows from the property (iv) by taking $t=1$ .

Example 10. (Iterative reduction to tuples of lower type)

For the tuple in equation (78) from Example 9, Proposition 6.8 would give, among other options, the following sequence of tuples:

$$ \begin{align*} &\mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_5^{n^2+n}, T_6^{n^2+n}, T_7^{n^2+2n}, T_8^{n^2+2n} \mathclose{})\\ \rightarrow & \mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_5^{n^2+n}, T_6^{n^2+n}, T_7^{n^2+2n}, T_5^{n^2+2n} \mathclose{})\\ \rightarrow & \mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_5^{n^2+n}, T_6^{n^2+n}, T_5^{n^2+2n}, T_5^{n^2+2n} \mathclose{})\\ \rightarrow & \mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_1^{n^2+n}, T_6^{n^2+n}, T_5^{n^2+2n}, T_5^{n^2+2n} \mathclose{})\\ \rightarrow & \mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_1^{n^2+n}, T_1^{n^2+n}, T_5^{n^2+2n}, T_5^{n^2+2n} \mathclose{})\\ \rightarrow & \mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_1^{n^2+n}, T_1^{n^2+n}, T_7^{n^2+2n}, T_8^{n^2+2n} \mathclose{})\\ \rightarrow & \mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_1^{n^2+n}, T_1^{n^2+n}, T_7^{n^2+2n}, T_2^{n^2+2n} \mathclose{})\\ \rightarrow & \mathopen{}( T_1^{n^2}, T_2^{n^2}, T_3^{n^2}, T_4^{n^2}, T_1^{n^2+n}, T_1^{n^2+n}, T_2^{n^2+2n}, T_2^{n^2+2n} \mathclose{}). \end{align*} $$

Their types, starting from the top tuple, are

$$ \begin{align*} (4, 2, 2)> (4, 3, 1) > (4, 4, 0) > (5, 3, 0) > (6, 2, 0) > (6, 0, 2) > (7, 0, 1) > (8, 0, 0), \end{align*} $$

which shows that at each step, the new tuple has a lower type than its predecessor. The sixth tuple, counting from the top, has been obtained via Corollary 6.7 (since the fifth tuple is uncontrollable, as explained in Examples 4 and 9), while all the other tuples have been obtained using Proposition 6.1. Each subsequent tuple is a descendant of the original tuple, and so each of them has the good ergodicity property thanks to Proposition 6.4. Lastly, the final tuple has basic type, and moreover for its indexing tuple $\eta $ , the restriction $\eta |_{{\mathfrak I}_1}=\eta |_{[4]}$ is an identity because no substitution has taken place at the first four indices.

7 The proof of Theorem 1.1

7.1 Induction scheme

In all statements in this section, we work in the setting of Proposition 6.8, that is, all the lower type tuples at which we arrive from some original average are those constructed in Proposition 6.8 and have the properties listed there.

Theorem 1.1 follows by induction from the result below upon setting $d=L=0$ and letting $\eta $ be the identity tuple.

Proposition 7.1. (Seminorm control restated)

(80)

$$ \begin{align} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{{\eta_j}}^{\,\rho_j(n)}f_j \cdot \prod_{j\in[L]}{\mathcal D}_{j}(q_j(n))\bigg\Vert_{L^2(\mu)} = 0 \end{align} $$

whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _j}} = 0$ for some $j\in [\ell ]$ .

A word of explanation is necessary for the statement of Proposition 7.1. We need both polynomials $p_1, \ldots , p_{\ell }$ and $\rho _1, \ldots , \rho _{\ell }$ . The reason is that for our induction to work, we need the functions $f_1, \ldots , f_{\ell }$ to have the good invariance property with respect to the original family $p_1, \ldots , p_{\ell }$ rather than the descendant family $\rho _1, \ldots , \rho _{\ell }$ . This is necessary for a number of reasons: to prove seminorm control for averages of basic types in the proof of Proposition 7.2; to apply Proposition 6.6 in the pong step of Proposition 7.5; to derive Proposition 7.1 from Proposition 7.4 for controllable tuples; and to invoke Corollary 6.7 for uncontrollable tuples in Proposition 7.1. The necessity of keeping track of the invariance property with regards to the original polynomial family has also been explained in Step 2 of Example 8.

Technically, the value s in Proposition 7.1 depends also on on $\eta $ , but since the number of possible tuples $\eta $ is bounded in terms of $\ell $ , this dependence can be removed.

We first prove Proposition 7.1 for averages from equation (35) of basic type. This will serve as the base for induction for Propositions 7.1, 7.4, and 7.5.

Proposition 7.2. (Seminorm control of basic types)

Let $d, D, \ell , L\in {\mathbb N}$ , $\eta \in [\ell ]^{\ell }$ and $p_1, \ldots , p_{\ell }$ , $q_1, \ldots , q_L\in {\mathbb Z}[n]$ be polynomials of degrees at most D. Suppose that the polynomials $p_1, \ldots , p_{\ell }$ have the good ergodicity property for the system $(X,{\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ . Let $\mathopen {}( T_{\eta _{j}}^{\,\rho _{j}(n)} \mathclose {})_{j\in [\ell ]}$ be a proper descendant of the tuple $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ , and suppose that the type w of $\mathopen {}( T_{\eta _{j}}^{\,\rho _{j}(n)} \mathclose {})_{j\in [\ell ]}$ is basic. Then there exists $s\in {\mathbb N}$ , depending only on $d, D, \ell , L$ , such that for all 1-bounded functions $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ with the good invariance property along $\eta $ with respect to $p_1, \ldots , p_{\ell }$ and all sequences of functions ${\mathcal D}_{1}, \ldots , {\mathcal D}_L\in {\mathfrak D}_d$ , we have equation (80) whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _j}} = 0$ for some $j\in [\ell ]$ .

A special case of Proposition 7.2 has been sketched in Step 2 of Example 8, and we invite the reader to compare the abstract proof presented below with the argument in Step 2 of Example 8.

Proof. We induct on the length $\ell $ of the average. If $\ell = 1$ , the statement holds by Proposition 3.7. We therefore assume that $\ell> 1$ , and we will prove Proposition 7.2 for fixed $\ell>1$ by invoking Proposition 7.1 for an average of length $\ell -1$ . More specifically, we will show first that there exists an index m satisfying the controllability condition, and that we can control the average by a $T_{\eta _m}$ -seminorm of $f_m$ . Then we will replace $f_m$ by a dual function using Proposition 2.3 and the pigeonhole principle, flip the other transformations $T_{\eta _j}$ into $T_j$ using Proposition 6.6, and invoke Proposition 7.1 for averages of length $\ell -1$ to obtain seminorm control in terms of other functions.

Take any $m\in {\mathfrak I}_1$ ; we assume for simplicity that $m = \ell $ . Proposition 6.8(vi) implies that $\eta _{\ell }=\ell $ , and moreover that $\ell $ satisfies the controllability condition. This fact and Proposition 3.7 imply that the identity in equation (80) holds whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_{\ell }|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{\textbf {b}}_1, \ldots , {\textbf {b}}_s} = 0$ for some

(81)

$$ \begin{align} {\textbf{b}}_1, \ldots, {\textbf{b}}_s \in\{b_{\ell}{\mathbf{e}}_{\ell} - b_i{\mathbf{e}}_{\eta_i}: i\in{\mathfrak L}\cup\{0\}\setminus\{\ell\}\}, \end{align} $$

where $s\in {\mathbb N}$ depends only on $d, D, \ell , L$ , the numbers $b_{\ell }, b_i$ are the coefficients of $\rho _{\ell }, \rho _i$ of degree $d_{\ell i}:= \deg (\rho _{\ell } {\mathbf {e}}_{\ell } - \rho _i{\mathbf {e}}_{\eta _i})$ whenever $i\neq 0$ , and for $i=0$ , we simply set $b_0{\mathbf {e}}_{\eta _0} = \mathbf {0}$ . Using the monotonicity property of equation (17), we can assume without loss of generality that $s\geq 2$ . Since the tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ has a basic type, it follows that the indices $\eta _i$ in equation (81) come from the set ${\mathfrak I}_1$ . We have to show that each transformation $T^{{\textbf {b}}_1}, \ldots , T^{{\textbf {b}}_s}$ is either a non-zero iterate of $T_{\ell }$ or its invariant functions are invariant under a bounded power of $T_{\ell }$ . For $k\in [s]$ , let ${\textbf {b}}_k:= b_{\ell }{\mathbf {e}}_{\ell } - b_i{\mathbf {e}}_{\eta _i}$ . If $i = 0$ , then $T^{{\textbf {b}}_k}$ is indeed a non-zero iterate of $T_{\ell }$ . If $i\in {\mathfrak I}_1$ , then $\eta _i = i\neq \ell $ by the property from Proposition 6.8(vi) that $\eta |_{{\mathfrak I}_1}$ is the identity tuple. Then $T^{{\textbf {b}}_k} = T_{\ell }^{b_{\ell }}T_i^{-b_i} = \mathopen {}( T_{\ell }^{\beta _{\ell }}T_i^{-\beta _i} \mathclose {})^{\gcd (b_{\ell }, b_i)}$ for coprime integers $\beta _{\ell }, \beta _i$ , and the good ergodicity property of $\rho _1, \ldots , \rho _{\ell }$ along $\eta $ (Propositions 6.8(iii)) implies that ${\mathcal I}(T_{\ell }^{\beta _{\ell }}T_i^{-\beta _i})\subseteq {\mathcal I}(T_{\ell })$ . If $i\notin {\mathfrak I}_1\cup \{0\}$ , then we split into the cases $\eta _i \neq \ell $ and $\eta _i = \ell $ . In the former case, we once again use the good ergodicity property of $\rho _1, \ldots , \rho _{\ell }$ along $\eta $ to conclude that $T^{{\textbf {b}}_k} = \mathopen {}( T_{\ell }^{\beta _{\ell }}T_i^{-\beta _i} \mathclose {})^{\gcd (b_{\ell }, b_i)}$ and ${\mathcal I}(T_{\ell }^{\beta _{\ell }}T_i^{-\beta _i})\subseteq {\mathcal I}(T_{\ell })$ . In the latter case, the pairwise independence of $\rho _i$ and $\rho _{\ell }$ implies that ${\textbf {b}}_k$ is non-zero, and hence $T^{{\textbf {b}}_k}$ is a non-zero iterate of $T_{\ell }$ . Rearranging ${\textbf {b}}_1, \ldots , {\textbf {b}}_s$ , it follows that

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{\ell}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s} = \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{\ell}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{c_1 {\mathbf{e}}_{\ell}, \ldots, c_{s'} {\mathbf{e}}_{\ell}, {\textbf{b}}_{s'+1}, \ldots, {\textbf{b}}_{s}} \end{align*} $$

for some $0 \leq s'\leq s$ , non-zero integers $c_1, \ldots , c_{s'}$ , and transformations $T^{{\textbf {b}}_{s'+1}}, \ldots , T^{{\textbf {b}}_s}$ with the property that for every $j\in \{s'+1, \ldots , s\}$ , there exists ${\textbf {b}}^{\prime }_j\in {\mathbb Z}^{\ell }$ and non-zero $c_j\in {\mathbb Z}$ such that ${\textbf {b}}_j = c_j {\textbf {b}}^{\prime }_j$ and ${\mathcal I}(T^{{\textbf {b}}^{\prime }_j})\subseteq {\mathcal I}(T^{{\mathbf {e}}_{\ell }})$ . By Lemmas 2.1 and 2.2, we have

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{\ell}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s}= \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{\ell}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{c_1 {\mathbf{e}}_{\ell}, \ldots, c_{s'} {\mathbf{e}}_{\ell}, c_{s'+1}{\textbf{b}}^{\prime}_{s'+1}, \ldots, c_s {\textbf{b}}^{\prime}_s}\ll_{c_1, \ldots, c_s}\lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{\ell}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s, T_{\ell}}, \end{align*} $$

implying that equation (80) holds whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_{\ell }|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\ell }} = 0$ .

We now use the seminorm control at $\ell $ with Proposition 2.3 and the pigeonhole principle to deduce that if equation (80) fails, then it also fails when $T_{\eta _{\ell }}^{\,\rho _{\ell }(n)} f_{\ell }$ is replaced by a sequence ${\mathcal D}_{L+1}(\rho _{\ell }(n))$ with ${\mathcal D}_{L+1}\in {\mathfrak D}_s$ . Letting $q_{L+1} := \rho _{\ell }$ for simplicity, it is enough to show that

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell-1]}T_{{\eta_j}}^{\,\rho_j(n)}f_j \cdot \prod_{j\in[L+1]}{\mathcal D}_{j}(q_j(n))\bigg\Vert_{L^2(\mu)}>0 \end{align*} $$

implies $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_{\eta _j}}> 0$ for every $j\in [\ell -1]$ and some $s'\in {\mathbb N}$ depending only on $d, D, \ell , L$ . By Proposition 6.6, there exist 1-bounded functions $f_1', \ldots , f_{\ell -1}'$ and polynomials $\rho _1', \ldots , \rho _{\ell -1}'$ with the good ergodicity property for the system such that

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell-1]}T_{{j}}^{\,\rho^{\prime}_j(n)}f^{\prime}_j \cdot \prod_{j\in[L+1]}{\mathcal D}_{j}(q_j(n))\bigg\Vert_{L^2(\mu)}>0 \end{align*} $$

and $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f^{\prime }_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_j} \leq C \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_{\eta _j}}$ for all $j\in [\ell -1]$ , $s'\geq 1$ , and some constant $C>0$ depending only on $\gamma $ and the leading coefficients of $p_1, \ldots , p_{\ell }$ . Invoking inductively the case $\ell -1$ of Proposition 7.1, we deduce that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f^{\prime }_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_j}>0$ for some $s'\in {\mathbb N}$ depending only on $d, D, \ell , L$ , and hence $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s', T_{\eta _j}}>0$ . This proves the claim.

We also need the following quantitative version of Proposition 7.1.

Proposition 7.3. (Soft quantitative estimates)

Let $d, D, \gamma , \ell , L\in {\mathbb N}$ , $\eta \in [\ell ]^{\ell }$ , and $p_1, \ldots , p_{\ell }$ , $q_1, \ldots , q_L\in {\mathbb Z}[n]$ be polynomials of degrees at most D. Suppose that the polynomials $p_1, \ldots , p_{\ell }$ have the good ergodicity property for the system $(X,{\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ . Let $\mathopen {}( T_{\eta _{j}}^{\,\rho _{j}(n)} \mathclose {})_{j\in [\ell ]}$ be a proper descendant of the tuple $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ . Then there exists $s\in {\mathbb N}$ , depending only on $d, D, \ell , L$ , with the following property: for any $\varepsilon>0$ , there exists $\delta>0$ such that for all $1$ -bounded functions $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ that are $\gamma $ -invariant along $\eta $ with respect to $p_1, \ldots , p_{\ell }$ and all sequences of functions ${\mathcal D}_{1}, \ldots , {\mathcal D}_L\in {\mathfrak D}_d$ , we have

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{{\eta_j}}^{\,\rho_j(n)}f_j \cdot \prod_{j\in[L]}{\mathcal D}_{j}(q_j(n))\bigg\Vert_{L^2(\mu)} < \varepsilon \end{align*} $$

whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _j}} < \delta $ for some $j\in [\ell ]$ .

Proof. We prove Proposition 7.3 for fixed $d, D, \gamma , \ell , L, \eta $ by assuming Proposition 7.1 for the same parameters.

Let $s\in {\mathbb N}$ be as in the statement of Proposition 7.1 (so in particular it only depends on $d, D, \ell , L$ ) and $b_1, \ldots , b_{\ell }$ be the leading coefficients of $\rho _1, \ldots , \rho _{\ell }$ . Fix $\pi \in [\ell ]^L$ . We first prove the following qualitative claim: for all $f_1, \ldots , f_{\ell }, g_1, \ldots , g_L\in L^{\infty }(\mu )$ , where $f_j$ is $\mathopen {}( T_{\eta _j}^{b_{\eta _j}}T_j^{-b_j} \mathclose {})^{\gamma }$ -invariant for each $j\in [\ell ]$ and $g_j$ is ${\mathcal Z}_d(T_{\pi _j})$ -measurable for each $j\in [L]$ , we have

(82)

$$ \begin{align} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{\eta_j}^{\,\rho_j(n)}f_j \cdot \prod_{j\in[L]}T^{q_j(n)}_{\pi_j}g_j\bigg\Vert_{L^2(\mu)} = 0 \end{align} $$

whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _j}} = 0$ for some $j\in [\ell ]$ .

Fix $f_1, \ldots , f_{\ell }, g_1, \ldots , g_L$ and suppose that equation (82) fails. Using Proposition 2.3 and the pigeonhole principle, we deduce that there exist dual functions $g_1', \ldots , g_L'$ of $T_{\pi _1}, \ldots , T_{\pi _L}$ of level d such that

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{\eta_j}^{\,\rho_j(n)}f_j \cdot \prod_{j\in[L]}T^{q_j(n)}_{\pi_j}g_j'\bigg\Vert_{L^2(\mu)}> 0. \end{align*} $$

Setting ${\mathcal D}_j(n) := T_{\pi _j}^{n}g^{\prime }_j$ and using Proposition 7.1, we deduce that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _j}}>0$ for all $j\in [\ell ]$ , and so the claim follows.

We combine the claim above with Proposition 3.6 for

$$ \begin{align*} {\mathcal Y}_j := \begin{cases} {\mathcal I}\mathopen{}( \mathopen{}( T_{\eta_j}^{b_{\eta_j}}T_j^{-b_j} \mathclose{})^{\gamma} \mathclose{}), &1\leq j \leq \ell,\\ {\mathcal Z}_d(T_{\pi_j}), &\ell+1\leq j \leq L, \end{cases} \end{align*} $$

deducing the following: for every $\varepsilon>0$ , there exists $\delta _{\pi }>0$ such that for all 1-bounded functions $f_1, \ldots , f_{\ell }, g_1, \ldots , g_L\in L^{\infty }(\mu )$ with $f_j\in L^{\infty }\mathopen {}( {\mathcal I}\mathopen {}( \mathopen {}( T_{\eta _j}^{b_{\eta _j}}T_j^{-b_j} \mathclose {})^{\gamma } \mathclose {}) \mathclose {})$ for $j\in [\ell ]$ and $g_j\in L^{\infty }({\mathcal Z}_d(T_{\pi _j}))$ for $j\in [L]$ , we have

whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _j}} <\delta _{\pi }$ for some $j\in [\ell ]$ .

Proposition 7.3 follows by taking $\delta := \min (\delta _{\pi }\colon \, \pi \in [\ell ]^L)$ and recalling that dual functions of $T_j$ of order d are ${\mathcal Z}_d(T_j)$ -measurable, and hence for every $j\in [L]$ , the sequence ${\mathcal D}_{j}(q_j(n))$ has the form ${\mathcal D}_j(q_j(n)) = T_{\pi _j}^{q_j(n)}g_j$ for some $\pi _j\in [\ell ]$ and a 1-bounded ${\mathcal Z}_d(T_j)$ -measurable function $g_j\in L^{\infty }(\mu )$ .

For controllable tuples, Proposition 7.1 will be deduced from the following result.

Proposition 7.4. (Iterated box seminorm smoothing)

Let $d, D, \ell , L\in {\mathbb N}$ , $\eta \in [\ell ]^{\ell }$ , and $p_1, \ldots , p_{\ell }$ , $q_1, \ldots , q_L\in {\mathbb Z}[n]$ be polynomials of degrees at most D. Suppose that the polynomials $p_1, \ldots , p_{\ell }$ have the good ergodicity property for the system $(X,{\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ . Let $\mathopen {}( T_{\eta _{j}}^{\,\rho _{j}(n)} \mathclose {})_{j\in [\ell ]}$ be a proper descendant of the tuple $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ . Suppose that $\mathopen {}( T_{\eta _{j}}^{\,\rho _{j}(n)} \mathclose {})_{j\in [\ell ]}$ of a non-basic type is controllable, and let m be an index satisfying the controllability condition. Then there exist $s\in {\mathbb N}$ , depending only on $d, D, \ell , L$ , such that for all functions $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ with the good invariance property along $\eta $ with respect to $p_1, \ldots , p_{\ell }$ and all sequences of functions ${\mathcal D}_{1}, \ldots , {\mathcal D}_L\in {\mathfrak D}_d$ , we obtain equation (80) whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _m}} = 0$ .

Proposition 7.4 is a consequence of Proposition 3.7, followed by an iterated application of the smoothing result given below.

Proposition 7.5. (Box seminorm smoothing)

Let $d, D, \ell , L\in {\mathbb N}$ , $\eta \in [\ell ]^{\ell }$ and $p_1, \ldots , p_{\ell }$ , $q_1, \ldots , q_L\in {\mathbb Z}[n]$ be polynomials of degrees at most D. Suppose that the polynomials $p_1, \ldots , p_{\ell }$ have the good ergodicity property for the system $(X,{\mathcal X}, \mu ,T_1, \ldots , T_{\ell })$ . Let $\mathopen {}( T_{\eta _{j}}^{\,\rho _{j}(n)} \mathclose {})_{j\in [\ell ]}$ be a proper descendant of the tuple $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ . Suppose that $\mathopen {}( T_{\eta _{j}}^{\,\rho _{j}(n)} \mathclose {})_{j\in [\ell ]}$ of a non-basic type is controllable, and let m be an index satisfying the controllability condition. Then for all $s\geq 2$ and vectors ${\textbf {b}}_1, \ldots , {\textbf {b}}_{s+1}$ satisfying equation (23), there exists $s'\in {\mathbb N}$ , depending only on $d, D, \ell , L, s$ , with the following property: for all functions $f_1, \ldots , f_{\ell }\in L^{\infty }(\mu )$ with the good invariance property along $\eta $ with respect to $p_1, \ldots , p_{\ell }$ and all sequences of functions ${\mathcal D}_{1}, \ldots , {\mathcal D}_L\in {\mathfrak D}_d$ , if $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{{\textbf {b}}_1},\ldots , {{\textbf {b}}_{s+1}}}= 0$ implies equation (80), then equation (80) also holds under the assumption that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{{\textbf {b}}_1},\ldots , {{\textbf {b}}_{s}}, {\mathbf {e}}_{\eta _m}^{\times s'}}= 0$ .

We explain now the induction scheme whereby we prove Propositions 7.1–7.5. Roughly speaking, the proofs proceed by the induction on the length $\ell $ of an average and—for fixed $\ell $ —by induction on type, where the base case are averages of basic types. More precisely, the induction scheme goes as follows.

(i) For tuples $(T_{\eta _j}^{\,\rho _j(n)})_{j\in [\ell ]}$ of length $\ell $ of type w, Proposition 7.3 follows from Proposition 7.1 as proved before.
(ii) Tuples $(T_{\eta _j}^{\,\rho _j(n)})_{j\in [\ell ]}$ of length $\ell = 1$ have necessarily basic type, and Propositions 7.1 and 7.2 follow easily from Proposition 3.7.
(iii) For tuples $(T_{\eta _j}^{\,\rho _j(n)})_{j\in [\ell ]}$ of length $\ell>1$ and basic type, Proposition 7.1 is a consequence of Proposition 7.2.
(iv) For tuples $(T_{\eta _j}^{\,\rho _j(n)})_{j\in [\ell ]}$ of length $\ell>1$ and non-basic type w, we prove Proposition 7.5 only under the assumption of controllability. This proof goes by inductively invoking Proposition 7.3 in two cases: for tuples of length $\ell $ and type $w'<w$ , and for tuples of length $\ell -1$ . An iterative application of Proposition 7.5 then yields Proposition 7.4 for tuples of length $\ell $ and type w.
(v) For controllable tuples $(T_{\eta _j}^{\,\rho _j(n)})_{j\in [\ell ]}$ of length $\ell>1$ and non-basic type w, we prove Proposition 7.1 by invoking Proposition 7.4 for tuples of length $\ell $ and type w followed by an application of Proposition 7.1 for tuples of length $\ell -1$ .
(vi) Lastly, for uncontrollable tuples $(T_{\eta _j}^{\,\rho _j(n)})_{j\in [\ell ]}$ of length $\ell>1$ and non-basic type w, we prove Proposition 7.1 by inductively invoking Proposition 7.1 for tuples of length $\ell $ and type $w'<w$ .

The way in which step (vi) is carried out has been illustrated in Example 9, in which seminorm control for the uncontrollable average from equation (77) is deduced from seminorm control for the lower-type controllable average from equation (79). The example below summarizes how steps (iv) and (v) proceed for a controllable average.

Example 11. (Inductive steps for a controllable average)

Consider the tuple

(83)

$$ \begin{align} (T_1^{n^2}, T_2^{n^2}, T_3^{n^2+n}, T_2^{n^2 + n}) \end{align} $$

of type $(3, 1)$ , which is a descendant of the tuple

(84)

$$ \begin{align} (T_1^{n^2}, T_2^{n^2}, T_3^{n^2+n}, T_4^{n^2 + n}) \end{align} $$

from Example 7 of type $(2,2)$ . While proving Proposition 7.5 for equation (83), we invoke in the ping step Proposition 7.3 for tuples

$$ \begin{align*} (T_1^{n^2}, T_2^{n^2}, T_1^{n^2+n}, T_2^{n^2 + n})\quad \textrm{and}\quad (T_1^{n^2}, T_2^{n^2}, T_2^{n^2+n}, T_2^{n^2 + n}). \end{align*} $$

They are proper descendants of the original tuple in equation (84), have basic type $(4, 0)$ , and Proposition 7.1 follows for them from Proposition 7.2. In the pong step of the proof of Proposition 7.5 for equation (83), we inductively invoke Proposition 7.3 for the following tuples of length 3, obtained by replacing the first and second term respectively by dual functions:

$$ \begin{align*} (*, T_2^{n^2}, T_3^{n^2+n}, T_2^{n^2 + n})\quad \textrm{and}\quad (T_1^{n^2}, *, T_3^{n^2+n}, T_2^{n^2 + n}). \end{align*} $$

Finally, once we prove the $T_3$ -seminorm control of the third term in equation (83) using Proposition 7.4, an iterated version of Proposition 7.5, we derive seminorm control of other terms in equation (83) as follows. Replacing the third term in equation (83) by a dual function using the newly established $T_3$ -control, Proposition 2.3, and the pigeonhole principle, we get a tuple

$$ \begin{align*} (T_1^{n^2}, T_2^{n^2}, *, T_2^{n^2 + n}), \end{align*} $$

and then we apply Proposition 6.6 to flip the $T_2$ in the last term into $T_4$ , obtaining the tuple

$$ \begin{align*} (T_1^{n^2}, T_2^{n^2}, *, T_4^{3n^2 + 3n}). \end{align*} $$

This tuple has length 3, and it is good for seminorm control by Proposition 7.1 applied inductively to tuples of length 3. Going back, this gives us seminorm control of the other terms in equation (83).

7.2 Proof of Proposition 7.5

We prove Proposition 7.5 for the tuple $\mathopen {}( T_{\eta _{j}}^{\,\rho _{j}(n)} \mathclose {})_{j\in [\ell ]}$ of non-basic type w with the last non-zero index t, which is a proper descendant of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ , by assuming that Proposition 7.1 holds for tuples of length $\ell -1$ as well as length $\ell $ and type $w'<w$ . For simplicity of notation, we assume that $m = \ell $ satisfies the controllability condition.

By Proposition 3.7, the vector ${\textbf {b}}_{s+1}$ is non-zero and takes the form ${\textbf {b}}_{s+1} = b_{\ell } {\mathbf {e}}_{\eta _{\ell }} - b_i {\mathbf {e}}_{\eta _i}$ for some $i\in \{0, \ldots , \ell -1\}$ , where $b_{\ell }, b_i$ are the coefficients of $p_{\ell }, p_i$ of degree $d_{\ell i} := \deg (p_{\ell } {\mathbf {e}}_{\eta _{\ell }} - p_i{\mathbf {e}}_{\eta _i})$ . If $i = 0$ , then $b_{\ell }\neq 0$ since ${\textbf {b}}_{s+1}$ is non-zero. If $\eta _i = \eta _{\ell }$ , then the controllability condition implies that $\rho _i, \rho _{\ell }$ are independent, and so $T^{{\textbf {b}}_{s+1}}$ is a non-zero iterate of $T_{\eta _{\ell }}$ . In both these cases, the result follows from the bound

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{\ell}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}, c {\mathbf{e}}_{\eta_{\ell}}}\ll_{|c|} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{\ell}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}, {\mathbf{e}}_{\eta_{\ell}}} \end{align*} $$

for any $c\neq 0$ , which is a consequence of Lemma 2.1.

For the case $\eta _i \neq \eta _{\ell }$ and $\eta _i\in {\mathfrak I}_t$ , recall first that $\eta _{\ell }\in {\mathfrak I}_t$ by assumption, and the good ergodicity property of $p_1, \ldots , p_{\ell }$ implies that $\rho _1, \ldots , \rho _{\ell }$ have the good ergodicity property along $\eta $ . Hence, we have $b_{\ell } = \beta _{\ell } \gcd (b_{\ell }, b_i)$ , $b_i = \beta _i \gcd (b_{\ell }, b_i)$ for coprime integers $\beta _{\ell }, \beta _i\in {\mathbb Z}$ such that ${\mathcal I}(T^{\beta _{\ell } {\mathbf {e}}_{\eta _{\ell }} - \beta _i {\mathbf {e}}_{\eta _i}})\subseteq {\mathcal I}(T_{\ell })$ . By Lemmas 2.1 and 2.2, we have

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{\ell}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s+1}}}\ll \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{\ell}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{\textbf{b}}_1, \ldots, {\textbf{b}}_s, \beta_{\ell} {\mathbf{e}}_{\eta_{\ell}} - \beta_i {\mathbf{e}}_{\eta_i}}\leq \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_{\ell}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}, {\mathbf{e}}_{\eta_{\ell}}}. \end{align*} $$

The last remaining case to consider, and the most difficult one, is when $\eta _i\in {\mathfrak I}_{t'}$ with $t'\neq t$ and $b_{\ell }, b_i\neq 0$ . The proof of Proposition 7.5 in this case follows the same two-step strategy that was explained in Example 1, but we also have to take into account additional complications explained in Examples 7 and 8. We first obtain the control of equation (35) by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_i|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_{s}}, {\mathbf {e}}_{\eta _i}^{\times s_1}}$ for some $s_1\in {\mathbb N}$ depending only on $d, D, \ell , L, s$ . This is accomplished by using the control by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_{\ell }|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_{s+1}}}$ , given by assumption, for an appropriately defined function $\tilde {f_{\ell }}$ in place of $f_{\ell }$ . Subsequently, we repeat the procedure by applying the newly established control by $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_i|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_{s}}, {\mathbf {e}}_{\eta _i}^{\times s_1}}$ for a function $\tilde {f}_i$ in place of $f_i$ . This gives us the claimed result.

Step 1 (ping): Obtaining control by a seminorm of $f_i$ .

Suppose that

(85)

The good invariance property of $f_1, \ldots , f_{\ell }$ implies that $f_{\ell }$ is invariant under $\mathopen {}( T_{\eta _{\ell }}^{a_{\eta _{\ell }}}T_{\ell }^{-a_{\ell }} \mathclose {})^{\gamma } = T^{{\mathbf {c}}}$ for some non-zero $\gamma>0$ and ${\mathbf {c}} := \gamma (a_{\eta _{\ell }}{\mathbf {e}}_{\eta _{\ell }}-a_{\ell }{\mathbf {e}}_{\ell })$ , where $a_{\eta _{\ell }}, a_{\ell }$ are the leading coefficients of $p_{\eta _{\ell }}$ and $p_{\ell }$ . Combining this with Lemma 3.2, we deduce that

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell-1]}T_{{\eta_j}}^{\,\rho_j(n)}f_j \cdot T_{\eta_{\ell}}^{\,\rho_{\ell}(n)}{\mathbb{E}}(\tilde{f}_{\ell}|{\mathcal I}(T^{\mathbf{c}}))\cdot \prod_{j\in[L]}{\mathcal D}_{j}(q_j(n))\bigg\Vert_{L^2(\mu)}>0 \end{align*} $$

for some function

$$ \begin{align*} \tilde{f}_{\ell}:=\lim_{k\to\infty} \mathop{\mathbb{E}}\limits_{n\in [N_k]} \, T_{\eta_{\ell}}^{-\rho_{\ell}(n)}g_k\cdot \prod_{\substack{j\in [\ell-1]}}T_{\eta_{\ell}}^{-\rho_{\ell}(n)} T_{{\eta_j}}^{\,\rho_j(n)}\overline{f}_j\cdot \prod_{j\in[L]}T_{\eta_{\ell}}^{-\rho_{\ell}(n)}{\mathcal D}_{j}(q_j(n)), \end{align*} $$

where the limit is a weak limit. Then our assumption gives

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| {\mathbb{E}}(\tilde{f}_{\ell}|{\mathcal I}(T^{\mathbf{c}}))|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s+1}}}>0. \end{align*} $$

By Proposition 3.3, we get

$$ \begin{align*} &\hspace{-2pt}\liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}}\\ &\hspace{-2pt}\lim_{N\to\infty}\!\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \! \prod_{j\in[\ell-1]} \!T_{{\eta_j}}^{\,\rho_j(n)} (\Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_s}; {\underline{h}}-{\underline{h}}'} f_j)\cdot T_{\eta_{\ell}}^{\,\rho_{\ell}(n)}u_{{\underline{h}},{\underline{h}}'}\cdot\! \prod_{j\in[L]} {\mathcal D}^{\prime}_{j, {\underline{h}}, {\underline{h}}'}(q_j(n))\bigg\Vert_{L^2(\mu)}\!>0, \end{align*} $$

where $u_{{\underline {h}},{\underline {h}}'}$ are 1-bounded and invariant under both $T^{{\textbf {b}}_{s+1}}$ and $T^{\mathbf {c}}$ , and

$$ \begin{align*} {\mathcal D}^{\prime}_{j, {\underline{h}}, {\underline{h}}'}(n) := \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_s}; {\underline{h}}-{\underline{h}}'}{\mathcal D}_{j}(n) \end{align*} $$

is a product of $2^s$ elements of ${\mathfrak D}_d$ . As a consequence of the $T^{{\textbf {b}}_{s+1}}$ -invariance of $u_{{\underline {h}},{\underline {h}}'}$ , we have

(86)

$$ \begin{align} T_{\eta_{\ell}}^{b_{\ell}}u_{{\underline{h}},{\underline{h}}'}=T_{\eta_i}^{b_i}u_{{\underline{h}},{\underline{h}}'}, \end{align} $$

where we use the identity $T_{\eta _{\ell }}^{b_{\ell }}=T_{\eta _i}^{b_i}T^{{\textbf {b}}_{s+1}}$ . Let $\unicode{x3bb} \in {\mathbb N}$ be the smallest natural number such that $b_{\ell }$ divides the coefficients of $\unicode{x3bb} \rho _{\ell }$ . By the triangle inequality,

$$ \begin{align*} &\liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}}\mathop{\mathbb{E}}\limits_{r\in \{0, \ldots, \unicode{x3bb}-1\}}\\ &\quad\lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \,\prod_{j\in[\ell-1]} T_{{\eta_j}}^{\,\rho_j(\unicode{x3bb} n + r)} (\Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_s}; {\underline{h}}-{\underline{h}}'} f_j)\cdot T_{\eta_{\ell}}^{\,\rho_{\ell}(\unicode{x3bb} n + r)}u_{{\underline{h}},{\underline{h}}'}\\ &\qquad \times \prod_{j\in[L]}{\mathcal D}^{\prime}_{j, {\underline{h}}, {\underline{h}}'}(q_j(\unicode{x3bb} n + r))\bigg\Vert_{L^2(\mu)}>0. \end{align*} $$

Using equation (86) and the pigeonhole principle, and setting $\eta '=\tau _{\ell i}\eta $ , we get that for some $r_0\in \{0, \ldots , \unicode{x3bb} -1\}$ , we have

(87)

$$ \begin{align} \limsup_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, \prod_{j\in[\ell]} T_{{\eta^{\prime}_j}}^{\,\rho^{\prime}_j(n)} f_{j, {\underline{h}}, {\underline{h}}'}\cdot \prod_{j\in[L]}{\mathcal D}^{\prime}_{j, {\underline{h}}, {\underline{h}}'}(q^{\prime}_j(n))\bigg\Vert_{L^2(\mu)}>0, \end{align} $$

where

$$ \begin{align*} \rho^{\prime}_j(n) &:= \begin{cases} \rho_j(\unicode{x3bb} n+r_0) - \rho_j(r_0), &j\in[\ell-1],\\[4pt] \dfrac{b_i}{b_{\ell}}(\rho_{\ell}(\unicode{x3bb} n+r_0)-\rho_{\ell}(r_0)), &j = \ell, \end{cases}\\ f_{j, {\underline{h}}, {\underline{h}}'} &:= \begin{cases} \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_s}; {\underline{h}}-{\underline{h}}'} T_{\eta_j}^{\,\rho_j(r_0)} f_j, &j\in[\ell-1],\\ T_{\eta_{\ell}}^{\,\rho_{\ell}(r_0)} u_{{\underline{h}},{\underline{h}}'}, &j=\ell, \end{cases} \end{align*} $$

and $q^{\prime }_j(n):=q_j(\unicode{x3bb} n+r_0)$ . We note that the polynomials $\rho ^{\prime }_1, \ldots , \rho ^{\prime }_{\ell }$ are as in Proposition 6.1.

It follows from equation (87) that there exists a set $B\subset {\mathbb N}^{2s}$ of positive upper density and $\varepsilon>0$ such that

(88)

$$ \begin{align} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, \prod_{j\in[\ell]} T_{{\eta^{\prime}_j}}^{\,\rho^{\prime}_j(n)} f_{j, {\underline{h}}, {\underline{h}}'}\cdot \prod_{j\in[L]}{\mathcal D}^{\prime}_{j, {\underline{h}}, {\underline{h}}'}(q^{\prime}_j(n))\bigg\Vert_{L^2(\mu)}\geq \varepsilon \end{align} $$

for every $({\underline {h}},{\underline {h}}')\in B$ .

By assumption, t is the last non-zero index of w, implying that $t'<t$ . Hence, the new tuple $\mathopen {}( T_{\eta ^{\prime }_j}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ has type $w'=\sigma _{tt'}w<w$ . Furthermore, by Proposition 6.5, the functions $f_{j, {\underline {h}}, {\underline {h}}'}$ have the good invariance property along $\eta '$ with respect to $p_1, \ldots , p_{\ell }$ . We therefore inductively apply Proposition 7.3 for tuples of length $\ell $ and type $w'$ to each average from equation (88). This allows us to conclude that there exist $s_1\in {\mathbb N}$ , depending only on $d, D, \ell , L, s$ , and $\delta>0$ such that

$$ \begin{align*}\lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_s}; {\underline{h}}-{\underline{h}}'} f_i|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s_1, T_{\eta_i}} \geq \delta\end{align*} $$

for $({\underline {h}}, {\underline {h}}')\in B$ . Hence,

(89)

$$ \begin{align} \limsup_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_s}; {\underline{h}}-{\underline{h}}'} f_i|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s_1, T_{\eta_i}}>0. \end{align} $$

Together with Lemma 3.1, the inductive formula for seminorms in equation (15), and Hölder inequality, the inequality in equation (89) implies that

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| f_i|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_s}, {\mathbf{e}}_{\eta_i}^{\times s_1}}>0, \end{align*} $$

and so the seminorm $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_i|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_s}, {\mathbf {e}}_{\eta _i}^{\times s_1}}$ controls the average from equation (35).

Step 2 (pong): Obtaining control by a seminorm of $f_{\ell }$ .

To get the claim that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_{\ell }|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_s}, {\mathbf {e}}_{\eta _{\ell }}^{\times s'}}$ controls the average for some $s'\in {\mathbb N}$ depending only on $d, D, \ell , L, s$ , we repeat the procedure once more with $f_i$ in place of $f_{\ell }$ . From equation (85), it follows that

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{\substack{j\in[\ell],\\ j\neq i}}T_{{\eta_j}}^{\,\rho_j(n)}f_j \cdot T_{\eta_i}^{\,\rho_i(n)}\tilde{f_i} \cdot \prod_{j\in[L]}{\mathcal D}_{j}(q_j(n))\bigg\Vert_{L^2(\mu)}>0 \end{align*} $$

for some function

$$ \begin{align*} \tilde{f}_i:=\lim_{k\to\infty} \mathop{\mathbb{E}}\limits_{n\in [N_k]} \, T_{\eta_i}^{-\rho_i(n)}g_k\cdot \prod_{\substack{j\in [\ell],\\ j\neq i}}T_{\eta_i}^{-\rho_i(n)} T_{{\eta_j}}^{\,\rho_j(n)}\overline{f}_j\cdot \prod_{j\in[L]}T_{\eta_i}^{-\rho_i(n)}{\mathcal D}_{j}(q_j(n)), \end{align*} $$

where the limit is a weak limit. Then the previous result gives

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \tilde{f}_i|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_s}, {\mathbf{e}}_{\eta_i}^{\times s_1}}>0. \end{align*} $$

By Proposition 3.3, we get

$$ \begin{align*} &\liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}},{\underline{h}}'\in [H]^{s}}\lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, \prod_{\substack{j\in[\ell],\\ j\neq i}} T_{{\eta_j}}^{\,\rho_j(n)} (\Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}-{\underline{h}}'} f_j)\\ &\quad\times \prod_{j\in[L+1]}{\mathcal D}_{j,{\underline{h}},{\underline{h}}'}(q_j(n))\bigg\Vert_{L^2(\mu)}>0, \end{align*} $$

where

$$ \begin{align*} {\mathcal D}_{j,{\underline{h}},{\underline{h}}'}(n):=\begin{cases} \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_s}; {\underline{h}}-{\underline{h}}'}{\mathcal D}_{j}(n), &j \in[L],\\ T_{\eta_i}^n T^{-({\textbf{b}}_1 h_1'+\cdots+{\textbf{b}}_s h^{\prime}_s)}\prod\limits_{{\underline{\epsilon}}\in\{0,1\}^s}{\mathcal C}^{|{\underline{\epsilon}}|}{\mathcal D}_{s_1, T_{\eta_i}}(\Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}^{{\underline{\epsilon}}}}\tilde{f}_i), &j=L+1. \end{cases} \end{align*} $$

Thus, the sequence of functions ${\mathcal D}_{j,{\underline {h}},{\underline {h}}'}$ is a product of $2^s$ elements of ${\mathfrak D}_d$ if $j\in [L]$ , and it is a product of $2^{s}$ elements of ${\mathfrak D}_{s_1}$ for $j=L+1$ . Consequently, there exists $\varepsilon>0$ and a set $B'\subset {\mathbb N}^{2s}$ of positive lower density such that for every $({\underline {h}}, {\underline {h}}')\in B'$ , we have

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, \prod_{\substack{j\in[\ell],\\ j\neq i}} T_{{\eta_j}}^{\,\rho_j(n)} (\Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}-{\underline{h}}'} f_j)\cdot \prod_{j\in[L+1]}{\mathcal D}_{j,{\underline{h}},{\underline{h}}'}(q_j(n))\bigg\Vert_{L^2(\mu)}> \varepsilon. \end{align*} $$

Proposition 6.5 implies that the functions $(g_{1,{\underline {h}}, {\underline {h}}'}, \ldots , g_{\ell , {\underline {h}}, {\underline {h}}'})_{({\underline {h}}, {\underline {h}}')\in {\mathbb N}^{2s}}$ given by

$$ \begin{align*} g_{j, {\underline{h}}, {\underline{h}}'} := \begin{cases}\Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}-{\underline{h}}'} f_j, &j \neq i,\\ 1, &j = i, \end{cases} \end{align*} $$

have the good invariance property along $\eta $ with respect to $p_1, \ldots , p_{\ell }$ . Proposition 6.6 then gives polynomials $\rho _1', \ldots , \rho _{\ell }', q^{\prime }_1, \ldots , q^{\prime }_L\in {\mathbb Z}[n]$ , 1-bounded functions $g^{\prime }_{1,{\underline {h}}, {\underline {h}}'}, \ldots , g^{\prime }_{\ell ,{\underline {h}}, {\underline {h}}'}$ with $g^{\prime }_{i, {\underline {h}}, {\underline {h}}'} := 1$ such that

(90)

$$ \begin{align} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in[N]} \, \prod_{\substack{j\in[\ell],\\ j\neq i}} T_j^{\,\rho^{\prime}_j(n)} g^{\prime}_{j,{\underline{h}}, {\underline{h}}'}\cdot \prod_{j\in[L+1]}{\mathcal D}_{j,{\underline{h}},{\underline{h}}'}(q^{\prime}_j(n))\bigg\Vert_{L^2(\mu)}> \varepsilon. \end{align} $$

It also gives a constant $C>0$ depending only on $\gamma $ and the leading coefficients of $p_1, \ldots , p_{\ell }$ such that

(91)

$$ \begin{align} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| g^{\prime}_{j, {\underline{h}}, {\underline{h}}'}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s', T_j}\leq C \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}-{\underline{h}}'} f_j|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s', T_{\eta_j}} \end{align} $$

for all $j\in [\ell ]\setminus \{i\}$ , $({\underline {h}},{\underline {h}}')\in B'$ , and $s'\geq 2$ . Observing that the averages from equation (90) have length $\ell -1$ and $\rho ^{\prime }_1, \ldots , \rho ^{\prime }_{i-1}, \rho ^{\prime }_{i+1}, \ldots , \rho ^{\prime }_{\ell }$ have the good ergodicity property for the system $(X, {\mathcal X}, \mu , T_1, \ldots , T_{i-1}, T_{i+1}, \ldots , T_{\ell })$ (another consequence of Propositions 6.6 and 6.4), we conclude from equation (91) and Proposition 7.3 that there exist $s'\in {\mathbb N}$ (depending only on $d, D, \ell , L, s$ ) and $\delta>0$ satisfying

$$ \begin{align*} \lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}-{\underline{h}}'} f_{\ell}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s', T_{\eta_{\ell}}}>\delta \end{align*} $$

for every $({\underline {h}}, {\underline {h}}')\in B'$ . Consequently, we deduce that

$$ \begin{align*} \liminf_{H\to\infty}\mathop{\mathbb{E}}\limits_{{\underline{h}}, {\underline{h}}'\in [H]^{s}}\lvert\hspace{-1.2pt}|\hspace{-1.2pt}| \Delta_{{{\textbf{b}}_1}, \ldots, {{\textbf{b}}_{s}}; {\underline{h}}-{\underline{h}}'} f_{\ell}|\hspace{-1.2pt}|\hspace{-1.2pt}\rvert_{s', T_{\eta_{\ell}}}>0. \end{align*} $$

It then follows from Lemma 3.1 and the Hölder inequality that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_{\ell }|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{{{\textbf {b}}_1}, \ldots , {{\textbf {b}}_{s}}, {\mathbf {e}}_{\eta _{\ell }}^{\times s'}}>0$ , as claimed.

7.3 Proof of Proposition 7.1

We induct on the length $\ell $ of the average, and for each fixed $\ell $ , we further induct on type. In the base case $\ell =1$ , Proposition 7.1 follows directly from Proposition 3.7. We assume therefore that the average has length $\ell>1$ and type w, and the statement holds for averages of length $\ell -1$ as well as length $\ell $ and type $w'<w$ . If the type w is basic, then Proposition 7.1 follows from Proposition 7.2, so we assume that w is not basic. We argue differently depending on whether the average is controllable or not.

Case 1: Controllable averages.

If the average is controllable, then Proposition 7.4 gives $s\in {\mathbb N}$ depending only on $d, D,\ell , L$ as well as $m\in [\ell ]$ such that equation (80) holds whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _m}}=0$ . Suppose now that

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{{\eta_j}}^{\,\rho_j(n)}f_j \prod_{j\in[L]}{\mathcal D}_{j}(q_j(n))\bigg\Vert_{L^2(\mu)}> 0. \end{align*} $$

Applying the fact that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_m|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _m}}$ controls this average, Proposition 2.3, and the pigeonhole principle, we replace $f_m$ by a dual function of level s, so that we have

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{\substack{j\in[\ell],\\ j\neq m}}T_{{\eta_j}}^{\,\rho_j(n)}f_j \cdot {\mathcal D}(\rho_j(n)) \cdot\prod_{j\in[L]}{\mathcal D}_{j}(q_j(n))\bigg\Vert_{L^2(\mu)}> 0 \end{align*} $$

for some ${\mathcal D}\in {\mathfrak D}_s$ . By Proposition 6.6, there exist 1-bounded functions $(f_j')_{j\in [\ell ]}$ with $f^{\prime }_m = 1$ and polynomials $\rho _1', \ldots , \rho _{\ell }', q_1', \ldots , q_L'\in {\mathbb Z}[n]$ such that

$$ \begin{align*} \lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{\substack{j\in[\ell],\\ j\neq m}}T_j^{\,\rho_j'(n)}f_j' \cdot {\mathcal D}(\rho_j(n)) \cdot\prod_{j\in[L]}{\mathcal D}_{j}(q_j'(n))\bigg\Vert_{L^2(\mu)}> 0. \end{align*} $$

Provided $s\geq 2$ (which we can assume without loss of generality), Proposition 6.6 also implies that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f^{\prime }_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_j} \leq C \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _j}}$ for a constant $C>0$ depending only on the leading coefficients of $p_1, \ldots , p_{\ell }$ and the number $\gamma $ for which $f_1, \ldots , f_{\ell }$ have the $\gamma $ -invariance property along $\eta $ . Moreover, the fact that $\rho ^{\prime }_1, \ldots , \rho ^{\prime }_{\ell }$ are descendants of $p_1, \ldots , p_{\ell }$ and Proposition 6.4 imply that $\mathopen {}( T_j^{\,\rho _j'(n)} \mathclose {})_{j\in [\ell ], j\neq m}$ has the good ergodicity property. By the case $\ell -1$ of Proposition 7.1, we deduce that $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f^{\prime }_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_j}>0$ for $j\neq m$ , and hence $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _j}}>0$ for $j\neq m$ .

Case 2: Uncontrollable averages.

If the average is uncontrollable, then we apply Proposition 6.8 to deduce the existence of polynomials $q_1', \ldots , q_L'\in {\mathbb Z}[n]$ , a tuple $\mathopen {}( T_{\eta _j'}^{\,\rho ^{\prime }_j(n)} \mathclose {})_{j\in [\ell ]}$ of type $w'<w$ that is a proper descendant of $\mathopen {}( T_j^{p_j(n)} \mathclose {})_{j\in [\ell ]}$ , as well as functions $f_1', \ldots , f_{\ell }'\in L^{\infty }(\mu )$ satisfying

$$ \begin{align*} &\lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{{\eta_j}}^{\,\rho_j(n)}f_j \prod_{j\in[L]}{\mathcal D}_{j}(q_j(n))\bigg\Vert_{L^2(\mu)} \\ &\quad\leq\lim_{N\to\infty}\bigg\Vert \mathop{\mathbb{E}}\limits_{n\in [N]} \,\prod_{j\in[\ell]}T_{{\eta^{\prime}_j}}^{\,\rho^{\prime}_j(n)}f^{\prime}_j \prod_{j\in[L]}{\mathcal D}_{j}(q^{\prime}_j(n))\bigg\Vert_{L^2(\mu)} \end{align*} $$

and $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f^{\prime }_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta ^{\prime }_j}}\leq C \lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _j}}$ for every $s\geq 2$ and $j\in [\ell ]$ as well as some constant $C>0$ depending only on $\gamma $ and the leading coefficients of $p_1, \ldots , p_{\ell }$ . Moreover, the functions $f^{\prime }_1, \ldots , f^{\prime }_{\ell }$ have the good invariance property along $\eta '$ with respect to $p_1, \ldots , p_{\ell }$ . By the induction hypothesis, there exists $s\in {\mathbb N}$ such that the second average above vanishes whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f^{\prime }_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta ^{\prime }_j}}=0$ , and so the first average also vanishes whenever $\lvert \hspace {-1.2pt}|\hspace {-1.2pt}| f_j|\hspace {-1.2pt}|\hspace {-1.2pt}\rvert _{s, T_{\eta _j}}=0$ . This establishes the seminorm control over the tuple $\mathopen {}( T_{\eta _j}^{\,\rho _j(n)} \mathclose {})_{j\in [\ell ]}$ .

8 Proofs of joint ergodicity results

In this section, we derive Theorem 1.2 and Corollaries 1.3 and 1.4. We start with two observations that connect the notions of joint ergodicity and weak joint ergodicity. Their proofs are straightforward and hence we skip them.

Lemma 8.1. Let $(X, {\mathcal X}, \mu , T)$ be a system. Suppose that there exists a sequence $a:{\mathbb N}\to {\mathbb Z}$ such that $(T^{a(n)})_{n\in {\mathbb N}}$ is ergodic for $\mu $ . Then T is ergodic.

Lemma 8.2. Let $a_1, \ldots , a_{\ell }:{\mathbb N}\to {\mathbb Z}$ be sequences and $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ be a system. The sequences are jointly ergodic for the system if and only if they are weakly jointly ergodic and the transformations $T_1, \ldots , T_{\ell }$ are ergodic.

We continue with the proof of Theorem 1.2.

Proof of Theorem 1.2

Suppose first that the polynomials $p_1, \ldots , p_{\ell }$ have the good ergodicity property for the system $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ . By Theorem 1.1, this implies that $p_1, \ldots , p_{\ell }$ are good for the seminorm control for the system $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ . This result, property (ii), and Theorem 2.4 imply that $p_1, \ldots , p_{\ell }$ are weakly jointly ergodic for the system.

Conversely, suppose that the polynomials are weakly jointly ergodic for the system. The condition (ii) follows by taking $f_1, \ldots , f_{\ell }$ to be non-ergodic eigenfunctions of respective transformations. To prove condition (i), suppose that $p_i/c_i = p_j/c_j$ for some $i\neq j$ and coprime integers $c_i, c_j$ , and there exists a function f invariant under $T_i^{c_i}T_j^{-c_j}$ that is not simultaneously invariant under $T_i$ and $T_j$ . The invariance property of f gives $T_i^{c_i} f = T_j^{c_j}{f}$ , and the same holds for $\overline {f}$ . The coprimeness of $c_i, c_j$ implies that the polynomial $p_i/c_i = p_j/c_j$ has integer coefficients, and so we have

$$ \begin{align*} \mathop{\mathbb{E}}\limits_{n\in[N]}T_i^{p_i(n)}f \cdot T_j^{p_j(n)}\overline{f} = \mathop{\mathbb{E}}\limits_{n\in[N]} T_i^{p_i(n)}|f|^2 = \mathop{\mathbb{E}}\limits_{n\in[N]} T_j^{p_j(n)}|f|^2, \end{align*} $$

which by the weak joint ergodicity of $p_1, \ldots , p_{\ell }$ converges to ${\mathbb {E}}(|f|^2|{\mathcal I}(T_i)) = {\mathbb {E}}(|f|^2|{\mathcal I}(T_j))$ in $L^2(\mu )$ . However, the weak joint ergodicity of $p_1, \ldots , p_{\ell }$ implies that

$$ \begin{align*} \lim_{N\to\infty}\mathop{\mathbb{E}}\limits_{n\in[N]}T_i^{p_i(n)}f \cdot T_j^{p_j(n)}\overline{f} = {\mathbb{E}}(f|{\mathcal I}(T_i))\cdot \overline{{\mathbb{E}}(f|{\mathcal I}(T_j))} \end{align*} $$

in $L^2(\mu )$ . Hence, ${\mathbb {E}}(|f|^2|{\mathcal I}(T_i)) = {\mathbb {E}}(f|{\mathcal I}(T_i))\cdot \overline {{\mathbb {E}}(f|{\mathcal I}(T_j))}$ . The properties of the conditional expectation and the Cauchy–Schwarz inequality imply that

$$ \begin{align*} \int \Vert {\mathbb{E}}(f|{\mathcal I}(T_i))\cdot \overline{{\mathbb{E}}(f|{\mathcal I}(T_j))}\mathclose{}|\, d\mu &\leq \Vert {\mathbb{E}}(f|{\mathcal I}(T_i))\Vert_{L^2(\mu)} \cdot \Vert {\mathbb{E}}(f|{\mathcal I}(T_j))\Vert_{L^2(\mu)}\\ &\leq \Vert f\Vert_{L^2(\mu)}^2 = \int |f|^2\, d\mu = \int {\mathbb{E}}(|f|^2|{\mathcal I}(T_i))\, d\mu. \end{align*} $$

The two inequalities above become an equality precisely when $f = {\mathbb {E}}(f|{\mathcal I}(T_i)) = {\mathbb {E}}(f|{\mathcal I}(T_j))$ holds $\mu $ -a.e., that is, when f is simultaneously invariant under $T_i$ and $T_j$ , and so either this is the case, contradicting the assumptions on f, or ${\mathbb {E}}(|f|^2|{\mathcal I}(T_i)) \neq {\mathbb {E}}(f|{\mathcal I}(T_i))\cdot \overline {{\mathbb {E}}(f|{\mathcal I}(T_j))}$ , contradicting the weak joint ergodicity of $p_1, \ldots , p_{\ell }$ .

We now derive Corollary 1.3 from Theorem 1.2.

Proof of Corollary 1.3

By Lemma 8.2, the polynomials $p_1, \ldots , p_{\ell }$ are jointly ergodic for the system $(X, {\mathcal X}, \mu , T_1, \ldots , T_{\ell })$ if and only if they are weakly jointly ergodic for this system and the transformations $T_1, \ldots , T_{\ell }$ are ergodic. Theorem 1.2 in turn implies that this is equivalent to the system having the good ergodicity property, the transformations $T_1, \ldots , T_{\ell }$ being ergodic, and equation (6) holding for all eigenfunctions. Since the transformations are ergodic, all the eigenfunctions $\chi _j$ of $T_j$ satisfy $T_j \chi _j = \unicode{x3bb} _j \chi _j$ for a constant $\unicode{x3bb} _j$ , and so the condition of equation (6) reduces to equation (7) upon taking $\unicode{x3bb} _j = e(\alpha _j)$ and realizing that $\int\! \chi _j\, d\mu = 0$ unless $\alpha _j = 0$ . Lastly, the good ergodicity property and the ergodicity of the transformations $T_1, \ldots , T_{\ell }$ jointly imply the very good ergodicity property.

Finally, we prove Corollary 1.4.

Proof of Corollary 1.4

The forward direction follows from [Reference Donoso, Koutsogiannis and Sun7, Proposition 5.3], and so it is enough to deduce the reverse direction. Our goal is to show that the conditions (i) and (ii) in the statement of Conjecture 1 imply the conditions (i) and (ii) in the statement of Corollary 1.3. The condition (ii) in Conjecture 1, that is, the ergodicity of $(T_1^{p_1(n)}, \ldots , T_{\ell }^{p_{\ell }(n)})_{n\in {\mathbb N}}$ , implies the condition (ii) in Corollary 1.3 by taking eigenfunctions. By Lemma 8.1, it also implies the ergodicity of $T_1, \ldots , T_{\ell }$ because each sequence $(T_j^{p_j(n)})_{n\in {\mathbb N}}$ is ergodic.

To establish the very good ergodicity property of $p_1, \ldots , p_{\ell }$ , suppose that $p_i/c_i = p_j/c_j$ for some relatively prime $c_i, c_j\in {\mathbb Z}$ , and f is a non-constant function invariant under $T_i^{c_i}T_j^{-c_j}$ . Letting $q := p_i/c_i = p_j/c_j$ and noting that it has integer coefficients due to the coprimeness of $c_i, c_j$ , we observe that

$$ \begin{align*} \lim_{N\to\infty}\mathop{\mathbb{E}}\limits_{n\in[N]}T_i^{p_i(n)}T_j^{-p_j(n)}f = \lim_{N\to\infty}\mathop{\mathbb{E}}\limits_{n\in[N]}(T_i^{c_i}T_j^{-c_j})^{q(n)} f = f\neq \int f\, d\mu, \end{align*} $$

contradicting the ergodicity of $(T_i^{p_i(n)} T_j^{-p_j(n)})_{n\in {\mathbb N}}$ .

Acknowledgements

We thank the anonymous referee for many useful suggestions. The authors were supported by the Hellenic Foundation for Research and Innovation, Project. No. 1684.

References

Berend, D. and Bergelson, V.. Jointly ergodic measure preserving transformations. Israel J. Math. 49(4) (1984), 307–314.CrossRef Google Scholar

Bergelson, V.. Weakly mixing PET. Ergod. Th. & Dynam. Sys. 7(3) (1987), 337–349.10.1017/S0143385700004090CrossRef Google Scholar

Bergelson, V. and Leibman, A.. Polynomial extensions of van der Waerden’s and Szemerédi’s theorems. J. Amer. Math. Soc. 9 (1996), 725–753.10.1090/S0894-0347-96-00194-4CrossRef Google Scholar

Bergelson, V. and Leibman, A.. Cubic averages and large intersections. Recent trends in ergodic theory and dynamical systems. Contemp. Math. 631 (2015), 5–19.10.1090/conm/631/12592CrossRef Google Scholar

Best, A. and Ferré Moragues, A.. Polynomial ergodic averages for certain countable ring actions. Discrete Contin. Dyn. Syst. 42(7) (2022), 3379–3413.10.3934/dcds.2022019CrossRef Google Scholar

Donoso, S., Ferré-Moragues, A., Koutsogiannis, A. and Sun, W.. Decomposition of multicorrelation sequences and joint ergodicity. Preprint, 2021, arXiv:2106.01058.Google Scholar

Donoso, S., Koutsogiannis, A. and Sun, W.. Seminorms for multiple averages along polynomials and applications to joint ergodicity. J. Anal. Math. 146 (2022), 1–64.10.1007/s11854-021-0186-zCrossRef Google Scholar

Frantzikinakis, N.. A multidimensional Szemerédi theorem for Hardy sequences of polynomial growth. Trans. Amer. Math. Soc. 367 (2015), 5653–5692.10.1090/S0002-9947-2014-06275-2CrossRef Google Scholar

Frantzikinakis, N. Joint ergodicity of sequences. Adv. Math. to appear. Preprint, 2022, arXiv:2102.09967.10.1016/j.aim.2023.108918CrossRef Google Scholar

Frantzikinakis, N. and Host, B.. Weighted multiple ergodic averages and correlation sequences. Ergod. Th. & Dynam. Sys. 38(1) (2018), 81–142.10.1017/etds.2016.19CrossRef Google Scholar

Frantzikinakis, N and Kuca, B.. Joint ergodicity for commuting transformations and applications to polynomial sequences. Preprint, 2022, arXiv:2207.12288.CrossRef Google Scholar

Host, B.. Ergodic seminorms for commuting transformations and applications. Studia Math. 195(1) (2009), 31–49.10.4064/sm195-1-3CrossRef Google Scholar

Host, B. and Kra, B.. Non-conventional ergodic averages and nilmanifolds. Ann. of Math. (2) 161 (2005), 397–488.CrossRef Google Scholar

Host, B. and Kra, B.. Nilpotent Structures in Ergodic Theory (Mathematical Surveys and Monographs, 236). American Mathematical Society, Providence, RI, 2018.10.1090/surv/236CrossRef Google Scholar

Peluse, S.. On the polynomial Szemerédi theorem in finite fields. Duke Math. J. 168 (2019), 749–774.10.1215/00127094-2018-0051CrossRef Google Scholar

Peluse, S.. Bounds for sets with no polynomial progressions. Forum Math. Pi 8 (2020), e16.10.1017/fmp.2020.11CrossRef Google Scholar

Peluse, S. and Prendiville, S.. Quantitative bounds in the non-linear Roth theorem. Preprint, 2022, arXiv:1903.02592.10.1093/imrn/rnaa261CrossRef Google Scholar

Tao, T. and Ziegler, T.. Concatenation theorems for anti-Gowers-uniform functions and Host–Kra characteristic factors. Discrete Analysis 13 (2016), 60.Google Scholar

Walsh, M.. Norm convergence of nilpotent ergodic averages. Ann. of Math. (2) 175(3) (2012), 1667–1688.10.4007/annals.2012.175.3.15CrossRef Google Scholar

Zorin-Kranich, P.. Norm convergence of multiple ergodic averages on amenable groups. J. Anal. Math. 130 (2016), 219–241.10.1007/s11854-016-0035-7CrossRef Google Scholar