Quantitative bounds in a popular polynomial Szemerédi theorem

Xuancheng Shao; Mengdi Wang

doi:10.1017/prm.2025.10103

Quantitative bounds in a popular polynomial Szemerédi theorem

Part of: Sequences and sets

Published online by Cambridge University Press: 05 December 2025

Xuancheng Shao and

Mengdi Wang

Show author details

Xuancheng Shao*: Affiliation:
Department of Mathematics, University of Kentucky, Lexington, KY, USA (xuancheng.shao@uky.edu)
Mengdi Wang: Affiliation:
École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland (mengdi.wang@epfl.ch)
*: *Corresponding author.

Article contents

Abstract
Introduction
Inverse theorems
Local factors
Polylogarithmic bound in the density result
Popular common difference
References

Rights & Permissions

Abstract

We obtain polylogarithmic bounds in the polynomial Szemerédi theorem when the polynomials have distinct degrees and zero constant terms. Specifically, let $P_1, \dots, P_m \in \mathbb Z[y]$ be polynomials with distinct degrees, each having zero constant term. Then there exists a constant $c = c(P_1,\dots,P_m) \gt 0$ such that any subset $A \subset \{1,2,\dots,N\}$ of density at least $(\log N)^{-c}$ contains a nontrivial polynomial progression of the form $x, x+P_1(y), \dots, x+P_m(y)$. In addition, we prove an effective “popular” version, showing that every dense subset $A$ has some non-zero $y$ such that the number of polynomial progressions in $A$ with this difference $y$ is asymptotically at least as large as in a random set of the same density as $A$.

Keywords

polynomial Szemerédi theorem popular differences

MSC classification

Primary: 11B30: Arithmetic combinatorics; higher degree uniformity

Information

Type: Research Article
Information: Proceedings of the Royal Society of Edinburgh Section A: Mathematics , First View , pp. 1 - 27

DOI: https://doi.org/10.1017/prm.2025.10103 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press on behalf of The Royal Society of Edinburgh.

1. Introduction

The quest to understand the ubiquity of arithmetic structures within subsets of integers has driven some of the most profound advances in additive combinatorics. Szemerédi’s landmark theorem states that every subset of the integers with positive density contains non-trivial $k$-term arithmetic progressions ( $k$-APs) for every natural number $k$. A pivotal contribution came from Gowers [Reference Gowers11], who introduced uniformity norms to furnish a quantitative bound for Szemerédi’s theorem, establishing that any subset of $\{1,2,\dots,N\}$ which does not contain non-trivial $k$-APs must have density $O((\log\log N)^{-c_k})$ for some constant $c_k \gt 0$. Over the years, the pursuit of quantitative bounds in Szemerédi’s theorem has remained extremely active, leading to many significant developments. For example, Kelley and Meka [Reference Kelley and Meka19] established a bound of the shape $O(\exp(-c(\log N)^{1/12}))$ for the case $k=3$, where the exponent $1/12$ is later improved to $1/9$ by Bloom and Sisask [Reference Bloom and Sisask3]. Green and Tao [Reference Green and Tao17] achieved a polylogarithmic bound for $k=4$. Leng, Sah, and Sawhney [Reference Leng, Sah and Sawhney20] improved Gowers’ bound to $O(\exp(-(\log \log N)^{c_k}))$ for general $k\geqslant 5$.

The scope of such results broadened significantly with the polynomial Szemerédi theorem of Bergelson and Leibman [Reference Bergelson and Leibman2], which established that every subset of the integers with positive density contains polynomial progressions of the form $x, x+P_1(y), \dots, x+P_m(y)$ with $y \neq 0$, where $P_1,\dots,P_m \in \mathbb{Z}[y]$ and each has zero constant term. Their proof was rooted in ergodic theory and was thus non-quantitative.

The density increment method has long served as the primary tool for deriving quantitative bounds for general additive patterns. However, applying this strategy to more complex polynomial configurations introduces additional difficulties; see [Reference Prendiville29, Section 2.2] for a detailed discussion. While a quantitative version of the general polynomial Szemerédi theorem remains an open challenge, progress has been made in certain special cases. We summarize some of these developments below and refer the reader to [Reference Peluse25] for for a more in-depth overview of the history of quantitative results in the polynomial Szemerédi theorem both in the integers and in the finite field setting.

• For Sarközy’s theorem on square differences (i.e. $x, x+y^2$), Green and Sawhney [Reference Green and Sawhney14] achieved quantitative bounds of the form $O(\exp(-c\sqrt{\log N}))$.
• When $P_1,\cdots,P_m$ are homogeneous and have the same degree, Prendiville [Reference Prendiville29] achieved bounds of the form $O((\log\log N)^{-c})$ (which can likely be improved to $O(\exp(-(\log \log N)^{c}))$ using the quasipolynomial bounds in the inverse theorem for the Gowers norms from [Reference Leng, Sah and Sawhney21], together with the density increment strategy of Heath-Brown and Szemerédi).
• For the nonlinear Roth configuration (i.e. $x, x+y, x+y^2$), Peluse and Prendiville [Reference Peluse and Prendiville26, Reference Peluse and Prendiville27] achieved quantitative bounds of the form $O((\log N)^{-c})$.
• Peluse [Reference Peluse25] extended these techniques from the nonlinear Roth configuration to more general polynomial progressions when $P_1,\dots, P_m$ have distinct degrees and each has zero constant term. She obtained bounds of the form $O((\log\log N)^{-c})$ for subsets of $\{1,2,\dots,N\}$ that do not contain such configurations.

The first main result of this paper is an improvement of the quantitative bound in Peluse’s result [Reference Peluse25, Theorem 1.1].

Theorem 1.1 (Density bound)

Let $P_1,\dots,P_m \in \mathbb{Z}[y]$ be polynomials with distinct degrees, each having zero constant term. If $A \subset \{1,2,\dots,N\}$ does not contain a polynomial progression of the form

\begin{equation*} x, x+P_1(y), \dots, x+P_m(y) \ \ \text{with }y \neq 0, \end{equation*}

then

\begin{equation*} |A| \ll \frac{N}{(\log N)^c}, \end{equation*}

for some constant $c \gt 0$ depending only on $P_1,\dots,P_m$.

In the “structure vs. randomness” framework underlying the density increment argument, if a set $A$ is not pseudorandom, then the number of additive configurations in this set $A$ may deviate significantly from the random bound. Bergelson, Host, and Kra [Reference Bergelson, Host and Kra1] posed the question of whether there always exists a nonzero difference $y$ for which the number of configurations in $A$ with difference parameter $y$ is at least as large as the random bound. For example, in the case when the configurations are $3$-APs, the question asks for $A \subset \{1,2,\cdots,N\}$ with $|A| = \delta N$, whether there exists $y \neq 0$ such that

\begin{equation*} \#\{x \in A: x, x+y, x+2y \in A\} \geqslant (\delta^3 - o(1))N. \end{equation*}

Such $y$ is loosely referred as a “popular difference”. This topic in the ergodic setting was initially explored by Bergelson, Host, and Kra [Reference Bergelson, Host and Kra1] and also studied in [Reference Donoso, Le, Moreira and Sun4, Reference Frantzikinakis9, Reference Frantzikinakis and Kra10]. We list some notable results below in the arithmetic setting, and refer the readers to [Reference Peluse, Prendiville and Shao28] for more contexts surrounding the popular difference results.

• The existence of popular differences is proved by Green [Reference Green13] for $3$-APs and by Green and Tao [Reference Green and Tao16] for $4$-APs. When $k\geqslant 5$, a counterexample was built by Ruzsa in the appendix to [Reference Bergelson, Host and Kra1].
• The quantitative aspects of popular common differences in 3-APs were studied in [Reference Fox and Pham5–Reference Fox, Pham and Zhao7].
• Linear configurations were classified in [Reference Sah, Sawhney and Zhao30] according to when popular differences are guaranteed to exist.
• Popular differences for corners $(x_1,x_2), (x_1+y,x_2), (x_1,x_2+y)$ are studied in [Reference Fox, Sah, Sawhney, Stoner and Zhao8, Reference Mandache23].
• The existence of popular differences for special polynomial progressions is proved by Lyall and Magyar [Reference Lyall and Magyar22] for the square differences $x, x+y^2$ and by Peluse, Prendiville, and the first author [Reference Peluse, Prendiville and Shao28] for the nonlinear Roth pattern $x, x+y, x+y^2$.

In this paper, we prove, with effective bounds, the existence of popular differences in general polynomial progressions when the polynomials have distinct degrees and each has zero constant term.

Theorem 1.2 (Popular difference)

Let $P_1,\dots,P_m \in \mathbb{Z}[y]$ be polynomials with distinct degrees, each having zero constant term. Let $A \subset \{1,2,\dots,N\}$ be a subset with $|A| \geqslant \delta N$ and let $\varepsilon \in (0,1/2)$. Then either $N \leqslant \exp(\exp(\varepsilon^{-O_{P_1,\dots,P_m}(1)}))$ or there exists a positive integer $y$ such that

\begin{equation*} \#\{x \in A: x+P_i(y) \in A\text{ for each }1 \leqslant i \leqslant m\} \geqslant (\delta^{m+1}-\varepsilon)N. \end{equation*}

The proofs of Theorems 1.1 and 1.2 are based on an inverse theorem. As the final part of the introduction, we present a simplified version of our more general result, Theorem 2.4. Suppose that $N, M \geqslant 1$, $P_1,\dots,P_m \in \mathbb{Z}[y]$ are polynomials and $f_0,\dots,f_m:\mathbb{Z}\rightarrow\mathbb{C}$ are functions supported on the interval $[N]$. We define the associated counting operator as follows:

(1.1)

\begin{equation} \Lambda_{P_1,\dots,P_m}^{N,M} (f_0,\dots, f_m) := \frac{1}{NM} \sum_{x \in \mathbb{Z}} \sum_{y \in [M]} f_0(x) f_1(x+P_1(y)) \cdots f_m(x+P_m(y)). \end{equation}

Theorem 1.3 (Inverse theorem)

Let $P_1,\dots,P_m \in \mathbb{Z}[y]$ be polynomials with distinct degrees, and let their coefficients be bounded by $C$. Let $d = max_{1 \leqslant i \leqslant m}\deg P_i$. Let $N, M$ be positive integers with $M \asymp_C N^{1/d}$, and let $\delta \in (0,1/2)$. Suppose that $f_0,\dots,f_m:\mathbb{Z}\rightarrow \mathbb{C}$ are $1$-bounded functions supported on $[N]$. If

\begin{equation*} \bigl| \Lambda_{P_1,\dots,P_m}^{N,M}(f_0,\dots,f_m) \bigr| \geqslant \delta. \end{equation*}

Then either $M \ll_{C,d} \delta^{-O_d(1)}$ or there exists a positive integer $q\ll_{C,d}\delta^{-O_d(1)}$ and a $1$-bounded function $\phi_i:\mathbb{Z}\rightarrow\mathbb{C}$ for each $1 \leqslant i \leqslant m$ which is $O_{C,d}(\delta^{-O_d(1)}M^{-\deg P_i})$-Lipschitz along $q\cdot\mathbb{Z}$, such that

\begin{equation*} \bigl| \sum_{x \in \mathbb{Z}}f_i(x)\phi_i(x) \bigr|\gg_{C,d} \delta^{O_d(1)}N. \end{equation*}

The definition of $C$-Lipschitz function is given in Definition 2.1. For intuition, we encourage the reader to think of $\phi_i$ as a phase function of the form $x \mapsto e(\alpha_ix)$ with $\left\| q\alpha_i\right\|\ll_{C,d}\delta^{-O_d(1)}M^{-\deg P_i}$. Alternatively, $\phi_i$ is almost constant on progressions of length $\Theta(\delta^{O_d(1)}M^{\deg P_i})$ and common difference $q$. Even though $q$ is independent of $i$, this is unimportant since one could simply take $q$ to be the least common multiple of all $q_i$’s if $q$ is allowed to depend on $i$, while maintaining the upper bound for $q$.

Compared to Peluse’s inverse theorem [Reference Peluse25, Theorem 3.3] which concludes correlation only for the $f_i$ corresponding to the lowest degree polynomial, our inverse theorem gives such conclusions for every $f_i$. This generalization, which is fundamental in getting the polylogarithmic bound in Theorem 1.1 and also in obtaining the popular difference result, is largely based on arguments of Peluse [Reference Peluse25] incorporated with some technical generalizations. Instead of proving Theorem 1.3 directly, we will in fact prove a slightly more general version of it, Theorem 2.4. See the discussions following the statement of Theorem 2.4 for a brief overview of the proof ideas.

We would like to make some remarks on general polynomial sequences. When the leading coefficients of the highest-degree terms among $P_1,\dots, P_m$ are distinct, the sequence can be controlled by a Gowers $U^s$-norm for some $s \gt 2$. In other cases, it can only be controlled by an average of box norms. Ideally, in such situations, some suitable local inverse theorems can be applied. As suggested by Leng–Sah–Sawhney [Reference Leng, Sah and Sawhney20], one can obtain quantitative density bounds for linear forms, since the inverse theorem implies that $f$ correlates with a periodic function $\phi$. However, the period of $\phi$ is typically of order $N^{1-\eta}$ for some small $\eta \gt 0$, which means that the density increment achieved at each step is not sufficiently effective. This is also the reason why the density increment argument fails for general polynomial sequences.

The rest of the paper is organized as follows. In Section 2 we prove our main inverse theorem, Theorem 2.4 (of which Theorem 1.3 is a special case). In Section 3 we develop the theory of local factors which are necessary for our weak regularity lemmas. In Section 4 we prove Theorem 1.1 by establishing a regularity lemma and using the density increment argument, generalizing Peluse and Prendiville’s work [Reference Peluse and Prendiville26] on the nonlinear Roth pattern. In Section 5 we prove Theorem 1.2 by establishing a second weak regularity lemma.

2. Inverse theorems

The goal of this section is to prove the inverse theorem which is the key to both Theorems 1.1 and 1.2. It asserts that if the counting expression (1.1) is large, then the functions exhibit correlation with certain Lipschitz functions. We adopt the same definition as [Reference Peluse and Prendiville26, Definition 6.1].

Definition 2.1. ( $C$-Lipschitz)

A function $\phi:\mathbb{Z}\to\mathbb{C}$ is said to be $C$–Lipschitz along $q\cdot\mathbb{Z}$ if for all $x,y\in \mathbb{Z}$ we have

\begin{equation*} |\phi(x+qy)-\phi(x)|\leqslant C|y|. \end{equation*}

Note that if $\alpha\in\mathbb{T}$ is a frequency such that $\left\| q\alpha\right\|$ is sufficiently small for some positive integer $q\geqslant 1$, then the linear phase function $e(\cdot\alpha)$ is $(2\pi \left\| q\alpha\right\|)$-Lipschitz along $q\cdot\mathbb{Z}$. Thus Lipschitz functions serve as an analogue to linear phase functions. The following lemma provides another class of Lipschitz functions.

Lemma 2.2. Let $q,H$ be positive integers and let $f:\mathbb{Z}\to\mathbb{C}$ be a 1-bounded function. Then the function

\begin{equation*} \phi(x) = \mathbb{E}_{h_1,h_2\in[H]} f(x+q(h_1-h_2)), \end{equation*}

is $O(H^{-1})$-Lipschitz along $q\cdot\mathbb{Z}$.

Proof. [Reference Peluse and Prendiville26, Lemma 6.2].

Before stating our general inverse theorems, we recall the notation of $(C,q)$-coefficients from [Reference Peluse25, Definition 3.1], which serves as an important index for tracking iteration steps.

Definition 2.3. We say a polynomial $P(y)=a_dy^d+\cdots+a_1y \in \mathbb{Z}[y]$ has $(C,q)$-coefficients if $|a_i|\leqslant C|a_d|$ for all $1\leqslant i \leqslant d-1$ and $a_d=a_d'q^{d-1}$ for some $a_d' \in \mathbb{Z}$ with $0 \lt |a_d'|\leqslant C$.

We are now ready to state the inverse theorem, of which Theorem 1.3 from the introduction is a special case corresponding to the situation where the polynomials have $(C,1)$-coefficients and $M$ is relatively large. Recall the counting operator defined in (1.1).

Theorem 2.4 (Inverse theorem)

Let $P_1,\dots,P_m \in \mathbb{Z}[y]$ be polynomials with $(C,q)$-coefficients, and assume they have distinct degrees. Let $d = max_{1 \leqslant i \leqslant m}\deg P_i$. Let $N, M$ be positive integers with $M \leqslant (N/q^{d-1})^{1/d}$, and let $\delta \in (0,1/2)$. Suppose that $f_0,\dots,f_m:\mathbb{Z}\rightarrow \mathbb{C}$ are $1$-bounded functions supported on $[N]$. Let

\begin{equation*} \bigl| \Lambda_{P_1,\dots,P_m}^{N,M}(f_0,\dots,f_m) \bigr| \geqslant \delta. \end{equation*}

Then either $M \ll_{C,d} (q/\delta)^{O_d(1)}$, or there exist positive integers $q'\ll_{C,d}\delta^{-O_d(1)}$ and $b = O_d(1)$, as well as a $1$-bounded function $\phi_i:\mathbb{Z}\rightarrow\mathbb{C}$ for each $1 \leqslant i \leqslant m$ which is $O_{C,d}((q/\delta)^{O_d(1)}M^{-\deg P_i})$-Lipschitz along $q'q^b\cdot\mathbb{Z}$, such that

\begin{equation*} \bigl| \sum_{x \in \mathbb{Z}}f_i(x)\phi_i(x) \bigr|\gg_{C,d} \delta^{O_d(1)}N. \end{equation*}

Assuming $\deg P_1=\min_{1\leqslant i\leqslant m}\deg P_i$ without loss of generality, we note that Peluse’s inverse theorem [Reference Peluse25, Theorem 3.3] implies the correlation result only for the function $f_1$ in Theorem 2.4, under the assumption that $M = (N/q^{d-1})^{1/d}$ (or, slightly more generally, $M \asymp_C (N/q^{d-1})^{1/d}$). The new ingredient in our proof of Theorem 2.4 involves a crucial inductive step which passes from the correlation conditions for $f_1,\cdots,f_{j-1}$ to the correlation condition for $f_j$. Loosely speaking, if $f_1,\dots,f_{j-1}$ correlates with Lipschitz functions $\phi_1,\dots,\phi_{j-1}$, respectively, then we will deduce that

\begin{equation*} \bigl| \Lambda_{P_1,\dots,P_m}^{N,M}(f_0,\phi_1,\dots,\phi_{j-1},f_j,\dots,f_m) \bigr| \gg \delta^{O(1)}. \end{equation*}

Assume, for simplicity, that $\phi_1,\dots,\phi_{j-1}$ are constant functions, this inequality then implies that there is a 1-bounded function $g_0$ such that

\begin{equation*} \bigl| \Lambda_{P_j,\dots,P_m}^{N,M}(g_0,f_j,\dots,f_m) \bigr| \gg \delta^{O(1)}, \end{equation*}

from which we will then deduce that $f_j$ also correlates with a Lipschitz function. This idea was discussed in [Reference Peluse and Prendiville26, Section 7], and our implementation of it involves technical generalizations of Peluse’s arguments [Reference Peluse25].

These arguments outlined above will be carried out in Sections 2.1 and 2.2, proving Theorem 2.4 when $M \asymp_C (N/q^{d-1})^{1/d}$. In Section 2.3, we will generalize to all $M \leqslant (N/q^{d-1})^{1/d}$ by working in subintervals of $[N]$ of appropriate lengths.

2.1. Inductive step

In this subsection, we prove Proposition 2.5 as a necessary step allowing us to pass from the correlation conditions for $f_1,\dots,f_{j-1}$ to the correlation conclusion for $f_j$.

Proposition 2.5. (Partial correlation)

Let $N,M \gt 0$ be sufficiently large numbers and $q\in\mathbb{N}$. Let $P_1,\dots,P_m\in\mathbb{Z}[y]$ be polynomials with $(C,q)$-coefficients such that $\deg P_1 \lt \cdots \lt \deg P_m$. Define $d=\deg P_m$ and assume $1/C \leqslant q^{d-1}M^d/N \leqslant C$. Let $\delta \in (0,1/2)$. We have either $N\ll_{C,d} (q/\delta)^{O_{d}(1)}$ or the following conclusions hold.

Suppose that $1\leqslant j\leqslant m$, and for each $1\leqslant i \lt j$, let $\phi_i:\mathbb{Z}\to\mathbb{C}$ be a 1-bounded $O_{C,d}((q/\delta)^{O_{d}(1)} M^{-\deg P_i})$-Lipschitz function along $q_iq^{b_i}\cdot\mathbb{Z}$ for some integers $q_i\ll_{C,d}\delta^{-O_{d}(1)}$ and $b_i\ll_{d} 1$. Let $f_0,f_j,\cdots,f_m:\mathbb{Z}\to\mathbb{C}$ be arbitrary 1-bounded functions supported on $[N]$. If

(2.1)

\begin{equation} \bigl| \Lambda_{P_1,\dots,P_m}^{N,M}(f_0,\phi_1,\dots,\phi_{j-1},f_j,\dots,f_m) \bigr| \geqslant \delta, \end{equation}

then there exist positive integers $q_j \ll_{C,d} \delta^{-O_{d}(1)}$, $b_j \ll_{d} 1$ and a 1-bounded function $\phi_j:\mathbb{Z}\to\mathbb{C}$ which is $O_{C,d}((q/\delta)^{O_{d}(1)} M^{-\deg P_j})$-Lipschitz along $q_jq^{b_j}\cdot\mathbb{Z}$, such that

\begin{equation*} \bigl| \sum_x f_j(x) \phi_j(x) \bigr|\gg_{C,d} \delta^{O_{d}(1)} N. \end{equation*}

This is an extension of [Reference Peluse25, Theorem 3.3] which essentially corresponds to the $j=1$ case. The assumption $q^{d-1}M^d/N = 1$ is assumed in [Reference Peluse25, Theorem 3.3], but it can easily be relaxed to $q^{d-1}M^d/N \asymp 1$ by following the proof.

The proof of Proposition 2.5 adapts the proof of [Reference Peluse25, Theorem 3.3] in [Reference Peluse25, Section 9]. We outline the arguments before presenting the details. For $N, M \geqslant 1$, polynomials $P_1,\cdots,P_m \in \mathbb{Z}[y]$, functions $f_0,\cdots,f_{\ell}:\mathbb{Z}\rightarrow\mathbb{C}$ supported on $[N]$ and characters $\psi_{\ell+1},\cdots,\psi_m:\mathbb{Z}\rightarrow\mathbb{C}$ of the form $\psi_i(x) = e(\alpha_ix)$ with $\alpha_i\in\mathbb{R}$, we define the counting operator

\begin{equation*} \begin{split} & \Lambda_{P_1,\cdots,P_m}^{N,M}(f_0,\cdots,f_{\ell};\psi_{\ell+1},\cdots,\psi_m) \\ & := \frac{1}{NM} \sum_{x \in \mathbb{Z}} \sum_{y \in [M]} f_0(x) f_1(x+P_1(y)) \cdots f_{\ell}(x+P_{\ell}(y)) \psi_{\ell+1}(P_{\ell+1}(y)) \cdots \psi_m(P_m(y)). \end{split} \end{equation*}

When $\ell=m$ this notation becomes (1.1). The first step in the proof of Proposition 2.5 involves repeated applications of the following lemma which is [Reference Peluse25, Lemma 3.11].

Lemma 2.6. Let $N,M \gt 0$ be sufficiently large numbers, $q\in\mathbb{N}$, and $2 \leqslant \ell \leqslant m$. Let $P_1,\dots,P_m\in\mathbb{Z}[y]$ be polynomials such that $P_1,\cdots,P_{\ell}$ have $(C,q)$-coefficients and $\deg P_1 \lt \cdots \lt \deg P_m$. Let $d = \deg P_{\ell}$ and let $c_{\ell}$ be the leading coefficient of $P_{\ell}$. Assume $1/C \leqslant q^{d-1}M^d/N \leqslant C$. Let $\delta \in (0,1/2)$. We have either $N\ll_{C,d} (q/\delta)^{O_{d}(1)}$ or the following conclusions hold.

Let $f_0,\cdots,f_{\ell}: \mathbb{Z}\rightarrow \mathbb{C}$ be $1$-bounded functions supported on $[N]$ and let $\psi_{\ell+1},\cdots,\psi_m$ be characters. If

\begin{equation*} \bigl| \Lambda_{P_1,\cdots,P_m}^{N,M}(f_0,\cdots,f_{\ell};\psi_{\ell+1},\cdots,\psi_m) \bigr| \geqslant \delta, \end{equation*}

then we have

\begin{equation*} \mathbb{E}_{\substack{u, h = 0,\cdots,|c'|-1 \\ 0 \leqslant w \lt N/(c'C'N')}} \bigl| \Lambda_{P_1^h,\cdots,P_m^h}^{C'N',M'}(f_0^{u,h,w},\cdots,f_{\ell-1}^{u,h,w}; \psi_{\ell,u}, \psi_{\ell+1},\cdots,\psi_m) \bigr| \gg_{C,d} \delta^{O_d(1)}, \end{equation*}

for some characters $\psi_{\ell,u}$, where $C' \asymp_d C$, $c' := d!c_{\ell}$, $M' := M/|c'|$, $N' := M'^{\deg P_{\ell-1}} (q|c'|)^{\deg P_{\ell-1}-1}$,

\begin{equation*} P_i^h(z) := \begin{cases} \frac{1}{c'}(P_i(c'z+h) - P_i(h)) & 1 \leqslant i \leqslant \ell-1, \\ P_i(c'z+h) - P_i(h) & \ell \leqslant i \leqslant m, \end{cases}, \end{equation*}

and

\begin{equation*} f_i^{u,h,w}(x) := \begin{cases} (f_0\psi_{\ell,u})\Big(c'x + c'C'N'w - P_{\ell}(h) - u\Big) \cdot 1_{[C'N']}(x) & i=0, \\ f_i\Big(c'x + c'C'N'w + P_i(h) - P_{\ell}(h)-u\Big) \cdot 1_{[C'N']}(x) & 1 \leqslant i \leqslant \ell-1. \end{cases} \end{equation*}

This allows us to essentially replace $f_{j+1},\cdots,f_m$ in (2.1) by characters. By applying Weyl’s exponential sum estimates stated below (see [Reference Tao32, Lemma 1.1.16]), one can deduce that these characters must be in major arcs.

Lemma 2.7. Let $N\geqslant 1$ and $0 \lt \delta \lt 1$. Suppose that $\alpha_1,\dots,\alpha_d\in\mathbb{R} $ and

\begin{equation*} \Bigl| \sum_{n\leqslant N} e\bigl( \alpha_d n^d +\cdots + \alpha_1n \bigr) \Bigr| \geqslant\delta N, \end{equation*}

then there exists a positive integer $1\leqslant q\ll \delta^{-O_d(1)}$ such that $\left\| q\alpha_i\right\|\ll\delta^{-O_d(1)}N^{-i}$ for each $1\leqslant i\leqslant d$.

At this stage, we essentially have

\begin{equation*} \bigl| \Lambda_{P_1,\dots,P_m}^{N,M}(f_0,\phi_1,\dots,\phi_{j-1},f_j,\psi_{j+1},\cdots,\psi_m) \bigr| \gg \delta^{O(1)}, \end{equation*}

where $\psi_{j+1},\cdots,\psi_m$ are major-arc linear phase functions. By the Lipschitz properties of $\phi_1,\cdots,\phi_{j-1}$, we may split the range of $y \in [M]$ into arithmetic progressions of length $|P| \approx (\delta/q)^{O(1)}M$ such that if $y \in P$ then

\begin{equation*} \phi_i(x) \approx \phi_i(x + P_i(y)) \text{for }1 \leqslant i \leqslant j-1, \ \ \psi_i(x) \approx \psi_i(x+P_i(y)) \text{for } j+1 \leqslant i \leqslant m. \end{equation*}

This leads to

\begin{equation*} \frac{1}{N}\sum_{x \in \mathbb{Z}} \sum_{y \in P} f_0(x) f_j(x + P_j(y)) \gg \delta^{O(1)}|P|, \end{equation*}

(after modifying $f_0$ appropriately). The desired Lipschitz property for $f_j$ follows by applying Cauchy–Schwarz and van der Corput’s inequality as in [Reference Peluse25, Lemma 4.2].

Proof of Proposition 2.5

Assume that the inequality (2.1) holds. Following the proof of [Reference Peluse25, Theorem 3.3], applying Lemma 2.6 a total of $m-j$ times yields

(2.2)

\begin{equation}\begin{aligned} & \mathbb{E}_{\substack{u_i,h_i=0,\dots,|c_i|-1 \\ 0\leqslant w_i \lt (C_{i+1}N_{i+1}/|c_i|)/C_iN_i \\ j+1\leqslant i\leqslant m}} \\ &\Bigl| \Lambda_{P_1^{\underline{h}},\dots, P_m^{\underline{h}} }^{C_{j+1}N_{j+1}, M_{j+1} } (f_0^{\underline u, \underline h, \underline w}, \phi_1^{\underline u, \underline h, \underline w} ,\dots, \phi_{j-1}^{\underline u, \underline h, \underline w}, f_j^{\underline u, \underline h, \underline w};\psi_{j+1}^{\underline u, \underline h, \underline w},\dots,\psi_m^{\underline u, \underline h, \underline w}) \Bigr| \gg_{C,d} \delta^{O_{d} (1)}. \end{aligned}\end{equation}

The parameters involved in the equality satisfy the following conditions: $C_{m+1}=1$, $N_{m+1}=N$, $b_i\ll_{d}1$, $c_i\asymp_{C, d} q^{b_i}$, $M_i=M/\prod_{i\leqslant j\leqslant m}|c_j|$, $C_i\asymp_{C,d} 1$ and $N_i = M_i^{\deg P_i-1}(q|c_i\cdots c_m|)^{\deg P_{i-1}-1}$ for each $i=j+1,\dots,m$. Besides, $f_0^{\underline u, \underline h, \underline w}$ is a 1-bounded function,

(2.3)

\begin{equation} \phi_i^{\underline u, \underline h, \underline w} (x) =\phi_i(c_{j+1}\cdots c_m x +Q_i(\underline u, \underline h, \underline w)), \end{equation}

for $1 \leqslant i \leqslant j-1$, where $Q_i(\underline u, \underline h, \underline w)$ is a function in terms of $\underline u, \underline h, \underline w$ and $P_i^{\underline h}$, and $f_j^{\underline u, \underline h, \underline w}(x)$ equals $1_{[C_{j+1}N_{j+1}]}(x)$ times

\begin{align*} f_j &\Bigl( c_{j+1}\cdots c_m x + \sum_{j \lt i\leqslant m} (c_{i+1}\cdots c_m)\left[w_ic_iC_i N_i - u_i + \left[P_j^{h_m,\dots,h_{i+1}} (h_i)\right.\right. \\ &\quad -\left.\left. P_i^{h_m,\dots,h_{i+1}} (h_i)\right]\right] \Bigr). \end{align*}

Next, we will remove the functions $\phi_i^{\underline u, \underline h, \underline w}$ and $\psi_i^{\underline u, \underline h, \underline w}$ from (2.2) step by step. First, for each character $\psi_i^{\underline u, \underline h, \underline w}$ ( $j+1 \leqslant i \leqslant m$), we can express it as $\psi_i^{\underline u, \underline h, \underline w} (x) =e (\beta_i^{\underline u, \underline h, \underline w}x)$ for some frequency $\beta_i^{\underline u, \underline h, \underline w}\in\mathbb{R} $. By applying the Cauchy–Schwarz inequality $O_{\deg P_1,\dots,\deg P_j} (1)$ times and then invoking Lemma 2.7, we may find positive integers $t'\ll_{C,d}\delta^{-O_{d}(1)}$ and $b_i\ll_{d}1$ such that

\begin{equation*} \left\| t' q^{b_i}\beta_i^{\underline u, \underline h, \underline w} \right\| \ll_{C,d} \delta^{-O_{d}(1)} /M_{j+1}^{\deg P_i}, \end{equation*}

for each $j+1\leqslant i\leqslant m$. The range of summation $y \in [M_{j+1}]$ in (2.2) can be split into arithmetic progressions, each of which has length $M_{j+1}^* =\Theta_{C,d} ((\delta/q)^{O_{d}(1)} M_{j+1})$ and common difference $t'q^s$ for some $s\ll_{d}1$, such that $\psi_i^{\underline u, \underline h, \underline w}$ is nearly constant within each arithmetic progression. Moreover, the number of such arithmetic progressions is $O(t'q^sM_{j+1}/M_{j+1}^*)$. Thus, after a change of variables, we deduce from inequality (2.2) that

(2.4)

\begin{equation}\begin{aligned} &\delta^{O_{d} (1)} \ll_{C,d} \mathbb{E}_{\substack{u_i,h_i=0,\dots,|c_i|-1\\ 0\leqslant w_i \lt (C_{i+1}N_{i+1}/|c_i|)/C_iN_i \\ j+1\leqslant i\leqslant m\\ k_{\underline u, \underline h, \underline w} \in [M_{j+1}/M_{j+1}^*]\\ k'_{\underline u, \underline h, \underline w} \in [t'q^s]}} \Bigl| (C_{j+1}N_{j+1})^{-1}\sum_x \mathbb{E}_{y\in[M_{j+1}^*]} f_0^{\underline u, \underline h, \underline w} (x)\\ &\phi_1^{\underline u, \underline h, \underline w} (x+P_1^{\underline h}(t'q^s (y-M_{j+1}^* k'_{\underline u, \underline h, \underline w}) - k_{\underline u, \underline h, \underline w})) \cdots f_j^{\underline u, \underline h, \underline w}\\ &\quad (x + P_j^{\underline h} (t'q^s (y-M_{j+1}^* k'_{\underline u, \underline h, \underline w}) - k_{\underline u, \underline h, \underline w})) \Bigr|. \end{aligned}\end{equation}

Next, from the Lipschitz property of $\phi_i$ and (2.3), one can deduce that, for each $1 \leqslant i \leqslant j-1$, the function $\phi_i^{\underline u, \underline h, \underline w}$ is $O_{C,d}((q/\delta)^{O_{d}(1)} M^{-\deg P_i})$-Lipschitz along $q_iq^{b_i}\cdot\mathbb{Z}$, noting that $|c_{j+1}\cdots c_m|\ll_{C,d} q^{O_d(1)}$. By increasing $t'$ and $s$ if necessary, we may ensure that $\mathrm{lcm}[q_1,\cdots,q_{j-1}] \mid t'$ and $s \geqslant \max_{1 \leqslant i \leqslant j-1}b_i$, while maintaining the bounds $t' \ll_{C,d} \delta^{-O_d(1)}$ and $s\ll_d1$. Thus $q_iq^{b_i} \mid t'q^s$ for each $1 \leqslant i \leqslant j-1$. It follows that the dependence on $y$ in each of the terms involving $\phi_1^{\underline u, \underline h, \underline w}, \cdots, \phi_{j-1}^{\underline u, \underline h, \underline w}$ can be dropped at a negligible error. More specifically, writing $Q(y): = P_i^{\underline h}(t'q^s(y-M_{j+1}^*k_{\underline u, \underline h, \underline w}')-k_{\underline u, \underline h, \underline w})$, we have for $y \in [M_{j+1}^*]$,

\begin{equation*} \bigl| \phi_i^{\underline u, \underline h, \underline w} (x+Q(y)) - \phi_i^{\underline u, \underline h, \underline w} (x+Q(0)) \bigr| \ll_{C,d} (q/\delta)^{O_{d} (1)} M^{-\deg P_i} |Q(y) - Q(0)|, \end{equation*}

since $q_iq^{b_i} \mid Q(y)-Q(0)$. Since $P_i^{\underline h}$ has degree $\deg P_i$ and all of its coefficients are $\ll_{C,d} q^{O_d(1)}$, it follows that

\begin{equation*} |Q(y) - Q(0)| \ll_{C,d} (q/\delta)^{O_d(1)} M^{\deg P_i-1} \cdot M_{j+1}^*. \end{equation*}

Hence, by decreasing $M_{j+1}^*$ if necessary (while still maintaining the bound $M_{j+1}^* \gg_{C,d} (\delta/q)^{O_d(1)}M_{j+1}$), we may ensure that the function $\phi_i^{\underline u, \underline h, \underline w} (x+Q(y))$ can be replaced by $\phi_i^{\underline u, \underline h, \underline w} (x+Q(0))$ in (2.4). Therefore, we conclude that

\begin{multline*} \delta^{O_{d} (1)} \ll_{C,d} \mathbb{E}_{\substack{u_i,h_i=0,\dots,|c_i|-1\\ 0\leqslant w_i \lt (C_{i+1}N_{i+1}/|c_i|)/C_iN_i \\ j+1\leqslant i\leqslant m\\ k_{\underline u, \underline h, \underline w} \in [M_{j+1}/M^*_{j+1}]\\ k'_{\underline u, \underline h, \underline w} \in [t'q^s]}} (C_{j+1}N_{j+1})^{-1}\\ \Bigl| \sum_x \mathbb{E}_{y\in[M^*_{j+1}]} g_0^{\underline u, \underline h, \underline w} (x) f_j^{\underline u, \underline h, \underline w} (x + P_j^{\underline h} (t'q^s (y-M^*_{j+1} k'_{\underline u, \underline h, \underline w}) - k_{\underline u, \underline h, \underline w})) \Bigr|, \end{multline*}

where $ g_0^{\underline u, \underline h, \underline w} (x)$ is $1$-bounded. This inequality has the same shape as the fourth from last inequality on [Reference Peluse25, Page 52] with $f_0,f_1$ there replaced by $g_0, f_j$, respectively. One can follow the remaining proof of [Reference Peluse25, Theorem 3.3] to establish that

\begin{equation*} \sum_x \bigl| \mathbb{E}_{y\in[N']}f_j(x+q'q^by) \bigr|\gg_{C,d}\delta^{O_{d}(1)}N, \end{equation*}

where $q'\ll_{C,d} \delta^{-O_{d}(1)} $, $b\ll_{d} 1+ \max_{1\leqslant i\leqslant j-1}b_i \ll_{d} 1$, and

\begin{equation*} N' \gg_{C,d} (\delta/q)^{O_{d}(1)} M_{j+1}^{\deg P_j}\gg_{C,d} (\delta/q)^{O_{d}(1)} M^{\deg P_j}. \end{equation*}

An application of the Cauchy–Schwarz inequality then yields

\begin{equation*} \delta^{O_{d}(1)} \ll_{C,d} N^{-1} \sum_{x\in \mathbb{Z}} \mathbb{E}_{y,y'\in [N]} f_j(x+q'q^b y)\overline{f_j(x+q'q^b y')}. \end{equation*}

Making the change of variables $x+q'q^b y \to x$ then gives

\begin{equation*} \delta^{O_{d}(1)} \ll_{C,d} N^{-1} \sum_{x\in \mathbb{Z}} f_j(x) \mathbb{E}_{y,y'\in [N]} \overline{f_j(x+q'q^b (y'-y))}. \end{equation*}

The proposition follows by noting, from Lemma 2.2, that the function

\begin{equation*} \phi_j (x) = \mathbb{E}_{y,y'\in [N]} \overline{f_j(x+q'q^b (y'-y))}, \end{equation*}

is $O_{C,d}((q/\delta)^{-O_d(1)} M^{-\deg P_j})$-Lipschitz along $q'q^b\cdot \mathbb{Z}$.

2.2. Correlation for all functions

In order to establish correlations for all functions, the final tool we require is a decomposition lemma due to Gowers [Reference Gowers12, Proposition 3.6]. If $\|\cdot\|$ is a seminorm on an inner product space, recall that its dual seminorm $\|\cdot\|^*$ is defined by

(2.5)

\begin{equation} \|f\|^{*} := \sup_{\|g\|\leqslant1}|\langle f,g\rangle|. \end{equation}

Hence,

(2.6)

\begin{equation} \left| \left\langle f,g\right\rangle\right| \leqslant \left\| f\right\|^* \left\| g\right\|. \end{equation}

Lemma 2.8. (Decomposition)

Let $\left\| \cdot\right\|$ be a norm on the space of complex-valued functions with support on the interval $[N]$. If $\varepsilon \gt 0$ and $f:\mathbb{Z}\to\mathbb{C}$ is a function supported on the interval $[N]$, then there is a decomposition $f = f_{\mathrm{str}} + f_\mathrm{unf}$ such that

\begin{equation*} \left\| f_\mathrm{str}\right\|^* \leqslant \varepsilon^{-1} \left\| f\right\|_2 \quad \text{and }\quad \left\| f_\mathrm{unf}\right\| \leqslant \varepsilon \left\| f\right\|_2. \end{equation*}

We are now ready to prove Theorem 2.4 under the additional assumption that $1/C \leqslant q^{d-1}M^d/N \leqslant C$. Without loss of generality, we may assume for the remainder of this section that $\deg P_1 \lt \cdots \lt \deg P_m$. Let $d_i = \deg P_i$ and $d = d_m$. We proceed by induction to prove the existence of $1$-bounded functions $\Phi_1,\cdots,\Phi_{m-1}:\mathbb{Z}\to\mathbb{C}$, where each $\Phi_i$ is $O_{C,d}((q/\delta)^{O_{d}(1)} M^{-\deg P_i})$-Lipschitz along $q'q^{b}\cdot\mathbb{Z}$ for some $q'\ll_{C,d}\delta^{-O_{d}(1)}$ and $b\ll_{d}1$, such that

(2.7)

\begin{equation} \bigl| \Lambda_{P_1,\cdots,P_m}^{N,M}(f_0,\Phi_1,\cdots,\Phi_{j-1},f_j,\cdots,f_m) \bigr| \geqslant \delta_j \end{equation}

for every $1 \leqslant j \leqslant m$ and some $\delta_j \gg_{C,d}\delta^{O_d(1)}$. Once this is proved, the desired correlation results for all $f_j$’s follow from Proposition 2.5.

The base case $j=1$ of (2.7) follows from Proposition 2.5 (1) directly. For the induction step, suppose that $\Phi_1,\cdots,\Phi_{j-1}$ have been constructed for some $1 \leqslant j \lt m$ such that (2.7) holds. Our goal is to construct $\Phi_j$ such that (2.7) holds for $j+1$.

Define the semi-norm $\left\| \cdot\right\|$ by setting

\begin{equation*} \left\| f\right\| = \sup_{g_0,g_{j+1},\cdots,g_m} N \cdot \Bigl| \Lambda_{P_1,\cdots,P_m}^{N,M}(g_0,\Phi_1,\cdots,\Phi_{j-1},f,g_{j+1},\cdots,g_m) \Bigr|, \end{equation*}

where the supremum is taken over all $1$-bounded functions $g_0,g_{j+1},\cdots,g_m$ supported on $[N]$. Applying Lemma 2.8 with $\varepsilon=\delta_j N^{1/2}/2$ we obtain a decomposition $f_j =f_j^{\text{str}}+ f_j^{\text{unf}}$ with

(2.8)

\begin{equation} \left\| f_j^{\text{str}}\right\|^*\leqslant 2\delta_j^{-1} N^{-1/2} \left\| f_j\right\|_2\leqslant 2\delta_j^{-1}, \end{equation}

and

(2.9)

\begin{equation} \left\| f_j^{\text{unf}}\right\|\leqslant (\delta_j/2)N^{1/2}\left\| f_j\right\|_2\leqslant \delta_j N /2. \end{equation}

After a change of variables replacing $x$ by $x-P_j(y)$, we can write

\begin{equation*} \Lambda_{P_1,\cdots,P_m}^{N,M}(f_0,\Phi_1,\cdots,\Phi_{j-1},f_j,\cdots,f_m) = N^{-1} \langle f_j, G_j\rangle, \end{equation*}

where $G_j$ is the dual function defined by

\begin{align*} G_j(x)&=\mathbb{E}_{y\in [M]} f_0(x-P_j(y)) \prod_{1\leqslant i \leqslant j-1} \Phi_i(x+P_i(y)-P_j(y))\\ &\quad\times \prod_{j+1 \leqslant i \leqslant m}f_i(x+P_i(y)-P_j(y)). \end{align*}

Then it follow from (2.7) and the triangle inequality that

\begin{equation*} \delta_j \leqslant N^{-1} | \left\langle f_j, G_j\right\rangle| \leqslant N^{-1} |\left\langle f_j^{\text{unf}},G_j\right\rangle| + N^{-1} |\left\langle f_j^{\text{str}},G_j\right\rangle|. \end{equation*}

From the definition of the norm $\left\| \cdot\right\|$ and the inequality (2.9) one can deduce that

\begin{equation*} |\left\langle f_j^{\text{unf}},G_j\right\rangle|\leqslant \left\| f_j^{\text{unf}}\right\|\leqslant \delta_j N/2. \end{equation*}

Hence, it follows from (2.6), (2.8) and the above two inequalites that

\begin{equation*} \delta_j/2 \leqslant N^{-1} |\left\langle f_j^{\text{str}},G_j\right\rangle| \leqslant N^{-1} \left\| f_j^{\text{str}}\right\|^* \left\| G_j\right\| \leqslant 2\delta_j^{-1} N^{-1} \left\| G_j\right\|. \end{equation*}

Therefore, $\left\| G_j\right\|\gg \delta_j^2 N$, which implies that there exist 1-bounded functions $g_0,g_{j+1},\cdots,g_m:\mathbb{Z}\to\mathbb{C}$ supported on the interval $[N]$ such that

\begin{equation*} \bigl| \Lambda_{P_1,\dots, P_m}^{N,M}(g_0,\Phi_1,\cdots,\Phi_{j-1},G_j,g_{j+1},\cdots,g_m \bigr| \gg \delta_j^2 \gg_{C,d}\delta^{O_d(1)}. \end{equation*}

By Proposition 2.5, there exists a $1$-bounded function $\Phi_j:\mathbb{Z}\to\mathbb{C}$ which is $O_{C,d}((q/\delta)^{O_{d}(1)} M^{-\deg P_j})$-Lipschitz along $q_jq^{b_j}\cdot Z$ for some $q_j\ll_{C,d}\delta^{-O_{d}(1)}$ and $b_j\ll_{d}1$, such that

\begin{equation*} \Bigl| \sum_x G_j(x) \Phi_j(x) \Bigr| \gg_{C,d} \delta^{O_{d}(1)}N. \end{equation*}

We may enlarge $q',b$ if necessary to ensure that all of $\Phi_1,\cdots,\Phi_j$ are Lipschitz functions along $q'q^b \cdot \mathbb{Z}$, while maintaining the bounds $q' \ll_{C,d}\delta^{-O_d(1)}$ and $b\ll_d1$. Expanding the dual function $G_j$, we thus obtain that

(2.10)

\begin{equation} \Bigl| \Lambda_{P_1,\cdots,P_m}^{N,M}(f_0,\Phi_1,\cdots,\Phi_{j-1},\Phi_j,f_{j+1},\cdots,f_m) \Bigr| \gg_{C,d} \delta^{O_{d}(1)}. \end{equation}

This completes the induction step, thereby proving Theorem 2.4 in the case when $1/C \leqslant q^{d-1}M^d/N \leqslant C$.

Remark. In the above argument, we used Lemma 2.8 based on the Hahn-Banach theorem to deduce intermediate conclusions for the dual functions $G_j$, which we then use to deduce conclusions for $f_{j+1}$. A modern way of executing this maneuver (in a much more sophisticated setting) was recently introduced by Manners [Reference Manners24] and referred to as “stashing”.

2.3. Theorem 2.4 for general $M$

Finally, we now address the general case of Theorem 2.4 by reducing to the case considered in Section 2.2 after dividing $[N]$ into subintervals of appropriate lengths.

Since each $P_j$ has $(C,q)$-coefficients and $\deg P_j\leqslant d$, we have for $1 \leqslant y \leqslant M$,

(2.11)

\begin{equation} |P_j(y)| \leqslant d C^2 q^{d-1} |y|^d \leqslant d C^2 q^{d-1}M^d. \end{equation}

Set $N_0 = dC^2 q^{d-1}M^d$. We may assume that $N_0 \leqslant N$, since otherwise $N/(dC^2) \leqslant q^{d-1}M^d \leqslant N$ and the conclusion follows from the previous case (after replacing $C$ by $dC^2$). Now divide $[N]$ into a collection $\mathcal{I}$ of $\asymp N/N_0$ subintervals, such that each interval $I \in \mathcal{I}$ has length between $N_0$ and $2N_0$.

For the rest of the proof, fix some $1 \leqslant i \leqslant m$, and we aim at deducing the correlation condition for $f_i$. Writing $f_i\vert_I$ for the restriction of $f_i$ on $I$, we have

(2.12)

\begin{equation} \bigl| \Lambda_{P_1,\dots,P_m}^{N,M}(f_0, \cdots,f_{i-1}, f_i\vert_{I}, f_{i+1},\dots,f_m) \bigr| \gg \delta \cdot \frac{N_0}{N}, \end{equation}

for each interval $I$ in a subcollection $\mathcal{I}'$ of $\mathcal{I}$ with $|\mathcal{I}'| \gg N/N_0$. Note that if $x+P_i(y) \in I$, then from (2.11) we have $x \in J$ and $x + P_j(y) \in J$ for each $1 \leqslant j \leqslant m$, where $J = I + [-2N_0, 2N_0]$ is an interval of length at most $6N_0$. Hence we may assume that all of the functions $f_0,\dots,f_{i-1},f_i\vert_{I},f_{i+1},\dots,f_m$ appearing in (2.12) are supported on $J$. By defining $\tilde{f}_j$ to be a suitable shift of $f_j$ for $j \neq i$ and defining $\tilde{f}_i$ to be a suitable shift of $f_i\vert_I$, we may ensure that each $\tilde{f}_j$ is supported on $[1, 6N_0]$ and obtain

\begin{equation*} \bigl| \Lambda_{P_1,\dots,P_m}^{6N_0,M}(\tilde{f}_0, \tilde{f}_1,\cdots,\tilde{f}_m) \bigr| \gg \delta. \end{equation*}

Since $q^{d-1}M^d\asymp_C N_0$, it follows from the conclusion in Section 2.2 that there exist positive integers $q_I'\ll_{C,d} \delta^{-O(1)}$ and $b_I\ll 1$, as well as a $1$-bounded function $\phi_I: \mathbb{Z}\rightarrow\mathbb{C}$ which is $O_{C,d}((q/\delta)^{O(1)}M^{-d_i})$-Lipschitz along $q_I'q^{b_I}\cdot \mathbb{Z}$, such that

(2.13)

\begin{equation} \bigl| \sum_{x \in \mathbb{Z}} f_i\vert_I(x) \phi_I(x) \bigr| \gg_{C,d} \delta^{O_d(1)}N_0. \end{equation}

By the pigeonhole principle, there exists a subset $\mathcal I^{\prime\prime} \subset \mathcal I'$ satisfying $|\mathcal{I}^{\prime\prime}| \gg_{C,d} \delta^{O_d(1)} N/N_0$, such that $q_I' = q'$ and $b_I = b$ are independent of $I$ for $I \in \mathcal{I}^{\prime\prime}$.

For $I \in \mathcal{I}^{\prime\prime}$, we may change the values of $\phi_I$ near the endpoints of $I$ appropriately to obtain a modified function $\phi_I': \mathbb{Z}\rightarrow\mathbb{C}$ which is supported on $I$ and takes the value $0$ at the endpoints of $I$, while ensuring that $\phi_I'$ remains $O_{C,d}((q/\delta)^{O(1)}M^{-d_i})$-Lipschitz along $q'q^{b}\cdot \mathbb{Z}$ and that (2.13) still holds with $\phi_I$ replaced by $\phi_I'$. More specifically, let

\begin{equation*} \eta = \frac{1}{10N_0} \bigl| \sum_{x \in \mathbb{Z}} f_i\vert_I(x) \phi_I(x) \bigr| \gg_{C,d} \delta^{O_d(1)}. \end{equation*}

If $I = [u, v]$ then we define $\phi_I'$ by $\phi_I'(x)=0$ for $x \leqslant u$ and for $x \geqslant v$, $\phi_I'(x) = \phi_I(x)$ for $x \in [u+\eta N_0, v-\eta N_0]$, and requiring that $\phi_I'$ is linear on $[u, u+\eta N_0]$ and on $[v-\eta N_0, v]$. By linearity, $\phi_I'$ is $O(\eta^{-1}N_0^{-1})$ Lipschitz on $[u, u+\eta N_0]$ and $[v-\eta N_0, v]$, and thus the desired Lipschitz condition on $\phi_I'$ is satisfied. Moreover, replacing $\phi_I$ by $\phi_I'$ in (2.13) does not change the summand unless $x \in [u, u+\eta N_0]$ or $x \in [v-\eta N_0, v]$. Since there are at most $2\eta N_0$ such values of $x$, it follows that (2.13) remains valid for $\phi_I'$ by our choice of $\eta$.

Next, by replacing $\phi_I'$ by $z_I\phi_I'$ for some complex number $z_I$ with $|z_I|=1$, we may assume that

\begin{equation*} \sum_{x \in \mathbb{Z}} f_i\vert_I(x) \phi_I'(x) \gg_{C,d} \delta^{O(1)}N_0. \end{equation*}

Finally, define $\phi:\mathbb{Z}\rightarrow\mathbb{C}$ by gluing the functions $\phi_I'$ together for $I \in \mathcal{I}^{\prime\prime}$; i.e. set $\phi(x) = \phi_I'(x)$ if $x \in I$ for some $I \in \mathcal{I}^{\prime\prime}$ and set $\phi(x)=0$ otherwise. This establishes the desired conclusion, thereby completing the proof of Theorem 2.4.

3. Local factors

In this section, we adopt the notation for factors from [Reference Green and Tao15], originally derived from ergodic theory; see also [Reference Peluse and Prendiville26, Section 4].

Definition 3.1. (Factor)

We define a factor $\mathcal{B}$ of $[N]$ to be a partition of $[N]$, so that $[N] = \sqcup_{B \in \mathcal{B}} B$. A factor $\mathcal{B}'$ refines $\mathcal{B}$ if every element of $\mathcal{B}$ is a union of elements of $\mathcal{B}'$. The join $\mathcal{B}_1\vee\dots\vee\mathcal{B}_d$ of factors $\mathcal{B}_1, \dots, \mathcal{B}_d$ is the factor formed by taking the $d$-fold intersections of the elements of $\mathcal{B}_1$, …, $\mathcal{B}_d$, that is,

\begin{equation*} \mathcal{B}_1\vee\dots\vee\mathcal{B}_d:=\{B_1\cap\dots\cap B_d:B_i\in\mathcal{B}_i\text{for }i=1,\dots,d\}. \end{equation*}

Definition 3.2. (Measurability, projection)

Given a factor $\mathcal{B}$ of $[N]$, we say that a function $f : [N] \to \mathbb{C}$ is $\mathcal{B}$-measurable if it is constant on the elements of $\mathcal{B}$. Define the projection of any function $f : [N] \to \mathbb{C}$ onto $\mathcal{B}$ by

(3.1)

\begin{equation} \Pi_\mathcal{B} f(x) = \mathbb{E}_{y \in B(x)} f(y), \end{equation}

where $B(x)$ is the unique atom of $\mathcal{B}$ containing $x$.

Notice that $\Pi_\mathcal{B} f $ is $\mathcal{B}$-measurable and corresponds to the conditional expectation of $\mathcal{B}$ with respect to the $\sigma$-algebra generated by the elements of $\mathcal{B}$. The following lemma illustrates why we work with projections onto factors: the $l^2$-norm of such projections is non-decreasing under refinement of the factor. For this reason, we refer to $\left\| \Pi_\mathcal{B} f\right\|_2^2$ as the energy.

Lemma 3.3. (Pythagoras theorem for projections)

Let $\mathcal{B},\mathcal{B}'$ be factors of $[N]$ such that $\mathcal{B}'$ refines $\mathcal{B}$. Let $f: [N]\rightarrow \mathbb{C}$ be a function. Then

\begin{equation*} \big\| \Pi_{\mathcal{B}'}f \big\|_2^2 = \big\| \Pi_{\mathcal{B}}f \big\|_2^2 + \big\| \Pi_{\mathcal{B}'}f -\Pi_\mathcal{B} f \big\|_2^2. \end{equation*}

Proof. See [Reference Peluse and Prendiville26, Lemma 4.3].

In this paper, we will exclusively work with factors whose atoms are arithmetic progressions.

Definition 3.4. (Local factor)

Let $\mathcal{B}$ be a factor of $[N]$. Let $q, M \geqslant 1$. We say that $\mathcal{B}$ is a local factor of resolution $M$ and modulus $q$ if every atom of $\mathcal{B}$ is an arithmetic progression of step $q$ and length in $[M, 2M]$.

For example, the trivial factor consisting of only one atom is a local factor of resolution $N$ and modulus $1$. Our definition of local factors is motivated by, but slightly different from, the definition in [Reference Peluse and Prendiville26], where the atoms are required to have length exactly $M$. This flexibility leads to the following simple lemma which turns out to be rather convenient for our arguments.

Lemma 3.5. (Existence of local factors)

Let $\mathcal{B}, \mathcal{B}^{\prime\prime}$ be local factors of $[N]$ of modulus $q$ and resolution $M, M^{\prime\prime}$, respectively, and $100M^{\prime\prime}\leqslant M$. Suppose that $\mathcal{B}^{\prime\prime}$ refines $\mathcal{B}$. Then for any integer $M' \in [10M^{\prime\prime}, M/10]$, there exists a local factor $\mathcal{B}'$ of $[N]$ of resolution $M'$ and modulus $q$, such that $\mathcal{B}'$ refines $\mathcal{B}$ and $\mathcal{B}^{\prime\prime}$ refines $\mathcal{B}'$.

Proof. Let $P$ be an arbitrary atom of $\mathcal{B}$, so that $P$ is an arithmetic progression of step $q$ and length in $[M, 2M]$. Since $\mathcal{B}^{\prime\prime}$ refines $\mathcal{B}$, we have a partition

\begin{equation*} P = P_1 \sqcup P_2 \sqcup \cdots \sqcup P_r, \end{equation*}

where each $P_i$ is an atom of $\mathcal{B}^{\prime\prime}$ which is an arithmetic progression of step $q$ and length in $[M^{\prime\prime}, 2M^{\prime\prime}]$. Without loss of generality, we may assume that all elements of $P_i$ are smaller than all elements of $P_j$ whenever $i \lt j$. Let $x_i = |P_i|/M'$ so that $x_i \leqslant [y, 2y]$ where $y = M^{\prime\prime}/M' \leqslant 1/10$. Then

\begin{equation*} s := x_1 + x_2 + \cdots + x_r = \frac{|P|}{M'} \geqslant \frac{M}{M'} \geqslant 10. \end{equation*}

We will define a sequence

\begin{equation*} 0 = i_0 \lt i_1 \lt \cdots \lt i_{k-1} \lt i_k = r, \end{equation*}

such that

\begin{equation*} x_{i_{j-1}+1} + \cdots + x_{i_j} \in [1, 2], \end{equation*}

for each $1 \leqslant j \leqslant k$. Once this construction is completed, we can partition $P$ into $k$ arithmetic progressions $P_{i_{j-1}+1} \cup \cdots \cup P_{i_j}$ ( $1 \leqslant j \leqslant k$), each of which has step $q$ and length in $[M', 2M']$. Performing this procedure for each atom of $\mathcal{B}$, we obtain the desired refinement $\mathcal{B}'$ of $\mathcal{B}$.

To construct the sequence $\{i_j\}$, first choose a positive integer $k$ such that $s/k \in [1.4, 1.6]$. The existence of such $k$ easily follows from $s \geqslant 10$. Now, for each $1 \leqslant j \leqslant k$, define $i_j$ to be the smallest index with the property that

\begin{equation*} x_1 + x_2 + \cdots + x_{i_j} \geqslant \frac{js}{k}. \end{equation*}

Clearly we must have $i_k = r$ and the upper bound

\begin{equation*} x_1 + x_2 + \cdots + x_{i_j} \leqslant x_1 + \cdots + x_{i_j-1} + 2y \leqslant \frac{js}{k} + 0.2. \end{equation*}

It follows that

\begin{equation*} x_{i_{j-1}+1} + \cdots + x_{i_j} \leqslant \frac{js}{k}+0.2 - \frac{(j-1)s}{k} = \frac{s}{k} + 0.2 \leqslant 1.8 \end{equation*}

and

\begin{equation*} x_{i_{j-1}+1} + \cdots + x_{i_j} \geqslant \frac{js}{k} - \left(\frac{(j-1)s}{k}+0.2\right) = \frac{s}{k}-0.2 \geqslant 1.2. \end{equation*}

This completes the proof.

In our proof of the popular difference result (Theorem 1.2), we need to work with a chain of local factors where each factor in the chain refines the next one.

Definition 3.6. (Local factor chain)

Let $q, M_1,\dots,M_m$ be positive integers. Let $\mathcal{B}_1,\dots,\mathcal{B}_m$ be local factors of $[N]$ of modulus $q$ and resolution $M_1,\dots,M_m$, respectively. We say that $(\mathcal{B}_1,\dots,\mathcal{B}_m)$ is a local factor chain of resolution $(M_1,\dots,M_m)$ and modulus $q$, if $\mathcal{B}_i$ is a refinement of $\mathcal{B}_{i+1}$ for each $1 \leqslant i \lt m$.

Let $(\mathcal{B}_1,\dots,\mathcal{B}_m)$ and $(\mathcal{B}_1',\dots,\mathcal{B}_m')$ be two local factor chains. We say that $(\mathcal{B}_1',\dots,\mathcal{B}_m')$ refines $(\mathcal{B}_1,\dots,\mathcal{B}_m)$ if $\mathcal{B}_i'$ refines $\mathcal{B}_i$ for each $1 \leqslant i \leqslant m$.

To visualize the relationships between factors, we use the notation $\mathcal{B} \longrightarrow \mathcal{B}'$ to indicate that $\mathcal{B}$ refines $\mathcal{B}'$. Thus a local factor chain $(\mathcal{B}_1,\cdots,\mathcal{B}_m)$ can be visualized as:

and the relationship that $(\mathcal{B}_1',\cdots,\mathcal{B}_m')$ refines $(\mathcal{B}_1,\cdots,\mathcal{B}_m)$ can be visualized as:

Lemma 3.7. (Existence of local factor chains)

Let $(\mathcal{B}_1,\dots,\mathcal{B}_m)$ be a local factor chain of $[N]$ of resolution $(M_1,\dots,M_m)$ and modulus $q$. Let $q', M_1',\dots,M_m'$ be positive integers such that $q'$ is divisible by $q$, and

\begin{equation*} 10M_{i-1} \leqslant \frac{q'}{q}M_i' \leqslant \frac{M_i}{10} \end{equation*}

for each $1 \leqslant i \leqslant m$, with the convenience that $M_0=1$. Then there exists a local factor chain $(\mathcal{B}_1',\dots,\mathcal{B}_m')$ of resolution $(M_1',\dots,M_m')$ and modulus $q'$ such that $(\mathcal{B}_1',\dots,\mathcal{B}_m')$ refines $(\mathcal{B}_1,\dots,\mathcal{B}_m)$.

Proof. First we prove the lemma in the special case $q'=q$. For each $1 \leqslant i \leqslant m$, we will apply Lemma 3.5 with $\mathcal{B} = \mathcal{B}_{i}$, $\mathcal{B}^{\prime\prime} = \mathcal{B}_{i-1}$ (taking $\mathcal{B}_0$ to be the trivial factor where each atom is a singleton), $M = M_i$, $M^{\prime\prime} = M_{i-1}$, and $M' = M_i'$. The hypothesis $10M^{\prime\prime} \leqslant M' \leqslant M/10$ is satisfied by our assumption. Hence we obtain a local factor $\mathcal{B}_i'$ of resolution $M_i'$ and modulus $q$, such that $\mathcal{B}_i'$ refines $\mathcal{B}_i$ and $\mathcal{B}_{i-1}$ refines $\mathcal{B}_i'$. The following diagram illustrates the construction in the case $m=3$, where the dotted arrows represent refinement relations from applying Lemma 3.5:

Since $\mathcal{B}_{i-1}'$ refines $\mathcal{B}_{i-1}$ and $\mathcal{B}_{i-1}$ refines $\mathcal{B}_i'$, it follows that $\mathcal{B}_{i-1}'$ refines $\mathcal{B}_i'$ for all $2\leqslant i\leqslant m$, as illustrated in the diagram above by the double arrows. Therefore, $(\mathcal{B}_1',\dots,\mathcal{B}_m')$ is a local factor chain which refines $(\mathcal{B}_1,\cdots,\mathcal{B}_m)$, as desired.

Now we treat general $q'$. By the above procedure, we first obtain a local factor chain $(\mathcal{B}_1^{\prime\prime},\dots,\mathcal{B}_m^{\prime\prime})$ of resolution $(q'M_1'/q, \cdots, q'M_m'/q)$ and modulus $q$ which refines $(\mathcal{B}_1,\dots,\mathcal{B}_m)$. Next, for each $1 \leqslant i \leqslant m$, divide each atom of $\mathcal{B}_i^{\prime\prime}$ into residue classes modulo $q'$ to form a refinement $\mathcal{B}_i'$ of $\mathcal{B}_i^{\prime\prime}$. Since the atoms of $\mathcal{B}_i^{\prime\prime}$ have length in $[q'M_i'/q, 2q'M_i'/q]$ and have step $q$, the atoms of $\mathcal{B}_i'$ have length in $[M_i', 2M_i']$. Hence $(\mathcal{B}_1',\dots,\mathcal{B}_m')$ is a local factor chain of resolution $(M_1',\dots,M_m')$ and modulus $q'$ which refines $(\mathcal{B}_1^{\prime\prime},\cdots,\mathcal{B}_m^{\prime\prime})$. This completes the proof.

4. Polylogarithmic bound in the density result

In this section, we prove Theorem 1.1 using the improved density increment strategy developed by Heath-Brown [Reference Heath-Brown18] and Szemerédi [Reference Szemerédi31]. For convenience, we follow the presentation in Green–Tao [Reference Green and Tao15].

We begin by briefly outlining the idea. Assume that $A\subseteq[N]$ is a subset of density $\alpha$ that contains no nontrivial polynomial configurations. Since Lipschitz functions are nearly constant on long arithmetic progressions, Peluse [Reference Peluse25] deduced from the inverse theorem (Theorem 2.4) and pigeonhole principle that that there exists a progression $P$ on which $A$ exhibits a density increment of size $\alpha^{O(1)}$. It turns out that this approach typically requires $O(\alpha^{-O(1)})$ iterations to reach a contradiction. The strategy adopted in this section improves on this by collecting several Lipschitz functions together to construct a new one, and then decomposing $[N]$ into progressions on which this composite function is nearly constant. This refinement allows us to obtain a density increment of size $c\alpha$ on one of the progressions, thereby reducing the number of iterations to just $O(\log 1/\alpha)$. While the progressions used at each step are shorter than those in Peluse’s approach, it was observed in [Reference Green and Tao15] that the progression length plays a less critical role in the overall argument.

Let us clarify the above argument. We first construct a nonnegative structural function $g\geqslant 0$ such that it has the same mean value as $1_A$ (say, $\alpha$), and the corresponding counting expressions weighted by $g$ and $1_A$ are comparable. This construction is formalized in Lemma 4.1 below. We then carry out the density-increment argument with respect to $g$. Lemma 4.2, in essence, asserts that the density of $g$ increases by a factor of at least $(1+c)\alpha$ on some arithmetic progression.

Lemma 4.1. (Weak regularity lemma-I)

Let $M, N,q$ be positive integers and let $P_1,\dots,P_m\in\mathbb{Z}[y]$ be polynomials with $(C,q)$-coefficients such that $\deg P_1 \lt \cdots \lt \deg P_m$. Suppose that $f:\mathbb{Z}\to\mathbb{C} $ is a 1-bounded function supported on the interval $[N]$. Let $d = \deg P_m$, and assume that $M \leqslant (N/q^{d-1})^{1/d}$. For $\delta \in (0,1/2)$, one of the following two statements holds:

(1) $M\ll_{C,d}(q/\delta)^{O_{d}(1)}$.
(2) There exist positive integers $Q, b, M_1,\dots,M_m$ with
\begin{equation*} Q \leqslant \exp(O_{C,d}(\delta^{-O_d(1)})), \ \ b\ll_d1, \ \ M_i\gg_{C,d} Q^{-1}(\delta/q)^{O_d(1)}M^{\deg P_i}, \end{equation*}
and local factors $\mathcal{B}_1,\dots,\mathcal{B}_m$ on $[N]$ of resolution $M_1,\dots,M_m$, respectively, and modulus $Qq^b$, such that
\begin{equation*} \bigl| \Lambda_{P_1,\dots,P_m}^{N,M} (f,\dots, f) - \Lambda_{P_1,\dots,P_m}^{N,M} (f, \Pi_{\mathcal{B}_1}f,\dots, \Pi_{\mathcal{B}_m}f) \bigr|\leqslant \delta. \end{equation*}

Proof. Let $b' = b'(d)$ be a positive integer chosen sufficiently large in terms of $d$, and let $C' = C'(C,d)$ be a constant taken large enough in terms of $C,d$. Set $Q$ to be the least common multiple of all positive integers at most $C'\delta^{-b'}$, and set $M_i = C'^{-1}Q^{-1}(\delta/q)^{b'} M^{\deg P_i}$. The desired bounds for $Q$ and $M_i$ are then satisfied.

Let $\mathcal{B}_1,\dots,\mathcal{B}_m$ be local factors on $[N]$ of resolution $M_1,\dots,M_m$, respectively, and modulus $Qq^b$. Decompose $\Lambda_{P_1,\dots,P_m}^{N,M}(f,\dots,f)$ into $2^m$ terms of the form

\begin{equation*} \Lambda_{P_1,\dots,P_m}^{N,M}(f, g_1,\dots,g_m), \end{equation*}

where each $g_i \in \{\Pi_{\mathcal{B}_i}f, f - \Pi_{\mathcal{B}_i}f\}$. It suffices to show that if $g_i = f - \Pi_{\mathcal{B}_i}f$ for some $1 \leqslant i \leqslant m$ then

\begin{equation*} |\Lambda_{P_1,\dots,P_m}^{N,M}(f, g_1,\dots,g_m)| \leqslant \frac{\delta}{2^m}. \end{equation*}

Suppose, on the contrary, that this is not the case. Then by Theorem 2.4, either $M\ll_{C,d}(q/\delta)^{O_{d}(1)}$ in which case we are done, or there exist positive integers $q' \ll_{C,d}\delta^{-O_d(1)}$ and $b \ll_d1$, as well as a $1$-bounded function $\phi_i:\mathbb{Z}\rightarrow\mathbb{C}$ which is $L$-Lipschitz along $q'q^b\cdot \mathbb{Z}$ for some $L \ll_{C,d}(q/\delta)^{O_d(1)}M^{-\deg P_i}$, such that

\begin{equation*} \bigl| \sum_{x \in \mathbb{Z}} g_i(x)\phi_i(x) \bigr| \geqslant \eta N, \end{equation*}

for some $\eta \gg_{C,d} \delta^{O_d(1)}$. We may ensure that $b$ is a constant depending only on $d$ and that $q' \mid Q$ by choosing $b', C'$ large enough. If $x,y$ lie in the same atom $P$ of $\mathcal{B}_i$, then $q'q^b \mid x-y$ and $|x-y| \leqslant 2Qq^b M_i$. Hence

\begin{equation*} |\phi_i(x) - \phi_i(y)| \leqslant L|x-y| \leqslant 2L Q q^b M_i, \end{equation*}

which we may ensure to be at most $\eta/2$ by choosing $b',C'$ large enough. Since $P$ is an atom of $\mathcal{B}_i$, the average of $g_i=f-\Pi_{\mathcal{B}_i}f$ on $P$ is $0$. It follows that

\begin{equation*} \bigl| \sum_{x \in P} g_i(x)\phi_i(x) \bigr| \leqslant \frac{1}{2}\eta |P|, \end{equation*}

for each atom $P$ of $\mathcal{B}_i$. This leads to a contradiction after summing over all atoms and concludes the proof.

Lemma 4.2. (Density increment)

Let $M, N,q$ be positive integers and $P_1,\dots,P_m\in\mathbb{Z}[y]$ be polynomials with $(C,q)$-coefficients such that $\deg P_1 \lt \cdots \lt \deg P_m$. Let $d = \deg P_m$ and $M = (N/q^{d-1})^{1/d}$. Suppose that $ A \subseteq [N]$ has density $\alpha := | A|/N$ and contains no nontrivial progression of the form $x,x+P_1(y),\dots, x+P_m(y)$. Then either $N\ll_{C,d} (q/\alpha)^{O_d (1)}$ or there exist integers $q'\leqslant \exp \bigl( O_{C,d} (\alpha^{-O_d (1)}) \bigr) $, $b \ll_d 1$ and an arithmetic progression $P\subseteq[N]$ of modulo $q'q^b$ and of length $\gg_{C,d}(\alpha/q)^{O_d(1)}M^{\deg P_1} /q'$ such that

\begin{equation*} | A \cap P| \geqslant \alpha (1+c) |P| \end{equation*}

for some constant $c=c(C,d) \gt 0$.

Let us state the following $l^1$-control lemma which will be used in proving Lemma 4.2. This lemma is an extension of [Reference Peluse and Prendiville26, Lemma 5.1], and its proof is elementary.

Lemma 4.3. ( $l^1$-control)

For any functions $f_0,\dots,f_m:[N]\to\mathbb{C}$ we have

\begin{equation*} \bigl| \Lambda_{P_1,\dots,P_m}^{N,M} (f_0,\dots, f_m) \bigr|\leqslant N^{-1} \left\| f_i\right\|_1 \prod_{j\neq i} \left\| f_j\right\|_\infty. \end{equation*}

Proof of Lemma 4.2

By the assumption on $A$, we have $\Lambda_{P_1,\dots,P_m}^{N,M}(1_A,\dots,1_A)=0$. Besides, it is obvious that

\begin{align*} \Lambda_{P_1,\dots,P_m}^{N,M}(1_A,\alpha 1_{[N]},\dots,\alpha 1_{[N]}) &= \frac{\alpha^m}{NM} \sum_{x \in \mathbb{Z}} 1_A(x) \sum_{y \in [M]} 1_{[N]}(x+P_1(y)) \cdots 1_{[N]}\\ &\quad\times(x+P_m(y)). \end{align*}

Let $\eta \gt 0$ be a constant sufficiently small in terms of $C,d$ so that $|P_i(y)| \leqslant N/10$ whenever $1 \leqslant y \leqslant \eta M$ and $1\leqslant i\leqslant m$. It follows that

(4.1)

\begin{align} &\Lambda_{P_1,\dots,P_m}^{N,M}(1_A,\alpha 1_{[N]},\dots,\alpha 1_{[N]})\nonumber\\ &\geqslant \frac{\alpha^m}{NM} \sum_{x\in [N/3,2N/3]} 1_A(x) \sum_{y\leqslant \eta M} 1_{[N]} (x+P_1(y))\dots 1_{[N]} (x+P_m(y))\nonumber\\ & \geqslant \frac{\alpha^m}{NM} \Big|A \cap [N/3, 2N/3]\Big|\cdot \eta M. \end{align}

If $|A \cap [N/3, 2N/3]| \leqslant \alpha N/10$, it follows from the pigeonhole principle that either $|A \cap [1,N/3| \geqslant \alpha N/5$, or $|A \cap [ 2N/3, N]| \geqslant \alpha N/5$, and the conclusion follows by taking either $P = [1, N/3]$ or $P = [2N/3, N]$. Hence we may assume that $|A \cap [N/3, 2N/3]| \geqslant \alpha N/10$ in the following. In light of (4.1) one has

\begin{equation*} \Lambda_{P_1,\dots,P_m}^{N,M}(1_A,\alpha 1_{[N]},\dots,\alpha 1_{[N]}) \geqslant \frac{\eta}{10}\alpha^{m+1}. \end{equation*}

On the other hand, by the weak regularity lemma (Lemma 4.1) applied to $f=1_A$ and $\delta = \eta \alpha^{m+1}/20$, we can find local factors $\mathcal{B}_1,\dots,\mathcal{B}_m$ on $[N]$ of resolution $M_1,\dots,M_m$, respectively, and modulus $q'q^b$ for some $q'\leqslant \exp(O_{C,d}(\alpha^{-O_d(1)}))$, $b\ll_d1$ and $M_i\gg_{C,d} (\alpha/q)^{O_{d}(1)} M^{\deg P_i}/q'$ such that

\begin{equation*} \bigl| \Lambda_{P_1,\dots,P_m}^{N,M} (1_A, \Pi_{\mathcal{B}_1}1_A,\dots, \Pi_{\mathcal{B}_m}1_A) \bigr|\leqslant \frac{\eta \alpha^{m+1}}{20}. \end{equation*}

We may decompose $\Lambda_{P_1,\dots,P_m}^{N,M}(1_A,\alpha 1_{[N]},\dots,\alpha 1_{[N]})$ into $2^m$ terms, each of which takes the form

\begin{equation*} \Lambda_{P_1,\dots,P_m}^{N,M}(1_A,g_1,\dots,g_m), \end{equation*}

where each $g_j \in \{\Pi_{\mathcal{B}_j}1_A, \alpha 1_{[N]} - \Pi_{\mathcal{B}_j}1_A\}$. It follows that

(4.2)

\begin{equation} \bigl| \Lambda_{P_1,\dots,P_m}^{N,M} (1_A, g_1,\dots,g_m) \bigr| \geqslant \frac{\eta \alpha^{m+1}}{20\cdot 2^m}, \end{equation}

for some choice of $g_1,\dots,g_m$, at least one of which (say $g_i$) is $g_i = \alpha 1_{[N]} - \Pi_{\mathcal{B}_i}1_A$.

Assume, for the sake of contradiction, that the desired density increment does not occur. Then the density of $A$ on each atom of each $\mathcal{B}_j$ is at most $\alpha(1+c)$, so that $\|g_j\|_{\infty} \leqslant \alpha(1+c) \leqslant 2\alpha$ for every $j$. Moreover, since $g_i = \alpha 1_{[N]} - \Pi_{\mathcal{B}_i}1_A$ takes constant values on atoms $P$ of $\mathcal{B}_i$, we have

\begin{equation*} \|g_i\|_1 =\sum_{P\in \mathcal{B}_i} \Bigl| \sum_{x\in P}\Pi_{\mathcal{B}_i} 1_A(x)- \alpha|P| \Bigr|= \sum_{P\in \mathcal{B}_i} \bigl| |A\cap P| - \alpha |P| \bigr|. \end{equation*}

On the other hand, we observe that the assumption $\mathbb{E}_{x\in [N]}1_A(x) =\alpha$ implies that $\sum_{x\in [N]} g_i(x)=0$. From these two formulas, we then deduce that

\begin{equation*} \|g_i\|_1 = 2 \sum_{P\in \mathcal{B}_i} \max\bigl\{|A \cap P| - \alpha |P|, 0 \bigr\} \leqslant 2c\alpha N. \end{equation*}

Thus, it follows from Lemma 4.3 that

\begin{equation*} \bigl| \Lambda_{P_1,\dots,P_m}^{N,M} (1_A, g_1,\dots,g_m) \bigr| \leqslant N^{-1} \|g_i\|_1 \prod_{j \neq i} \|g_i\|_{\infty} \leqslant 2^{m+1} c \alpha^{m+1}. \end{equation*}

This contradicts the inequality (4.2) if $c$ is sufficiently small.

Proof of Theorem 1.1

For $k \geqslant 0$, we iteratively construct positive integers $N_k, q_k$, polynomials $P_i^{(k)}$ for $1 \leqslant i \leqslant m$, each with $(C,q_k)$-coefficients and satisfying $\deg P_i^{(k)} = \deg P_i$, and subsets $A_k \subset [N_k]$ of density $\alpha_k := |A_k|/N_k$ which contains no nontrivial progressions of the form $x, x+P_1^{(k)}(y),\dots,x+P_m^{(k)}(y)$.

For convenience, we write $d=\deg P_m$ and $d_i=\deg P_i$ for each $i$. We begin the iteration with $N_0 = N$, $q_0=1$, $P_i^{(0)} = P_i$ for $1 \leqslant i \leqslant m$, $A_0 = A$. At step $k$, either we have $N_k \ll_{C,d} (q_k/\alpha_k)^{O_d(1)}$ in which case we terminate the iteration process, or, by Lemma 4.2, there exist $q' \leqslant \exp(O_{C,d}(\alpha_k^{-O_d(1)}))$, $b \ll_d 1$ and an arithmetic progression $P \subset [N_k]$ of modulo $q = q'q_k^b$ and of length $\gg_{C,d} q'^{-1}(\alpha_k/q_k)^{O_d(1)} N_k^{d_1/d}$ such that

(4.3)

\begin{equation} |A_k \cap P| \geqslant \alpha_k(1+c)|P|, \end{equation}

for some constant $c=c(C,d) \gt 0$. Let $N_{k+1} = |P|$ and write $P = q \cdot [N_{k+1}] + m$ for some $m \in \mathbb{Z}$. Define

\begin{equation*} A_{k+1} := \{x \in [N_{k+1}]: qx+m \in A_k\}. \end{equation*}

Then inequality (4.3) yields that $\frac{|A_{k+1}|}{N_{k+1}} :=\alpha_{k+1} \geqslant \alpha_k(1+c)$. Since $A_k$ contains no nontrivial progressions of the form

\begin{equation*} qx+m, qx+m+P_1^{(k)}(qy), \cdots, qx+m + P_m^{(k)}(qy), \end{equation*}

it follows that $A_{k+1}$ contains no nontrivial progressions of the form

\begin{equation*} x, x+P_1^{(k+1)}(y), \cdots, x + P_m^{(k+1)}(y), \end{equation*}

where $P_i^{(k+1)}(y) := P_i^{(k)}(qy)/q$ has $(C,q_{k+1})$-coefficients with $q_{k+1}=q_kq = q_k^{b+1}q'$.

Suppose that the process above terminates at step $K$. Since

\begin{equation*} 1 \geqslant \alpha_K \geqslant (1+c)^K\alpha, \end{equation*}

we have

\begin{equation*} K \leqslant \frac{\log (1/\alpha)}{\log (1+c)} \ll_{C,d} \log \frac{1}{\alpha}. \end{equation*}

From $q_{k+1} = q_k^{b+1}q'$ it follows that

\begin{equation*} q_K \leqslant (q')^{1+(b+1)+\cdots+(b+1)^{K-1}} \leqslant (q')^{(b+1)^K} \leqslant \exp(-O_{C,d}(\alpha^{O_d(1)})). \end{equation*}

Since for any $1\leqslant k\leqslant K$ we have

\begin{equation*} N_{k+1} \gg q_k^{-O_d(1)} \exp(-O_{C,d} (\alpha_k^{-O_d(1)}))N_k^{d_1/d} \gg \exp(-O_{C,d} (\alpha_k^{-O_d(1)}))N_k^{d_1/(2d)}, \end{equation*}

it follows that

\begin{equation*} N_K \gg \exp(-O_{C,d}(\alpha^{-O_d(1)})) N^{d_1/(2d)}. \end{equation*}

Since $N_K \ll (q_K/\alpha_K)^{O_d(1)}\ll \exp(O_{C,d}(\alpha^{-O_d(1)}))$, it follows that

\begin{equation*} N^{d_1/(2d)} \ll \exp(O_{C,d}(\alpha^{-O_d(1)})), \end{equation*}

which implies that $\alpha \ll (\log N)^{-c}$ as desired. Therefore, the proof of Theorem 1.1 is complete.

5. Popular common difference

We plan to prove Theorem 1.2 in this section. The guiding philosophy of our approach aligns closely with that of [Reference Green13, Reference Peluse, Prendiville and Shao28]. Specifically, given a function $f$ our goal is to find a regular function $g$ and a subset $H$ of $[N]$ such that $f$ and $g$ exhibit similar densities of polynomial configurations with common differences in $H$, while $g$ remains nearly constant under shifts by elements of $H$.

To this end, we introduce the following expression in this section. Suppose that $q\leqslant M\leqslant N$ are integers, and $P_1,\dots,P_m\in\mathbb{Z}[y]$ are polynomials. Suppose that $f_0,\dots,f_m:\mathbb{Z}\to\mathbb{C}$ are functions supported on the interval $[N]$. Set

\begin{equation*} \Lambda^{N,M,q}_{P_1,\dots,P_m}(f_0,f_1,\dots,f_m) = N^{-1} \sum_{x\in\mathbb{Z}} \mathbb{E}_{y\in [M] \atop q|y} f_0(x) f_1(x+P_1(y)) \cdots f_m(x+P_m(y)). \end{equation*}

Given a function $f$, the first step towards our goal is to find regular functions $g_1,\dots,g_m$ and set $H =\left\{y\in[M]: q|y\right\}$ such that

\begin{equation*} \Lambda^{N,M,q}_{P_1,\dots,P_m}(f,\dots,f) \approx \Lambda^{N,M,q}_{P_1,\dots,P_m} (f,g_1,\dots,g_m). \end{equation*}

Proposition 5.1. (Weak regularity lemma-II)

Let $P_1,\dots,P_m\in \mathbb{Z}[y]$ be polynomials of $(C,1)$-coefficients and such that $\deg P_1 \lt \dots \lt \deg P_m$. Let $d = \deg P_m$ and $d_i = \deg P_i$. Let $\varepsilon \in (0,1)$ be real and $N$ a positive integer. Then either $N\leqslant \exp(\exp(O_{C,d}(\varepsilon^{-O_d(1)})))$ or the following statement holds. If $f_0,\dots,f_m:\mathbb{Z}\to\mathbb{C}$ are $1$-bounded functions supported on the interval $[N]$, then there exist $q,M$ with

\begin{equation*} N^{1/d} \geqslant M \geqslant \exp(\exp(-O_{C,d}(\varepsilon^{-O_d(1)})))N^{1/d}, \ \ q\leqslant \exp(\exp(O_{C,d} (\varepsilon^{-O_d(1)}))), \end{equation*}

and a local factor chain $(\mathcal{B}_1,\dots,\mathcal{B}_m)$ on $[N]$ of resolution $(M^{d_1},\dots,M^{d_m})$ and modulus $q$, such that

\begin{equation*} \bigl| \Lambda^{N,\varepsilon M,q}_{P_1,\dots,P_m}(f_0,f_1,\dots,f_m) - \Lambda^{N,\varepsilon M,q}_{P_1,\dots,P_m}(f_0, \Pi_{\mathcal{B}_1}f_1, \dots, \Pi_{\mathcal{B}_m}f_m) \bigr| \leqslant \varepsilon. \end{equation*}

Remark. The quantitative bounds for $q$ and $M$ come from the bounds in our inverse theorem (Theorem 2.4). Roughly speaking, we will iterative construct a sequence $\{q_j\}_{j \geqslant 0}$ and take $q = q_j$ for some $j \ll \varepsilon^{-O(1)}$. At each iteration step, we apply the inverse theorem to produce $q_{j+1}$ from $q_j$, with $q_{j+1} = q'q_j^b$ for some $q' \ll \varepsilon^{-O(1)}$ and $b = O(1)$. This implies that $q_j$ is of the shape $\varepsilon^{-O(b^j)}$, leading to the doubly exponential bound for $q$, and hence for $M$ as well.

For the nonlinear Roth pattern studied in [Reference Peluse, Prendiville and Shao28, Theorem 6.1], one can obtain (single) exponential bound in the popular difference result, thanks to having a version of Proposition 5.1 where one can take $q_{j+1} = q'q_j$ in each iteration in the proof. Attaining this improvement in the general setting of Theorem 1.2 seems to require an inverse theorem with a precise value of $b$, possibly with $b = d$.

Proof. We shall apply the energy increment argument to prove this proposition. For $j \geqslant 0$, we will iteratively construct a sequence of local factor chains $(\mathcal{B}_1^{(j)},\dots,\mathcal{B}_m^{(j)})$ on $[N]$ of resolution $(M_j^{d_1},\cdots,M_j^{d_m})$ and modulus $q_j$, such that $(\mathcal{B}_1^{(j+1)},\dots,\mathcal{B}_m^{(j+1)})$ refines $(\mathcal{B}_1^{(j)},\dots,\mathcal{B}_m^{(j)})$ for each $j$, where

(5.1)

\begin{align} N^{1/d} \geqslant M_j& \geqslant \exp(\exp(-O_{C,d}((j+1)\varepsilon^{-O_d(1)})))N^{1/d},\ \nonumber\\ &\qquad q_j \leqslant \exp(\exp(O_{C,d} ((j+1)\varepsilon^{-O_d(1)}))). \end{align}

For $j=0$, let $(\mathcal{B}_1^{(0)},\cdots,\mathcal{B}_m^{(0)})$ be a local factor chain of resolution $(M_0^{d_1},\cdots,M_0^{d_m})$ with $M_0=N^{1/d}$ and modulus $q_0=1$. The existence of such a local factor chain easily follows from Lemma 3.5. If we have, for some $j \geqslant 0$,

(5.2)

\begin{equation} \bigl| \Lambda^{N,\varepsilon M_j,q_j}_{P_1,\dots,P_m}(f_0,f_1,\cdots,f_m) - \Lambda^{N,\varepsilon M_j,q_j}_{P_1,\dots,P_m}(f_0, \Pi_{\mathcal{B}_1^{(j)}}f_1, \cdots, \Pi_{\mathcal{B}_m^{(j)}}f_m) \bigr| \leqslant \varepsilon, \end{equation}

then we terminate the iteration process. We will show that the process must be terminated for some $j \ll_{C,d}\varepsilon^{-O(1)}$, which would conclude the proof by taking $\mathcal{B}_i = \mathcal{B}_i^{(j)}$, $q = q_j$, and $M = M_j$.

Now suppose that (5.2) fails. The left-hand side of (5.2) can be decomposed into the sum of $m$ terms of the form

\begin{equation*} \Lambda^{N,\varepsilon M_j,q_j}_{P_1,\dots,P_m}(f_0, g_1,\cdots,g_{i-1}, f_i-\Pi_{\mathcal{B}_i^{(j)}}f_i, g_{i+1},\cdots,g_m), \end{equation*}

where $g_{\ell} = \Pi_{\mathcal{B}_\ell^{(j)}}f_\ell$ for $\ell \lt i$ and $g_{\ell} = f_{\ell}$ for $\ell \gt i$. By the pigeonhole principle, there exists at least one $i\in\left\{1,\dots,m\right\}$ such that

\begin{equation*} \bigl| \Lambda^{N,\varepsilon M_j,q_j}_{P_1,\dots,P_m}(f_0, g_1,\dots, f_i-\Pi_{\mathcal{B}_i^{(j)}}f_i, \dots,g_m) \bigr|\geqslant \varepsilon/m. \end{equation*}

Set $Q_l(y)=P_l(q_jy)$ for all $1\leqslant l\leqslant m$. Recalling Definition 2.1, the assumption that $P_1,\dots,P_m$ have $(C,1)$-coefficients implies that $Q_1,\dots,Q_m$ have $(C,q_j^2)$-coefficient. Therefore, recalling the notation (1.1), we can conclude from the above inequality that

\begin{equation*} \bigl| \Lambda^{N, \varepsilon M_j/q_j}_{Q_1,\dots,Q_m} (f_0, g_1,\dots, f_i-\Pi_{\mathcal{B}_i^{(j)}}f_i, \dots,g_m) \bigr|\geqslant \varepsilon/m. \end{equation*}

By Theorem 2.4, either we have $\varepsilon M_j/q_j \ll_{C,d} (q_j/\varepsilon)^{O_d(1)}$ in which case the bound $N \leqslant \exp(\exp(O_{C,d}(\varepsilon^{-O_d(1)})))$ follows and we are done, or else there exist positive integers $q'\ll_{C,d} \varepsilon^{-O(1)}$, $b \ll_d 1$, and a $1$-bounded function $\phi: \mathbb{Z}\rightarrow\mathbb{C}$ which is $O_{C,d}((q_j/\varepsilon)^{O(1)}M_j^{-d_i})$-Lipschitz along $q'q_j^b\cdot\mathbb{Z}$, such that

(5.3)

\begin{equation} \bigl| \sum_{x \in \mathbb{Z}} (f_i-\Pi_{\mathcal{B}_i^{(j)}}f_i)(x) \phi(x) \bigr| \geqslant \eta N, \end{equation}

for some $\eta \gg_{C,d}\varepsilon^{O_d(1)}$. Set $q_{j+1} = q'q_j^b$ so that $q_{j+1}$ satisfies the bound in (5.1). We may choose $M_{j+1}$ satisfying the bound in (5.1) while ensuring that

\begin{equation*} q_{j+1}M_{j+1}^{d_i} \leqslant \frac{M_j^{d_i}}{10} \end{equation*}

and $\phi$ is $\eta/(10M_{j+1}^{d_i})$-Lipschitz along $q_{j+1}\cdot\mathbb{Z}$. Moreover, for each $1 \leqslant i \leqslant m$ we have (with the convention that $d_0=0$)

\begin{equation*} 10 M_j^{d_{i-1}} \leqslant 10N^{d_{i-1}/d} \leqslant M_{j+1}^{d_i}, \end{equation*}

thanks to the lower bound for $M_{j+1}$ in (5.1), unless $N \leqslant \exp(\exp(O_{C,d}(\varepsilon^{-O(1)})))$ in which case we are done. By the two inequalities above, we may apply Lemma 3.7 to find a local factor chain $(\mathcal{B}_1^{(j+1)},\cdots,\mathcal{B}_m^{(j+1)})$ of resolution $(M_{j+1}^{d_1},\cdots,M_{j+1}^{d_m})$ and modulus $q_{j+1}$ which refines $(\mathcal{B}_1^{(j)},\cdots,\mathcal{B}_m^{(j)})$.

Since the Lipschitz property of $\phi$ implies that

\begin{equation*} |\phi(x+q_{j+1}y) - \phi(x)| \leqslant \frac{1}{5}\eta \text{ for } |y| \leqslant 2M_{j+1}^{d_i}, \end{equation*}

the left-hand side of (5.3) can be written as

\begin{equation*} \sum_{P} \phi(x_P)\sum_{x \in P} (f_i-\Pi_{\mathcal{B}_i^{(j)}}f_i)(x) + \sum_{P}\sum_{x \in P}(f_i-\Pi_{\mathcal{B}_i^{(j)}}f_i)(x) (\phi(x) - \phi(x_P)), \end{equation*}

where the outer summation is over all atoms $P$ of $\mathcal{B}_i^{(j+1)}$ and $x_P$ is an arbitrary element in $P$. In the second term above, each summand is bounded by $\eta/2$ in absolute value since $x-x_P = q_{j+1}y$ for some $|y| \leqslant 2M_{j+1}^{d_i}$. It follows from inequality (5.3) that

\begin{equation*} \frac{1}{2}\eta N \leqslant \sum_{P} |\phi(x_P)|\Bigl| \sum_{x \in P} (f_i-\Pi_{\mathcal{B}_i^{(j)}}f_i)(x) \Bigr| \leqslant \sum_P \Bigl| \sum_{x \in P} (f_i-\Pi_{\mathcal{B}_i^{(j)}}f_i)(x) \Bigr|. \end{equation*}

Since $\mathcal{B}_i^{(j+1)}$ refines $\mathcal{B}_i^{(j)}$, we have

\begin{equation*} \sum_{x \in P} (f_i-\Pi_{\mathcal{B}_i^{(j)}}f_i)(x) = \sum_{x \in P} \Pi_{\mathcal{B}_i^{(j+1)}}(f_i-\Pi_{\mathcal{B}_i^{(j)}}f_i)(x) = \sum_{x \in P} \Big( \Pi_{\mathcal{B}_i^{(j+1)}}f_i(x) - \Pi_{\mathcal{B}_i^{(j)}}f_i(x)\Big), \end{equation*}

and hence

\begin{equation*} \|\Pi_{\mathcal{B}_i^{(j+1)}}f_i - \Pi_{\mathcal{B}_i^{(j)}}f_i\|_1 \gg \eta N \gg_{C,d}\varepsilon^{O_d(1)}N. \end{equation*}

By orthogonality (Lemma 3.3) and then Cauchy–Schwarz, it follows that

\begin{equation*} \|\Pi_{\mathcal{B}_i^{(j+1)}}f_i\|_2 - \|\Pi_{\mathcal{B}_i^{(j)}}f_i\|_2 = \|\Pi_{\mathcal{B}_i^{(j+1)}}f_i - \Pi_{\mathcal{B}_i^{(j)}}f_i\|_2 \gg_{C,d} \varepsilon^{O_d(1)}N, \end{equation*}

and hence

\begin{equation*} \sum_{\ell=1}^m\|\Pi_{\mathcal{B}_{\ell}^{(j+1)}}f_\ell\|_2 - \sum_{\ell=1}^m\|\Pi_{\mathcal{B}_\ell^{(j)}}f_\ell\|_2 \geqslant \|\Pi_{\mathcal{B}_i^{(j+1)}}f_i\|_2 - \|\Pi_{\mathcal{B}_i^{(j)}}f_i\|_2 \gg_{C,d} \varepsilon^{O_d(1)}N. \end{equation*}

Hence (5.2) must hold for some $j \ll_{C,d} \varepsilon^{-O_d(1)}$. This concludes the proof.

Proof of Theorem 1.2

Without loss of generality, we may assume that $\deg P_1 \lt \cdots \lt \deg P_m$. Let $d = \deg P_m$. Choose a constant $C$ such that $P_1,\cdots,P_m$ have $(C,1)$-coefficients. By Proposition 5.1 applied with $f_0=f_1=\cdots=f_m=1_A$, we find integers $q, M$ with

\begin{equation*} N^{1/d} \geqslant M \geqslant \exp(\exp(-O_{C,d}(\varepsilon^{-O_d(1)})))N^{1/d}, \ \ q\leqslant \exp(\exp(O_{C,d}(\varepsilon^{-O_d(1)}))), \end{equation*}

and a local factor chain $(\mathcal{B}_1,\cdots,\mathcal{B}_m)$ on $[N]$ of resolution $(M^{d_1},\cdots,M^{d_m})$ and modulus $q$, such that

\begin{equation*} \bigl| \Lambda^{N,\varepsilon M,q}_{P_1,\dots,P_m}(1_A,1_A,\cdots,1_A) - \Lambda^{N,\varepsilon M,q}_{P_1,\dots,P_m}(1_A, \Pi_{\mathcal{B}_1}1_A, \cdots, \Pi_{\mathcal{B}_m}1_A) \bigr| \leqslant \varepsilon. \end{equation*}

We will show that

(5.4)

\begin{equation} \Lambda^{N,\varepsilon M,q}_{P_1,\dots,P_m}(1_A, \Pi_{\mathcal{B}_1}1_A, \cdots, \Pi_{\mathcal{B}_m}1_A) \geqslant \delta^{m+1}-O_{C,d}(\varepsilon). \end{equation}

This would imply that the expression

\begin{equation*} \Lambda_{P_1,\cdots,P_m}^{N,\varepsilon M,q}(1_A,\cdots,1_A) = \frac{1}{N} \sum_{x \in \mathbb{Z}} \mathbb{E}_{\substack{y \leqslant \varepsilon M \\ q\mid y}} 1_A(x) 1_A(x+P_1(y))\cdots 1_A(x+P_m(y)) \end{equation*}

is at least $\delta^{m+1} - O_{C,d}(\varepsilon)$. Hence, by pigeonhole principle, there exists a positive integer $y \leqslant \varepsilon M$ with $q \mid y$ such that

\begin{equation*} \sum_{x \in \mathbb{Z}} 1_A(x) 1_A(x+P_1(y))\cdots 1_A(x+P_m(y)) \geqslant (\delta^{m+1}-O_{C,d}(\varepsilon))N, \end{equation*}

which concludes the proof.

To prove (5.4), note that the left-hand side is

\begin{equation*} \frac{1}{N} \sum_{x \in \mathbb{Z}} \mathbb{E}_{y \leqslant \varepsilon M \atop q \mid y} 1_A(x) (\Pi_{\mathcal{B}_1}1_A)(x+P_1(y)) \cdots (\Pi_{\mathcal{B}_m}1_A)(x+P_m(y)). \end{equation*}

For $y \leqslant \varepsilon M$ and $q\mid y$, we have $q \mid P_i(y)$ and $|P_i(y)| \ll_{C,d} \varepsilon M^{d_i}$ for each $i$. Thus for such values of $y$, the elements $x, x+P_i(y)$ lie in the same atom of $\mathcal{B}_i$ for all but $O_{C,d}(\varepsilon N)$ values of $x \in [N]$. It follows that the left-hand side of (5.4) is

\begin{equation*} \frac{1}{N} \sum_{x \in \mathbb{Z}} 1_A(x) (\Pi_{\mathcal{B}_1}1_A)(x) \cdots (\Pi_{\mathcal{B}_m}1_A)(x) + O_{C,d}(\varepsilon). \end{equation*}

The desired lower bound follows from the lemma below.

Lemma 5.3. (Combinatorics of projections)

Let $X$ be a finite set of integers and let $f : X \to [0,1]$ be a function with $\mathbb{E}_{x \in X} f(x) = \delta$. Suppose that $\mathcal{B}_1,\cdots,\mathcal{B}_m$ are factors of $X$ such that $\mathcal{B}_i$ refines $\mathcal{B}_{i+1}$ for each $1 \leqslant i \lt m$. Then

\begin{equation*} \mathbb{E}_{x \in X} (\Pi_{\mathcal{B}_1}f)(x) \cdots (\Pi_{\mathcal{B}_m}f)(x) \geqslant \delta^{m}. \end{equation*}

Proof. We first claim that for any atom $P\in \mathcal{B}_m$ and letting $\delta_P= \mathbb{E}_{x\in P} f(x)$, the following inequality holds

\begin{equation*} \mathbb{E}_{x\in P} (\Pi_{\mathcal{B}_1}f)(x) \cdots (\Pi_{\mathcal{B}_m}f)(x) \geqslant \delta_P^m. \end{equation*}

Assume this claim, one can deduce from Hölder’s inequality that

\begin{align*} \mathbb{E}_{x \in X} (\Pi_{\mathcal{B}_1}f)(x) \cdots (\Pi_{\mathcal{B}_m}f)(x)& = \mathbb{E}_{P\in \mathcal{B}_m} \mathbb{E}_{x\in P} (\Pi_{\mathcal{B}_1}f)(x) \cdots (\Pi_{\mathcal{B}_m}f)(x)\\ & \geqslant \mathbb{E}_{P\in \mathcal{B}_m}\delta_P^m = \mathbb{E}_{P\in \mathcal{B}_m} (\mathbb{E}_{x\in P} f(x))^m\geqslant (\mathbb{E}_{x\in X} f(x))^m\\ &=\delta^m. \end{align*}

Thus, it remains only to prove the claim. We proceed by induction on $m$. For the base case $m=1$, it follows immediately from the definition that

\begin{equation*} \mathbb{E}_{x\in P} \Pi_{\mathcal{B}_1} f(x) =\mathbb{E}_{x\in P} f(x) =\delta_P. \end{equation*}

Now we suppose that $m\geqslant 2$, and assume the inductive hypothesis holds for $m-1$, that is, if $Q\in \mathcal{B}_{m-1}$ and $\delta_Q=\mathbb{E}_Q(f)$ is the density of $f$ on $Q$, then

\begin{equation*} \mathbb{E}_{x\in Q} (\Pi_{\mathcal{B}_1}f)(x) \cdots (\Pi_{\mathcal{B}_{m-1}}f)(x)\geqslant \delta_Q^{m-1}. \end{equation*}

Since $\Pi_{\mathcal{B}_m}f$ takes the value $\delta_P$ on the atom $P\in \mathcal{B}_m$, we have

\begin{equation*} \mathbb{E}_{x\in P} (\Pi_{\mathcal{B}_1}f)(x) \cdots (\Pi_{\mathcal{B}_m}f)(x) =\delta_P \mathbb{E}_{x\in P} (\Pi_{\mathcal{B}_1}f)(x) \cdots (\Pi_{\mathcal{B}_{m-1}}f)(x). \end{equation*}

Since $\mathcal{B}_{m-1}$ refines $\mathcal{B}_m$, we may decompose $P$ as a disjoint union of atoms in $\mathcal{B}_{m-1}$, i.e. $P=\sqcup_{Q\in \mathcal{B}_{m-1}\cap P} Q$, Using the inductive hypothesis and Hölder’s inequality, we get

\begin{align*} \mathbb{E}_{x\in P} (\Pi_{\mathcal{B}_1}f)(x) \cdots (\Pi_{\mathcal{B}_{m-1}}f)(x) &=\mathbb{E}_{Q\in \mathcal{B}_{m-1}\cap P}\mathbb{E}_{x\in Q} (\Pi_{\mathcal{B}_1}f)(x) \cdots (\Pi_{\mathcal{B}_{m-1}}f)(x)\\ &\geqslant \mathbb{E}_{Q\in \mathcal{B}_{m-1}\cap P} \delta_Q^{m-1} \geqslant (\mathbb{E}_{Q\in \mathcal{B}_{m-1}\cap P}\mathbb{E}_{x\in Q} f(x))^{m-1}\\ &\geqslant \delta_P^{m-1}. \end{align*}

Combining the above two inequalities, we conclude that

\begin{equation*} \mathbb{E}_{x\in P} (\Pi_{\mathcal{B}_1}f)(x) \cdots (\Pi_{\mathcal{B}_m}f)(x)\geqslant \delta_P \cdot \delta_P^{m-1} = \delta_P^m, \end{equation*}

as claimed.

Acknowledgements

The authors thank Sean Prendiville for valuable discussions, particularly regarding an earlier version of the proof of Theorem 1.2, and for his encouragement. We are grateful to the anonymous referee for a careful reading of the paper and for numerous helpful comments and corrections. XS was supported by NSF grant DMS-2200565. MW was partially supported by the Swiss National Science Foundation grant TMSGI2-2112.

References

Bergelson, V., Host, B. and Kra, B.. Multiple recurrence and nilsequences. Inventiones mathematicae. 160 (2005), 261–303.10.1007/s00222-004-0428-6CrossRef Google Scholar

Bergelson, V. and Leibman, A.. Polynomial extensions of van der Waerden’s and Szemerédi’s theorems. J. Amer. Math. Soc. 9 (1996), 725–753.10.1090/S0894-0347-96-00194-4CrossRef Google Scholar

Bloom, T. F. and Sisask, O.. An improvement to the Kelley-Meka bounds on three-term arithmetic progressions. arXiv preprint arXiv:2309.02353 (2023).10.2140/ent.2023.2.15CrossRef Google Scholar

Donoso, S., Le, A. N., Moreira, J. and Sun, W.. Optimal lower bounds for multiple recurrence. Ergodic Theory Dynam. Systems. 41 (2021), 379–407.10.1017/etds.2019.72CrossRef Google Scholar

Fox, J. and Pham, H. T.. Popular progression differences in vector spaces II. Discrete Anal. 16 (2019), 39.Google Scholar

Fox, J. and Pham, H. T.. Popular progression differences in vector spaces. Int. Math. Res. Not. IMRN. 7 (2021), 5261–5289.10.1093/imrn/rny240CrossRef Google Scholar

Fox, J., Pham, H. and Zhao, Y.. Tower-type bounds for Roth’s theorem with popular differences. Journal of the European Mathematical Society. 25 (2022), 3795–3831.10.4171/jems/1271CrossRef Google Scholar

Fox, J., Sah, A., Sawhney, M., Stoner, D. and Zhao, Y.. Triforce and corners. Math. Proc. Cambridge Philos. Soc. 169 (2020), 209–223.10.1017/S0305004119000173CrossRef Google Scholar

Frantzikinakis, N.. Multiple ergodic averages for three polynomials and applications. Trans. Amer. Math. Soc. 360 (2008), 5435–5475.10.1090/S0002-9947-08-04591-1CrossRef Google Scholar

Frantzikinakis, N. and Kra, B.. Ergodic averages for independent polynomials and applications. J. London Math. Soc. (2). 74 (2006), 131–142.10.1112/S0024610706023374CrossRef Google Scholar

Gowers, T.. A new proof of Szemerédi’s theorem. Geometric & Functional Analysis GAFA. 11 (2001), 465–588.10.1007/s00039-001-0332-9CrossRef Google Scholar

Gowers, T.. Decompositions, approximate structure, transference, and the Hahn-Banach theorem. Bull. Lond. Math. Soc. 42 (2010), 573–606.10.1112/blms/bdq018CrossRef Google Scholar

Green, B.. A Szemerédi-type regularity lemma in abelian groups, with applications. Geometric & Functional Analysis GAFA. 15 (2005), 340–376.10.1007/s00039-005-0509-8CrossRef Google Scholar

Green, B. and Sawhney, M.. Improved bounds for the Furstenberg-Sárközy theorem. arXiv preprint arXiv:2411.17448, 2024.Google Scholar

Green, B. and Tao, T.. New bounds for Szemerédi’s theorem. II. A new bound for

$r_4(N)$. In Analytic Number theory, pp. 180–204 (Cambridge Univ. Press, Cambridge, 2009).Google Scholar

Green, B. and Tao, T.. An arithmetic regularity lemma, an associated counting lemma, and applications. In An Irregular Mind: Szemerédi is 70, pp. 261–334 (Springer, 2010).10.1007/978-3-642-14444-8_7CrossRef Google Scholar

Green, B. and Tao, T.. New bounds for Szemerédi’s theorem, III: a polylogarithmic bound for

$r_4(N)$. Mathematika. 63 (2017), 944–1040.10.1112/S0025579317000316CrossRef Google Scholar

Heath-Brown, D. R.. Integer sets containing no arithmetic progressions. J. London Math. Soc. (2) 35 (1987), 385–394.10.1112/jlms/s2-35.3.385CrossRef Google Scholar

Kelley, Z. and Meka, R.. Strong bounds for

$3$-progressions, In 2023 IEEE 64th Annual Symposium on Foundations of Computer Science—FOCS 2023, (IEEE Computer Soc, pp. 933–973). Los Alamitos, CA, [2023] © 2023.10.1109/FOCS57990.2023.00059CrossRef Google Scholar

Leng, J., Sah, A. and Sawhney, M.. Improved bounds for Szemeredi’s Theorem. arXiv preprint arXiv:2402.17995, 2024.Google Scholar

Leng, J., Sah, A. and Sawhney, M.. Quasipolynomial bounds on the inverse theorem for the Gowers

${U}^{s+1}[n]$-norm. arXiv preprint arXiv:2402.17994, 2024.Google Scholar

Lyall, N. and Magyar, A.. Optimal polynomial recurrence. Canad. J. Math. 65 (2013), 171–194.10.4153/CJM-2012-003-8CrossRef Google Scholar

Mandache, M.. A variant of the Corners theorem. Math. Proc. Cambridge Philos. Soc. 171 (2021), 607–621.10.1017/S0305004121000049CrossRef Google Scholar

Manners, F.. True complexity and iterated Cauchy–Schwarz. arXiv preprint arXiv:2109.05731, 2021.Google Scholar

Peluse, S.. Bounds for sets with no polynomial progressions. In Forum of Mathematics, Pi, (Cambridge University Press, 2020) Vol. 8, p. e16.Google Scholar

Peluse, S. and Prendiville, S.. A polylogarithmic bound in the nonlinear Roth theorem. International Mathematics Research Notices. 2022 (2022), 5658–5684.10.1093/imrn/rnaa261CrossRef Google Scholar

Peluse, S. and Prendiville, S.. Quantitative bounds in the nonlinear Roth theorem. Invent. Math. 238(3) (2024), 865–903.10.1007/s00222-024-01293-xCrossRef Google Scholar

Peluse, S., Prendiville, S. and Shao, X.. Bounds in a popular multidimensional nonlinear Roth theorem. J. Lond. Math. Soc. (2). 110 (2024), e70019, 35.10.1112/jlms.70019CrossRef Google Scholar

Prendiville, S.. Quantitative bounds in the polynomial Szemerédi theorem: the homogeneous case. Discrete Analysis. 2017, 2017.Google Scholar

Sah, A., Sawhney, M. and Zhao, Y.. Patterns without a popular difference. Discrete Anal. 8 (2021), 30.Google Scholar

Szemerédi, E.. Integer sets containing no arithmetic progressions. Acta Math. Hungar. 56 (1990), 155–158.10.1007/BF01903717CrossRef Google Scholar

Tao, T.. Higher Order Fourier analysis. Vol. 142, (American Mathematical Soc, 2012).10.1090/gsm/142CrossRef Google Scholar

Article contents

Quantitative bounds in a popular polynomial Szemerédi theorem

Abstract

Keywords

MSC classification

Information

1. Introduction

Theorem 1.1 (Density bound)

Theorem 1.2 (Popular difference)

Theorem 1.3 (Inverse theorem)

2. Inverse theorems

Definition 2.1. ( $C$-Lipschitz)

Theorem 2.4 (Inverse theorem)

2.1. Inductive step

Proposition 2.5. (Partial correlation)

Proof of Proposition 2.5

2.2. Correlation for all functions

Lemma 2.8. (Decomposition)

2.3. Theorem 2.4 for general $M$

3. Local factors

Definition 3.1. (Factor)

Definition 3.2. (Measurability, projection)

Lemma 3.3. (Pythagoras theorem for projections)

Definition 3.4. (Local factor)

Lemma 3.5. (Existence of local factors)

Definition 3.6. (Local factor chain)

Lemma 3.7. (Existence of local factor chains)

4. Polylogarithmic bound in the density result

Lemma 4.1. (Weak regularity lemma-I)

Lemma 4.2. (Density increment)

Lemma 4.3. ( $l^1$-control)

Proof of Lemma 4.2

Proof of Theorem 1.1

5. Popular common difference

Proposition 5.1. (Weak regularity lemma-II)

Proof of Theorem 1.2

Lemma 5.3. (Combinatorics of projections)

Acknowledgements

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests