1 Introduction
In the first article [Reference Bhargava, Shankar and Wang11] of this two-part series, we proved that when monic integer polynomials
$f(x)=x^n+a_1x^{n-1}+\cdots +a_n$
of fixed degree n are ordered by
$\mathrm {{max}}\{|a_1|,\ldots ,|a_n|^{1/n}\}$
, a positive proportion have squarefree discriminant. The purpose of this article is to prove the analogous result for integral binary n-ic forms.
Recall that the discriminant
$\Delta (f)$
of a binary n-ic form over a field K is a homogeneous polynomial of degree
$2n-2$
in the coefficients of f, whose nonvanishing is equivalent to f having n distinct linear factors over an algebraic closure
$\overline {K}$
of K. We order integral binary n-ic forms
$f(x,y)=a_0x^n+a_1x^{n-1}y+\cdots +a_ny^n$
by their height
$H(f)$
given by
$H(f):= \mathrm {{max}}\{|a_0|,\ldots ,|a_n|\},$
(i.e., the maximum of the absolute values of the coefficients). Then a natural question is as follows: When ordered by height, what is the density of integral binary n-ic forms whose discriminant is squarefree? For
$n=2$
, classical methods in sieve theory yield the answer. For
$n=3$
and
$n=4$
, results of Davenport–Heilbronn [Reference Davenport and Heilbronn15] and the first and second authors [Reference Bhargava and Shankar10], respectively, answer the question in the related setting in which we consider
$\mathrm {{GL}}_2({\mathbb {Z}})$
-orbits on binary n-ic forms. However, for
$n\geq 5$
, it has not previously been known whether this density exists or even whether the lower density is positive. In this paper, we prove the following:
Theorem 1. Let
$n\geq 2$
be an integer. When integral binary n-ic forms
$f(x,y)=a_0x^n+a_1x^{n-1}y+\cdots +a_n y^n$
are ordered by
$H(f):=\mathrm {{max}}\{|a_0|,\ldots ,|a_n|\}. $
, the density of forms having squarefree discriminant exists and is equal to

To any nonzero integral binary n-ic form
$f(x,y) = a_0x^n + \cdots + a_ny^n$
, we may naturally attach a rank-n ring
$R_f$
(see Birch–Merriman [Reference Birch and Merriman12], Nakagawa [Reference Nakagawa24] and Wood [Reference Wood38]), defined as follows when
$a_0\neq 0$
. Let
$\theta $
denote the image of x in
$K_f:={\mathbb {Q}}[x]/(f(x,1))$
. Let
$R_f$
be the free rank-
$n {\mathbb {Z}}$
-submodule of
$K_f$
generated by
$1, \,a_0\theta , \,a_0\theta ^2+a_1\theta ,\,\ldots ,\,a_0\theta ^{n-1}+\cdots +a_{n-1}\theta $
. Then
$R_f$
is in fact closed under multiplication and forms a ring whose discriminant is equal to the discriminant of
$f(x)$
. Our next result determines the density of irreducible integral binary forms f for which
$R_f$
is the maximal order in its field of fractions.
Theorem 2. Let
$n\geq 2$
be an integer. When irreducible integral binary n-ic forms
$f(x,y)=a_0x^n+a_1x^{n-1}y+\cdots +a_ny^n$
are ordered by
$H(f):=\mathrm {{max}}\{|a_0|,\ldots ,|a_n|\}$
, the density of forms f such that
$R_f$
is the ring of integers in its field of fractions exists and is equal to

In particular, Theorem 2 yields the first unconditional Bertini theorem for arithmetic schemes of dimension
$\geq 2$
as conjectured by Poonen [Reference Poonen28, §5]. Indeed, for a quasiprojective subscheme X of
${\mathbb {P}}^n_{\mathbb {Z}}$
that is regular of dimension m, Poonen conjectured that the density of hyperplane sections of X that are regular of dimension
$m-1$
should equal
$\zeta _X(m+1)$
, where
$\zeta _X$
denotes the zeta function of X. Since the subscheme of
${\mathbb {P}}^1_{\mathbb {Z}}$
cut out by an integral binary n-ic form f is regular if and only if
$R_f$
is maximal, and the zeta function of
${\mathbb {P}}^1_{\mathbb {Z}}$
is given by
$\zeta _{{\mathbb {P}}^1_{\mathbb {Z}}}(s)=\zeta (s)\zeta (s-1)$
, we have
$\zeta _{{\mathbb {P}}^1_{\mathbb {Z}}}(\dim ({\mathbb {P}}^1_{\mathbb {Z}})+1)^{-1}=\zeta (2)^{-1}\zeta (3)^{-1}$
. Therefore, Theorem 2 yields an unconditional proof of [Reference Poonen28, Theorem 5.1] for the case
$X={\mathbb {P}}^1_{\mathbb {Z}}$
with the usual ‘box ordering’ on the forms defining the hyperplane sections. In fact, we prove the stronger result that for every fixed
$n\geq 3$
, the density of regular binary n-ic forms is
$\zeta (2)^{-1}\zeta (3)^{-1}$
, while arithmetic Bertini only claims this in the limit as
$n\to \infty $
.
As a further application of our methods, we obtain the following theorem:
Theorem 3. For each
$n\geq 3$
, the number of isomorphism classes of number fields of degree n with associated Galois group
$S_n$
and absolute discriminant less than X is
$\gg X^{1/2+1/(n-1)}$
.
Our lower bound in Theorem 3 on the number of degree-
$n S_n$
-number fields of absolute discriminant less than X improves the previous best-known lower bound of
$X^{1/2+1/n}$
obtained in [Reference Bhargava, Shankar and Wang11]. We note that the number fields constructed in Theorem 3 can all be taken to have squarefree discriminant.
Our results also correct an error in, and thus resurrect, all the results of Nakagawa [Reference Nakagawa24] and [Reference Nakagawa26] that had been subsequently retracted in [Reference Nakagawa25] and [Reference Nakagawa27]. Specifically, the retracted theorems [Reference Nakagawa24, Theorems 3–4] and [Reference Nakagawa26, Theorem 2] regarding binary forms and
$A_n$
-extensions of quadratic fields can now be taken to be true. In particular, we obtain the following:
Theorem 4. For
$n\geq 3$
, the total number of unramified
$A_n$
-extensions of real
$($
resp., imaginary
$)$
quadratic fields F, across all such F such that
$|\mathrm {{Disc}}(F)|<X$
, is
$\gg X^{(n+1)/(2n-2)}$
.
Theorem 4 yields the best-known lower bounds on the number of unramified
$A_n$
-extensions of quadratic fields when
$n>5$
. For improved bounds in the cases
$n\leq 5$
, see [Reference Bhargava4, Theorem 1.4]. For the best-known bounds on the number of quadratic fields of bounded discriminant admitting an unramified
$A_n$
-extension, see Kedlaya [Reference Kedlaya21, Corollary 1.4]. Other related works include Uchida [Reference Uchida37], Yamamoto [Reference Yamamoto40] and Yamamura [Reference Yamamura41].
The main technical ingredient required to prove all the above results is a ‘tail estimate’ which shows that not too many discriminants of integral binary n-ic forms f are divisible by
$p^2$
when p is large relative to the discriminant of f (here, large means larger than
$H(f)$
, say). It is these tail estimates that were missing in Nakagawa’s work. For a prime p, and an integral binary n-ic form f such that
$p^2\mid \Delta (f)$
, we say that
$p^2$
strongly divides
$\Delta (f)$
if
$p^2\mid \Delta (f + pg)$
for every integral binary n-ic form g; otherwise, we say
$p^2$
weakly divides
$\Delta (f)$
. For any squarefree integer
$m>0$
, let
${\mathcal {W}}_m^{\mathrm {{{(1)}}}}$
(resp.,
${\mathcal {W}}_m^{\mathrm {{{(2)}}}}$
) denote the set of integral binary n-ic forms whose discriminants are strongly divisible (resp., weakly divisible) by
$p^2$
for every prime factor p of m.
We prove the following tail estimates:
Theorem 5. For an integer
$n\geq 3$
, a positive real number M and any
$\epsilon>0$
, we have

The estimate in the strongly divisible Case (a) of Theorem 5 follows from geometric techniques – namely, the quantitative version of the Ekedahl geometric sieve as developed by the first author [Reference Bhargava4]. The estimates in the weakly divisible Cases (b) and (c) of Theorem 5 are considerably more difficult (particularly (c)), and we describe their proofs in the next section. Our tail estimate, in fact, allows us to prove Theorems 1 and 2 with power-saving error terms:
Theorem 6. Let
$V_n=\mathrm {{Sym}}^n(2)$
denote the space of binary n-ic forms. Define
$\eta _n$
to be
$1/(2n)$
when n is odd and
$1/(88n^6)$
when n is even. Then

These power saving bounds have applications towards level-of-distribution questions when counting integral binary n-ic forms f of bounded height with
$\Delta (f)$
squarefree (resp.,
$R_f$
maximal) satisfying splitting conditions at finitely many primes. Such level-of-distribution results in turn have applications towards a host of problems in analytic number theory, such as studying statistics of Artin L-functions attached to binary n-ic forms and proving lower bounds on the number of degree-n number fields which are ramified only at a bounded number of primes, among many others. For examples of such applications of level-of-distribution results, see, for example, [Reference Belabas and Fouvry1, Reference Cho and Kim13, Reference Shankar, Södergren and Templier33, Reference Shankar, Södergren and Templier34, Reference Taniguchi and Thorne36].
We remark that our methods imply that the analogues of all of the above results also hold when local conditions are imposed at finitely many places (including at infinity); the orders of magnitudes in these theorems remain the same, provided that no local conditions are imposed that force the sets being counted in Theorems 1 and 2 to be empty.
Finally, the methods introduced in [Reference Bhargava, Shankar and Wang11] and in the current article have applications beyond just squarefree values of polynomial discriminants. They have been recently adapted in [Reference Bhargava and Ho9] to determine the density of squarefree discriminants of elliptic curves over
${\mathbb {Q}}$
having two marked rational points. Other applications include determining the density of conductors in some families of elliptic curves [Reference Shankar, Shankar and Wang32] and the density of squarefree values taken by
$a^4+b^3$
[Reference Sanjaya and Wang29].
2 Outline of proof
As mentioned in the introduction, the uniformity estimate in Theorem 5 is the key to deducing Theorems 1, 2 and 6 via a squarefree sieve. Case (a) of Theorem 5 follows directly from the results in [Reference Bhargava4]. Case (b), which pertains to odd degrees n, can be proven using methods similar to those developed in our previous work [Reference Bhargava, Shankar and Wang11]. However, these methods fail to work for Case (c), which pertains to even degrees n, and a number of new ideas are required to handle this case. It is the proof of this case to which the bulk of our paper is devoted; it requires, in particular, the introduction of a new technique in the geometry of numbers – namely, the techniques of Eskin–Katznelson [Reference Eskin and Katznelson17] used in counting singular symmetric matrices. We believe that this combining of methods may also be useful in other contexts.
In this section, we give a detailed outline of the proof of Case (b) pertaining to odd n. We then explain why this strategy breaks down (quite spectacularly!) when n is even, and finally we describe the new techniques required to complete the proof of Theorem 5(c).
Sketch of the proof of the tail estimate for odd n
Our proof of Theorem 5(b) makes use of the representation of
$G=\mathrm {{SL}}_n$
on the space
$W=2\otimes \mathrm {{Sym}}_2(n)$
of pairs
$(A,B)$
of symmetric
$n\times n$
matrices, studied in detail in [Reference Wood39, Reference Bhargava5, Reference Bhargava, Gross and Wang7, Reference Bhargava, Gross and Wang8]. The group G acts on W via
$\gamma \cdot (A,B)=(\gamma A\gamma ^t,\gamma B\gamma ^t$
) for
$\gamma \in G$
and
$(A,B)\in W$
. We define the invariant binary form of an element
$(A,B)\in W$
by

Then
$f_{A,B}$
is a binary n-ic form satisfying
$f_{\gamma (A,B)}=f_{A,B}$
. Moreover, the ring of polynomial invariants for the action of G on W is freely generated by the coefficients of the invariant binary form. Define the discriminant
$\Delta (A,B)$
and height
$H(A,B)$
of an element
$(A,B)\in W$
by
$\Delta (A,B)=\Delta (f_{A,B})$
and
$H(A,B)=H(f_{A,B})$
.
The first step of our proof is the construction, for every squarefree integer
$m>0$
, of a map

such that
$f_{\sigma _m(f)}(x,y)=f(x,y)$
for every
$f\in {\mathcal {W}}_m^{\mathrm {{{(2)}}}}$
. In our construction, the image of
$\sigma _m$
, in fact, lies in
$W_0({\mathbb {Z}})$
, where
$W_0$
is the subspace of W consisting of pairs of matrices whose top left
$g\times g$
blocks are
$0$
, where
$n=2g+1$
. The action of the group G does not preserve
$W_0$
, and we take
$G_0$
to be the maximal parabolic subgroup of G that does preserve
$W_0$
. When the discriminant polynomial
$\Delta \in {\mathbb {Z}}[W]$
is restricted to
$W_0$
, it is no longer irreducible but rather is divisible by the square of a polynomial
$Q\in {\mathbb {Z}}[W_0]$
. This polynomial Q is a relative invariant for the action of
$G_0$
on
$W_0$
. Its significance is that, by construction of
$\sigma _m$
, every element in the image of
$\sigma _m$
has Q-invariant equal to m. To prove Part (b) of Theorem 5, it therefore suffices to estimate the number of
$G_0({\mathbb {Z}})$
-orbits on
$W_0({\mathbb {Z}})$
having height less than X and Q-invariant greater than M.
Bounding the number of these orbits is complicated by the fact that
$G_0$
is not reductive. We are rescued by using the full action of
$G({\mathbb {Z}})$
on
$W({\mathbb {Z}})$
. This necessitates expanding the definition of the Q-invariant from
$W_0({\mathbb {Z}})$
to all ‘distinguished’ elements of
$W({\mathbb {Z}})$
. An element
$(A,B)\in W({\mathbb {Z}})$
is distinguished if A and B have a common isotropic g-dimensional subspace defined over
${\mathbb {Q}}$
. Thus, every element in
$W_0({\mathbb {Z}})$
(and thus every element in the image of
$\sigma _m$
) is distinguished. The Q-invariant, though defined initially on
$W_0$
, can be extended as a function on the set of all triples
$(A,B,\Lambda )$
, where
$(A,B)\in W({\mathbb {Z}})$
is distinguished, and
$\Lambda $
is a common isotropic subspace of A and B. For all but a negligible number of distinguished elements
$(A,B)\in W({\mathbb {Z}})$
, A and B have exactly one common isotropic subspace
$\Lambda $
defined over
${\mathbb {Q}}$
. Thus, we may define a
$G({\mathbb {Z}})$
-invariant function Q on the set of distinguished pairs
$(A,B)\in W({\mathbb {Z}})$
outside a negligible number of them. It then suffices to bound the number of
$G({\mathbb {Z}})$
-orbits on distinguished elements in
$W({\mathbb {Z}})$
having bounded height and large Q-invariant.
To obtain such a bound, we construct fundamental domains for the action of
$G({\mathbb {Z}})$
on elements in
$W({\mathbb {R}})$
with height less than X. Such a fundamental domain has a natural partition into three parts that we term the main body, the shallow cusp and the deep cusp. We have little control over the Q-invariants of elements in the main body and the shallow cusp. However, it is known [Reference Ho, Shankar and Varma20, Proposition 4.3] that there are a negligible number of integral elements in the shallow cusp. Meanwhile, distinguished elements occur rarely in the main body, a fact we prove via the large sieve.
Finally, the deep cusp lies in
$W_0$
, where an upper bound for the Q-invariant can be obtained. Imposing the condition that this upper bound is greater than M, and counting the number of such points in the deep cusp using the averaging method of [Reference Bhargava3], gives the desired saving for the number of elements in the deep cusp having Q-invariant larger than M. Combining the estimates for the main body, the shallow cusp and the deep cusp yields Part (b) of Theorem 5.
Sketch of the proof of the tail estimate for even n
With W again denoting the space of pairs of symmetric
$n\times n$
matrices, we may attempt to proceed in the same manner as in the case of odd n, by constructing a map

such that
$f_{\sigma _m(f)}(x,y)=f(x,y)$
for every
$f\in {\mathcal {W}}_m^{\mathrm {{{(2)}}}}$
. However, such a map does not exist in the case that n is even! Indeed, there exist integral binary n-ic forms
$f(x,y)$
that cannot be expressed as
$\det (Ax - By)$
– even up to sign – for any integral
$n\times n$
symmetric matrices A and B. This phenomenon was extensively studied in [Reference Bhargava5, Reference Bhargava, Gross and Wang7, Reference Bhargava, Gross and Wang8]. It is in this sense that the strategy to prove Theorem 5(b) for odd n fails spectacularly for even n – and at the very first step.
We address this issue by replacing
$f(x,y)\in {\mathcal {W}}_m^{(2)}$
by
$xf(x,y)$
, which is a reducible binary
$(n+1)$
-ic form whose discriminant, at least generically, remains weakly divisible by
$m^2$
. For these forms
$xf(x,y)$
, we can use the lift
$\sigma _m$
constructed in the odd case. However, since
$xf(x,y)$
has vanishing
$y^{n+1}$
term, the image of
$\sigma _m$
lies within the set of pairs
$(A,B)$
where B is singular.
The singularity of B introduces additional difficulties with respect to both the algebraic and the analytic aspects of the proof. On the algebraic side, the main new problem is that distinguished elements
$(A,B)$
with B singular have at least two values for the Q-invariant, since they share at least two different common isotropic
$(g+1)$
-dimensional subspaces, where
$n=2g+2$
. So it is no longer well defined to impose the condition that Q is large. Imposing the condition that the maximum value of Q is large does not yield sufficient savings to prove an analogue of Theorem 5(b). We thus instead construct a new invariant, termed q, such that for all but a negligible number of elements
$(A,B)$
in the image of our map
$\sigma _m$
, the invariant q is the minimum value taken by Q, and it satisfies
$q(\sigma _m(xf(x,y)))=\pm m$
.
As in the odd degree case, we once again construct fundamental domains
${\mathcal {F}}_X$
for the action of
$G({\mathbb {Z}})$
on
$W({\mathbb {R}})$
with height less than X, and partition such a domain into three parts: the main body, the shallow cusp and the deep cusp. However, we must now only count integer elements
$(A,B)$
where B is singular. The beautiful work of Eskin and Katznelson [Reference Eskin and Katznelson17] provides asymptotics for the number of singular symmetric matrices in homogenously expanding domains, but this work is not directly applicable to our case since we need to estimate the number of singular symmetric matrices B in skewed domains. To achieve this, we provide a simplification of the proof of the upper bounds in [Reference Eskin and Katznelson17], at the cost of some extra
$\log $
factors, which gives us a flexible method by which to obtain upper bounds on the number of singular symmetric matrices in arbitrarily skewed domains.
Accounting for the singularity of the B’s introduces complications in each region of the fundamental domain. In the main body, the fact that the singular matrices B lie on the subvariety cut out the determinant means that we cannot directly apply the large sieve, and the lack of an exact count with a power-saving error term means we also cannot directly apply a Selberg sieve to bound the number of distinguished elements. Instead, we fiber over the singular matrices B and apply the large sieve to bound the number of possible A’s. This requires us to prove new density estimates on the number of distinguished elements
$(A,B)$
over
${\mathbb {F}}_p$
, when B is fixed.
Furthermore, unlike in the odd degree case, we no longer have an automatic power-saving on the number of pairs
$(A,B)\in W({\mathbb {Z}})$
lying in the shallow cusp of the fundamental domain and where B is singular. As we go closer to the deep cusp, there are regions in which imposing the condition that B is singular yields no saving whatsoever. To obtain the required bounds, we isolate this region of the shallow cusp and prove that integral elements
$(A,B)$
in them either satisfy
$\Delta (A,B)=0$
or
$|q(A,B)|$
is small.
Finally, for the deep cusp of
${\mathcal {F}}$
, we once again use the condition that the q-invariant is large to obtain a power saving. Unlike the situation with the Q-invariant in the odd-degree case, the invariant q in the even degree case behaves more wildly and is much harder to control. This is because q is not a polynomial in the coefficients of
$W_0$
but rather is a minimum of the different possible values of Q. In fact, there are regions within the deep cusp where the q-invariant of elements
$(A,B)$
are not small. However, we show that these regions correspond to an archimedean condition on the invariant binary form f of
$(A,B)$
– namely, that the discriminant of f is much smaller than is typical for the height bound on f. Separately bounding the number of such binary forms yields the desired result.
Organization of the paper
This paper is organized as follows. We begin in §3 by recalling the arithmetic invariant theory for the representations
$W_n:=2\times \mathrm {{Sym}}_2(n)$
of
$\mathrm {{SL}}_n$
and
$2\otimes g\otimes (g+1)$
of
$\mathrm {{SL}}_2\times \mathrm {{SL}}_g\times \mathrm {{SL}}_{g+1}$
. In particular, we define the fundamental invariants Q and q. We then construct our maps from
${\mathcal {W}}_{m}^{(2)}$
into
$W_n({\mathbb {Z}})$
when n is odd and into
$W_{n+1}({\mathbb {Z}})$
when n is even.
The analytic parts of the paper are carried out in §4–6. In §4, we prove the tail estimates of Theorem 5 for odd degrees n using geometry-of-numbers techniques. In §5, we carry out the necessary groundwork to count the number of singular symmetric matrices that lie in skewed domains. Using these results, we prove the tail estimates for even degrees n in §6, completing the proof of Theorem 5. In §7, we deduce the main results, Theorems 1–4, from the tail estimates using a squarefree sieve, although the exact constants occurring in Theorems 1 and 2 remain conditional upon certain local density computations. Finally, in the Appendix, we compute the local densities of integral binary n-ic forms whose discriminants are indivisible by
$p^2$
(resp., whose associated rings are maximal at p), thereby completing the proofs of Theorems 1 and 2.
3 Invariant theory on spaces associated to binary n-ic forms
Fix a positive integer n and consider the space
$V_n=\mathrm {{Sym}}^n(2)$
of binary n-ic forms of degree n. The group
$\mathrm {{SL}}_2$
acts on
$V_n$
via linear change of variables: we have
$\gamma \cdot f(x,y):=f((x,y)\cdot \gamma )$
for
$\gamma \in \mathrm {{SL}}_2$
and
$f\in V_n$
.
Let
$W_n=2\otimes \mathrm {{Sym}}_2(n)$
denote the space of pairs of
$n\times n$
symmetric matrices
$(A,B)$
. The group
$\mathrm {{SL}}_2\times \mathrm {{SL}}_n$
acts on
$(A,B)$
via

There is a natural map
$W_n\to V_n$
given by

sending an element of
$W_n$
to its invariant binary n-ic form. The ring of
$\mathrm {{SL}}_n({\mathbb {C}})$
-invariant polynomials on
$W_n({\mathbb {C}})$
is freely generated by the coefficients of the invariant binary n-ic form.
3.1 Arithmetic invariant theory for the representation
$2\otimes \mathrm {{Sym}}_2(n)$
of
$\mathrm {{SL}}_n$
First, let
$n=2g+1$
be an odd integer with
$g\geq 1$
. We recall some of the arithmetic invariant theory of the representation
$W:=W_n$
of
$\mathrm {{SL}}_n$
and its map (1) to
$V:=V_n;$
see [Reference Bhargava, Gross and Wang7] for more details.
Let k be a field of characteristic not
$2$
. For a binary n-ic form
$f(x,y)=a_0x^n + \cdots + a_ny^n\in V(k)$
with
$\Delta (f)\neq 0$
and
$a_0\neq 0$
, let
$C_f$
denote the smooth hyperelliptic curve
$z^2=f(x,y)y$
of genus g viewed as a curve in the weighted projective space
${\mathbb {P}}(1,1,g+1)$
. Let
$J_f$
denote the Jacobian of
$C_f$
. Then the stabilizer of an element
$(A,B)\in W(k)$
with invariant binary form
$f(x,y)$
is isomorphic to
$J_f[2](k)$
. The set of
$\mathrm {{SL}}_n(k)$
-orbits on
$W(k)$
with invariant binary form
$f(x,y)$
maps injectively into
$H^1(k,J_f[2])$
. An element
$(A,B)$
(or an
$\mathrm {{SL}}_n(k)$
-orbit) is distinguished if
$\Delta (A,B)\neq 0$
and there exists a g-dimensional subspace defined over k that is isotropic with respect to both A and B. If
$(A,B)$
is distinguished, then its
$\mathrm {{SL}}_n(k)$
-orbit corresponds to the identity element of
$H^1(k,J_f[2])$
, and the set of these g-dimensional subspaces is in bijection with
$J_f[2](k)$
.
Let
$W_{0}\subset W$
be the subspace of pairs of matrices whose top left
$g\times g$
blocks are zero. Then elements
$(A,B)$
in
$W_{0}(k)$
with nonzero discriminant are all distinguished since the g-dimensional subspace
$Y_g$
spanned by the first g basis vectors is isotropic with respect to both A and B. Moreover, every distinguished element of
$W(k)$
is
$\mathrm {{SL}}_n(k)$
-equivalent to some element in
$W_{0}(k)$
since
$\mathrm {{SL}}_n(k)$
acts transitively on the set of g-dimensional subspaces of
${\mathbb {P}}^{n-1}(k)$
. Let
$G_0$
be the maximal parabolic subgroup of
$\mathrm {{SL}}_n$
consisting of elements
$\gamma $
that preserve
$Y_g$
. Elements of
$W_{0}$
have block matrix form

where
$A^{\mathrm {{top}}}$
,
$B^{\mathrm {{top}}}$
are
$g\times (g+1)$
matrices and
$A_1$
,
$B_1$
are
$(g+1)\times (g+1)$
-symmetric matrices. Meanwhile, elements of
$G_0$
have the block matrix form

An element
$\gamma \in G_0$
acts on the top right
$g\times (g+1)$
block of elements of
$W_{0}$
by

where we use the superscript ‘top’ to denote the top right
$g\times (g+1)$
block of an
$n\times n$
symmetric matrix. The action of
$G_0$
on
$W_{0}$
restricts to an action on the space
$U_g:=2\otimes g\otimes (g+1)$
of pairs of
$g\times (g+1)$
-matrices, Moreover, the unipotent radical
$M_{(g+1)\times g}$
of
$G_0$
acts trivially on
$U_g$
. We study the invariant theory for this action more closely in the next subsection.
We will also need some results in the case when
$n=2g+2$
is even in Section 6 (specifically in the proof of Lemma 6.7). Let
$f(x,y) = a_0x^n + \cdots + a_ny^n\in V(k)$
with
$\Delta (f)\neq 0$
and
$a_0\neq 0$
. Let
$L = k[x]/(f(x,1))$
. Let
$V_f(k)$
denote the set of
$(A,B)\in W_n(k)$
with
$f_{A,B} = f(x,y)$
. Then
$V_f(k)$
is nonempty if and only if
$f_0\in k^{\times 2}N_{L/k}(L^\times )$
(see also [Reference Bhargava, Gross and Wang8, Theorem 7]). Note in particular that if
$f(x,y)\in V({\mathbb {R}})$
is negative definite, so that
$L = {\mathbb {R}}[x]/(f(x))\simeq {\mathbb {C}}^{n/2}$
and
$a_0<0$
, then
$V_f({\mathbb {R}})$
is empty. However, if k is a finite field of characteristic not
$2$
, then
$V_f(k)$
is always nonempty and the number of
$\mathrm {{SL}}_n(k)$
-orbits equals the number of even degree factorizations of
$f(x,y)$
over k.
3.2 The representation
$2\otimes g\otimes (g+1)$
of
$\mathrm {{SL}}_2\times \mathrm {{SL}}_g\times \mathrm {{SL}}_{g+1}$
and the Q-invariant
In this section, we collect some algebraic facts about the representation
$U_g;=2\otimes g\otimes (g+1)$
of the group
$H_g:=\mathrm {{SL}}_2\times \mathrm {{SL}}_g\times \mathrm {{SL}}_{g+1}$
. We start with the following proposition.
Proposition 3.1. The representation
$U_g$
of
${\mathbb {G}}_m\times H_g$
is prehomogeneous (i.e., the action of
${\mathbb {G}}_m\times H_g$
on
$U_g$
has a single Zariski open orbit). Furthermore, the stabilizer in
$H_g({\mathbb {C}})$
of an element in the open orbit of
$U_g({\mathbb {C}})$
is isomorphic to
$\mathrm {{SL}}_2({\mathbb {C}})$
.
Proof. We prove this by induction on g. The assertion is clear for
$g=1$
, where the representation is that of
${\mathbb {G}}_m\times \mathrm {{SL}}_2\times \mathrm {{SL}}_2$
on
$2\times 2$
matrices; the single relative invariant in this case is the determinant, and the open orbit consists of nonsingular matrices. For higher g, we note that
$U_g$
is a castling transform of
$U_{g-1}$
in the sense of Sato and Kimura [Reference Sato and Kimura30, §2, Definition 10] (with
$\widetilde {G} = {\mathbb {G}}_m\times \mathrm {{SL}}_2\times \mathrm {{SL}}_g$
,
$m = 2g$
and
$n = g-1$
). As a result, the orbits of
${\mathbb {G}}_m\times \mathrm {{SL}}_2\times \mathrm {{SL}}_g\times \mathrm {{SL}}_{g-1}$
on
$2\otimes g \otimes (g-1)$
are in natural one-to-one correspondence with the orbits of
${\mathbb {G}}_m\times \mathrm {{SL}}_2\times \mathrm {{SL}}_g\times \mathrm {{SL}}_{g+1}$
on
$2\otimes g\otimes (2g-(g-1))=2\otimes g\otimes (g+1)$
, and under this correspondence, the open orbit in
$U_{g-1}$
maps to an open orbit in
$U_g$
(cf. [Reference Sato and Kimura30, §2, Proposition 9]). Thus, all the representations
$U_g$
for the action of
${\mathbb {G}}_m\times H_g$
are prehomogeneous.
Note that castling transforms preserve stabilizers over
${\mathbb {C}}$
. Since the generic stabilizer for the action of
$H_1({\mathbb {C}})$
on
$U_1({\mathbb {C}})$
is clearly isomorphic to
$\mathrm {{SL}}_2({\mathbb {C}})$
, it follows that this remains the generic stabilizer for the action of
$H_g({\mathbb {C}})$
on
$U_g({\mathbb {C}})$
for all
$g\geq 1$
.
Since castling transforms also preserve polynomial invariants and their irreducibility [Reference Sato and Kimura30, Proposition 18], it follows that the ring of polynomial invariants for this action of
$H_g$
on
$U_g$
is generated by an irreducible polynomial. We now give an explicit description of this invariant.
Write an element in
$U_g=2\times g\times (g+1)$
as a pair
$(A^{\mathrm {{top}}},B^{\mathrm {{top}}})$
of
$g\times (g+1)$
matrices. For
$1\leq i\leq g+1$
, let
$A_i$
and
$B_i$
denote the
$g\times g$
-matrices obtained from
$A^{\mathrm {{top}}}$
and
$B^{\mathrm {{top}}}$
, respectively, by deleting the ith column. Define the binary g-ic form
$f_i(x,y)$
to be
$(-1)^{i+1}\det (A_ix-B_iy)$
. Consider the
$(g+1)\times (g+1)$
matrix C whose
$(i,j)$
-entry is the jth-coefficient of
$f_i(x,y)$
. Taking the determinant of C yields a polynomial
$Q=Q(A^{\mathrm {{top}}},B^{\mathrm {{top}}})$
in the coordinates of
$U_g$
. The polynomial Q is the hyperdeterminant of the
$2\times g \times (g+1)$
matrix
$(A^{\mathrm {{top}}},B^{\mathrm {{top}}})$
(cf. [Reference Gelfond, Kapranov and Zelevinsky18, Chapter 14, Theorem 3.18] with
$m = g$
,
$n = g+1$
,
$p = 2$
). As a consequence, it is irreducible and invariant under the action of
$H_g$
on
$U_g$
and thus generates the ring of polynomials for the action of
$H_g$
on
$U_g$
.
Let
$n=2g+1$
again be an odd integer. We return to the representation
$W_{0}$
of
$G_0$
. Given an element
$(A,B)\in W_{0}$
, recall that we obtain an element
$(A^{\mathrm {{top}}},B^{\mathrm {{top}}})\in U_g$
by taking the top right
$g\times (g+1)$
blocks of A and B. We define the Q-invariant of
$(A,B)\in W_{0}$
as the Q-invariant of
$(A^{\mathrm {{top}}},B^{\mathrm {{top}}})$
:

Then the Q-invariant is a relative invariant for
$G_0$
. More precisely, for any
$\gamma \in G_0$
in the block matrix form (3), we have

since
$\det (\gamma _1)\det (\gamma _2)=1$
. If
$\gamma \in G_0({\mathbb {Z}})$
, then we have
$\det (\gamma _1)=\det (\gamma _2)=\pm 1$
. Hence, the absolute value
$|Q|$
of Q is an invariant for the action of
$G_0({\mathbb {Z}})$
on
$W_{0}({\mathbb {Z}})$
.
3.3 Divisibility properties of
$\Delta $
when restricted to
$W_{0}$
Let
$n=2g+1$
be an odd integer. Write the coordinates on
$W_{0}$
as
$a_{ij},b_{ij}$
with
$i,j$
in the appropriate ranges. Let R denote the ring of regular functions of
$W_0$
over
${\mathbb {Z}}$
(i.e.,
$R={\mathbb {Z}}[W_0]={\mathbb {Z}}[a_{ij},b_{ij}]$
). Consider the discriminant polynomial
$\Delta \in R$
given by
$\Delta (A,B):=\Delta (f_{A,B})$
. In this section, we prove that
$Q^2\mid \Delta $
as polynomials in R, along with another useful divisibility result.
Let Z be the closed subvariety of
$W_{0}$
consisting of elements
$(A,B)$
with
$\Delta (A,B)=0$
, and let
$Y\subset Z$
denote the closed subvariety of
$W_{0}$
consisting of elements
$(A,B)$
such that
$f_{A,B}$
is either divisible by the cube of a binary form with degree
$\geq 1$
or the square of a binary form with degree
$\geq 2$
. Both of these varieties Y and Z are defined over
${\mathbb {Z}}$
and are clearly
$\mathrm {{SL}}_2\times G_0$
-invariant.
Our first result states that the variety in
$W_{0}$
cut out by
$Q=0$
does not lie in Y.
Proposition 3.2. Let
$(A,B) = ((a_{ij})_{ij},(b_{ij})_{ij}) \in W_0(R)$
be the generic element. Then

Proof. Fix an odd prime p. Let
$f(x,y)$
be an element of
$V({\mathbb {Z}})$
, such that the reduction of
$f(x,y)$
modulo p factors as
$x^2h(x,y)$
, where h is irreducible. In particular,
$f(x,y)$
mod p is not divisible by either the cube of a binary form with degree
$\geq 1$
, or the square of a binary form with degree
$\geq 2$
. Let
$(A_f,B_f)\in W_{0}({\mathbb {Z}})$
be an element with invariant binary n-ic form equal to f and
$Q(A_f,B_f) = p$
. Such an element
$(A_f,B_f)$
is constructed in the next subsection (see (9) with
$m = p$
).
Let
$\pi :R\rightarrow {\mathbb {Z}}$
denote the specialization map assigning integer values to
$a_{ij},b_{ij}$
such that

Then
$\pi (Q) = p$
and so
$\pi $
induces a map
$R/(Q) \rightarrow {\mathbb {F}}_p$
. Since
$(A_f,B_f)\text { mod }p\notin Y({\mathbb {F}}_p)$
, we see that
$(A,B)\text { mod }Q\notin Y(R/(Q))$
.
The next lemma, which follows from a direct computation, gives the Q-invariant for elements in
$W_{0}$
having a specific form.
Lemma 3.3. Let k be a field and let
$(A,B)\in W_{0}(k)$
be an element such that the top right
$g\times (g+1)$
blocks of
$(A,B)$
are of the following form:

Then

Next, we have the following proposition that gives a normal form for elements
$(A,B)\not \in Y$
whose Q-invariant is
$0$
.
Proposition 3.4. Let k be a field. Let
$(A,B)$
be an element of
$W_{0}(k)\backslash Y(k)$
such that
$Q(A,B)=0$
. Then
$(A,B)$
is
$\mathrm {{SL}}_2(k)\times G_0(k)$
-equivalent to an element of the form
$(A',B')$
where the top right
$g\times (g+1)$
blocks of
$A'$
and
$B'$
are given by

where
$a_1,\ldots ,a_g,b_2,\ldots ,b_g\in k^\times .$
In the displayed matrices above, any empty entry is
$0$
.
Proof. The action of
$G_0(k)$
allows us to perform simultaneous row operations and simultaneous column operations on
$(A^{\mathrm {{top}}},B^{\mathrm {{top}}})$
. As a first step, we perform column operations to ensure that the rightmost column of
$B^{\mathrm {{top}}}$
is
$0$
. Next, recall that the Q-invariant of
$(A,B)$
is the determinant of the
$(g+1)\times (g+1)$
matrix C, whose rows come from the coefficients of the
$g\times g$
minors of
$A^{\mathrm {{top}}} x-B^{\mathrm {{top}}} y$
. It follows that row operations on
$(A^{\mathrm {{top}}},B^{\mathrm {{top}}})$
leave C unchanged, while adding
$\alpha $
times the i-th columns of
$A^{\mathrm {{top}}},B^{\mathrm {{top}}}$
to the j-th column has the effect of adding
$\alpha $
times the j-th row of C to the i-th row of C and leaving the rest unchanged. Since
$\det (C)=Q(A^{\mathrm {{top}}},B^{\mathrm {{top}}})=0$
, it follows that by adding multiples of the last columns of
$A^{\mathrm {{top}}},B^{\mathrm {{top}}}$
to the other columns, we may assume that the last row of C is
$0$
. Denoting the
$g\times g$
matrices obtained by removing the last columns of
$A^{\mathrm {{top}}}$
and
$B^{\mathrm {{top}}}$
by M and N, respectively, we have
$\det (Mx - Ny) = 0$
.
We next claim that by performing simultaneous row and column operations on
$(M,N)$
, we may bring M and N in the form of the first g columns of
$A^{\prime \,\mathrm {{top}}}$
and
$B^{\prime \,\mathrm {{top}}}$
, respectively, for
$(A^{\prime \,\mathrm {{top}}},B^{\prime \,\mathrm {{top}}})$
as given in (7) with
$b_i\neq 0$
for all
$2\leq i\leq g$
. Since
$\det (M)=0$
, after appropriate column operations, we may assume that the first column of M is
$0$
. Now the first column of N cannot be identically
$0$
for otherwise, the invariant binary form of
$(A,B)$
has a factor of the form
$h(x,y)^2$
with
$\deg h = g$
, contradicting
$(A,B)\notin Y(k)$
. By applying row operations, we may ensure that the bottom left entry of N is
$b_g\neq 0$
and the rest of the first column of N is
$0$
. We then use this nonzero cofficient
$b_g$
to clear out the rest of the bottom row of N (without changing M).
Let
$M_1$
and
$N_1$
denote the top right
$(g-1)\times (g-1)$
block of M and N. Then
$\det (Mx - Ny) = (-1)^{g}b_g y\det (M_1x - N_1y).$
Hence,
$\det (M_1x - N_1y) = 0$
and the first column of
$M_1$
can be made
$0$
. As in the previous case, all the coefficients of the first column of
$N_1$
can be made
$0$
except for the bottom left entry, which is
$b_{g-1}\neq 0$
. We then clear out the bottom row of
$N_1$
as before. Proceeding in this way, we transform the first
$g-1$
columns of M and N to be in the required form. Since the
$b_i$
’s are nonzero for
$2\leq i\leq g$
, and since
$\det (Mx-Ny)=0$
, it follows that the top right coefficients of M and N are
$0$
, completing the proof of the claim.
Note that this transformation of M and N did not change the last column of
$B^{\mathrm {{top}}}$
, which remains
$0$
. Thus, to complete the proof of Proposition 3.4, it remains to show that
$a_i\neq 0$
for
$1\leq i\leq g$
. Since the first row and column of
$B'$
are
$0$
, we see that
$x^2a_1^2\mid f_{A',B'}$
. Hence,
$a_1\neq 0$
. Suppose for contradiction that
$i=2,\ldots ,g$
is the smallest index such that
$a_i = 0$
. Then we may clear out the i-th row of
$A'$
using the second up to the
$(i-1)$
-th rows of
$A'$
. That is,
$(A',B')$
is
$\mathrm {{SL}}_n(k)$
-equivalent to some
$(A",B")$
where the only nonzero entries in the i-th row and the i-th column of
$A"$
appear in the last entry. This allows us to factor out an extra factor of
$y^2$
in
$\det (A"x - B"y)=\pm f_{A,B}$
, contradicting the assumption that
$(A,B)\notin Y(k)$
since we already had
$x^2\mid f_{A,B}$
.
We are now ready to prove that
$Q^2\mid \Delta $
:
Theorem 3.5. We have
$Q^2\mid \Delta $
in
${\mathbb {Z}}[W_{0}]$
.
Proof. Let
$(A,B)\in W_{0}(R)$
be the generic element. We begin by proving that
$(A,B)\in Z(R/(Q))$
, or equivalently that
$Q\mid \Delta $
in R. Let
$(\bar {A},\bar {B})\in W_{0}(R/(Q))$
denote the reduction of
$(A,B)$
mod Q, and let F denote the field of fractions of
$R/(Q)$
. By Proposition 3.2, we know
$(\bar {A},\bar {B})\notin Y(F)$
. Since
$Q(\bar {A},\bar {B}) = 0$
, by Proposition 3.4, there exists
$\gamma \in \mathrm {{SL}}_2(F)\times G_0(F)$
such that
$\gamma (\bar {A},\bar {B})=(A',B')$
, where
$(A^{\prime \,\mathrm {{top}}},B^{\prime \,\mathrm {{top}}})$
is of the form (7). The invariant binary form of
$(A',B')$
has a factor of
$x^2$
, and so
$(A',B')\in Z(F)$
. Since Z is
$\mathrm {{SL}}_2\times G_0$
-invariant, we see that
$(\bar {A},\bar {B})\in Z(F)$
.
Since
$Q\mid \Delta $
in R, there exists an element
$\delta \in R$
such that
$\Delta =Q\delta $
. Let
$Z_1$
denote the closed subvariety of
$W_{0}$
cut out by
$\delta $
. It now suffices to prove that
$Q\mid \delta $
or, equivalently, that the generic element
$(A,B)$
belongs to
$Z_1(R/Q)$
. We claim that for any field k, and every element
$(A,B)\in W_{0}(k)$
such that
$(A^{\mathrm {{top}}},B^{\mathrm {{top}}})$
has the form (7), we have
$\delta (A,B)=0$
. Indeed, let
$(A,B)$
be such an element. Let
$(A^{(\epsilon )},B^{(\epsilon )})\in W_{0}(k[\epsilon ])$
be such that
$A^{(\epsilon )}=A$
, the
$(1,n-1)$
-entry and the
$(n-1,1)$
-entry of
$B^{(\epsilon )}$
equal
$\epsilon $
, and the other coefficients of
$B^{(\epsilon )}$
are the same as those of B. By Lemma 3.3, we have

Moreover,
$\epsilon ^2$
divides the
$y^n$
-coefficient of
$f_{A^{(\epsilon )},B^{(\epsilon )}}$
and
$\epsilon $
divides the
$xy^{n-1}$
-coefficient of
$f_{A^{(\epsilon )},B^{(\epsilon )}}$
. Hence,
$\epsilon ^2\mid \Delta (A^{(\epsilon )},B^{(\epsilon )})$
, which implies (since
$\epsilon ^2\nmid Q(A^{(\epsilon )},B^{(\epsilon )})$
) that
$\epsilon \mid \delta (A^{(\epsilon )},B^{(\epsilon )})$
. Since
$(A,B)$
is obtained from
$(A^{(\epsilon )},B^{(\epsilon )})$
by setting
$\epsilon =0$
, we have
$\delta (A,B)=0$
. We have proven that the generic element
$(A,B)\in W_{0}(R)$
belongs to
$Z_1(R/(Q))$
. Therefore,
$Q\mid \delta $
.
We end this section with another divisibility result for
$\Delta $
, which will be used in §6.
Proposition 3.6. We have
$\det (A^{\mathrm {{top}}} (A^{\mathrm {{top}}})^t)\det (B^{\mathrm {{top}}} (B^{\mathrm {{top}}})^t) \mid \Delta $
as elements in
${\mathbb {Z}}[W_{0}]$
.
Proof. It suffices to prove that
$\det (B^{\mathrm {{top}}} (B^{\mathrm {{top}}})^t)$
divides
$\Delta $
in
${\mathbb {Z}}[W_{0}]$
. Suppose
$(A,B)\in W_{0}({\mathbb {C}})$
with
$\det (B^{\mathrm {{top}}} (B^{\mathrm {{top}}})^t) = 0$
. Then
$B^{\mathrm {{top}}}$
does not have full rank. Hence, there exists some nonzero
$v\in \text {Span}_{\mathbb {C}}\{e_1,\ldots ,e_g\}$
such that
$Bv = 0$
. However, any such v is isotropic with respect to A. As a result,
$\Delta (A,B) = 0$
. Thus, by the Nullstellensatz,
$\det (B^{\mathrm {{top}}} (B^{\mathrm {{top}}})^t) \mid c \Delta ^{d}$
in
${\mathbb {Z}}[W_{0}]$
for some nonzero integer c and positive integer d.
Define
$P_g\in {\mathbb {Z}}[M_{g\times (g+1)}]$
by
$P_g(M) = \det (MM^t)$
. For the purpose of proving Proposition 3.6, it suffices to prove that
$P_g$
is squarefree in
${\mathbb {Z}}[M_{g\times (g+1)}]$
. We proceed by induction on g. Denote the
$(i,j)$
-entry of any
$M\in M_{g\times (g+1)}$
by
$u_{ij}$
. When
$g = 1$
, we have
$P_1 = u_{11}^2 + u_{12}^2$
, which is squarefree in
${\mathbb {Z}}[u_{11},u_{12}]$
. For general
$g\geq 2$
, consider

Then

where
$D_{g-1}$
is the determinant of the top left
$(g-1)\times (g-1)$
block of M. Any square factor of
$\det (MM^t)$
must be a common square factor of
$P_{g-1}$
and
$D_{g-1}^2$
, which can only be
$\pm 1$
since
$P_{g-1}$
is squarefree by induction. We have shown that
$P_g$
is squarefree even after setting certain variables to
$0$
. Therefore,
$P_g$
is squarefree in
${\mathbb {Z}}[M_{g\times (g+1)}]$
.
3.4 Embedding
${\mathcal {W}}_{m,n}^{\mathrm {{{(2)}}}}$
into
$W_n({\mathbb {Z}})$
, for n odd
Let
$n=2g+1$
be an odd integer, and set
$W=W_n$
. For an odd squarefree integer
$m>0$
, let
${\mathcal {W}}_m^{\mathrm {{{(2)}}}}={\mathcal {W}}_{m,n}^{\mathrm {{{(2)}}}}$
denote the set of integer n-ic binary forms whose discriminants are weakly divisible by
$p^2$
for every prime factor p of m. Fix an element
$f(x,y)\in {\mathcal {W}}_m^{\mathrm {{{(2)}}}}$
. Then just as shown in [Reference Bhargava, Shankar and Wang11, §3.2], there exists an
$\mathrm {{SL}}_2({\mathbb {Z}})$
-change of variable such that
$f((x,y)\gamma )$
has the form

for some integers
$b_0,\ldots ,b_n$
and where m and
$b_0$
are coprime.
Consider the following pair of matrices:

Here, the dots on the antidiagonal of A are all
$1$
and the dots on the antidiagonal of B are all
$0$
. We claim that
$c_i,r$
can be chosen to be integers so that
$(-1)^g\det (xA - yB) = f((x,y)\gamma )$
. It is clear that
$c_0 = b_0$
and
$2mrc_0 + m^2c_1 = -mb_1$
. Choose
$r\in {\mathbb {Z}}$
such that
$m\mid 2rc_0+b_{1}$
; this then determines
$c_{1}$
. It is then not hard to check that the coefficient of
$x^{n-i}y^i$
in
$(-1)^g\det (xA - yB)$
is of the form
$(-1)^ic_i + L(c_0,\ldots ,c_{i-1})$
where L is a linear form with coefficients in
${\mathbb {Z}}[r]$
. The existence of integers
$c_2,\ldots ,c_n$
now follows by induction.
Set
$\sigma _m(f)=\sigma _{m,n}(f)$
to be the element
$(A_f,B_f)$
such that

Then
$f_{\sigma _{m}(f)} = f.$
Next, we note that
$(A,B)$
, and thus,
$(A_f,B_f)$
are in
$W_{0}({\mathbb {Z}})$
, and from Lemma 3.3, we obtain that
$|Q|(A,B) = m$
. Since Q is
$\mathrm {{SL}}_2$
-invariant, we conclude that

We have proven the following theorem.
Theorem 3.7. Let
$m>0$
be a squarefree integer. There exists a map
$\sigma _{m}:{\mathcal {W}}_{m}^{\mathrm {{{(2)}}}}\to W_{0}({\mathbb {Z}})$
such that

for every
$f\in {\mathcal {W}}_{m}^{\mathrm {{{(2)}}}}$
.
We will later use the image of
$\sigma _{1}$
as a fundamental set for the action of
$\mathrm {{SL}}_n({\mathbb {R}})$
on the set of distinguished elements of
$W({\mathbb {R}})$
. We now extend the function
$|Q|$
to the set of distinguished elements of
$W({\mathbb {Z}})$
having irreducible invariant binary form. Suppose that
$(A,B)$
is a distinguished element of
$W({\mathbb {Z}})$
. Then there is a g-dimensional subspace X isotropic with respect to A and B. Let
$\Lambda = X\cap {\mathbb {Z}}^n$
be the primitive lattice in X. There exists an element
$\gamma $
in
$\mathrm {{SL}}_n({\mathbb {Z}})$
, unique up to left multiplication by an element in
$G_0({\mathbb {Z}})$
, such that
$\Lambda = \gamma ^t\cdot \,\text {Span}_{\mathbb {Z}}\{e_1,\ldots ,e_g\}$
, where
$e_1,\ldots ,e_n$
is the standard basis of
${\mathbb {Z}}^n$
. Then
$\gamma \cdot (A,B)\in W_{0}({\mathbb {Z}})$
, and we can thus define the
$|Q|$
-invariant on the triple
$(A,B,\Lambda )$
by

That is, we complete an integral basis of
$\Lambda $
to an integral basis of
${\mathbb {Z}}^n$
with respect to which the pair
$(A',B')$
of Gram matrices for the quadratic forms defined by A and B lies inside
$W_0({\mathbb {Z}})$
, and we define
$|Q|(A,B,\Lambda )$
to be
$|Q|(A',B').$
We end with the following result that will be crucial in Section 4.
Proposition 3.8. Let
$n=2g+1$
be an odd integer with
$n\geq 3$
. Let m be an odd positive squarefree integer. Let
$f(x,y)\in {\mathcal {W}}_{m}^{(2)}$
be an irreducible integral binary n-ic form. Let
$(A,B)$
be any element in
$\mathrm {{SL}}_n({\mathbb {Z}})\cdot \sigma _{m}(f)$
. Then there is a unique primitive g-dimensional lattice
$\Lambda $
that is isotropic with respect to both A and B. Moreover,
$|Q|(A,B):=|Q|(A,B,\Lambda )=m$
. In particular, if
$f(x,y)\in {\mathcal {W}}_{m}^{(2)}\cap {\mathcal {W}}_{m'}^{(2)}$
is irreducible where m and
$m'$
are distinct odd positive squarefree integers, then
$\sigma _{m}(f(x,y))$
and
$\sigma _{m'}(f(x,y))$
are not
$\mathrm {{SL}}_{n}({\mathbb {Z}})$
-equivalent.
Proof. Let
$C_f$
denote the smooth hyperelliptic curve
$z^2=f(x,y)y$
of genus g viewed as a curve in the weighted projective space
${\mathbb {P}}(1,1,g+1)$
, and let
$J_f$
denote its Jacobian. Since
$(A,B)$
is
$\mathrm {{SL}}_n({\mathbb {Z}})$
-equivalent to
$\sigma _{m}(f)$
, it follows that
$(A,B)$
is distinguished. Thus, the set of common isotropic g-dimensional subspaces of A and B over
${\mathbb {Q}}$
is in bijection with
$J_f[2]({\mathbb {Q}})$
. Since f is irreducible, we have
$J_f[2]({\mathbb {Q}})=1$
. Therefore, there is a unique primitive g-dimensional lattice
$\Lambda $
which is isotropic with respect to both A and B.
Let
$\gamma \in \mathrm {{SL}}_n({\mathbb {Z}})$
be an element such that
$\gamma (A,B) = \sigma _{m}(f) =: (A_f, B_f)\in W_{0}({\mathbb {Z}})$
. Since we know that
$\text {Span}_{\mathbb {Z}}\{e_1,\ldots ,e_g\}$
is a primitive g-dimensional lattice isotropic with respect to
$A_f$
and
$B_f$
, we see that
$\gamma ^t\cdot \,\text {Span}_{\mathbb {Z}}\{e_1,\ldots ,e_g\}$
is a primitive g-dimensional lattice isotropic with respect to A and B. By uniqueness, it follows that
$\Lambda = \gamma ^t\cdot \,\text {Span}_{\mathbb {Z}}\{e_1,\ldots ,e_g\}$
, and so by definition,
$|Q|(A,B,\Lambda )=|Q|(\sigma _{m}(f))=m$
, where the final equality is Theorem 3.7.
3.5 Embedding
${\mathcal {W}}_{m,n}^{(2),\,\mathrm {{gen}}}$
into
$W_{n+1}({\mathbb {Z}})$
, for n even
Suppose now that
$n = 2g+2$
is even with
$g\geq 1$
. For an odd squarefree integer
$m>0$
, let
${\mathcal {W}}_{m,n}^{\mathrm {{{(2)}}}}$
denote the set of integer binary forms having discriminant weakly divisible by
$p^2$
for every prime factor p of m. Let
${\mathcal {W}}_{m,n}^{(2),\,\mathrm {{gen}}}\subset {\mathcal {W}}_{m,n}^{(2)}$
consist of those
$f(x,y)$
with
$f(0,1)$
coprime to m. Since
$\Delta (xf(x,y)) = \Delta (f(x,y))f(0,1)^2$
, we see that if
$f(x,y)\in {\mathcal {W}}_{m,n}^{(2),\,\mathrm {{gen}}}$
, then
$xf(x,y)\in {\mathcal {W}}_{m,n+1}^{(2)}$
. We define
$\sigma _{m,n}:{\mathcal {W}}_{m,n}^{(2),\,\mathrm {{gen}}}\rightarrow W_{n+1}({\mathbb {Z}})$
via
$\sigma _{m,n}(f) = \sigma _{m,n+1}(xf)$
. For the rest of this subsection, we drop subscripts and denote
$\sigma _{m,n}$
by
$\sigma _m$
,
${\mathcal {W}}_{m,n}^{\mathrm {{{(2)}}},\,\mathrm {{gen}}}$
by
${\mathcal {W}}_m^{\mathrm {{{(2)}}}\,\mathrm {{gen}}}$
, and
$W_{n+1}$
by W.
We now define the finer q-invariant. Let
$f\in {\mathcal {W}}_{m}^{(2),\,\mathrm {{gen}}}$
and suppose
$(A,B) = \sigma _{m}(f)$
. Then B is singular since
$xf(x,y)$
has vanishing
$y^n$
-term. Moreover, since
$\Delta (xf)\neq 0$
, the kernel of B has dimension exactly
$1$
and is not isotropic with respect to A (see Lemma 6.4). Fix an integral domain D. Let
$W_{1}(D)$
be the subset of
$W(D)$
consisting of pairs
$(A,B)$
of symmetric
$(n+1)\times (n+1)$
matrices satisfying the following conditions:
-
(a) The top left
$(g+1)\times (g+1)$ block of A is
$0$ .
-
(b) The top left
$(g+2)\times (g+2)$ block of B is
$0$ (implying that B is singular).
-
(c) The kernel of B has dimension exactly
$1$ (over the fraction field of D) and is not isotropic with respect to A.
Take any
$(A,B)\in W_{1}(D)$
. Conditions (a) and (c) imply that the first
$g+1$
columns of B are linearly independent over the fraction field of D. Let
$B'$
denote the top right
$(g+1)\times (g+1)$
block of B. Since B is symmetric, we see that
$B'$
is nonsingular. It is now easy to see from the definition of the Q-invariant that as polynomials in the coordinates of
$W_{1}(D)$
, we have

We define the quotient to be the q-invariant of
$(A,B)$
:

Let
$G_1(D)$
denote the subgroup of
$\mathrm {{SL}}_{n+1}(D)$
preserving
$W_{1}(D)$
. Then elements of
$G_1(D)$
have the following block matrix form:

It is easy to check that for any
$(A,B)\in W_{1}(D)$
,

We now consider the situation over
${\mathbb {Z}}$
. Let
$(A,B)\in W({\mathbb {Z}})$
be a distinguished element having nonzero discriminant such that B is singular. Let X denote a common isotropic
$(g+1)$
-dimiensional subspace of A and B. We know that the kernel
$\langle v\rangle $
of B has trivial intersection with X. Denote the span of X and v by
$X'$
, which is a
$(g+2)$
-dimensional subspace containing X that is isotropic with respect to B. Let
$\Lambda = X\cap {\mathbb {Z}}^{n+1}$
and
$\Lambda ' = X'\cap {\mathbb {Z}}^{n+1}$
be the primitive lattices in X and
$X'$
, respectively. There exists an element
$\gamma $
in
$\mathrm {{SL}}_{n+1}({\mathbb {Z}})$
, unique up to left multiplication by an element in
$G_1({\mathbb {Z}})$
, such that
$\Lambda = \gamma ^t.\,\text {Span}_{\mathbb {Z}}\{e_1,\ldots ,e_{g+1}\}$
and
$\Lambda ' = \gamma ^t.\,\text {Span}_{\mathbb {Z}}\{e_1,\ldots ,e_{g+2}\}$
. Then
$\gamma (A,B)\in W_{1}({\mathbb {Z}})$
, and we can thus define the
$|q|$
-invariant for the quadruple
$(A,B,\Lambda ,\Lambda ')$
by

In other words, we complete an integral basis
$\{v_1,\ldots ,v_{g+1}\}$
of
$\Lambda $
to an integral basis
$\{v_1,\ldots ,v_{n+1}\}$
of
${\mathbb {Z}}^{n+1}$
such that
$\{v_1,\ldots ,v_{g+2}\}$
forms an integral basis of
$\Lambda '$
. When expressed in this basis, the pair
$(A',B')$
of Gram matrices for the quadratic forms defined by A and B lies in
$W_{1}({\mathbb {Z}})$
and we define
$|q|(A,B,\Lambda ,\Lambda ') := |q|(A',B').$
Finally, we compute the
$|Q|$
- and
$|q|$
-invariants of
$\sigma _{m}(f(x,y))$
, where
$f(x,y)\in {\mathcal {W}}_{m}^{(2),\,\mathrm {{gen}}}$
is irreducible.
Proposition 3.9. Let
$n=2g+2$
with
$g\geq 1$
. Let m be an odd positive squarefree integer. Let
$f(x,y)\in {\mathcal {W}}_{m}^{(2),\,\mathrm {{gen}}}$
be irreducible. Let
$(A,B)$
be any element in
$\mathrm {{SL}}_{n+1}({\mathbb {Z}})\cdot \sigma _{m}(f(x,y))$
. Let
$\Lambda $
be a
$(g+1)$
-dimensional primitive lattice contained in a
$(g+2)$
-dimensional primitive lattice
$\Lambda '$
such that
$\Lambda $
is isotropic with respect to A and
$\Lambda '$
is isotropic with respect to B. Then
$|Q|(A,B,\Lambda )$
is either m or
$|f(0,1)|m$
, and
$|q|(A,B) := |q|(A,B,\Lambda ,\Lambda ') = m$
, independent of
$(\Lambda ,\Lambda ')$
. In particular, if
$f(x,y)\in {\mathcal {W}}_{m}^{(2),\,\mathrm {{gen}}}\cap {\mathcal {W}}_{m'}^{(2),\,\mathrm {{gen}}}$
is irreducible where m and
$m'$
are distinct odd positive squarefree integers, then
$\sigma _{m}(f(x,y))$
and
$\sigma _{m'}(f(x,y))$
are not
$\mathrm {{SL}}_{n+1}({\mathbb {Z}})$
-equivalent.
Proof. The size of
$J_f[2]({\mathbb {Q}})$
is
$2$
since
$xf(x,y)$
has a unique even degree factor (namely,
$f(x,y)$
) over
${\mathbb {Q}}$
. Therefore, the pair
$(A,B)$
has two
$(g+1)$
-dimensional common isotropic subspaces
$X_1$
and
$X_2$
over
${\mathbb {Q}}$
. Let
$\Lambda _1$
and
$\Lambda _2$
denote the corresponding primitive lattices contained in
$X_1$
and
$X_2$
. The unique
$(g+2)$
-dimensional subspace
$X_1'$
(resp.,
$X_2'$
) isotropic with respect to B and containing
$X_1$
(resp.,
$X_2$
) is the span of
$X_1$
(resp.,
$X_2$
) with the kernel of B. Let
$\Lambda ^{\prime }_1$
and
$\Lambda _2'$
denote the primitive lattices contained in
$X_1'$
and
$X_2'$
. We compute the
$|Q|$
- and
$|q|$
-invariants associated to these lattices.
We may assume that
$(A,B)=\sigma _{m}(f)$
since the action of
$\mathrm {{SL}}_{n+1}({\mathbb {Z}})$
does not change the
$|Q|$
- or
$|q|$
invariants. Since
$|Q|$
is
$\mathrm {{SL}}_2$
-invariant, and
$|q|$
remains unchanged when we add a multiple of B to A, we may also assume that

and so

Comparing the
$y^{n+1}$
- and the
$xy^{n}$
-coefficients, we have
$c_{n+1} = 0$
and
$c_{n}=b_{n}$
.
Let
$\langle \,,\rangle _A$
and
$\langle \,,\rangle _B$
denote the quadratic forms corresponding to A and B. Let
$e_1,\ldots ,e_{n+1}$
be the standard basis on
${\mathbb {Z}}^{n+1}$
. Since
$c_{n+1}=0$
, the vector
$e_{n+1}$
spans the kernel of B. We may take the subspace spanned by
$e_1,\ldots ,e_{g+1}$
as
$X_1$
. Then by construction,
$|Q|(A,B,\Lambda _1) = m$
. When expressed in terms of the ordered integral basis
$\{e_1,\ldots ,e_{g+1},e_n,e_{g+2},\ldots ,e_{n-1}\}$
, the top right
$(g+1)\times (g+1)$
block of B has
$1$
’s on the antidiagonal and
$0$
’s above the antidiagonal, and so has determinant
$\pm 1$
. Hence,
$|q|(A,B,\Lambda _1,\Lambda _1') = m$
.
The second common isotropic
$(g+1)$
-dimensional subspace
$X_2$
is the reflection of
$X_1$
in the hyperplane perpendicular to
$e_{n+1}$
with respect to
$\langle \,,\rangle _A$
. That is,

Suppose first that
$b_n$
is odd. Then we have the following integral basis for
${\mathbb {Z}}^{n+1}$
:

In terms of this basis, the top right
$(g+1)\times (g+2)$
blocks of A and B have the following form:

It is then easy to check that
$|Q|(A,B,\Lambda _2) = |b_{n}|m$
and
$|q|(A,B,\Lambda _2,\Lambda _2') = m$
.
When
$b_{n}$
is even, we have the following integral basis:

In terms of this basis, the top right
$(g+1)\times (g+2)$
blocks of A and B have the same form as in (13). Hence, the
$|Q|$
- and
$|q|$
-invariants are as stated in the proposition.
4 A uniformity estimate for odd degree polynomials
Throughout this section, we fix an odd integer
$n=2g+1$
with
$g\geq 1$
. Our goal is to prove Theorem 5(b) by obtaining a bound on the number of integral binary n-ic forms having bounded height and discriminant weakly divisible by the square of a large squarefree integer.
Let
$m>0$
be an odd squarefree integer. Recall that we defined a map
$\sigma _m:{\mathcal {W}}^{(2)}_m\to W_0({\mathbb {Z}})$
in Theorem 3.7 with the following two properties:
$f_{\sigma _m(f)}=f$
for every
$f\in {\mathcal {W}}^{(2)}_m$
, and
$|Q|(\sigma _m(f))=m$
. Moreover, in Proposition 3.8, we proved that when
$f\in {\mathcal {W}}_m^{(2)}$
is irreducible, it is possible to naturally extend the definition of the
$|Q|$
-invariant to the set
$\mathrm {{SL}}_n({\mathbb {Z}})\cdot \sigma _m(f)$
.
Let
$W({\mathbb {Z}})^{\mathrm {{dist}}}$
denote the set of distinguished elements in
$W({\mathbb {Z}})$
, and for any set
$L\subset W({\mathbb {Z}})$
, let
$L^{\mathrm {{irr}}}$
denote the set of elements
$w\in L$
such that
$f_w$
is irreducible. There is a natural extension of the
$|Q|$
-invariant to the set
$W({\mathbb {Z}})^{\mathrm {{dist}},\mathrm {{irr}}}$
. For a positive real number M and any set
$S\subset W({\mathbb {Z}})^{\mathrm {{dist}},\mathrm {{irr}}}$
, let
$S_{|Q|>M}$
denote the set of elements
$w\in S$
with
$|Q(w)|>M$
. By [Reference Kuba22, Theorem 1], the number of reducible elements
$f\in V({\mathbb {Z}})$
with
$H(f)<X$
is
$O(X^n)$
. Hence, we have the bound

In this section, we obtain an upper bound on the number of
$\mathrm {{SL}}_n({\mathbb {Z}})$
-orbits on
$W({\mathbb {Z}})^{\mathrm {{dist}},\mathrm {{irr}}}_{|Q|>M}$
with height bounded by X. First, in §4.1, we lay out the reduction theory necessary to express the number of such orbits in terms of the counts of lattice points in certain bounded regions. Then in §4.2, we partition these regions into three parts, the main body, the shallow cusp and the deep cusp. We prove the desired estimate for each of these parts, thereby obtaining Theorem 5(b).
4.1 Reduction theory and averaging over fundamental domains
Recall that the Iwasawa decomposition of
$\mathrm {{SL}}_n({\mathbb {R}})$
is given by
$ \mathrm {{SL}}_n({\mathbb {R}})=NTK, $
where N is the group of unipotent lower triangular matrices in
$\mathrm {{SL}}_n({\mathbb {R}})$
,
$K=\mathrm {{SO}}(n)$
is a maximal compact subgroup of
$\mathrm {{SL}}_n({\mathbb {R}})$
, and T is the split torus of
$\mathrm {{SL}}_n({\mathbb {R}})$
consisting of
$n\times n$
diagonal matrices with positive diagonal entries and determinant
$1$
. We denote elements in T by
$s=\mathrm {{diag}}(t_1^{-1},t_2^{-1},\ldots ,t_n^{-1})$
, where
$t_i>0$
for
$1\leq 1\leq n$
and
$t_1t_2\cdots t_n=1$
. It will be convenient to make the following change of variables. For
$1\leq i\leq n-1$
, set
$s_i$
to be

for
$1\leq i<n$
. The Haar measure of
$G({\mathbb {R}})$
in these coordinates is then given by

$dn$
and
$dk$
are Haar measures on N and K, respectively, and
$d^\times s=\prod _{i=1}^{n-1}s_i^{-1}ds_i$
.
We denote the coordinates on W by
$a_{ij},b_{ij}$
for
$1\leq i\leq j\leq n$
. These coordinates are eigenvectors for the action of T on the dual
$W^*$
of W. Denote the T-weight of a coordinate
$\alpha $
on W, or more generally a product
$\alpha $
of powers of such coordinates, by
$w(\alpha )$
. Then
$w(a_{ij}) = w(b_{ij}) = t_i^{-1}t_j^{-1}.$
It will be useful in what follows to compute the weight of the Q-invariant, which is a homogeneous polynomial of degree
$g(g+1)$
in the coordinates of
$W_0$
. We view the torus T as sitting inside
$G_0$
. Then by (5), we have

Let
${\mathcal {F}}$
be a fundamental set for the action of
$\mathrm {{SL}}_n({\mathbb {Z}})$
on
$\mathrm {{SL}}_n({\mathbb {R}})$
that is contained in a Siegel set (i.e., contained in
$N'T'K$
, where
$N'$
is a set consisting of elements in N whose coefficients are absolutely bounded and
$T'\subset T$
consists of elements in
$s\in T$
with
$s_i\geq c$
for some positive constant c). Let
${\mathcal {W}}(1)$
denote the subset of real binary n-ic forms of height bounded by
$1$
and let
$R' = \sigma _1({\mathcal {W}}(1))$
, where
$\sigma _1$
is as in §3.4. Set
$R:={\mathbb {R}}_{>0}\cdot R'$
, and note that every distinguished element of
$W({\mathbb {R}})$
is
$\mathrm {{SL}}_n({\mathbb {R}})$
-equivalent to some element in R.
Let
$H_0$
be a nonempty open bounded left K-invariant set in
$\mathrm {{SL}}_n({\mathbb {R}})$
. Denote the set
$H_0\cdot R'$
by
${\mathcal {B}}_1$
. Then
${\mathcal {B}}_1$
is an absolutely bounded set in
$W({\mathbb {R}})$
. Let
${\mathcal {L}}$
be any
$\mathrm {{SL}}_n({\mathbb {Z}})$
-invariant subset of
$W({\mathbb {Z}})$
consisting of elements that are distinguished over
${\mathbb {R}}$
, and denote the set of elements in
${\mathcal {L}}$
with height less than X by
${\mathcal {L}}_X$
. Throughout this section, let
$Y=X^{1/n}$
. Then the averaging method as described in [Reference Bhargava and Shankar10, §2.3] yields the bound

for some absolutely bounded open set
${\mathcal {B}}$
containing
${\mathcal {B}}_1$
.
We denote the second integral on the right-hand side of (16) by
${\mathcal {I}}_X({\mathcal {L}})$
, and break it up into an integral over the main body, the shallow cusp and the deep cusp. We define the main body to be the range of the integral where
$|a_{11}|\geq 1$
for some element in
$s(Y{\mathcal {B}})$
, and denote the main-body portion of
${\mathcal {I}}_X({\mathcal {L}})$
by
${\mathcal {I}}_X^{\mathrm {{main}}}({\mathcal {L}})$
. We define the shallow cusp to be the range of the integral where
$|a_{11}|< 1$
for all elements in
$s(Y{\mathcal {B}})$
but
$|a_{ij}|\geq 1$
for some
$i,j\leq g$
, and denote the shallow-cusp portion of
${\mathcal {I}}_X({\mathcal {L}})$
by
${\mathcal {I}}_X^{\mathrm {{scusp}}}({\mathcal {L}})$
. We define the deep cusp to be the range of the integral where
$|a_{ij}|<1$
for all
$i, j\leq g$
and all elements in
$s(Y{\mathcal {B}})$
, and denote the deep-cusp portion of
${\mathcal {I}}_X({\mathcal {L}})$
by
${\mathcal {I}}_X^{\mathrm {{dcusp}}}({\mathcal {L}})$
. Then

In the next subsection, we prove bounds for the main body, the shallow cusp and the deep cusp when
${\mathcal {L}}=W({\mathbb {Z}})_{|Q|>M}^{\mathrm {{dist}},\mathrm {{irr}}}$
.
We will need the following result of Davenport to estimate the number of lattice points in bounded regions.
Proposition 4.1 [Reference Davenport14].
Let
${\mathcal {R}}$
be a bounded, semi-algebraic multiset in
${\mathbb {R}}^n$
having maximum multiplicity m that is defined by at most k polynomial inequalities, each having degree at most
$\ell $
. Let
${\mathcal {R}}'$
denote the image of
${\mathcal {R}}$
under any
$($
upper or lower
$)$
triangular, unipotent transformation of
${\mathbb {R}}^n$
. Then the number of lattice points
$($
counted with multiplicity
$)$
contained in the region
${\mathcal {R}}'$
is given by

where
$\overline {\mathrm {{Vol}}}(\overline {{\mathcal {R}}})$
denotes the greatest d-dimensional volume of any projection of
${\mathcal {R}}$
onto a coordinate subspace obtained by equating
$n-d$
coordinates to zero, as d ranges over all values in
$\{1, \dots , n-1\}$
. The implied constant in the second summand depends only on n, m, k and
$\ell $
.
4.2 The number of orbits of distinguished elements with large Q-invariant
In this subsection, we obtain the following upper bound on
${\mathcal {I}}_X(W({\mathbb {Z}})^{\mathrm {{dist}},\mathrm {{irr}}}_{|Q|>M})$
, thus yielding the same bound on the quantity
$\#\big (\mathrm {{SL}}_n({\mathbb {Z}})\backslash \{w\in W({\mathbb {Z}})^{\mathrm {{dist}},\mathrm {{irr}}}_{|Q|>M}:H(w)<X\}\big )$
by (16).
Theorem 4.2. We have
${\mathcal {I}}_X(W({\mathbb {Z}})^{\mathrm {{dist}},\mathrm {{irr}}}_{|Q|>M}) \ll _\epsilon X^{n+1-1/(2n)+\epsilon } + {X^{n+1+\epsilon }}/{M}.$
Note that (14), (16), and Theorem 4.2 immediately imply Part (b) of Theorem 5.
We bound
${\mathcal {I}}_X(W({\mathbb {Z}})^{\mathrm {{dist}},\mathrm {{irr}}}_{|Q|>M})$
by obtaining bounds for the main body, the shallow cusp and the deep cusp. We consider first the main body. In [Reference Ho, Shankar and Varma20, Proposition 4.6], an upper bound of
$o(X^{n+1})$
is obtained on
${\mathcal {I}}_X^{\mathrm {{main}}}(W({\mathbb {Z}})^{\mathrm {{dist}}})$
. This is proved using the following two ingredients: estimates with a power saving error tern on
${\mathcal {I}}_X^{\mathrm {{main}}}({\mathcal {L}})$
for lattices
${\mathcal {L}}\subset W({\mathbb {Z}})$
, and a proof that the density of elements in
$W({\mathbb {F}}_p)$
that are not
${\mathbb {F}}_p$
-distinguished is bounded below by some positive constant, independent of p. To obtain a power saving bound on
${\mathcal {I}}_X(W({\mathbb {Z}})^{\mathrm {{dist}},\mathrm {{irr}}}_{|Q|>M})$
, we use the large sieve.
Proposition 4.3. Let
$V\cong {\mathbb {A}}^N$
be an affine space. For every prime p, let
$\Omega _p\subset V({\mathbb {F}}_p)$
and let
$\omega (p) = \#\Omega _p/\#V({\mathbb {F}}_p)$
. For a rectangular box
${\mathcal {B}}' = [M_1,M_1+X_1]\times \cdots \times [M_N,M_N+X_N]$
where
$M_1,\ldots ,M_N,X_1,\ldots , X_N$
are real numbers with
$X_1,\ldots , X_n$
positive. Let

Then for any
$L> 0$
,

In particular, if
$\omega (p) \gg 1$
, that is, all
$\omega (p)$
are bounded below by some positive constant for large enough p, then

Proof. The bound (18) follows from [Reference Huxley19, Theorem 1] and [Reference Kowalski23, Proposition 2.4]. For the second statement, we have
$\omega (p)/(1 - \omega (p))\gg 1$
, and so

We are then done by taking
$L = \min \{X_1,\ldots ,X_N\}^{1/2}.$
In the situation when, for each prime p, a positive density subset of the lattice is being excluded by the sieve, the large sieve yields a better upper bound than the Selberg sieve. See, for example, [Reference Shankar and Tsimerman35], which gives a power-saving error term of
$O_\epsilon (X^{399/400+\epsilon })$
on the count of quintic fields. Applying the large sieve above instead of the Selberg sieve, and following the argument of [Reference Shankar and Tsimerman35], would yield the better error term of
$O_\epsilon (X^{159/160+\epsilon })$
.
We now apply the large sieve, as stated in Proposition 4.3, to bound the number of distinguished elements in the main ball:
Proposition 4.4. We have
$ {\mathcal {I}}_X^{\mathrm {{main}}}(W({\mathbb {Z}})^{\mathrm {{dist}}})\ll _\epsilon X^{n+1-1/{(2n)}+\epsilon }. $
Proof. We apply Proposition 4.3 with
$\Omega _p$
being the set of non-distinguished elements of
$w({\mathbb {F}}_p)$
and the rectangular box being
$s(Y{\mathcal {B}})$
. The shortest side has length
$Yw(a_{11})$
. Note that

Hence, we have

We are now done as
$Y = X^{1/n}$
.
Next, a bound on the shallow cusp follows directly from the proof of [Reference Ho, Shankar and Varma20, Proposition 4.3]:
Proposition 4.5. We have
$ {\mathcal {I}}_X^{\mathrm {{scusp}}}(W({\mathbb {Z}}))\ll X^{n+1-1/{n}}. $
In [Reference Ho, Shankar and Varma20, Proposition 4.3], the shallow and deep cusps were treated simultaneously, but the points in the deep cusp were ruled out since only nondistinguished elements were counted there. Hence, the proof of [Reference Ho, Shankar and Varma20, Proposition 4.3] yields the claimed bound in Proposition 4.5.
Finally, to treat the deep cusp, let
$U=\{a_{ij},b_{ij}:1\leq i\leq j\leq n\}$
denote the set of coordinates on W, and let
$U_0=\{a_{ij},b_{ij}\mid i\leq j, \:j\geq g+1\}$
denote the set of coordinates on
$W_0$
. We define a partial order
$\lesssim $
on U by setting
$\alpha \lesssim \beta $
if all the powers of
$s_i$
in
$w(\alpha )^{-1}w(\beta )$
are nonnegative. Explicitly,
$a_{ij}\leq a_{i'j'}$
if and only if
$i\leq i'$
and
$j\leq j'$
(and similarly for
$b_{ij}$
, as
$a_{ij}$
and
$b_{ij}$
have the same weight). A subset
${\mathcal {Z}}$
of
$U_0$
is saturated if for any
$\beta \in {\mathcal {Z}}$
and any
$\alpha \in U_0$
with
$\alpha \lesssim \beta $
, the coordinate
$\alpha $
also lies in
${\mathcal {Z}}$
. We pick positive constants
$c_{ij}$
for
$1\leq i\leq j\leq n$
such that
-
(a) If
$|Yw(a_{ij})|<c_{ij}$ , then
$|a_{ij}|<1$ and
$|b_{ij}|<1$ for every
$(A,B)\in s(Y{\mathcal {B}})$ .
-
(b) For all
$s\in T'$ and
$a_{ij}\lesssim a_{i'j'}$ , we have
$w(a_{ij})/c_{ij}\leq w(a_{i'j'})/c_{i'j'}$ .
More explicitly, we may choose
$c_{nn}$
to be sufficiently small and take

The significance of these constants
$c_{ij}$
is the following: for every
$Y>1$
, first, if
$Yw(a_{ij})<c_{ij}$
, then every integral element in
$s(Y{\mathcal {B}})$
has
$a_{ij}$
- and
$b_{ij}$
-coordinates equal to
$0$
; and second, if
$a_{ij}\lesssim a_{i'j'}$
, then
$Yw(a_{i'j'})<c_{i'j'}$
implies
$Yw(a_{ij})<c_{ij}$
.
The following lemma gives conditions that ensure an element in
$W({\mathbb {R}})$
has discriminant
$0$
.
Lemma 4.6. Suppose that
$(A,B)\in W({\mathbb {R}})$
satisfies
$a_{ij}=b_{ij}=0$
for all
$i\leq k$
and
$j\leq n-k$
for some
$k\in \{1,\ldots ,g\}$
. Then the discriminant of
$(A,B)$
is
$0$
.
Proof. One checks that
$f_{A,B}$
has a square factor of degree k and so has discriminant
$0$
.
The next lemma states that when
${\mathcal {L}}\subset W({\mathbb {Z}})$
consists of elements with nonzero discriminant, the integral defining
${\mathcal {I}}_X({\mathcal {L}})$
can be cut off by conditions of the form
$s_i\ll X^{\Theta }$
for some absolute constant
$\Theta $
depending only on n.
Lemma 4.7. There exists an absolute constant
$\Theta $
depending only on n such that if
$s\in T'$
with
$s_i\gg X^\Theta $
for some i, then
$s(Y{\mathcal {B}})\cap W({\mathbb {Z}})$
contains only points with discriminant
$0$
.
Proof. Let
$s=\mathrm {{diag}}(t_1^{-1},\cdots ,t_n^{-1})\in T'$
; then
$t_1\gg t_2\gg \cdots \gg t_n$
and
$t_1t_2\cdots t_n=1$
. Because of the relation between the
$t_j$
’s and the
$s_i$
’s, it suffices to prove that if
$s(Y{\mathcal {B}})$
contains an integral element with nonzero discriminant, then
$t_1$
is bounded from above by some power of X or, equivalently,
$t_n$
is bounded from below by some power of X. By Lemma 4.6, for
$s(Y{\mathcal {B}})$
to contain an integral element with nonzero discriminant, we must have
$Yw(a_{k,n-k})\gg 1$
for every
$k\in \{1,\ldots ,g\}$
. That is,
$t_kt_{n-k}\ll Y$
for every
$k\in \{1,\ldots ,g\}$
. Multiplying these conditions together, we obtain
$t_n\gg Y^{-g}$
. The lemma follows.
We now estimate the contribution to
${\mathcal {I}}_X(W({\mathbb {Z}})_{|Q|>M}^{\mathrm {{dist}},\mathrm {{irr}}})$
coming from the deep cusp.
Proposition 4.8. We have
$ {\mathcal {I}}_X^{\mathrm {{dcusp}}}(W({\mathbb {Z}})_{|Q|>M}^{\mathrm {{irr}}})\ll _\epsilon {X^{n+1+\epsilon }}/{M}. $
Proof. For a subset
${\mathcal {Z}}$
of
$U_0$
, let
$T^{\prime }_{\mathcal {Z}}$
denote the subset of
$s\in T'$
with
$Y^{g(g+1)}w(Q)\gg M$
, and
$|Yw(a_{ij})|< c_{ij}$
precisely for those
$(i,j)$
where
$a_{ij}\in {\mathcal {Z}}$
or
$b_{ij}\in {\mathcal {Z}}$
. Note that
$T^{\prime }_{\mathcal {Z}}$
is empty if
${\mathcal {Z}}$
is not saturated. Define

where the bound on the second line follows from Proposition 4.1. Let
$U' := \{a_{ij},b_{ij}\mid i+j<n\}$
. If
${\mathcal {Z}}$
is saturated and not contained in
$U'$
, then
${\mathcal {Z}}$
contains
$a_{k,n-k}$
for some
$k=1,\ldots ,g$
. Hence, for any
$s\in T^{\prime }_{\mathcal {Z}}$
, every integral element in
$s(Y {\mathcal {B}})\cap W_0({\mathbb {Z}})$
satisfies
$a_{ij}=b_{ij}=0$
, for
$i\leq k$
and
$j\leq n-k$
, and so has zero discriminant by Lemma 4.6. Therefore,

where the sum is over saturated subsets
${\mathcal {Z}}$
of
$U_0$
contained in
$U'$
.
Now

Fix a saturated subset
${\mathcal {Z}}$
of
$U_0$
contained in
$U'$
. We define a map
$\pi :{\mathcal {Z}}\rightarrow U_0\backslash U'$
by

Note that for any
$\alpha \in {\mathcal {Z}}$
, we have
$\pi (\alpha )\notin U'$
and so
$Yw(\pi (\alpha ))\gg 1$
. Furthermore, for every
$\alpha \in U'$
, we have
$\alpha \lesssim \pi (\alpha )$
and so
$w(\pi (\alpha ))/w(\alpha )\gg 1$
. Hence, for any
$s\in T^{\prime }_{\mathcal {Z}}$
,

Here, the first product on the right-hand side is the contribution from all
$a_{ij}\in U'$
, and the second product is the contribution from all
$b_{ij}\in U'$
. Note that when multiplying the right-hand sides of (19) and (20), we get all of the
$t_i/t_j$
for
$1\leq i < j \leq n$
except for the ones with
$i\geq g+1$
and
$j = n$
. For any
$s\in T^{\prime }_{\mathcal {Z}}$
, we have
$t_i/t_j\gg 1$
for any
$i < j$
, and so

Since each
$s_i$
is bounded below by an absolute constant and bounded above by a power of X by Lemma 4.7, we obtain

The proof is completed by summing over all saturated subsets
${\mathcal {Z}}$
contained in
$U_1$
.
5 A bound on the number of singular symmetric matrices in skewed boxes
Let
$n\geq 2$
be a positive integer and let
$S=\mathrm {{Sym}}_2(n)$
denote the space of symmetric
$n\times n$
matrices. Let
$|\cdot |$
denote Euclidean length on
$S({\mathbb {R}})$
obtained by identifying
$S({\mathbb {R}})$
with
${\mathbb {R}}^{\dim S}={\mathbb {R}}^{n(n+1)/2}$
. Let
${\mathcal {D}}\subset S({\mathbb {R}})$
be a bounded open set. For an integer r with
$1\leq r<n$
, let
$S({\mathbb {Z}})_{(r)}$
denote the set of elements in
$S({\mathbb {Z}})$
having rank r. In [Reference Eskin and Katznelson17], Eskin and Katznelson obtained asymptotics for the number of elements in
$Y{\mathcal {D}}\cap S({\mathbb {Z}})_{(r)}$
for
$r\in \{1,\ldots ,n-1\}$
.
In this paper, we will not need exact asymptotics; upper bounds will suffice. In this section, our goal is to obtain upper bounds on the number of elements of
$S({\mathbb {Z}})_{(r)}$
in skew balls.
The group
$\mathrm {{SL}}_n$
acts on S via
$\gamma (A)=\gamma A\gamma ^t$
for
$\gamma \in \mathrm {{SL}}_n$
and
$A\in S$
. Let
$T\subset \mathrm {{SL}}_n({\mathbb {R}})$
denote the subgroup of diagonal matrices with positive coefficients. We denote elements in T by
$s=\mathrm {{diag}}(t_1^{-1},\ldots ,t_n^{-1})$
. We are interested in studying the skew ball
$s(Y{\mathcal {D}})$
. By symmetry, we may assume that
$s\in T'$
(i.e., we have
$t_1\gg t_2\gg \ldots \gg t_n$
). Moreover, in light of Lemma 4.7, we will assume that
$t_1\ll Y^\Theta $
and
$t_n\gg Y^{-\Theta }$
for some absolute constant
$\Theta $
depending only on n.
For
$s\in T$
and
$r\in \{1,\ldots ,n-1\}$
, we define the constants
$C(r,s)$
by

When
$s\in T'$
, these constants satisfy

where as before,
$\delta (s)$
is the character of the torus appearing in the Haar measure of
$\mathrm {{SL}}_n({\mathbb {R}})$
.
Finally, for
$1\leq r<n$
, a positive real number Y, and
$s\in T$
, let
$N_r(Y,s)$
denote the number of elements in
$s(Y{\mathcal {D}})\cap S({\mathbb {Z}})_{(r)}$
. We prove the following result.
Theorem 5.1. Let
$n\geq 2$
and
$1\leq r<n$
be positive integers. Let
$\Theta>0$
be a real number. Let
$Y>1$
be a real number, and let
$s\in T'$
with
$t_1\ll Y^\Theta $
and
$t_n\gg Y^{-\Theta }$
. Then

where the implied constants are independent of s and depend only on n,
${\mathcal {D}}$
,
$\Theta $
and the implied constants in the assumed bounds on
$t_1$
and
$t_n$
.
The case
$s = 1$
of Theorem 5.1 follows from the work of Eskin-Katznelson [Reference Eskin and Katznelson17]. Their strategy is to express the set of singular symmetric matrices of rank r as a union of lattices, each of which consists of elements having a fixed row span. They count the number of elements in each such lattice having bounded norm, and then sum over all possible row spans. We follow this strategy, explaining the modifications necessary to bound integer points in skew balls.
Fix positive integers k and m with
$k\leq m$
, and a lattice
$\Lambda $
in
${\mathbb {R}}^n$
of rank r. A basis
$\{\ell _1,\ldots ,\ell _k\}$
of
$\Lambda $
is reduced if the product
$|\ell _1||\ell _2|\cdots |\ell _k|$
is minimal among all integral bases of
$\Lambda $
. It is almost reduced if

where
$d(\Lambda )$
denotes the covolume of
$\Lambda $
in
$\Lambda \otimes {\mathbb {R}}$
, and the implied constant in the inequality depends only on n. If we order an almost reduced basis
$\{\ell _1,\ldots ,\ell _k\}$
by length, then the i-th successive minimum of
$\Lambda $
is within a constant multiple (depending only on n) of
$|\ell _i|$
for every
$i = 1,\ldots ,k$
. To bound the number elements of
$\Lambda $
in a ball, we use the following result of Schmidt [Reference Schmidt31].
Proposition 5.2. Let
$\Lambda $
be a rank k lattice in
${\mathbb {R}}^m$
and
${\mathcal {D}}$
a bounded open domain in
${\mathbb {R}}^m$
. Let
$\mu _1,\ldots ,\mu _k$
be the successive minima of
$\Lambda $
. Then for
$Y>0$
, we have

We use the notation of Theorem 5.1. Given a lattice
$\Lambda \subset {\mathbb {Z}}^n$
of rank r, let
$S(\Lambda )$
denote the set of symmetric matrices
$B\in S({\mathbb {Z}})$
such that the row space (equivalently, the column space) of B is a full rank lattice of
$\Lambda \otimes {\mathbb {R}}$
. Note that
$S(\Lambda )$
will not be a lattice (for example, it does not contain
$0$
unless
$r=0$
); we denote the lattice spanned by
$S(\Lambda )$
in
$S({\mathbb {Z}})$
by
$S'(\Lambda )$
. For two vectors
$v_1$
and
$v_2$
in
${\mathbb {R}}^n$
, we define

Then
$v_1\ast v_2=v_2\ast v_1\in S(\text {Span}\{v_1,v_2\})$
, and

Fix
$\gamma \in \mathrm {{SL}}_n({\mathbb {R}})$
. (For our applications, we will take
$\gamma \in T$
.) Let
$\Lambda \subseteq {\mathbb {Z}}^n$
be a primitive lattice of rank r. We bound the number of elements in
$\gamma ^{-1}(Y{\mathcal {D}})\cap S(\Lambda )$
using the bijection

where
$\gamma \cdot A = \gamma A\gamma ^t$
is the action of
$\gamma $
on A, and instead bounding the number of elements in
$Y{\mathcal {D}}\cap \gamma (S(\Lambda ))$
. We thus study the set
$\gamma (S(\Lambda ))\subset S({\mathbb {R}})$
. The next result, which gives an almost reduced basis for
$\gamma (S'(\Lambda ))$
in terms of an almost reduced basis of
$\gamma \Lambda $
, follows from the proofs of [Reference Eskin and Katznelson17, Proposition 3.3] and [Reference Eskin and Katznelson17, Lemma 3.5].
Theorem 5.3. Fix
$\gamma \in \mathrm {{SL}}_n({\mathbb {R}})$
. Let
$\Lambda \subset {\mathbb {Z}}^n$
be a primitive lattice of rank r, and let
$\{\ell _1,\ldots ,\ell _r\}$
be a basis for
$\gamma \Lambda $
. Then
$\{\ell _i\ast \ell _j\colon 1\leq i\leq j\leq r\}$
is a basis for
$\gamma (S'(\Lambda ))$
. Furthermore,

In particular, if
$\{\ell _1,\ldots ,\ell _r\}$
is almost reduced, then so is
$\{\ell _i\ast \ell _j\colon 1\leq i\leq j\leq r\}$
.
Next, by the proof of [Reference Eskin and Katznelson17, Lemma 4.1], we have the following result giving a necessary condition for the set
$Y{\mathcal {D}}\cap \gamma (S(\Lambda ))$
to be nonempty.
Proposition 5.4. Let
$\gamma \in \mathrm {{SL}}_n({\mathbb {R}})$
and let
$\Lambda \subset {\mathbb {Z}}^n$
be a primitive lattice of rank r such that the successive minima of
$\gamma \Lambda $
are
$\mu _1\leq \ldots \leq \mu _r$
. If
$\#(Y{\mathcal {D}}\cap \gamma (S(\Lambda ))>0$
, then
$\mu _i\mu _j\leq c_1 Y$
for every pair
$(i,j)$
with
$i+j\leq r+1$
, for some constant
$c_1$
depending only on n.
We now prove an upper bound on
$\#(Y{\mathcal {D}}\cap \gamma (S(\Lambda ))$
.
Proposition 5.5. Let
$\gamma \in \mathrm {{SL}}_n({\mathbb {R}})$
and let
$\Lambda \subset {\mathbb {Z}}^n$
be a primitive lattice of rank r such that the successive minima of
$\gamma \Lambda $
are
$\mu _1\leq \ldots \leq \mu _r$
. Then

Proof. Let
$U(r)$
denote the set of pairs
$(i,j)$
of positive integers such that
$i\leq j\leq r$
and
$i+j>r+1$
. In other words, elements in
$U(r)$
correspond to the successive minima of the lattice
$\gamma (S'(\Lambda ))$
that are
$\gg Y$
. By Proposition 5.2, Theorem 5.3 and (24), we have

Assume that
$\#(Y{\mathcal {D}}\cap \gamma (S(\Lambda ))>0$
. Then
$\mu _{r+1-j}\mu _j\ll Y$
for all
$1\leq j\leq r$
by Proposition 5.4. Thus,

Since
$i\leq j$
and
$i+j>r+1$
for
$(i,j)\in U(r)$
, we have the following injection:

Since
$\frac {\mu _j}{\mu _i}\geq 1$
for
$j>i$
, the injection (27) implies that the product of the ratios
$\mu _j/\mu _i$
in (25) is at least as large as the product of the ratios
$\mu _i/\mu _{r+1-j}$
in (26). The result follows.
We now sum over the appropriate lattices
$\Lambda \subset {\mathbb {Z}}^n$
having rank r. To this end, we fix an element
$s=\mathrm {{diag}}(t_1^{-1},t_2^{-1},\dots ,t_n^{-1})\in T'$
. We will apply the previous results with
$\gamma =s^{-1}$
. Set
$L=(L_1,\ldots ,L_r)$
with
$0<L_1\leq L_2\leq \cdots \leq L_r$
. Let
$\Sigma (L,s)$
denote the set of primitive lattices
$\Lambda \subset {\mathbb {Z}}^n$
of rank r whose successive minima
$\mu _1,\ldots ,\mu _r$
of
$s^{-1}\Lambda $
satisfy
$L_i\leq \mu _i<2L_i$
for each i.
Lemma 5.6. Let
$L=(L_1,\ldots ,L_r)$
and
$s=\mathrm {{diag}}(t_1^{-1},\ldots ,t_n^{-1})\in T'$
. Then there is a constant
$c'>0$
depending only on n such that if
$\#\Sigma (L,s)> 0$
, then
$L_it_j^{-1}> c'$
for all
$(i,j)$
with
$i+j\geq n+1$
.
Proof. Since
$\#\Sigma (L,s)> 0$
, there exists an integral lattice
$\Lambda \subset {\mathbb {Z}}^n$
of rank r with basis
$\{\ell _1,\ldots ,\ell _r\}$
such that
$|s^{-1}\ell _i|<2L_i$
for
$i\in \{1,\ldots ,r\}$
. For
$1\leq j\leq n$
, let
$u_{ij}$
denote the (integral) j-th entry of
$\ell _i$
. Then
$|u_{ij}|\leq 2L_it_j^{-1}$
for every
$1\leq i\leq r$
and
$1\leq j\leq n$
. The assumption that
$s\in T'$
implies that
$L_it_j^{-1}\ll L_{i'}t_{j'}^{-1}$
whenever
$i\leq i'$
and
$j\leq j'$
.
Suppose that there is an integer k with
$1\leq k\leq r$
such that
$L_kt_{n+1-k}^{-1}<c"$
for some sufficiently small constant
$c">0$
. Then
$|u_{ij}|<1$
, and thus,
$u_{ij}=0$
for all
$(i,j)$
with
$1\leq i\leq k$
and
$1\leq j\leq n+1-k$
. However, this implies that the vectors
$\ell _1,\ldots ,\ell _k$
are not linearly independent, a contradiction. Hence, such a k does not exist and
$L_kt_{n+1-k}^{-1}\gg 1$
for all k, implying the result.
We now determine an upper bound for
$\#\Sigma (L,s)$
.
Proposition 5.7. Let
$L=(L_1,\ldots ,L_r)$
and
$s=\mathrm {{diag}}(t_1^{-1},\ldots ,t_n^{-1})\in T'$
. Then

where
$C(r,s)$
is defined as in (21).
Proof. We count lattices
$\Lambda $
by counting r-tuples of vectors
$(\ell _1,\ldots ,\ell _r)$
such that each
$\ell _i\in s^{-1}{\mathbb {Z}}^n$
satisfies
$L_i\leq |\ell _i| < 2L_i$
and such that
$\{\ell _1,\ldots ,\ell _r\}$
is a reduced basis of the lattice it generates. For each
$i = 1,\ldots ,r$
, let
$\alpha (i)$
be the largest integer such that
$L_it_{\alpha (i)}^{-1}\leq c'$
, where
$c'$
is as in Lemma 5.6, or let
$\alpha (i)=0$
if no such integer exists. By Proposition 5.2, the number of possibilities for
$\ell _i$
is

However, once
$\ell _1$
is fixed, and given a vector
$\ell _2$
, at most two of
$\ell _2-k\ell _1$
can be part of a reduced basis for
$k\in {\mathbb {Z}}$
. Since
$\gg L_2/L_1$
vectors
$\ell _2-k\ell _1$
satisfy the same size bound as
$\ell _2$
(namely, those with
$k\ll L_2/L_1$
), the number of choices for the pair
$(\ell _1,\ell _2)$
that are part of a reduced basis is

Continuing in this way, we obtain the bound

By Lemma 5.6, we have
$\alpha (i)\leq n-i$
for
$i\in \{1,\ldots ,r\}$
. Therefore,

We are now ready to prove the main result of this section.
Proof of Theorem 5.1.
Let
$L=(L_1,\ldots ,L_r)$
be a tuple such that
$0<L_1\leq L_2\leq \cdots \leq L_r$
. Then, by Lemma 5.6, Proposition 5.4 and the definition of
$T'$
, we see that for there to exist a lattice
$\Lambda \in \Sigma (L,s)$
such that
$\#(Y{\mathcal {D}}\cap s^{-1}(S(\Lambda ))>0$
, we must have

for some absolute constants
$\Theta _1,\Theta _2>0$
. For any such
$\Lambda $
, Proposition 5.5 states that

Thus,

where the sum is over r-tuples
$L=(L_1,\ldots ,L_r)$
with
$L_1\leq L_2\leq \cdots \leq L_n$
that partition the region
$\{(\mu _1,\ldots ,\mu _r)\in [Y^{-\Theta },Y^{\Theta '}]^r:\mu _1\leq \ldots \leq \mu _r\}$
into dyadic ranges. The sum over L has length
$O(\log ^r Y)$
. Using the upper bound on
$\#\Sigma (L,s)$
in Proposition 5.7, we obtain

This concludes the proof of Theorem 5.1.
6 A uniformity estimate for even degree polynomials
We fix an even integer
$n=2g+2$
with
$g\geq 1$
. Our goal is to prove Theorem 5(c) by obtaining a bound on the number of integral binary n-ic forms having bounded height having discriminant weakly divisible by the square of a large squarefree integer.
Throughout this section, we write
$V:=V_n$
and
$W:=W_{n+1}$
. Let
$m>0$
be an odd squarefree integer, and let
${\mathcal {W}}_m^{\mathrm {{{(2)}}}}:={\mathcal {W}}_{m,n}^{\mathrm {{{(2)}}}}$
. We also define the following auxiliary sets:




where
$\kappa>0$
is a small constant (whose exact value will be optimized later) and
$\mathrm {{Gal}}$
denotes the Galois group. Then, for any
$M>0$
, we have the following containment:

The number of elements in
$V({\mathbb {Z}})^{\mathrm {{red}}}$
having height less than X was bounded by
$O(X^n$
) in [Reference Kuba22, Theorem 1]. We next prove a bound on the number of elements in
$V({\mathbb {Z}})^{\Delta \text {\,small}}$
of bounded height.
Lemma 6.1. The number of integral binary n-ic forms with height less than X and absolute discriminant less than
$X^{2n-2-\kappa }$
is
$O(X^{n+1 - \frac {\kappa }{2n-2}})$
.
Proof. Set
$\eta := \kappa /(2n-2)$
. The number of integral binary n-ic forms
$a_0x^n + \cdots + a_ny^n$
with height less than X such that
$|a_0|\leq X^{1-\eta }$
is
$O(X^{n+1-\eta })$
. Hence, we assume
$|a_0|>X^{1-\eta }.$
Now fix integers
$a_0,\ldots ,a_{n-1}$
with
$|a_i|\leq X$
and
$|a_0|>X^{1-\eta }$
. The discriminant of
$a_0x^n + \cdots + a_ny^n$
is a polynomial
$F(a_n)$
in
$a_n$
of degree
$n-1$
with leading coefficient
$C_na_0^{n-1}$
for some nonzero constant
$C_n$
. Let
$r_1,\ldots ,r_{n-1}\in {\mathbb {C}}$
be the
$n-1$
roots of
$F(x)$
. Then

Since
$|F(a_n)| < X^{2n-2-\kappa }$
, we have
$(a_n-r_1)\cdots (a_n-r_{n-1})\ll X^{n-1-(n-1)\eta }$
. Hence,
$|a_n - r_i| \ll X^{1-\eta }$
for some
$i=1,\ldots ,n-1$
. The number of such integers
$a_n$
is
$O(X^{1-\eta })$
. Since there are
$O(X^n)$
choices for
$a_0,\ldots ,a_{n-1}$
, we obtain the desired bound.
A direct application of a quantitative version of the Ekedahl sieve as in [Reference Bhargava4, Theorem 3.3] implies the following bound on the number of elements of bounded height belonging to
${\mathcal {W}}^{(1\#)}_{m}$
for large m.
Lemma 6.2. We have
$ \#\!\!\bigcup _{\substack {m>\sqrt {M}\\ m\;\mathrm {squarefree}}} \!\!\{f\in {\mathcal {W}}_{m}^{(1\#)}:H(f)<X\} = O\Big (\frac {X^{n+1}}{\sqrt {M}} + X^n\Big ). $
To prove Theorem 5(c), it thus remains to obtain an upper bound for

In §3.5, we defined a map
$\sigma _m$
from the set of elements
$f\in {\mathcal {W}}_m^{\mathrm {{{(2)}}}}$
with
$\gcd (m,f(0,1))=1$
to
$W({\mathbb {Z}})$
such that
$f_{\sigma _{m}(f)} = xf$
and
$ |q|(\sigma _{m}(f)) = m.$
For any
$M>0$
, define the set
${{\mathcal {L}}(M)}$
by

Then (36) is
$\ll $

where

is as defined immediately after (16), where Y is now taken to be
$X^{1/(n+1)}$
throughout this section. Moreover, exactly as in the paragraph leading up to (17), we break up
${\mathcal {I}}_X({{\mathcal {L}}(M)})$
into three parts – corresponding to the main body, the shallow cusp and the deep cusp – and again write

The rest of this section is dedicated to obtaining an upper bound on
${\mathcal {I}}_X({{\mathcal {L}}(M)} )$
. Every element
$(A,B)\in {{\mathcal {L}}(M)} $
satisfies
$\det (B)=0$
since
$f_{A,B}$
is divisible by x. In §4, we used vanishing conditions on the coefficients
$\{a_{ij},b_{ij}\}$
of W to estimate the number of integral pairs
$(A,B)$
in skewed domains of
$W({\mathbb {R}})$
. Now, since we also need to impose the condition that B has determinant
$0$
, we use the setup of §5 to count the number of such B’s in skewed bounded domains by fibering over the row space of B.
In §6.1, we thus further break up the three parts of
${\mathcal {I}}_X({{\mathcal {L}}(M)})$
into sums over row spaces of the singular matrix B. We also obtain some preliminary bounds on
${\mathcal {I}}_X({{\mathcal {L}}(M)})$
and give some conditions that ensure that a pair
$(A,B)$
has discriminant
$0$
. In §6.2, §6.3 and §6.4, we then prove the desired upper bounds on
${\mathcal {I}}^{\mathrm {{main}}}_X({{\mathcal {L}}(M)})$
,
${\mathcal {I}}^{\mathrm {{scusp}}}_X({{\mathcal {L}}(M)})$
, and
${\mathcal {I}}^{\mathrm {{dcusp}}}_X({{\mathcal {L}}(M)})$
, respectively. In conjunction with (35), (37) and Lemmas 6.1–6.2, this will yield Theorem 5(c).
6.1 Setup and preliminary bounds
Coordinate systems, weight functions and summing over row spaces
Let
$S({\mathbb {Z}})$
denote the set of
$(n+1)\times (n+1)$
integral symmetric matrices. For any primitive lattice
$\Lambda $
of
${\mathbb {Z}}^{n+1}$
, let
$S(\Lambda )$
denote the sublattice of
$S({\mathbb {Z}})$
consisting of elements
$B\in S({\mathbb {Z}})$
with row space contained in
$\Lambda $
. For
$L=(L_1,\ldots ,L_n)$
with
$L_i\in {\mathbb {R}}$
and
$L_1\leq L_2\leq \cdots \leq L_n$
and
$s\in T'$
, let
$\Sigma (L,s)$
denote the set of primitive lattices
$\Lambda \subset {\mathbb {Z}}^{n+1}$
of rank n such that the successive minima
$\mu _1,\ldots ,\mu _n$
of
$s^{-1}\Lambda $
satisfy
$L_1\leq \mu _i\leq 2L_i$
for each i. We define
${\mathcal {S}}(L,s)\subset S({\mathbb {Z}})$
by

We next introduce coordinate systems and weight functions. Let

denote the set of coordinates of n-tuples of vectors in
${\mathbb {R}}^{n+1}$
. We define

The significance of
$w_L$
is the following. Let
$\Lambda \in \Sigma (L,s)$
be a lattice with an integral basis
$\{\ell _1,\ldots ,\ell _n\}$
such that
$\{s^{-1}\ell _1,\ldots ,s^{-1}\ell _n\}$
is a Minkowski-reduced basis for
$s^{-1}\Lambda $
. Then the jth coefficient of
$\ell _i$
is
$\ll L_it_j^{-1}=w_L(\ell _{ij})$
. In particular, for the absolute value of the jth coefficient of
$\ell _i$
to be nonzero, we must have
$w_L(\ell _{ij})\gg 1$
. When L is implicit, we will write w in place of
$w_L$
.
Let
${\mathcal {K}}$
denote the set of coefficients
$\{a_{ij}:1\leq i\leq j\leq n+1\}$
, and recall the weight function

Define a partial order on
${\mathcal {K}}$
by setting
$a_{ij}\lesssim a_{i'j'}$
if
$i\leq i'$
and
$j\leq j'$
, and on
${\mathcal {M}}$
by setting
$\ell _{ij}\lesssim \ell _{i'j'}$
if
$i\leq i'$
and
$j\leq j'$
. The significance of this partial order is that if
$\alpha ,\beta \in {\mathcal {K}}$
with
$\alpha \lesssim \beta $
and
$s\in T'$
, then
$w(\alpha )\ll w(\beta )$
and similarly,
$w_L(\alpha )\ll w_L(\beta )$
if
$\alpha ,\beta \in {\mathcal {M}}.$
We say that a subset
${\mathcal {Z}}$
of
${\mathcal {K}}\cup {\mathcal {M}}$
is saturated if for any
$\alpha \in {\mathcal {Z}}$
, all the
$\alpha '\in {\mathcal {K}}\cup {\mathcal {M}}$
with
$\alpha '\lesssim \alpha $
are also contained in
${\mathcal {Z}}$
.
Let
${\mathcal {D}}\subset S({\mathbb {R}})$
be a bounded domain such that
${\mathcal {B}}\subset {\mathcal {D}}\times {\mathcal {D}}$
. We pick positive constants
$c_{ij}$
for
$1\leq i\leq j\leq n+1$
and
$c_i^{\prime }$
for
$1\leq i\leq n$
such that
-
(a) if
$|Yw(a_{ij})|<c_{ij}$ , then the
$a_{ij}$ –coordinate of any integral element in
$s(Y{\mathcal {D}})$ is
$0$ ;
-
(b) if
$|w_L(\ell _{ij})|<c_j^{\prime }$ , then the jth coefficient of
$\ell _i$ for any lattice
$\Lambda \in \Sigma (L,s)$ is
$0$ ;
-
(c)
$c_i^{\prime }<c'$ for all
$i=1,\ldots ,n$ , where
$c'$ is the constant in Lemma 5.6;
-
(d)
$c_1c_{g+1,g+1}\leq c_{g+1}^{\prime \,2}$ , where
$c_1$ is the constant in Proposition 5.4;
-
(e) for any
$i\leq i'$ and
$j\leq j'$ , we have
$w(a_{ij})/c_{ij} \leq w(a_{i'j'})/c_{i'j'}$ and
$w(\ell _{ij})/c_j^{\prime } \leq w(\ell _{i'j'})/c_{j'}^{\prime }.$
More explicitly, we choose
$c_{n+1,n+1}$
and
$c_n'$
to be sufficiently small and take

For any nondecreasing n-tuple L of positive real numbers, and a saturated subset
${\mathcal {Z}}$
of
${\mathcal {K}}\cup {\mathcal {M}}$
, we define the following subset
$T_{\mathcal {Z}}(L,Y)$
of
$T'$
:

where
$\Theta $
is the absolute constant from Lemma 4.7.
For X, Y, L,
${\mathcal {Z}}$
as above and any subset
${\mathcal {L}}$
of
$W({\mathbb {Z}})$
, we define the quantity

In the proof of Theorem 5.1, we showed that unless
$Y^{-\Theta _1}<L_1$
and
$Y^{\Theta _2}>L_n$
for some absolute positive constants
$\Theta _1$
and
$\Theta _2$
, we have
${\mathcal {S}}(L,s)=\emptyset $
, which implies that
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)=0$
. Therefore,

where the inner sum is over saturated subsets
${\mathcal {Z}}$
of
${\mathcal {K}}\cup {\mathcal {M}}$
, and the outer sum is over n-tuples
$L=(L_1,\ldots ,L_n)$
with
$L_1\leq L_2\leq \cdots \leq L_n$
that partition the region
$\{(\mu _1,\ldots ,\mu _n)\in [Y^{-\Theta _1},Y^{\Theta _2}]^n:\mu _1\leq \ldots \leq \mu _n\}$
into dyadic ranges.
We may therefore bound the main-body, the shallow-cusp and the deep-cusp parts of
${\mathcal {I}}_X({{\mathcal {L}}(M)} )$
in terms of sums over
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)$
. We have

A preliminary upper bound
We now prove some preliminary results on
$N({{\mathcal {L}}(1)} ,L,{\mathcal {Z}},X))$
. We start with an upper bound on
$N({{\mathcal {L}}(1)} ,L,{\mathcal {Z}},X)$
, which also bounds
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)$
by directly counting the number of possible A’s and then using the results of §5 to count B’s. For a saturated subset
${\mathcal {Z}}$
of
${\mathcal {K}}\cup {\mathcal {M}}$
, define

In what follows, the n-tuple L will be clear from the context, and we simply write w in place of
$w_L$
.
Proposition 6.3. Suppose that
${\mathcal {Z}}$
is a saturated subset of
${\mathcal {K}}\cup {\mathcal {M}}$
. Then

Proof. By Proposition 4.1, the number of elements
$A\in s(Y{\mathcal {D}})\cap S({\mathbb {Z}})$
is

By the definition of
$T_{\mathcal {Z}}(L,Y)$
, it follows from (29) that for every
$s\in T_{\mathcal {Z}}(L,Y)$
, we have

For each
$\Lambda \in \Sigma (L,s)$
, Proposition 5.5 implies that the number of integral symmetric matrices
$B\in s(Y{\mathcal {D}})$
whose row space is contained in
$\Lambda $
is

Combining (42), (43) and (44), and recalling that
$X = Y^{n+1}$
, gives (41).
Conditions for vanishing discriminant
Next, we give some conditions on
${\mathcal {Z}}$
that ensure
$N({{\mathcal {L}}(1)} ,L,{\mathcal {Z}},X) = 0$
. We start with the following algebraic result that gives sufficient conditions on a pair
$(A,B)\in W({\mathbb {C}})$
that ensure it has discriminant
$0$
.
Lemma 6.4. Suppose that
$(A,B)$
is an element of
$W({\mathbb {C}})$
such that one of the following three conditions are satisfied:
-
(a) The kernel of B has dimension at least
$2$ .
-
(b) There is a nonzero vector
$v\in {\mathbb {C}}^{n+1}$ that is in the kernel of B and isotropic with respect to A.
-
(c) There exists
$k\in \{ 1,\ldots ,g+1\}$ such that
$a_{ij} = b_{ij} = 0$ for all
$1\leq i\leq k$ and all
$1\leq j\leq n+1-k$ .
Then
$\Delta (A,B) = 0$
.
Proof. This is a standard result in the algebraic geometric theory of pencils of quadrics. We give another proof using the explicit formula for
$f(x,y) = f_{A,B}(x,y).$
The claim regarding Condition (c) is Lemma 4.6. If the kernel of B has dimension at least
$2$
, then the quadratic form defined by A restricted to the kernel of B admits a nonzero isotropic vector in
${\mathbb {C}}^{n+1}$
. Thus Condition (a) implies Condition (b). Suppose now that Condition (b) is satisfied. Then the
$y^{n+1}$
-coefficient of
$f(x,y)$
is
$0$
since B is singular. The
$xy^n$
-coefficient of
$f(x,y)$
equals, up to sign, the alternating sum of the determinants of the matrices obtained by replacing the i-th column of B by the i-th column of A. By translating the vector v to
$(1,0,0,\ldots ,0)$
using an element of
$\mathrm {{SL}}_{n+1}({\mathbb {C}})$
, we may assume that the first column (and row) of B is
$0$
and the
$(1,1)$
-entry of A is
$0$
. It is then easy to see that the determinant of the matrix obtained by replacing the i-th column of B by the i-th column of A is
$0$
for any i. Hence,
$\Delta (A,B) = \Delta (f) = 0$
.
We now translate these conditions into the vanishing of
$N({{\mathcal {L}}(1)} ,L,{\mathcal {Z}},X)$
for certain sets
${\mathcal {Z}}$
. To this end, define the set
${\mathcal {Z}}_1\subset {\mathcal {K}}\cup {\mathcal {M}}$
by

Lemma 6.5. Let
${\mathcal {Z}}$
be a saturated subset of
${\mathcal {K}}\cup {\mathcal {M}}$
satisfying one of the following two conditions:
-
(a) The set
${\mathcal {Z}}$ is not contained in
${\mathcal {Z}}_1$ .
-
(b) There exists
$k\in \{1,\ldots ,g+1\}$ such that
$a_{kk}\in {\mathcal {Z}}$ and
$\ell _{n+1-k,k}\in {\mathcal {Z}}$ .
Then
$N({{\mathcal {L}}(1)} ,L,{\mathcal {Z}},X) = 0$
.
Proof. If
${\mathcal {Z}}$
contains some
$\ell _{ij}\notin {\mathcal {Z}}_1$
, then for every
$s\in T_{\mathcal {Z}}(L,Y)$
, the set
$\Sigma (L,s)$
(and hence
${\mathcal {S}}(L,s)$
) is empty by Lemma 5.6. This implies that
$N({{\mathcal {L}}(1)} ,L,{\mathcal {Z}},X)=0$
. If
${\mathcal {Z}}$
contains some
$a_{ij}\notin {\mathcal {Z}}_1$
, then every integral
$(A,B)\in s(Y{\mathcal {D}}\times s(Y{\mathcal {D}})$
has discriminant
$0$
by Condition (c) of Lemma 6.4. Once again, this implies that
$N({{\mathcal {L}}(1)} ,L,{\mathcal {Z}},X)=0$
.
Let k be an integer satisfying Condition (b) of the lemma, and let
$s\in T_{\mathcal {Z}}(L,Y)$
. Let
$(A,B)$
be such that
$A\in s(Y{\mathcal {D}})$
and
$B\in {\mathcal {S}}(L,s)$
. Since
$\ell _{n+1-k,k}\in {\mathcal {Z}}$
, it follows that there exists a nonzero vector
$v\in {\mathbb {C}}^{n+1}$
of the form
$(v_1,\ldots ,v_k,0,\ldots ,0)$
that is in the kernel of B. Since
$a_{kk}\in {\mathcal {Z}}$
, it follows that v is isotropic with respect to A. By Condition (b) of Lemma 6.4, it follows that
$\Delta (A,B) = 0$
, implying that
$N({{\mathcal {L}}(1)} ,L,{\mathcal {Z}},X)=0$
, as desired.
6.2 Bounding the number of distinguished elements in the main body
In this subsection, we bound the number of distinguished elements in the main body:
Theorem 6.6. We have
${\mathcal {I}}_X^{\mathrm {{main}}}({{\mathcal {L}}(1)} )=O\big (X^{n+1-1/(4n) + \epsilon }\big )$
.
As
${{\mathcal {L}}(M)} \subset {{\mathcal {L}}(1)} $
for
$M\geq 1$
, it follows that
${\mathcal {I}}_X^{\mathrm {{main}}}({{\mathcal {L}}(M)} )$
satisfies the same bound.
We will use the Selberg sieve to show that distinsuished elements are negligible in number in the main body. However, applying the Selberg sieve requires asymptotics along with a power saving error term. Our methods in §5 do not yield such results.
Hence, we will instead fiber over
$B\in s(Y{\mathcal {D}})\cap S({\mathbb {Z}})$
having determinant
$0$
, apply the Selberg sieve to prove that there are negligibly many
$A\in s(Y{\mathcal {D}})\cap S({\mathbb {Z}})$
such that
$(A,B)$
is distinguished, and then bound the number of possible B’s using the results of Section 5. To carry out the middle step, we require the following lower bound on the number of nondistinguished elements modulo primes p that is independent of p and B.
Lemma 6.7. Let
$B_0$
be an element in
$S({\mathbb {F}}_p)$
with
${\mathbb {F}}_p$
-rank n. Let
$S_{B_0}^{\mathrm {{ndist}}}({\mathbb {F}}_p)$
denote the set of elements
$A\in S({\mathbb {F}}_p)$
such that
$(A,B_0)$
has nonzero discriminant and A and
$B_0$
do not have a common isotropic
$(g+1)$
-dimensional subspace. Then

Proof. For an element
$B\in S({\mathbb {F}}_p)$
with
${\mathbb {F}}_p$
-rank n and kernel spanned by v, let
$d(B)$
denote the discriminant of the corresponding quadratic form on
${\mathbb {F}}_p^{n+1}/({\mathbb {F}}_p v)$
. If
$B_1,B_2\in S({\mathbb {F}}_p)$
have
${\mathbb {F}}_p$
-rank n and
$d(B_1)/d(B_2)\in {\mathbb {F}}_p^{\times 2}$
, then
$B_1$
and
$B_2$
are
$\mathrm {{SL}}_{n+1}({\mathbb {F}}_p)$
-equivalent. Indeed, by using
$\mathrm {{SL}}_{n+1}({\mathbb {F}}_p)$
transformations, we may assume the last row and columns of
$B_1$
and
$B_2$
are all
$0$
. The nondegenerate forms defined by the top left
$n\times n$
blocks of
$B_1$
and
$B_2$
have discriminants
$d(B_1)$
and
$d(B_2)$
, which are in the same quadratic residue class. Hence, they are equivalent via an element
$\gamma \in \mathrm {{GL}}_n({\mathbb {F}}_p)$
. Expanding
$\gamma $
to an element in
$\mathrm {{SL}}_{n+1}({\mathbb {F}}_p)$
by appending an additional row and column whose entries are all
$0$
, except for the
$(n+1,n+1)$
-entry which is
$\det \gamma ^{-1}$
, gives an element in
$\mathrm {{SL}}_{n+1}({\mathbb {F}}_p)$
that takes
$B_1$
to
$B_2$
.
Let
$B_0\in S({\mathbb {F}}_p)$
have
${\mathbb {F}}_p$
-rank n. For each binary n-ic form
$f(x,y) = a_0x^n + \cdots + a_ny^n$
over
${\mathbb {F}}_p$
that splits completely over
${\mathbb {F}}_p$
such that
$\Delta (xf(x,y))\neq 0$
and
$a_0\neq 0$
, we construct a nondistinguished element
$(A_0,B_0)$
with
$f_{A_0,B_0}=xf(x,y)$
. Let f be such a form. Then
$a_n\neq 0$
. Let
$\alpha = d(B_0)/a_n$
. As noted in §3.1, there exist at least two (in fact
$2^{n-1}$
)
$\mathrm {{SL}}_n({\mathbb {F}}_p)$
-orbits of
$(A,B)\in W_n({\mathbb {F}}_p)$
such that
$f_{A,B} = \alpha f(x,y)$
. Pick two inequivalent representatives
$(A_1,B_1)$
and
$(A_2,B_2)$
. Let
$A_1'$
and
$A_2'$
be the
$(n+1)$
-ary quadratic forms obtained from
$A_1$
and
$A_2$
, respectively, by appending an additional row and column whose entries are all
$0$
except for the
$(n+1,n+1)$
-entry which is
$\alpha ^{-1}$
. Let
$B_1'$
and
$B_2'$
be the
$(n+1)$
-ary quadratic forms obtained from
$B_1$
and
$B_2$
, respectively, by appending an additional row and column whose entries are all
$0$
. Then
$f_{A_1',B_1'} = f_{A_2',B_2'} = xf(x,y).$
Since
$(A_1,B_1)$
and
$(A_2,B_2)$
are
$\mathrm {{SL}}_n({\mathbb {F}}_p)$
-inequivalent, it follows that
$(A_1',B_1')$
and
$(A_2',B_2')$
are
$\mathrm {{SL}}_{n+1}({\mathbb {F}}_p)$
-inequivalent. Hence, without loss of generality, we may assume that
$(A_1',B_1')$
is nondistinguished. Now
$d(B_1')=\alpha a_n = d(B_0)$
, and so there exists
$\gamma \in \mathrm {{SL}}_{n+1}({\mathbb {F}}_p)$
such that
$\gamma B_1'\gamma ^t = B_0$
. Then
$A_0 = \gamma A_1'\gamma ^t$
does the job.
We complete the proof of the lemma via the orbit-stabilizer theorem. By the above construction, there are
$\gg _n p^{n+1}$
binary
$(n+1)$
-ic forms
$xf(x,y)$
, with
$\Delta (xf(x,y))\neq 0$
and
$a_0\neq 0$
, such that there exists an element
$A\in S({\mathbb {F}}_p)$
with
$f_{A,B_0}=xf(x,y)$
and
$(A,B_0)$
nondistinguished. The group
$G_{B_0}({\mathbb {F}}_p) = \{\gamma \in \mathrm {{SL}}_{n+1}({\mathbb {F}}_p)\colon \gamma B_0\gamma ^t = B_0\}$
acts on the set of such A with stabilizer of size
$\# J_{xf}[2]({\mathbb {F}}_p)$
, where
$J_{xf}$
is the Jacobian of the hyperelliptic curve defined by
$z^2 = xf(x,y)y$
. Any element of
$\gamma \in G_{B_0}({\mathbb {F}}_p)$
preserves the kernel
${\mathbb {F}}_p v$
of
$B_0$
and stabilizes the nondegenerate form
$b_0$
on
${\mathbb {F}}_p^{n+1}/({\mathbb {F}}_p v)$
induced by
$B_0$
. The determinant
$1$
condition then gives

Finally, since
$\#J_{xf}[2]({\mathbb {F}}_p)\ll _n 1$
, we have

as desired.
Corollary 6.8. Fix
$a\in {\mathbb {F}}_p^\times $
and
$B_0\in S({\mathbb {F}}_p)$
with rank n. Let
$S_{B_0}^{\mathrm {{ndist}}}({\mathbb {F}}_p)_{a_{11}=a}$
denote the set of all elements
$A\in S_{B_0}^{\mathrm {{ndist}}}({\mathbb {F}}_p)$
with
$a_{11}=a$
. Then

Proof. Since the property of
$(A,B_0)$
being nondistinguished is preserved when A is multiplied by an element of
${\mathbb {F}}_p^\times $
, the claim follows immediately from Lemma 6.7.
We now bound the number of pairs
$(A,B)$
in the main body where the first row and column of B are zero.
Proposition 6.9. We have

Proof. Let
$s\in T'$
be an element with
$Yw(a_{11})\gg 1$
. Then

For each
$B\in s(Y{\mathcal {D}})\cap S({\mathbb {Z}})$
having rank n, we bound the number of
$A\in s(Y{\mathcal {D}})\cap S({\mathbb {Z}})$
such that
$(A,B)$
is distinguished. Indeed, after additionally fibering over the coefficient
$a_{11}$
, Corollary 6.8 in conjunction with an application of the large sieve in Proposition 4.3, we obtain a saving of
$(Yw(a_{12}))^{-1/2+\epsilon }$
.
Therefore, the left-hand side of (45) is

In particular, the power of
$s_i$
above is negative for all
$j\in \{2,\ldots ,n\}$
, and hence, the integral over
$s_2,\ldots ,s_n$
is absolutely bounded. The condition that
$Yw(a_{11})\gg 1$
on the integrand implies that we have
$s_1\ll Y^{1/(2n)}$
. Therefore, the terms in (46) are

Since
$Y=X^{1/(n+1)}$
, we obtain the result.
Remark 6.10. Our use of the large sieve saves a power of the smallest range of any coordinate. In the above proof, we fiber over
$a_{11}$
because in the region of the main body close to the cusp, just before we enter the shallow cusp, the range of
$a_{11}$
has size
$\ll 1$
. In this case, the large sieve gives no saving at all. Once we fiber over
$a_{11}$
, the next smallest range is that of
$a_{12}$
. Implicit in our proof is an argument that either the range of
$a_{12}$
is large, in which case the large sieve gives the desired saving, or the number of pairs
$(A,B)$
is automatically small.
Proof of Theorem 6.6.
Recall from (40) that we have

where the second sum is over all saturated
${\mathcal {Z}}$
. Since
${\mathcal {Z}}$
is saturated and
$a_{11}\not \in {\mathcal {Z}}$
, we have
${\mathcal {Z}}\subset {\mathcal {M}}$
. If
$\ell _{k,1}=0$
for every
$k=1,\ldots ,n$
, then
$(1,0,\ldots ,0)$
is in the kernel of B implying that the top row of B is zero. The number of such pairs
$(A,B)$
has already been bounded in Proposition 6.9, and hence, we may assume that
$\ell _{n,1}\not \in {\mathcal {Z}}$
. Fix a nondecreasing n-tuple L of positive real numbers, and a saturated
${\mathcal {Z}}\subset {\mathcal {M}}$
with
$\ell _{n,1}\not \in {\mathcal {Z}}$
such that
$N({{\mathcal {L}}(1)} ,L,{\mathcal {Z}},X)\neq 0$
. We partition the integrand
$T_{\mathcal {Z}}(L,Y)$
into two parts: let
$T_1$
denote the subset of
$T_{\mathcal {Z}}(L,Y)$
consisting of elements s for
$s=(s_i)_i$
with
$s_n\geq Y^\delta $
, and let
$T_2$
denote the subset of elements s with
$1\ll s_n< Y^\delta $
, where
$\delta $
is a positive constant to be optimized later.
We first bound the contribution to
$N({{\mathcal {L}}(1)} ,L,{\mathcal {Z}},X)$
from
$T_1$
. Since
$Yw(a_{11})\gg 1$
, we have

for
$s\in T_1$
. Integrating over
$T_1$
gives the bound

Next, we consider the contribution from
$T_2$
. Define the map
$\pi :{\mathcal {Z}}_1\cap {\mathcal {M}}\to {\mathcal {M}}$
by

Since we have assumed that
$N({{\mathcal {L}}(1)} ,L,{\mathcal {Z}},X)\neq 0$
, Lemma 6.5 implies that
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
, and so the image of
$\pi $
lies in
${\mathcal {M}}\backslash {\mathcal {Z}}$
. Then for any
$\alpha \in {\mathcal {Z}}_1\cap {\mathcal {M}}$
and any
$s\in T_{\mathcal {Z}}(L,Y)$
, we have
$w_L(\pi (\alpha ))\gg w_L(\alpha )$
and
$w_L(\pi (\alpha ))\gg 1$
. These inequalities along with (43) and (44) imply that for any
$s\in T_{\mathcal {Z}}(L,Y)$
, the number
$\# (s(Y{\mathcal {D}}) \cap {\mathcal {S}}(L,s))$
of possible B’s is

For each possible B, applying the large sieve (Proposition 4.3) using Lemma 6.7 gives us a bound of

for the number of possible choices for A. Therefore,

for
$s\in T_2$
. We compute the ratio of these weights: For any
$i\geq 2$
and
$j=1$
, we have

For any other
$i,j$
, we have

As the
$L_i$
are nondecreasing and positive, we multiply by the Haar measure character
$\delta (s)$
to obtain

The powers of
$s_i$
in the above expression are negative for
$1\leq i\leq n-1$
, while the power of
$s_n$
is
$1$
. Integrating over
$T_2$
now gives the bound

Combining (47) and (48) and choosing
$\delta =\frac {n+3/2}{n^2 + n + 1}$
yields

The summation of this bound over the
$O(1)$
different possible
${\mathcal {Z}}$
’s and the
$O(Y^\epsilon )$
different possible L’s, in conjunction with the bound in Proposition 6.9, implies Theorem 6.6.
6.3 Bounding the number of distinguished elements in the shallow cusp
In this subsection, we bound the number of distinguished elements having large q-invariant that lie in the shallow cusp of the fundamental domain.
Theorem 6.11. Let
$\eta>0$
be any real number. Assume that
$M>X^{\eta }$
. Then

We will take
$\eta = 1/4$
when we prove Theorem 5 in §6.5.
6.3.1 A preliminary bound of
$O_\epsilon (X^{n+1+\epsilon })$
We again use (40) to write

where the sum is over nondecreasing n-tuples
$L=(L_1,\ldots ,L_n)$
of positive real numbers that partition the region
$\{(\mu _1,\ldots ,\mu _n)\in [Y^{-\Theta _1},Y^{\Theta _2}]^n:\mu _1\leq \mu _2\leq \ldots \leq \mu _n\}$
into dyadic ranges, and over saturated
${\mathcal {Z}}\subset {\mathcal {K}}\cup {\mathcal {M}}$
such that
$a_{11}\in {\mathcal {Z}}$
and
$a_{g+1,g+1}\not \in {\mathcal {Z}}$
. By Lemma 6.5, we have
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)>0$
only when
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
, which we henceforth assume.
For
$k\in \{0,\ldots ,g\}$
, define the map
$\pi _k:{\mathcal {Z}}_1\to {\mathcal {K}}\cup {\mathcal {M}}$
by

We define the auxiliary set
${\mathcal {Z}}^*$
by

Then, when restricted to
${\mathcal {Z}}^*\subset {\mathcal {Z}}_1$
, the functions
$\pi _k$
are equal for every k.
Lemma 6.12. For any
$k\in \{0,\ldots ,g\}$
, we have

Proof. We directly compute

which is
$\delta (s)^{-1}$
.
Fix a saturated set
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
such that
$a_{11}\in {\mathcal {Z}}$
,
$a_{g+1,g+1}\notin {\mathcal {Z}}$
and
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)>0$
. Let
$k\in \{1,\ldots ,g\}$
be the largest integer such that
$a_{kk}\in {\mathcal {Z}}$
. Then we have the following results.
Lemma 6.13. Let
${\mathcal {Z}}$
and k be as above. Then for every
$\alpha \in {\mathcal {Z}}$
, we have
$\pi _k(\alpha )\notin {\mathcal {Z}}$
. In particular, for any
$s\in T_{\mathcal {Z}}(L,Y)$
, we have
$Yw(\pi _k(\alpha ))\gg 1.$
Proof. Since
$a_{n+1-j,j}\not \in {\mathcal {Z}}_1$
for any j and
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
, we have
$\pi _k(a_{ij})\not \in {\mathcal {Z}}$
for any
$a_{ij}\in {\mathcal {Z}}$
. Moreover, since
$a_{jj}\in {\mathcal {Z}}$
for every
$j\leq k$
, it follows from Lemma 6.5 that
$\ell _{n+1-j,j}\not \in {\mathcal {Z}}$
. Furthermore,
$\ell _{i,n+2-i}\not \in {\mathcal {Z}}_1$
. Hence,
$\pi _k(\ell _{ij})\not \in {\mathcal {Z}}$
for any
$\ell _{ij}\in {\mathcal {Z}}$
.
Lemma 6.14. Let
${\mathcal {Z}}$
and k be as above. Then, uniformly for
$s\in T_{\mathcal {Z}}(L,Y)$
, we have

Proof. Since we have

it follows that
$w(\pi _k(\alpha ))/w(\alpha )\gg 1$
for every k,
$\alpha \in {\mathcal {Z}}_1$
, and
$s\in T_{\mathcal {Z}}(L,Y)$
. Thus, by adding elements in
${\mathcal {Z}}_1$
to
${\mathcal {Z}}$
, if necessary, we can assume that
${\mathcal {Z}}$
is equal to

Denote the four sets on the right-hand side of the above equation as
$S_1$
,
$S_2$
,
$S_3$
and
$S_4$
, respectively. For an element
$\ell _{ij}\in S_2$
, we have

Therefore,

where the second equality follows from Lemma 6.12, and the last inequality follows because the
$L_i$
’s are nondecreasing.
Proposition 6.3 and Lemmas 6.13 and 6.14 thus yield the bound

We now work towards obtaining a power saving.
6.3.2 Strategy towards a power saving
In light of Proposition 6.3, it is enough to have a bound of the form

for some
$\delta> 0$
, for all
$s\in T_{\mathcal {Z}}(L,Y).$
By modifying
$\pi _k$
on a certain subset of
${\mathcal {Z}}$
, we are able to obtain (51) except for some
$s\in T_{\mathcal {Z}}(L,Y)$
satisfying some special conditions. We then consider the contribution from these special s using a different count for
$\#\big ((s(Y{\mathcal {D}})\times s(Y{\mathcal {D}}))\cap {{\mathcal {L}}(M)} \big )$
.
More precisely, let
${\mathcal {K}}_1 := \{a_{1j}: 1\leq j \leq g+2\}$
. Then
${\mathcal {K}}_1$
consists exactly of those
$\alpha \in {\mathcal {K}}$
such that the exponent of every
$s_i$
is negative in
$w(\alpha )$
. As such, one expects that the hardest case is when
${\mathcal {Z}} = {\mathcal {K}}_1$
. We show first in Lemma 6.15 how to reduce to considering only
${\mathcal {Z}}\cap {\mathcal {K}}_1$
.
Lemma 6.15. Let
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
be saturated with
$a_{11}\in {\mathcal {Z}}$
,
$a_{g+1,g+1}\not \in {\mathcal {Z}}$
and
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)> 0$
. For any
${\mathcal {Z}}'\subset {\mathcal {K}}_1$
and any
$s\in T'$
, we write

Then for any
$s\in T_{\mathcal {Z}}(L,Y)$
, we have

We then prove in Lemma 6.16 the following bound for
$I({\mathcal {Z}}\cap {\mathcal {K}}_1)$
when
${\mathcal {Z}}\cap {\mathcal {K}}_1$
is a proper subset of
${\mathcal {K}}_1$
, which gives a bound of the form (51) when
$s_n\ll Y^{1/2-\delta }$
.
Lemma 6.16. Let
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
be saturated with
$a_{11}\in {\mathcal {Z}}$
,
$a_{g+1,g+1}\not \in {\mathcal {Z}}$
and
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)> 0$
. Suppose
${\mathcal {Z}}\cap {\mathcal {K}}_1\neq {\mathcal {K}}_1$
. For any
$s\in T_{\mathcal {Z}}(L,Y)$
, if
$I({\mathcal {Z}}\cap {\mathcal {K}}_1,s) \gg Y^{-2\delta }$
, then
$s_n\gg Y^{1/2 - \delta }.$
In the case where
$s_n\gg Y^{1/2-\delta }$
, the Haar measure turns out to be very small, so we may simply ignore the singularity condition of B and prove the following bound.
Lemma 6.17. Let
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
be saturated with
$a_{11}\in {\mathcal {Z}}$
,
$a_{g+1,g+1}\not \in {\mathcal {Z}}$
and
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)> 0$
. Suppose
${\mathcal {Z}}\cap {\mathcal {K}}_1\neq {\mathcal {K}}_1$
. Then for any
$s\in T_{\mathcal {Z}}(L,Y)$
with
$s_n\gg Y^{1/2-\delta }$
,

Therefore, by taking
$\delta = (n-2)/(4n^2 + 14n + 4)$
, we obtain the following result from Proposition 6.3 and Lemmas 6.15, 6.16 and 6.17:
Proposition 6.18. Let
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
be saturated with
$a_{11}\in {\mathcal {Z}}$
,
$a_{g+1,g+1}\not \in {\mathcal {Z}}$
and
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)> 0$
. Suppose
${\mathcal {Z}}\cap {\mathcal {K}}_1\neq {\mathcal {K}}_1$
. Then

We next handle the case
${\mathcal {K}}_1\subset {\mathcal {Z}}$
. We give necessary conditions in Lemma 6.19 on s so that a bound of the form (51) does not hold.
Lemma 6.19. Let
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
be saturated with
$a_{11}\in {\mathcal {Z}}$
,
$a_{g+1,g+1}\not \in {\mathcal {Z}}$
and
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)> 0$
. Suppose
${\mathcal {K}}_1\subset {\mathcal {Z}}$
. For any
$s\in T_{\mathcal {Z}}(L,Y)$
, if
$I({\mathcal {K}}_1,s)\gg X^{-\delta }$
, then

where

Note that the coefficients of
$\delta $
in the exponents in the above bounds are not optimal and are simply chosen to make the formula look nice. The optimal coefficients can be obtained from the proof.
When s satisfies (52), we give further conditions in Lemma 6.20 on s so that simply using the Haar measure and ignoring the singularity condition by counting all symmetric matrices is not enough for a power saving.
Lemma 6.20. Let
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
be saturated with
$a_{11}\in {\mathcal {Z}}$
,
$a_{g+1,g+1}\not \in {\mathcal {Z}}$
and
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)> 0$
. Suppose
${\mathcal {K}}_1\subset {\mathcal {Z}}$
. For any
$s\in T_{\mathcal {Z}}(L,Y)$
, if

then

To obtain a further saving, we need to use the
$|q|$
-invariant!
Lemma 6.21. Suppose
$M>X^\eta $
where
$\eta>0$
is some fixed constant. Let
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
be saturated with
$a_{11}\in {\mathcal {Z}}$
,
$a_{g+1,g+1}\not \in {\mathcal {Z}}$
and
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)> 0$
. Suppose
${\mathcal {K}}_1\subset {\mathcal {Z}}$
. Then for
$\delta < \min (\eta ,1)/(1355g^6)$
and any
$s\in T_{\mathcal {Z}}(L,Y)$
such that (52) and (53) hold, we have

Therefore, by taking
$\delta = 64\min (\eta ,1)/(1355n^6)$
, we obtain the following result from Proposition 6.3 and Lemmas 6.15, 6.19, 6.20 and 6.21.
Proposition 6.22. Suppose
$M>X^\eta $
where
$\eta>0$
is some fixed constant. Let
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
be saturated with
$a_{11}\in {\mathcal {Z}}$
,
$a_{g+1,g+1}\not \in {\mathcal {Z}}$
and
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)> 0$
. Suppose
${\mathcal {K}}_1\subset {\mathcal {Z}}$
. Then

Theorem 6.11 then follows immediately from (40), Proposition 6.18, Proposition 6.22 and summing over the
$O(1)$
different possible
${\mathcal {Z}}$
’s and the
$O(Y^\epsilon )$
different possible L’s.
6.3.3 Proofs of Lemmas 6.15, 6.16, 6.17, 6.19, 6.20 and 6.21.
We fix a saturated
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
with
$a_{11}\in {\mathcal {Z}}$
,
$a_{g+1,g+1}\not \in {\mathcal {Z}}$
and
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)> 0$
.
Proof of Lemma 6.15.
Recall that
${\mathcal {K}}_1 := \{a_{1j}: 1\leq j \leq g+2\}$
. Let
$k\in \{1,\ldots ,g\}$
be the largest integer such that
$a_{kk}\notin {\mathcal {Z}}$
. Then, applying Lemma 6.14 to the saturated set
${\mathcal {Z}}\cup {\mathcal {K}}_1$
, we have

Hence, by Lemma 6.13, we obtain for any
$s\in T_{\mathcal {Z}}(L,Y)$
,

as desired.
Note that a direct computation yields

Proof of Lemma 6.16.
Since
${\mathcal {Z}}$
is saturated and
${\mathcal {Z}}\cap {\mathcal {K}}_1\neq {\mathcal {K}}_1$
, we have
${\mathcal {Z}}\cap {\mathcal {K}}_1 = \{a_{11},\ldots ,a_{1j}\}$
for some
$j=1,\ldots ,g+1$
. Since
$a_{g+1,g+1}$
and
$a_{1,g+2}$
do not belong to
${\mathcal {Z}}$
, we have for
$s\in T_{\mathcal {Z}}(L,Y)$
,

since the powers of the
$s_i$
’s in the third line are negative for
$i<n$
.
Similarly, for any
$j=1,\ldots ,g$
, we compute

as desired.
Proof of Lemma 6.17.
Suppose now
$s_n\gg Y^{1/2-\delta }$
. First note, that the inequality

implies that we have

Since each
$s_i\gg 1$
, we also have
$s_n\ll Y^{1/2}$
by (56). Hence,

Thus,

Multiplying these weights together and applying Proposition 4.1 gives the estimate

Meanwhile, in this region where
$s_n\geq X^{1/2-\delta }$
, the quantity
$\delta (s)$
satisfies

Multiplying the bounds in (57) and (58) together yields

as desired.
Proof of Lemma 6.19.
Suppose now
${\mathcal {K}}_1\subset {\mathcal {Z}}\subset {\mathcal {Z}}_1$
and
$I({\mathcal {K}}_1,s) \gg X^{-\delta }$
for some
$s\in T_{\mathcal {Z}}(L,Y)$
. We prove first that for any
$i = 1,\ldots ,g-1$
, we have

Indeed, since
$a_{j,n+1-j}\notin {\mathcal {Z}}$
for all j, we have from (54) that, for any
$k = 1,\ldots ,g$
,

Hence,

Denote the product of the two factors in the final line by
$J_k$
. Then

Since, by assumption,
$I({\mathcal {K}}_1,s)\gg X^{-\delta }$
, we have
$J_k\gg Y^{-(n+1)\delta }$
for every
$k=1,\ldots ,g$
. Therefore, for every
$i=1,\ldots ,g-1$
, we have

The claimed bound (59) follows.

where

We next prove the desired lower bounds:

The bound
$s_{g+1}\gg 1$
follows from the definition of
$T'$
. For the bounds on
$s_g$
and
$s_{g+2}$
, we use the assumption that
$a_{g+1,g+1}\notin {\mathcal {Z}}$
and the computation of
$I({\mathcal {K}}_1,s)$
in (54) to obtain

which along with
$I({\mathcal {K}}_1,s)\gg Y^{-(n+1)\delta }$
implies the desired lower bounds on
$s_{g+2}$
; and

implying the desired lower bound on
$s_g$
, where in the last inequality we used the already-established lower bounds on
$s_{g+1}$
and
$s_{g+2}$
.
The desired lower bounds for
$s_g$
,
$s_{g+1}$
,
$s_{g+2}$
then follow by combining the upper bound on
$s_g^gs_{g+1}^{g+1}s_{g+2}^{g+2}$
in (60) and the individual lower bounds on
$s_g$
,
$s_{g+1}$
,
$s_{g+2}$
in (61). The desired upper bound on
${\mathcal {R}}$
follows by comparing the upper bound on
$s_g$
and the trivial lower bound
$s_g\gg 1$
.
Proof of Lemma 6.20.
Suppose
${\mathcal {K}}_1\subset {\mathcal {Z}}$
and
$s\in T_{\mathcal {Z}}(L,Y)$
satisfies (52). Then

where the upper bound on
${\mathcal {R}}$
also gives
$t_j^{-1} \ll Y^{-1/2+20g^2\delta }$
for
$j = 1,\ldots ,g$
. For
$i,j\leq g+2$
, we have
$Yt_i^{-1}t_j^{-1}\ll Y^{40g^2\delta }$
. For
$j\geq g+3$
and
$i\leq n-j+1$
, we have
$Yt_i^{-1}t_j^{-1}\ll Y^{28g^2\delta }$
. Using (62) for the rest of the coordinates gives

The Haar measure satisfies the following bound:

Hence,

Suppose now
$\#\big ((s(Y{\mathcal {D}})\times s(Y{\mathcal {D}}))\cap W({\mathbb {Z}})\big )\,\delta (s)\gg X^{n+1-\delta }$
. Then, for any
$i = g+3,\ldots ,n$
,

as desired.
Proof of Lemma 6.21.
Suppose
$M>X^\eta $
where
$\eta>0$
is some fixed constant. Suppose
$\delta < \mathrm {{max}}(\eta ,1)/1355g^6$
. Suppose
${\mathcal {K}}_1\subset {\mathcal {Z}}$
and
$s\in T_{\mathcal {Z}}(L,Y)$
satisfies (52) and (53). We now impose the conditions
$\det (B) = 0$
and
$|q|(A,B)>M$
for any
$(A,B)\in {{\mathcal {L}}(M)} $
to obtain a further saving for
$\#\big ((s(Y{\mathcal {D}})\times s(Y{\mathcal {D}}))\cap {{\mathcal {L}}(M)} \big )\,\delta (s)$
.
The bound (53) on
$s_{g+3},\ldots ,s_n$
gives
${\mathcal {R}}\ll Y^{258g^4\delta }.$
Hence,

thus improving (62). In this case,


Since
$\delta < 1/(1315g^4)$
, we may assume that every
$(A,B)\in (s(Y{\mathcal {D}})\times s(Y{\mathcal {D}}))\cap W({\mathbb {Z}})$
satisfies the following:
-
(a) The top left
$g\times (g+2)$ -blocks of A and B are
$0$ .
-
(b) The entries of the top right
$g\times (g+1)$ blocks of A and B are
$O(Y^{1315g^5\delta })$ .
-
(c) The entries
$a_{g+1,g+1}$ ,
$a_{g+1,g+2}$ ,
$a_{g+2,g+2}$ ,
$b_{g+1,g+1}$ ,
$b_{g+1,g+2}$ , and
$b_{g+2,g+2}$ are
$O(Y^{40g^2\delta })$ .
-
(d) The entries
$a_{g+1,j}$ ,
$a_{g+2,j}$ ,
$b_{g+1,j}$ and
$b_{g+2,j}$ are
$O(Y^{(n+1)/2+43g^2\delta })$ for
$g+3\leq j\leq n+1$ .
-
(e) The entries
$a_{ij}$ and
$b_{ij}$ are
$O(Y^{n+1+46g^2\delta })$ for
$g+3\leq i,j\leq n+1$ .
Suppose now that
$(A,B)$
is an element of
$(s(Y{\mathcal {D}})\times s(Y{\mathcal {D}}))\cap {{\mathcal {L}}(M)} $
. Then
$f_{A,B} = xg(x,y)$
, where
$g(x,1)$
is a degree n polynomial with Galois group
$S_n$
.
Lemma 6.23. Let
$(A,B)$
be as above. If
$b_{g+1,g+1}=b_{g+1,g+2}=b_{g+2,g+2}=0$
, then

Proof. Since
$(A,B)$
is distinguished over
${\mathbb {Q}}$
, the set of
$(g+1)$
-dimensional common isotropic subspaces defined over any number field L is in bijection with
$J[2](L)$
, where J is the Jacobian of the hyperelliptic curve
$y^2 = xg(x,1)$
(which has a rational Weierstrass point at infinity), and
$J[2](L)$
is in bijection with the factorizations of
$xg(x,1)$
over L. Since
$g(x,1)$
has Galois group
$S_n$
, it does not admit any factorization over any quadratic extension of
${\mathbb {Q}}$
. Therefore, for any quadratic extension K of
${\mathbb {Q}}$
, we have
$J[2](K) = J[2]({\mathbb {Q}})$
, and so any
$(g+1)$
-dimension K-subspace isotropic with respect to A and B admits a
${\mathbb {Q}}$
-basis.
Suppose
$x_0,y_0\in K$
for some quadratic extension K of
${\mathbb {Q}}$
such that
$(x_0,y_0)$
is a solution to

By the assumption
$b_{g+1,g+1}=b_{g+1,g+2}=b_{g+2,g+2}=0$
, we see that

is a
$(g+1)$
-dimension K-subspace isotropic with respect to A and B. Let
$v_1,\ldots ,v_{g+1}\in {\mathbb {Q}}^{n+1}$
be such that

We now complete
$\{e_1,\ldots ,e_g\}$
into a
${\mathbb {Q}}$
-basis
$\{e_1,\ldots ,e_g,v_0\}$
for
$\mathrm {Span}_{\mathbb {Q}}\{v_1,\ldots ,v_{g+1}\}$
. We may use
$e_1,\ldots ,e_g$
to clear out the first g coordinates of
$v_0$
and take
$v_0$
to be of the form
$x_0'e_{g+1} + y_0'e_{g+2}$
with
$x_0',y_0'\in {\mathbb {Q}}$
, which implies that
$(x_0',y_0')$
is a nonzero rational solution (65). In particular, the discriminant
$a_{g+1,g+2}^2 - 4a_{g+1,g+1}a_{g+2,g+2}\in {\mathbb {Z}}$
is a square.
If
$a_{g+1,g+1}\neq 0$
, let

If
$a_{g+1,g+1} = 0$
, let
$x_1 = 1,\,y_1 = 0$
. Then
$x_1,y_1$
are integers
$\ll Y^{40g^2\delta }$
, not both zero, and are solutions to (65). Let
$x_0 = x_1/\gcd (x_1,y_1)$
and
$y_0 = y_1/\gcd (x_1,y_1)$
. There then exist integers
$x_2,y_2\ll Y^{40g^2\delta }$
such that

forms an integral basis for
${\mathbb {Z}}^{n+1}$
such that the first
$g+1$
vectors generate a primitive lattice isotropic with respect to A and B, and the first
$g+2$
vectors generate a primitive lattice isotropic with respect to B. That is, we compute the
$|q|$
-invariant of
$(A,B)$
using this basis. When so expressed, the top right
$(g+1)\times (g+2)$
blocks of the Gram matrices of A and B have the form

where entries labeled ‘
$0$
’ are
$0$
, entries labeled ‘
$\flat $
’ are
$O(Y^{1355g^5\delta })$
, and entries labeled ‘
$*$
’ are
$O(Y^{(n+1)/2 + 83g^2\delta })$
. Let
$M_1$
denote the
$(g+2)\times (g+2)$
matrix whose ith row consists of the coefficients of
$\det (A_ix - B_iy)$
, where
$A_i$
and
$B_i$
are the
$(g+1)\times (g+1)$
matrices formed by removing the i-th columns from
$A^{\mathrm {{top}}}$
and
$B^{\mathrm {{top}}}$
, respectively. Then
$M_1$
is of the form

where entries labeled ‘
$0$
’ are
$0$
, entries labeled ‘
$\sharp $
’ are
$O(Y^{2710g^6\delta })$
, and entries labeled ‘
$*$
’ are
$O(Y^{(n+1)/2+1438g^6\delta })$
, where the top right coefficient
$m'$
of
$M_1$
is the determinant of the top right
$(g+1)\times (g+1)$
block
$B'$
of
$B^{\mathrm {{top}}}$
, up to sign. Thus,

where
$M_1'$
is the bottom left
$(g+1)\times (g+1)$
block of
$M_1$
. Since the coefficients of
$M_1'$
are
$\ll Y^{2710g^6\delta }$
, it follows that
$|q|(A,B)\ll X^{2710g^6(g+1)\delta /(n+1)} \ll X^{1355g^6\delta }$
.
We now return to the proof of Lemma 6.21. For any
$(A,B)\in (s(Y{\mathcal {D}})\times s(Y{\mathcal {D}}))\cap {{\mathcal {L}}(M)} $
, since
$|q|(A,B)> M > X^\eta $
, we may assume that
$b_{g+1,g+1}$
,
$b_{g+1,g+2}$
, and
$b_{g+2,g+2}$
are not all
$0$
since
$\delta < \eta /(1355g^6)$
.
We now fix
$b_{ij}$
for
$1\leq i\leq g, g+3\leq j\leq n+1$
, and
$i = g+1,g+2$
,
$j=g+1,g+2$
. We consider the number of pairs
$(A,B)\in (s(Y{\mathcal {D}})\times s(Y{\mathcal {D}}))\cap {{\mathcal {L}}(M)} $
with these prescribed coefficients by viewing
$\det (B)$
as a polynomial F in
$b_{ij}$
for
$g+1\leq i\leq n+1$
and
$g+3\leq j\leq n+1$
. Note that all of these remaining coefficients have range at least

Hence, to complete the proof of Lemma 6.21, it remains to prove that F is a nonzero polynomial, for then we would have, using (63), that

We may assume that the top right
$g\times (g+1)$
block of B has full rank, for otherwise the kernel of B would be isotropic with respect to A forcing
$\Delta (A,B) = 0$
by Lemma 6.4. Hence, we may also assume that the top right
$g\times (g+1)$
block of B equals
$(I_g\,\, 0)$
, where
$I_g$
denotes the
$g\times g$
identity matrix. Then

Since

we see that
$\det (B)$
is a nonzero polynomial in
$b_{g+1,n+1}$
,
$b_{g+2,n+1}$
, and
$b_{n+1,n+1}$
.
6.4 Bounding the number of distinguished elements in the deep cusp
In this subsection, we bound the number of elements with large q-invariant that lie in the deep cusp.
Theorem 6.24. We have
$ {\mathcal {I}}_X^{\mathrm {{dcusp}}}({{\mathcal {L}}(M)} )=O\Big (\frac {X^{n+1+\frac 12\kappa }}{M}\log ^{2n}X\Big ). $
Recall from (40) that

here, the first sum is over r-tuples
$L=(L_1,\ldots ,L_r)$
with
$L_1\leq L_2\leq \cdots \leq L_n$
that partition the region
$\{(\mu _1,\ldots ,\mu _r)\in [Y^{-\Theta _1},Y^{\Theta _2}]^r:\mu _1\leq \ldots \leq \mu _r\}$
into dyadic ranges, and the second sum is over saturated subsets
${\mathcal {Z}}$
of
${\mathcal {K}}\cup {\mathcal {M}}$
, where

The set
${\mathcal {S}}(L,s)$
is the union over
$\Lambda \in \Sigma (L,s)$
of
$S(\Lambda )$
, where
$S(\Lambda )$
denotes the lattice of integral symmetric matrices whose row space is contained in
$\Lambda \otimes {\mathbb {R}}$
, and
$\Sigma (L,s)$
denotes the set of primitive lattices
$\Lambda \in {\mathbb {Z}}^{n+1}$
of rank n such that the successive minima
$\mu _1,\ldots ,\mu _n$
of
$s^{-1}(\Lambda )$
satisfy
$L_i\leq \mu _i<2L_i$
for each
$i\in \{1,\ldots ,n\}$
. Finally, recall from §6.1 and Proposition 5.4 that

for every
$s\in T_{\mathcal {Z}}(Y,L)$
. Hence, we may assume that
$\ell _{g+1,g+1}\in {\mathcal {Z}}$
.
The deep cusp contains
$\asymp X^{n+1}$
elements, and we obtain a saving because the elements we are counting have q-invariant greater than M. To make use of this condition, we require an upper bound on the size of the
$|q|$
-invariant of elements in
$(s(Y{\mathcal {D}})\times s(Y{\mathcal {D}})) \cap {{\mathcal {L}}(1)} $
. To accomplish this, we have the following preliminary result.
Lemma 6.25. Let
$(A,B)\in (Y{\mathcal {D}}\times Y{\mathcal {D}})\cap W_{0}({\mathbb {R}})$
be such that
$\Delta (A,B)> X^{2n-2-\kappa }$
. Denote the top right
$(g+1)\times (g+2)$
block of B by
$B^{\mathrm {{top}}}$
. Then

Proof. Let
$(A',B')=Y^{-1}(A,B)\in ({\mathcal {D}}\times {\mathcal {D}})\cap W_{0}({\mathbb {R}})$
. Then it suffices to prove that

Since
$|\Delta (A,B)|>X^{2n-2-\kappa }$
, we have
$|\Delta (A',B')|>X^{-\kappa }$
. By Proposition 3.6, there is a polynomial
$P\in {\mathbb {Z}}[W_{0}]$
such that

for any
$(A",B")\in W_{0}({\mathbb {R}})$
. Since
$(A',B')\in {\mathcal {D}}\times {\mathcal {D}}$
, which is an absolutely bounded region, we have
$|P(A',B')|\ll 1$
. Hence,

as desired.
Next, we have the following upper bound on the
$|q|$
-invariant.
Proposition 6.26. Let
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
be a saturated set containing
$a_{g+1,g+1}$
and
$\ell _{g+1,g+1}$
. Let
$L=(L_1,\ldots ,L_n)$
be a sequence of nondecreasing positive real numbers. Then for any
$s\in T_{\mathcal {Z}}(L,Y)$
and
$(A,B)\in (s(Y{\mathcal {D}})\times s(Y{\mathcal {D}}))\cap {{\mathcal {L}}(1)} $
, we have

Proof. Suppose
$(A,B)\in (s(Y{\mathcal {D}})\times s(Y{\mathcal {D}}))\cap {{\mathcal {L}}(1)} $
. Since
$a_{g+1,g+1}\in {\mathcal {Z}}$
, we have
$(A,B)\in W_{0}({\mathbb {Z}})$
. By Lemma 6.4,
$\ker (B)$
is
$1$
-dimensional and does not lie inside
$\text {Span}\{e_1,\ldots ,e_{g+1}\}$
as this
$(g+1)$
-plane is isotropic with respect to A. Let
$w_1\in \text {Span}_{\mathbb {Z}}\{e_{g+2},\ldots ,e_{n+1}\}$
be a primitive vector so that
$\{e_1,\ldots ,e_{g+1},w_1\}$
forms a basis for the primitive lattice in
$\text {Span}_{\mathbb {R}}\{e_1,\ldots ,e_{g+1}\} + \ker (B).$
Complete
$w_1$
to an integral basis
$\{w_1,\ldots ,w_{g+2}\}$
for
$\text {Span}_{\mathbb {Z}}\{e_{g+2},\ldots ,e_{n+1}\}$
. We can now use the integral basis

of
${\mathbb {Z}}^{n+1}$
to compute the
$|q|$
-invariant of
$(A,B)$
, as the first
$g+1$
vectors generate a primitive lattice isotropic with respect to A and B, and the first
$g+2$
vectors generate a primitive lattice isotropic with respect to B. Note also that with respect to the standard inner product on
${\mathbb {R}}^{n+1}$
, since
$w_1\in \text {Span}_{\mathbb {R}}\{e_1,\ldots ,e_{g+1}\} + \ker (B)$
, we have

where
$C(B)$
denotes the column space of B.
Let
$A'$
and
$B'$
be the Gram matrices of the quadratic forms defined by A and B with respect to this new basis (67). Since the first
$g+1$
vectors of this basis are part of the standard basis, we see that
$(A,B)$
and
$(A',B')$
are
$G_0({\mathbb {Z}})$
-equivalent, where
$G_0$
is defined in §3.1. Hence,

Let
$B"$
denote the top right
$(g+1)\times (g+1)$
block of
$B'$
. Then, by the definition of q, we have

We now work towards proving a lower bound on
$|\det (B")|$
. Let
$p_1$
(resp.,
$p_2$
) denote the projection of
${\mathbb {R}}^{n+1}$
onto the first
$g+1$
coefficients (resp., the last
$g+2$
coefficients). Let
$B^{\mathrm {{top}}}$
(resp.,
$B^{\prime \mathrm {{top}}}$
) denote the top right
$(g+1)\times (g+2)$
block of B (resp.,
$B'$
). Then by (68), we have
$B^{\mathrm {{top}}} p_2(w_1) = 0$
. Consider the following two
$(g+2)\times (g+2)$
matrices in block form:

Then

Let
$\Lambda _2$
denote the rank
$g+1$
lattice in
${\mathbb {Z}}^{g+2}$
spanned by the rows of
$B^{\mathrm {{top}}}$
. Then
$|\det (B^*)| = d(\Lambda _2)|w_1|.$
Since
$\{p_2(w_1),\ldots ,p_2(w_{g+2})\}$
is an integral basis for
${\mathbb {Z}}^{g+2}$
, we have
$\det \gamma = \pm 1$
, and so

We now use the fact that
$B\in {\mathcal {S}}(L,s)$
. This means that the row span of B lies in an n-dimensional primitive lattice
$\Lambda \subset {\mathbb {Z}}^{n+1}$
with basis of the form
$\{s\ell _1,\ldots ,s\ell _n\}$
where
$L_i\leq |\ell _i|<2L_i$
and
$\{\ell _1,\ldots ,\ell _n\}$
are reduced. By assumption,
$\ell _{g+1,g+1}\in {\mathcal {Z}}$
, and hence,
$\ell _{i,j}\in {\mathcal {Z}}$
for all
$i\leq g+1$
and
$j\leq g+1$
. Thus, the first
$g+1$
coefficients of
$s\ell _1,\ldots ,s\ell _{g+1}$
are all
$0$
, and
$\{s\ell _1,\ldots ,s\ell _{g+1}\}$
forms an integral basis of a primitive lattice
$\Lambda _1$
of rank
$g+1$
in
$\text {Span}_{\mathbb {R}}\{e_{g+2},\ldots ,e_{n+1}\}$
. By (68),
$w_1$
is a primitive vector in
$\text {Span}_{\mathbb {R}}\{e_{g+2},\ldots ,e_{n+1}\}$
orthogonal to
$\Lambda _1$
. Hence,

By (68), we have
$\text {Span}_{\mathbb {R}}\{e_{g+2},\ldots ,e_{n+1}\}\cap C(B)\neq \text {Span}_{\mathbb {R}}\{e_{g+2},\ldots ,e_{n+1}\}$
, and so

In particular, since
$\Lambda _1$
is primitive, the first
$g+1$
columns of B belong to
$\Lambda _1$
. That is, there is a
$(g+1)\times (g+1)$
matrix C (with integer coefficients) such that

and so

To obtain a lower bound on
$|\det (C)|$
, we write

Let
$M^{\mathrm {{top}}}$
denote the
$(g+1)\times (g+2)$
matrix with rows
$p_2(\ell _1)^t,\ldots ,p_2(\ell _{g+1})^t$
. Then

Consider the pair
$(A_0,B_0):=s^{-1}(A,B)\in (Y{\mathcal {D}}\times Y{\mathcal {D}})\cap W_{0,n+1}({\mathbb {R}})$
satisfying

since
$(A,B)\in {{\mathcal {L}}(1)} $
. The top right
$(g+1)\times (g+2)$
block
$B_{0}^{\mathrm {{top}}}$
of
$B_0$
satisfies

and so

The rows of
$M^{\mathrm {{top}}}$
form a reduced basis for a lattice
$\Lambda _3\subset {\mathbb {Z}}^{g+2}$
with
$L_i\leq |p_2(\ell _i)|<2L_i$
. Thus,


The result now follows from Lemma 6.25.
Proof of Theorem 6.24.
We write

and obtain upper bounds on
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)$
for each
${\mathcal {Z}}\subset {\mathcal {Z}}_1$
with
$a_{g+1,g+1},\ell _{g+1,g+1}\in {\mathcal {Z}}$
. Fix such a set
${\mathcal {Z}}$
with
$N({{\mathcal {L}}(M)} ,L,{\mathcal {Z}},X)>0$
and an element
$s\in T_{\mathcal {Z}}(L,Y)$
. Then

We begin by bounding the number of elements in
$\#(s(Y{\mathcal {D}})\cap S({\mathbb {Z}}))$
. Let
${\mathcal {K}}_{\mathrm {{dist}}} := \{a_{ij}\mid 1\leq i\leq j\leq g+1\}$
. By assumption,
${\mathcal {K}}_{\mathrm {{dist}}}$
is a subset of
${\mathcal {Z}}\cap {\mathcal {K}}$
. Define
$\pi _{\mathcal {K}}: {\mathcal {Z}}_1\cap {\mathcal {K}}\rightarrow {\mathcal {K}}\backslash {\mathcal {Z}}_1$
by

This agrees with the
$\pi _k$
as defined in §6.3.1 when restricted to
${\mathcal {K}}$
. For any
$\alpha \in {\mathcal {Z}}_1\cap {\mathcal {K}}$
, we have
$Yw(\pi _{\mathcal {K}}(\alpha ))\gg 1$
and
$w(\pi _{\mathcal {K}}(\alpha ))\gg w(\alpha )$
. For any
$a_{ij}\in ({\mathcal {Z}}_1\cap {\mathcal {K}})\backslash {\mathcal {K}}_{\mathrm {{dist}}}$
, we have
$i< g+1<j$
. Thus,

Therefore,

We now obtain an upper bound on
$\#(s(Y{\mathcal {D}})\cap {\mathcal {S}}(L,s))$
. Recall that

Let
$\Lambda \in \Sigma (L,s)$
be a lattice such that
$s^{-1}(\Lambda )$
has reduced basis
$\{\ell _1,\ldots ,\ell _n\}$
with
$L_i\leq |\ell _i| < 2L_i$
for each
$i=1,\ldots ,n$
. Suppose there exists
$(A,B)\in (s(Y{\mathcal {D}})\times s(Y{\mathcal {D}}))\cap {{\mathcal {L}}(M)} $
with
$B\in s(Y{\mathcal {D}})\cap S(\Lambda )$
. By Proposition 6.26,

Recall also from the proof of Proposition 6.26 that

Hence,

It follows that the set
$\{p_1(s\ell _{g+2}),\ldots ,p_1(s\ell _n)\}$
, and thus the set
$\{p_1(\ell _{g+2}),\ldots ,p_1(\ell _n)\}$
, are both linearly independent. There then exist vectors
$v_{g+2},\ldots ,v_n\in \text {Span}_{\mathbb {R}}\{e_1,\ldots ,e_{g+1}\}$
such that

is the identity matrix. Let
$B'\in s(Y{\mathcal {D}})\cap S(\Lambda )$
be any element and write

where
$\ell _i\ast \ell _j$
is as defined in (23). Then for
$g+2\leq i\leq j\leq n$
, since
$v_i,v_j\perp \ell _1,\ldots ,\ell _{g+2}$
, we have

Since the top left
$(g+1)\times (g+1)$
block of
$B'\in s(Y{\mathcal {D}})\cap S(\Lambda )$
is
$0$
, the same is true for
$s^{-1}B'$
. Hence,
$\beta _{ij} = 0$
whenever
$g+2\leq i\leq j\leq n$
. In other words,

By Proposition 5.2, we have

where the second bound follows since
$L_{n+1-j}L_j\ll Y$
for all j by Proposition 5.4; and the last bound follows because the map from
$\{(i,j):1\leq i\leq j\leq n\text { and }i\leq g+1\}$
to
$\{(k,\ell )\}$
sending
$(i,j)$
to
$(n+1-j,i)$
is one-to-one with its image contained within the set of pairs
$(k,\ell )$
with
$k<\ell \leq g+1$
, and because the
$L_i$
’s are nondecreasing.
To obtain a bound on the size of
$\Sigma (L,s)$
, we use (43):

Let
${\mathcal {M}}_{\mathrm {{dist}}} = \{\ell _{i,j}\mid 1\leq i\leq g+1,\,1\leq j\leq g+1\}$
. Recall that elements
$\ell _{i,j}\in {\mathcal {Z}}_1$
satisfy
$i+j\leq n+1$
. Hence, for any
$\ell _{ij}\in ({\mathcal {Z}}_1\cap {\mathcal {M}})\backslash {\mathcal {M}}_{\mathrm {{dist}}}$
, exactly one of i and j is
$\leq g+1$
. Define

We claim that the image of
$\pi _{\mathcal {M}}$
is disjoint from
${\mathcal {Z}}$
. Indeed, when
$i\leq g+1$
, we have
$\pi _{\mathcal {M}}(\ell _{i,j})\not \in {\mathcal {Z}}_1$
, and when
$j\leq g+1$
, we have
$\pi _{\mathcal {M}}(\ell _{i,j})\not \in {\mathcal {Z}}$
by Lemma 6.5 and the fact that
$a_{g+1,g+1}\in {\mathcal {Z}}$
. Thus,
$w(\pi _{\mathcal {M}}(\alpha ))\gg 1$
and
$w(\pi _{\mathcal {M}}(\alpha ))\gg w(\alpha )$
for every
$\alpha \in ({\mathcal {Z}}_1\cap {\mathcal {M}})\backslash {\mathcal {M}}_{\mathrm {{dist}}}$
. It follows that

so that

Combining (74), (76) and (77) and the identity

now yields

where the fourth line follows since
$L_{n+1-j}L_j\ll Y$
for all j by Proposition 5.4. Finally, note that

Therefore, combining (73), (75) and (78) gives

Theorem 6.24 now follows immediately from (72) by summing over the
$O(1)$
different possible
${\mathcal {Z}}$
’s and the
$O(\log ^n Y)$
different possible L’s.
6.5 Proof of the main uniformity estimates
Proof of Theorem 5.
Case (a) of Theorem 5 follows from an application of the quantitative version of the Ekedahl geometric sieve developed in [Reference Bhargava4, Theorem 3.3]. Case (b) follows from (14), (16), and Theorem 4.2.
We now use the results of this section to prove the most intricate case – namely, Case (c). For any prime p, the number binary n-ic forms mod
$p^2$
having discriminant
$0$
mod
$p^2$
is
$O(p^{2n})$
since the p-adic densities of these forms is
$O(1/p^2)$
by Proposition A.1. For any squarefree m, we then have
$O_\epsilon (m^{2n-\epsilon })$
binary n-ic forms mod
$m^2$
having discriminant
$0$
mod
$m^2$
. Hence, for any squarefree
$m \leq X^{1/2}$
, we have the bound

Using this bound for
$m\leq X^{1/2}$
, we may assume that
$M>X^{1/2}$
. We note next that from (35), (37), and Lemmas 6.1 and 6.2, we have

Applying Theorems 6.6, 6.11 and 6.24, we obtain

Setting
$\kappa = (2n-2)/(88n^6)$
yields the desired result.
Theorem 5 has the following immediate consequence. For a positive squarefree integer m, let
${\mathcal {W}}_m$
denote the set of integral binary n-ic forms whose discriminants are divisible by
$m^2$
.
Corollary 6.27. For a positive integer
$N\geq 3$
, and positive real numbers M and X, we have

where
$\delta _n=1/2,\,\xi _n = 0,\,\eta _n=1/(2n)$
when n is odd and
$\delta _n=1/3,\,\xi _n = 1/(88n^5),\,\eta _n=1/(88n^6)$
when n is even.
Proof. Suppose
$f\in {\mathcal {W}}_{m}$
for some squarefree
$m>M$
. Note that for a fixed f, the number of such
$m> M$
such that
$f\in {\mathcal {W}}_m$
is
$\ll _\epsilon X^\epsilon $
. Hence, it suffices to consider the cardinality of the union over squarefree
$m>M$
. Let
$m_1$
be the product of primes
$p\mid m$
such that
$f\in {\mathcal {W}}_p^{(1)}$
. Let
$m_2$
be the product of primes
$p\mid m$
such that
$f\in {\mathcal {W}}_p^{(2)}$
. Then
$m_1m_2 = m$
. For any positive real numbers
$M_1,M_2$
such that
$M_1M_2 = M$
, we have either
$m_1> M_1$
or
$m_2> M_2$
, and so

Optimizing, we take
$M_1 = M_2 = \sqrt {M}$
when n is odd, and take
$M_1 = M^{1/3}$
,
$M_2 = M^{2/3}$
when n is even. A direct application of Theorem 5 now yields the result.
7 Proofs of the main results
We begin by proving a more general form of Theorem 6. Let N be a positive squarefree integer, and for each
$p\mid N$
, let
$\Sigma _p\subset V_n({\mathbb {Z}}/p^2{\mathbb {Z}})$
be a nonempty subset. Denote the collection
$(\Sigma _p)_{p\mid N}$
by
$\Sigma $
. Let
$V_n(\Sigma )$
be the set of all
$f\in V_n({\mathbb {Z}})$
such that the reduction of f modulo
$p^2$
lies in
$\Sigma _p$
for all
$p\mid N$
. For
$p\mid N$
, let
$\alpha _n(\Sigma ,p)$
(resp.,
$\beta _n(\Sigma ,p)$
) denote the density of elements
$f\in V_n({\mathbb {Z}})$
such that
$p^2\nmid \Delta (f)$
(resp.,
$R_f$
is maximal at p) and such that the reduction of f modulo
$p^2$
lies in
$\Sigma _p$
. For
$p\nmid N$
, simply set
$\alpha _n(\Sigma ,p)=\alpha _n(p)$
and
$\beta _n(\Sigma ,p)=\beta _n(p)$
. Finally, define

We are now ready to carry out our sieve.
Theorem 7.1. We have

where the error term is given by

where
$\eta _n = 1/(2n),\, \xi _n = 0$
when n is odd, and
$\eta _n = 1/(88n^6),\,\xi _n = 1/(88n^5)$
when n is even.
Proof. For any squarefree integer m that is relatively prime to N, let
${\mathcal {W}}_{m}(\Sigma )$
denote the set of elements
$f\in V({\mathbb {Z}})$
such that
$m^2\mid \Delta (f)$
, and such that the reduction of f modulo
$p^2$
belongs to
$\Sigma _p$
for every
$p\mid N$
. Note that
${\mathcal {W}}_m(\Sigma )$
is a union of

translates of
$m^2N^2V({\mathbb {Z}})$
. By inclusion-exclusion and Corollary 6.27, we have for any
$M>0$
,

Recalling that
$\delta _n = 1/2$
or
$1/3$
, we may take
$M = X^{3\eta _n+3\xi _n}$
to obtain the first claim in Theorem 7.1. The second claim follows identically.
Taking
$N=1$
in Theorem 7.1 yields Theorem 6. Theorems 1 and 2 are then immediate consequences of Theorem 6.
Next, we prove lower bounds on the number of
$S_n$
-fields having bounded discriminant. Let
$f(x,y) = a_0x^n + a_1x^{n-1}y + \cdots + a_ny^n$
be a real binary n-ic form with
$a_0\neq 0$
and nonzero discriminant. Let
$\theta $
be the image of x in
${\mathbb {R}}[x]/(f(x,1))$
, and write
$R_f$
for the lattice spanned by

in
${\mathbb {R}}[x]/(f(x,1))$
. Here, we identify
${\mathbb {R}}[x]/(f(x,1))$
with
${\mathbb {R}}^n$
via its real and complex embeddings and by identifying
${\mathbb {C}}={\mathbb {R}}\oplus i{\mathbb {R}}$
with
${\mathbb {R}}^2$
.
We say that
$f(x,y)$
is Minkowski-reduced if the basis
$\{1,\zeta _1,\ldots ,\zeta _{n-1}\}$
of
$R_f$
is Minkowski-reduced. We say that
$f(x,y)$
, or its
$\mathrm {{SL}}_2({\mathbb {Z}})$
-orbit, is quasi-reduced if there exists
$\gamma \in \mathrm {{SL}}_2({\mathbb {Z}})$
such that
$\gamma .f$
is Minkowski-reduced. We add the prefix ‘strongly’ if the relevant lattice has a unique Minkowski-reduced basis. The relevance of being strongly quasi-reduced is contained in the following lemma.
Lemma 7.2. Let
$n\geq 3$
and let
$f(x,y)$
and
$f^*(x,y)$
be strongly quasi-reduced integral binary n-ic forms. Suppose the corresponding rank-n rings
$R_f$
and
$R_{f^*}$
are isomorphic. Then
$f(x,y)$
and
$f^*(x,y)$
are
$\mathrm {{SL}}_2({\mathbb {Z}})$
-equivalent.
Proof. It suffices to assume
$f(x,y)=a_0x^n + \cdots + a_ny^n$
and
$f^*(x,y) = a_0^*x^n + \cdots + a_n^*y^n$
are strongly Minkowski-reduced with
$R_f\simeq R_{f^*}$
. We show
$f(x,y) = f^*(x,y)$
. Let
$\phi :R_f\rightarrow R_{f^*}$
be a ring isomorphism. By the uniqueness of Minkowski-reduced bases,
$\phi $
must map the basis elements
$1,\zeta _1,\ldots ,\zeta _{n-1}$
for
$R_f$
to the corresponding basis elements
$1,\zeta ^*_1,\ldots ,\zeta ^*_{n-1}$
for
$R_{f^*}$
. Let
$\theta $
denote the image of x in
${\mathbb {Q}}[x]/(f(x,1))$
and
$\theta ^*$
the image of x in
${\mathbb {Q}}[x]/(f^*(x,1))$
. Then
$\phi (a_0\theta ) = a_0^*\theta ^*$
and

Since
$\theta ^*$
and
$\theta ^{*2}$
are linearly independent, we have
$a_0 = a_0^*$
,
$a_1 = a_1^*$
, and
$\phi (\theta ) = \theta ^*$
, where we extend
$\phi $
naturally to
$R_f\otimes {\mathbb {Q}} = {\mathbb {Q}}[x]/(f(x,1))$
. Then since
$\phi (\zeta _{n-1}) = \zeta ^*_{n-1}$
, we have
$a_i=a_i^*$
for
$i=0,\ldots ,n-2$
. Finally,
$\phi (-a_{n-1}\theta - a_n) = \phi (\theta \zeta _{n-1}) =\phi (\theta ^*\zeta _{n-1}^*)= -a_{n-1}^*\theta ^* - a_n^*$
. Hence,
$a_{n-1} = a^*_{n-1}$
and
$a_n = a^*_n$
.
Proof of Theorem 3.
The condition of being strongly quasi-reduced is open in
$V_n({\mathbb {R}})$
. Therefore, given a strongly quasi-reduced element
$f\in V_n({\mathbb {R}})$
, there exists an open neighbourhood
${\mathcal {B}}$
of f in which every element is strongly quasi-reduced. Moreover, since the action of
$\mathrm {{SL}}_2({\mathbb {Z}})$
on
$V_n({\mathbb {R}})$
is discrete, we may ensure that no two elements of
${\mathcal {B}}$
are
$\mathrm {{SL}}_2({\mathbb {Z}})$
-equivalent. We may further scale
${\mathcal {B}}$
in order to assume that every element in
${\mathcal {B}}$
has discriminant bounded by
$1$
.
Consider the set
${\mathcal {B}}_X:=X^{1/(2n-2)}\cdot {\mathcal {B}}$
. No two elements in it are
$\mathrm {{SL}}_2({\mathbb {Z}})$
-equivalent, and every element in it is strongly quasi-reduced. Therefore, the rings corresponding to any two elements in
${\mathcal {B}}_X$
are nonisomorphic. However, applying Theorem 7.1, we see that
$\gg X^{(n+1)/(2n-2)}$
integral elements in
${\mathcal {B}}_X$
have discriminant less than X and correspond to maximal orders in degree-n number fields. Since these rings are pairwise nonisomorphic, so are their fields of fractions. Hence, we have constructed
$\gg X^{(n+1)/(2n-2)}$
nonisomorphic degree-n number fields of absolute discriminant less than X. Restricting to counting forms that have squarefree discriminant yields
$\gg X^{(n+1)/(2n-2)}$
nonisomorphic
$S_n$
-number fields.
We note that Theorem 7.1 also allows us to construct
$\gg X^{(n+1)/(2n-2)} S_n$
-number fields satisfying any finite set of splitting conditions.
A Computations of the local densities
$\alpha _n(p),\beta _n(p)$
Let
$n\geq 2$
be a fixed integer. For a prime p, let
$\alpha _n(p)$
denote the density of the set of binary n-ic forms having discriminant indivisible by
$p^2$
, and let
$\beta _n(p)$
denote the density of binary n-ic forms whose associated rank-n rings are maximal at p. In this section, we compute
$\alpha _n(p)$
and
$\beta _n(p)$
for all integers
$n\geq 2$
and all primes p.
Proposition A.1. We have
$\alpha _2(2) = 1/2$
and
$\alpha _n(2) = 3/8$
for
$n\geq 3$
. For odd primes p, we have

Proof. For
$j\geq 0$
,
$n\geq 1$
, and p prime, we let
$\nu _{j}(n,p)$
denote the density within monic degree-n integer polynomials of the set of those whose discriminants have p-adic valuation j. Then
$\nu _0(n,p)$
and
$\nu _1(n,p)$
are computed in [Reference Ash, Brakenhoff and Zarrabi2, Proposition 6.4 and Theorem 6.8]:

To compute the densities
$\alpha _n(p)$
, we partition the set of integral binary n-ic forms
$f(x,y)=a_0x^n+a_1x^{n-1}y+\cdots +a_ny^n$
whose discriminants are not divisible by
$p^2$
into three subsets, and compute each of their densities. For any binary form
$f(x,y)$
in
${\mathbb {Z}}[x,y]$
or in
$({\mathbb {Z}}/p^2{\mathbb {Z}})[x,y]$
, we write
$\bar {f}(x,y)$
for its reduction modulo p.
Subset 1: The set of
$f(x,y)$
with
$p\nmid a_0$
and
$p^2\nmid \Delta (f)$
. Here, for any fixed leading coefficient
$a_0\not \equiv 0\pmod p$
, the density of
$f(x,y)$
having discriminant indivisible by
$p^2$
is simply given by
$\nu _0(n,p)+\nu _1(n,p)$
. Therefore, the p-adic density of this subset is equal to

Subset 2: The set of
$f(x,y)$
with
$p\mid a_0$
,
$p\nmid a_1$
, and
$p^2\nmid \Delta (f)$
. In this case, we begin by proving that the density of elements f with fixed
$a_0$
and
$a_1$
and with
$p^2\nmid \Delta (f)$
is the same as the density of binary
$(n-1)$
-ic forms g, with fixed leading coefficient
$a_1$
such that
$p^2\nmid \Delta (g)$
. Indeed, given any
$(a_2,\ldots ,a_n)\in ({\mathbb {Z}}/p^2{\mathbb {Z}})^{n-1}$
, we write

Define

Recall that
$p^2$
strongly divides the discriminant of f if and only if
$\bar {f}(x,y)$
has a factor of the form
$h(x,y)^3$
for some linear form h or a factor of the form
$j(x,y)^2$
where j is a binary form of degree at least 2. Since
$\overline {f_{a_2,\ldots ,a_n}}(x,y)\equiv y\,\overline {g_{a_2,\ldots ,a_n}}(x,y)$
, and since y does not divide
$\overline {g_{a_2,\ldots ,a_n}}(x,y)$
, we see that
$\overline {f_{a_2,\ldots ,a_n}}(x,y)$
admits such a factor if and only if
$\overline {g_{a_2,\ldots ,a_n}}(x,y)$
does. Hence,
$S_f^{(1)} = S_g^{(1)}$
. However, we have

The density
$(\#S_g^{(1)}+\#S_g^{(2)})/p^{2(n-1)}$
is
$\nu _0(n-1,p)+\nu _1(n-1,p)$
. Taking into account that
$p\mid a_0$
and
$p\nmid a_1$
, we see that the density of this second subset is

Subset 3: The set of
$f(x,y)$
with
$p\mid a_0$
,
$p\mid a_1$
, and
$p^2\nmid \Delta (f)$
. Note that we already have
$p\mid \Delta (f)$
in this case. To ensure that
$p^2\nmid \Delta (f)$
, we must have
$p>2$
,
$p^2\nmid a_0$
, and
$p\nmid a_2$
. Indeed, if
$p=2$
, then since
$2\mid \Delta (f)$
, we have
$4\mid \Delta (f)$
; if
$p^2\mid a_0$
, then
$p^2$
(weakly) divides
$\Delta (f)$
; and if
$p\mid a_2$
, then
$y^3\mid \bar {f}$
and so
$p^2$
(strongly) divides
$\Delta (f)$
. As polynomials in
$a_0,\ldots ,a_n$
, we have

Hence, if
$p>2$
,
$p^2\nmid a_0$
, and
$p\nmid a_2$
, then
$p^2\nmid \Delta (f)$
if and only if
$p\nmid \Delta (a_2x^{n-2}+a_3x^{n-3}y+\cdots a_ny^{n-2})$
. Hence, the density of this third subset is

Adding together these three densities yields the proposition.
Next, we compute the value of
$\beta _n(p)$
for integers
$n\geq 2$
and primes p.
Proposition A.2. We have

Proof. The density of monic degree-n integer polynomials that are maximal at p was computed in [Reference Ash, Brakenhoff and Zarrabi2, Proposition 3.5] to be
$1-p^{-2}$
for all
$n\geq 2$
and all primes p.
We compute
$\beta _n(p)$
by working over
${\mathbb {Z}}_p$
. Fix a binary n-ic form
$f(x,y)\in V_n({\mathbb {Z}}_p)$
. Suppose
$f(x,y)$
(mod p) factors as
$y^kg(x,y)$
, where
$g(x,y)$
is a binary
$(n-k)$
-ic form over
${\mathbb {F}}_p$
with nonzero
$x^{n-k}$
-term for some
$k\in \{0,\ldots ,n\}$
. Then, by Hensel’s lemma,
$f(x,y)$
factors as
$h_1(x,y)h_2(x,y)$
where
$h_1(x,y)\in {\mathbb {Z}}_p[x]$
is a binary k-ic form such that
$h_1(x,y)$
(mod p) is
$y^k$
and
$h_2(x,y)\in {\mathbb {Z}}_p[x]$
is a binary
$(n-k)$
-ic such that
$h_2(x,y)$
(mod p) is
$g(x,y)$
. By scaling
$h_1$
and
$h_2$
, we may further assume that the leading coefficient of
$h_2(x,y)$
is
$1$
.
Since
$h_1(x,y)$
and
$h_2(x,y)$
share no common factors (mod p), the rank-n ring over
${\mathbb {Z}}_p$
associated to
$f(x,y)$
is isomorphic to the product of the rings associated to
$h_1(x,y)$
and
$h_2(x,y)$
. Since
$h_1(x,y)$
reduces to a unit times
$y^k$
modulo p, the rank-k ring associated to
$h_1(x,y)$
is always maximal when
$k\leq 1$
and is maximal when
$k\geq 2$
if and only if
$p^2$
does not divide the
$x^k$
-coefficient. However,
$h_2(x,y)$
is monic, and so the probability that it is maximal is exactly
$1-p^{-2}$
when
$n-k\geq 2$
, and
$1$
when
$n-k=1$
. When
$k=n$
,
$f(x,y)$
is a multiple of p and is automatically nonmaximal. Summing over k, we have for
$n\geq 3$
,

When
$n=2$
, we have

This concludes the proof of Proposition A.2.
Acknowledgements
We are grateful for comments from the anonymous referees.
Competing interest
The authors have no competing interests to declare.
Funding statement
The first-named author was supported by a Simons Investigator Grant and NSF Grant DMS-1001828. The second-named author was supported by an NSERC Discovery Grant and Sloan Research Fellowship. The third-named author was supported by an NSERC Discovery Grant.