1. Introduction
Parts of Sections 1–2 are repeated nearly verbatim, for the convenience of the reader and with permission of the publisher, from [Reference Fill and Matterer11]. Note, however, that we have updated the literature review, perhaps most notably including an excellent sequel to this paper, namely, [Reference Ischebeck and Neininger20]; see especially Remark 1.2.
In this section, we describe QuickSelect and QuickQuant and give historical background. The main result of this paper, Theorem 5.1, concerns an algorithm very closely related to QuickQuant known as QuickVal, which is described in Section 4. In Section 3 we will define and consider the algorithm QuickMin, which can be viewed a special case of either QuickQuant or QuickVal.
QuickSelect (also known as FIND), introduced by Hoare [Reference Hoare17], is a randomized algorithm (a close cousin of the randomized sorting algorithm QuickSort, also introduced by Hoare [Reference Hoare18]) for selecting a specified order statistic from an input sequence of objects, or rather their identifying labels usually known as keys. The keys can be numeric or symbol strings, or indeed any labels drawn from a given linearly ordered set. Suppose we are given keys
$y_1,\ldots , y_n$
and we want to find the
$m$
th smallest among them. The algorithm first selects a key (called the pivot) uniformly at random. It then compares every other key to the pivot, thereby determining the rank, call it
$r$
, of the pivot among the
$n$
keys. If
$r = m$
, then the algorithm terminates, returning the pivot key as output. If
$r \gt m$
, then the algorithm is applied recursively to the keys smaller than the pivot to find the
$m$
th smallest among those; while if
$r \lt m$
, then the algorithm is applied recursively to the keys larger than the pivot to find the
$(m - r)$
th smallest among those. More formal descriptions of QuickSelect can be found in [Reference Hoare17] and [Reference Knuth22], for example.
The cost of running QuickSelect can be measured (somewhat crudely) by assessing the cost of comparing keys. We assume that every comparison of two (distinct) keys costs some amount that is perhaps dependent on the values of the keys, and then the cost of the algorithm is the sum of the comparison costs.
Historically, it was customary to assign unit cost to each comparison of two keys, irrespective of their values. We denote the (random) key-comparisons-count cost for QuickSelect by
$K_{n, m}$
. There have been many studies of the random variables
$K_{n, m}$
, including [Reference Devroye3, Reference Mahmoud, Modarres and Smythe24, Reference Grübel and Rösler15, Reference Kodaj and Móri23, Reference Grübel14, Reference Devroye4, Reference Hwang and Tsai19, Reference Devroye and Fawzi5, Reference Fill and Huber9, Reference Fill and Hung10]. But unit cost is not always a reasonable model for comparing two keys. For example, if each key is a string of symbols, then a more realistic model for the cost of comparing two keys is the value of the first index at which the two symbol strings differ. To date, only a few papers [Reference Vallée, Clément, Fill and Flajolet29, Reference Fill and Nakama6, Reference Fill and Nakama7, Reference Ischebeck and Neininger20] have considered QuickSelect from this more realistic symbol-comparisons perspective. As in [Reference Fill and Nakama7] and [Reference Fill and Matterer11], in this paper we will treat a rather general class of cost functions that includes both key-comparisons cost and symbol-comparisons cost.
In our set-up (to be described in detail in Section 2) for this paper, we will consider a variety of probabilistic models (called probabilistic sources) for how a key is generated as an infinite-length string of symbols, but we will always assume that the keys form an infinite sequence of independent and identically distributed and almost surely distinct symbol strings. This gives us, on a single probability space, all the randomness needed to run QuickSelect for every value of
$n$
and every value of
$m \in \{1, \ldots , n\}$
by always choosing the first key in the sequence as the pivot (and maintaining initial relative order of keys when the algorithm is applied recursively); this is what is meant by the natural coupling (cf. [Reference Fill8, Section 1]) of the runs of the algorithm for varying
$n$
and
$m$
(and varying cost functions).
When considering asymptotics of the cost of QuickSelect as the number of keys tends to
$\infty$
, it becomes necessary to let the order statistic
$m_n$
depend on the number of keys
$n$
. When
$m_n / n \to \alpha \in [0, 1]$
, we refer to QuickSelect for finding the
$m_n$
th order statistic among
$n$
keys as QuickQuant
$(n, \alpha )$
. As explained in [Reference Fill8, Section 1], the natural coupling allows us to consider stronger forms of convergence for the cost of QuickQuant
$(n, \alpha )$
than convergence in distribution, such as almost sure convergence and convergence in
$L^p$
. Fill and Nakama [Reference Fill and Nakama7] prove, under certain `tameness' conditions (to be reviewed later) on the probabilistic source and the cost function, that, for each fixed
$\alpha$
, the cost of QuickQuant
$(n, \alpha )$
, when scaled by
$n$
, converges both in
$L^p$
and almost surely to a limiting random variable. Fill and Matterer [Reference Fill and Matterer11] extend these univariate convergence results to results about convergence of certain related stochastic processes.
Closely related to QuickQuant
$(n, \alpha )$
is an algorithm called QuickVal
$(n, \alpha )$
, detailed in Section 4. Employing the natural coupling, QuickVal
$(n, \alpha )$
searches (almost surely unsuccessfully) for a specified population quantile
$\alpha \in [0, 1]$
in an input sample of size
$n$
. Call the total cost of comparisons for this algorithm
$S_n$
. For a general class of cost functions, Fill and Nakama [Reference Fill and Nakama7] proved under mild assumptions that the scaled cost
$S_n / n$
of QuickVal converges in
$L^p$
and almost surely to a limit random variable
$S$
. For a general cost function, we consider what we term the QuickVal residual:

The residual is of natural interest, especially in light of the previous analogous work on QuickSort [Reference Bindjeme and Fill1, Reference Neininger26, Reference Fuchs13, Reference Grübel and Kabluchko16, Reference Sulzbach28].
An outline for this paper is as follows. First, in Section 2 we carefully describe our set-up and, in some detail, discuss probabilistic sources, cost functions, and tameness; we also discuss the idea of seeds, which allow us a unified treatment of all sources. Section 3 concerns QuickMin (the case
$\alpha = 0$
of QuickVal) with unit cost per key-comparison, for which we are able to calculate – à la Bindjeme and Fill for QuickSort [Reference Bindjeme and Fill1] – the exact (and asymptotic)
$L^2$
-norm of the residual; the result is Theorem3.1, which we take as motivation for the scaling factor
$\sqrt {n}$
for the QuickVal residual for general population quantiles and for general cost. The remainder of the paper is devoted to establishing convergence of the QuickVal residual. Section 4 introduces notation needed to state the main theorem and establishes an important preliminary result (Lemma 4.2). We state and prove the main theorem (Theorem5.1), which asserts that the scaled cost of the QuickVal residual converges in law to a scale mixture of centered Gaussians, in Section 5; and in Section 6 we prove the corresponding convergence of moments.
Remark 1.1.
As recalled from [Reference Fill and Nakama7] at the end of our Section 2.1
, many common sources, including memoryless and Markov sources, have the property that the source-specific cost function
$\beta$
corresponding to the symbol-comparisons cost for comparing keys is
$\epsilon$
-tame for every
$\epsilon \gt 0$
. Thus our main result, Theorem 5.1, applies to all such sources.
Remark 1.2.
In very recent work, Ischebeck and Neininger [Reference Ischebeck and Neininger20] extend our main Theorem 5.1 from univariate normal convergence for each
$\alpha$
to Gaussian-process convergence, treating
$\alpha$
as a parameter, in the metric space of càdlàg functions endowed with the Skorokhod metric.
To motivate the reader, here is a fairly easily understood instance of our main Theorem5.1. Suppose that keys arrive as i.i.d. uniform
$(0, 1)$
random variables, and suppose that cost is measured classically as the number of key comparisons. Using QuickVal to search for population quantile
$\alpha \in [0, 1]$
, suppose for each
$k \geq 0$
that the search has been narrowed to the (random) interval
$(L_k, R_k)$
after
$k$
steps of the algorithm have been carried out; in particular,
$L_0 = 0$
and
$R_0 = 1$
. Let
$I_k \,{:\!=}\, R_k - L_k$
. Then, as shown in the proof of Theorem5.1, the random series (of positive terms) in the expressions

converge with probability one, and the residual
$\rho _n$
given by (1.1) converges in distribution to
$\sigma _{\infty } Z$
, where
$Z$
has a standard normal distribution and is independent of
$\sigma _\infty$
.
2. Set-up
2.1 Probabilistic sources
Let us define the fundamental probabilistic structure underlying the analysis of QuickSelect. We assume that keys arrive independently and with the same distribution and that each key is composed of a sequence of symbols from some finite or countably infinite alphabet. Let
$\Sigma$
be this alphabet (which we assume is totally ordered by
$\leq$
). Then a key is an element of
$\Sigma ^\infty$
[ordered by the lexicographic order, call it
$\preceq$
, corresponding to
$(\Sigma ,\leq )$
] and a probabilistic source is a stochastic process
$W=(W_1, W_2, W_3, \ldots )$
such that for each
$i$
the random variable
$W_i$
takes values in
$\Sigma$
. We will impose restrictions on the distribution of
$W$
that will have as a consequence that (with probability one) all keys are distinct.
We denote the cost (assumed to be non-negative) of comparing two keys
$w,w^\prime$
by
$\textrm { cost}(w, w^\prime )$
. As two examples, the choice
$\textrm { cost}(w, w') \equiv 1$
gives rise to a key-comparisons analysis, whereas if words are symbol strings then a symbol-comparisons analysis is obtained by letting
$\textrm { cost}(w, w')$
be the first index at which
$w$
and
$w'$
disagree.
Since
$\Sigma ^\infty$
is totally ordered, a probabilistic source
$W$
is governed by a distribution function
$F$
defined for
$w \in \Sigma ^\infty$
by

Then the corresponding inverse probability transform
$M$
, defined by

has the property that if
$U \sim \text{uniform}(0, 1)$
, then
$M(U)$
has the same distribution as
$W$
. We refer to such uniform random variables
$U$
as seeds.
Using this technique, we can define a source-specific cost function

by
$\beta (u,v) \,{:\!=}\, \textrm { cost}(M(u), M(v))$
.
Definition 2.1.
Let
$0\lt c \lt \infty$
and
$0\lt \epsilon \lt \infty$
. A source-specific cost function
$\beta$
is said to be
$(c,\epsilon )$
-tame if for
$0 \lt u \lt t \lt 1$
, we have

and is said to be
$\epsilon$
-tame if it is
$(c, \epsilon )$
-tame for some
$c$
.
For further important background on sources, cost functions, and tameness, we refer the reader to Section 2.1 (see especially Definitions 2.3–2.4 and Remark 2.5) in Fill and Nakama [Reference Fill and Nakama7]. Note in particular that many common sources, including memoryless and Markov sources, have the property that the source-specific cost function
$\beta$
corresponding to symbol-comparisons cost for comparing keys is
$\epsilon$
-tame for every
$\epsilon \gt 0$
.
2.2 Tree of seeds and the QuickSelect tree processes
Let
$\mathcal{T}$
be the collection of (finite or infinite) rooted ordered binary trees (whenever we refer to a binary tree we will assume it is of this variety) and let
$\overline {T}\in \mathcal{T}$
be the complete infinite binary tree. We will label each node
$\theta$
in a given tree
$T \in \mathcal{T}$
by a binary sequence representing the path from the root to
$\theta$
, where
$0$
corresponds to taking the left child and
$1$
to taking the right. We consider the set of real-valued stochastic processes each with index set equal to some
$T \in \mathcal{T}$
. For such a process, we extend the index set to
$\overline {T}$
by defining
$X_{\theta } = 0$
for
$\theta \in \overline {T} \setminus T$
. We will have need for the following definition of levels of a binary tree.
Definition 2.2.
For
$0\leq k \lt \infty$
, we define the
$k^{\textrm { th}}$
level
$\Lambda _k$
of a binary tree as the collection of vertices that are at distance
$k$
from the root.
Let
$\Theta = \bigcup _{0 \le k \lt \infty }\{0,1\}^k$
be the set of all finite-length binary strings, where
$\{0,1\}^0 = \{\varepsilon \}$
with
$\varepsilon$
denoting the empty string. Set
$L_\varepsilon \,{:\!=}\, 0$
,
$R_\varepsilon \,{:\!=}\, 1$
, and
$\tau _{\varepsilon } \,{:\!=}\, 1$
. Then, for
$\theta \in \Theta$
, we define
$|\theta |$
to be the length of the string
$\theta$
, and
$\upsilon _\theta (n)$
to be the size (through the arrival of the
$n^{\text{th}}$
key) of the subtree rooted at node
$\theta$
. Given a sequence of independent and identically distributed (iid) seeds
$U_1,U_2,U_3,\ldots$
, we recursively define

where
$\theta _1 \theta _2$
denotes the concatenation of
$\theta _1, \theta _2 \in \Theta$
. For a source-specific cost function
$\beta$
and
$0\leq p \lt \infty$
, we define

In some later definitions we will make use of the positive part function defined as usual by
$x^+ \,{:\!=}\, x \textbf { 1}(x \gt 0)$
. Given a source-specific cost function
$\beta$
and the seeds
$U_1,U_2,U_3,\ldots$
, we define the
$n$
th QuickSelect seed process as the
$n$
-nodes binary tree indexed stochastic process obtained by successive insertions of
$U_1,\ldots , U_n$
into an initially empty binary search tree (BST).
Before we use these random variables, we supply some understanding of them for the reader. The arrival time
$\tau _\theta$
is the index of the seed that is slotted into node
$\theta$
in the construction of the QuickSelect seed process. Note that for each
$\theta \in \Theta$
we have
$P(\tau _\theta \lt \infty ) = 1$
. The interval
$(L_\theta , R_\theta )$
provides sharp bounds for all seeds arriving after time
$\tau _\theta$
that interact with
$U_{\tau _\theta }$
in the sense of being placed in the subtree rooted at
$U_{\tau _{\theta }}$
. A crucial observation is that, conditioned on
$C_\theta$
, the sequence of seeds
$U_{\tau _\theta +1}, U_{\tau _\theta + 2}, \ldots$
are iid uniform
$(0,1)$
; thus, again conditioned on
$C_\theta$
, the sum
$S_{n,\theta }$
is the sum of
$(n-\tau _\theta )^+$
iid random variables. Note that when
$n \leq {\tau _\theta }$
the sum defining
$S_{n,\theta }$
is empty and so
$S_{n,\theta }=0$
; in this case, we shall conveniently interpret
$S_{n,\theta } / (n-{\tau _\theta })^+ = 0 / 0$
as
$0$
. The random variable
$S_{n, \theta }$
is the total cost of comparing the key with seed
$U_{\tau _{\theta }}$
with keys (among the first
$n$
overall to arrive) whose seeds fall in the interval
$(L_{\theta }, R_{\theta })$
, and
$I_{p, \theta }$
is the conditional
$p$
th moment of the cost of one such comparison: If we let
$U \sim \text{uniform}(0,1)$
independent of
$C_\theta$
, then

Conditioned on
$C_\theta$
, the term
$S_{n,\theta }$
is the sum of
$(n-\tau _\theta )^+$
iid random variables with
$p$
th moment
$I_{p,\theta }$
.
We define the
$n^{\textrm { th}}$
QuickSelect tree process as the binary-tree-indexed stochastic process
$S_n = (S_{n,\theta })_{\theta \in \Theta }$
and the limit QuickSelect tree process (so called in light of [Reference Fill and Matterer11, Master Theorem 4.1]) by
$I = (I_\theta )_{\theta \in \Theta }$
.
We recall from [Reference Fill and Matterer11] an easily established lemma that will be invoked in Remark 4.4 and in the proof of Lemma 6.3.
Lemma 2.3 (Lemma 3.1 of [Reference Fill and Matterer11]). If
$\beta$
is
$(c,\epsilon )$
-tame with
$0\leq \epsilon \lt 1/s$
, then for each fixed node
$\theta \in \Lambda _k$
and
$0\leq r \lt \infty$
we have

3. Exact
$L^2$
asymptotics for
QuickMin
residual
Before deriving a limit law for QuickVal under general source-specific cost functions
$\beta$
, we motivate the scaling factor of
$\sqrt {n}$
in Theorem5.1. We consider the case of QuickMin (i.e., QuickSelect for the minimum key) with key-comparisons cost (
$\beta \equiv 1$
). Note that the operation of QuickMin and QuickVal with
$\alpha =0$
are identical. Our goal in this section is to establish Theorem3.1, which gives exact and asymptotic expansions for the second moment of the residual in this special case.
Let
$K_n$
denote the key-comparisons cost of QuickMin and define

where
$\mu _n \,{:\!=}\,\ {\mathbb{E}} K_n = 2(n-H_n)$
for each
$n$
[Reference Knuth22]. A consequence of [Reference Hwang and Tsai19, Theorem 1] is that
$K_n/n {\stackrel {{\mathcal L}}{\longrightarrow }} D$
, where
$D{\,\stackrel {{\mathcal L}}{=}\,} \sum _{k=0}^{\infty } \prod _{j=0}^k U_j$
has a Dickman distribution [Reference Hwang and Tsai19], with
$\mu \,{:\!=}\, {\mathbb{E}} D = 2$
(here
$U_0 \,{:\!=}\, 1$
). (Note that [Reference Hwang and Tsai19] refers to
$D-1$
as having a Dickman distribution; we ignore this distinction.) Applying [Reference Fill and Nakama7, Theorems 3.1 and 3.2] to the special case of QuickMin using key-comparisons costs yields the stronger result that
$Y_n$
converges to a limit random variable
$Y$
in
$L^p$
for any
$p\geq 1$
and almost surely. We can then set

and this
$D$
has a Dickman distribution as defined above.
The main result of this Section 3 is the exact calculation of of the second moment of
$Y_n -Y$
:
Theorem 3.1.
For
$Y_n$
and
$Y$
defined previously, we have

The remainder of this section builds to the proof of Theorem3.1. Define

(i.e., the number of keys that fall into the left subtree of the QuickSelect seed process). To begin the derivation, note that
$K_n = n - 1 + \widetilde {K}_{N_n}$
, where
$\widetilde {K}_{N_n}$
is the key-comparisons cost for QuickMin applied to the left subtree. Note also that the same equation holds as equality in law if the process
$\widetilde {K}$
has the same distribution as the process
$K$
and is independent of
$N_n$
. We also have
$D = 1 + U \widetilde {D}$
with
$U \,{:\!=}\, U_1$
and
$\widetilde {D} {\,\stackrel {{\mathcal L}}{=}\,} D$
independent.
Make the following definitions:

Then we can express the residual
$Y_n-Y$
in terms of these "smaller versions" of
$Y_n$
and
$Y$
:

where
$C_n(i)\,{:\!=}\, n^{-1}(n-1 + \mu _i-\mu _n)$
and
$C(x)\,{:\!=}\, \mu x - 1 = 2x-1$
. Observe that with these definitions, we can break up the previous equation as

where

Conditionally given
$N_n$
and
$U$
, the random variable
$W_2$
is constant and
$W_1$
has mean zero, so

Consider the first term
${\mathbb{E}} W_1^2$
.
Lemma 3.2.

Proof. If we define

then
$W_1 = Z_1 + Z_2$
and so
${\mathbb{E}} W_1^2 = {\mathbb{E}} Z_1^2+{\mathbb{E}} Z_2^2 + 2{\mathbb{E}} (Z_1 Z_2)$
.
For the cross term
${\mathbb{E}} (Z_1 Z_2)$
, conditionally given
$N_n$
the random variable
$U$
is distributed Beta
$(N_n+1,n-N_n)$
. Therefore,

Next consider the term
${\mathbb{E}} Z_1^2$
.
Remark 3.3.
The conditional joint distribution of the process
$(Y_{n, 0})_{n \geq 0}$
and the random variable
$Y^{(0)}$
given
$N_n$
is the conditional joint distribution of the process
$(Y^*_{N_n})_{n \geq 0}$
and the random variable
$Y^*$
given
$N_n$
, where the process
$(Y^*_n)$
and the random variable
$Y^*$
are independent of
$N_n$
and have (unconditionally) the same joint distribution as the process
$(Y_n)$
and the random variable
$Y$
.
In light of the preceding remark,

Since
$N_n\sim \text{unif}\left \{0,1,2,\ldots ,n-1\right \}$
, conditioning on
$N_n$
gives

Finally, consider the term
${\mathbb{E}} Z_2^2$
. Since
$Y^{(0)}$
is independent of
$N_n$
and
$U$
, we have

Recall that
$N_n \sim \text{unif}\left \{0,1,\ldots ,n-1\right \}$
and that conditionally given
$N_n$
, we have
$U\sim \text{Beta}(N_n+1,n-N_n)$
; therefore,

Since
${{\mathbb{E}} Y^{(0)}}^2 = 1/2$
[Reference Kirschenhofer and Prodinger21], we have that

Putting these calculations together, we get that

Now we consider the term
${\mathbb{E}} W_2^2$
.
Lemma 3.4.

Proof. We have

Squaring and then taking expectations, we find

Recall that conditionally given
$N_n$
we have
$U\sim \text{Beta}(N_n+1,n-N_n)$
, which implies that the cross term vanishes; therefore, it suffices to consider the two squared terms. From (3.7) we know that

We now proceed to treat the final term

We can compute the first and second moments of
$H_n - H_{N_n}$
as follows. Fixing
$n$
, for
$1 \leq i \leq n$
consider the events
$B_i \,{:\!=}\, \{N_n \lt i\}$
, which satisfy
$\mathbb{P}(B_i) = \frac {i}{n}$
and
$B_i \cap B_j = B_{i \wedge j}$
(where
$\wedge$
denotes minimum). Note that

Thus

and

Therefore we get that

as desired.
We will also need the following well-known (and very easy to derive) solution to a "divide-and-conquer" recurrence in the proof of Theorem3.1.
Lemma 3.5.
Let
$(A_n)_{n\geq 0}$
and
$(B_n)_{n\geq 1}$
be sequences of real numbers that satisfy

for
$n\geq 1$
. Then for
$n\geq 0$
we have

with
$B_0 \,{:\!=}\, 0$
.
Proof of Theorem 3.1. Combining the expressions for
${\mathbb{E}} W_1^2$
and
${\mathbb{E}} W_2^2$
gives

If we define

then Lemma 3.5 implies

Plugging in
$b_n$
gives

Simplifying this expression gives

where
$a_0^2=1/2$
was substituted in the second equality. Therefore, we can conclude that

4. QuickVal and mixing distribution for residual limit distribution
Our main theorem, Theorem5.1, asserts that the scaled QuickVal residual cost converges in law to a scale mixture of centered Gaussians. In this section, we introduce needed notation and prove Lemma 4.2, which gives an explicit representation of the mixing distribution.
Consider a BST constructed by the insertion (in order) of the
$n$
seeds. Then QuickQuant(
$n, \alpha$
) follows the path from the root to the node storing the
$m_n^{\textrm { th}}$
smallest key, where
$m_n/n\rightarrow \alpha$
.
For QuickVal(
$n,\alpha$
), consider the same BST of seeds with the additional value
$\alpha$
inserted (last). Then QuickVal(
$n,\alpha$
) follows the path from the root to this
$\alpha$
-node. Almost surely for
$n$
large and
$k$
fixed, the difference between these two algorithms in costs associated with the
$k$
-th pivot is negligible to lead order [Reference Fill and Nakama7, (4.2)]. See [Reference Vallée, Clément, Fill and Flajolet29] or [Reference Fill and Nakama7] for a more complete description.
When considering QuickVal, we will simplify the notation since we will only need to reference one path of nodes from the root to a leaf in the QuickSelect process tree. For this we define similar notation indexed by the pivot index (i.e., by the level in the tree). Set
$L_0 \,{:\!=}\, 0$
,
$R_0 \,{:\!=}\, 1$
, and
$\tau _0 \,{:\!=}\, 1$
. Then, for
$k \geq 1$
, we define






Remark 4.1.
Note that [Reference Fill and Nakama7] used the notation
$S_{n,k}$
for what we have called
$S_{k,n}$
.
The random variable
$\tau _k$
is the arrival time/index of the
$k^{\textrm { th}}$
pivot. The interval
$(L_k,R_k)$
gives the range of seeds to be compared to the
$k^{\textrm { th}}$
pivot in the operation of the QuickVal algorithm. The cost of comparing seed
$i$
to the
$k^{\textrm { th}}$
pivot is given by
$X_{k,i}$
. The total comparison costs attributed to the
$k^{\textrm { th}}$
pivot is
$S_{k,n}$
.
The cost of QuickVal on
$n$
keys is then given by

Define

and

Then, conditionally given
$\widehat {C}_K$
, the random variable

is the sum of
$(n-\tau _K)^+$
independent and identically distributed random variables, each with the same conditional distribution as
$\widehat {X}_K \,{:\!=}\, \sum _{k = 1}^K X_k$
, where

and
$U$
is uniformly distributed on
$(0, 1)$
and independent of all the
$U_j$
’s. Here,
$\widehat {X}_{K,i}$
is the cost incurred by comparing seed
$i$
to pivots
$1,2,\ldots K$
and
$\widehat {S}_{K,n}$
is the comparison cost of all seeds that arrive after the
$K$
-th pivot to pivots
$1,2,\ldots K$
.
It will be helpful to condition on
$\widehat {C}_K$
later. In anticipation of this, we establish notation for the conditional expectation of
$X_k$
given
$C_k$
(which equals the conditional expectation given
$\widehat {C}_k$
) and, for
$k \leq \ell$
, the conditional expected product of
$X_k$
and
$X_\ell$
given
$\widehat {C}_{\ell }$
, as follows:


We symmetrize the definition of
$I_{2, k, \ell }$
in the indices
$k$
and
$\ell$
by setting
$I_{2, k, \ell }: = I_{2, \ell , k}$
for
$k \gt \ell$
. Finally, we write
$I_{2, k}$
as shorthand for
$I_{2, k, k}$
.
We now calculate the mean and variance of
$\widehat {X}_K$
with the intention of applying the classical central limit theorem; everything is done conditionally given
$\widehat {C}_K$
. Define
$\mu _{K}$
and
$\sigma _K^2$
to be the conditional mean and conditional variance of
$\widehat {X}_K$
given
$\widehat {C}_K$
, respectively. Then

We next present a condition guaranteeing that
$\sigma _K^2$
behaves well as
$K \to \infty$
. We note in passing that this condition is also the sufficient condition of Theorem3.1 in [Reference Fill and Nakama7] ensuring that
$S_n / n$
converges in
$L^2$
to

Lemma 4.2. If

then both almost surely and in
$L^1$
we have that (i) the two series on the right in the equation

converge absolutely, (ii) the equation holds, and (iii)
$\sigma _K \overset {L^1}{\longrightarrow } \sigma _{\infty }$
as
$K \to \infty$
.
Proof.
Recall the notation
$X_k = \textbf { 1}(L_{k - 1} \lt U \lt R_{k - 1}) \beta (U, U_{\tau _k})$
from above. Consider
$1 \leq k \leq \ell$
. The term
$I_{2,k,\ell } - I_k I_\ell$
equals the conditional covariance of
$X_k$
and
$X_{\ell }$
given
$\widehat {C}_{\ell }$
, and the absolute value of this conditional covariance is bounded above by the product of the conditional
$L^2$
-norms, namely,
$I_{2, k}^{1/2} I_{2, \ell }^{1/2}$
. Thus for the three desired conclusions it is sufficient that
${\mathbb{E}} \left ( \sum _{k = 1}^{\infty } I_{2, k}^{1/2} \right )^2 \lt \infty$
. But

Remark 4.3. In light of the absolute convergence noted in conclusion (i) of Lemma 4.2, we may unambiguously write

both in
$L^1$
and almost surely.
5. Convergence
Our main result is that, for a suitably tame cost function, the QuickVal residual converges in law to a scale-mixture of centered Gaussians. Furthermore, we have the explicit representation of Lemma 4.2 for the random scale
$\sigma _{\infty }$
as an infinite series of random variables that depend on conditional variances and covariances related to the source-specific cost functions [see (4.13) and (4.8)–(4.9)].
Theorem 5.1.
Suppose that the cost function
$\beta$
is
$\epsilon$
-tame with
$\epsilon \lt 1/2$
. Then

where
$Z$
has a standard normal distribution and is independent of
$\sigma _\infty$
.
We approach the proof of Theorem5.1 in two parts. First, in Proposition 5.4 we apply the central limit theorem to an approximation
$\widehat {S}_{K,n}$
of the cost of QuickVal
$S_n$
. Second, we show that the error due to the approximation
$\widehat {S}_{K,n}$
is negligible in the limit, culminating in the results of Propositions 5.9 and 5.11.
Before proving Theorem5.1, we state a corollary to Theorem5.1 for QuickMin. Recall that QuickMin is QuickSelect applied to find the minimum of the keys. Using a general source-specific cost function
$\beta$
, denote the cost of QuickMin on
$n$
keys by
$V_n$
. Since the operation of QuickMin is the same as that of QuickVal with
$\alpha =0$
, Theorem5.1 implies the following convergence for the cost of QuickMin with the same mild tameness condition on the source-specific cost function.
Corollary 5.2.
Suppose that the source-specific cost function
$\beta$
is
$\epsilon$
-tame with
$\epsilon \lt 1/2$
. Then

where
$Z$
has a standard normal distribution and is independent of
$\sigma _\infty$
.
Remark 5.3.
In the key-comparisons case
$\beta = 1$
(which is
$\epsilon$
-tame for every
$\epsilon \geq 0$
) for
$k \geq 0$
we have
$L_k \equiv 0$
and
$R_k \equiv U_{\tau _k}$
, with the convention
$U_{\tau _0} \,{:\!=}\, 1$
. Hence
$I_k = U_{\tau _{k - 1}}$
for
$k \geq 1$
, and
$I_{2, k, \ell } = U_{\tau _{\ell - 1}}$
for
$1 \leq k \leq \ell$
. Therefore
$S = \sum _{k \geq 1} U_{\tau _{k - 1}} = 1 + \sum _{k \geq 1} U_{\tau _k}$
and

in Corollary 5.2. To further simplify the understanding of
$\sigma ^2_{\infty }$
, and hence of the limit in Corollary 5.2 in this case, observe that
$U_{\tau _1}, U_{\tau _2}, \ldots$
have the same joint distribution as the cumulative products
$U_1, U_1 U_2, \ldots$
. Thus

Define

Proposition 5.4.
Fix
$K \in \{1, 2, \ldots \}$
. Suppose that

for
$k=1,2,\ldots , K$
. Then

as
$n\rightarrow \infty$
, where
$Z$
has a standard normal distribution independent of
$\sigma _K$
.
Proof.
The classical central limit theorem for independent and identically distributed random variables applied conditionally given
$\widehat {C}_K$
yields

Since
$\tau _K$
is finite almost surely, Slutsky’s theorem (applied conditionally given
$\widehat {C}_K$
) implies that we can replace
$\sqrt {(n-\tau _K)^+}$
by
$\sqrt {n}$
in the denominator of (5.1). Finally, applying the dominated convergence theorem to conditional distribution functions gives that the resulting conditional convergence in distribution in (5.1) holds unconditionally.
Define

and let

Note that
$W_n$
does not depend on
$K$
. We can write
$W_n$
in terms of the cost of QuickVal as follows:

We prove that
$W_n\,{\stackrel {{\mathcal L}}{\longrightarrow }}\,\sigma _\infty Z$
(which is Proposition 5.9) in three parts. First (Lemma 5.5) we show that
$\left \lvert T_{K,n} - W_{K,n} \right \rvert \to 0$
almost surely. Next (Lemma 5.7) we show that
$\left \lVert \overline {W}_{K,n} \right \rVert _2$
is negligible as first
$n \rightarrow \infty$
and then
$K\rightarrow \infty$
. Lastly (see the proof below of Proposition 5.9), an application of Markov’s inequality gives the desired convergence.
Lemma 5.5.
For
$K$
fixed, if
${\mathbb{E}} I_k \lt \infty$
for
$k=1,2,\ldots ,K$
, then

almost surely as
$n\rightarrow \infty$
.
Remark 5.6.
The condition
${\mathbb{E}} I_k \lt \infty$
in Lemma 5.5 is weaker than the condition
${\mathbb{E}} I_{2,k} \lt \infty$
in Proposition 5.4.
Proof of Lemma 5.5
. When
$n \gt \tau _K$
we have

For a fixed
$K$
with
$k\leq K$
, the almost sure finiteness of
$\tau _k$
and
$\tau _K$
implies that the sum

consists of an almost surely finite number of terms. Since each term
$\left \lvert X_{k,i} - I_k \right \rvert$
is finite almost surely, the sum in (5.2) is finite almost surely. Therefore,
$\left \lvert T_{K,n}-W_{K,n} \right \rvert \rightarrow 0$
almost surely as
$n\rightarrow \infty$
.
Lemma 5.7. Let

Then

Remark 5.8.
A necessary and sufficient condition for
$\epsilon _K \rightarrow 0$
as
$K\rightarrow \infty$
is (
4.11
). Therefore, by Remark 4.4
,
$\epsilon$
-tameness for some
$\epsilon \lt 1/2$
is sufficient.
Proof of Lemma 5.7 . Minkowski’s inequality yields

By conditioning on
$C_k$
, we can calculate the square of the
$L^2$
-norm here:

where we use the fact that, conditionally given
$C_k$
, the random variables
$X_{k,i}-I_k$
for
$i \gt \tau _k$
are iid with zero mean. Substituting (5.4) into (5.3) gives the result.
Proposition 5.9. Suppose that

Then

where
$Z$
has a standard normal distribution independent of
$\sigma _\infty$
.
Proof.
Let
$t\in {\mathbb{R}}$
and
$\delta \gt 0$
. Since
$W_n \leq t$
implies either

we have

Markov’s inequality and Lemma 5.7 imply

Taking limits superior as
$n\rightarrow \infty$
gives

by (5.5)–(5.6), Lemma 5.5, and Proposition 5.4, respectively. Now taking limits as
$K\rightarrow \infty$
gives

by Lemma 4.2 and the assumption that
$\epsilon _K\to 0$
. Letting
$\delta \rightarrow 0$
yields

Applying the previous argument with limsup replaced by liminf to

implies

Since
$\sigma _{\infty } Z$
has a continuous distribution, combining (5.7) and (5.8) gives the result.
For completeness we include the following simple lemma, which will be needed in the sequel.
Lemma 5.10.
Let
$0 \lt p \lt 1$
and
$a_1, \ldots , a_n$
be non-negative real numbers. Then

The final step in the proof of Theorem5.1 is to show that the difference between the centering random variable

in
$W_n$
and the more natural

is negligible (when scaled by
$1 / \sqrt {n}$
) in the limit as
$n\rightarrow \infty$
.
Proposition 5.11.
If the source-specific cost function
$\beta$
is
$\epsilon$
-tame with
$\epsilon \lt 1/2$
, then

almost surely as
$n\rightarrow \infty$
.
Proof.
Observe that for any
$0 \lt \delta \lt 1/2$
, we have

Therefore, if we let
$0 \lt \delta \lt (1/2) - \epsilon$
, it suffices to show that

almost surely. We prove this by showing that the random variable in (5.9) has finite expectation. Applying [Reference Fill and Nakama7, Lemma 3.2] implies that for the
$\epsilon$
-tameness constant
$c$
, we have

Define, for
$k=1,2,\ldots$
, the sigma-field
$\mathcal{F}_k\,{:\!=}\, \sigma \langle (L_1,R_1),\ldots (L_{k-1}, R_{k-1})\rangle$
. Conditionally given
$\mathcal{F}_k$
, the distribution of
$\tau _k$
is the convolution over
$j=0,\ldots , k-1$
of geometric distributions with success probabilities
$R_j - L_j$
. This distribution is stochastically smaller than the convolution of
$k$
geometric distributions with success probability
$R_{k-1}-L_{k-1}$
. Let
$G_k, G_{k,0}, \ldots , G_{k,k-1}$
be
$k + 1$
iid geometric random variables with success probability
$R_{k-1}-L_{k-1}$
. Then

where

We can now compute

where
$z=1-(R_{k-1}-L_{k-1})\in [0,1)$
for
$k\geq 2$
is the failure probability and
$p=(1/2) + \delta$
. Note that the infinite series in (5.11) can be written in terms of a polylogarithm function, as follows:

Therefore [Reference Flajolet12, Theorem 1] implies the existence of an
$\eta \in (0,1)$
such that for
$1-\eta \lt z \lt 1$
, we have

On
$0 \leq z \leq 1-\eta$
, the polylogarithm
$\text{Li}_{-p, 0}(z)$
is increasing and therefore we have the bound

Defining

for
$z\in [0,1)$
we get

Substituting the bound from (5.12) in (5.11) gives

Therefore, after substituting
$p=(1/2) + \delta$
, an application of the monotone convergence theorem yields

where
$C_3\,{:\!=}\,C_1C_2$
. Let
$q\,{:\!=}\, (1 / 2) - \epsilon - \delta$
; then by the restriction placed on
$\delta$
, we know
$q \gt 0$
. By [Reference Fill and Nakama7, Lemma 3.1], we have

Therefore, after defining

we have

Consequently, to check the convergence in (5.9), it suffices to check that
$ \sum _{j=0}^\infty \gamma _j^2 \lt \infty$
; however, this follows trivially from the observation that
$ \gamma _j^2 \leq 4 / j^2$
. Therefore, it remains to show that the
$k = 1$
and
$k = 2$
terms in (5.9) have finite expectation. The first arrival time
$\tau _1$
equals
$1$
identically and
${\mathbb{E}} I_1 \lt \infty$
. Applying (5.10) when
$k=2$
gives

Since
$(R_1-L_1)^{1-\epsilon } \lt 1$
a.s. , it suffices to show that

for
$p=(1/2) + \delta$
. However, we can calculate the expectation in (5.13) exactly. Since
$R_1-L_1 {\,\stackrel {{\mathcal L}}{=}\,} 1-U$
, where
$U$
has a unif
$(0,1)$
distribution,

which is finite because
$p\lt 1$
.
6. Convergence of moments for QuickVal residual
The main result of this section is that, under suitable tameness assumptions for the cost function, the moments of the normalized QuickVal residual converge to those of its limiting distribution.
Theorem 6.1.
Let
$p \in [2, \infty )$
. Suppose that the cost function
$\beta$
is
$\epsilon$
-tame with
$\epsilon \lt 1 / p$
. Then the moments of orders
$\leq p$
for the normalized QuickVal residual

converge to the corresponding moments of the limit-law random variable
$\sigma _\infty Z$
.
Remark 6.2.
We will prove Theorem 6.1 using the second assertion in [Reference Chung2
, Theorem 4.5.2]. Use of the first assertion in that theorem shows that, for all real
$r \in [1, p]$
, we also have convergence of
$r$
th absolute moments.
As mentioned in Remark 6.2, we prove Theorem6.1 using [Reference Chung2, Theorem 4.5.2] by proving that, for some
$q \gt p$
, the
$L^q$
-norms of the normalized QuickVal residuals are bounded in
$n$
. Choosing
$q$
arbitrarily from the nonempty interval
$[2, 1 / \epsilon )$
and using the triangle inequality for
$L^q$
-norm, we do this by showing (in Lemmas 6.3 and 6.4, respectively) that the same
$L^q$
-boundedness holds for each of the following two sequences:

and the sequence previously treated in Proposition 5.11:

Lemma 6.3.
Let
$q \in [2, \infty )$
, and suppose that the cost function
$\beta$
is
$\epsilon$
-tame with
$0 \leq \epsilon \lt 1 / q$
. Then, the sequence
$(W_n)$
is
$L^q$
-bounded.
Proof.
This is straightforward. We proceed as at (5.3), except that we use triangle inequality for
$L^q$
-norm rather than for
$L^2$
-norm:

To bound the
$L^q$
-norm on the right, we employ Rosenthal’s inequality [Reference Rosenthal27] conditionally given
$C_k$
to find

and so, by Lemma 5.10,

But by the argument at (5.4) we have

and

by again conditioning on
$C_k$
to obtain the equality here. Consider a generalization of the definition of
$I_{2, k} = I_{2, k, k}$
given in (4.9):

Therefore

Three applications of Lemma 2.3 (requiring
$\epsilon \lt 1 / q$
,
$\epsilon \lt 1$
, and
$\epsilon \lt 1 / 2$
to handle
${\mathbb{E}} I_{q, k}$
,
$\|I_k\|_q$
, and
${\mathbb{E}} I_{2, k}$
, respectively), do the rest.
Lemma 6.4.
Suppose that the cost function
$\beta$
is
$\epsilon$
-tame with
$0 \leq \epsilon \lt 1 / 2$
. Then, the sequence
$(\widehat {W}_n)$
is
$L^q$
-bounded for every
$q \lt \infty$
.
Proof.
We may and do suppose
$q \geq 2$
. We begin as in the proof of Proposition 5.11, except that there is now no harm in choosing
$\delta = 0$
. So, it is sufficient to prove that

We follow the proof of Proposition 5.11 to a large extent; in particular, what we will show is that all the terms in this sum are finite and that, for sufficiently large
$K$
, the series
$\sum _{k = K}^{\infty }$
converges. As in the proof of Proposition 5.11, we utilise the bound

which requires only
$\epsilon$
-tameness with
$\epsilon \lt 1$
. Then, we proceed much the same way as at (5.10), but now substituting convexity of
$q$
th power for use of Lemma 5.10:

where, as before,
$C_1 = 2^{\epsilon } c / (1 - \epsilon )$
.
Arguing from here just as in the proof of Proposition 5.11, we find

where
$C_2 \,{:\!=}\, \max (\Gamma (1+(q/2)), C_{q/2,\eta })$
. (See the proof of Proposition 5.11 for the definition of
$C_{q/2, \eta }$
.) Therefore, with
$C_3 \,{:\!=}\, C_1^{q/2} C_2$
, we have

By [Reference Fill and Nakama7, Lemma 3.1], we have (using our assumption
$\epsilon \lt 1/2$
for the
$j = 0$
term)

where

decreases in
$j$
and vanishes in the limit as
$j \to \infty$
. Therefore, taking
$q$
th roots and using Lemma 5.10,

If we bound the factor
$k^{1/2}$
here by
$k$
and then sum the right side over
$k \geq K$
, the result is

where

like
$\gamma _{j, q, \epsilon }$
, decreases in
$j$
and vanishes in the limit as
$j \to \infty$
. Since
$\Gamma _j \lt (2 / j)^{1 / q}$
, it follows if we take
$K \geq 2 q + 1$
that

It remains to show that
$\left \| \tau _k^{1 / 2} I_k \right \|_q \lt \infty$
for every
$k$
. For this we use (6.2) to note, since
$0 \lt \gamma _{j, q, \epsilon } \lt 2 / j$
, that it clearly suffices to consider the cases
$k = 1$
and
$k = 2$
. When
$k = 1$
we have
$\tau _1$
= 1 and hence
$\left \| \tau _1^{1 / 2} I_1 \right \|_q = \left \| I_1 \right \|_q \leq C_1 \lt \infty$
. Applying (6.1) when
$k = 2$
gives

and we can exactly compute

where
$U \sim \mbox{unif}(0, 1)$
. Each of the terms in this last sum is finite, and by Stirling’s formula the
$i$
th term equals
$(1 + o(1)) i^{-[2 + ((1/2) - \epsilon ) q]} = o(i^{-2})$
as
$i \to \infty$
, so the sum converges. Hence
$\| \tau _2^{1/2} I_2 \|_q \lt \infty$
.
Remark 6.5. Matterer [Reference Matterer25 , Chapter 7] describes the approach, involving the contraction method for inspiration and the method of moments for proof, we initially took in trying to establish a limiting distribution for the QuickVal residual in the special case of QuickMin with key-comparisons cost. It turns out that, for this approach, we must consider the QuickMin limit and the residual from it bivariately. However, we discovered that, unfortunately, the limit residual QuickMin distribution is not uniquely determined by its moments (we omit the proof here); so, the method of moments approach is ultimately unsuccessful, unlike for QuickSort [Reference Fuchs13]. We nevertheless find that approach instructive, since it does yield a rather direct proof of convergence of moments for the residual in the special case of QuickMin with key-comparisons cost; see [Reference Matterer25 , Chapter 7] for details.
Acknowledgements
The authors thank two anonymous referees for helpful comments. In particular, one of the referees provided an argument enabling us to simplify and shorten the proof of Lemma 3.4 substantially.
Data availability statement
Data availability is not applicable to this article since no data were generated or analysed.
Funding statement
Research for both authors was supported by the Acheson J. Duncan Fund for the Advancement of Research in Statistics. The funder had no role in study design, decision to publish, or preparation of the manuscript. The research of the second author was conducted while he was affiliated with The Johns Hopkins University.
Competing interests
The authors declare none.