We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Radix Sort is a sorting algorithm based on analyzing digital data. We study the number of swaps made by Radix Select (a one-sided version of Radix Sort) to find an element with a randomly selected rank. This kind of grand average provides a smoothing over all individual distributions for specific fixed-order statistics. We give an exact analysis for the grand mean and an asymptotic analysis for the grand variance, obtained by poissonization, the Mellin transform, and depoissonization. The digital data model considered is the Bernoulli(p). The distributions involved in the swaps experience a phase change between the biased cases (p ≠ ½) and the unbiased case (p = ½). In the biased cases, the grand distribution for the number of swaps (when suitably scaled) converges to that of a perpetuity built from a two-point distribution. The tool for this proof is contraction in the Wasserstein metric space, and identifying the limit as the fixed-point solution of a distributional equation. In the unbiased case the same scaling for the number of swaps gives a limiting constant in probability.
Let Xn be a sequence of integrable real random variables, adapted to a filtration (Gn). Define Cn = √{(1 / n)∑k=1nXk − E(Xn+1 | Gn)} and Dn = √n{E(Xn+1 | Gn) − Z}, where Z is the almost-sure limit of E(Xn+1 | Gn) (assumed to exist). Conditions for (Cn, Dn) → N(0, U) x N(0, V) stably are given, where U and V are certain random variables. In particular, under such conditions, we obtain √n{(1 / n)∑k=1nX_k - Z} = Cn + Dn → N(0, U + V) stably. This central limit theorem has natural applications to Bayesian statistics and urn problems. The latter are investigated, by paying special attention to multicolor randomly reinforced urns.
Exact lower bounds on the exponential moments of min(y, X) and X1{X < y} are provided given the first two moments of a random variable X. These bounds are useful in work on large deviation probabilities and nonuniform Berry-Esseen bounds, when the Cramér tilt transform may be employed. Asymptotic properties of these lower bounds are presented. Comparative advantages of the so-called Winsorization min(y, X) over the truncation X1{X < y} are demonstrated. An application to option pricing is given.
We investigate the maximal number Mk of offspring amongst all individuals in a critical Galton-Watson process started with k ancestors. We show that when the reproduction law has a regularly varying tail with index -α for 1 < α < 2, then k-1Mk converges in distribution to a Frechet law with shape parameter 1 and scale parameter depending only on α.
We provide necessary and sufficient conditions for the asymptotic normality of Nn, the number of records among the first n observations from a sequence of independent and identically distributed random variables, with general distribution F. In the case of normality we identify the centering and scaling sequences. Also, we characterize distributions for which the limit is not normal in terms of their discrete and continuous components.
We consider the level hitting times τy = inf{t ≥ 0 | Xt = y} and the running maximum process Mt = sup{Xs | 0 ≤ s ≤ t} of a growth-collapse process (Xt)t≥0, defined as a [0, ∞)-valued Markov process that grows linearly between random ‘collapse’ times at which downward jumps with state-dependent distributions occur. We show how the moments and the Laplace transform of τy can be determined in terms of the extended generator of Xt and give a power series expansion of the reciprocal of Ee−sτy. We prove asymptotic results for τy and Mt: for example, if m(y) = Eτy is of rapid variation then Mt / m-1(t) →w 1 as t → ∞, where m-1 is the inverse function of m, while if m(y) is of regular variation with index a ∈ (0, ∞) and Xt is ergodic, then Mt / m-1(t) converges weakly to a Fréchet distribution with exponent a. In several special cases we provide explicit formulae.
This paper is motivated by relations between association and independence of random variables. It is well known that, for real random variables, independence implies association in the sense of Esary, Proschan and Walkup (1967), while, for random vectors, this simple relationship breaks. We modify the notion of association in such a way that any vector-valued process with independent increments also has associated increments in the new sense - association between blocks. The new notion is quite natural and admits nice characterization for some classes of processes. In particular, using the covariance interpolation formula due to Houdré, Pérez-Abreu and Surgailis (1998), we show that within the class of multidimensional Gaussian processes, block association of increments is equivalent to supermodularity (in time) of the covariance functions. We also define corresponding versions of weak association, positive association, and negative association. It turns out that the central limit theorem for weakly associated random vectors due to Burton, Dabrowski and Dehling (1986) remains valid, if the weak association is relaxed to the weak association between blocks.
We consider a feed-forward network with a single-server station serving jobs with multiple levels of priority. The service discipline is preemptive in that the server always serves a job with the current highest level of priority. For this system with discontinuous dynamics, we establish the sample path large deviation principle using a weak convergence argument. In the special case where jobs have two different levels of priority, we also explicitly identify the exponential decay rate of the total population overflow probabilities by examining the geometry of the zero-level sets of the system Hamiltonians.
We consider a serialized coin-tossing leader election algorithm that proceeds in rounds until a winner is chosen, or all contestants are eliminated. The analysis allows for either biased or fair coins. We find the exact distribution for the duration of any fixed contestant; asymptotically, it turns out to be a geometric distribution. Rice's method (an analytic technique) shows that the moments of the duration contain oscillations, which we give explicitly for the mean and variance. We also use convergence in the Wasserstein metric space to show that the distribution of the total number of coin flips (among all participants), suitably normalized, approaches a normal limiting random variable.
In this paper we study the number of random records in an arbitrary split tree (or, equivalently, the number of random cuttings required to eliminate the tree). We show that a classical limit theorem for the convergence of sums of triangular arrays to infinitely divisible distributions can be used to determine the distribution of this number. After normalization the distributions are shown to be asymptotically weakly 1-stable. This work is a generalization of our earlier results for the random binary search tree in Holmgren (2010), which is one specific case of split trees. Other important examples of split trees include m-ary search trees, quad trees, medians of (2k + 1)-trees, simplex trees, tries, and digital search trees.
We analyze the mean cost of the partial match queries in random two-dimensional quadtrees. The method is based on fragmentation theory. The convergence is guaranteed by a coupling argument of Markov chains, whereas the value of the limit is computed as the fixed point of an integral equation.
We show how the extremal behavior of d-variate Archimedean copulas can be deduced from their stochastic representation as the survival dependence structure of an ℓ1-symmetric distribution (see McNeil and Nešlehová (2009)). We show that the extremal behavior of the radial part of the representation is determined by its Williamson d-transform. This leads in turn to simple proofs and extensions of recent results characterizing the domain of attraction of Archimedean copulas, their upper and lower tail-dependence indices, as well as their associated threshold copulas. We outline some of the practical implications of their results for the construction of Archimedean models with specific tail behavior and give counterexamples of Archimedean copulas whose coefficient of lower tail dependence does not exist.
We consider a branching process with Poissonian immigration where individuals have inheritable types. At rate θ, new individuals singly enter the total population and start a new population which evolves like a supercritical, homogeneous, binary Crump-Mode-Jagers process: individuals have independent and identically distributed lifetime durations (nonnecessarily exponential) during which they give birth independently at a constant rate b. First, using spine decomposition, we relax previously known assumptions required for almost-sure convergence of the total population size. Then, we consider three models of structured populations: either all immigrants have a different type, or types are drawn in a discrete spectrum or in a continuous spectrum. In each model, the vector (P1, P2,…) of relative abundances of surviving families converges almost surely. In the first model, the limit is the GEM distribution with parameter θ / b.
We consider a generalized form of the coupon collection problem in which a random number, S, of balls is drawn at each stage from an urn initially containing n white balls (coupons). Each white ball drawn is colored red and returned to the urn; red balls drawn are simply returned to the urn. The question considered is then: how many white balls (uncollected coupons) remain in the urn after the kn draws? Our analysis is asymptotic as n → ∞. We concentrate on the case when kn draws are made, where kn / n → ∞ (the superlinear case), although we sketch known results for other ranges of kn. A Gaussian limit is obtained via a martingale representation for the lower superlinear range, and a Poisson limit is derived for the upper boundary of this range via the Chen-Stein approximation.
We consider the sample paths of the order statistics of independent and identically distributed random variables with common distribution function F. If F is strictly increasing but possibly having discontinuities, we prove that the sample paths of the order statistics satisfy the large deviation principle in the Skorokhod M1 topology. Sanov's theorem is deduced in the Skorokhod M'1 topology as a corollary to this result. A number of illustrative examples are presented, including applications to the sample paths of trimmed means and Hill plots.
Uniform large deviation principles for positive functionals of all equivalent types of infinite-dimensional Brownian motions acting together with a Poisson random measure are established. The core of our approach is a variational representation formula, which for an infinite sequence of independent and identically distributed real Brownian motions and a Poisson random measure was shown in [A. Budhiraja, P. Dupuis and V. Maroulas, Variational representations for continuous time processes. Ann. Inst. H. Poincaré (to appear)].
We propose a model for the presence/absence of a population in a collection of habitat patches. This model assumes that colonisation and extinction of the patches occur as distinct phases. Importantly, the local extinction probabilities are allowed to vary between patches. This permits an investigation of the effect of habitat degradation on the persistence of the population. The limiting behaviour of the model is examined as the number of habitat patches increases to ∞. This is done in the case where the number of patches and the initial number of occupied patches increase at the same rate, and for the case where the initial number of occupied patches remains fixed.
The point process of vertices of an iteration infinitely divisible or, more specifically, of an iteration stable random tessellation in the Euclidean plane is considered. We explicitly determine its covariance measure and its pair-correlation function, as well as the cross-covariance measure and the cross-correlation function of the vertex point process and the random length measure in the general nonstationary regime. We also give special formulae in the stationary and isotropic setting. Exact formulae are given for vertex count variances in compact and convex sampling windows, and asymptotic relations are derived. Our results are then compared with those for a Poisson line tessellation having the same length density parameter. Moreover, a functional central limit theorem for the joint process of suitably rescaled total edge counts and edge lengths is established with the process (ξ, tξ), t > 0, arising in the limit, where ξ is a centered Gaussian variable with explicitly known variance.
In this paper we consider the stochastic analysis of information ranking algorithms of large interconnected data sets, e.g. Google's PageRank algorithm for ranking pages on the World Wide Web. The stochastic formulation of the problem results in an equation of the form where N, Q, {Ri}i≥1, and {C, Ci}i≥1 are independent nonnegative random variables, the {C, Ci}i≥1 are identically distributed, and the {Ri}i≥1 are independent copies of stands for equality in distribution. We study the asymptotic properties of the distribution of R that, in the context of PageRank, represents the frequencies of highly ranked pages. The preceding equation is interesting in its own right since it belongs to a more general class of weighted branching processes that have been found to be useful in the analysis of many other algorithms. Our first main result shows that if ENE[Cα] = 1, α > 0, and Q, N satisfy additional moment conditions, then R has a power law distribution of index α. This result is obtained using a new approach based on an extension of Goldie's (1991) implicit renewal theorem. Furthermore, when N is regularly varying of index α > 1, ENE[Cα] < 1, and Q, C have higher moments than α, then the distributions of R and N are tail equivalent. The latter result is derived via a novel sample path large deviation method for recursive random sums. Similarly, we characterize the situation when the distribution of R is determined by the tail of Q. The preceding approaches may be of independent interest, as they can be used for analyzing other functionals on trees. We also briefly discuss the engineering implications of our results.
Melamed's theorem states that, for a Jackson queueing network, the equilibrium flow along a link follows a Poisson distribution if and only if no customers can travel along the link more than once. Barbour and Brown (1996) considered the Poisson approximate version of Melamed's theorem by allowing the customers a small probability p of travelling along the link more than once. In this note, we prove that the customer flow process is a Poisson cluster process and then establish a general approximate version of Melamed's theorem that accommodates all possible cases of 0 ≤ p < 1.