We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This study is motivated by problems of molecular sequence comparison for multiple marker arrays with correlated distributions. In this paper, the model assumes two (or more) kinds of markers, say Markers A and B, distributed along the DNA sequence. The two primary conditions of interest are (i) many of Marker B (say ≥ m) occur, and (ii) few of Marker B (say ≤ l) occur. We title these the conditional r-scan models, and inquire on the extent to which Marker A clusters or is over-dispersed in regions satisfying condition (i) or (ii). Limiting distributions for the extremal r-scan statistics from the A array satisfying conditions (i) and (ii) are derived by extending the Chen-Stein Poisson approximation method.
Global weak continuity of M-functionals in a neighbourhood of the parametric distribution is established. This has implications for robustness of M-estimators vis a vis definitions put forward by Hampel. For instance the Tukey bisquare location estimator is robust on neighbourhoods of the parametric model, but the median is not.
We study a random record model where the observation Xi has continuous distribution function Fαi (αi > 0) and the number of available observations is random and independent of the observations. We obtain the joint distribution of the record values and inter-record times for our model. We investigate the distribution of the number of records when the number of observations has one of the common distributions and the α's increase geometrically or linearly. A particularly interesting case arises when the observations arrive at time points paced by a Poisson point process. For this model we obtain distributional results for the inter-arrival times of records for a large class of combinations of α structures and intensity functions.
Let n random points be uniformly and independently distributed in the unit square, and count the number W of subsets of k of the points which are covered by some translate of a small square C. If n|C| is small, the number of such clusters is approximately Poisson distributed, but the quality of the approximation is poor. In this paper, we show that the distribution of W can be much more closely approximated by an appropriate compound Poisson distribution CP(λ1, λ2,…). The argument is based on Stein's method, and is far from routine, largely because the approximating distribution does not satisfy the simplifying condition that iλi be decreasing.
This paper arose from interest in assessing the quality of random number generators. The problem of testing randomness of a string of binary bits produced by such a generator gained importance with the wide use of public key cryptography and the need for secure encryption algorithms. All such algorithms are based on a generator of (pseudo) random numbers; the testing of such generators for randomness became crucial for the communications industry where digital signatures and key management are vital for information processing.
The concept of approximate entropy has been introduced in a series of papers by S. Pincus and co-authors. The corresponding statistic is designed to measure the degree of randomness of observed sequences. It is based on incremental contrasts of empirical entropies based on the frequencies of different patterns in the sequence. Sequences with large approximate entropy must have substantial fluctuation or irregularity. Alternatively, small values of this characteristic imply strong regularity, or lack of randomness, in a sequence. Pincus and Kalman (1997) evaluated approximate entropies for binary and decimal expansions of e, π, √2 and √3 with the surprising conclusion that the expansion of √3 demonstrated much less irregularity than that of π. Tractable small sample distributions are hardly available, and testing randomness is based, as a rule, on fairly long strings. Therefore, to have rigorous statistical tests of randomness based on this approximate entropy statistic, one needs the limiting distribution of this characteristic under the randomness assumption. Until now this distribution remained unknown and was thought to be difficult to obtain. To derive the limiting distribution of approximate entropy we modify its definition. It is shown that the approximate entropy as well as its modified version converges in distribution to a χ2-random variable. The P-values of approximate entropy test statistics for binary expansions of e, π and √3 are plotted. Although some of these values for √3 digits are small, they do not provide enough statistical significance against the randomness hypothesis.
We derive explicit closed expressions for the moment generating functions of whole collections of quantities associated with the waiting time till the occurrence of composite events in either discrete or continuous-time models. The discrete-time models are independent, or Markov-dependent, binary trials and the events of interest are collections of successes with the property that each two consecutive successes are separated by no more than a fixed number of failures. The continuous-time models are renewal processes and the relevant events are clusters of points. We provide a unifying technology for treating both the discrete and continuous-time cases. This is based on first embedding the problems into similar ones for suitably selected Markov chains or Markov renewal processes, and second, applying tools from the exponential family technology.
Techniques currently available in the literature in dealing with problems in geometric probabilities seem to rely heavily on results from differential and integral geometry. This paper provides a radical departure in this respect. By using purely algebraic procedures and making use of some properties of Jacobians of matrix transformations and functions of matrix argument, the distributional aspects of the random p-content of a p-parallelotope in Euclidean n-space are studied. The common assumptions of independence and rotational invariance of the random points are relaxed and the exact distributions and arbitrary moments, not just integer moments, are derived in this article. General real matrix-variate families of distributions, whose special cases include the mulivariate Gaussian, a multivariate type-1 beta, a multivariate type-2 beta and spherically symmetric distributions, are considered.
We exhibit solutions of Monge–Kantorovich mass transportation problems with constraints on the support of the feasible transportation plans and additional capacity restrictions. The Hoeffding–Fréchet inequalities are extended for bivariate distribution functions having fixed marginal distributions and satisfying additional constraints. Sharp bounds for different probabilistic functionals (e.g. Lp-distances, covariances, etc.) are given when the family of joint distribution functions has prescribed marginal distributions, satisfies restrictions on the support, and is bounded from above, or below, by other distributions.
We consider a sequence matching problem involving the optimal alignment score for contiguous sequences; rewarding matches and penalizing for deletions and mismatches. Arratia and Waterman conjectured in [1] that the score constant a(μ, δ) is a strictly monotone function (i) in δ for all positive δ and (ii) in μ if 0 ≤ μ ≤ 2δ. Here we prove that (i) is true for all δ and (ii) is true for some μ.
We provide a probabilistic proof of the Stein's factors based on properties of birth and death Markov chains, solving a tantalizing puzzle in using Markov chain knowledge to view the celebrated Stein–Chen method for Poisson approximations. This work complements the work of Barbour (1988) for the case of Poisson random variable approximation.
In applied probability, the distribution of a sum of n independent Bernoulli random variables with success probabilities p1,p2,…, pn is often approximated by a Poisson distribution with parameter λ = p1 + p2 + pn. Popular bounds for the approximation error are excellent for small values, but less efficient for moderate values of p1,p2,…,pn.
Upper bounds for the total variation distance are established, improving conventional estimates if the success probabilities are of medium size. The results may be applied directly, e.g. to approximation problems in risk theory.
The no-aging property and the ℓ1-isotropic model it implies have been introduced to overcome certain shortcomings of the exponential model. However, its definition is abstract and not very useful for practitioners. This paper presents several additional characterizations of the no-aging property. Included are (1) characterizations that appropriately generalize the memoryless property and the constant-failure-rate property of the exponential, (2) behavioral characterizations based on fair bets, and (3) geometric characterizations of the survival and density function and differential-geometric characterizations based on tensor methods.
Size-biased permutation (SBP) is a random arrangement of frequencies of distinct categories in the order in which the categories appear for the first time in the sampling process. We study the conditions under which the SBPs converge in distribution and discuss extended versions of SBP for the case when the sum of positive frequencies is less than 1.
The accuracy of compound Poisson approximation can be estimated using Stein's method in terms of quantities similar to those which must be calculated for Poisson approximation. However, the solutions of the relevant Stein equation may, in general, grow exponentially fast with the mean number of ‘clumps’, leading to many applications in which the bounds are of little use. In this paper, we introduce a method for circumventing this difficulty. We establish good bounds for those solutions of the Stein equation which are needed to measure the accuracy of approximation with respect to Kolmogorov distance, but only in a restricted range of the argument. The restriction on the range is then compensated by a truncation argument. Examples are given to show that the method clearly outperforms its competitors, as soon as the mean number of clumps is even moderately large.
The characterization of the exponential distribution via the coefficient of the variation of the blocking time in a queueing system with an unreliable server, as given by Lin (1993), is improved by substantially weakening the conditions. Based on the coefficient of variation of certain random variables, including the blocking time, the normal service time and the minimum of the normal service and the server failure times, two new characterizations of the exponential distribution are obtained.
Asymptotic expansions are obtained for the distribution function of a studentized estimator of the offspring mean sequence in an array branching process with immigration. The expansion result is shown to hold in a test function topology. As an application of this result, it is shown that the bootstrapping distribution of the estimator of the offspring mean in a sub-critical branching process with immigration also admits the same expansion (in probability). From these considerations, it is concluded that the bootstrapping distribution provides a better approximation asymptotically than the normal distribution.
In this paper we introduce a quantile dispersion measure. We use it to characterize different classes of ageing distributions. Based on the quantile dispersion measure, we propose a new partial ordering for comparing the spread or dispersion in two probability distributions. This new partial ordering is weaker than the well known dispersive ordering and it retains most of its interesting properties.
We present general criteria for analyzing the crossing characteristics of RI, the reliability function of an m-of-n system of components operating within a laboratory (or test-bench) environment, and RO, the reliability function of the same system now operating subject to an external environment. Inside the laboratory the components' lifetimes may be dependently distributed, and the external environment is modeled using the general approach of Lindley and Singpurwalla (1986). Our techniques, which utilize results basic to the theory of order statistics, apply to broad classes of external environment models.
The dynamical aspects of single channel gating can be modelled by a Markov renewal process, with states aggregated into two classes corresponding to the receptor channel being open or closed, and with brief sojourns in either class not detected. This paper is concerned with the relation between the amount of time, for a given record, in which the channel appears to be open compared to the amount in which it is actually open and the difference in their proportions; this may be used to obtain information on the unobserved actual process from the observed one. Results, with extensions, on exponential families have been applied to obtain relevant generating functions and asymptotic normal distributions, including explicit forms for the parameters. Numerical results are given as illustration in special cases.
It is shown that totally positive order 2 (TP2) properties of the infinitesimal generator of a continuous-time Markov chain with totally ordered state space carry over to the chain's transition distribution function. For chains with such properties, failure rate characteristics of the first passage times are established. For Markov chains with partially ordered state space, it is shown that the first passage times have an IFR distribution under a multivariate total positivity condition on the transition function.