To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This paper applies the theory of the quasi-likelihood method to model-based inference for sample surveys. Currently, much of the theory related to sample surveys is based on the theory of maximum likelihood. The maximum likelihood approach is available only when the full probability structure of the survey data is known. However, this knowledge is rarely available in practice. Based on central limit theory, statisticians are often willing to accept the assumption that data have, say, a normal probability structure. However, such an assumption may not be reasonable in many situations in which sample surveys are used. We establish a framework for sample surveys which is less dependent on the exact underlying probability structure using the quasi-likelihood method.
A sequence of first-order integer-valued autoregressive (INAR(1)) processes is investigated, where the autoregressive-type coefficient converges to 1. It is shown that the limiting distribution of the conditional least squares estimator for this coefficient is normal and the rate of convergence is n3/2. Nearly critical Galton–Watson processes with unobservable immigration are also discussed.
This paper considers a competing risks system with p pieces of software where each piece follows the model by Littlewood (1980) described as follows. The failure rate of a piece of software relies on the residual number of bugs remaining in the software where each bug produces failures at varying rates. In effect, bugs with higher failure rates tend to be observed earlier in the testing period. Tasks are assigned to the system and the task completion times as well as the software failure times are assumed to be independent of each other. The system is observed over a fixed testing period and the system reliability upon test termination is examined. An estimator of the system reliability is presented and its asymptotic properties as well as finite-sample properties are obtained.
Estimation methods for the directional measure of a stationary planar random set Z, based only on discretized realizations of Z, are discussed. Properties of the discretized set that can be derived by comparing neighbouring grid points are used. Larger grid configurations of more than two grid points are considered. It is shown that the probabilities of observing the various types of configurations can be expressed in terms of the first contact distribution function of Z (with a finite structuring element). An important prerequisite result concerning deterministic dilation areas is also established. The inference on the mean normal measure based on 2×2 configurations is discussed in detail.
The goal of this paper is to investigate properties of statistical procedures based on numbers of different patterns by using generating functions for the probabilities of a prescribed number of occurrences of given patterns in a random text. The asymptotic formulae are derived for the expected value of the number of words occurring a given number of times and for the covariance matrix. The form of the optimal linear test based on these statistics is established. These problems appear in testing for the randomness of a string of binary bits, DNA sequencing, source coding, synchronization, quality control protocols, etc. Indeed, the probabilities of repeated (overlapping) patterns are important in information theory (the second-order properties of relative frequencies)and molecular biology problems (finding patterns with unexpectedly low or high frequencies).
We consider a parametrization of the Heath-Jarrow-Morton (HJM) family of term structure of interest rate models that allows a finite-dimensional Markovian representation of the stochastic dynamics. This parametrization results from letting the volatility function depend on time to maturity and on two factors: the instantaneous spot rate and one fixed-maturity forward rate. Our main purpose is an estimation methodology for which we have to model the observations under the historical probability measure. This leads us to consider as an additional third factor the market price of interest rate risk, that connects the historical and the HJM martingale measures. Assuming that the information comes from noisy observations of the fixed-maturity forward rate, the purpose is to estimate recursively, on the basis of this information, the three Markovian factors as well as the parameters in the model, in particular those in the volatility function. This leads to a nonlinear filtering problem, for the solution of which we describe an approximation methodology, based on time discretization and quantization. We prove the convergence of the approximate filters for each of the observed trajectories.
The tapered (or generalized) Pareto distribution, also called the modified Gutenberg-Richter law, has been used to model the sizes of earthquakes. Unfortunately, maximum likelihood estimates of the cutoff parameter are substantially biased. Alternative estimates for the cutoff parameter are presented, and their properties discussed.
Consider an inhomogeneous Poisson process X on [0, T] whose unknown intensity function ‘switches' from a lower function g∗ to an upper function h∗ at some unknown point θ∗. What is known are continuous bounding functions g and h such that g∗(t) ≤ g(t) ≤ h(t) ≤ h∗(t) for 0 ≤ t ≤ T. It is shown that on the basis of n observations of the process X the maximum likelihood estimate of θ∗ is consistent for n →∞, and also that converges in law and in pth moment to limits described in terms of the unknown functions g∗ and h∗.
We introduce the notion of weakly approaching sequences of distributions, which is a generalization of the well-known concept of weak convergence of distributions. The main difference is that the suggested notion does not demand the existence of a limit distribution. A similar definition for conditional (random) distributions is presented. Several properties of weakly approaching sequences are given. The tightness of some of them is essential. The Cramér-Lévy continuity theorem for weak convergence is generalized to weakly approaching sequences of (random) distributions. It has several applications in statistics and probability. A few examples of applications to resampling are given.
We define a class of anticipative flows on Poisson space and compute its Radon-Nikodym derivative. This result is applied to statistical testing in an anticipative queueing problem.
This paper arose from interest in assessing the quality of random number generators. The problem of testing randomness of a string of binary bits produced by such a generator gained importance with the wide use of public key cryptography and the need for secure encryption algorithms. All such algorithms are based on a generator of (pseudo) random numbers; the testing of such generators for randomness became crucial for the communications industry where digital signatures and key management are vital for information processing.
The concept of approximate entropy has been introduced in a series of papers by S. Pincus and co-authors. The corresponding statistic is designed to measure the degree of randomness of observed sequences. It is based on incremental contrasts of empirical entropies based on the frequencies of different patterns in the sequence. Sequences with large approximate entropy must have substantial fluctuation or irregularity. Alternatively, small values of this characteristic imply strong regularity, or lack of randomness, in a sequence. Pincus and Kalman (1997) evaluated approximate entropies for binary and decimal expansions of e, π, √2 and √3 with the surprising conclusion that the expansion of √3 demonstrated much less irregularity than that of π. Tractable small sample distributions are hardly available, and testing randomness is based, as a rule, on fairly long strings. Therefore, to have rigorous statistical tests of randomness based on this approximate entropy statistic, one needs the limiting distribution of this characteristic under the randomness assumption. Until now this distribution remained unknown and was thought to be difficult to obtain. To derive the limiting distribution of approximate entropy we modify its definition. It is shown that the approximate entropy as well as its modified version converges in distribution to a χ2-random variable. The P-values of approximate entropy test statistics for binary expansions of e, π and √3 are plotted. Although some of these values for √3 digits are small, they do not provide enough statistical significance against the randomness hypothesis.
This paper considers a branching process generated by an offspring distribution F with mean m < ∞ and variance σ2 < ∞ and such that, at each generation n, there is an observed δ-migration, according to a binomial law Bpvn*Nnbef which depends on the total population size Nnbef. The δ-migration is defined as an emigration, an immigration or a null migration, depending on the value of δ, which is assumed constant throughout the different generations. The process with δ-migration is a generation-dependent Galton-Watson process, whereas the observed process is not in general a martingale. Under the assumption that the process with δ-migration is supercritical, we generalize for the observed migrating process the results relative to the Galton-Watson supercritical case that concern the asymptotic behaviour of the process and the estimation of m and σ2, as n → ∞. Moreover, an asymptotic confidence interval of the initial population size is given.
Is the Ewens distribution the only one-parameter family of partition structures where the total number of types sampled is a sufficient statistic? In general, the answer is no. It is shown that all counterexamples can be generated via an urn scheme. The urn scheme need only satisfy two general conditions. In fact, the conditions are both necessary and sufficient. However, in particular, for a large class of partition structures that naturally arise in the infinite alleles theory of population genetics, the Ewens distribution is the only one in this class where the total number of types is sufficient for estimating the mutation rate. Finally, asymptotic sufficiency for parametric families of partition structures is discussed.
In this paper, we consider the question of which convergence properties of Markov chains are preserved under small perturbations. Properties considered include geometric ergodicity and rates of convergence. Perturbations considered include roundoff error from computer simulation. We are motivated primarily by interest in Markov chain Monte Carlo algorithms.
A Bayesian approach for analyzing layered defense systems is presented. This approach incorporates the dependence of penetration probabilities on the size of attackers going into any layer. A general formula is developed for computing the predictive distribution of the number of attackers surviving any layer as well as the posterior distribution of the penetration probabilities under the a priori assumptions that: (i) the probabilities are dependent and their joint distribution is Dirichlet, and (ii) the probabilities are independent. Positive dependence of the penetration probabilities as well as the number of attackers surviving the different layers is also established.
In the Bayesian estimation of higher-order Markov transition functions on finite state spaces, a prior distribution may assign positive probability to arbitrarily high orders. If there are n observations available, we show (for natural priors) that, with probability one, as n → ∞ the Bayesian posterior distribution ‘discriminates accurately' for orders up to β log n, if β is smaller than an explicitly determined β0. This means that the ‘large deviations' of the posterior are controlled by the relative entropies of the true transition function with respect to all others, much as the large deviations of the empirical distributions are governed by their relative entropies with respect to the true transition function. An example shows that the result can fail even for orders β log n if β is large.
Under the assumptions of the neutral infinite alleles model, K (the total number of alleles present in a sample) is sufficient for estimating θ (the mutation rate). This is a direct result of the Ewens sampling formula, which gives a consistent, asymptotically normal estimator for θ based on K. It is shown that the same estimator used to estimate θ under neutrality is consistent and asymptotically normal, even when the assumption of selective neutrality is violated.
Consider a sequence of possibly dependent random variables having the same marginal distribution F, whose tail 1−F is regularly varying at infinity with an unknown index − α < 0 which is to be estimated. For i.i.d. data or for dependent sequences with the same marginal satisfying mixing conditions, it is well known that Hill's estimator is consistent for α−1 and asymptotically normally distributed. The purpose of this paper is to emphasize the central role played by the tail empirical process for the problem of consistency. This approach allows us to easily prove Hill's estimator is consistent for infinite order moving averages of independent random variables. Our method also suffices to prove that, for the case of an AR model, the unknown index can be estimated using the residuals generated by the estimation of the autoregressive parameters.
The stationary distribution for the population frequencies under an infinite alleles model is described as a random sequence (x1, x2, · ··) such that Σxi = 1. Likelihood ratio theory is developed for random samples drawn from such populations. As a result of the theory, it is shown that any parameter distinguishing an infinite alleles model with selection from the neutral infinite alleles model cannot be consistently estimated based on gene frequencies at a single locus. Furthermore, the likelihood ratio (neutral versus selection) converges to a non-trivial random variable under both hypotheses. This shows that if one wishes to test a completely specified infinite alleles model with selection against neutrality, the test will not obtain power 1 in the limit.
We study convergence in total variation of non-stationary Markov chains in continuous time and apply the results to the image analysis problem of object recognition. The input is a grey-scale or binary image and the desired output is a graphical pattern in continuous space, such as a list of geometric objects or a line drawing. The natural prior models are Markov point processes found in stochastic geometry. We construct well-defined spatial birth-and-death processes that converge weakly to the posterior distribution. A simulated annealing algorithm involving a sequence of spatial birth-and-death processes is developed and shown to converge in total variation to a uniform distribution on the set of posterior mode solutions. The method is demonstrated on a tame example.