We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The cumulative residual extropy has been proposed recently as an alternative measure of extropy to the cumulative distribution function of a random variable. In this paper, the concept of cumulative residual extropy has been extended to cumulative residual extropy inaccuracy (CREI) and dynamic cumulative residual extropy inaccuracy (DCREI). Some lower and upper bounds for these measures are provided. A characterization problem for the DCREI measure under the proportional hazard rate model is studied. Nonparametric estimators for CREI and DCREI measures based on kernel and empirical methods are suggested. Also, a simulation study is presented to evaluate the performance of the suggested measures. Simulation results show that the kernel-based estimator performs better than the empirical-based estimator. Finally, applications of the DCREI measure for model selection are provided using two real data sets.
We derive large-sample and other limiting distributions of components of the allele frequency spectrum vector, $\mathbf{M}_n$, joint with the number of alleles, $K_n$, from a sample of n genes. Models analysed include those constructed from gamma and $\alpha$-stable subordinators by Kingman (thus including the Ewens model), the two-parameter extension by Pitman and Yor, and a two-parameter version constructed by omitting large jumps from an $\alpha$-stable subordinator. In each case the limiting distribution of a finite number of components of $\mathbf{M}_n$ is derived, joint with $K_n$. New results include that in the Poisson–Dirichlet case, $\mathbf{M}_n$ and $K_n$ are asymptotically independent after centering and norming for $K_n$, and it is notable, especially for statistical applications, that in other cases the limiting distribution of a finite number of components of $\mathbf{M}_n$, after centering and an unusual $n^{\alpha/2}$ norming, conditional on that of $K_n$, is normal.
This paper studies a bi-dimensional compound risk model with quasi-asymptotically independent and consistently varying-tailed random numbers of claims and establishes an asymptotic formula for the finite-time sum-ruin probability. Additionally, some results related to tail probabilities of random sums are presented, which are of significant interest in their own right. Some numerical studies are carried out to check the accuracy of the asymptotic formula.
Qu, Dassios, and Zhao (2021) suggested an exact simulation method for tempered stable Ornstein–Uhlenbeck processes, but their algorithms contain some errors. This short note aims to correct their algorithms and conduct some numerical experiments.
We obtain exact formulas for the cumulative distribution function of the variance-gamma distribution, as infinite series involving the modified Bessel function of the second kind and the modified Lommel function of the first kind. From these formulas, we deduce exact formulas for the cumulative distribution function of the product of two correlated zero-mean normal random variables.
We use Stein’s method to establish the rates of normal approximation in terms of the total variation distance for a large class of sums of score functions of samples arising from random events driven by a marked Poisson point process on $\mathbb{R}^d$. As in the study under the weaker Kolmogorov distance, the score functions are assumed to satisfy stabilisation and moment conditions. At the cost of an additional non-singularity condition, we show that the rates are in line with those under the Kolmogorov distance. We demonstrate the use of the theorems in four applications: Voronoi tessellations, k-nearest-neighbours graphs, timber volume, and maximal layers.
We study fluctuations of the error term for the number of integer lattice points lying inside a three-dimensional Cygan–Korányi ball of large radius. We prove that the error term, suitably normalized, has a limiting value distribution which is absolutely continuous, and we provide estimates for the decay rate of the corresponding density on the real line. In addition, we establish the existence of all moments for the normalized error term, and we prove that these are given by the moments of the corresponding density.
Suppose that a system is affected by a sequence of random shocks that occur over certain time periods. In this paper we study the discrete censored $\delta$-shock model, $\delta \ge 1$, for which the system fails whenever no shock occurs within a $\delta$-length time period from the last shock, by supposing that the interarrival times between consecutive shocks are described by a first-order Markov chain (as well as under the binomial shock process, i.e., when the interarrival times between successive shocks have a geometric distribution). Using the Markov chain embedding technique introduced by Chadjiconstantinidis et al. (Adv. Appl. Prob.32, 2000), we study the joint and marginal distributions of the system’s lifetime, the number of shocks, and the number of periods in which no shocks occur, up to the failure of the system. The joint and marginal probability generating functions of these random variables are obtained, and several recursions and exact formulae are given for the evaluation of their probability mass functions and moments. It is shown that the system’s lifetime follows a Markov geometric distribution of order $\delta$ (a geometric distribution of order $\delta$ under the binomial setup) and also that it follows a matrix-geometric distribution. Some reliability properties are also given under the binomial shock process, by showing that a shift of the system’s lifetime random variable follows a compound geometric distribution. Finally, we introduce a new mixed discrete censored $\delta$-shock model, for which the system fails when no shock occurs within a $\delta$-length time period from the last shock, or the magnitude of the shock is larger than a given critical threshold $\gamma >0$. Similarly, for this mixed model, we study the joint and marginal distributions of the system’s lifetime, the number of shocks, and the number of periods in which no shocks occur, up to the failure of the system, under the binomial shock process.
Consider the problem of determining the Bayesian credibility mean $E(X_{n+1}|X_1,\cdots, X_n),$ whenever the random claims $X_1,\cdots, X_n,$ given parameter vector $\boldsymbol{\Psi},$ are sampled from the K-component mixture family of distributions, whose members are the union of different families of distributions. This article begins by deriving a recursive formula for such a Bayesian credibility mean. Moreover, under the assumption that using additional information $Z_{i,1},\cdots,Z_{i,m},$ one may probabilistically determine a random claim $X_i$ belongs to a given population (or a distribution), the above recursive formula simplifies to an exact Bayesian credibility mean whenever all components of the mixture distribution belong to the exponential families of distributions. For a situation where a 2-component mixture family of distributions is an appropriate choice for data modelling, using the logistic regression model, it shows that: how one may employ such additional information to derive the Bayesian credibility model, say Logistic Regression Credibility model, for a finite mixture of distributions. A comparison between the Logistic Regression Credibility (LRC) model and its competitor, the Regression Tree Credibility (RTC) model, has been given. More precisely, it shows that under the squared error loss function, it shows the LRC’s risk function dominates the RTC’s risk function at least in an interval which about $0.5.$ Several examples have been given to illustrate the practical application of our findings.
We examine urn models under random replacement schemes, and the related distributions, by using generating functions. A fundamental isomorphism between urn models and a certain system of differential equations has previously been established. We study the joint distribution of the numbers of balls in the urn, and determined recurrence relations for the probability generating functions. The associated partial differential equation satisfied by the generating function is derived. We develop analytical methods for the study of urn models that can lead to perspectives on urn-related problems from analytic combinatorics. The results presented here provide a broader framework for the study of exactly solvable urns than the existing framework. Finally, we examine several applications and their numerical results in order to demonstrate how our theoretical results can be employed in the study of urn models.
We consider the problem of group testing (pooled testing), first introduced by Dorfman. For nonadaptive testing strategies, we refer to a nondefective item as “intruding” if it only appears in positive tests. Such items cause misclassification errors in the well-known COMP algorithm and can make other algorithms produce an error. It is therefore of interest to understand the distribution of the number of intruding items. We show that, under Bernoulli matrix designs, this distribution is well approximated in a variety of senses by a negative binomial distribution, allowing us to understand the performance of the two-stage conservative group testing algorithm of Aldridge.
This article derives quantitative limit theorems for multivariate Poisson and Poisson process approximations. Employing the solution of the Stein equation for Poisson random variables, we obtain an explicit bound for the multivariate Poisson approximation of random vectors in the Wasserstein distance. The bound is then utilized in the context of point processes to provide a Poisson process approximation result in terms of a new metric called
$d_\pi$
, stronger than the total variation distance, defined as the supremum over all Wasserstein distances between random vectors obtained by evaluating the point processes on arbitrary collections of disjoint sets. As applications, the multivariate Poisson approximation of the sum of m-dependent Bernoulli random vectors, the Poisson process approximation of point processes of U-statistic structure, and the Poisson process approximation of point processes with Papangelou intensity are considered. Our bounds in
$d_\pi$
are as good as those already available in the literature.
It is well known that each statistic in the family of power divergence statistics, across n trials and r classifications with index parameter
$\lambda\in\mathbb{R}$
(the Pearson, likelihood ratio, and Freeman–Tukey statistics correspond to
$\lambda=1,0,-1/2$
, respectively), is asymptotically chi-square distributed as the sample size tends to infinity. We obtain explicit bounds on this distributional approximation, measured using smooth test functions, that hold for a given finite sample n, and all index parameters (
$\lambda>-1$
) for which such finite-sample bounds are meaningful. We obtain bounds that are of the optimal order
$n^{-1}$
. The dependence of our bounds on the index parameter
$\lambda$
and the cell classification probabilities is also optimal, and the dependence on the number of cells is also respectable. Our bounds generalise, complement, and improve on recent results from the literature.
We make the first steps towards generalising the theory of stochastic block models, in the sparse regime, towards a model where the discrete community structure is replaced by an underlying geometry. We consider a geometric random graph over a homogeneous metric space where the probability of two vertices to be connected is an arbitrary function of the distance. We give sufficient conditions under which the locations can be recovered (up to an isomorphism of the space) in the sparse regime. Moreover, we define a geometric counterpart of the model of flow of information on trees, due to Mossel and Peres, in which one considers a branching random walk on a sphere and the goal is to recover the location of the root based on the locations of leaves. We give some sufficient conditions for percolation and for non-percolation of information in this model.
This paper concentrates on the fundamental concepts of entropy, information and divergence to the case where the distribution function and the respective survival function play the central role in their definition. The main aim is to provide an overview of these three categories of measures of information and their cumulative and survival counterparts. It also aims to introduce and discuss Csiszár's type cumulative and survival divergences and the analogous Fisher's type information on the basis of cumulative and survival functions.
A method for the construction of Stein-type covariance identities for a nonnegative continuous random variable is proposed, using a probabilistic analogue of the mean value theorem and weighted distributions. A generalized covariance identity is obtained, and applications focused on actuarial and financial science are provided. Some characterization results for gamma and Pareto distributions are also given. Identities for risk measures which have a covariance representation are obtained; these measures are connected with the Bonferroni, De Vergottini, Gini, and Wang indices. Moreover, under some assumptions, an identity for the variance of a function of a random variable is derived, and its performance is discussed with respect to well-known upper and lower bounds.
We derive the large-sample distribution of the number of species in a version of Kingman’s Poisson–Dirichlet model constructed from an
$\alpha$
-stable subordinator but with an underlying negative binomial process instead of a Poisson process. Thus it depends on parameters
$\alpha\in (0,1)$
from the subordinator and
$r>0$
from the negative binomial process. The large-sample distribution of the number of species is derived as sample size
$n\to\infty$
. An important component in the derivation is the introduction of a two-parameter version of the Dickman distribution, generalising the existing one-parameter version. Our analysis adds to the range of Poisson–Dirichlet-related distributions available for modeling purposes.
There are two types of tempered stable (TS) based Ornstein–Uhlenbeck (OU) processes: (i) the OU-TS process, the OU process driven by a TS subordinator, and (ii) the TS-OU process, the OU process with TS marginal law. They have various applications in financial engineering and econometrics. In the literature, only the second type under the stationary assumption has an exact simulation algorithm. In this paper we develop a unified approach to exactly simulate both types without the stationary assumption. It is mainly based on the distributional decomposition of stochastic processes with the aid of an acceptance–rejection scheme. As the inverse Gaussian distribution is an important special case of TS distribution, we also provide tailored algorithms for the corresponding OU processes. Numerical experiments and tests are reported to demonstrate the accuracy and effectiveness of our algorithms, and some further extensions are also discussed.
Explicit bounds are given for the Kolmogorov and Wasserstein distances between a mixture of normal distributions, by which we mean that the conditional distribution given some
$\sigma$
-algebra is normal, and a normal distribution with properly chosen parameter values. The bounds depend only on the first two moments of the first two conditional moments given the
$\sigma$
-algebra. The proof is based on Stein’s method. As an application, we consider the Yule–Ornstein–Uhlenbeck model, used in the field of phylogenetic comparative methods. We obtain bounds for both distances between the distribution of the average value of a phenotypic trait over n related species, and a normal distribution. The bounds imply and extend earlier limit theorems by Bartoszek and Sagitov.