Hostname: page-component-76c49bb84f-lvrpm Total loading time: 0 Render date: 2025-07-03T23:21:32.900Z Has data issue: false hasContentIssue false

Stein’s method for distributions modelling competing and complementary risk problems

Published online by Cambridge University Press:  23 June 2025

Anum Fatima*
Affiliation:
University of Oxford, UK & Lahore College for Women University, Pakistan
Gesine Reinert*
Affiliation:
University of Oxford & The Alan Turing Institute, London, UK
*
*Postal address: Department of Statistics, University of Oxford. Email: anumfatimam@gmail.com
**Postal address: Department of Statistics, University of Oxford. Email: reinert@stats.ox.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

Competing and complementary risk (CCR) problems are often modelled using a class of distributions of the maximum, or minimum, of a random number of independent and identically distributed random variables, called the CCR class of distributions. While CCR distributions generally do not have an easy-to-calculate density or probability mass function, two special cases, namely the Poisson–exponential and exponential–geometric distributions, can easily be calculated. Hence, it is of interest to approximate CCR distributions with these simpler distributions. In this paper, we develop Stein’s method for the CCR class of distributions to provide a general comparison method for bounding the distance between two CCR distributions, and we contrast this approach with bounds obtained using a Lindeberg argument. We detail the comparisons for Poisson–exponential, and exponential–geometric distributions.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

Competing and complementary risk (CCR) problems typically focus on the failure of a system composed of multiple components or a system with several, sometimes even a countably infinite number of, risk factors that can cause its failure. Here we think of components and risk factors as having random lifetimes, denoted by a sequence of independent and identically distributed (i.i.d.) positive random variables $X_1, X_2, \ldots.$ At the end of its lifetime the component fails, or the risk occurs.

CCR systems are either sequential or parallel; in sequential systems, the whole system fails at the occurrence of the first among an unknown number N of risk factors; risks compete with each other to cause failure. In this case, the observed failure time is the minimum of the lifetimes of the risk factors. In parallel systems, the system fails after all of an unknown number N of risk factors occur, and the observed failure time is the maximum of the lifetimes of these risk factors. In both settings, the lifetimes of the components or risks are modelled as random, as is the number of risks N, which is assumed to be independent of the lifetimes. These two CCR settings are frequently studied together, as $\max(X_1, \ldots, X_N) = - \min(\!-X_1, \ldots, -X_N)$ ; see [Reference Basu and Klein6].

CCR settings arise in many fields, such as industrial reliability, demography, biomedical studies, public health, and actuarial sciences. For example, in a study of vertical transmission of HIV from an infected mother to her newborn child, several factors increase the risk of transmission, such as high maternal virus load and low birth weight. Exactly which factors determine the timing of transmission is an ongoing area of research. Even with a list of known risk factors, there are possibly many unknown risk factors involved. The problem is therefore modelled as a CCR problem in [Reference Cancho, Louzada-Neto and Barriga8], [Reference Louzada, Bereta and Franco20], [Reference Louzada, Ramos and Perdoná21], and [Reference Tojeiro, Louzada, Roman and Borges35]. The number of successive failures of the air conditioning system of each member of a fleet of 13 Boeing 720 jet aeroplanes was modelled as a CCR problem in [Reference Kus17], with potential risk factors including defective components in the air conditioning system and errors during the production process. In [Reference Adamidis and Loukas3] the period between successive coal-mining disasters is modelled as a CCR problem, with construction faults or human errors committed by inexperienced miners as examples of risks. The daily ozone concentrations in New York during May–September 1973 is treated as a CCR problem in [Reference Jayakumar, Babu and Bakouch16].

We call the distributions used to model the lifetimes in CCR problems the CCR family of distributions. Examples from this family that have been studied in the literature include the exponential–Poisson distribution [Reference Kus17], the Poisson–exponential (PE) lifetime distribution [Reference Cancho, Louzada-Neto and Barriga8], the Weibull–Poisson distribution [Reference Lu and Shi22], the extended Weibull–Poisson distribution, and the extended generalised extreme value Poisson distribution [Reference Ramos, Dey, Louzada and Lachos29]. In [Reference Tahir and Cordeiro34] a detailed review is given of CCR distributions, and additional ones are proposed therein.

The probability mass function of a CCR distribution can be quite unwieldy. Hence, it is of interest to approximate a CCR distribution by a simpler CCR distribution such as the PE distribution or the exponential–geometric distribution. For these two distributions, the number of risks or components follows a zero-truncated Poisson or geometric distribution, respectively. To assess such approximations, we develop Stein’s method for CCR distributions.

The seminal work of Charles Stein [Reference Stein32] derives bounds on the approximation error for normal approximations. In [Reference Chen10], Stein’s method is adapted to Poisson approximation; see also [Reference Arratia, Goldstein and Gordon4] and [Reference Barbour, Holst and Janson5]. Generalisations to many other distributions and dimensions are available; see, for example, [Reference Chen, Goldstein and Shao9], [Reference Mijoule, Raič, Reinert and Swan24], and [Reference Nourdin and Peccati25]. The Stein operators in this paper are based on the density approach; see [Reference Ley, Reinert and Swan18]. A main difficulty in distributional comparison problems is comparing a discrete and a continuous distribution; related works on Stein’s method include [Reference Goldstein and Reinert15], which bounds the Wasserstein distance between a beta distribution and the distribution of the number of white balls in a Pólya–Eggenberger urn, and [Reference Germain and Swan14], where standardisations related to the ones we propose are used. Maxima of a fixed number of random variables were treated with Stein’s method in [Reference Feidt13]; Stein’s method applied to a random sum of random variables can be found in [Reference Peköz, Röllin and Ross28]. Here we propose a comparison of the distributions of a maximum (or minimum) of discrete random variables and a maximum (or minimum) of continuous random variables when the number of these random variables is itself an independent random variable. In future work, the Stein characterisations derived in the present paper could be employed to construct Stein-based goodness-of-fit statistics as in [Reference Betsch and Ebner7].

The remainder of this paper is organised as follows. Section 2 gives a brief introduction to Stein’s method using the density approach and applies it to obtain a general representation for the CCR class of distributions through a Stein operator. Section 3 develops Stein’s method for comparing CCR distributions; as an alternative approach, it also provides a comparison based on a Lindeberg argument. As the main illustration of our results, Section 4 details Stein’s method for the PE distribution and uses it to bound the total variation distance between a PE distribution and a distribution from the CCR family, as well as between a PE distribution and a distribution not from the CCR family. As a second illustration, in Section 5 we develop Stein’s method for the exponential–geometric distribution and perform distributional comparisons in total variation distance. Section 6 gives a bound on the bounded Wasserstein distance between the distribution of the maximum waiting time of sequence patterns in Bernoulli trials and a PE distribution. Proofs that are standard but which would disturb the flow of the argument are postponed to Appendix.

2. Stein’s method for CCR distributions

We use the notation $\mathbb{R}^+_{{>0}} = (0, \infty)$ and $\mathbb{N} = \{1, 2, 3, \ldots\}.$ The backward difference operator $\Delta^-$ operates on a function $g\,:\, \mathbb{R} \rightarrow \mathbb{R}$ by $\Delta^-g(x) = g(x) - g(x-1)$ . We note that $\Delta^- (gq)(y) = q(y)\Delta^- g(y) + g(y-1) \Delta^- q(y)$ . For a function $h \in \textrm{Lip}_b(1)$ , its derivative is denoted by h and exists almost everywhere (by Rademacher’s theorem).

For two probability distributions q and p on $\mathbb{R}^+_{{>0}}$ , we seek bounds on distances of the form

(2.1) \begin{equation} d_{\mathcal{H}} (p,q) \,:\!= \underset{h \in \mathcal{H}}{\sup} \, \big|\mathbb{E} h(X) - \mathbb{E} h(Z)\big|,\end{equation}

where $\mathcal{H}$ is a set of test functions, $Z\sim q$ , and $X \sim p$ . The sets of functions $\mathcal{H}$ in (2.1) are, for the total variation distance $d_{\rm TV}$ , $\mathcal{H} = \{\mathbb{I}[\cdot\in A] \,:\, A \in \mathcal{B} ( \mathbb{R})\}$ and, for the bounded Wasserstein distance $d_{\rm BW}$ ,

\begin{align*}\mathcal{H} = \textrm{Lip}_b(1)\,:\!=\{h\, : \,\mathbb{R}^+_{{>0}} \rightarrow \mathbb{R}\,:\, |h(x) - h(y)| \le |x-y| \mbox{ for all } x,y\in\mathbb{R}; \, \|h\| \le 1 \}.\end{align*}

Here $\mathcal{B} ( \mathbb{R})$ denotes the Borel sets of $\mathbb{R}$ and $\| \cdot \|$ is the supremum norm in $\mathbb{R}$ . We note here the alternative formulation for total variation distance based on Borel-measurable functions $h\,:\, \mathbb{R} \rightarrow \mathbb{R}$ ,

(2.2) \begin{eqnarray} d_{\rm TV}(p,q) = \frac{1}{2} \underset{\|h\| \le 1}{\sup}\big|\mathbb{E} h(X) - \mathbb{E} h(Z)\big|.\end{eqnarray}

2.1. Stein’s method for distributional comparisons

To obtain explicit bounds on the distance between a probability distribution $\mathcal{L}(X)$ of interest and a usually well-understood approximating distribution $\mathcal{L}_0(Z)$ , often called the target distribution, Stein’s method connects the test function $h \in \mathcal{H}$ in (2.1) to the distribution of interest through a Stein equation

(2.3) \begin{equation} h(x) - \mathbb{E} h(Z) = \mathcal{T}g(x).\end{equation}

In (2.3), $\mathcal{T}$ is a Stein operator for the distribution $\mathcal{L}_0 (Z)$ , with an associated Stein class $ \mathcal{F}(\mathcal{T})$ of functions such that $\mathbb{E}[\mathcal{T}g(Z)] = 0 \mbox{ for all } g \in \mathcal{F}(\mathcal{T}) $ if and only if $ Z \sim \mathcal{L}_0 (Z);$ thus, a Stein operator characterises the distribution. The distance (2.1) can then be bounded by $ d_{\mathcal{H}} (\mathcal{L}(X),\mathcal{L}_0(Z)) \le \sup_{g \in \mathcal{F}(\mathcal{H})}|\mathbb{E}\mathcal{T}g(X)| $ where $ \mathcal{F}(\mathcal{H}) = \{g_h \,:\, h \in \mathcal{H}\}$ is the set of solutions of the Stein equation (2.3) for the set of test functions $h \in \mathcal{H}$ .

The Stein operator $\mathcal{T}$ for a probability distribution is not unique; see e.g. [Reference Ley, Reinert and Swan18]. In this paper, we employ the so-called density method, which uses the score function of a probability distribution to construct a Stein operator, called a score Stein operator. Following [Reference Ley and Swan19, Reference Stein, Diaconis, Holmes and Reinert33], a score Stein operator for a continuous distribution with probability density function (PDF) p and support $[a,b] \in \mathbb{R}$ acts on functions g for which the derivative exists by

(2.4) \begin{equation} {\mathcal T}_p g (x) = \frac{ (gp)' (x) }{p (x)},\end{equation}

where we take $0/0 =0$ . For differentiable g and p, (2.4) simplifies to ${\mathcal T}_p g (x) = g'(x) + g(x) \rho (x)$ , where $\rho = p'/p$ is the score function of p. The Stein class ${\mathcal F}(\mathcal{T}_p)$ is the collection of functions $ g\, :\, \mathbb{R} \rightarrow \mathbb{R}$ such that g(x) p(x) is differentiable with integrable derivative and $ \lim_{x \to {a,b} } g(x)p(x)=0 $ . It is straightforward to see that for a PDF p on ${[a,b] \subset \mathbb{R}}$ ,

(2.5) \begin{align} g(x) = g_h(x) = \frac{ 1 }{p (x)} \int_0^x [h(t)-\mathbb{E} h(X)] p(t) \,\mathrm{d}t \end{align}

solves (2.3) for h and that if h is bounded then $g \in \mathcal{F}(\mathcal{T}_p)$ .

To use the Stein equation for a distributional comparison, let X and Y be two random variables with PDFs $p_X$ and $p_Y$ , defined on the same probability space and with nested supports ${\textrm{supp}}(p_Y) \subset {\textrm{supp}}(p_X) =[a,b] \subset \mathbb{R}$ , score functions $\rho_X$ and $\rho_Y$ , and corresponding score Stein operators $\mathcal T_X$ and $\mathcal T_Y$ . Then

(2.6) \begin{equation} \mathbb{E} h(Y)-\mathbb{E} h(X) = \mathbb{E} g_{X, h}(Y)(\rho_Y(Y)-\rho_X(Y)),\end{equation}

where $g_{X, h}(x)$ is the solution of the Stein equation for h and $\mathcal T_X$ . Equation (2.6) is a special case of Equation 23 in Section 2.5 of [Reference Ley, Reinert and Swan18].

For a discrete distribution with probability mass function (PMF) q having support $\mathcal{I}=[a,b] \subset {\mathbb{N}}$ , the discrete backward score function is $\frac{ \Delta^- q(y)}{q(y)}$ , and, as in Remark 3.2 and Example 3.13 of [Reference Ley, Reinert and Swan18], a discrete backward score Stein operator is

(2.7) \begin{equation}{\mathcal T}_q g (y) = \Delta^- g(y) + \frac{ \Delta^- q(y)}{q(y)} g(y-1), \quad y \in \mathcal{I}.\end{equation}

With an abuse of notation, we often refer to a Stein operator for the distribution of X as the Stein operator for X, and similarly refer to the score function of the distribution of X as the score function of X. Further, if $X \sim p$ , we also write $\mathcal{T}_X$ for $\mathcal{T}_p$ .

2.2. CCR distributions

Let $N \in \mathbb{N}$ be a random variable with finite second moment and let ${\textbf{Y}}= (Y_1, Y_2, \ldots)$ be a sequence of positive i.i.d. random variables, with cumulative distribution function (CDF) $F_Y$ independent of N. Then

(2.8) \begin{equation} W_{\alpha} =W_{\alpha}(N, {\textbf{Y}})= \begin{cases} \min\{Y_1, Y_2, \ldots, Y_N\} & \mbox{ if } \alpha = -1, \\ \max\{Y_1, Y_2, \ldots, Y_N\} & \mbox{ if }\alpha = 1 \end{cases}\end{equation}

is called a CCR random variable with type indicator $\alpha \in \{-1, 1\}.$ Setting

(2.9) \begin{equation} U_Y^{\alpha}(\cdot) = \begin{cases} 1- F_Y(\cdot) & \mbox{ if }\alpha = -1, \\ F_Y(\cdot) & \mbox{ if }\alpha = 1\end{cases}\end{equation}

and writing $G_N(x) = \mathbb{E}\, x^N$ , the probability generating function of N, $W_{\alpha} >0$ has CDF

(2.10) \begin{align} F_{W_{\alpha}}(w) &= \mathbb{P} (W_\alpha \le w) \end{align}
(2.11) \begin{align}&= \begin{cases} 1-\sum_{n}\mathbb{P} (N =n)U_Y^{\alpha}(w)^{n} = 1-(G_N\circ U_Y^{\alpha})(w) & \mbox{ if } \alpha = -1, \\ \sum_{n}\mathbb{P} (N =n)U_Y^{\alpha}(w)^{n} = (G_N\circ U_Y^{\alpha})(w) & \mbox{ if } \alpha = 1. \end{cases} \\[8pt]\nonumber\end{align}

If the $Y_i$ have a continuous distribution with PDF $f_Y$ , then $W_\alpha$ has PDF

(2.12) \begin{equation} {f}_{W_{\alpha}}(w) = \sum_{n} \mathbb{P} (N =n)\, n ({U_Y^{\alpha}}(w))^{n-1} {f}_Y(w) = f_Y(w)(G_N'\circ U_Y^{\alpha})(w);\end{equation}

see also [Reference Tahir and Cordeiro34]. Here we have used that $(U_Y^\alpha)'(w) = \alpha f_Y(w)$ .

If the $Y_i$ are discrete with PMF $p_Y$ on $\mathbb{N},$ the resulting random variable $W_{\alpha}$ has PMF $p_{W_{\alpha}}$ , which can be expressed in terms of $G_N(\cdot)$ as

(2.13) \begin{equation} p_{W_{\alpha}}(x) = \alpha \bigl(G_N (U_Y^{\alpha}(x)) - G_N (U_Y^{\alpha}(x-1)) \bigr) = \alpha\Delta^- (G_N \circ U_Y^{\alpha}) (x).\end{equation}

Example 1.

  1. (i) If N is a zero-truncated Poisson random variable with parameter $\theta$ , then (2.12) simplifies to $ f_{W_{\alpha}}(w) = f_Y(w) \frac{\theta }{(1-{\mathrm{e}}^{-\theta})} {\mathrm{e}}^{-\theta(1-U_Y^{\alpha}(w))} $ for $w > 0.$ This is the PDF of the extended Poisson family of distributions given in Equation 3 of [Reference Ramos, Dey, Louzada and Lachos29], where it is called the G-Poisson class of distributions [Reference Tahir and Cordeiro34].

  2. (ii) If N is a $\mathrm{Geometric}(p)$ random variable with PMF $\mathbb{P}(N=n) = (1-p)^{(n-1)} p$ for $n \in \mathbb{N} $ , then (2.12) yields $f_{W_{\alpha}}(w) = f_Y(w) {p}/{(1-(1-p)U_Y^{\alpha}(w))^2}$ for $ w > 0,$ which gives the distributions in Equations 5.2 and 5.3 of [Reference Marshall and Olkin23].

2.3. Stein’s method for the CCR class of distributions

To obtain a Stein operator for the CCR random variable $W_{\alpha} = W_{\alpha}(N, \textbf{Y})$ in (2.8), we use the density method. First, we assume that the $Y_i$ are continuous with differentiable PDF $f_Y$ . From (2.12), the score function for the distribution of $W_{{\alpha}}$ is

(2.14) \begin{equation} \rho_{W_{{\alpha}}}({w}) = {\alpha} {f}_Y({w}) \frac{(G_N'' \circ U_Y^{\alpha})(w)}{(G_N'\circ U_Y^{\alpha})({w})} + \rho_Y(w),\end{equation}

where $\rho_Y = {f_Y'}/{f_Y}$ is the score function of Y. Hence $\mathcal{T}_{W_{\alpha}}$ given by

(2.15) \begin{equation} \mathcal{T}_{W_{\alpha}}g(w) = g'(w) + \rho_{W_{\alpha}}(w) g(w)\end{equation}

for differentiable g is a Stein operator acting on the functions $g \in {\mathcal F}(\mathcal{T}_{W_{\alpha}})$ . For a test function $h \in \mathcal{H}$ , the corresponding Stein equation is

(2.16) \begin{equation} g'({w}) + \rho_{{W_{{\alpha}}}}({w}) g({w}) = h({w}) - \mathbb{E} h ( {W_{\alpha}}).\end{equation}

Thus, for any random variable X, the distance $d_{\mathcal{H}}$ from (2.1) between the distributions of X and $W_{\alpha}$ can be bounded by bounding the expectation of the left-hand side of (2.16). For $Y_i$ taking values in $\mathbb{N}$ with PMF $p_Y$ , the backward score function is

(2.17) \begin{equation} \rho_{W_{{\alpha}}}(w) = \frac{\Delta^- p_{W_{\alpha}}(w)}{p_{W_{\alpha}}(w)} = \frac{\Delta^-( \Delta^- (G_N \circ U_Y^{\alpha}) )(w) }{\Delta^- (G_N \circ U_Y^{\alpha})(w)},\end{equation}

with corresponding discrete backward score Stein operator $\mathcal{T}_{W_{\alpha}}$ operating as

(2.18) \begin{align} \mathcal{T}_{W_{\alpha}} g (w) = \Delta^- g(w) + \rho_{W_{\alpha}}(w) g(w-1).\end{align}

3. A general comparison approach

To illustrate the use of Stein’s method for CCR distributions, we compare the distributions of two maxima or two minima of a random number of i.i.d. random variables $W_{{\alpha}}(N, \textbf{Y})$ and $W_{{\alpha}}(M, \textbf{Z})$ .

Proposition 1. Let $W_{\alpha_1}(N, \textbf{Y})$ and $W_{\alpha_2}(M, \textbf{Z})$ for $\alpha_1, \alpha_2 \in \{-1, 1\}$ be CCR random variables with PDFs $f_Y$ and $f_Z$ and score functions $\rho_Y$ and $\rho_Z$ . Then for any test function h such that the $W_{\alpha_1}(N, \textbf{Y})$ Stein equation (2.16) for h has a solution $g=g_h$ ,

(3.1) \begin{align}& \left|\mathbb{E} h(W_{\alpha_1}(N, \textbf{Y}))-\mathbb{E} h(W_{\alpha_2}(M, \textbf{Z}))\right| \nonumber \\& = \Biggl| \mathbb{E} g ( W) \biggl(\alpha_1 f_Y(W) \frac{(G_N'' \circ U_Y^{\alpha_1})(W)}{(G_N' \circ U_Y^{\alpha_1})(W)} - \alpha_2 f_Z(W) \frac{(G_M'' \circ U_Z^{\alpha_2})(W)}{(G_M' \circ U_Z^{\alpha_2})(W)} \biggr) + \mathbb{E} g (W) ( \rho_Y - \rho_Z) (W) \Biggr|, \end{align}

where $W = W_{\alpha_{2}}(M, \textbf{Z})$ and $U_{\cdot}^{\alpha}$ is as in (2.9).

If the $Y_i$ are discrete with PMF $f_Y$ and the $Z_i$ are discrete with PMF $f_Z$ , then

(3.2) \begin{align} & \left|\mathbb{E} h(W_{\alpha_1}(N, \textbf{Y}))-\mathbb{E} h(W_{\alpha_2}(M, \textbf{Z}))\right|\nonumber\\ &= \left| \mathbb{E} g (W-1) \left(f_Y(W) \frac{\Delta^-( \Delta^- (G_N \circ U_Y^{\alpha_1}) )(W) }{\Delta^- (G_N \circ U_Y^{\alpha_1})(W) } - f_Z(W) \frac{\Delta^-( \Delta^- (G_N \circ U_Z^{\alpha_2}) )(W) }{\Delta^- (G_N \circ U_Z^{\alpha_2})(W) } \right) \right|. \end{align}

Proof. Substitute the score functions (2.14) and (2.17) into (2.6); simplifying then gives (3.1) and (3.2) respectively.

To compare a discrete random variable and a continuous random variable, we use the concept of standardised Stein equations as in [Reference Germain and Swan14] and [Reference Ley, Reinert and Swan18]. For a continuous random variable W with score function $\rho_W$ and a differentiable function $c\,:\, \mathbb{R} \rightarrow \mathbb{R}$ , define a c-standardised Stein operator ${\mathcal{T}_{W}^{(c)}}$ by

(3.3) \begin{equation} \mathcal{T}_W^{(c)} (g(w)) = {\mathcal{T}_W}(cg)(w) = c(w) g'(w) + [ c(w) \rho_W (w) + c'(w) ] \,g(w).\end{equation}

For a random variable $V \in \mathbb{N}$ with discrete backward score function $\rho_V$ and a function $d\,:\, \mathbb{N} \rightarrow \mathbb{R}$ , define a d-standardised Stein operator for V by

(3.4) \begin{equation}\begin{split} {\mathcal{T}_V^{(d)}} (g(w)) & = \mathcal{T}_{V}(dg)({w}) = \Delta^- (dg) (w) + \rho_V(w) (dg)(w-1) \\ & = d(w-1)\Delta^-g(w) + g(w)\Delta^- d(w) + d(w-1) g(w-1)\rho_V (w) . \end{split}\end{equation}

For CCR random variables $W_{\alpha}(N, {\textbf{Y}})$ and $W_{\alpha}(M, {\textbf{Z}})$ , when the $Y_i$ are continuous on $\mathbb{R}^+_{{>0}} $ with differentiable PDF $f_y$ and the $Z_i$ take values in $\mathbb{N}$ , we rescale $W_{\alpha}(M, \textbf{Z})$ by dividing it by n, to obtain $W_n = W_{\alpha, n} = \frac1n W_{\alpha}(M, \textbf{ Z})$ . If ${W_{\alpha}(M, \textbf{Z})} \in \mathbb{N}$ has PMF p and backward score function $\rho$ , then $W_n$ has PMF $ \mathbb{P} (W_n = z) = p (nz)$ and backward score function $\tilde{\rho}_n(z) = \rho (nz)$ . We note here that the ratio $\rho(nz) = \frac{ p(nz) - p(nz-1)}{p(nz)} $ is the score function of ${W_{\alpha}(M, \textbf{Z})}$ evaluated at nz, which for $n \ne 1$ does not equal the score function of $W_n$ . With $ \Delta^{-n} f(x) \,:\!= f(x) - f(x-1/n)$ , we obtain the Stein operator $ {\mathcal T}_{n}^{(d)}$ given by

(3.5) \begin{equation} {\mathcal T}_{n}^{(d)} g(z) = {\mathcal T}_n (dg) (z) = \Delta^{-n} (dg)(z) d(z) +\tilde{\rho}_n(z) (dg) \left( z - 1/n \right).\end{equation}

Proposition 2. Let ${W_{\alpha}(M, \textbf{Z})}$ be a discrete CCR random variable with discrete backward score function ${\rho_W}$ ; for $n \in \mathbb{N}$ set $W_n = {W_{\alpha}(M, \textbf{Z})}/n$ and $\tilde{\rho}_n(z) = \rho_W(nz)$ . Let ${W = W_{\alpha}(N, \textbf{Y})}$ be a continuous CCR random variable with score function $\rho$ . Let $h \in \mathcal{H}$ be a test function such that the ${\mathcal{L}}({W})$ Stein equation (2.3) has solution $g=g_h$ . Then for any differentiable function $c\,:\, \mathbb{R}^+_{{>0}} \rightarrow \mathbb{R}$ and any function $d\,:\, \mathbb{N} \rightarrow \mathbb{R}$ ,

(3.6) \begin{equation} \begin{split} \big|\mathbb{E} h(W_n) - \mathbb{E} h(W)\big| \le \big| & \mathbb{E} [ n \Delta^{-n} (dg)(W_n) - (cg)'(W_n) ]\\ & + \mathbb{E}[ n {\tilde \rho}_n (W_n) (dg)( W_n - 1/n ) - (cg)( W_n) \rho(W_n)] \big|. \end{split}\end{equation}

Proof. To compare the two distributions, for a given test function h we have

(3.7) \begin{align} \mathbb{E} h(W_n) - \mathbb{E} h(W) = \mathbb{E} {\mathcal T}_{W} (cg)(W_n) = \mathbb{E} [ (cg)'(W_n) + (cg)(W_n) \rho(W_n)] \end{align}

with g solving the continuous Stein equation (3.3) for h. Next, we note that for the Stein operator given in (3.5), $\mathbb{E} {\mathcal T}_{n}^{(d)} (g) ({W_n}) = 0$ by construction. Hence, also $n \mathbb{E} {\mathcal T}_n^{(d)} (g) (W_n) = 0.$ Thus, (3.7) yields

\begin{equation*} \begin{split} \mathbb{E} h(W_n) - \mathbb{E} h(W) = \mathbb{E} \big[ & (cg)'(W_n) + (cg)(W_n) \rho(W_n) \\ & - n \Delta^{-n} (dg)(W_n) - n {\tilde \rho}_n (W_n) (dg)( W_n - 1/n ) \big] . \nonumber \end{split}\end{equation*}

Rearranging gives the assertion.

Adaptation to other deterministic scaling functions should be straightforward; as in our examples, we concentrate on the case where we scale by dividing by n. As an aside, while such standardisations and scalings could perhaps be used for a ‘bespoke derivative’ as in [Reference Germain and Swan14], the connection is not obvious.

An alternative comparison of CCR distributions can be achieved using a Lindeberg argument, to arrive at the following result.

Proposition 3. Let $W_{\alpha}(M, \textbf{X})$ and $W_{\alpha}(N, \textbf{E})$ be given in (2.8). Then for functions $h \in \textrm{ Lip}_b(1)$ ,

(3.8) \begin{align} & \big| \mathbb{E} h ( W_{\alpha}(N, {\textbf{E}})) - \mathbb{E} h ({W_{\alpha}(M, {\textbf{X}})}) \big| \nonumber \\ & \quad \le \| h'\| \sum_{i=1}^\infty \mathbb{E} | E_i - X_i| \,\mathbb{P} (M \ge i) \\ & \qquad + {\sum_{m,n=1}^\infty \mathbb{P} (M=m, N=n) \,\big|\mathbb{E} h(W_{\alpha}(n, {\textbf{E}})) - \mathbb{E} h(W_{\alpha}(m, {\textbf{E}}))\big|.} \nonumber\end{align}

If M and N are identically distributed random variables, then

(3.9) \begin{equation} \big|\mathbb{E} h(W_{\alpha}(N, {\textbf{E}})) - \mathbb{E} h (W_{\alpha}(N, {\textbf{X}})) \big| \le \| h'\|\sum_{i=1}^\infty \mathbb{E} | E_i - X_i| \,\mathbb{P} (N \ge i) . \end{equation}

Proof. We employ a Lindeberg argument, as follows. Defining $X_1, X_2, \ldots$ and $E_1, E_2, \ldots$ on the same probability space, we have

(3.10) \begin{align}\big| \mathbb{E} h ( W_{\alpha}(N, {\textbf{E}})) - \mathbb{E} h ({W_{\alpha}(M, {\textbf{X}})}) \big| & \le \big| \mathbb{E} h ( W_{\alpha}(N, {\textbf{E}})) - \mathbb{E} h ({W_{\alpha}(M, {\textbf{E}})}) \big| \end{align}
(3.11) \begin{align}& \quad + \big| \mathbb{E} h ( W_{\alpha}(M, {\textbf{E}})) - \mathbb{E} h ({W_{\alpha}(M, {\textbf{X}})}) \big|.\\[8pt]\nonumber\end{align}

To bound (3.10), we simply note that

(3.12) \begin{align} & |\mathbb{E} h(W_{\alpha}(N, {\textbf{E}})) - \mathbb{E} h(W_{\alpha}(M, {\textbf{E}}))| \nonumber \\ & \quad \le \| h'\| \sum_{m,n=1}^\infty \mathbb{P} (M=m, N=n) \,\mathbb{E} \big| W_{\alpha}(n, {\textbf{E}}) - W_{\alpha}(m, {\textbf{E}})\big| .\end{align}

Now, to bound (3.11), if $\alpha =1$ , then

(3.13) \begin{align} &\big| \mathbb{E} h(\max\{E_1, \ldots, E_{M}\}) - \mathbb{E} h(\max\{X_1, \ldots, X_{M}\}) \big| \nonumber \\ & \quad \le \sum_{m=1}^\infty \mathbb{P}(M=m) \sum_{i=1}^{m} \mathbb{E} \big|h(\max\{ E_1,\ldots,E_i, X_{i+1}, \ldots, X_{m} \} ) \nonumber \\ & \qquad\qquad\qquad\qquad\qquad - h (\max\{E_1,\ldots,E_{i-1}, X_i, \ldots, X_{m}\})\big| \nonumber \\ & \quad \le \sum_{m=1}^\infty \mathbb{P}(M=m) \sum_{i=1}^{m} \mathbb{E}| E_i - X_i| \|h'\| = \sum_{i=1}^\infty \mathbb{E}| E_i - X_i| \mathbb{P} (M \ge i) \|h'\|. \end{align}

Similarly, the minimum case ( $\alpha =-1$ ) follows as $\min(X_i) \le \max(X_i) \le \sum X_i$ . Adding (3.12) and (3.13) and taking the supremum over all $h \in \textrm{Lip}_b(1)$ gives the first assertion. The second assertion follows from coupling M and N so that $M=N$ almost surely, in which case (3.12) vanishes.

Propositions 1, 2 and 3 complement each other; the first two yield bounds in $d_{\rm TV}$ distance, for a general variable W, while the last result can be translated into a bound in $d_{\rm BW}$ distance, for comparing CCR distributions; see Remark 4 for more details. We note that Proposition 1 can be used to compare a maximum and a minimum, whereas Propositions 2 and 3 require the same value of $\alpha$ .

4. Application to the Poisson–exponential distribution

Cancho et al. [Reference Cancho, Louzada-Neto and Barriga8] introduced the PE distribution as a distribution of the maximum of N i.i.d. exponential random variables from an infinite sequence ${\textbf{E}}= (E_1, E_2, \ldots)$ such that $E_i \sim \mathrm{Exp}(\lambda)$ with parameter $\lambda$ (having mean $1/{\lambda}$ ) and N follows a zero-truncated Poisson distribution with parameter $\theta$ , independently of ${\textbf{E}}$ . This maximum has the PE distribution with parameters $\theta, \lambda >0$ , denoted by $\mathrm{PE}(\theta, \lambda)$ , which has the differentiable PDF

(4.1) \begin{equation} p({w}| \theta, \lambda)= \frac{\theta \lambda {\mathrm{e}}^{-\lambda {w} - \theta {\mathrm{e}}^{-\lambda {w}}}}{1 - {\mathrm{e}}^{-\theta}}, \quad {w} > 0.\end{equation}

To obtain a Stein operator for $\mathrm{PE}(\theta, \lambda)$ , we use (2.14) with $G_N({u}) = \frac{{\mathrm{e}}^{-\theta}}{1 - {\mathrm{e}}^{-\theta}} ( {\mathrm{e}}^{\theta{u}} - 1 )$ and $\frac{G_N''({u})}{G_N'({u})} = \theta,$ yielding the score function

(4.2) \begin{align} \rho({w}) = \lambda (\theta {\mathrm{e}}^{-\lambda {w}}-1).\end{align}

Equation (2.15) gives

(4.3) \begin{equation} {\mathcal T} f ({w}) = f'({w}) + \lambda (\theta {\mathrm{e}}^{-\lambda {w}}-1) f({w}).\end{equation}

For a bounded test function $h\in \mathcal{H}$ , the score $\mathrm{PE}(\theta, \lambda)$ Stein equation is

(4.4) \begin{equation} f'(w) + \lambda(\theta {\mathrm{e}}^{-\lambda{w}}-1)f({w}) = h({w}) - \mathbb{E} h({W}),\end{equation}

with the solution $f_h$ , given by (2.5), satisfying the following bounds.

Lemma 1. Let $ h\,:\, \mathbb{R}^+_{{>0}} \rightarrow \mathbb{R} $ be bounded and let f denote the solution (2.5) of the Stein equation (4.4) for h. Let $\tilde{h}(w) = h(w)-\mathbb{E} h(W)$ for $W \sim \mathrm{PE}(\theta, \lambda)$ . Then for all $w>0$ ,

(4.5) \begin{align} \left|{\mathrm{e}}^{-\lambda w}f(w) \right| &\le\frac{ \|\tilde{h}\|}{\theta \lambda } \big( 1-{\mathrm{e}}^{-\theta + \theta {\mathrm{e}}^{-\lambda w}}\big)\end{align}
(4.6) \begin{align} &\le \frac{\|\tilde{h}\|}{\theta \lambda};\end{align}
(4.7) \begin{align} \big|\lambda (\theta {\mathrm{e}}^{-\lambda w}-1) f(w)\big| &\le \|\tilde{h}\|;\end{align}
(4.8) \begin{align} |f(w)|& \le \frac{2\|\tilde{h}\| }{\lambda};\end{align}
(4.9) \begin{align} |f'(w)|& \le 2\|\tilde{h}\|.\\[8pt]\nonumber\end{align}

If in addition $h \in \ {Lip}_b(1)$ , then at all points w at which h exists,

(4.10) \begin{equation} |f''(w)| \le \|h'\| + 2\lambda\theta \|\tilde{h}\| + 3\lambda \|\tilde{h}\|.\end{equation}

Proof. We write p for the PDF of $\mathrm{PE}(\theta, \lambda)$ . To prove (4.5) and (4.6), we bound

\begin{eqnarray*} \left|{\mathrm{e}}^{-\lambda w}f(w) \right| \le \|\tilde{h}\| \frac{{\mathrm{e}}^{-\lambda w}}{p(w)} \int_0^{w} p(t) \,\mathrm{d}t = \|\tilde{h}\| \frac{1-{\mathrm{e}}^{-\theta(1-{\mathrm{e}}^{-\lambda w})}}{\theta \lambda},\end{eqnarray*}

and (4.5) follows. From $1-{\mathrm{e}}^{-y} < 1$ for all $ y>0$ , we get (4.6).

Proof of (4.7). Case 1: $ \theta {\mathrm{e}}^{-\lambda w} - 1 > 0$ . In this case $0< w < \frac{\ln \theta} {\lambda}$ and we have

\begin{align*}\big|\lambda (\theta {\mathrm{e}}^{-\lambda w}-1)f(w)\big|\le \|\tilde{h}\|\frac{\lambda (\theta {\mathrm{e}}^{-\lambda w}-1)}{p(w)} \int_0^w p(t) \,\mathrm{d}t.\end{align*}

As $p'(t) = \lambda (\theta {\mathrm{e}}^{-\lambda t}-1) p(t)\ge \lambda (\theta {\mathrm{e}}^{-\lambda w}-1) p(t)$ for $ 0< t < w < \frac{\ln \theta }{\lambda}$ , it follows that

\begin{align*} 0 & < \frac{\lambda (\theta {\mathrm{e}}^{-\lambda w}-1)}{p(w)} \int_0^w p(t) \,\mathrm{d}t \le \frac{1}{p(w)} \int_0^w p'(t) \,\mathrm{d}t = \big(1-{\mathrm{e}}^{-\theta (1 - {\mathrm{e}}^{-\lambda w}) + \lambda w}\big)\le 1.\end{align*}

Hence we obtain the bound (4.7) for $0< w < \frac{\ln\theta} {\lambda}$ .

Case 2: $ \theta {\mathrm{e}}^{-\lambda w} - 1 \le 0$ . In this case $w \ge \frac{\ln\theta}{\lambda}$ and

\begin{align*} 0 < \lambda (1-\theta {\mathrm{e}}^{-\lambda w}) p(t)< \lambda (1-\theta {\mathrm{e}}^{-\lambda t}) p(t) = p'(t).\end{align*}

Using (2.5) gives

\begin{align*} \big|\lambda (\theta {\mathrm{e}}^{-\lambda w}-1) f(w)\big| \le \|\tilde{h}\| \frac{ \lambda (1-\theta {\mathrm{e}}^{-\lambda w}) }{p (w)} \int_{w}^{\infty} p(t)\,\mathrm{d}t \le \|\tilde{h}\| \frac{1}{p (w)} \int_{w}^{\infty} p'(t)\,\mathrm{d}t \le \|\tilde{h}\|.\end{align*}

Hence the bound (4.7) follows for all $w>0$ .

Proof of (4.8)–(4.10). As $ \lambda(\theta {\mathrm{e}}^{-\lambda w} -1)f(w) = \lambda\theta {\mathrm{e}}^{-\lambda w} f(w) - \lambda f(w),$ the triangle inequality along with (4.7) and (4.6) gives (4.8). To show (4.9), using the triangle inequality, from (4.4) we obtain $|f'({w})| \le |h({w})-\mathbb{E} h({W})|+|\lambda (\theta {\mathrm{e}}^{-\lambda {w}}-1)f({w})|$ , and using (4.7) yields the bound (4.9).

Now, for h differentiable at w, taking the first-order derivative in (4.4) gives $ |f''(w)| \le |h'(w)| + |\lambda(\theta {\mathrm{e}}^{-\lambda w} -1)f'(w)| + |\theta \lambda^2 {\mathrm{e}}^{-\lambda w}f(w)|. $ Using (4.6) and (4.9), we obtain the bound (4.10) through

\begin{align*} |f''(w)| &\le \|h'\| + \lambda|\theta {\mathrm{e}}^{-\lambda w} -1| 2\|\tilde{h}\| + \theta \lambda^2 \frac{\|\tilde{h}\|}{\theta \lambda} \le \|h'\| + 2\lambda\theta {\mathrm{e}}^{-\lambda w} \|\tilde{h}\| + 2\lambda \|\tilde{h}\| + \lambda \|\tilde{h}\|.\end{align*}

This completes the proof.

Remark 1. If $\theta \rightarrow 0$ , the PE distribution converges to the exponential distribution $\mathrm{Exp} (\lambda)$ ; when $\lambda = 1$ , (4.4) reduces to the Stein equation (4.2) in [Reference Peköz and Röllin27]. For this simplified version, the bound in [Reference Peköz and Röllin27] is only one-half of the bound (4.9); this discrepancy arises through our use of the triangle inequality for $\theta > 0$ .

In the following subsections, we compare a PE distribution with a distribution of a maximum of a random number of i.i.d. random variables, with a generalised Poisson–exponential distribution, and with a Poisson–geometric distribution.

4.1. Approximating the distribution of the maximum of a random number of i.i.d. random variables by a PE distribution

Let $M\in \mathbb{N}$ be independent of ${\textbf{X}} = (X_1, X_2, \ldots)$ , a sequence of i.i.d. random variables, and let $W=W_1(M, {\textbf{X}})$ have a PDF of the form (2.12) for $\alpha = 1$ . Our first comparison result employs Stein’s method.

Corollary 1. Assume that the $X_i$ have differentiable PDF $p_X$ , CDF $F_X$ , and score function $\rho_X$ . Let $W = W_1(M, {\textbf{X}}) = \max \{X_1, \ldots, X_M\}$ . Then

(4.11) \begin{equation} \begin{split} & d_{\rm TV}\big(\mathrm{PE}(\theta, \lambda), \mathcal{L}(W_1(M, {\textbf{X}}))\big) \\ & \quad\le \frac{4}{\lambda} \left(\mathbb{E}\left|\theta \lambda {\mathrm{e}}^{-\lambda W}- \frac{(G_M'' \circ F_X)(W)}{(G_M' \circ F_X)(W)} p_X(W) \right| + \mathbb{E} \bigl| - \lambda - \rho_X (W) \bigr|\right) . \end{split}\end{equation}

If M is a zero-truncated Poisson $(\theta_m)$ random variable, then (4.11) reduces to

\begin{align*} d_{\rm TV}\big(\mathrm{PE}(\theta, \lambda), \mathcal{L}(W_1(M, {\textbf{X}}))\big) \le \frac{4}{\lambda} \Big(\mathbb{E} \bigl|\theta\lambda {\mathrm{e}}^{-\lambda W} - \theta_mp_X(W) \bigr| + \mathbb{E} \bigl| - \lambda - \rho_X (W) \bigr| \Big) . \end{align*}

Proof. We employ Proposition 1. For a zero-truncated Poisson random variable N with parameter $\theta$ such that $G_N''(\cdot) / G_N'(\cdot) = \theta$ and for $Y \sim \mathrm{Exp}(\lambda)$ with PDF $f_Y(y) = \lambda {\mathrm{e}}^{-\lambda y}$ , using (3.1) along with (4.8) and taking h to be an indicator function so that $\| \tilde{h} \| \le 2 \|h\| \le 2$ gives (4.11). The simplification when M is a zero-truncated Poisson random variable follows from $G_M''(\cdot) / G_M'(\cdot) = \theta_M$ .

Remark 2. As $-\lambda$ is the score function of the exponential distribution $\mathrm{Exp}(\lambda)$ , for M being a zero-truncated $\mathrm{Poisson}(\theta)$ random variable, the bound in Corollary 1 is close to zero if the density and the score function of X are close to those of $\mathrm{Exp}(\lambda)$ .

The next result is based on the Lindeberg argument.

Corollary 2. Let $W_1(N, {\textbf{E}}) \sim \mathrm{PE}(\theta, \lambda)$ and let $W_{1}(M, {\textbf{X}}) = \max(X_1, \ldots, X_M)$ have a CCR distribution. Then

\begin{align*} d_{\rm BW}\bigl(\mathrm{PE}(\theta, \lambda), \mathcal{L}(W_{1}(M, {\textbf{X}})\bigr) & \le \mathbb{E} M \, \mathbb{E}| E_1 - X_1| \nonumber \\ & \quad + \sum_{m,n=1}^\infty \mathbb{P} (M=m, N=n) \left( H_{\max\{m,n\}} - H_{\min\{m,n\}}\right), \end{align*}

where $H_n$ is the nth harmonic number.

If M is also a zero-truncated $\mathrm{Poisson}(\theta)$ random variable, then

\begin{align*} d_{\rm BW}(\mathrm{PE}\bigl(\theta, \lambda), \mathcal{L}(W_{1}(M, {\textbf{X}})\bigr) \le \frac{\theta}{1 - {\mathrm{e}}^{-\theta} } \,\mathbb{E} | E_1 - X_1|. \end{align*}

Proof. Proposition 3 gives

\begin{align*} & \big|\mathbb{E} h(W_{1}(N, {\textbf{E}})) - \mathbb{E} h (W_{1}(M, {\textbf{X}})) \big| \\ & \quad \le \sum_{i=1}^\infty \mathbb{E} | E_i - X_i| \,\mathbb{P} (M \ge i)\, \| h'\| \\ &\qquad + \sum_{m,n=1}^\infty \mathbb{P} (M=m, N=n) \,\big|\mathbb{E} h(W_{1}(n, {\textbf{E}})) - \mathbb{E} h(W_{1}(m, {\textbf{E}}))\big|\, \| h'\|. \end{align*}

Since the random variables in ${\textbf{X}}$ are i.i.d., those in ${\textbf{E}}$ are i.i.d. exponential with parameter $\lambda$ , and the expectation of the maximum of n exponential random variables is $\sum_{i=1}^n \frac1i = H_n$ , the assertion follows.

4.2. Approximating the generalised Poisson–exponential distribution

The PE distribution has an increasing or constant failure rate. To model decreasing failure rate as well, Fatima & Roohi [Reference Fatima and Roohi12] introduced the family of generalised Poisson–exponential (GPE) distributions. The differentiable PDF of a GPE distribution with parameters $\beta, \theta,\lambda > 0 $ , denoted by $\mathrm{GPE}(\theta,\lambda,\beta)$ , is

(4.12) \begin{equation} p(x| \theta, \lambda, \beta)= \frac{\beta \theta \lambda {\mathrm{e}}^{-\lambda x - \theta \beta {\mathrm{e}}^{-\lambda x}}}{(1 - {\mathrm{e}}^{-\theta})^{\beta}}\, \Big(1-{\mathrm{e}}^{-\theta + \theta {\mathrm{e}}^{-\lambda x}} \Big)^{\beta -1} , \quad x > 0.\end{equation}

For $\beta = 1$ the density of the GPE distribution simplifies to that of the PE distribution given in (4.1). For ${0 < \beta < 1}$ the PDF of the GPE distribution is monotonically decreasing, while for $\beta \ge 1$ it is unimodal positively skewed with skewness depending on the shape parameters $\beta$ and $\theta$ . The shape of the hazard function also depends on these two shape parameters. For example, for $\theta = 1$ and $\lambda =2$ , the failure rate is decreasing for $ 0 < \beta <1$ and increasing for $\beta \ge 1$ , as in Figure 2 of [Reference Fatima and Roohi12].

For a data set from [Reference Aarset1] which consists of 50 observations on the time to first failure of devices, in [Reference Fatima and Roohi12] it was shown that the GPE distribution provides a better fit than the PE and some other candidate distributions. However, a GPE distribution is not as easy to manipulate and interpret as a PE distribution. Therefore, a natural question is how to quantify the sacrifice when approximating a GPE distribution with a PE distribution. Here we note that GPE distributions are not from the CCR family, and hence Proposition 3 cannot be applied. Instead we use Stein’s method to bound the approximation error.

For such an approximation to be intuitive, the failure rate of the approximating distribution should be qualitatively similar. Hence we restrict attention to the case of $\beta \ge 1$ , for which both the GPE and the PE distributions have an increasing failure rate. As an aside, we note that for $\beta \ge 1$ , the limit of the PDF at 0 is 0, while for $0 < \beta < 1$ it is undefined; the latter condition leads to a Stein class for the GPE Stein score operator which differs from the Stein class for the PE Stein score operator.

We note here that if $X \sim \mathrm{GPE}(\theta, \lambda, \beta)$ , then for $\beta \ge 1$ we have

(4.13) \begin{equation} \mathbb{E} X \le \frac{\beta \theta}{\lambda(1 - {\mathrm{e}}^{-\theta})^{\beta}}.\end{equation}

The proof of (4.13) is given in the appendix.

The GPE random variable is not of the CCR form (2.8), but its score function, given in (4.14) below, can be derived from its PDF (4.12) with parameters $\beta, \theta, \lambda > 0$ as

(4.14) \begin{equation} \rho(x) = \frac{p'(x)}{p(x)} = \lambda \theta {\mathrm{e}}^{-\lambda x} \Bigg[\beta +\frac{(\beta-1) \,{\mathrm{e}}^{-\theta + \theta {\mathrm{e}}^{-\lambda x}}}{1-{\mathrm{e}}^{-\theta + \theta {\mathrm{e}}^{-\lambda x}}}\Bigg] - \lambda, \quad x > 0.\end{equation}

In the following theorem we bound the distance between a GPE with $\beta \ge 1$ and a PE distribution, using their corresponding score Stein operators in (2.6).

Theorem 1. Let $W \sim \mathrm{PE}(\theta_1, \lambda_1)$ and $X \sim \mathrm{GPE}(\theta_2, \lambda_2, \beta)$ , let $h\,:\, {\mathbb{R}^{+}_{>}0} \rightarrow \mathbb{R}$ be bounded, and let $\tilde h(x) = h(x) - \mathbb{E} h(W)$ . Then for $\lambda_1 \le \lambda_2$ and $\beta \ge 1$ ,

(4.15) \begin{equation} \big|\mathbb{E} h(X)-\mathbb{E} h(W)\big| \le \|\tilde{h}\| \left\{ \frac{\lambda_2}{\lambda_1}|\beta-1| + \left|\frac{\lambda_2 \theta_2}{\lambda_1 \theta_1}\beta - 1 \right| + \left( \frac{\lambda_2}{\lambda_1} - 1\right) \left( \lambda_1 \mathbb{E} X + 2\right) \right\}.\end{equation}

Proof. Let $p_W$ and $p_X$ denote the PDFs of W and X, and let $\mathcal{T}_{X}$ denote a score Stein operator for a GPE distribution. To employ (2.6), we first check that $f_W$ as in (2.5) for the PE distribution, for h bounded, is in the Stein class of $\mathcal{T}_X$ . Invoking Lemma 1, $f_W$ is bounded. Now $ {\mathbb{E}} [{\mathcal T}_X\ f_W (X) ]= \int_0^\infty \frac{(f_W p_X)' (x)}{p_X(x)} p_X (x) \,\mathrm{d} x= \lim_{x \to \infty } (f_W p_X) (x) - \lim_{x \rightarrow 0} (f_W p_X) (x),$ and for $\beta \ge 1$ we have that $ {(f_Wp_{X})(x)} \rightarrow 0$ as $x \rightarrow 0$ and as $x \rightarrow \infty$ , showing that $f_W$ is in the Stein class for $\mathcal{T}_{X}$ . Applying (2.6) with the score functions from (4.2) and (4.14), we have

\begin{align*} & \mathbb{E} h(X) -\mathbb{E} h(W) \nonumber \\ &= \mathbb{E} \bigg[\lambda_2 \theta_2 {\mathrm{e}}^{-\lambda_2 X} \Bigg(\beta +\frac{(\beta-1) {\mathrm{e}}^{-\theta_2 + \theta_2 {\mathrm{e}}^{-\lambda_2 X}}}{1-{\mathrm{e}}^{-\theta_2 + \theta_2 {\mathrm{e}}^{-\lambda_2 X}}}\Bigg) - \lambda_2 - \lambda_1 (\theta_1 {\mathrm{e}}^{-\lambda_1 X}-1) \bigg]f_{W}(X) \\ &\le \mathbb{E} \left[\lambda_2 \theta_2 (\beta-1) {\mathrm{e}}^{(\lambda_1-\lambda_2) X} \frac{ {\mathrm{e}}^{-\theta_2 + \theta_2 {\mathrm{e}}^{-\lambda_2 X}}}{1-{\mathrm{e}}^{-\theta_2 + \theta_2 {\mathrm{e}}^{-\lambda_2 X}}} \,{\mathrm{e}}^{-\lambda_1 X} |f_W(X)|\right] \nonumber \\ &\quad+ \mathbb{E}\left|\lambda_2\theta_2\beta {\mathrm{e}}^{-\lambda_2 X} - \lambda_1 \theta_1 {\mathrm{e}}^{-\lambda_1 X}\right| |f_W(X)| + \mathbb{E}|\lambda_1 - \lambda_2|\,|f_W(X)|.\end{align*}

Next,

(4.16) \begin{equation}{ |\theta_2 \lambda_2 {\mathrm{e}}^{-\lambda_2 x} - \theta_1 \lambda_1 {\mathrm{e}}^{-\lambda_1 x}|} \le |\theta_2 \lambda_2 \beta -\theta_1 \lambda_1 |\,{\mathrm{e}}^{(\lambda_1 - \lambda_2) x} {\mathrm{e}}^{-\lambda_1 x} + (\lambda_2-\lambda_1) x \theta_1 \lambda_1 {\mathrm{e}}^{-\lambda_1 x}, \end{equation}

since $|{\mathrm{e}}^{-x} - 1 | \le x$ for all $x \ge 0$ . With ${\mathrm{e}}^{(\lambda_1 - \lambda_2) x} \le 1$ for $\lambda_1 \le \lambda_2$ , we write

(4.17) \begin{align} & |\mathbb{E} h(X) -\mathbb{E} h(W)| \nonumber \\ &\le \lambda_2 \theta_2 |\beta-1| \,\mathbb{E} \left|\frac{ {\mathrm{e}}^{-\theta_2 + \theta_2 {\mathrm{e}}^{-\lambda_2 X}}}{1-{\mathrm{e}}^{-\theta_2 + \theta_2 {\mathrm{e}}^{-\lambda_2 X}}} {\mathrm{e}}^{-\lambda_1 X} f_W(X)\right| + ( \lambda_2 - \lambda_1) \,\mathbb{E}|f_W(X)| \nonumber \\ &\quad + \lambda_1 \theta_1 \left|\frac{\lambda_2 \theta_2}{\lambda_1 \theta_1}\beta - 1 \right|\mathbb{E}|{\mathrm{e}}^{-\lambda_1 X}f_W(X)| + \lambda_1 \theta_1 ( \lambda_2 - \lambda_1) \,\mathbb{E} X |{\mathrm{e}}^{-\lambda_1X}f_W(X)| . \end{align}

Now, using (4.5) as well as ${\mathrm{e}}^{x} - 1 \ge x$ and $1-{\mathrm{e}}^{-x} \le x$ for $x \ge 0$ , we have

(4.18) \begin{align} \left|\frac{{\mathrm{e}}^{-\theta_2 + \theta_2 {\mathrm{e}}^{-\lambda_2 x}}}{1-{\mathrm{e}}^{-\theta_2 + \theta_2 {\mathrm{e}}^{-\lambda_2 x}}}{\mathrm{e}}^{-\lambda_1 x} f_W(x)\right| &\le \frac{\|\tilde{h}\|}{\lambda_1 \theta_1} \frac{{\mathrm{e}}^{-\theta_2 + \theta_2 {\mathrm{e}}^{-\lambda_2 x}}}{1-{\mathrm{e}}^{-\theta_2 + \theta_2 {\mathrm{e}}^{-\lambda_2 x}}}\, (1-{\mathrm{e}}^{-\theta_1(1 - {\mathrm{e}}^{-\lambda_1 x})}) \nonumber \\ & \le \frac{\|\tilde{h}\|}{\lambda_1 \theta_2} .\end{align}

Using (4.6), (4.8), (4.13) and (4.18) in (4.17) gives the bound (4.15).

Remark 3. For $\lambda_1=\lambda_2$ and $\theta_1=\theta_2$ in Theorem 1, the bound (4.15) can be improved by a factor of 2, using that with $f_{W}$ being the solution of the PE score Stein equation,

\begin{align*} \big|\mathbb{E} h(X) -\mathbb{E} h(W)\big| = \mathbb{E} \! \left(\lambda \theta |\beta - 1|\frac{{\mathrm{e}}^{-\lambda X}}{1-{\mathrm{e}}^{-\theta + \theta {\mathrm{e}}^{-\lambda X}}}f_{W}(X) \right).\end{align*}

With (4.5) and $\tilde h(x) = h(x) - \mathbb{E} h(W)$ we obtain

(4.19) \begin{equation} \big|\mathbb{E} h(X)-\mathbb{E} h(W)\big|\le \|\tilde{h}\|\, |\beta -1 |.\end{equation}

This bound depends solely on the parameter $\beta$ and tends to 0 as $\beta\rightarrow 1$ , which is in line with the fact that for $\beta \rightarrow 1$ , $\mathrm{GPE}(\theta, \lambda, \beta)$ converges to $\mathrm{PE}(\theta, \lambda)$ ; see [Reference Fatima and Roohi12].

The next result follows immediately from Theorem 1 and (4.13).

Corollary 3. Let $W_1 \sim \mathrm{PE}(\theta_1, \lambda_1)$ and $W_2 \sim \mathrm{PE}(\theta_2, \lambda_2)$ with $\lambda_1 \le \lambda_2$ . Let $\mathcal{H} = \{h\,:\, \mathbb{R}^+_{>0} \rightarrow \mathbb{R}, \,\|h\| \le 1\}$ . Then for all $h \in \mathcal{H}$ , letting $\tilde h(w) = h(w) - \mathbb{E} h(W)$ , we have from (4.15) with $\beta = 1$ that

(4.20) \begin{equation} \big|\mathbb{E} h(W_2)-\mathbb{E} h(W_1)\big| \le \|\tilde{h}\| \left\{ \left|\frac{\lambda_2 \theta_2}{\lambda_1 \theta_1} - 1 \right| + \left( \frac{\lambda_2}{\lambda_1} - 1\right) \left( \frac{\lambda_1 \theta_2}{\lambda_2 (1 - {\mathrm{e}}^{-\theta_2})} + 2\right) \right\}.\end{equation}

Remark 4.

  1. (i) For $\|h\| \le 1$ such that $\|\tilde{h}\| \le 2$ , the bounds can easily be converted into bounds in total variation distance using (2.2).

  2. (ii) To bound $d_{\mathcal{H}}(\mathrm{PE}(\theta_1,\lambda_1), \mathrm{GPE}(\theta_2,\lambda_2,\beta))$ when $\lambda_1 > \lambda_2$ , we can use

    \begin{align*} d_{\mathcal{H}} \bigl(\mathrm{GPE}(\theta_2,\lambda_2,\beta) , \mathrm{PE}(\theta_1,\lambda_1)\bigr) & \le d_{\mathcal{H}} \bigl(\mathrm{GPE}(\theta_2,\lambda_2,\beta) ,\mathrm{PE}(\theta_2,\lambda_2)\bigr) \\ &\quad + d_{\mathcal{H}}\bigl( \mathrm{PE}(\theta_2,\lambda_2), \mathrm{PE}(\theta_1,\lambda_1)\bigr)\end{align*}
    and apply Remark 3 and Corollary 3.

4.3. Approximating the Poisson–geometric distribution

Next we consider the distribution of $W_G = \max\{T_1, \ldots, T_M\}$ , where $T_1, T_2, \ldots \in \{1, 2, \ldots\}$ are i.i.d. Geometric(p) random variables and M is an independent zero-truncated Poisson( $\theta$ ) random variable; the distribution of $W_G$ is the Poisson–geometric (PG) distribution; it has PMF

(4.21) \begin{align} \mathbb{P}(W_G=w) = \frac{{\mathrm{e}}^{-(1-p)^w\theta}- {\mathrm{e}}^{-(1-p)^{w-1}\theta}}{(1-{\mathrm{e}}^{-\theta})}, \quad {w} \in \mathbb{N} \end{align}

and the discrete backward score function

\begin{align*}\rho_{W_{G}}(w) = 1- \frac{{\mathrm{e}}^{-(1-p)^{w-1}\theta}- {\mathrm{e}}^{-(1-p)^{w-2}\theta}}{{\mathrm{e}}^{-(1-p)^{w}\theta}- {\mathrm{e}}^{-(1-p)^{w-1}\theta}}, \quad w \in \mathbb{N}.\end{align*}

For $T \sim \mathrm{Geometric}(\lambda/n)$ , the distribution of $n^{-1} {T}$ converges to $\mathrm{Exp}(\lambda)$ in probability, and hence it is plausible to approximate the distribution of $Z_n={W_{G,n}}/n$ , for ${W_{G,n}} \sim \mathrm{PG}(\theta, \lambda/n)$ , by a corresponding PE distribution. With $q_n = 1-\lambda/n $ and $\tilde{\rho}_n (z) = \rho_{W_G} (nz)$ ,

\begin{align*} \tilde{\rho}_n (z) = 1- \frac{{\mathrm{e}}^{-q_n^{nz-1}\theta}- {\mathrm{e}}^{-q_n^{nz-2}\theta}}{{\mathrm{e}}^{-q_n^{nz}\theta}- {\mathrm{e}}^{-q_n^{nz-1}\theta}}, \quad nz \in \mathbb{N}.\end{align*}

This function is the ratio of two exponential functions, complicating the comparison using (2.6). To simplify the comparison, we use Proposition 2. From (3.4), a standardised PG Stein operator for $Z_n$ is

(4.22) \begin{equation} {\mathcal{T}_{Z_n} ^{(d)}} (g(z)) = g(z) d(z) - g \!\left( z - \frac1n \right) d\! \left( z - \frac1n \right) +\tilde{\rho}_n(z) g \!\left( z - \frac1n \right)d \!\left( z - \frac1n \right);\end{equation}

here we choose

(4.23) \begin{equation} d(z) = {\mathrm{e}}^{-q_n^{nz + 1}\theta}- {\mathrm{e}}^{-q_n^{nz}\theta}= {\mathrm{e}}^{ -\theta {q_n^{nz}}} \bigl( {\mathrm{e}}^{ -\theta (q_n-1) q_n^{nz}} -1 \bigr)\end{equation}

so that $ \tilde{\rho}_n(z) d\left( z - \frac1n \right) = {\mathrm{e}}^{-q_n^{nz }\theta}- 2 {\mathrm{e}}^{-q_n^{nz-1}\theta} + {\mathrm{e}}^{-q_n^{nz -2}\theta}.$ As $nd(z)\rightarrow \lambda \theta {\mathrm{e}}^{-\lambda z - \theta {\mathrm{e}}^{-\lambda z}}$ as $n \rightarrow \infty$ , for the approximating PE distribution we choose

\begin{align*} c(z) = \frac{\lambda \theta}{n}{\mathrm{e}}^{-\lambda z - \theta {\mathrm{e}}^{-\lambda z}}, \quad z > 0\end{align*}

as a standardisation function in (3.3), giving rise to the standardised PE Stein equation

(4.24) \begin{equation} c(w) g'(w) + \big[c(w) \lambda(\theta {\mathrm{e}}^{-\lambda w}-1)+ c'(w)\big] g(w) = h(w) - \mathbb{E} h(W)\end{equation}

for $W \sim \mathrm{PE}(\theta, \lambda)$ . Again we can bound the solution of this Stein equation as follows.

Lemma 2. For $W \sim \mathrm{PE}(\theta, \lambda)$ , the solution g(w) of the Stein equation (4.24), bounded differentiable h such that $\|h\| \le 1$ and $\|h'\| \le 1$ with $\tilde{h} (x) = h(x) - \mathbb{E} h(W)$ , and $ c(w) = \frac{\lambda \theta}{n}\,{\mathrm{e}}^{-\lambda {w} - \theta {\mathrm{e}}^{-\lambda {w}}}$ , we have

(4.25) \begin{align} \big|{\mathrm{e}}^{-2\lambda w - \theta {\mathrm{e}}^{-\lambda w}} g(w)\big| &\le \frac{n}{\lambda^2\theta^2} \|\tilde{h}\|, \end{align}
(4.26) \begin{align} \big|{\mathrm{e}}^{-\lambda w - \theta {\mathrm{e}}^{-\lambda w}} (\theta {\mathrm{e}}^{-\lambda w} - 1)g(w) \big| &\le \frac{n}{\lambda^2 \theta}\|\tilde{h}\|, \end{align}
(4.27) \begin{align} \big|{\mathrm{e}}^{-\lambda w - \theta {\mathrm{e}}^{-\lambda w}}g(w)\big| &\le 2\frac{n}{\lambda^2\theta} \|\tilde{h}\|,\end{align}
(4.28) \begin{align} \big|{\mathrm{e}}^{-\lambda w - \theta {\mathrm{e}}^{-\lambda w}}g'(w) \big| &\le 3 \frac{n}{\lambda \theta} \|\tilde{h}\|, \end{align}
(4.29) \begin{align} \bigg|\frac{\lambda \theta}{n}{\mathrm{e}}^{-\lambda w - \theta {\mathrm{e}}^{-\lambda w}}g''(w)\bigg| &\le \|h'\| + 9 \lambda\theta \|\tilde{h}\| + 11 \lambda \|\tilde{h}\|, \\[8pt]\nonumber \end{align}

and

(4.30) \begin{equation} \frac{c(x)}{ c\left( x + \frac{\rho}{n}\right)} \le \begin{cases}\! {\mathrm{e}}^{\theta ({\mathrm{e}}^{ \frac{\lambda}{n}} -1)}, & \rho = -1, \\ {\mathrm{e}}^{\frac{\lambda}{n}}, & 0 < \rho <1. \end{cases}\end{equation}

Proof. In (4.24), $cg = f$ is the solution of the Stein equation (4.4), so we use the bounds for f to bound g. The bound (4.6) in Lemma 1 immediately gives (4.25). Also $c'(w) = c(w) \lambda (\theta {\mathrm{e}}^{-\lambda w} - 1) $ so that $c'(w)g(w) = \lambda (\theta {\mathrm{e}}^{-\lambda w} - 1) f(w)$ ; using (4.7) we get (4.26). Combining (4.25), (4.26), and the triangle inequality gives (4.27).

Since $(cg)' = cg' + c'g$ , rearranging and using (4.9) and (4.7) with the triangle inequality gives (4.28). For $(cg)'' - c''g - 2c'g' = cg''$ , using (4.28) we have

\begin{align*} |c'(w)g'(w)| \le \lambda|\theta {\mathrm{e}}^{-\lambda w} - 1|c(w)g'(w) \le 3\lambda(\theta+1) \|\tilde{h}\|.\end{align*}

For $c''(w)g(w) = \lambda^2(\theta {\mathrm{e}}^{-\lambda w} -1)^2 c(w)g(w) - \lambda^2\theta {\mathrm{e}}^{-\lambda w}c(w)g(w)$ , using the triangle inequality, (4.26), and (4.25) yields

\begin{align*} |c''(w)g(w)| \le \lambda|\theta {\mathrm{e}}^{-\lambda w} - 1|\|\tilde{h}\| + \lambda \|\tilde{h}\| \le \lambda\theta \|\tilde{h}\| + 2\lambda \|\tilde{h}\|.\end{align*}

These two results along with (4.10) give (4.29). Now, for any $0 < \rho < 1$ ,

\begin{align*} \frac{c(x)}{ c\left( x + \frac{\rho}{n}\right)} = {\mathrm{e}}^{\lambda \frac{\rho}{n} - \theta {\mathrm{e}}^{-\lambda x} (1- {\mathrm{e}}^{-\lambda \frac{\rho}{n}}) } \le {\mathrm{e}}^{\lambda \frac{\rho}{n}};\end{align*}

for $\rho = -1$ we have $\frac{c(x)}{ c\left( x - \frac{1}{n}\right)} = {\mathrm{e}}^{- \frac{\lambda}{n} + \theta {\mathrm{e}}^{-\lambda x} ({\mathrm{e}}^{ \frac{\lambda}{n}} -1) } \le {\mathrm{e}}^{ \theta ({\mathrm{e}}^{\frac{\lambda }{n}} -1)}$ , yielding (4.30).

Theorem 2. Let $W_{G,n} \sim \mathrm{PG}(\theta, p_n) $ with $p_n = \lambda/n$ where $ 0< \lambda < n$ , and let $W \sim \mathrm{PE}(\theta,\lambda)$ . Then for the scaled PG random variable $Z_n = W_{G,n}/n$ and any bounded function h with bounded first derivative, we have

(4.31) \begin{align} &\big|\mathbb{E} h(Z_n) - \mathbb{E} h(W)\big| \nonumber\\ & \le \frac{{\mathrm{e}}^{\frac{\lambda}{n}}}{n} \biggl(1+\frac{{\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}}}{n}B_1(\theta, \lambda)\biggr)\|h'\| \nonumber \\ & \quad + \frac{1}{n} \left(\left(1-\frac{\lambda}{n}\right)^{-6} {\mathrm{e}}^{\theta {\mathrm{e}}^{ \frac{\lambda}{n}} + \frac{\theta\lambda}{n} + 2\frac{\theta\lambda}{n {\mathrm{e}}} + \frac{\lambda}{n}}B_2(\theta, \lambda) + \frac{1}{2n} {\mathrm{e}}^{\frac{\lambda}{n} + \frac{\theta\lambda}{n {\mathrm{e}}}}B_3(\theta, \lambda)\right)\|\tilde{h}\| , \end{align}

where

\begin{align*} B_1(\theta, \lambda) &= \lambda \max(1,\theta) + \frac{{\mathrm{e}}^{\lambda \theta + {\frac{\theta}{{\mathrm{e}}}}}}{2\theta\lambda } + \frac{\theta\lambda}{2{\mathrm{e}}}\frac{(1+{\mathrm{e}}-{\mathrm{e}}^{-\theta})}{(1-{\mathrm{e}}^{-\theta})} ,\\B_2(\theta, \lambda) &= 11 \lambda + 9 \lambda \theta + 6\lambda \max(1,\theta) + \frac{3{\mathrm{e}}^{\lambda \theta + {\frac{\theta}{{\mathrm{e}}}}}}{\theta \lambda} + 3\theta\lambda \frac{(1+{\mathrm{e}}-{\mathrm{e}}^{-\theta})}{(1-{\mathrm{e}}^{-\theta}){\mathrm{e}}} \nonumber \\&\quad + 4\lambda {\mathrm{e}}^{-\theta}\left(1+ 2 \max(1,\theta) + \frac{(3{\mathrm{e}} + 1) \theta}{{\mathrm{e}}}+ \frac{(5{\mathrm{e}} + 3)\theta^2}{3{\mathrm{e}}} + \frac{\theta(2\theta +1)}{(1-{\mathrm{e}}^{-\theta})} \right),\\B_3(\theta, \lambda) & = (9 \lambda \theta + 11 \lambda ) \left(2\lambda \max(1,\theta) + \frac{{\mathrm{e}}^{\lambda \theta + {\frac{\theta}{{\mathrm{e}}}}}}{\theta \lambda} + \theta\lambda \frac{(1+{\mathrm{e}}-{\mathrm{e}}^{-\theta})}{(1-{\mathrm{e}}^{-\theta}){\mathrm{e}}} \right).\end{align*}

Remark 5.

  1. (i) For fixed $\lambda $ and $\theta$ , the bound (4.31) is of $O(n^{-1})$ . As $n \rightarrow \infty$ , for $\lambda = \lambda(n)$ and $\theta = \theta(n)$ the bound decreases to 0 as long as ${\lambda(n) \theta(n)}/{n} \rightarrow 0$ and ${\lambda(n)}/{n} \rightarrow 0$ .

  2. (ii) Equation (4.31) can be translated into a bound in the bounded Wasserstein distance using $\mathrm{Lip}_b(1)$ as the class of test functions.

Proof. We employ Proposition 2. First we note that $ c(w) \rho(w) = c'(w)$ and that $ {\tilde \rho}_n (w) d\bigl( w - \frac1n \bigr)= d \bigl( w - \frac1n \bigr) - d \bigl( w - \frac2n \bigr)$ . Thus, for (3.6), with $h \in \textrm{Lip}_b(1)$ we have

(4.32) \begin{align} & {\big|\mathbb{E} h(Z_n) - \mathbb{E} h(W)\big|} \nonumber \\ & \le \bigg| \mathbb{E} [ n \Delta^{-n} (dg)(Z_n) - (cg)'(Z_n) ] + \mathbb{E}\Big[ n {\tilde \rho}_n (Z_n) (dg)\Bigl( Z_n - \frac1n \Bigr) - (cg)\left( Z_n\right) \rho(Z_n)\Big] \bigg| \nonumber \\ & \le \mathbb{E}\left| n \left\{ g(Z_n) - g \left( Z_n - \frac1n \right)\right\} c(Z_n) - g'(Z_n) c(Z_n) \right| \end{align}
(4.33) \begin{align} & \quad + \mathbb{E} \left| n \left\{ g(Z_n) - g \left( Z_n - \frac1n \right)\right\} \left(d(Z_n) - c(Z_n) -\frac{2}{n} c'(Z_n)\right) \right|\end{align}
(4.34) \begin{align} & \quad + \mathbb{E} \left| d(Z_n) - d \left( Z_n - \frac2n \right) - \frac{2}{n} c'(Z_n)\right| \left|ng\left( Z_n - \frac1n \right)\right| .\\[8pt]\nonumber\end{align}

To bound the term (4.32), for some $0 < \rho < 1$ we write

\begin{align*} \mathbb{E}\left| n \left\{ g(Z_n) - g \left( Z_n - \frac1n \right)\right\} c(Z_n) - g'(Z_n) c(Z_n) \right| & \le \frac1n \,\mathbb{E} \left|c(Z_n) g''\left( Z_n + \frac{\rho}{n}\right) \right| \\ &\le \frac1n \| c g''\| \,\mathbb{E} \left| \frac{c(Z_n)}{ c\left( Z_n + \frac{\rho}{n}\right)} \right|.\end{align*}

Then (4.29) and (4.30) give the bound

(4.35) \begin{align} \mathbb{E}\left| n \{ g(Z_n) - g ( Z_n - n^{-1})\} c(Z_n) - g'(Z_n) c(Z_n) \right| \le \frac{{\mathrm{e}}^{\frac{\lambda}{n}}}{n} \big\{ \| h'\| + 9 \lambda \theta \| {\tilde h}\| + 11 \lambda \| {\tilde h}\|\big\}. \nonumber \\\end{align}

To bound (4.33), we let $\tau(z) = {\mathrm{e}}^{\lambda z + \theta {\mathrm{e}}^{-\lambda z}}$ , so that $\tau^{-1} g = \frac{n}{\theta\lambda}cg$ can be bounded as in Lemma 2, and write

\begin{align*} &\mathbb{E} \left| n \left\{ g(Z_n) - g \left( Z_n - \frac1n \right)\right\} \left(d(Z_n) - c(Z_n) -\frac{2}{n} c'(Z_n)\right) \right| \\ & \quad = \mathbb{E} \left| l(Z_n) n \left\{ g(Z_n) - g \left( Z_n - \frac1n \right)\right\}\tau^{-1}(Z_n) \right|,\end{align*}

where $l(Z_n) = \tau(Z_n)\bigl(d(Z_n) - c(Z_n) -\frac{2}{n} c'(Z_n)\bigr)$ . We show in Appendix that

(4.36) \begin{align} |l(Z_n)| \le \frac{2\lambda^2 \theta}{n^2} \max(1,\theta) + \frac{1}{n^2} {\mathrm{e}}^{\lambda \theta + {\frac{\theta}{{\mathrm{e}}}}} + \frac{\theta^2\lambda^2}{n^2 } {\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}-1} + \frac{\theta\lambda^3Z_n}{n^2} {\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}}.\end{align}

By Taylor expansion, for some $0 < \epsilon < 1$ we have

(4.37) \begin{align} n \left\{ g(Z_n) - g \left( Z_n - \frac1n \right)\right\}\tau^{-1}(Z_n) &= g'(Z_n)\tau^{-1}(Z_n) + \frac{1}{2n}g''\left(Z_n + \frac{\epsilon}{n}\right)\tau^{-1}(Z_n) \nonumber \\ &= \frac{n}{\theta\lambda}\|cg'\| + \frac{1}{2\theta\lambda} \|cg''\| \frac{c(Z_n)}{ c\left(Z_n + \frac{\epsilon}{n}\right)}. \end{align}

Using (4.36) and (4.37) along with (4.28), (4.29) and (4.30) yields

(4.38) \begin{align} &{\mathbb{E} \left| n \left\{ g(Z_n) - g \left( Z_n - \frac1n \right)\right\} \left(d(Z_n) - c(Z_n) -\frac{2}{n}c'(Z_n)\right) \right|} \nonumber \\ &\le \left(\frac{2\lambda^2 \theta}{n^2} \max(1,\theta) + \frac{1}{n^2} {\mathrm{e}}^{\lambda \theta + {\frac{\theta}{{\mathrm{e}}}}}+ \frac{\theta^2\lambda^2}{n^2 } {\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}-1} + \frac{\theta\lambda^3 \mathbb{E} Z_n}{n^2} {\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}}\right) \nonumber \\ &\quad\times \left\{ \frac{n}{\theta\lambda}3\|\tilde{h}\| + \frac{1}{2\theta\lambda} \{ \| h'\| + 9 \lambda \theta \| {\tilde h}\| + 11 \lambda \| {\tilde h}\|\} {\mathrm{e}}^{\frac{\lambda}{n}} \right\} . \end{align}

Finally, to bound (4.34), we show in Appendix that

(4.39) \begin{align} & \left|\frac{2}{n} c'(z) - d(z) + d \left( z - \frac2n \right)\right|\left|ng\left( z - \frac1n\right)\right| \\ &\le 2\lambda \left\{ \left(1-\frac{\lambda}{n}\right)^{-2} -1 \right\} \max(1,\theta) c(z) \left| g\left( z - \frac1n\right)\right| \nonumber \\ &\quad + \frac{\theta\lambda^2}{n} \left(1-\frac{\lambda}{n}\right)^{-2} {\mathrm{e}}^{{\frac{\theta\lambda}{n {\mathrm{e}}}}} \Biggl\{ \frac{1}{3} \left(1-\frac{\lambda}{n}\right)^{-1} \nonumber \biggl[\theta + \theta {\mathrm{e}}^{\frac{\theta\lambda}{n}} + 6 {\mathrm{e}}^{\frac{\theta\lambda}{n}} + 12 \left(1-\frac{\lambda}{n}\right)^{-1} \\ & \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad\; + 8\theta \left(1-\frac{\lambda}{n}\right)^{-3}\biggr] + 2\left({\frac{\theta}{{\mathrm{e}}}} + 2\lambda z \right)\Biggr\}\nonumber c(z) \left| g\left( z - \frac1n\right)\right| \\ &\quad + \frac{2\lambda^2}{n}\left( 1- \frac{\lambda}{n}\right)^{-2}{\mathrm{e}}^{{\frac{\theta\lambda}{n {\mathrm{e}}}}}\left[ \left(1-\frac{\lambda}{n}\right)^{-1}{\mathrm{e}}^\frac{\theta \lambda}{n} + \lambda z + {\frac{\theta}{{\mathrm{e}}}} \right] c(z) \left|g\left( z - \frac1n\right)\right|. \nonumber\end{align}

Note that

(4.40) \begin{equation} 0 \le \left(1-\frac{\lambda}{n}\right)^{-2} -1 = \frac{\lambda}{n} \left(1-\frac{\lambda}{n}\right)^{-2}\left( 2 - \frac{\lambda}{n} \right)\end{equation}

and that, using (4.27) and (4.30),

\begin{align*}c(z) \left|g\left( z - \frac1n\right)\right|= \left|c(z) g\left( z - \frac1n\right)\right| \le \|cg\|\left| \frac{c(z)}{ c\left( z - \frac{1}{n}\right)} \right| \le \frac{2}{\lambda} {\mathrm{e}}^{\theta ({\mathrm{e}}^{ \frac{\lambda}{n}} -1)}\|\tilde{h}\|.\end{align*}

Combining this with (4.39) and (4.40), we bound (4.34) as

(4.41) \begin{align} & \mathbb{E} \left| d(Z_n) - d \left( Z_n - \frac2n \right) - \frac{2}{n} c'(Z_n)\right| \left|ng\left( Z_n - \frac1n \right)\right| \\ & \le \frac{4 \lambda}{n}\max(1,\theta) \left(1-\frac{\lambda}{n}\right)^{-2}\left|\frac{\lambda}{n} - 2 \right| {\mathrm{e}}^{\theta ({\mathrm{e}}^{ \frac{\lambda}{n}} -1)}\|\tilde{h}\| + \frac{2\theta\lambda}{n} \left(1-\frac{\lambda}{n}\right)^{-2} {\mathrm{e}}^{{\frac{\theta\lambda}{n {\mathrm{e}}}}} {\mathrm{e}}^{\theta ({\mathrm{e}}^{ \frac{\lambda}{n}} -1)}\|\tilde{h}\| \nonumber \\ &\quad \times \bigg\{ \frac{1}{3} \left(1-\frac{\lambda}{n}\right)^{-1} \left[\theta + \theta {\mathrm{e}}^{\frac{\theta\lambda}{n}} + 6 {\mathrm{e}}^{\frac{\theta\lambda}{n}} + 12 \left(1-\frac{\lambda}{n}\right)^{-1} + 8\theta \left(1-\frac{\lambda}{n}\right)^{-3}\right] \nonumber \\ &\qquad + 2\left({\frac{\theta}{{\mathrm{e}}}} + 2\lambda \mathbb{E} Z_n \right) + \frac{2}{\theta} \bigg[ \left(1-\frac{\lambda}{n}\right)^{-1}{\mathrm{e}}^{\frac{\theta \lambda}{n}} + \lambda \mathbb{E} Z_n + {\frac{\theta}{{\mathrm{e}}}} \bigg] \bigg\}. \nonumber\end{align}

To bound $\mathbb{E} Z_n $ , we argue as for (4.13); $W_{G,n} = \max \{ T_1, \ldots, T_M\} \le \sum_{i=1}^{M} T_i$ and so $ \mathbb{E} {W_{G,n}} \le \mathbb{E} \sum_{i=1}^{M} T_i = \frac{n\theta}{\lambda(1-{\mathrm{e}}^{-\theta})}$ , giving that $ \mathbb{E} Z_n = \frac{1}{n} \mathbb{E} W_{G,n}\le \frac{\theta}{\lambda(1-{\mathrm{e}}^{-\theta})}.$ Adding (4.35), (4.38), and (4.41) and simplifying gives (4.31).

The next result instead uses Proposition 3 to bound the distance between a PG and a PE distribution.

Corollary 4. Let $W = W_1(M,{\boldsymbol{E}}) \sim \mathrm{PE}(\theta, \lambda)$ and ${W_{G,n}} = W_1(M,{\boldsymbol{G}}) \sim \mathrm{PG}(\theta, \frac{\lambda}{n})$ with $G_i \sim \mathrm{Geometric}(\lambda/n)$ be two CCR random variables, and let $Z_n = \frac{{W_{G,n}}}{n}$ . Then, for all bounded Lipschitz functions $h\,:\, \mathbb{R}_{>0}^{+} \rightarrow \mathbb{R}$ ,

(4.42) \begin{equation} \big| \mathbb{E} h(W) - \mathbb{E} h (Z_n ) \big| \le \frac{\theta}{1 - {\mathrm{e}}^{-\theta}} \frac{2}{n}\left(\frac{n - {\mathrm{e}}^{\frac{\lambda}{n}}(n-\lambda)}{\lambda({\mathrm{e}}^{\frac{\lambda}{n}}-1)}\right)\|h'\| .\end{equation}

Proof. Using (3.9) in Proposition 3 gives

\begin{align*} \big| \mathbb{E} h(W) - \mathbb{E} h (Z_n ) \big| = \left| \mathbb{E} h(W) - \mathbb{E} h \left(\frac{W_{G,n}}{n} \right) \right| &\le \frac{\theta}{1 - {\mathrm{e}}^{-\theta} } \,\mathbb{E} \left|E - \frac{G}{n}\right| \| h'\|.\end{align*}

With the coupling $\tilde{G} = \lceil n{E}\rceil \sim \mathrm{Geometric}(1-{\mathrm{e}}^{-\lambda/n})$ , $\tilde{G}$ is stochastically greater than or equal to G and $\mathbb{E} |n{E} - G| \le \mathbb{E} |n{E}-\tilde{G}| + \mathbb{E} (\tilde{G} - G)$ . Moreover,

\begin{align*} \mathbb{E} |nE - \tilde{G}| = \sum_{k=0}^{\infty} \int^{\frac{k}{n}}_{\frac{(k-1)}{n}} \lambda {\mathrm{e}}^{-\lambda x} \left(k - nx \right) {\mathrm{d}} x = \frac{n - {\mathrm{e}}^{\frac{\lambda}{n}}(n-\lambda)}{\lambda({\mathrm{e}}^{\frac{\lambda}{n}}-1)}\end{align*}

and $\mathbb{E}(\tilde{G} - G) = ({1-{\mathrm{e}}^{-\frac{\lambda}{n}}})^{-1} - {n}/{\lambda}.$ Hence the bound (4.42) follows.

Remark 6. To compare (4.31) and (4.42), first note that as no fixed bound on $\| h\| $ is assumed, through rescaling h we can make $\|\tilde{h}\|$ as small as desired. Therefore, for a comparison we focus on the terms involving $\|h'\|$ . For continuous $\|h'\|$ , (4.31) outperforms the bound (4.42) for large n, since ${\lim}_{n \rightarrow \infty} {\mathrm{e}}^{\frac{\lambda}{n}} \bigl(1+\frac{{\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}}}{n}B_1(\theta, \lambda)\bigr) = 1$ and ${\lim}_{n \rightarrow \infty}\frac{\theta}{1 - {\mathrm{e}}^{-\theta}} 2\Bigl(\frac{n - {\mathrm{e}}^{\frac{\lambda}{n}}(n-\lambda)}{\lambda({\mathrm{e}}^{\frac{\lambda}{n}}-1)}\Bigr) = \frac{\theta}{(1-{\mathrm{e}}^{-\theta})}.$ In particular, for any $n \ge n_0$ the right-hand side of (4.42) is larger than the coefficient of $\|h'\|$ in the bound (4.31), where $n_0 =n_0(\theta, \lambda)$ solves $ {\mathrm{e}}^{\frac{\lambda}{n}} \le \frac{2n\theta}{\lambda(1 - {\mathrm{e}}^{-\theta})({\mathrm{e}}^{\frac{\lambda}{n}}-1)} \Bigl(\frac{n - {\mathrm{e}}^{\frac{\lambda}{n}}(n-\lambda)}{n+{\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}}B_1(\theta, \lambda)}\Bigr).$ Table 1 shows such values of $n_0$ .

Table 1. Values of n above which the coefficient of $\|h'\|$ is smaller in (4.31) than in (4.42)

5. Application to the exponential–geometric distribution

The exponential–geometric (EG) distribution $\mathrm{EG}(\lambda, p)$ introduced in [Reference Adamidis and Loukas3] is the distribution of the minimum of N i.i.d. random variables $\textbf{E}= (E_1, E_2,\ldots)$ with $E_i \sim \mathrm{Exp}(\lambda)$ for $i \in \mathbb{N}$ , where N is a Geometric(p) random variable and independent of all the $E_i$ . We set $q=1-p$ . An EG random variable $W = W_{-1}(N, {\textbf{E}})$ is thus a CCR random variable. As $G_N(u) = \frac{up}{(1-qu)}$ , $G'_N(u) = \frac{p}{(1-qu)^2}$ , and $G''_N(u) = \frac{2pq}{(1-qu)^3}$ , the PDF of $W \sim \text{EG}(\lambda, p)$ is

(5.1) \begin{equation} f(w; \lambda, p) = \frac{\lambda p {\mathrm{e}}^{-\lambda w}} {(1-q{\mathrm{e}}^{-\lambda w})^2} \quad\mbox{for } w, \lambda >0, \; 0 \le p \le 1,\end{equation}

with score function

(5.2) \begin{equation} \rho(w) = - \lambda \left(\frac{1+q{\mathrm{e}}^{-\lambda w}}{1-q{\mathrm{e}}^{-\lambda w}}\right).\end{equation}

This gives the score Stein operator

(5.3) \begin{equation} \mathcal{T} g(w) = g'(w) - \lambda \left(\frac{1+q{\mathrm{e}}^{-\lambda w}}{1-q{\mathrm{e}}^{-\lambda w}}\right)g(w)\end{equation}

and the Stein equation

(5.4) \begin{equation} g'(w) - \lambda \left(\frac{1+q{\mathrm{e}}^{-\lambda w}}{1-q{\mathrm{e}}^{-\lambda w}}\right)g(w) = h(w) - \mathbb{E} h(W).\end{equation}

The next lemma bounds the solution (2.5) of this EG Stein equation.

Lemma 3. For any bounded test function $h\,:\, \mathbb{R}^+_{>0} \rightarrow \mathbb{R}$ , let $\tilde{h}(w) = h(w)-\mathbb{E} h(W)$ for $W \sim \mathrm{EG}(\lambda,p)$ . Then the solution $g=g_h$ of the EG Stein equation (5.4) satisfies

(5.5) \begin{align} |g(w)| \le \frac{\|\tilde{h}\|}{\lambda}, \end{align}
(5.6) \begin{align} |g'(w)| \le \frac{2\|\tilde{h}\|}{p}. \end{align}

Proof. For the solution (2.5) of the EG Stein equation (5.4),

\begin{eqnarray*} |g(w)| \le \|\tilde{h}\| \left| \frac{ -1 }{f(w)} \int_w^{\infty} f(t)\,\mathrm{d}t \right| = \|\tilde{h}\| \left| \frac{ 1-F(w)}{f(w)} \right| = \|\tilde{h}\| \left| \frac{(1-q{\mathrm{e}}^{-\lambda w})}{\lambda} \right| \le \frac{\|\tilde{h}\|}{\lambda}.\end{eqnarray*}

The last inequality follows since with $0 \le {\mathrm{e}}^{-\lambda w} \le 1$ and $0 \le q \le 1$ , we have $1 -q \le 1 - q{\mathrm{e}}^{-\lambda w} \le 1$ and $1 \le 1 + q{\mathrm{e}}^{-\lambda w} \le 1+q$ . For a bound on $| g'(w)| $ , we use the Stein equation (5.4) and the triangle inequality to obtain

\begin{eqnarray*} |g'(w)| \le \|\tilde{h}\| + \left| \lambda \left(\frac{1+q{\mathrm{e}}^{-\lambda w}}{1-q{\mathrm{e}}^{-\lambda w}}\right)\right| \frac{\|\tilde{h}\|}{\lambda} \le \|\tilde{h}\| \left(1 + \frac{1+q}{1-q}\right).\end{eqnarray*}

Simplifying the last inequality gives (5.6).

5.1. The minimum of a geometric number of i.i.d. random variables

Let $N\in \mathbb{N}$ be a random variable which is independent of ${\textbf{X}} = (X_1, X_2, \ldots)$ , a sequence of i.i.d. random variables, and let $W=W_{-1}(N, {\textbf{X}})$ have PDF of the form (2.12) for $\alpha = -1$ . In this subsection we approximate its distribution by an EG distribution. First we apply Proposition 1.

Corollary 5. Assume that the $X_i$ have CDF $F_X$ , differentiable PDF $p_X$ , and score function $\rho_X$ , and let $N \sim \mathrm{Geometric}(p)$ . Then

\begin{align*} d_{\rm TV}\bigl(\text{EG}(\lambda,p),\mathcal{L}(W)\bigr)\le \frac{2}{\lambda}\,\biggl|\, & \mathbb{E} (\! -\lambda- \rho_X(W)) \\ & - \mathbb{E} \left( \frac{2(1-p)\lambda {\mathrm{e}}^{-\lambda W}}{1-(1-p){\mathrm{e}}^{-\lambda W}} - \frac{2(1-p)f_X(W)}{1-(1-p)(1-F_X(W))} \right) \biggr|.\end{align*}

Proof. Substituting $\frac{G_N''(1-F(w))}{G_N'(1-F(w))} = \frac{2(1-p)f(w)}{1-(1-p)(1-F(w))}$ and the score function of the exponential distribution into (3.1), taking h an indicator function so that $\| \tilde{h} \| \le 2 \|h\| \le 2$ , and using (5.5) gives the bound.

Next, we instead use Proposition 3; the corollary follows immediately from (3.9).

Corollary 6. For $W = W_{-1}(N, {\boldsymbol{X}})$ with $N \sim \mathrm{Geometric}(p)$ ,

\begin{align*} d_{\rm BW}\bigl(\mathrm{EG}(\lambda,p), \mathcal{L}(W)\bigr) \le {\mathbb{E} | E - X|}/{p} . \end{align*}

5.2. Approximating the extended exponential–geometric distribution

Motivated by population heterogeneity, Adamidis et al. [Reference Adamidis, Dimitrakopoulou and Loukas2] developed the extended exponential–geometric (EEG) distribution by assuming that individual units in a population have increasing failure rates that depend on a random scale parameter A. Their lifetimes $X/A$ are modelled by a modified extreme value distribution; if $A = \alpha$ , then the PDF is $f(x|\alpha;\beta) = \alpha \beta {\mathrm{e}}^{\beta x + \alpha(1-{\mathrm{e}}^{\beta x})}$ , where $x, \alpha, \beta \in \mathbb{R}^+_{>0}$ ; it is assumed that A has an $\mathrm{Exp}(\gamma)$ distribution. Then the unconditional lifetime distribution X has PDF

(5.7) \begin{equation} f(x;\,\beta, \gamma) = \frac{\beta \gamma {\mathrm{e}}^{-\beta x}} {(1- (1-\gamma) {\mathrm{e}}^{-\beta x})^2}, \quad x>0,\end{equation}

with $\beta, \gamma \in \mathbb{R}^+_{>0}$ ; we use the notation $X \sim \mathrm{EEG}(\beta, \gamma)$ . Its score function is

(5.8) \begin{equation} \rho(x) = \frac{f'(x)}{f(x)} = - \beta \left(\frac{1+(1-\gamma){\mathrm{e}}^{-\beta x}}{1-(1-\gamma){\mathrm{e}}^{-\beta x}}\right).\end{equation}

This distribution is not in the CCR family. However, the EG distribution $\mathrm{EG}(\beta, \gamma)$ is a special case when $\gamma \in (0,1)$ . To assess the total variation distance between the distributions, we use the general approach developed in Section 2.

Theorem 3. For $X \sim \mathrm{EEG}(\beta, \gamma)$ and $W \sim \mathrm{EG}(\lambda, p)$ , with $\beta, \gamma, \lambda \in \mathbb{R}^+_{>0}$ and $p \in (0,1)$ , and for a bounded test function h, we have

(5.9) \begin{align} \big|\mathbb{E} h(X)-\mathbb{E} h(W)\big| \le \begin{cases} \frac{\|\tilde{h}\|}{\lambda}\left[| \lambda - \beta| \left( \frac{2-p}{p} + \frac{2(1-\gamma)}{{\mathrm{e}} \gamma^2} \right) + \frac{2\beta} {\min(\gamma, p)^2} | p - \gamma| \right], & 0 < \gamma <1 ,\\ \frac{\|\tilde{h}\|}{\lambda p } \left[\left|\lambda - \beta\right|\left(1+(\gamma-1)(1-p)\right) + (\lambda+\beta)\left(\gamma -p \right)\right], & \gamma \ge 1. \end{cases}\end{align}

Proof. Using the score functions (5.2) and (5.8) in (2.6) yields

(5.10) \begin{align} & |\mathbb{E} h(X)-\mathbb{E} h(W)| = |\mathbb{E}(\rho_X(X)-\rho_Y(X))g(X)| \nonumber \\ &\quad= \left|\mathbb{E} \left(\!- \beta \left(\frac{1+(1-\gamma){\mathrm{e}}^{-\beta X}}{1-(1-\gamma){\mathrm{e}}^{-\beta X}}\right)+ \lambda \left(\frac{1+ (1-p){\mathrm{e}}^{-\lambda X}}{1- (1-p){\mathrm{e}}^{-\lambda X}}\right)\right)g(X)\right| \nonumber \\ &\quad\le \frac{\|\tilde{h}\|}{\lambda} \mathbb{E} \left| - \beta \left(\frac{1+(1-\gamma){\mathrm{e}}^{-\beta X}}{1-(1-\gamma){\mathrm{e}}^{-\beta X}}\right)+ \lambda \left(\frac{1+ (1-p){\mathrm{e}}^{-\lambda X}}{1- (1-p){\mathrm{e}}^{-\lambda X}}\right)\right|. \end{align}

To bound the expectation in the above equation, we let

\begin{align*}R(x) = - \beta \left(\frac{1+(1-\gamma){\mathrm{e}}^{-\beta x}}{1-(1-\gamma){\mathrm{e}}^{-\beta x}}\right)+ \lambda \left(\frac{1+ (1-p){\mathrm{e}}^{-\lambda x}}{1- (1-p){\mathrm{e}}^{-\lambda x}}\right).\end{align*}

Case 1: $0 < \gamma <1$ . In this case we decompose $|R(x)|$ as

(5.11) \begin{align} |R(x)| & \le \left|(\lambda - \beta) \frac{1+ (1-p){\mathrm{e}}^{-\lambda x}}{1- (1-p){\mathrm{e}}^{-\lambda x}}\right| \end{align}
(5.12) \begin{align} &\quad + \left|\beta \left( \frac{1+ (1-p){\mathrm{e}}^{-\lambda x}}{1- (1-p){\mathrm{e}}^{-\lambda x}} - \frac{1+ (1-\gamma){\mathrm{e}}^{-\lambda x}}{1- (1-\gamma){\mathrm{e}}^{-\lambda x}}\right) \right| \end{align}
(5.13) \begin{align} &\quad + \left|\beta \left( \frac{1+ (1-\gamma){\mathrm{e}}^{-\lambda x}}{1- (1-\gamma){\mathrm{e}}^{-\lambda x}} - \frac{1+ (1-\gamma){\mathrm{e}}^{-\beta x}}{1- (1-\gamma){\mathrm{e}}^{-\beta x}}\right) \right| \\[8pt]\nonumber\end{align}

and bound these terms separately. For (5.11) we use that $ \frac{1 + \alpha q}{1 - \alpha q} \le \frac{1+q}{1-q}$ when $\alpha \in [0,1]$ and $q \in (0,1).$ Hence

\begin{eqnarray*} \left\vert (\lambda - \beta) \frac{1+ (1-p){\mathrm{e}}^{-\lambda x}}{1- (1-p){\mathrm{e}}^{-\lambda x}}\right\vert &\le& | \lambda - \beta| \frac{2-p}{p}.\end{eqnarray*}

For (5.12), Taylor expansion about $1-\gamma$ of the function $f(q) = \frac{1+aq }{1-aq} $ with $a = {\mathrm{e}}^{-\lambda x} \in (0,1)$ and $q=1-p$ gives $f'(q) = \frac{a}{1 - aq} + \frac{a(1+aq)}{(1-aq)^2} = \frac{2a}{(1-aq)^2}>0$ . Moreover, for $0< a< 1$ we have $ \frac{2a}{(1-aq)^2} \le \frac{2}{(1-q)^2}$ , and hence for $\theta \in (0,1)$ we have $0< f'(\theta (1-p) + (1- \theta) (1-\gamma)) \le \frac{2}{(1- \max(1-\gamma, 1-p))^2} = \frac{2}{(\min (\gamma, p))^2}$ . Therefore,

\begin{eqnarray*} \left\vert \beta \left( \frac{1+ (1-p){\mathrm{e}}^{-\lambda x}}{1- (1-p){\mathrm{e}}^{-\lambda x}} - \frac{1+ (1-\gamma){\mathrm{e}}^{-\lambda x}}{1- (1-\gamma){\mathrm{e}}^{-\lambda x}}\right)\right\vert &\le& \beta \frac{2| p - \gamma|}{(\min (\gamma, p))^2}.\end{eqnarray*}

For (5.13), first-order Taylor expansion of the function $f(\beta) = \frac{1+(1-\gamma) {\mathrm{e}}^{-\beta x}}{1-(1-\gamma) {\mathrm{e}}^{-\beta x}} $ gives

\begin{eqnarray*}\left\vert \beta \left( \frac{1+ (1-\gamma){\mathrm{e}}^{-\lambda x}}{1- (1-\gamma){\mathrm{e}}^{-\lambda x}} - \frac{1+ (1-\gamma){\mathrm{e}}^{-\beta x}}{1- (1-\gamma){\mathrm{e}}^{-\beta x}}\right) \right\vert&\le& \beta | \lambda - \beta|\, | f'(\theta \lambda + (1-\theta) \beta)|\end{eqnarray*}

for some $\theta \in (0,1).$ Now the function $f'(\beta) = \frac{{2 x (1-\gamma)} {\mathrm{e}}^{-\beta x}}{(1 - (1-\gamma) {\mathrm{e}}^{-\beta x})^2}$ is positive; moreover, $x {\mathrm{e}}^{-\beta x} \le ({\mathrm{e}} \beta)^{-1}$ for $x \ge 0$ , so that $f'(\beta) \le \frac{1}{\beta {\mathrm{e}}} \frac{2(1-\gamma)}{ (1 - (1-\gamma) {\mathrm{e}}^{-\beta x})^2} \le \frac{2(1-\gamma)}{{\mathrm{e}} \beta \gamma^2}.$ Hence we can bound

\begin{eqnarray*}\left\vert \beta \left( \frac{1+ (1-\gamma){\mathrm{e}}^{-\lambda x}}{1- (1-\gamma){\mathrm{e}}^{-\lambda x}} - \frac{1+ (1-\gamma){\mathrm{e}}^{-\beta x}}{1- (1-\gamma){\mathrm{e}}^{-\beta x}}\right) \right\vert&\le& \frac{2(1-\gamma)}{{\mathrm{e}} \gamma^2} | \lambda - \beta|.\end{eqnarray*}

This gives as overall bound

\begin{eqnarray*} | R(x) | &\le& | \lambda - \beta| \left( \frac{2-p}{p} + \frac{2(1-\gamma)}{{\mathrm{e}} \gamma^2} \right) + \frac{2\beta} {\min(\gamma, p)^2} | p - \gamma| .\end{eqnarray*}

Substituting this bound into (5.10) gives the final bound for the case where $0 < \gamma <1$ .

Case 2: For $\gamma \ge 1$ , we have

\begin{align*} |R(x)| & \le \left|\frac{(\lambda - \beta)(1-(1-\gamma)(1-p){\mathrm{e}}^{-(\lambda+\beta)x})}{\left(1-(1-\gamma){\mathrm{e}}^{-\beta x}\right) \left(1- (1-p){\mathrm{e}}^{-\lambda x}\right)}\right| \\ &\quad + \left|\frac{(\lambda+\beta)\left((1-p){\mathrm{e}}^{-\lambda x} - (1-\gamma) {\mathrm{e}}^{-\beta x}\right)}{\left(1-(1-\gamma){\mathrm{e}}^{-\beta x}\right) \left(1- (1-p){\mathrm{e}}^{-\lambda x}\right)}\right| \\ &\le \frac{\left|\lambda - \beta\right|\left(1+(\gamma-1)(1-p)\right)}{p} + \frac{(\lambda+\beta)\left(\gamma -p \right)}{p}.\end{align*}

Substituting this result into (5.10) gives the bound for $\gamma \ge 1$ in (5.9).

Remark 7.

  1. (i) The bounds are not optimised for numerical value, but for $\beta = \lambda$ and $\gamma = p$ , (5.9) reduces to zero, as it should.

  2. (ii) For $\gamma \in (0,1)$ , (5.9) gives a bound on the distance between an $\mathrm{EG}(\beta, \gamma)$ and an $\mathrm{EG}(\lambda, p)$ distribution.

6. The maximum waiting time of sequence patterns in Bernoulli trials

This application is motivated by the results in Section 4 of [Reference Peköz26], which gives bounds on the distribution of the number of trials preceding the first appearance of a pattern in dependent Bernoulli trials. Here we are interested in the distribution of the maximum of such random variables.

Consider M independent parallel systems $(X_1^{(i)},X_2^{(i)},\ldots)$ , for $i=1,2,\ldots ,M,$ of possibly dependent Bernoulli(a) trials, which are jointly independent of $M \in \mathbb{N} $ . For each sequence i, let $I_j^{(i)}$ be the indicator function that a fixed non-overlapping binary sequence pattern of length k occurs starting at $X_j^{(i)}$ ; the pattern may be specific to sequence i. Let $V_i =\min\{j\,:\, I_j^{(i)}=1\}$ denote the first occurrence of the pattern of interest in the ith system; assume that $\mathbb{P}(V_i = 1) = p$ for all $i \in \mathbb{N} $ . We denote the maximum waiting time for the occurrence of a corresponding sequence pattern in all M parallel systems by $W = W_1(M,{\textbf{V}})$ where ${\textbf{V}} = (V_1, V_2 \ldots)$ .

Example 2. If the Bernoulli trials are independent and the pattern of interest is a run of ones of length k, starting with a zero and followed by k ones, then $\mathbb{P}(V_i = 1) = p = (1-a)a^k$ , as given in Corollary 2 of [Reference Peköz26]. Intuitively, the waiting time for the occurrence of this pattern in an individual sequence is approximately geometric with parameter p. For an approximation by a PE distribution we are particularly interested in instances where we can write the probability $\mathbb{P}(V_i = 1) = p$ as $p = \lambda/n$ ; here this leads to scaling the run length k as $k \sim a \log n$ .

Corollary 7. In the above setting, let $U_n = W/n$ with $p =\frac{\lambda}{n}$ , $W_1(M', {\textbf{E}}) \sim \mathrm{PE}(\theta, \lambda)$ with M being a zero-truncated Poisson random variable, and $E_i \sim \mathrm{Exp}(\lambda)$ . Then we have

\begin{align*} & d_{\rm BW}\big(\mathcal{L}(U_n), \mathrm{PE}(\theta, \lambda)\big) \\ &\quad\le \frac{2\lambda(k-1)}{n}\,\mathbb{E} M + \frac{1}{\ln(p^{-1})} \,\mathbb{E} \sum_{k= \min(M,M')}^{\max(M,M')}\frac1k + \frac{{\mathrm{e}}^{\frac{\lambda}{n}}}{n} \left(1+\frac{{\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}}}{n}B_1(\theta, \lambda)\right) \nonumber \\ &\qquad+ \frac{1}{n} \left(\left(1-\frac{\lambda}{n}\right)^{-6} {\mathrm{e}}^{\theta {\mathrm{e}}^{ \frac{\lambda}{n}} + 2\frac{\theta\lambda}{n} + \frac{\theta\lambda}{n {\mathrm{e}}} + \frac{\lambda}{n}}B_2(\theta, \lambda) + \frac{1}{2n} {\mathrm{e}}^{\frac{\lambda}{n} + \frac{\theta\lambda}{n {\mathrm{e}}}}B_3(\theta, \lambda)\right),\end{align*}

where $B_1(\theta, \lambda)$ , $B_2(\theta, \lambda)$ , and $B_3(\theta, \lambda)$ are as given in Theorem 2.

Example 3. In the above example of independent Bernoulli trials and the pattern of interest being a run of ones of length $k \sim a \log n$ , the bound in Corollary 7 is of order $(\log n)^{-1}.$

Proof. We couple $U_n = \frac1n \max\{V_1, \cdots, V_M\}$ and $Z_n = \frac1n \max\{T_1, \cdots, T_M\}$ , where $T_i \sim \mathrm{Geometric}(\frac{\lambda}{n})$ for $i = 1, \ldots, M$ , by using the same random variable M. Then

(6.1) \begin{equation} d_{\rm BW}\big(\mathcal{L}(U_n), \mathcal{L}(\mathrm{PE}(\theta, \lambda))\big) \le d_{\rm BW}\big(\mathcal{L}(U_n), \mathcal{L}(Z_n)\big) + d_{\rm BW}\big(\mathcal{L}(Z_n), \mathcal{L} (\mathrm{PE}(\theta, \lambda))\big).\end{equation}

Taking a union bound, we have

\begin{align*} d_{\rm TV}\big(\mathcal{L}(U_n), \mathcal{L}(Z_n)\big) & \le \sum_{m=1}^{\infty} \sum_{i=1}^{m}{d_{\rm TV}\left(\mathcal{L}\left(\frac{V{_i}}{n}\right), \mathcal{L}\left(\frac{T{_i}}{n}\right) \right)} \,\mathbb{P}(M=m)\le \frac{\lambda(k-1)}{n} \,\mathbb{E} M,\end{align*}

where in the last step we have used Corollary 1 from [Reference Peköz26]. With (2.2),

(6.2) \begin{equation} d_{\rm BW}\big(\mathcal{L}(U_n), \mathcal{L}(Z_n)\big)\le 2d_{\rm TV}\big(\mathcal{L}(U_n), \mathcal{L}(Z_n)\big) \le \frac{2\lambda(k-1)}{n}\,\mathbb{E} M.\end{equation}

Now

(6.3) \begin{align} d_{\rm BW}\big(\mathcal{L}(Z_n), \mathrm{PE}(\theta, \lambda)\big) &\le d_{\rm BW}\left(\mathcal{L}(Z_n), \mathcal{L}\left(\frac{W_1(M', {\textbf{T}})}{n}\right)\right) \end{align}
(6.4) \begin{align} &\quad + d_{\rm BW}\left(\mathcal{L}\left(\frac{W_1(M', {\textbf{T}})}{n}\right), \mathrm{PE}(\theta, \lambda)\right).\\[8pt]\nonumber \end{align}

The term (6.4) is bounded in (4.31). To bound (6.3), (3.12) gives

\begin{align*} d_{\rm BW}\left(\mathcal{L}(Z_n), \mathcal{L}\left(\frac{W_1(M', {\textbf{T}})}{n}\right)\right) &\le \sum_{m,m'=1}^\infty \mathbb{P} (M=m, M'=m') \, \mathbb{E} \left| W_1(m', {\textbf{T}}) - W_1(m, {\textbf{T}})\right|.\end{align*}

The expectation of the maximum of m Geometric(p) variables satisfies

\begin{align*} \frac{1}{\ln(p^{-1})} \sum_{k=1}^m\frac1k \le \mathbb{E} W_1(m, {\textbf{T}}) \le 1 + \frac{1}{\ln(p^{-1})} \sum_{k=1}^m\frac1k,\end{align*}

as given in [Reference Eisenberg11, p. 136]. Hence

\begin{align*} & d_{\rm BW}\left(\mathcal{L}(Z_n), \mathcal{L}\left(\frac{W_1(M', {\textbf{T}})}{n}\right)\right) \\ &\quad\le \sum_{m,m'=1}^\infty \mathbb{P} (M=m, M'=m') \frac{1}{\ln(p^{-1})} \sum_{k= \min(m,m')}^{\max(m,m')}\frac1k \\ &\quad= \frac{1}{\ln(p^{-1})} \left| \mathbb{E} \sum_{k= 1}^{M}\frac1k - \mathbb{E} \sum_{k= 1}^{M'}\frac1k \right|.\end{align*}

Combining this result with (4.31) and (6.2) in (6.1), we obtain the assertion.

Remark 8. For $M=M'$ , the bound in Corollary 7 reduces to

\begin{align*} d_{\rm BW}\big(\mathcal{L}(U_n), \mathrm{PE}(\theta, \lambda)\big) & \le \frac{2\theta\lambda(k-1)}{n(1-{\mathrm{e}}^{-\theta})} + \frac{{\mathrm{e}}^{\frac{\lambda}{n}}}{n} \left(1+\frac{{\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}}}{n}B_1(\theta, \lambda)\right) \\ &\quad + \frac{1}{n} \left(\left(1-\frac{\lambda}{n}\right)^{-6} {\mathrm{e}}^{\theta {\mathrm{e}}^{ \frac{\lambda}{n}} + 2\frac{\theta\lambda}{n} + \frac{\theta\lambda}{n {\mathrm{e}}} + \frac{\lambda}{n}}B_2(\theta, \lambda) + \frac{1}{2n} {\mathrm{e}}^{\frac{\lambda}{n} + \frac{\theta\lambda}{n {\mathrm{e}}}}B_3(\theta, \lambda)\right).\end{align*}

The assumption of i.i.d. sequences can be weakened to that of a Markov chain by applying Theorem 5.5 from [Reference Reinert, Schbath and Waterman30] with $M=M'$ . This theorem gives a Poisson process approximation for the number of ‘declumped’ counts of each pattern, which in turn yields that the waiting time for each pattern to occur is approximately exponentially distributed. The theorem also gives an explicit bound on the approximation, but this result requires considerable notation and hence we do not pursue it here.

Acknowledgements

We thank Christina Goldschmidt, David Steinsaltz, and Tadas Temcinas for helpful discussions. We would also like to thank the editor and the anonymous reviewers for their suggestions, which have led to overall improvements of the paper.

Funding information

AF is supported by the Commonwealth Scholarship Commission, United Kingdom, and in part by EPSRC grant EP/X002195/1. GR is supported in part by EPSRC grants EP/T018445/1, EP/R018472/1, EP/X002195/1, and EP/Y028872/1. For the purpose of Open Access, the authors have applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

Appendix Further proofs

Proof of (4.13)

For $X \sim \text{GPE}(\theta, \lambda, \beta)$ with $\lambda,\theta >0$ and $\beta \ge 1$ , we have $(1-{\mathrm{e}}^{-\theta + \theta {\mathrm{e}}^{-\lambda x}})^{\beta-1} \le 1$ , and hence

\begin{align*} \mathbb{E}(X) &= \frac{\beta \theta \lambda }{(1 - {\mathrm{e}}^{-\theta})^{\beta}} \int_0^{\infty} x {\mathrm{e}}^{-\lambda x - \theta \beta {\mathrm{e}}^{-\lambda x}} \left(1-{\mathrm{e}}^{-\theta + \theta {\mathrm{e}}^{-\lambda x}} \right)^{\beta -1}\,\mathrm{d}x \\ &\le \frac{\beta \theta \lambda}{(1 - {\mathrm{e}}^{-\theta})^{\beta}} \int_0^{\infty} x {\mathrm{e}}^{ -\lambda x }\,\mathrm{d}x.\end{align*}

Proof of the inequality (4.36)

First note that for $n \in \mathbb{N}$ and $|x| < n$ ,

(A.1) \begin{equation} {\mathrm{e}}^x\left(1-\frac{x^2}{n}\right) \le \left(1+\frac{x}{n}\right)^n,\qquad 0 \le {\mathrm{e}}^x - \left( 1 + \frac{x}{n}\right)^n \le \frac{x^2}{n} {\mathrm{e}}^x, \end{equation}

and $x {\mathrm{e}}^{-x} \le {\mathrm{e}}^{-1}$ for $x > 0$ . Hence, for $0 < \lambda < n$ and $z > 0$ ,

(A.2) \begin{align} 0 \le {\mathrm{e}}^{-\lambda z} - \left(1-\frac{\lambda z }{n z } \right)^{nz} \le \frac{ (\lambda z)^2}{nz} {\mathrm{e}}^{-\lambda z} &= \frac{ \lambda^2 z}{n} {\mathrm{e}}^{-\lambda z} \end{align}
(A.3) \begin{align} &\le \frac{ \lambda}{ne}. \\[8pt]\nonumber \end{align}

Also, we can write ${\mathrm{e}}^{-\theta \left(1-\frac{\lambda}{n}\right)^{nz}}= {\mathrm{e}}^{\theta \left[ {\mathrm{e}}^{-\lambda z}- \left(1-\frac{\lambda}{n}\right)^{nz}\right]} {\mathrm{e}}^{-\theta {\mathrm{e}}^{-\lambda z}}$ , and so

(A.4) \begin{equation} {\mathrm{e}}^{-\theta \left(1-\frac{\lambda}{n}\right)^{nz}} \le {\mathrm{e}}^{ \frac{\theta\lambda}{ n e}} {\mathrm{e}}^{-\theta {\mathrm{e}}^{-\lambda z}} .\end{equation}

Now, to bound $l = \tau (d - c - \frac1n c')$ , we have

\begin{align*}\tau(z) c(z) = {\mathrm{e}}^{\lambda z + \theta {\mathrm{e}}^{-\lambda z}} c(z) = \frac{\lambda \theta}{n}, \qquad\frac{2}{n} \tau(z)c'(z) = \frac{2\lambda^2 \theta}{n^2} \left(\theta {\mathrm{e}}^{-\lambda z} -1\right),\end{align*}

and

\begin{align*} {\mathrm{e}}^{\lambda z + \theta {\mathrm{e}}^{-\lambda z}} d(z) &= {\mathrm{e}}^{\lambda z + \theta {\mathrm{e}}^{-\lambda z} - \theta\left( 1 - \frac{\lambda}{n}\right)^{nz}} \left( {\mathrm{e}}^{\frac{\lambda \theta}{n} \left( 1 - \frac{\lambda}{n}\right)^{nz}} - 1 \right) \\ &= {\mathrm{e}}^{\lambda z + \theta {\mathrm{e}}^{-\lambda z} - \theta\left( 1 - \frac{\lambda}{n}\right)^{nz}} \left( {\mathrm{e}}^{\frac{\lambda \theta}{n} \left( 1 - \frac{\lambda}{n}\right)^{nz}} - 1 - \frac{\lambda \theta}{n}\left( 1 - \frac{\lambda}{n}\right)^{nz} \right)\\ &\quad + {\mathrm{e}}^{\lambda z + \theta {\mathrm{e}}^{-\lambda z} - \theta\left( 1 - \frac{\lambda}{n}\right)^{nz}}\frac{\lambda \theta}{n}\left( 1 - \frac{\lambda}{n}\right)^{nz}.\end{align*}

Thus,

(A.5) \begin{align}l(z) &= \frac{2\lambda^2 \theta}{n^2} \left(\theta {\mathrm{e}}^{-\lambda z} -1\right) \nonumber \\ &\quad+ {\mathrm{e}}^{\lambda z + \theta \left({\mathrm{e}}^{-\lambda z} - \left( 1 - \frac{\lambda}{n}\right)^{nz}\right)} \left( {\mathrm{e}}^{\frac{\lambda \theta}{n} \left( 1 - \frac{\lambda}{n}\right)^{nz}} - 1 - \frac{\lambda \theta}{n}\left( 1 - \frac{\lambda}{n}\right)^{nz} \theta\right) \end{align}
(A.6) \begin{align} &\quad+ \frac{\lambda \theta}{n} {\mathrm{e}}^{\lambda z + \theta {\mathrm{e}}^{-\lambda z}}\left( {\mathrm{e}}^{ - \theta\left( 1 - \frac{\lambda}{n}\right)^{nz}}\left( 1 - \frac{\lambda}{n}\right)^{nz} - {\mathrm{e}}^{-\lambda z - \theta {\mathrm{e}}^{-\lambda z}}\right).\\[8pt]\nonumber\end{align}

Here

(A.7) \begin{equation} \frac{2\lambda^2 \theta}{n^2} \left(\theta {\mathrm{e}}^{-\lambda z} -1\right) \le \frac{2\lambda^2 \theta}{n^2}\max(1,\theta).\end{equation}

To bound (A.5), we use (A.3) and series expansion, recalling that $0 < \lambda < n$ , to get

(A.8) \begin{align} & {\mathrm{e}}^{\lambda z + \theta {\mathrm{e}}^{-\lambda z} - \left( 1 - \frac{\lambda}{n}\right)^{nz}\theta} \left( {\mathrm{e}}^{\frac{\lambda \theta}{n} \left( 1 - \frac{\lambda}{n}\right)^{nz}} - 1 - \frac{\lambda \theta}{n}\left( 1 - \frac{\lambda}{n}\right)^{nz} \theta\right)\end{align}
(A.9) \begin{align} & \le {\mathrm{e}}^{\lambda z + {\frac{\theta \lambda}{n {\mathrm{e}}}}} \sum_{k=2}^\infty \frac{(\lambda \theta \left( 1 - \frac{\lambda}{n}\right)^{nz})^k} {n^k k!} \nonumber \\ &\le {\mathrm{e}}^{\lambda z + {\frac{\theta \lambda}{n {\mathrm{e}}}}} \sum_{k=2}^\infty \frac{(\lambda \theta)^k {\mathrm{e}}^{-\lambda z k} } {n^k k!} \le \frac{1}{n^2} {\mathrm{e}}^{-\lambda z + {\frac{\theta\lambda}{n {\mathrm{e}}}} + \lambda \theta} \le \frac{1}{n^2} {\mathrm{e}}^{{\frac{\theta}{{\mathrm{e}}}} + \lambda \theta}.\\[8pt]\nonumber\end{align}

To bound (A.6), we first bound

(A.10) \begin{align} &{{\mathrm{e}}^{-\lambda z - \theta {\mathrm{e}}^{-\lambda z}} - {\mathrm{e}}^{ - \theta\left( 1 - \frac{\lambda}{n}\right)^{nz}}\left( 1 - \frac{\lambda}{n}\right)^{nz}} \nonumber \\ &= \left( {\mathrm{e}}^{-\lambda z} - \left(1-\frac{\lambda}{n}\right)^{nz} \right) {\mathrm{e}}^{-\theta \left(1-\frac{\lambda}{n}\right)^{nz}} + {\mathrm{e}}^{-\lambda z -\theta {\mathrm{e}}^{-\lambda z}} \left(1 - {\mathrm{e}}^{\theta \{{\mathrm{e}}^{-\lambda z} - \left(1-\frac{\lambda}{n}\right)^{nz}\} }\right) \nonumber \\ &\le \left( {\mathrm{e}}^{- \lambda z} - \left(1-\frac{\lambda}{n}\right)^{nz} \right) {\mathrm{e}}^{-\theta \left(1-\frac{\lambda}{n}\right)^{nz}} \nonumber \\ &\quad + \theta {\mathrm{e}}^{- \lambda z -\theta {\mathrm{e}}^{-\lambda z}} \left({\mathrm{e}}^{-\lambda z} - \left(1-\frac{\lambda}{n}\right)^{nz}\right) {\mathrm{e}}^{\theta \left\{{\mathrm{e}}^{-\lambda z} - \left(1-\frac{\lambda}{n}\right)^{nz}\right\}} \nonumber \\ &\le \frac{\lambda^2 z}{n} {\mathrm{e}}^{ \frac{\theta\lambda}{ n {\mathrm{e}}}} {\mathrm{e}}^{-\lambda z -\theta {\mathrm{e}}^{-\lambda z}} + \frac{\theta\lambda}{n {\mathrm{e}}} {\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}} {\mathrm{e}}^{- \lambda z -\theta {\mathrm{e}}^{-\lambda z}}. \end{align}

Here we have used Property 4 from [Reference Salas31], that for all $x>0$ we have $ \left(1+\frac{x}{n}\right)^n - 1 \le x{\mathrm{e}}^{x} $ and $ {\mathrm{e}}^x - 1 \le x {\mathrm{e}}^x$ , along with (A.2) and (A.4). Thus

(A.11) \begin{equation} \frac{\lambda \theta}{n} {\mathrm{e}}^{\lambda z + \theta {\mathrm{e}}^{-\lambda z}}\left( {\mathrm{e}}^{ - \theta\left( 1 - \frac{\lambda}{n}\right)^{nz}}\left( 1 - \frac{\lambda}{n}\right)^{nz} - {\mathrm{e}}^{-\lambda z - \theta {\mathrm{e}}^{-\lambda z}}\right) \le \frac{\theta\lambda^3z}{n^2} {\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}} + \frac{\theta^2\lambda^2}{n^2 } {\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}-1}. \end{equation}

Combining (A.7), (A.9), and (A.11), we get

\begin{eqnarray*} l(z) \le \frac{2\lambda^2 \theta}{n^2}\max(1,\theta) + \frac{1}{n^2} {\mathrm{e}}^{\lambda \theta + {\frac{\theta}{{\mathrm{e}}}}} + \frac{\theta^2\lambda^2}{n^2 } {\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}-1} + \frac{\theta\lambda^3z}{n^2} {\mathrm{e}}^{ \frac{\theta\lambda}{n {\mathrm{e}}}}.\end{eqnarray*}

Replacing z by $Z_n$ gives (4.36).

Bounding (4.39)

For $z>0$ ,

\begin{eqnarray}&&{2\frac{\lambda^2 \theta}{n^2}{\mathrm{e}}^{-\lambda z - \theta {\mathrm{e}}^{-\lambda z}}(\theta {\mathrm{e}}^{-\lambda z}-1) } \nonumber\\ &&- \left({\mathrm{e}}^{-\theta(1-\frac{\lambda}{n})^{nz+1}}- {\mathrm{e}}^{-\theta(1-\frac{\lambda}{n})^{nz}}- {\mathrm{e}}^{-\theta(1-\frac{\lambda}{n})^{nz-1}}+ {\mathrm{e}}^{-\theta(1-\frac{\lambda}{n})^{nz-2}}\right) \nonumber \\ &=& 2\frac{\lambda^2 \theta}{n^2}{\mathrm{e}}^{-\lambda z - \theta {\mathrm{e}}^{-\lambda z}}(\theta {\mathrm{e}}^{-\lambda z}-1) - k\left( \frac{n}{n-\lambda} \right), \nonumber\end{eqnarray}

where, with $a = \theta \left(1-\frac{\lambda}{n} \right)^{nz} $ ,

\begin{align*} k(b) = {\mathrm{e}}^{-\frac{a}{b}} - {\mathrm{e}}^{-a} - {\mathrm{e}}^{-ab} + {\mathrm{e}}^{-ab^2} = 2 (b-1)^2 a{\mathrm{e}}^{-a} (a-1) + R_1\end{align*}

with $R_1$ being the remainder term from the Taylor expansion for k(b) about 1,

\begin{align*} R_1 & = \frac{1}{3} (b-1)^3 a \left(a^2{\mathrm{e}}^{-a\xi} + \frac{a^2 {\mathrm{e}}^{-\frac{a}{\xi}}}{\xi^6} -\frac{6a {\mathrm{e}}^{-\frac{a}{\xi}}}{\xi^5} + \frac{6 {\mathrm{e}}^{-\frac{a}{\xi}}}{\xi^4} + 12 a \xi {\mathrm{e}}^{-a\xi^2} -8a^2\xi^3 {\mathrm{e}}^{-a\xi^2} \right) \end{align*}

for some $1 < \xi \le b = \frac{n}{n-\lambda}$ . To bound $R_1$ , we use that $1<\xi \le \frac{n}{n-\lambda}$ so that ${\mathrm{e}}^{-a \xi} \le {\mathrm{e}}^{-a}$ and ${\mathrm{e}}^{-a \xi^2} \le {\mathrm{e}}^{-a};$ also ${\mathrm{e}}^{-\frac{a}{\xi}} \le {\mathrm{e}}^{-a \frac{n-\lambda}{n}}.$ Hence, with the crude bounds $a \le \theta$ and ${\mathrm{e}}^{-a} \le 1,$

\begin{align*} | R_1 | &\le \frac{1}{3} (b-1)^3 a \left(a^2 {\mathrm{e}}^{-a} + \left( \frac{n-\lambda}{n} \right)^6 a^2 {\mathrm{e}}^{-a \frac{n-\lambda}{n}} + \left( \frac{n-\lambda}{n} \right)^5 6a {\mathrm{e}}^{-a \frac{n-\lambda}{n}} \right.\\ & \left. \qquad\qquad\qquad\; + \left( \frac{n-\lambda}{n} \right)^4 6 {\mathrm{e}}^{-a \frac{n-\lambda}{n}} + 12 a\left(\frac{n}{n-\lambda}\right) {\mathrm{e}}^{-a} + 8a^2 \left(\frac{n}{n-\lambda}\right)^3 {\mathrm{e}}^{-a} \right) \\ & \le \frac{1}{3} (b-1)^3 a^2 {\mathrm{e}}^{-a} \left(\theta + \theta {\mathrm{e}}^{\theta \frac{\lambda}{n}} + 6 {\mathrm{e}}^{\theta \frac{\lambda}{n}} + 6 a^{-1} {\mathrm{e}}^{\theta \frac{\lambda}{n}} + 12 \left(\frac{n}{n-\lambda}\right) + 8\theta \left(\frac{n}{n-\lambda}\right)^3 \right). \end{align*}

Substituting the expressions for a and b gives

\begin{align*} |R_1| & \le \frac{1}{3} \left(\frac{\lambda}{n-\lambda}\right)^3 \theta^2 \left(1-\frac{\lambda}{n}\right)^{2nz} {\mathrm{e}}^{-\theta \left(1-\frac{\lambda}{n}\right)^{nz}} \bigg[\theta + \theta {\mathrm{e}}^{\frac{\lambda \theta}{n}} + 6 {\mathrm{e}}^{\frac{\lambda \theta}{n}} \\ & \quad + 6 {\mathrm{e}}^{\frac{\lambda \theta}{n}}\frac{1}{\theta}\left(1-\frac{\lambda}{n}\right)^{-nz}+ 12 \left(\frac{n}{n-\lambda}\right) + 8\theta \left(\frac{n}{n-\lambda}\right)^3\bigg] \\ & \le \frac{\theta^2\lambda^3}{3n^3} \left(1-\frac{\lambda}{n}\right)^{-3} {\mathrm{e}}^{-2\lambda z} {\mathrm{e}}^{-\theta \left(1-\frac{\lambda}{n}\right)^{nz}} \bigg[\theta + \theta {\mathrm{e}}^{\frac{\lambda \theta}{n}} + 6 {\mathrm{e}}^{\frac{\lambda \theta}{n}} + 12 \left(1-\frac{\lambda}{n}\right)^{-1} \\ & \quad + 8\theta \left(1-\frac{\lambda}{n}\right)^{-3}\bigg] + \frac{2\theta\lambda^3}{n^3} \left(1-\frac{\lambda}{n}\right)^{-3} {\mathrm{e}}^{\frac{\lambda \theta}{n}} {\mathrm{e}}^{-\lambda z} {\mathrm{e}}^{-\theta \left(1-\frac{\lambda}{n}\right)^{nz}},\end{align*}

where we have used ${\mathrm{e}}^{-\theta \left(1-\frac{\lambda}{n}\right)^{nz}} = {\mathrm{e}}^{\theta \left[ {\mathrm{e}}^{-\lambda z}- \left(1-\frac{\lambda}{n}\right)^{nz}\right]} {\mathrm{e}}^{-\theta {\mathrm{e}}^{-\lambda z}}.$ Hence, with (A.4),

(A.12) \begin{equation} \begin{split} |R_1| &\le \frac{\theta^2\lambda^3}{3n^3}{\mathrm{e}}^{{\frac{\theta \lambda}{n {\mathrm{e}}}}} \left(1-\frac{\lambda}{n}\right)^{-3} \bigg[\theta + \theta {\mathrm{e}}^{\frac{\lambda \theta}{n}} + 6 {\mathrm{e}}^{\frac{\lambda \theta}{n}} + 12 \left(1-\frac{\lambda}{n}\right)^{-1} \\ &\quad + 8\theta \left(1-\frac{\lambda}{n}\right)^{-3}\bigg] {\mathrm{e}}^{-2\lambda z-\theta {\mathrm{e}}^{-\lambda z}} + 2\frac{\theta\lambda^3}{ n^3} \left(1-\frac{\lambda}{n}\right)^{-3}{\mathrm{e}}^{ {\frac{\theta\lambda}{n {\mathrm{e}}} ({\mathrm{e}}+1)}} {\mathrm{e}}^{-\lambda z-\theta {\mathrm{e}}^{-\lambda z} }. \end{split}\end{equation}

Thus,

(A.13) \begin{align} \bigg| & 2\frac{\lambda^2 \theta}{n^2}{\mathrm{e}}^{-\lambda z - \theta {\mathrm{e}}^{-\lambda z}}(\theta {\mathrm{e}}^{-\lambda z}-1) \nonumber \\ & - \left({\mathrm{e}}^{-\theta(1-\frac{\lambda}{n})^{nz+1}}- {\mathrm{e}}^{-\theta(1-\frac{\lambda}{n})^{nz}} - {\mathrm{e}}^{-\theta(1-\frac{\lambda}{n})^{nz-1}}+ {\mathrm{e}}^{-\theta(1-\frac{\lambda}{n})^{nz-2}}\right)\bigg|\nonumber \\ & \quad \le \bigg| 2\frac{\lambda^2 \theta}{n^2}{\mathrm{e}}^{-\lambda z - \theta {\mathrm{e}}^{-\lambda z}}(\theta {\mathrm{e}}^{-\lambda z}-1) \bigg[1-\left(1-\frac{\lambda}{n}\right)^{-2} \bigg] \bigg| + |R_1| + |R_2| ,\end{align}

with

\begin{align*} R_2= 2\left(\frac{\lambda}{n-\lambda}\right)^2 \bigg\{ & \theta^2 \bigg[\left(1-\frac{\lambda}{n}\right)^{2nz} {\mathrm{e}}^{-\theta \left(1-\frac{\lambda}{n}\right)^{nz}} -{\mathrm{e}}^{-2\lambda z-\theta {\mathrm{e}}^{-\lambda z}} \bigg] \\ & - \theta \bigg[\left(1-\frac{\lambda}{n}\right)^{nz} {\mathrm{e}}^{-\theta \left(1-\frac{\lambda}{n}\right)^{nz}} - {\mathrm{e}}^{-\lambda z-\theta {\mathrm{e}}^{-\lambda z}} \bigg] \bigg\}.\end{align*}

Using (A.10), we obtain the bound

(A.14) \begin{equation} \begin{split} |R_2| & \le 4\frac{ \theta^2 \lambda^4}{n^3} \left( 1- \frac{\lambda}{n}\right)^{-2} z {\mathrm{e}}^{{\frac{\theta \lambda}{n {\mathrm{e}}}}} {\mathrm{e}}^{-2\lambda z -\theta {\mathrm{e}}^{-\lambda z}} + 2\frac{\theta^3 \lambda^3}{n^3 {{\mathrm{e}}}} \left( 1- \frac{\lambda}{n}\right)^{-2}{\mathrm{e}}^{{\frac{\theta\lambda}{n {\mathrm{e}}}}} {\mathrm{e}}^{-2 \lambda z -\theta {\mathrm{e}}^{-\lambda z}} \\ & \quad + 2\frac{ \theta \lambda^4 }{n^3}\left( 1- \frac{\lambda}{n}\right)^{-2} z {\mathrm{e}}^{{\frac{\theta \lambda}{n {\mathrm{e}}}}} {\mathrm{e}}^{-\lambda z -\theta {\mathrm{e}}^{-\lambda z}} + 2\frac{\theta^2 \lambda^3}{n^3 {{\mathrm{e}}}} \left( 1- \frac{\lambda}{n}\right)^{-2} {\mathrm{e}}^{{\frac{\theta\lambda}{n {\mathrm{e}}}}} {\mathrm{e}}^{- \lambda z -\theta {\mathrm{e}}^{-\lambda z}}. \end{split}\end{equation}

Combining (A.12), (A.13), and (A.14) and simplifying then gives the bound

\begin{align*}\bigg| & 2\frac{\lambda^2 \theta}{n^2}{\mathrm{e}}^{-\lambda z - \theta {\mathrm{e}}^{-\lambda z}}(\theta {\mathrm{e}}^{-\lambda z}-1) \nonumber \\ &- \left({\mathrm{e}}^{-\theta(1-\frac{\lambda}{n})^{nz+1}}- {\mathrm{e}}^{-\theta(1-\frac{\lambda}{n})^{nz}} - {\mathrm{e}}^{-\theta(1-\frac{\lambda}{n})^{nz-1}}+ {\mathrm{e}}^{-\theta(1-\frac{\lambda}{n})^{nz-2}}\right)\bigg|\nonumber \\ & \le \frac{2\lambda}{n} \max(1,\theta) \left\{ \left(1-\frac{\lambda}{n}\right)^{-2} -1\right\} c(z) + \frac{\theta\lambda^2}{n^2} \left(1-\frac{\lambda}{n}\right)^{-2} {\mathrm{e}}^{{\frac{\theta\lambda}{n {\mathrm{e}}}}} \bigg\{ \frac{1}{3} \left(1-\frac{\lambda}{n}\right)^{-1} \nonumber \\ & \quad \times\left[\theta + \theta {\mathrm{e}}^{\frac{\theta\lambda}{n}} + 6 {\mathrm{e}}^{\frac{\theta\lambda}{n}} + 12 \left(1-\frac{\lambda}{n}\right)^{-1} + 8\theta \left(1-\frac{\lambda}{n}\right)^{-3}\right] + 2\left({\frac{\theta}{{\mathrm{e}}}} + 2\lambda z \right)\bigg\} {\mathrm{e}}^{-\lambda z}c(z) \nonumber \\ & \quad + \frac{2\lambda^2}{n^2}\left( 1- \frac{\lambda}{n}\right)^{-2}{\mathrm{e}}^{{\frac{\theta\lambda}{n {\mathrm{e}}}}}\left[ \left(1-\frac{\lambda}{n}\right)^{-1}{\mathrm{e}}^{\frac{\theta \lambda}{n}} + \lambda z + {\frac{\theta}{{\mathrm{e}}}} \right] c(z).\end{align*}

Multiplying the above inequality by $n\left|g\left(z-\frac1n\right)\right|$ gives (4.39).

References

Aarset, M. V. (1987). How to identify a bathtub hazard rate. IEEE Transactions on Reliability R-36, 106–108.CrossRefGoogle Scholar
Adamidis, K., Dimitrakopoulou, T. and Loukas, S. (2005). On an extension of the exponential–geometric distribution. Statistics & Probability Letters 73, 259269.10.1016/j.spl.2005.03.013CrossRefGoogle Scholar
Adamidis, K. and Loukas, S. (1998). A lifetime distribution with decreasing failure rate. Statistics & Probability Letters 39, 3542.CrossRefGoogle Scholar
Arratia, R., Goldstein, L. and Gordon, L. (1989). Two moments suffice for Poisson approximations: the Chen-Stein method. The Annals of Probability 925.Google Scholar
Barbour, A. D., Holst, L. and Janson, S. (1992). Poisson Approximation. The Clarendon Press, Oxford, UK.10.1093/oso/9780198522355.001.0001CrossRefGoogle Scholar
Basu, A. P. and Klein, J. P. (1982). Some recent results in competing risks theory. Lecture Notes-Monograph Series 2, 216229.10.1214/lnms/1215464851CrossRefGoogle Scholar
Betsch, S. and Ebner, B. (2019). A new characterization of the Gamma distribution and associated goodness-of-fit tests. Metrika 82, 779806.10.1007/s00184-019-00708-7CrossRefGoogle Scholar
Cancho, V. G., Louzada-Neto, F. and Barriga, G. D. C. (2011). The Poisson-exponential lifetime distribution. Computational Statistics & Data Analysis 55, 677686.10.1016/j.csda.2010.05.033CrossRefGoogle Scholar
Chen, L. H., Goldstein, L. and Shao, Q.-M. (2011). Normal Approximation by Stein’s Method. Springer, Heidelberg, London.CrossRefGoogle Scholar
Chen, L. H. Y. (1975). Poisson approximation for dependent trials. The Annals of Probability 3, 534545.10.1214/aop/1176996359CrossRefGoogle Scholar
Eisenberg, B. (2008). On the expectation of the maximum of IID geometric random variables. Statistics & Probability Letters 78, 135143.10.1016/j.spl.2007.05.011CrossRefGoogle Scholar
Fatima, A. and Roohi, A. (2015). The generalized Poisson-exponential distribution. Journal of ISOSS 1, 103118.Google Scholar
Feidt, A. (2013). Stein’s method for multivariate extremes. arXiv:1310.2564.Google Scholar
Germain, G. and Swan, Y. (2025). Bounding the ${L}^1$ -distance between one-dimensional continuous and discrete distributions via Stein’s method. Journal of Theoretical Probability 38, 9.10.1007/s10959-024-01373-xCrossRefGoogle Scholar
Goldstein, L. and Reinert, G. (2013). Stein’s method for the Beta distribution and the Pólya-Eggenberger urn. Journal of Applied Probability 50, 11871205.10.1239/jap/1389370107CrossRefGoogle Scholar
Jayakumar, K., Babu, M. G. and Bakouch, H. S. (2021). General classes of complementary distributions via random maxima and their discrete version. Japanese Journal of Statistics and Data Science 4, 797820.10.1007/s42081-021-00136-wCrossRefGoogle Scholar
Kus, C. (2007). A new lifetime distribution. Computational Statistics & Data Analysis 51, 44974509.CrossRefGoogle Scholar
Ley, C., Reinert, G. and Swan, Y. (2017). Stein’s method for comparison of univariate distributions. Probability Surveys 14, 152.10.1214/16-PS278CrossRefGoogle Scholar
Ley, C. and Swan, Y. (2013). Stein’s density approach and information inequalities. Electronic Communications in Probability 18, 114.10.1214/ECP.v18-2578CrossRefGoogle Scholar
Louzada, F., Bereta, E. and Franco, M. (2012). On the distribution of the minimum or maximum of a random number of i.i.d. lifetime random variables. Applied Mathematics 3, 350353.10.4236/am.2012.34054CrossRefGoogle Scholar
Louzada, F., Ramos, P. L. and Perdoná, G. S. C. (2016). Different estimation procedures for the parameters of the extended exponential geometric distribution for medical data. Computational and Mathematical Methods in Medicine 2016, 8727951.10.1155/2016/8727951CrossRefGoogle ScholarPubMed
Lu, W. and Shi, D. (2012). A new compounding life distribution: the Weibull–Poisson distribution. Journal of Applied Statistics 39, 2138.CrossRefGoogle Scholar
Marshall, A. W. and Olkin, I. (1997). A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika 84, 641652.10.1093/biomet/84.3.641CrossRefGoogle Scholar
Mijoule, G., Raič, M., Reinert, G. and Swan, Y. (2023). Stein’s density method for multivariate continuous distributions. Electronic Journal of Probability 28, 1–40.10.1214/22-EJP883CrossRefGoogle Scholar
Nourdin, I. and Peccati, G. (2012). Normal Approximations with Malliavin Calculus: From Stein’s Method to Universality. Cambridge University Press, Cambridge, UK.10.1017/CBO9781139084659CrossRefGoogle Scholar
Peköz, E. A. (1996). Stein’s method for geometric approximation. Journal of Applied Probability 33, 707713.10.2307/3215352CrossRefGoogle Scholar
Peköz, E. A. and Röllin, A. (2011). New rates for exponential approximation and the theorems of Rényi and Yaglom. The Annals of Probability 39, 587608.CrossRefGoogle Scholar
Peköz, E. A., Röllin, A. and Ross, N. (2013). Total variation error bounds for geometric approximation. Bernoulli 19, 610632.10.3150/11-BEJ406CrossRefGoogle Scholar
Ramos, P. L., Dey, D. K., Louzada, F. and Lachos, V. H. (2020). An extended Poisson family of life distribution: a unified approach in competitive and complementary risks. Journal of Applied Statistics 47, 306322.CrossRefGoogle Scholar
Reinert, G., Schbath, S. and Waterman, M. S. (2000). Probabilistic and statistical properties of words: an overview. Journal of Computational Biology 7, 146.10.1089/10665270050081360CrossRefGoogle ScholarPubMed
Salas, A. H. (2012). The exponential function as a limit. Applied Mathematical Sciences 6, 45194526.Google Scholar
Stein, C. (1986). Approximate Computation of Expectations. Institute of Mathematical Statistics, Hayward, California.10.1214/lnms/1215466568CrossRefGoogle Scholar
Stein, C., Diaconis, P., Holmes, S. and Reinert, G. (2004). Use of exchangeable pairs in the analysis of simulations. Lecture Notes-Monograph Series 46, 126.Google Scholar
Tahir, M. H. and Cordeiro, G. M. (2016). Compounding of distributions: a survey and new generalized classes. Journal of Statistical Distributions and Applications 3, 13.10.1186/s40488-016-0052-1CrossRefGoogle Scholar
Tojeiro, C., Louzada, F., Roman, M. and Borges, P. (2014). The complementary Weibull geometric distribution. Journal of Statistical Computation and Simulation 84, 13451362.10.1080/00949655.2012.744406CrossRefGoogle Scholar
Figure 0

Table 1. Values of n above which the coefficient of $\|h'\|$ is smaller in (4.31) than in (4.42)