Hostname: page-component-cb9f654ff-qc88w Total loading time: 0 Render date: 2025-09-05T13:35:13.595Z Has data issue: false hasContentIssue false

The time until a random walk exceeds a square root and other barriers

Published online by Cambridge University Press:  03 September 2025

Sheldon M. Ross
Affiliation:
Department of Industrial and Systems Engineering, University of Southern California, Los Angeles, CA, USA
Tianchi Zhao*
Affiliation:
Department of Industrial and Systems Engineering, University of Southern California, Los Angeles, CA, USA
*
Corresponding author: Tianchi Zhao; Email: tianchiz@usc.edu
Rights & Permissions [Opens in a new window]

Abstract

This paper investigates the time N until a random walk first exceeds some specified barrier. Letting $X_i, i \geq 1,$ be a sequence of independent, identically distributed random variables with a log-concave density or probability mass function, we derive both lower and upper bounds on the probability $P(N \gt n),$ as well as bounds on the expected value $E[N].$ On barriers of the form $a + b \sqrt{k},$ where a is nonnegative, b is positive, and k is the number of steps, we provide additional bounds on $E[N].$

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press.

1. Introduction

Let $X_i, i \geq 1,$ be a sequence of independent and identically distributed random variables with a density or probability mass function f(x) for which $\log(f(x)) $ is a concave function. Let $S_n = \sum_{i=1}^n X_i, n \geq 1.$ For given constants $s_n, n \geq 1$, let

\begin{equation*} N = \min\{k \geq 1: S_k \geq s_k\} \end{equation*}

In Section 2 we present bounds on $P(N \gt n)$. In Section 3, we specialize to the case $s_k = a + b \sqrt{k},\, k \geq 1,$ where a is non-negative and b is positive, and present bounds on $E[N].$

The exploration of random walk behavior with threshold boundaries has a rich history in probability theory. Blackwell and Freedman[Reference Blackwell and Freedman1] presented foundational insights into exit times for sums of independent random variables. They considered a simple coin-tossing model where $X_i, i \geq 1,$ take values ±1 with probability $\frac{1}{2}$ each. Let $\tau(N, c)$ be the least $n \geq N$ with $\left|S_n\right| \gt c n^{\frac{1}{2}}$, where c is a constant. Their work demonstrated that $E[\tau(1, 1)]$ is infinite but when $0 \lt c \lt 1$, $E[\tau(N, c)]$ is finite for all N.

Building on these concepts, Breiman[Reference Breiman2] investigated the asymptotic distribution of first exit times for random walks with a square root boundary, particularly examining both discrete sums of i.i.d. random variables and continuous processes like Brownian motion. Breiman’s work established an approximation for the probability $P(N \gt n)$ as $n \to \infty$, and highlighted that while invariance principles apply for certain distributions, they may not extend to more general cases. This extension provides a framework for understanding the impact of varying boundary functions on exit time distributions.

In more recent work, Hansen[Reference Hansen4] examined random walks reflected at general boundaries, focusing on conditions under which the global maximum remains finite almost surely. Specifically, Hansen considered random walks with light-tailed, negatively biased increments, showing that the tail of the distribution for the maximum decays exponentially.

To the best of our knowledge, this is the first paper that examines the bounds of $P(N \gt n)$ and particularly $E[N]$ for log-concave random walks.

2. Bounds on P(N > n)

Proposition 2.1.

\begin{equation*}P(N \gt n) \geq P(S_1 \lt s_1) \prod_{k=2}^n P(S_k \lt s_k | S_{k-1} \lt s_{k-1}).\end{equation*}

To establish Proposition 2.1, we present some lemmas. The first one is Efron’s theorem[Reference Efron3].

Lemma 2.2. Efron’s Theorem

If $X_1, \ldots, X_r$ are independent log-concave random variables, then $(X_1, \ldots, X_r) \,|\, \sum_{t=1}^r X_t = x$ is stochastically increasing in $x.$ That is, for any component-wise increasing function $g(x_1, \ldots, x_r)$

\begin{equation*}E\big [g(X_1, \ldots, X_r) | \sum_{t=1}^r X_t = x\big ] \;\;\mbox{is increasing in}\;\;x\end{equation*}

Our next lemma states that Sk conditional on $S_1 \lt s_1, \ldots, S_{k} \lt s_k $ is likelihood ratio smaller than Sk conditional on $S_k \lt s_k.$

Lemma 2.3.

\begin{equation*}S_k| (S_1 \lt s_1, \ldots, S_{k} \lt s_k) \; \leq_{lr} \; S_k| S_{k} \lt s_k.\end{equation*}

Proof. We need to show that the ratio of the conditional density of Sk given $S_1 \lt s_1, \ldots, S_{k} \lt s_k$ to the conditional density of Sk given $ S_{k} \lt s_k$ is decreasing. Now, for $t \leq s_k$

\begin{eqnarray*} f_{S_k| S_1 \lt s_1, \ldots, S_{k} \lt s_k}(t)\!\! &=& \!\!\frac{ f_{S_k}(t) P(S_1 \lt s_1, \ldots, S_{k} \lt s_k|S_k = t)}{P( S_1 \lt s_1, \ldots, S_{k} \lt s_k)} \\ f_{S_k| S_{k} \lt s_k}(t)\!\! &=& \!\!\frac{ f_{S_k}(t) P( S_{k} \lt s_k|S_k = t)}{P( S_{k} \lt s_k)} = \frac{ f_{S_k}(t) }{P( S_{k} \lt s_k)} \end{eqnarray*}

Hence we need to show that $ P(S_1 \lt s_1, \ldots, S_{k} \lt s_k|S_k = t) $ is a decreasing function of t. However, this follows from Efron’s theorem because $ g(x_1, \ldots, x_k) = 1- I\{x_1 \lt s_1, x_1 + x_2 \lt s_2, \ldots, x_1+ \ldots + x_k \lt s_k\} $ is an increasing function of $(x_1, \ldots, x_k)$.

Lemma 2.4.

\begin{equation*}P(S_k \lt s_k | S_1 \lt s_1, \ldots, S_{k-1} \lt s_{k-1} ) \geq P(S_k \lt s_k | S_{k-1} \lt s_{k-1})\end{equation*}

Proof. Because being likelihood ratio smaller implies being stochastically smaller, it follows from Lemma 2.3 that $S_{k-1} | (S_1 \lt s_1, \ldots, S_{k-1} \lt s_{k-1} ) $ is stochastically smaller than $S_{k-1} | S_{k-1} \lt s_{k-1}$. Now, if $X \leq_{st} Y$ and Z is independent of both X and Y, then $X +Z \leq_{st} Y + Z.$ The result thus follows because $S_k = S_{k-1} + X_k.$

Proof. Proposition 2.1.

Proposition 2.1 follows from Lemma 2.4 upon using that

\begin{eqnarray*} P(N \gt n) &=& P(S_1 \lt s_1, \ldots, S_n \lt s_n ) \\ &=& P(S_1 \lt s_1) \prod_{k=2}^n P(S_k \lt s_k | S_1 \lt s_1, \ldots, S_{k-1} \lt s_{k-1}) \end{eqnarray*}

Proposition 2.1 yields the following lower bound on $E[N]$.

Corollary 2.5.

\begin{eqnarray*} E[N] &=& \sum_{n=0}^{\infty} P(N \gt n) \\ & \geq & 1 + P(S_1 \lt s_1) \big (1 + \sum_{n=2}^{\infty} \prod_{k=2}^n P(S_k \lt s_k|S_{k-1} \lt s_{k-1})\big ) \end{eqnarray*}

Remark. The log concave condition is essential for establishing Proposition 2.1. For a counterexample, suppose that $p_0 = \epsilon,\; p_2 = .2, \; p_5 = .8 - \epsilon,$ where $p_j = P(X_i = j)$ and ϵ is a small positive number. Then

\begin{equation*} P(S_3 \lt 6.5|S_1 \lt 1, S_2 \lt 5.5) = P(X_2 + X_3 \lt 6.5) \approx .04\end{equation*}

whereas

\begin{equation*} P(S_3 \lt 6.5| S_2 \lt 5.5) \approx .2\end{equation*}

The conditional expectation inequality (see [Reference Ross6]) can be used to obtain an upper bound on $P(N \gt n).$

Lemma 2.7. The Conditional Expectation Inequality

For events $B_1, \ldots, B_n$

\begin{equation*}P( \cup_{i=1}^n B_i) \geq \sum_{i=1}^n \frac{P(B_i)}{1 + \sum_{j \neq i} P(B_j|B_i)} .\end{equation*}

With $B_i = A_i^c = \{S_i \geq s_i\}, $ the inequality yields that

\begin{equation*}P(N \leq n) \geq \sum_{i=1}^n \frac{P^2(B_i) } {P(B_i) + \sum_{j \neq i} P(B_i B_j) } \end{equation*}

Whereas for many logconcave distributions it is difficult to compute $P(S_k \lt s_k | S_{k-1} \lt s_{_{k-1}})$, this is easily accomplished in important special cases such as normal, exponential, binomial, and Poisson. Example 2.8 considers the normal case and Example 2.9 the exponential case.

Example 2.8. Suppose the Xi are normal random variables with mean µ and variance 1. Let Z be a standard normal whose distribution function is Φ; let U be uniform on $(0, 1);$ and let $c_{n-1} = \frac{s_{n-1} - (n-1)\mu}{\sqrt{n-1}}.$ Because $S_{n-1}$ is normal with mean $(n-1) \mu$ and variance $n-1,$ it follows that

\begin{eqnarray*} S_{n-1} | S_{n-1} \lt s_{n-1}\!\! & =_{st} & \!\!(n-1) \mu + \sqrt{n-1} \,Z | \;Z \lt c_{n-1} \\ \!\!& =_{st} & \!\!(n-1) \mu + \sqrt{n-1} \, \Phi^{-1}(U) \; | \; \Phi^{-1}(U) \lt c_{n-1} \\ \!\!& =_{st} & \!\!( n-1) \mu + \sqrt{n-1} \, \Phi^{-1}(U) \; | \; U \lt \Phi(c_{n-1} ) \\ \!\!& =_{st} & \!\!( n-1) \mu + \sqrt{n-1}\, \Phi^{-1}(U\Phi(c_{n-1} ) ) \end{eqnarray*}

Hence

\begin{eqnarray*} P(S_n \lt s_n | S_{n-1} \lt s_{n-1}) &=& \!E[ E[ I\{S_n \lt s_n \} | S_{n-1} ] | S_{n-1} \lt s_{n-1} ] \\ \!&=& \!E[ \Phi( s_n - \mu - S_{n-1} ) | S_{n-1} \lt s_{n-1} ] \\ \!&=& \!E[ \Phi( s_n - n \mu - \sqrt{n-1} \Phi^{-1}(U \Phi(c_{n-1})) ] \\ \!&=& \!\int_0^1 g(x) dx, \end{eqnarray*}

where $\;g(x) = \Phi \left( s_n - n \mu - \sqrt{n-1} \, \Phi^{-1}(x \Phi(c_{n-1}) \right).$ Because Φ and Φ−1 are both increasing functions, it follows that g(x) is a decreasing function of x. Since $\;\int_0^1 g(x) dx = \sum_{i=1}^r \int_{(i-1)/r}^{i/r} g(x) dx ,$ this shows that, for any $r,$

\begin{equation*}\frac{1}{r} \sum_{i=1}^r g( \frac{i-1}{r}) \geq \int_0^1 g(x) dx \geq \frac{1}{r} \sum_{i=1}^r g(\frac{i}{r})\end{equation*}

To utilize the conditional expectation inequality we need to compute $P(S_i \gt s_i, S_j \gt s_j), i \neq j.$ To do so, suppose that i < j, and let $c_i = \frac{s_{i} - i\mu}{\sqrt{i}}$. Then, with $\bar{\Phi} = 1 - \Phi,$ arguing as before yields

\begin{eqnarray*} S_{i} | S_{i} \gt s_{i} & =_{st} & i \mu + \sqrt{i} \; \Phi^{-1}(U) \; | \; U \gt \Phi(c_i ) \\ & =_{st} & i \mu + \sqrt{i}\; \Phi^{-1} \left(\Phi(c_i ) + \bar{\Phi}(c_i) U \right) \end{eqnarray*}

Using that $S_j |S_i$ is normal with mean $S_i + (j-i) \mu$ and variance $(j-i)^2$ yields that

\begin{eqnarray*} P(S_j \gt s_j|S_i \gt s_i)\! &=& \!E\big[ E[ I\{S_j \gt s_j\} | S_{i} ] | S_{i} \gt s_{i} \big] \\ \!&=& \! E\big[ \bar{\Phi}\big( \frac{s_j - S_i -(j-i)\mu}{\sqrt{j-i}} \big ) | S_{i} \gt s_{i} \big] \\ \!&=& \! E \big [ \bar{\Phi} \left( \frac{s_j - j \mu - \sqrt{i}\; \Phi^{-1} \left(\Phi(c_i ) + \bar{\Phi}(c_i) U \right) }{\sqrt{j-i}} \right) \big ] \\ \!&=& \!\int_0^1 h_{i,j} (x) dx, \end{eqnarray*}

where $h_{i,j}(x) = \bar{\Phi} \left( \frac{s_j - j \mu - \sqrt{i}\; \Phi^{-1} \left(\Phi(c_i ) + \bar{\Phi}(c_i) x \right) }{\sqrt{j-i}} \right) .$ Because $h_{i,j}(x)$ is an increasing function of x, this gives

\begin{equation*} \frac{1}{r} \sum_{i=1}^r h_{i,j} \big(\frac{i}{r}\big ) \geq \int_0^1 h(x) dx \geq \frac{1}{r} \sum_{i=1}^r h_{i,j}\big( \frac{i-1}{r}\big). \end{equation*}

Example 2.9. Suppose the Xi are exponential random variables with rate $\lambda.$ Let N(t) be the number of events by time t of the Poisson process that has Xi as its i th interarrival time, $i \geq 1.$ Now, if $s_{n-1} \leq s_n$ then

\begin{eqnarray*} P(S_n \lt s_n, S_{n-1} \lt s_{n-1})\! &=& \!P(N(s_n) \geq n , N(s_{n-1}) \geq n -1) \\ \!&=& \!\sum_{i=n-1}^{\infty} P(N(s_n) \geq n | N(s_{n-1}) = i) P(N(s_{n-1}) = i) \\ \!&=& \!(1 - e^{- \lambda (s_n - s_{n-1})}) P(N(s_{n-1}) = n-1) + \sum_{i=n}^{\infty} P(N(s_{n-1}) = i) \\ \!&=& \!P(N(s_{n-1}) \geq n -1) - e^{- \lambda (s_n - s_{n-1})} P(N(s_{n-1}) = n-1) \end{eqnarray*}

giving that

\begin{equation*} P(S_n \lt s_n| S_{n-1} \lt s_{n-1}) = 1- \frac{e^{- \lambda (s_n - s_{n-1})} P(N(s_{n-1}) = n-1)}{P(N(s_{n-1}) \geq n -1) }, \quad s_{n-1} \leq s_n\end{equation*}

If $s_{n-1} \geq s_n,$ then $\; P(S_n \lt s_n, S_{n-1} \lt s_{n-1}) = P(N(s_n) \geq n ) , $ giving that

\begin{equation*}P(S_n \lt s_n | S_{n-1} \lt s_{n-1}) = \frac{ P(N(s_n) \geq n ) }{P(N(s_{n-1}) \geq n-1)} \end{equation*}

To compute $P(S_i \gt s_i, S_j \gt s_j) = P(N(s_i) \lt i, N(s_j ) \lt j)$ suppose that $i \lt j.$ If $s_j \gt s_i,$ conditioning on $N(s_i)$ yields

\begin{equation*}P(S_i \gt s_i, S_j \gt s_j) = \sum_{r=0}^{i-1} P(N(s_i) = r) P(N(s_j - s_i ) \lt j-r) \end{equation*}

If $s_i \gt s_j,$ then

\begin{equation*}P(S_i \gt s_i, S_j \gt s_j) = P(S_i \gt s_i) = P(N(s_i) \lt i).\end{equation*}

In Tables 12 and Figure 1, we present two numerical results for the probability bounds of $P(N \gt n)$. One assumes that Xi follows a normal distribution with mean 1 and variance 1, denoted $X_i \sim N(1,1)$. The other assumes that Xi follows an exponential distribution with rate parameter 1, denoted $X_i \sim Exp(1)$. The boundary $s_k = 2 + 2 \sqrt{k},\, k \geq 1$. For values of n ranging from 2 to 20, we calculate the lower and upper bounds for $P(N \gt n)$ using the analytical method described alongside Monte Carlo simulation estimates.

Figure 1. Probability bounds and Monte Carlo estimates for $P(N \gt n). s_k = 2 + 2\sqrt{k}$, Left: $X_i \sim N(1,1)$. Right: $X_i \sim \mathrm{Exp}(1)$.

Table 1. Probability bounds and Monte Carlo estimates for $P(N \gt n)$ with $X_i \sim N(1,1)$ and $s_k = 2 + 2 \sqrt{k}$ for n = 2 to 20.

Table 2. Probability bounds and Monte Carlo estimates for $P(N \gt n)$ with $X_i \sim Exp(1)$ and $s_k = 2 + 2 \sqrt{k}$ for n = 2 to 20

Whereas the bounds on $P(N \gt n)$ also yield bounds on $E[N]$, additional bounds for a square root barrier are given in the next section.

3. Additional bounds on $E[N] $ for a square root barrier

Suppose that $s_k = a + b \sqrt{k}, k \geq 1,$ where $a \geq 0, b \gt 0.$ Also, suppose the log concave random variables Xi have a positive mean. Now, conditional on N and $S_{N-1},$ the random variable SN is distributed as $a+b \sqrt{N} $ plus the amount by which $X,$ a random variable having density f exceeds the positive value $a+b \sqrt{N} - S_{N-1} $ given that it does exceed that value. But a log concave random variable X conditioned to be positive has an increasing failure rate (see Shaked and Shanthikumar[Reference Moshe and George Shanthikumar.5]) implying that $S_N - (a + b \sqrt{N})$ is stochastically smaller than $X|X \gt 0.$ As this is true no matter what the values of N and $S_{N-1}$, it follows that

\begin{equation*}E[S_N] \leq a + bE[ \sqrt{N} ] + E[X|X \gt 0] \end{equation*}

Using Wald’s equation and Jensen’s inequality the preceding implies that

\begin{equation*}\mu E[N] \leq a + b \sqrt{E[N]} + E[X|X \gt 0] \end{equation*}

With $d = a + E[X|X \gt 0],$ the preceding can be written as

\begin{equation*}\mu E[N] - d \leq b \sqrt{E[N]} \end{equation*}

If $ d \leq \mu E[N] ,$ which can be checked using Corollary 2.5, the preceding yields that

\begin{equation*}\mu^2 E^2[N] + d^2 - (2d \mu+ b^2) E[N] \leq 0\end{equation*}

Because the function $g(x) = \mu^2 x^2 - (2d \mu+ b^2) x + d^2$ is convex with $g(0) \gt 0, \lim_{x \rightarrow \infty} g(x) = \infty$, it follows that $g(x) \lt 0$ in the region between the two roots of $g(x) = 0.$ Thus, $E[N]$ lies between these two roots.

Remarks. 1. If f is the normal density with mean µ > 0 and variance $1,$ then $ E[X|X \gt 0] = \mu + \frac{e^{- \mu^2/2} } {\sqrt{2 \pi} \Phi(\mu)},$ where Φ is the standard normal distribution function.

2. Whereas the condition $d \leq \mu E[N]$ involves the unknown $E[N],$ it can often be verified by showing that $d \leq \mu \, \text{LB}$, where LB is the lower bound for $E[N]$ given by Corollary 2.5. (Of course, it is possible that $d \leq \mu E[N]$ but $d \gt \mu \, \text{LB}$).

In Table 3, we give the numerical results of the lower and upper bounds of $E[N]$ and compare them with the Monte Carlo estimate of $E[N]$. In this case, $X_i \sim N(1,1)$ and $s_k = a + b \sqrt{k}, k \geq 1$.

Table 3. Comparison of Simulated $E[N]$ with Lower and Upper Bounds for Different Values of a and b.

Let $\hat{E[N]}$ be the Monte Carlo estimate of $E[N]$, and let LB denote the lower bound in Corollary 2.5. Since the smaller root of $ \mu^2 x^2 - (2d \mu+ b^2) x + d^2 = 0$ does not yield a good result, it is excluded. UB1 is the upper bound by conditional expectation inequality, and UB2 is the larger root. The results are as follows.

Remark. From the numerical results across all cases shown in Table 3, UB2 is consistently smaller than UB1.

Funding statement

This work was supported by, or in part, by the National Science Foundation under contract/grant CMMI2132759.

Conflict of interest statement

The authors declare they have no conflict of interest.

References

Blackwell, D. & Freedman, D. (1964). A remark on the coin tossing game. The Annals of Mathematical Statistics 35(3): 13451347.10.1214/aoms/1177703292CrossRefGoogle Scholar
Breiman, L. (1967). First exit times from a square root boundary. In Fifth Berkeley Symposium. pp. 916. Vol. 2.10.1525/9780520325340-004CrossRefGoogle Scholar
Efron, B. (1965). Increasing Properties of Polya Frequency Function. The Annals of Mathematical Statistics 36(1): 272279.10.1214/aoms/1177700288CrossRefGoogle Scholar
Hansen, N.R. (2006). The maximum of a random walk reflected at a general barrier. The Annals of Applied Probability 16(1): 1529.10.1214/105051605000000610CrossRefGoogle Scholar
Moshe, S. & George Shanthikumar., J. (1987). Characterization of some first passage times using log-concavity and log-convexity as aging notions. Probability in the Engineering and Informational Sciences 1(3): 279291.Google Scholar
Ross, S.M. (2002). Probability Models for Computer Science, Academic Press.Google Scholar
Figure 0

Figure 1. Probability bounds and Monte Carlo estimates for $P(N \gt n). s_k = 2 + 2\sqrt{k}$, Left: $X_i \sim N(1,1)$. Right: $X_i \sim \mathrm{Exp}(1)$.

Figure 1

Table 1. Probability bounds and Monte Carlo estimates for $P(N \gt n)$ with $X_i \sim N(1,1)$ and $s_k = 2 + 2 \sqrt{k}$ for n = 2 to 20.

Figure 2

Table 2. Probability bounds and Monte Carlo estimates for $P(N \gt n)$ with $X_i \sim Exp(1)$ and $s_k = 2 + 2 \sqrt{k}$ for n = 2 to 20

Figure 3

Table 3. Comparison of Simulated $E[N]$ with Lower and Upper Bounds for Different Values of a and b.