To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Markov theory is a huge subject (much bigger than indicated by these notes), and consequently there are many books written on it. Three books that have influenced the present text are the ones by Brémaud [B], Grimmett & Stirzaker [GS], and (the somewhat more advanced book by) Durrett [Du]. Another nice introduction to the topic is the book by Norris [N]. Some of my Swedish compatriots will perhaps prefer to consult the texts by Rydén & Lindgren [RL] and Enger & Grandell [EG]. The reader can find plenty of additional material (more general theory, as well as other directions for applications) in any of these references.
Still on the Markov theory side (Chapters 2–6) of this text, there are two particular topics that I would warmly recommend for further study to anyone with a taste for mathematical elegance and the power and simplicity of probabilistic arguments: The first one is the coupling method, which was used to prove Theorems 5.2 and 8.1, and which also underlies the algorithms in Chapters 10–12; see the books by Lindvall [L] and by Thorisson [T]. The second topic is the relation between reversible Markov chains and electrical networks, which is delightfully treated in the book by Doyle & Snell [DSn]. Häggström [H] gives a short introduction in Swedish.
Another goldmine for the ambitious student is the collection of papers edited by Snell [Sn], where many exciting topics in probability, several of which concern Markov chains and/or randomized algorithms, are presented on a level accessible to advanced undergraduates.
The first version of these lecture notes was composed for a last-year undergraduate course at Chalmers University of Technology, in the spring semester 2000. I wrote a revised and expanded version for the same course one year later. This is the third and final (?) version.
The notes are intended to be sufficiently self-contained that they can be read without any supplementary material, by anyone who has previously taken (and passed) some basic course in probability or mathematical statistics, plus some introductory course in computer programming.
The core material falls naturally into two parts: Chapters 2–6 on the basic theory of Markov chains, and Chapters 7–13 on applications to a number of randomized algorithms.
Markov chains are a class of random processes exhibiting a certain “memoryless property”, and the study of these – sometimes referred to as Markov theory – is one of the main areas in modern probability theory. This area cannot be avoided by a student aiming at learning how to design and implement randomized algorithms, because Markov chains are a fundamental ingredient in the study of such algorithms. In fact, any randomized algorithm can (often fruitfully) be viewed as a Markov chain.
I have chosen to restrict the discussion to discrete time Markov chains with finite state space. One reason for doing so is that several of the most important ideas and concepts in Markov theory arise already in this setting; these ideas are more digestible when they are not obscured by the additional technicalities arising from continuous time and more general state spaces.
For several of the most interesting results in Markov theory, we need to put certain assumptions on the Markov chains we are considering. It is an important task, in Markov theory just as in all other branches of mathematics, to find conditions that on the one hand are strong enough to have useful consequences, but on the other hand are weak enough to hold (and be easy to check) for many interesting examples. In this chapter, we will discuss two such conditions on Markov chains: irreducibility and aperiodicity. These conditions are of central importance in Markov theory, and in particular they play a key role in the study of stationary distributions, which is the topic of Chapter 5. We shall, for simplicity, discuss these notions in the setting of homogeneous Markov chains, although they do have natural extensions to the more general setting of inhomogeneous Markov chains.
We begin with irreducibility, which, loosely speaking, is the property that “all states of the Markov chain can be reached from all others”. To make this more precise, consider a Markov chain (X0, X1, …) with state space S = {s1, …, sk} and transition matrix P. We say that a state sicommunicates with another state sj, writing si → sj, if the chain has positive probability of ever reaching sj when we start from si.
The general problem considered in this chapter is the following. We have a set S = {s1, …, sk} and a function f : S → R. The objective is to find an si ∈ S which minimizes (or, sometimes, maximizes) f (si).
When the size k of S is small, then this problem is of course totally trivial – just compute f (si) for i = 1, …, k and keep track sequentially of the smallest value so far, and for which si it was attained. What we should have in mind is the case where k is huge, so that this simple method becomes computationally too heavy to be useful in practice. Here are two examples.
Example 13.1: Optimal packing. Let G be a graph with vertex set V and edge set E. Suppose that we want to pack objects at the vertices of this graph, in such a way that
(i) at most one object can be placed at each vertex, and
(ii) no two objects can occupy adjacent vertices,
and that we want to squeeze in as many objects as possible under these constraints. If we represent objects by 1's and empty vertices by 0's, then, in the terminology of Example 7.1 (the hard-core model), the problem is to find (one of) the feasible configuration(s) ξ ∈ {0, 1}V which maximizes the number of 1's. […]
This chapter is devoted to the first-passage properties of fractal and nonfractal networks including the Cayley tree, hierarchically branched trees, regular and hierarchical combs, hierarchical blob structures, and other networks. One basic motivation for extending our study of first passage to these geometries is that many physical realizations of diffusive transport, such as hopping conductivity in amorphous semiconductors, gel chromatography, and hydrodynamic dispersion, occur in spatially disordered media. For general references see, e.g., Havlin and ben-Avraham (1987), Bouchaud and Georges (1990), and ben-Avraham and Havlin (2000). Judiciously constructed idealized networks can offer simple descriptions of these complex systems and their first-passage properties are often solvable by exact renormalization of the master equations.
In the spirit of simplicity, we study first passage on hierarchical trees, combs, and blobs. The hierarchical tree is an iterative construction in which one bond is replaced with three identical bonds at each stage; this represents a minimalist branched structure. The comb and the blob structures consist of a main backbone and an array of sidebranches or blob regions where the flow rate is vanishingly small. By varying the relative geometrical importance of the sidebranches (or blobs) to the backbone, we can fundamentally alter the first-passage characteristics of these systems.
When transport along the backbone predominates, first-passage properties are essentially one dimensional in character. For hierarchical trees, the role of sidebranches and the backbone are comparable, leading to a mean first-passage time that grows more quickly than the square of the system length. As might be expected, this can be viewed as the effective spatial dimension of such structures being greater than one.
We now develop the ideas of the previous chapter to determine basic first-passage properties for both continuum diffusion and the discrete randomwalk in a finite one-dimensional interval. This is a simple system with which we can illustrate the physical implications of first-passage processes and the basic techniques for their solution. Essentially all of the results of this chapter are well known, but they are scattered throughout the literature. Much information about the finite-interval system is contained in texts such as Cox and Miller (1965), Feller (1968), Gardiner (1985), Risken (1988), and Gillespie (1992). An important early contribution for the finite-interval system is given by Darling and Siegert (1953). Finally, some of the approaches discussed in this chapter are similar in spirit to those of Fisher (1988).
For continuum diffusion, we start with the direct approach of first solving the diffusion equation and then computing first-passage properties from the time dependence of the flux leaving the system. Much of this is classical and well-known material. These same results will then be rederived more elegantly by the electrostatic analogies introduced in Chap. 1. This provides a striking illustration of the power of these analogies and sets the stage for their use in higher dimensions and in more complex geometries (Chaps. 5–7).
We also derive parallel results for the discrete random walk. One reason for this redundancy is that random walks are often more familiar than diffusion because the former often arise in elementary courses. It will therefore be satisfying to see the essential unity of their first-passage properties. It is also instructive to introduce various methods for analyzing the recursion relations for the discrete randomwalk.
You arrange a 7 P.M. date at a local bistro. Your punctual date arrives at 6:55, waits until 7:05, concludes that you will not show up, and leaves. At 7:06, you saunter in – “just a few minutes” after 7 (see Cover). You assume that you arrived first and wait for your date. The wait drags on and on. “What's going on?” you think to yourself. By 9 P.M., you conclude that you were stood up, return home, and call to make amends. You explain, “I arrived around 7 and waited 2 hours! My probability of being at the bistro between 7 and 9 P.M., P(bistro, t), was nearly one! How did we miss each other?” Your date replies, “I don't care about your occupation probability. What matteredwas your first-passage probability, F(bistro, t), which was zero at 7 P.M. GOOD BYE!” Click!
The moral of this juvenile parable is that first passage underlies many stochastic processes in which the event, such as a dinner date, a chemical reaction, the firing of a neuron, or the triggering of a stock option, relies on a variable reaching a specified value for the first time. In spite of the wide applicability of first-passage phenomena (or perhaps because of it), there does not seem to be a pedagogical source on this topic. For those with a serious interest, essential information is scattered and presented at diverse technical levels. In my attempts to learn the subject, I also encountered the proverbial conundrum that a fundamental result is “well known to (the vanishingly small subset of) those who know it well.”
This chapter is devoted to first-passage properties in spherically symmetric systems. We shall see how the contrast between persistence, for spatial dimension d ≤ 2, and transience, for d > 2, leads to very different first-passage characteristics. We will solve first-passage properties both by the direct time-dependent solution of the diffusion equation and by the much simpler and more elegant electrostatic analogy of Section 1.6.
The case of two dimensions is particularly interesting, as the inclusion of a radial potential drift ν(r) ∝ 1/r is effectively the same as changing the spatial dimension. Thus general first-passage properties for isotropic diffusion in d dimensions are closely related to those of diffusion in two dimensions with a superimposed radial potential bias. This leads to nonuniversal behavior for the two-dimensional system.
As an important application of our study of first-passage to an isolated sphere, we will obtain the classic Smoluchowski expression for the chemical reaction rate, a result that underlies much of chemical kinetics. Because of the importance of this result, we will derive it by time-dependent approaches as well as by the quasi-static approximation introduced in Section 3.6. The latter approach also provides an easy way to understand detailed properties of the spatial distribution of reactants around a spherical trap.
First Passage between Concentric Spheres
We begin by computing the splitting (or exit) probabilities and the corresponding mean hitting times to the inner and outer the boundaries of the annular region R− ≤ r ≤ R+ as functions of the starting radius r (Fig. 6.1).
In this last chapter, we investigate simple particle reactions whose kinetics can be understood in terms of first-passage phenomena. These are typically diffusion-controlled reactions, in which diffusing particles are immediately converted to a product whenever a pair of them meets. The term diffusion controlled refers to the fact that the reaction itself is fast and the overall kinetics is controlled by the transport mechanism that brings reactive pairs together. Because the reaction occurs when particles first meet, first-passage processes provide a useful perspective for understanding the kinetics.
We begin by treating the trapping reaction, in which diffusing particles are permanently captured whenever they meet immobile trap particles. For a finite density of randomly distributed static traps, the asymptotic survival probability S(t) is controlled by rare, but large trap-free regions.We obtain this survival probability exactly in one dimension and by a Lifshitz tail argument in higher dimensions that focuses on these rare configurations [Lifshitz, Gredeskul, & Pastur (1988)]. At long times, we find that S(t) exp(−Atd/d+2), where A is a constant and d is the spatial dimension. This peculiar form for the survival probability was the focus of considerable theoretical effort that ultimately elucidated the role of extreme fluctuations on asymptotic behavior [see, e.g., Rosenstock (1969), Balagurov & Vaks (1974), Donsker & Varadhan (1975, 1979), Bixon & Zwanzig (1981), Grassberger & Procaccia (1982a), Kayser & Hubbard (1983), Havlin et al. (1984), and Agmon & Glasser (1986)].
We next discuss diffusion-controlled reactions in one dimension.
A natural counterpart to the finite interval is the first-passage properties of the semi-infinite interval [0,∞] with absorption at x = 0. Once again, this is a classical geometry for studying first-passage processes, and many of the references mentioned at the outset of Chap. 2 are again pertinent. In particular, the text by Karlin and Taylor (1975) gives a particularly comprehensive discussion about first-passage times in the semi-infinite interval with arbitrary hopping rates between neighboring sites. Once again, however, our focus is on simple diffusion or the nearest-neighbor randomwalk. For these processes, the possibility of a diffusing particle making arbitrarily large excursions before certain trapping takes place leads to an infinite mean lifetime. On the other hand, the recurrence of diffusion in one dimension means that the particle must eventually return to its starting point. This dichotomy between infinite lifetime and certain trapping leads to a variety of extremely surprising first-passage-related properties both for the semi-infinite interval and the infinite system.
Perhaps the most amazing such property is the arcsine law for the probability of long leads in a symmetric nearest-neighbor random walk in an unbounded domain. Although this law applies to the unrestricted random walk, it is intimately based on the statistics of returns to the origin and thus fits naturally in our discussion of first-passage on the semi-infinite interval. Our natural expectation is that, for a random walk which starts at x = 0, approximately one-half of the total time would be spent on the positive axis and the remaining one-half of the time on the negative axis. Surprisingly, this is the least probable outcome.