To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In Chapter 8, conditional probabilities are introduced by conditioning upon the occurrence of an event B of nonzero probability. In applications, this event B is often of the form Y = b for a discrete random variable Y. However, when the random variable Y is continuous, the condition Y = b has probability zero for any number b. The purpose of this chapter is to develop techniques for handling a condition provided by the observed value of a continuous random variable. We will see that the conditional probability density function of X given Y = b for continuous random variables is analogous to the conditional probability mass function of X given Y = b for discrete random variables. The conditional distribution of X given Y = b enables us to define the natural concept of conditional expectation of X given Y = b. This concept allows for an intuitive understanding and is of utmost importance. In statistical applications, it is often more convenient to work with conditional expectations instead of the correlation coefficient when measuring the strength of the relationship between two dependent random variables. In applied probability problems, the computation of the expected value of a random variable X is often greatly simplified by conditioning on an appropriately chosen random variable Y. Learning the value of Y provides additional information about the random variable X and for that reason the computation of the conditional expectation of X given Y = b is often simple.
In previous chapters we have dealt with sequences of independent random variables. However, many random systems evolving in time involve sequences of dependent random variables. Think of the outside weather temperature on successive days, or the prize of IBM stock at the end of successive trading days. For many such systems it is reasonable to assume that the probability of going from one state to another state depends only on the current state of the system and thus is not influenced by additional information about past states. The probability model with this feature is called a Markov chain. The concepts of state and state transition are at the heart of Markov chain analysis. The line of thinking through the concepts of state and state transition is very useful for analyzing many practical problems in applied probability.
Markov chains are named after the Russian mathematician Andrey Markov (1856–1922), who first developed this probability model in order to analyze the alternation of vowels and consonants in Pushkin's poem “Eugine Onegin.” His work helped to launch the modern theory of stochastic processes (a stochastic process is a collection of random variables, indexed by an ordered time variable). The characteristic property of a Markov chain is that its memory goes back only to the most recent state. Knowledge of the current state only is sufficient to describe the future development of the process. A Markov model is the simplest model for random systems evolving in time when the successive states of the system are not independent.
How does one calculate the probability of throwing heads more than 15 times in 25 tosses of a fair coin? What is the probability of winning a lottery prize? Is it exceptional for a city that averages eight serious fires per year to experience 12 serious fires in one particular year? These kinds of questions can be answered by the probability distributions that we will be looking at in this chapter. These are the binomial distribution, the Poisson distribution, and the hypergeometric distribution. A basic knowledge of these distributions is essential in the study of probability theory. This chapter gives insight into the different types of problems to which these probability distributions can be applied. The binomial model refers to a series of independent trials of an experiment that has two possible outcomes. Such an elementary experiment is also known as a Bernoulli experiment, after the famous Swiss mathematician Jakob Bernoulli (1654–1705). In most cases, the two possible outcomes of a Bernoulli experiment will be specified as “success” or “failure.” Many probability problems boil down to determining the probability distribution of the total number of successes in a series of independent trials of a Bernoulli experiment. The Poisson distribution is another important distribution and is used, in particular, to model the occurrence of rare events. When you know the expected value of a Poisson distribution, you know enough to calculate all of the probabilities of that distribution.
In this chapter, we provide a number of probability problems that challenge the reader to test his or her feeling for probabilities. As stated in the Introduction, it is possible to fall wide of the mark when using intuitive reasoning to calculate a probability, or to estimate the order of magnitude of a probability. To find out how you fare in this regard, it may be useful to try one or more of these 12 problems. They are playful in nature but are also illustrative of the surprises one can encounter in the solving of practical probability problems. Think carefully about each question before looking up its solution. All of the solutions to these problems can be found scattered throughout the ensuing chapters.
Question 1. A birthday problem (§3.1, §4.2.3)
You go with a friend to a football (soccer) game. The game involves 22 players of the two teams and one referee. Your friend wagers that, among these 23 persons on the field, at least two people will have birthdays on the same day. You will receive ten dollars from your friend if this is not the case. How much money should you, if the wager is to be a fair one, pay out to your friend if he is right?
Question 2. Probability of winning streaks (§2.1.3, §5.9.1)
A basketball player has a 50% success rate in free throw shots. Assuming that the outcomes of all free throws are independent from one another, what is the probability that, within a sequence of 20 shots, the player can score five baskets in a row?
In many practical applications of probability, physical situations are better described by random variables that can take on a continuum of possible values rather than a discrete number of values. Examples are the decay time of a radioactive particle, the time until the occurrence of the next earthquake in a certain region, the lifetime of a battery, the annual rainfall in London, and so on. These examples make clear what the fundamental difference is between discrete random variables and continuous random variables. Whereas a discrete random variable associates positive probabilities to its individual values, any individual value has probability zero for a continuous random variable. It is only meaningful to speak of the probability of a continuous random variable taking on a value in some interval. Taking the lifetime of a battery as an example, it will be intuitively clear that the probability of this lifetime taking on a specific value becomes zero when a finer and finer unit of time is used. If you can measure the heights of people with infinite precision, the height of a randomly chosen person is a continuous random variable. In reality, heights cannot be measured with infinite precision, but the mathematical analysis of the distribution of heights of people is greatly simplified when using a mathematical model in which the height of a randomly chosen person is modeled as a continuous random variable.
In the midst of a coin-tossing game, after seeing a long run of tails, we are often tempted to think that the chances that the next toss will be heads must be getting larger. Or, if we have rolled a die many times without seeing a six, we are sure that finally we will roll a six. These notions are known as the gambler's fallacy. Of course, it is a mistake to think that the previous tosses will influence the outcome of the next toss: a coin or die has no memory. With each new toss, each of the possible outcomes remains equally likely. Irregular patterns of heads and tails are even characteristic of tosses with a fair coin. Unexpectedly long runs of heads or tails can already occur with a relatively few number of tosses. To see five or six heads in a row in 20 tosses is not exceptional. It is the case, however, that as the number of tosses increases, the fractions of heads and tails should be about equal, but that is guaranteed only in the long run. In the theory of probability, this fact is known as the law of large numbers. Just as the name implies, this law only says something about the game after a large number of tosses. This law does not imply that the absolute difference between the numbers of heads and tails should oscillate close to zero.