To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In Chapter 8, conditional probabilities are introduced by conditioning upon the occurrence of an event B of nonzero probability. In applications, this event B is often of the form Y = b for a discrete random variable Y. However, when the random variable Y is continuous, the condition Y = b has probability zero for any number b. The purpose of this chapter is to develop techniques for handling a condition provided by the observed value of a continuous random variable. We will see that the conditional probability density function of X given Y = b for continuous random variables is analogous to the conditional probability mass function of X given Y = b for discrete random variables. The conditional distribution of X given Y = b enables us to define the natural concept of conditional expectation of X given Y = b. This concept allows for an intuitive understanding and is of utmost importance. In statistical applications, it is often more convenient to work with conditional expectations instead of the correlation coefficient when measuring the strength of the relationship between two dependent random variables. In applied probability problems, the computation of the expected value of a random variable X is often greatly simplified by conditioning on an appropriately chosen random variable Y. Learning the value of Y provides additional information about the random variable X and for that reason the computation of the conditional expectation of X given Y = b is often simple.
In previous chapters we have dealt with sequences of independent random variables. However, many random systems evolving in time involve sequences of dependent random variables. Think of the outside weather temperature on successive days, or the prize of IBM stock at the end of successive trading days. For many such systems it is reasonable to assume that the probability of going from one state to another state depends only on the current state of the system and thus is not influenced by additional information about past states. The probability model with this feature is called a Markov chain. The concepts of state and state transition are at the heart of Markov chain analysis. The line of thinking through the concepts of state and state transition is very useful for analyzing many practical problems in applied probability.
Markov chains are named after the Russian mathematician Andrey Markov (1856–1922), who first developed this probability model in order to analyze the alternation of vowels and consonants in Pushkin's poem “Eugine Onegin.” His work helped to launch the modern theory of stochastic processes (a stochastic process is a collection of random variables, indexed by an ordered time variable). The characteristic property of a Markov chain is that its memory goes back only to the most recent state. Knowledge of the current state only is sufficient to describe the future development of the process. A Markov model is the simplest model for random systems evolving in time when the successive states of the system are not independent.
How does one calculate the probability of throwing heads more than 15 times in 25 tosses of a fair coin? What is the probability of winning a lottery prize? Is it exceptional for a city that averages eight serious fires per year to experience 12 serious fires in one particular year? These kinds of questions can be answered by the probability distributions that we will be looking at in this chapter. These are the binomial distribution, the Poisson distribution, and the hypergeometric distribution. A basic knowledge of these distributions is essential in the study of probability theory. This chapter gives insight into the different types of problems to which these probability distributions can be applied. The binomial model refers to a series of independent trials of an experiment that has two possible outcomes. Such an elementary experiment is also known as a Bernoulli experiment, after the famous Swiss mathematician Jakob Bernoulli (1654–1705). In most cases, the two possible outcomes of a Bernoulli experiment will be specified as “success” or “failure.” Many probability problems boil down to determining the probability distribution of the total number of successes in a series of independent trials of a Bernoulli experiment. The Poisson distribution is another important distribution and is used, in particular, to model the occurrence of rare events. When you know the expected value of a Poisson distribution, you know enough to calculate all of the probabilities of that distribution.
In this chapter, we provide a number of probability problems that challenge the reader to test his or her feeling for probabilities. As stated in the Introduction, it is possible to fall wide of the mark when using intuitive reasoning to calculate a probability, or to estimate the order of magnitude of a probability. To find out how you fare in this regard, it may be useful to try one or more of these 12 problems. They are playful in nature but are also illustrative of the surprises one can encounter in the solving of practical probability problems. Think carefully about each question before looking up its solution. All of the solutions to these problems can be found scattered throughout the ensuing chapters.
Question 1. A birthday problem (§3.1, §4.2.3)
You go with a friend to a football (soccer) game. The game involves 22 players of the two teams and one referee. Your friend wagers that, among these 23 persons on the field, at least two people will have birthdays on the same day. You will receive ten dollars from your friend if this is not the case. How much money should you, if the wager is to be a fair one, pay out to your friend if he is right?
Question 2. Probability of winning streaks (§2.1.3, §5.9.1)
A basketball player has a 50% success rate in free throw shots. Assuming that the outcomes of all free throws are independent from one another, what is the probability that, within a sequence of 20 shots, the player can score five baskets in a row?
In many practical applications of probability, physical situations are better described by random variables that can take on a continuum of possible values rather than a discrete number of values. Examples are the decay time of a radioactive particle, the time until the occurrence of the next earthquake in a certain region, the lifetime of a battery, the annual rainfall in London, and so on. These examples make clear what the fundamental difference is between discrete random variables and continuous random variables. Whereas a discrete random variable associates positive probabilities to its individual values, any individual value has probability zero for a continuous random variable. It is only meaningful to speak of the probability of a continuous random variable taking on a value in some interval. Taking the lifetime of a battery as an example, it will be intuitively clear that the probability of this lifetime taking on a specific value becomes zero when a finer and finer unit of time is used. If you can measure the heights of people with infinite precision, the height of a randomly chosen person is a continuous random variable. In reality, heights cannot be measured with infinite precision, but the mathematical analysis of the distribution of heights of people is greatly simplified when using a mathematical model in which the height of a randomly chosen person is modeled as a continuous random variable.
In the midst of a coin-tossing game, after seeing a long run of tails, we are often tempted to think that the chances that the next toss will be heads must be getting larger. Or, if we have rolled a die many times without seeing a six, we are sure that finally we will roll a six. These notions are known as the gambler's fallacy. Of course, it is a mistake to think that the previous tosses will influence the outcome of the next toss: a coin or die has no memory. With each new toss, each of the possible outcomes remains equally likely. Irregular patterns of heads and tails are even characteristic of tosses with a fair coin. Unexpectedly long runs of heads or tails can already occur with a relatively few number of tosses. To see five or six heads in a row in 20 tosses is not exceptional. It is the case, however, that as the number of tosses increases, the fractions of heads and tails should be about equal, but that is guaranteed only in the long run. In the theory of probability, this fact is known as the law of large numbers. Just as the name implies, this law only says something about the game after a large number of tosses. This law does not imply that the absolute difference between the numbers of heads and tails should oscillate close to zero.
This appendix first gives some background material on counting methods. Many probability problems require counting techniques. In particular, these techniques are extremely useful for computing probabilities in a chance experiment in which all possible outcomes are equally likely. In such experiments, one needs effective methods to count the number of outcomes in any specific event. In counting problems, it is important to know whether the order in which the elements are counted is relevant or not. After the discussion on counting methods, the Appendix summarizes a number of properties of the famous number e and the exponential function ex both playing an important role in probability.
Permutations
How many different ways can you arrange a number of different objects such as letters or numbers? For example, what is the number of different ways that the three letters A, B, and C can be arranged? By writing out all the possibilities ABC, ACB, BAC, BCA, CAB, and CBA, you can see that the total number is six. This brute-force method of writing down all the possibilities and counting them is naturally not practical when the number of possibilities gets large, for example the number of different ways to arrange the 26 letters of the alphabet. You can also determine that the three letters A, B, and C can be written down in six different ways by reasoning as follows. For the first position, there are three available letters to choose from, for the second position there are two letters over to choose from, and only one letter for the third position.
In the first part of this book, we worked many times with models of random variables. In performing a chance experiment, one is often not interested in the particular outcome that occurs but in a specific numerical value associated with that outcome. Any function that assigns a real number to each outcome in the sample space of the experiment is called a random variable. The purpose of this chapter is to familiarize the reader with a number of basic rules for calculating characteristics of random variables such as the expected value and the variance. These rules are easiest explained and understood in the context of discrete random variables. Therefore, the discussion in this chapter is restricted to the case of discrete random variables. However, the rules for discrete random variables apply with obvious modifications to other types of random variables as well. In Chapter 10, we discuss so-called continuous random variables. Such random variables have a continuous interval as the range of possible values.
Random variables
Intuitively, a random variable is a variable that takes on its values by chance. The convention is to use capital letters such as X, Y, Z to denote random variables. Formally, a random variable is defined as a real-valued function on the sample space of a chance experiment. A random variable X assigns a numerical value X(ω) to each element ω of the sample space. For example, if X is the sum of the dots when rolling twice one fair die, the random variable X assigns the numerical value i + j to the outcome (i, j) of the chance experiment.