To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter develops more tools for working with random variables. The probability generating function is the key tool for working with sums of nonnegative integer-valued random variables that are independent. When random variables are only uncorrelated, we can work with averages (normalized sums) by using the weak law of large numbers. We emphasize that the weak law makes the connection between probability theory and the every-day practice of using averages of observations to estimate probabilities of real-world measurements. The last two sections introduce conditional probability and conditional expectation. The three important tools here are the law of total probability, the law of substitution, and, for independent random variables, “dropping the conditioning.”
The foregoing concepts are developed here for discrete random variables, but they will all be extended to more general settings in later chapters.
Probability generating functions
In many problems we have a sum of independent random variables, and we would like to know the probability mass function of their sum. For example, in an optical communication system, the received signal might be Y = X + W, where X is the number of photoelectrons due to incident light on a photodetector, and W is the number of electrons due to dark current noise in the detector. An important tool for solving these kinds of problems is the probability generating function. The name derives from the fact that it can be used to compute the probability mass function.
Why do electrical and computer engineers need to study probability?
Probability theory provides powerful tools to explain, model, analyze, and design technology developed by electrical and computer engineers. Here are a few applications.
Signal processing. My own interest in the subject arose when I was an undergraduate taking the required course in probability for electrical engineers. We considered the situation shown in Figure 1.1. To determine the presence of an aircraft, a known radar pulse v(t) is sent out. If there are no objects in range of the radar, the radar's amplifiers produce only a noise waveform, denoted by Xt. If there is an object in range, the reflected radar pulse plus noise is produced. The overall goal is to decide whether the received waveform is noise only or signal plus noise. To get an idea of how difficult this can be, consider the signal plus noise waveform shown at the top in Figure 1.2. Our class addressed the subproblem of designing an optimal linear system to process the received waveform so as to make the presence of the signal more obvious. We learned that the optimal transfer function is given by the matched filter. If the signal at the top in Figure 1.2 is processed by the appropriate matched filter, we get the output shown at the bottom in Figure 1.2. You will study the matched filter in Chapter 10.
In Chapters 2 and 3, the only random variables we considered specifically were discrete ones such as the Bernoulli, binomial, Poisson, and geometric. In this chapter we consider a class of random variables allowed to take a continuum of values. These random variables are called continuous random variables and are introduced in Section 4.1. Continuous random variables are important models for integrator output voltages in communication receivers, file download times on the Internet, velocity and position of an airliner on radar, etc. Expectation and moments of continuous random variables are computed in Section 4.2. Section 4.3 develops the concepts of moment generating function (Laplace transform) and characteristic function (Fourier transform). In Section 4.4 expectation of multiple random variables is considered. Applications of characteristic functions to sums of independent random variables are illustrated. In Section 4.5 the Markov inequality, the Chebyshev inequality, and the Chernoff bound illustrate simple techniques for bounding probabilities in terms of expectations.
Densities and probabilities
Introduction
Suppose that a random voltage in the range [0,1) is applied to a voltmeter with a one-digit display. Then the display output can be modeled by a discrete random variable Y taking values .0, .1, .2, …, .9 with P(Y = k/10) = 1/10 for k = 0, …, 9.
This book is a primary text for graduate-level courses in probability and random processes that are typically offered in electrical and computer engineering departments. The text starts from first principles and contains more than enough material for a two-semester sequence. The level of the text varies from advanced undergraduate to graduate as the material progresses. The principal prerequisite is the usual undergraduate electrical and computer engineering course on signals and systems, e.g., Haykin and Van Veen or Oppenheim and Willsky (see the Bibliography at the end of the book). However, later chapters that deal with random vectors assume some familiarity with linear algebra; e.g., determinants and matrix inverses.
How to use the book
A first course. In a course that assumes at most a modest background in probability, the core of the offering would include Chapters 1–5 and 7. These cover the basics of probability and discrete and continuous random variables. As the chapter dependencies graph on the preceding page indicates, there is considerable flexibility in the selection and ordering of additional material as the instructor sees fit.
A second course. In a course that assumes a solid background in the basics of probability and discrete and continuous random variables, the material in Chapters 1–5 and 7 can be reviewed quickly.
Prior to the 1990s, network analysis and design was carried out using long-established Markovian models such as the Poisson process. As self similarity was observed in the traffic of local-area networks, wide-area networks, and in World Wide Web traffic, a great research effort began to examine the impact of self similarity on network analysis and design. This research has yielded some surprising insights into questions about buffer size versus bandwidth, multiple-time-scale congestion control, connection duration prediction, and other issues.
The purpose of this chapter is to introduce the notion of self similarity and related concepts so that the student can be conversant with the kinds of stochastic processes being used to model network traffic. For more information, the student may consult the text by Beran, which includes numerous physical models and a historical overview of self similarity and long-range dependence.
Section 15.1 introduces the Hurst parameter and the notion of distributional self similarity for continuous-time processes. The concept of stationary increments is also presented. As an example of such processes, fractional Brownian motion is developed using the Wiener integral. In Section 15.2, we show that if one samples the increments of a continuous-time self-similar process with stationary increments, then the samples have a covariance function with a specific formula. It is shown that this formula is equivalent to specifying the variance of the sample mean for all values of n.
As we have seen, most problems in probability textbooks start out with random variables having a given probability mass function or density. However, in the real world, problems start out with a finite amount of data, X1, X2, …, Xn, about which very little is known based on the physical situation. We are still interested in computing probabilities, but we first have to find the pmf or density with which to do the calculations. Sometimes the physical situation determines the form of the pmf or density up to a few unknown parameters. For example, the number of alpha particles given off by a radioactive sample is Poisson(λ), but we need to estimate λ from measured data. In other situations, we may have no information about the pmf or density. In this case, we collect data and look at histograms to suggest possibilities. In this chapter, we not only look at parameter estimators and histograms, we also try to quantify how confident we are that our estimate or density choice is a good one.
Section 6.1 introduces the sample mean and sample variance as unbiased estimators of the true mean and variance. The concept of strong consistency is introduced and used to show that estimators based on the sample mean and sample variance inherit strong consistency. Section 6.2 introduces histograms and the chi-squared statistic for testing the goodness-of-fit of a hypothesized pmf or density to a histogram.
A Markov chain is a random process with the property that given the values of the process from time zero up through the current time, the conditional probability of the value of the process at any future time depends only on its value at the current time. This is equivalent to saying that the future and the past are conditionally independent given the present (cf. Problem 70 in Chapter 1).
Markov chains often have intuitively pleasing interpretations. Some examples discussed in this chapter are random walks (without barriers and with barriers, which may be reflecting, absorbing, or neither), queuing systems (with finite or infinite buffers), birth–death processes (with or without spontaneous generation), life (with states being “healthy,” “sick,” and “death”), and the gambler's ruin problem.
Section 12.1 briefly highlights some simple properties of conditional probability that are very useful in studying Markov chains. Sections 12.2–12.4 cover basic results about discrete-time Markov chains. Continuous-time chains are discussed in Section 12.5.
Preliminary results
We present some easily-derived properties of conditional probability. These observations will greatly simplify some of our calculations for Markov chains.