We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Probably the most common theme in analyzing complex data is the classification, or categorization, of elements. Described abstractly, the task is to classify a given data instance into a prespecified set of categories. Applied to the domain of document management, the task is known as text categorization (TC) – given a set of categories (subjects, topics) and a collection of text documents, the process of finding the correct topic (or topics) for each document.
The study of automated text categorization dates back to the early 1960s (Maron 1961). Then, its main projected use was for indexing scientific literature by means of controlled vocabulary. It was only in the 1990s that the field fully developed with the availability of ever increasing numbers of text documents in digital form and the necessity to organize them for easier use. Nowadays automated TC is applied in a variety of contexts – from the classical automatic or semiautomatic (interactive) indexing of texts to personalized commercials delivery, spam filtering, Web page categorization under hierarchical catalogues, automatic generation of metadata, detection of text genre, and many others.
As with many other artificial intelligence (AI) tasks, there are two main approaches to text categorization. The first is the knowledge engineering approach in which the expert's knowledge about the categories is directly encoded into the system either declaratively or in the form of procedural classification rules.
The basic tools for describing and analyzing random processes have all been developed in the preceding chapters along with a variety of examples of random processes with and without memory. The goal of this chapter is to use these tools to describe a menagerie of useful random processes, usually by taking a simple random process and applying some form of signal processing such as linear filtering in order to produce a more complicated random process. In Chapter 5 the effect of linear filtering on second-order moments was considered. In this chapter we look in more detail at the resulting output process and consider other forms of signal processing as well. In the course of the development a few new tools and several variations on old tools for deriving distributions are introduced. Much of this chapter can be considered as practice of the methods developed in the previous chapters, with names often being given to the specific examples developed. In fact several processes with memory have been encountered previously: the binomial counting process and the discrete time Wiener process, in particular. The goal now is to extend the techniques used in these special cases to more general situations and to introduce a wider variety of processes.
The development of examples begins with a continuation of the study of the output processes of linear systems with random process inputs.
In this appendix we provide some suggestions for supplementary reading. Our goal is to provide some leads for the reader interested in pursuing the topics treated in more depth. Admittedly we only scratch the surface of the large literature on probability and random processes. The books are selected based on our own tastes — they are books from which we have learned and from which we have drawn useful results, techniques, and ideas for our own research.
A good history of the theory of probability may be found in Maistrov, who details the development of probability theory from its gambling origins through its combinatorial and relative frequency theories to the development by Kolmogorov of its rigorous axiomatic foundations. A somewhat less serious historical development of elementary probability is given by Huff and Geis. Several early papers on the application of probability are given in Newman. Of particular interest are the papers by Bernoulli on the law of large numbers and the paper by George Bernard Shaw comparing the vice of gambling and the virtue of insurance.
An excellent general treatment of the theory of probability and random processes may be found in Ash, along with treatments of real analysis, functional analysis, and measure and integration theory. Ash is a former engineer turned mathematician, and his book is one of the best available for someone with an engineering background who wishes to pursue the mathematics beyond the level treated in this book.
A random or stochastic process is a mathematical model for a phenomenon that evolves in time in an unpredictable manner from the viewpoint of the observer. The phenomenon may be a sequence of real-valued measurements of voltage or temperature, a binary data stream from a computer, a modulated binary data stream from a modem, a sequence of coin tosses, the daily Dow–Jones average, radiometer data or photographs from deep space probes, a sequence of images from a cable television, or any of an infinite number of possible sequences, waveforms, or signals of any imaginable type. It may be unpredictable because of such effects as interference or noise in a communication link or storage medium, or it may be an information-bearing signal, deterministic from the viewpoint of an observer at the transmitter but random to an observer at the receiver.
The theory of random processes quantifies the above notions so that one can construct mathematical models of real phenomena that are both tractable and meaningful in the sense of yielding useful predictions of future behavior. Tractability is required in order for the engineer (or anyone else) to be able to perform analyses and syntheses of random processes, perhaps with the aid of computers. The “meaningful” requirement is that the models must provide a reasonably good approximation of the actual phenomena. An oversimplified model may provide results and conclusions that do not apply to the real phenomenon being modeled.
In engineering practice we are often interested in the average behavior of measurements on random processes. The goal of this chapter is to link the two distinct types of averages that are used – long-term time averages taken by calculations on an actual physical realization of a random process and averages calculated theoretically by probabilistic calculus at some given instant of time, averages that are called expectations. As we shall see, both computations often (but by no means always) give the same answer. Such results are called laws of large numbers or ergodic theorems.
At first glance from a conceptual point of view, it seems unlikely that long-term time averages and instantaneous probabilistic averages would be the same. If we take a long-term time average of a particular realization of the random process, say {X(t, ω0); t ∈ T}, we are averaging for a particular ω which we cannot know or choose; we do not use probability in any way and we are ignoring what happens with other values of ω. Here the averages are computed by summing the sequence or integrating the waveform over t while ω0 stays fixed. If, on the other hand, we take an instantaneous probabilistic average, say at the time t0, we are taking a probabilistic average and summing or integrating over ω for the random variable X(t0, ω).
The theory of random processes is a branch of probability theory and probability theory is a special case of the branch of mathematics known as measure theory. Probability theory and measure theory both concentrate on functions that assign real numbers to certain sets in an abstract space according to certain rules. These set functions can be viewed as measures of the size or weight of the sets. For example, the precise notion of area in two-dimensional Euclidean space and volume in three-dimensional space are both examples of measures on sets. Other measures on sets in three dimensions are mass and weight. Observe that from elementary calculus we can find volume by integrating a constant over the set. From physics we can find mass by integrating a mass density or summing point masses over a set. In both cases the set is a region of three-dimensional space. In a similar manner, probabilities will be computed by integrals of densities of probability or sums of “point masses” of probability.
Both probability theory and measure theory consider only nonnegative real-valued set functions. The value assigned by the function to a set is called the probability or the measure of the set, respectively. The basic difference between probability theory and measure theory is that the former considers only set functions that are normalized in the sense of assigning the value of 1 to the entire abstract space, corresponding to the intuition that the abstract space contains every possible outcome of an experiment and hence should happen with certainty or probability 1.
The theory of random processes is constructed on a large number of abstractions. These abstractions are necessary to achieve generality with precision while keeping the notation used manageably brief. Students will probably find learning facilitated if, with each abstraction, they keep in mind (or on paper) a concrete picture or example of a special case of the abstraction. From this the general situation should rapidly become clear. Concrete examples and exercises are introduced throughout the book to help with this process.
Set theory
In this section the basic set theoretic ideas that are used throughout the book are introduced. The starting point is an abstract space, or simply a space, consisting of elements or points, the smallest quantities with which we shall deal. This space, often denoted by Ω, is sometimes referred to as the universal set. To describe a space we may use braces notation with either a list or a description contained within the braces { }. Examples are:
[A.0] The abstract space consisting of no points at all, that is, an empty (or trivial) space. This possibility is usually excluded by assuming explicitly or implicitly that the abstract space is nonempty, that is, it contains at least one point.
In Chapter 4 we saw that the second-order moments of a random process – the mean and covariance or, equivalently, the autocorrelation – play a fundamental role in describing the relation of limiting sample averages and expectations. We also saw, for example in Section 4.6.1 and Problem 4.26, that these moments also play a key role in signal processing applications of random processes, especially in linear least squares estimation. Because of the fundamental importance of these particular moments, this chapter considers their properties in greater depth and their evaluation for several important examples. A primary focus is on a second-order moment analog of a derived distribution problem. Suppose we are given the second-order moments of one random process and this process is then used as an input to a linear system. What are the resulting second-order moments of the output random process? These results are collectively known as second-order moment input/output or I/O relations for linear systems.
Linear systems may seem to be a very special case. As we will see, their most obvious attribute is that they are easier to handle analytically, which leads to more complete, useful, and stronger results than can be obtained for the class of all systems. This special case, however, plays a central role and is by far the most important class of systems.
This chapter provides theoretical foundations and examples of of random variables, vectors, and processes. All three concepts are variations on a single theme and may be included in the general term of random object. We will deal specifically with random variables first because they are the simplest conceptually – they can be considered to be special cases of the other two concepts.
Random variables
The name random variable suggests a variable that takes on values randomly. In a loose, intuitive way this is the right interpretation – e.g., an observer who is measuring the amount of noise on a communication link sees a random variable in this sense. We require, however, a more precise mathematical definition for analytical purposes. Mathematically a random variable is neither random nor a variable – it is just a function mapping one sample space into another space. The first space is the sample space portion of a probability space, and the second space is a subset of the real line (some authors would call this a “real-valued” random variable). The careful mathematical definition will place a constraint on the function to ensure that the theory makes sense, but for the moment we informally define a random variable as a function.
A random variable is perhaps best thought of as a measurement on a probability space; that is, for each sample point ω the random variable produces some value, denoted functionally as f(ω).
The origins of this book lie in our earlier book Random Processes: A Mathematical Approach for Engineers (Prentice Hall, 1986). This book began as a second edition to the earlier book and the basic goal remains unchanged – to introduce the fundamental ideas and mechanics of random processes to engineers in a way that accurately reflects the underlying mathematics, but does not require an extensive mathematical background and does not belabor detailed general proofs when simple cases suffice to get the basic ideas across. In the years since the original book was published, however, it has evolved into something bearing little resemblance to its ancestor. Numerous improvements in the presentation of the material have been suggested by colleagues, students, teaching assistants, and reviewers, and by our own teaching experience. The emphasis of the book shifted increasingly towards examples and a viewpoint that better reflected the title of the courses we taught using the book for many years at Stanford University and at the University of Maryland: An Introduction to Statistical Signal Processing. Much of the basic content of this course and of the fundamentals of random processes can be viewed as the analysis of statistical signal processing systems: typically one is given a probabilistic description for one random object, which can be considered as an input signal. An operation is applied to the input signal (signal processing) to produce a new random object, the output signal.
We review a number of engineering problems that can be posed or solved using Fourier transforms for the groups of rigid-body motions of the plane or three-dimensional space. Mathematically and computationally these problems can be divided into two classes: (1) physical problems that are described as degenerate diffusions on motion groups; (2) enumeration problems in which fast Fourier transforms are used to efficiently compute motion-group convolutions. We examine engineering problems including the analysis of noise in optical communication systems, the allowable positions and orientations reachable with a robot arm, and the statistical mechanics of polymer chains. In all of these cases, concepts from noncommutative harmonic analysis are put to use in addressing real-world problems, thus rendering them tractable.
1. Introduction
Noncommutative harmonic analysis is a beautiful and powerful area of pure mathematics that has connections to analysis, algebra, geometry, and the theory of algorithms. Unfortunately, it is also an area that is almost unknown to engineers. In our research group, we have addressed a number of seemingly intractable “real-world” engineering problems that are easily modeled and/or solved using techniques of noncommutative harmonic analysis. In particular, we have addressed physical/mechanical problems that are described well as functions or processes on the rotation and rigid-body-motion groups. The interactions and evolution of these functions are described using group-theoretic convolutions and diffusion equations, respectively. In this paper we provide a survey of some of these applications and show how computational harmonic analysis on motion groups is used.