To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter lays the foundation for the remainder of the book by introducing key concepts and definitions for complex random vectors and processes. The structure of this chapter is as follows.
In Section 2.1, we relate descriptions of complex random vectors to the corresponding descriptions in terms of their real and imaginary parts. We will see that operations that are linear when applied to real and imaginary parts generally become widely linear (i.e., linear–conjugate-linear) when applied to complex vectors. We introduce a matrix algebra that enables a convenient description of these widely linear transformations.
Section 2.2 introduces a complete second-order statistical characterization of complex random vectors. The key finding is that the information in the standard, Hermitian, covariance matrix must be complemented by a second, complementary, covariance matrix. We establish the conditions that a pair of Hermitian and complementary covariance matrices must satisfy, and show what role the complementary covariance matrix plays in power and entropy.
In Section 2.3, we explain that probability distributions and densities for complex random vectors must be interpreted as joint distributions and densities of their real and imaginary parts. We present two important distributions: the complex multivariate Gaussian distribution and its generalization, the complex multivariate elliptical distribution. These distributions depend both on the Hermitian covariance matrix and on the complementary covariance matrix, and their well-known versions are obtained for the zero complementary covariance matrix.
Detection is the electrical engineer's term for the statistician's hypothesis testing. The problem is to determine which of two or more competing models best describes experimental measurements. If the competition is between two models, then the detection problem is a binary detection problem. Such problems apply widely to communication, radar, and sonar. But even a binary problem can be composite, which is to say that one or both of the hypotheses may consist of a set of models. We shall denote by H0 the hypothesis that the underlying model, or set of models, is M0 and by H1 the hypothesis that it is M1.
There are two main lines of development for detection theory: Neyman–Pearson and Bayes. The Neyman–Pearson theory is a frequentist theory that assigns no prior probability of occurrence to the competing models. Bayesian theory does. Moreover, the measure of optimality is different. To a frequentist the game is to maximize the detection probability under the constraint that the false-alarm probability is not greater than a prespecified value. To a Bayesian the game is to assign costs to incorrect decisions, and then to minimize the average (or Bayes) cost. The solution in any case is to evaluate the likelihood of the measurement under each hypothesis, and to choose the model whose likelihood is higher. Well – not quite. It is the likelihood ratio that is evaluated, and when this ratio exceeds a threshold, determined either by the false-alarm rate or by the Bayes cost, one or other of the hypotheses is accepted.
In this chapter, we discuss in detail the second-order description of a complex random vector x. We have seen in Chapter 2 that the second-order averages of x are completely described by the augmented covariance matrix Rxx. We shall now be interested in those second-order properties of x that are invariant under two types of transformations: widely unitary and nonsingular strictly linear.
The eigenvalues of the augmented covariance matrix Rxx constitute a maximal invariant for Rxx under widely unitary transformation. Hence, any function of Rxx that is invariant under widely unitary transformation must be a function of these eigenvalues only. In Section 3.1, we consider the augmented eigenvalue decomposition (EVD) of Rxx for a complex random vector x. Since we are working with an augmented matrix algebra, this EVD looks somewhat different from what one might expect. In fact, because all factors in the EVD must be augmented matrices, widely unitary diagonalization of Rxx is generally not possible. As an application for the augmented EVD, we discuss rank reduction and transform coding.
In Section 3.2, we introduce the canonical correlations between x and x*, which have been called the circularity coefficients. These constitute a maximal invariant for Rxx under nonsingular strictly linear transformation. They are interesting and useful for a number of reasons.
They determine the loss in entropy that an improper Gaussian random vector incurs compared with its proper version (see Section 3.2.1).
Complex-valued random signals are embedded into the very fabric of science and engineering, being essential to communications, radar, sonar, geophysics, oceanography, optics, electromagnetics, acoustics, and other applied sciences. A great many problems in detection, estimation, and signal analysis may be phrased in terms of two channels' worth of real signals. It is common practice in science and engineering to place these signals into the real and imaginary parts of a complex signal. Complex representations bring economies and insights that are difficult to achieve with real representations.
In the past, it has often been assumed – usually implicitly – that complex random signals are proper and circular. A proper complex random variable is uncorrelated with its complex conjugate, and a circular complex random variable has a probability distribution that is invariant under rotation in the complex plane. These assumptions are convenient because they simplify computations and, in many aspects, make complex random signals look and behave like real random signals. Yet, while these assumptions can often be justified, there are also many cases in which proper and circular random signals are very poor models of the underlying physics. This fact has been known and appreciated by oceanographers since the early 1970s, but it has only recently been accepted across disciplines by acousticians, optical scientists, and communication theorists.
This book develops the tools and algorithms that are necessary to deal with improper complex random variables, which are correlated with their complex conjugate, and with noncircular complex random variables, whose probability distribution varies under rotation in the complex plane.
All parameter estimation begins with a measurement and an algorithm for extracting a parameter estimate from the measurement. The algorithm is the estimator.
There are two ways to think about performance analysis. One way is to begin with a particular estimator and then to compute its performance. Typically this would amount to computing the bias of the estimator and its error covariance matrix. The practitioner then draws or analyzes concentration ellipsoids to decide whether or not the estimator meets specifications. But the other, more general, way is to establish a limit on the accuracy of any estimator of the parameter. We might call this a uniform limit, uniform over an entire class of estimators. Such a limit would speak to the information that the measurement carries about the underlying parameter, independently of how the information is extracted.
Performance bounds are fundamental to signal processing because they tell us when the number and quality of spatial, temporal, or spatial–temporal measurements is sufficient to meet performance specifications. That is, these general bounds speak to the quality of the experiment or the sensing schema itself, rather than to the subsequent signal processing. If the sensing scheme carries insufficient information about the underlying parameter, then no amount of sophisticated signal processing can extract information that is not there. In other words, if the bound says that the error covariance is larger than specifications require, then the experiment or measurement scheme must be redesigned.
Assessing multivariate association between two random vectors x and y is an important problem in many research areas, ranging from the natural sciences (e.g., oceanography and geophysics) to the social sciences (in particular psychometrics and behaviormetrics) and to engineering. While “multivariate association” is often simply visualized as “similarity” between two random vectors, there are many different ways of measuring it. In this chapter, we provide a unifying treatment of three popular correlation analysis techniques: canonical correlation analysis (CCA), multivariate linear regression (MLR), and partial least squares (PLS). Each of these techniques transforms x and y into its respective internal representation ξ and ω. Different correlation coefficients may then be defined as functions of the diagonal cross-correlations {ki} between the internal representations ξi and ωi.
The key differences among CCA, MLR, and PLS are revealed in their invariance properties. CCA is invariant under nonsingular linear transformation of x and y, MLR is invariant under nonsingular linear transformation of y but only unitary transformation of x, and PLS is invariant under unitary transformation of x and y. Correlation coefficients then share the invariance properties of the correlation analysis technique on which they are based.
Analyzing multivariate association of complex data is further complicated by the fact that there are different types of correlation. Two scalar complex random variables x and y are called rotationally dependent if x = ky for some complex constant k.
Wide-sense stationary (WSS) processes admit a spectral representation (see Result 8.1) in terms of the Fourier basis, which allows a frequency interpretation. The transform-domain description of a WSS signal x(t) is a spectral process ξ(f) with orthogonal increments dξ(f). For nonstationary signals, we have to sacrifice either the Fourier basis, and thus its frequency interpretation, or the orthogonality of the transform-domain representation. We will discuss both possibilities.
The Karhunen–Loève (KL) expansion uses an orthonormal basis other than the Fourier basis but retains the orthogonality of the transform-domain description. The KL expansion is applied to a continuous-time signal of finite duration, which means that its transform-domain description is a countably infinite number of orthogonal random coefficients. This is analogous to the Fourier series, which produces a countably infinite number of Fourier coefficients, as opposed to the Fourier transform, which is applied to an infinite-duration continuous-time signal. The KL expansion presented in Section 9.1 takes into account the complementary covariance of an improper signal. It can be considered the continuous-time equivalent of the eigenvalue decomposition of improper random vectors discussed in Section 3.1.
An alternative approach is the Cramér–Loève (CL) spectral representation, which retains the Fourier basis and its frequency interpretation but sacrifices the orthogonality of the increments dξ(f). As discussed in Section 9.2, the increments dξ(f) of the spectral process of an improper signal can have nonzero Hermitian correlation and complementary correlation between different frequencies.
In statistical signal processing, we often deal with a real nonnegative cost function, such as a likelihood function or a quadratic form, which is then either analytically or numerically optimized with respect to a vector or matrix of parameters. This involves taking derivatives with respect to vectors or matrices, leading to gradient vectors and Jacobian and Hessian matrices. What happens when the parameters are complex-valued? That is, how do we differentiate a real-valued function with respect to a complex argument?
What makes this situation confusing is that classical complex analysis tells us that a complex function is differentiable on its entire domain if and only if it is holomorphic (which is a synonym for complex analytic). A holomorphic function with nonzero derivative is conformal because it preserves angles (including their orientations) and the shapes of infinitesimally small figures (but not necessarily their size) in the complex plane. Since nonconstant real-valued functions defined on the complex domain cannot be holomorphic, their classical complex derivatives do not exist.
We can, of course, regard a function f defined on ℂn as a function defined on ℝ2n. If f is differentiable on ℝ2n, it is said to be real-differentiable, and if f is differentiable on ℂn, it is complex-differentiable. A function is complex-differentiable if and only if it is real-differentiable and the Cauchy–Riemann equations hold. Is there a way to define generalized complex derivatives for functions that are real-differentiable but not complex-differentiable?
Cyclostationary processes are an important class of nonstationary processes that have periodically varying correlation properties. They can model periodic phenomena occurring in science and technology, including communications (modulation, sampling, and multiplexing), meteorology, oceanography, climatology, astronomy (rotation of the Earth and other planets), and economics (seasonality). While cyclostationarity can manifest itself in statistics of arbitrary order, we will restrict our attention to phenomena in which the second-order correlation and complementary correlation functions are periodic in their global time variable.
Our program for this chapter is as follows. In Section 10.1, we discuss the spectral properties of harmonizable cyclostationary processes. We have seen in Chapter 8 that the second-order averages of a WSS process are characterized by the power spectral density (PSD) and complementary power spectral density (C-PSD). These each correspond to a single δ-ridge (the stationary manifold) in the spectral correlation and complementary spectral correlation. Cyclostationary processes have a (possibly countably infinite) number of so-called cyclic PSDs and C-PSDs. These correspond to δ-ridges in the spectral correlation and complementary spectral correlation that are parallel to the stationary manifold. In Section 10.2, we derive the cyclic PSDs and C-PSDs of linearly modulated digital communication signals. We will see that there are two types of cyclostationarity: one related to the symbol rate, the other to impropriety and carrier modulation.
Because cyclostationary processes are spectrally correlated between different frequencies, they have spectral redundancy. This redundancy can be exploited in optimum estimation.
One of the most important applications of probability in science and engineering is to the theory of statistical inference, wherein the problem is to draw defensible conclusions from experimental evidence. The three main branches of statistical inference are parameter estimation, hypothesis testing, and time-series analysis. Or, as we say in the engineering sciences, the three main branches of statistical signal processing are estimation, detection, and signal analysis.
A common problem is to estimate the value of a parameter, or vector of parameters, from a sequence of measurements. The underlying probability law that governs the generation of the measurements depends on the parameter. Engineering language would say that a source of information, loosely speaking, generates a signal x and a channel carries this information in a measurement y, whose probability law p(y∣x) depends on the signal. There is usually little controversy over this aspect of the problem because the measurement scheme generally determines the probability law. There is, however, a philosophical divide about the modeling of the signal x. Frequentists adopt the point of view that to assign a probability law to the signal assumes too much. They argue that the signal should be treated as an unknown constant and the data should be allowed to speak for itself. Bayesians argue that the signal should be treated as a random variable whose prior probability distribution is to be updated to a posterior distribution as measurements are made.