To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter provides an overview of matrices. Basic matrix operations are introduced first, such as addition, multiplication, transposition, and so on. Determinants and matrix inverses are then defined. The rank and Kruskal rank of matrices are defined and explained. The connection between rank, determinant, and invertibility is elaborated. Eigenvalues and eigenvectors are then reviewed. Many equivalent meanings of singularity (non-invertibility) of matrices are summarized. Unitary matrices are reviewed. Finally, linear equations are discussed. The conditions under which a solution exists and the condition for the solution to be unique are also explained and demonstrated with examples.
This chapter discusses the Fourier series representation for continuous-time signals. This is applicable to signals which are either periodic or have a finite duration. The connections between the continuous-time Fourier transform (CTFT), the discrete-time Fourier transform (DTFT), and Fourier series are also explained. Properties of Fourier series are discussed and many examples presented. For real-valued signals it is shown that the Fourier series can be written as a sum of a cosine series and a sine series; examples include rectified cosines, which have applications in electric power supplies. It is shown that the basis functions used in the Fourier series representation satisfy an orthogonality property. This makes the truncated version of the Fourier representation optimal in a certain sense. The so-called principal component approximation derived from the Fourier series is also discussed. A detailed discussion of the properties of musical signals in the light of Fourier series theory is presented, and leads to a discussion of musical scales, consonance, and dissonance. Also explained is the connection between Fourier series and the function-approximation property of multilayer neural networks, used widely in machine learning. An overview of wavelet representations and the contrast with Fourier series representations is also given.
This chapter examines discrete-time LTI systems in detail. It shows that the input–output behavior of an LTI system is characterized by the so-called impulse response. The output is shown to be the so-called convolution of the input with the impulse response. It is then shown that exponentials are eigenfunctions of LTI systems. This property leads to the ideas of transfer functions and frequency responses for LTI systems. It is argued that the frequency response gives a systematic meaning to the term “filtering.” Image filtering is demonstrated with examples. The discrete-time Fourier transform (DTFT) is introduced to describe the frequency domain behavior of LTI systems, and allows one to represent a signal as a superposition of single-frequency signals (the Fourier representation). DTFT is discussed in detail, with many examples. The z-transform, which is of great importance in the study of LTI systems, is also introduced and its connection to the Fourier transform explained. Attention is also given to real signals and real filters, because of their additional properties in the frequency domain. Homogeneous time-invariant (HTI) systems are also introduced. Continuous-time counterparts of these topics are explained. B-splines, which arise as examples in continuous-time convolution, are presented.
This chapter discusses many interesting properties of bandlimited signals. The subspace of bandlimited signals is introduced. It is shown that uniformly shifted versions of an appropriately chosen sinc function constitute an orthogonal basis for this subspace. It is also shown that the integral and the energy of a bandlimited signal can be obtained exactly from samples if the sampling rate is high enough. For non-bandlimited functions, such a result is only approximately true, with the approximation getting better as the sampling rate increases. A number of less obvious consequences of these results are also presented. Thus, well-known mathematical identities are derived using sampling theory. For example, the Madhava–Leibniz formula for the approximation of π can be derived like this. When samples of a bandlimited signal are contaminated with noise, the reconstructed signal is also noisy. This noise depends on the reconstruction filter, which in general is not unique. Excess bandwidth in this filter increases the noise, and this is quantitatively analyzed. An interesting connection between bandlimited signals and analytic functions (entire functions) is then presented. This has many implications, one being that bandlimited signals are infinitely smooth.
This chapter discusses the Fourier series representation for continuous-time signals. This is applicable to signals which are either periodic or have a finite duration. The connections between the continuous-time Fourier transform (CTFT), the discrete-time Fourier transform (DTFT), and Fourier series are also explained. Properties of Fourier series are discussed and many examples presented. For real-valued signals it is shown that the Fourier series can be written as a sum of a cosine series and a sine series; examples include rectified cosines, which have applications in electric power supplies. It is shown that the basis functions used in the Fourier series representation satisfy an orthogonality property. This makes the truncated version of the Fourier representation optimal in a certain sense. The so-called principal component approximation derived from the Fourier series is also discussed. A detailed discussion of the properties of musical signals in the light of Fourier series theory is presented, and leads to a discussion of musical scales, consonance, and dissonance. Also explained is the connection between Fourier series and the function-approximation property of multilayer neural networks, used widely in machine learning. An overview of wavelet representations and the contrast with Fourier series representations is also given.
This chapter introduces the discrete Fourier transform (DFT), which is different from the discrete-time Fourier transform (DTFT) introduced earlier. The DFT transforms an N-point sequence x[n] in the time domain to an N-point sequence X[k] in the frequency domain by sampling the DTFT of x[n]. A matrix representation for this transformation is introduced, and the properties of the DFT matrix are studied. The fast Fourier transform (FFT), which is a fast algorithm to compute the DFT, is also introduced. The FFT makes the computation of the Fourier transforms of large sets of data practical. The digital signal processing revolution of the 1960s was possible because of the FFT. This chapter introduces the simplest form of FFT, called the radix-2 FFT, and a number of its properties. The chapter also introduces circular or cyclic convolution, which has a special place in DFT theory, and explains the connection to ordinary convolution. Circular convolution paves the way for fast algorithms for ordinary convolution, using the FFT. The chapter also summarizes the relationships between the four types of Fourier transform studied in this book: CTFT, DTFT, DFT, and Fourier series.
In many applications, dimensionality reduction is important. Uses of dimensionality reduction include visualization, removing noise, and decreasing compute and memory requirements, such as for image compression. This chapter focuses on low-rank approximation of a matrix. There are theoretical models for why big matrices should be approximately low rank. Low-rank approximations are also used to compress large neural network models to reduce computation and storage. The chapter begins with the classic approach to approximating a matrix by a low-rank matrix, using a nonconvex formulation that has a remarkably simple singular value decomposition solution. It then applies this approach to the source localization application via the multidimensional scaling method and to the photometric stereo application. It then turns to convex formulations of low-rank approximation based on proximal operators that involve singular value shrinkage. It discusses methods for choosing the rank of the approximation, and describes the optimal shrinkage method called OptShrink. It discusses related dimensionality reduction methods including (linear) autoencoders and principal component analysis. It applies the methods to learning low-dimensionality subspaces from training data for subspace-based classification problems. Finally, it extends the method to streaming applications with time-varying data. This chapter bridges the classical singular value decomposition tool with modern applications in signal processing and machine learning.
An important operation in signal processing and machine learning is dimensionality reduction. There are many such methods, but the starting point is usually linear methods that map data to a lower-dimensional set called a subspace. When working with matrices, the notion of dimension is quantified by rank. This chapter reviews subspaces, span, dimension, rank, and nullspace. These linear algebra concepts are crucial to thoroughly understanding the SVD, a primary tool for the rest of the book (and beyond). The chapter concludes with a machine learning application, signal classification by nearest subspace, that builds on all the concepts of the chapter.
This chapter contains topics related to matrices with special structures that arise in many applications. It discusses companion matrices that are a classic linear algebra topic. It constructs circulant matrices from a particular companion matrix and describes their signal processing applications. It discusses the closely related family of Toeplitz matrices. It describes the power iteration that is used later in the chapter for Markov chains. It discusses nonnegative matrices and their relationships to graphs, leading to the analysis of Markov chains. The chapter ends with two applications: Google’s PageRank method and spectral clustering using graph Laplacians.
Many of the preceding chapters involved optimization formulations: linear least squares, Procrustes, low-rank approximation, multidimensional scaling. All these have analytical solutions, like the pseudoinverse for minimum-norm least squares problems and the truncated singular value decomposition for low-rank approximation. But often we need iterative optimization algorithms, for example if no closed-form minimizer exists, or if the analytical solution requires too much computation and/or memory (e.g., singular value decomposition for large problems. To solve an optimization problem via an iterative method, we start with some initial guess and then the algorithm produces a sequence that hopefully converges to a minimizer. This chapter describes the basics of gradient-based iterative optimization algorithms, including preconditioned gradient descent (PGD) for the linear LS problem. PGD uses a fixed step size, whereas preconditioned steepest descent uses a line search to determine the step size. The chapter then considers gradient descent and accelerated versions for general smooth convex functions. It applies gradient descent to the machine learning application of binary classification via logistic regression. Finally, it summarizes stochastic gradient descent.
This chapter introduces matrix factorizations – somewhat like the reverse of matrix multiplication. It starts with the eigendecomposition of symmetric matrices, then generalizes to normal and asymmetric matrices. It introduces the basics of the singular value decomposition (SVD) of general matrices. It discusses a simple application of the SVD that uses the largest singular value of a matrix (the spectral norm), posed as an optimization problem, and then describes optimization problems related to eigenvalues and the smallest singular value. (The “real” SVD applications appear in subsequent chapters.) It discusses the special situations when one can relate the eigendecomposition and an SVD of a matrix, leading to the special class of positive (semi)definite matrices. Along the way there are quite a few small eigendecomposition and SVD examples.