To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
During the last three decades Digital Signal Processing (DSP) has evolved into a core area of study in electrical and computer engineering. Today, DSP provides the methodology and algorithms for the solution of a continuously growing number of practical problems in scientific, engineering, and multimedia applications.
Despite the existence of a number of excellent textbooks focusing either on the theory of DSP or on the application of DSP algorithms using interactive software packages, we feel there is a strong need for a book bridging the two approaches by combining the best of both worlds. This was our motivation for writing this book, that is, to help students and practicing engineers understand the fundamental mathematical principles underlying the operation of a DSP method, appreciate its practical limitations, and grasp, with sufficient details, its practical implementation.
Objectives
The principal objective of this book is to provide a systematic introduction to the basic concepts and methodologies for digital signal processing, based whenever possible on fundamental principles. A secondary objective is to develop a foundation that can be used by students, researchers, and practicing engineers as the basis for further study and research in this field. To achieve these objectives, we have focused on material that is fundamental and where the scope of application is not limited to the solution of specialized problems, that is, material that has a broad scope of application.
In this chapter we introduce the concept of Fourier or frequency-domain representation of signals. The basic idea is that any signal can be described as a sum or integral of sinusoidal signals. However, the exact form of the representation depends on whether the signal is continuous-time or discrete-time and whether it is periodic or aperiodic. The underlying mathematical framework is provided by the theory of Fourier series, introduced by Jean Baptiste Joseph Fourier (1768–1830).
The major justification for the frequency domain approach is that LTI systems have a simple behavior with sinusoidal inputs: the response of a LTI system to a sinusoid is a sinusoid with the same frequency but different amplitude and phase.
Study objectives
After studying this chapter you should be able to:
Understand the fundamental differences between continuous-time and discrete-time sinusoidal signals.
Evaluate analytically the Fourier representation of continuous-time signals using the Fourier series (periodic signals) and the Fourier transform (aperiodic signals).
Evaluate analytically and numerically the Fourier representation of discrete-time signals using the Fourier series (periodic signals) and the Fourier transform (aperiodic signals).
Choose the proper mathematical formulas to determine the Fourier representation of any signal based on whether the signal is continuous-time or discrete-time and whether it is periodic or aperiodic.
Understand the use and implications of the various properties of the discrete-time Fourier transform.
Signal processing is a discipline concerned with the acquisition, representation, manipulation, and transformation of signals required in a wide range of practical applications. In this chapter, we introduce the concepts of signals, systems, and signal processing. We first discuss different classes of signals, based on their mathematical and physical representations. Then, we focus on continuous-time and discrete-time signals and the systems required for their processing: continuous-time systems, discrete-time systems, and interface systems between these classes of signal. We continue with a discussion of analog signal processing, digital signal processing, and a brief outline of the book.
Study objectives
After studying this chapter you should be able to:
Understand the concept of signal and explain the differences between continuous-time, discrete-time, and digital signals.
Explain how the physical representation of signals influences their mathematical representation and vice versa.
Explain the concepts of continuous-time and discrete-time systems and justify the need for interface systems between the analog and digital worlds.
Recognize the differences between analog and digital signal processing and explain the key advantages of digital over analog processing.
Signals
For our purposes a signal is defined as any physical quantity that varies as a function of time, space, or any other variable or variables. Signals convey information in their patterns of variation. The manipulation of this information involves the acquisition, storage, transmission, and transformation of signals.
There are many signals that could be used as examples in this section. However, we shall restrict our attention to a few signals that can be used to illustrate several important concepts and they will be useful in later chapters.
A key feature of the discrete-time systems discussed so far is that the signals at the input, output, and every internal node have the same sampling rate. However, there are many practical applications that either require or can be implemented more efficiently by processing signals at different sampling rates. Discrete-time systems with different sampling rates at various parts of the system are called multirate systems. The practical implementation of multirate systems requires changing the sampling rate of a signal using discrete-time operations, that is, without reconstructing and resampling a continuous-time signal. The fundamental operations for changing the sampling rate are decimation and interpolation. The subject of this chapter is the analysis, design, and efficient implementation of decimation and interpolation systems, and their application to two important areas of multirate signal processing: sampling rate conversion and multirate filter banks.
Study objectives
After studying this chapter you should be able to:
Understand the operations of decimation, interpolation, and arbitrary sampling rate change in the time and frequency domains.
Understand the efficient implementation of discrete-time systems for sampling rate conversion using polyphase structures.
Design a special type of filter (Nyquist filters), which are widely used for the efficient implementation of multirate filters and filter banks.
Understand the operation, properties, and design of two-channel filter banks with perfect reconstruction analysis and synthesis capabilities.
Sampling rate conversion
The need for sampling rate conversion arises in many practical applications, including digital audio, communication systems, image processing, and high-definition television.
In this chapter we discuss the basic concepts and the mathematical tools that form the basis for the representation and analysis of discrete-time signals and systems. We start by showing how to generate, manipulate, plot, and analyze basic signals and systems using Matlab. Then we discuss the key properties of causality, stability, linearity, and time-invariance, which are possessed by the majority of systems considered in this book. We continue with the mathematical representation, properties, and implementation of linear time-invariant systems. The principal goal is to understand the interaction between signals and systems to the extent that we can adequately predict the effect of a system upon the input signal. This is extremely difficult, if not impossible, for arbitrary systems. Thus, we focus on linear time-invariant systems because they are amenable to a tractable mathematical analysis and have important signal processing applications.
Study objectives
After studying this chapter you should be able to:
Describe discrete-time signals mathematically and generate, manipulate, and plot discrete-time signals using Matlab.
Check whether a discrete-time system is linear, time-invariant, causal, and stable; show that the input-output relationship of any linear time-invariant system can be expressed in terms of the convolution sum formula.
Determine analytically the convolution for sequences defined by simple formulas, write computer programs for the numerical computation of convolution, and understand the differences between stream and block processing.
Determine numerically the response of discrete-time systems described by linear constant-coefficient difference equations.
Independent scalar quantization of a sequence of samples from a source is especially inefficient when there is dependence or memory among the samples. Even if there is no memory, where the samples are truly statistically independent, scalar quantization of each sample independently, although practical, is a suboptimal method. When neighboring samples provide information about a sample to be quantized, one can make use of this information to reduce the rate needed to represent a quantized sample. In this chapter, we shall describe various methods that exploit the memory of the source sequence in order to reduce the rate needed to represent the sequence with a given distortion or reduce the distortion needed to meet a given rate target. Such methods include predictive coding, vector coding, and tree- and trellis-based coding. This chapter presents detailed explanations of these methods.
Predictive coding
The first approach toward coding of sources with memory is using scalar quantization, because of its simplicity and effectiveness. We resort to a simple principle to motivate our approach. Consider the scenario in Figure 6.1, where a quantity un is subtracted from the source sample xn at time n prior to scalar quantization (Q). The same quantity un is added to the quantized difference ẽn to yield the reconstructed output yn.
This book is an outgrowth of a graduate level course taught for several years at Rensselaer Polytechnic Institute (RPI). When the course started in the early 1990s, there were only two textbooks available that taught signal compression, Jayant and Noll and Gersho and Gray. Certainly these are excellent textbooks and valuable references, but they did not teach some material considered to be necessary at that time, so the textbooks were supplemented with handwritten notes where needed. Eventually, these notes grew to many pages, as the reliance on published textbooks diminished. The lecture notes remained the primary source even after the publication of the excellent book by Sayood, which served as a supplement and a source of some problems. While the Sayood book was up to date, well written, and authoritative, it was written to be accessible to undergraduate students, so lacked the depth suitable for graduate students wanting to do research or practice in the field. The book at hand teaches the fundamental ideas of signal compression at a level that both graduate students and advanced undergraduate students can approach with confidence and understanding. The book is also intended to be a useful resource to the practicing engineer or computer scientist in the field. For that purpose and also to aid understanding, the 40 algorithms listed under Algorithms in the Index are not only fully explained in the text, but also are set out step-by-step in special algorithm format environments.
In this chapter, we introduce the concept that correlated sources need not be encoded jointly to achieve greater efficiency than encoding them independently. In fact, if they are encoded independently and decoded jointly, it is theoretically possible under certain conditions to achieve the same efficiency as when encoded jointly. Such a method for coding correlated sources is called distributed source coding (DSC). Figure 14.1 depicts the paradigm of DSC with independent encoding and joint decoding. In certain applications, such as sensor networks and mobile communications, circuit complexity and power drain are too burdensome to be tolerated at the transmission side. DSC shifts complexity and power consumption from the transmission side to the receiver side, where it can be more easily handled and tolerated. The content of this chapter presents the conditions under which DSC is ideally efficient and discusses some practical schemes that attempt to realize rate savings in the DSC paradigm. There has been a plethora of recent work on this subject, so an encyclopedic account is impractical and ill-advised in a textbook. The goal here is to explain the principles clearly and elucidate them with a few examples.
Slepian–Wolf coding for lossless compression
Consider two correlated, discrete scalar sources X and Y. Theoretically, these sources can be encoded independently without loss using H(X) and H(Y) bits, respectively, where H(X) and H(Y) are the entropies of these sources. However, if encoded jointly, both these sources can be reconstructed perfectly using only H(X, Y) bits, the joint entropy of these sources.
In the previous chapter, Chapter 3, we presented the theory of lossless coding and derived properties for optimality of uniquely decodable, prefix-free source codes. In particular, we showed that entropy is the absolute lower rate limit of a prefix-free code and presented tree and arithmetic structures that support prefix-free codes. In this chapter, we shall present coding methods that utilize these structures and whose rates approach the entropy limit. These methods are given the generic name of entropy coding. Huffman coding is one common form of entropy coding. Another is arithmetic coding and several adaptive, context-based enhancements are parts of several standard methods of data compression. Nowadays, lossless codes, whether close to optimal or not, are often called entropy codes. In addition to Huffman and arithmetic coding, we shall develop other important lossless coding methods, including run-length coding, Golomb coding, and Lempel–Ziv coding.
Huffman codes
The construction invented by Huffman [1] in 1952 yields the minimum length, prefix-free code for a source with a given set of probabilities. First, we shall motivate the construction discovered by Huffman. We consider only binary (D = 2) codes in this chapter, since extensions to non-binary are usually obvious from the binary case and binary codes are predominant in practice and the literature. We have learned that a prefix-free code can be constructed, so that its average length is no more than 1 bit from the source entropy, which is the absolute lower limit.
Compression of a digital signal source is just its representation with fewer information bits than its original representation. We are excluding from compression cases when the source is trivially over-represented, such as an image with gray levels 0 to 255 written with 16 bits each when 8 bits are sufficient. The mathematical foundation of the discipline of signal compression, or what is more formally called source coding, began with the seminal paper of Claude Shannon [1, 2], entitled “A mathematical theory of communication,” that established what is now called Information Theory. This theory sets the ultimate limits on achievable compression performance. Compression is theoretically and practically realizable even when the reconstruction of the source from the compressed representation is identical to the original. We call this kind of compression lossless coding. When the reconstruction is not identical to the source, we call it lossy coding. Shannon also introduced the discipline of Rate-distortion Theory [1–3], where he derived the fundamental limits in performance of lossy coding and proved that they were achievable. Lossy coding results in loss of information and hence distortion, but this distortion can be made tolerable for the given application and the loss is often necessary and unavoidable in order to satisfy transmission bandwidth and storage constraints. The payoff is that the degree of compression is often far greater than that achievable by lossless coding.
In many circumstances, data are collected that must be preserved perfectly. Data that are especially expensive to collect, require substantial computation to analyze, or involve legal liability consequences for imprecise representation should be stored and retrieved without any loss of accuracy. Medical data, such as images acquired from X-ray, CT (computed tomography), and MRI (magnetic resonance imaging) machines, are the most common examples where perfect representation is required in almost all circumstances, regardless of whether it is really necessary to preserve the integrity of the diagnostic task. The inaccuracies resulting from the acquisition and digitization processes are ignored in this requirement of perfection. It is only in the subsequent compression that the digitized data must be perfectly preserved. Physicists and materials scientists conduct experiments that produce data written as long streams or large arrays of samples in floating point format. These experiments are very expensive to set up, so there is often insistence that, if compressed, the decompressed data must be identical to the original.
Nowadays, storage and transmission systems are overwhelmed with huge quantities of data. Although storage technology has made enormous strides in increasing density and reducing cost, it seems that whatever progress is made is not enough. The users and producers of data continue to adapt to these advances almost instantaneously and fuel demand for even more storage at less cost. Even when huge quantities of data can be accommodated, retrieval and transmission delays remain serious issues.
It is easy to recognize the importance of data compression technology by observing the way it already pervades our daily lives. For instance, we currently have more than a billion users [1] of digital cameras that employ JPEG image compression, and a comparable number of users of portable audio players that use compression formats such as MP3, AAC, and WMA. Users of video cameras, DVD players, digital cable or satellite TV, hear about MPEG-2, MPEG-4, and H.264/AVC. In each case, the acronym is used to identify the type of compression. While many people do not know what exactly compression means or how it works, they have to learn some basic facts about it in order to properly use their devices, or to make purchase decisions.
Compression's usefulness is not limited to multimedia. An increasingly important fraction of the world's economy is in the transmission, storage, and processing of all types of digital information. As Negroponte [2] succinctly put it, economic value is indeed moving “from atoms to bits.” While it is true that many constraints from the physical world do not affect this “digital economy,” we cannot forget that, due to the huge volumes of data, there has to be a large physical infrastructure for data transmission, processing, and storage. Thus, just as in the traditional economy it is very important to consider the efficiency of transportation, space, and material usage, the efficiency in the representation of digital information also has great economic importance.
Due to advances in the technology of metrology and storage density, acquisition of images is now increasingly being practiced in more than two dimensions and in ever finer resolutions and in numerous scientific fields. Scientific data are being acquired, stored, and analyzed as images. Volume or three-dimensional images are generated in clinical medicine by CT or MRI scans of regions of the human body. Such an image can be regarded as a sequence of two-dimensional slice images of a bodily section. Tomographic methods in electron microscopy, for example, produce images in slices through the material being surveyed. The material can be a biological specimen or a materials microstructure. In remote sensing, a surface is illuminated with broad-spectrum radiation and the reflectance spectrum of each point on the surface is measured and recorded. In this way, a reflectance “image” of the surface is generated for a number of spectral bands. One uses the term multi-spectral imaging when the number of bands is relatively small, say less than 20, and the term hyper-spectral imaging when the number of bands is larger, usually hundreds. In either case, one can view the data either as a sequence of images or a single image of vector pixels. One particular kind of multi-spectral image is a color image, where the spectrum is in the range of visible wavelengths. Because of the properties of the human visual system, only three particular color images are needed to generate a visual response in the human viewer.
In normal circumstances, lossless compression reduces file sizes in the range of a factor of 2, sometimes a little more and sometimes a little less. Often it is acceptable and even necessary to tolerate some loss or distortion between the original and its reproduction. In such cases, much greater compression becomes possible. For example, the highest quality JPEG-compressed images and MP3 audio are compressed about 6 or 7 to 1. The objective is to minimize the distortion, as measured by some criterion, for a given rate in bits per sample or equivalently, minimize the rate for a given level of distortion.
In this chapter, we make a modest start toward the understanding of how to compress realistic sources by presenting the theory and practice of quantization and coding of sources of independent and identically distributed random variables. Later in the chapter, we shall explain some aspects of optimal lossy compression, so that we can assess how well our methods perform compared to what is theoretically possible.
Quantization
The sources of data that we recognize as digital are discrete in value or amplitude and these values are represented by a finite number of bits. The set of these discrete values is a reduction from a much larger set of possible values, because of the limitations of our computers and systems in precision, storage, and transmission speed. We therefore accept the general model of our data source as continuous in value. The discretization process is called quantization.