To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This book is all about iterative channel decoding. Two other names which are often used to identify the same area are probabilistic coding and codes on graphs. Iterative decoding was originally conceived by Gallager in his remarkable Ph.D. thesis of 1960. Gallager's work was, evidently, far ahead of its time. Limitations in computational resources in the 1960s were such that the power of his approach could not be fully demonstrated, let alone developed. Consequently, iterative decoding attracted only passing interest and slipped into a long dormancy. It was rediscovered by Berrou, Glavieux, and Thitimajshima in 1993 in the form of turbo codes, and then independently in the mid 1990s by MacKay and Neal, Sipser and Spielman, as well as Luby, Mitzenmacher, Shokrollahi, Spielman, and Stemann in a form much closer to Gallager's original construction. Iterative techniques have subsequently had a strong impact on coding theory and practice and, more generally, on the whole of communications.
The title Modern Coding Theory is clearly a hyperbole. There have been several other important recent developments in coding theory. To mention one prominent example: Sudan's algorithm and the Guruswami-Sudan improvement for list decoding of Reed-Solomon codes and their extension to soft-decision decoding have sparked new life into this otherwise mature subject. So what is our excuse? Iterative methods and their theory are strongly tied to advances in current computing technology and they are therefore inherently modern. They have also brought about a break with the past.
We now look at a select list of applications beyond the simple model of binary memoryless symmetric channels. We formalize each problem and point out how the system can be analyzed. Rather than discussing applications in their full generality, we limit ourselves to interesting special cases. In the same spirit, we do not present highly tuned solutions but explain how each system can be optimized. This keeps the exposition simple. The generalizations are quite routine. Since the Forney-style factor graph (FSFG) of a large system is the composition of the individual FSFGs of its components, it suffices for the most part to study those components in isolation. Real transmission scenarios typically involve combinations of the various components discussed in the following.
Each example introduces one new ingredient. A simple model of a fading channel is discussed in Section 5.1. We next discuss the prototypical asymmetric channel (the so-called Z channel) in Section 5.2. We then turn in Section 5.3 to an information-theoretic application of factor graphs – computing information rates of channels with memory. We also discuss how to code over channels with memory. In Section 5.4 we see how to construct systems with high spectral efficiency from simple binary ones. Very similar in spirit is the discussion on multiple-access channels in Section 5.5.
We now broaden our scope of channels from the binary erasure channel (BEC) to the class of binary memoryless symmetric (BMS) channels. Many concepts encountered during our study of the BEC still apply and reappear suitably generalized. It might at first seem that this expanded class of channels is still restricted and special and that we are only covering a small portion of the large volume of channels encountered in practice. Actually, however, a wide range of situations can be dealt with rather straightforwardly once we have mastered BMS channels. One should therefore view the following as part of the foundation upon which much of communications rests.
Sections 4.2–4.10 recapitulate Sections 3.5–3.6, and 3.8–3.14 in this more general setting and these sections form the core material. The remaining sections can be read in essentially any order. They contain either more advanced topics or less accessible material.
General BMS channels are more mathematically challenging than the BEC. Section 4.1 summarizes the necessary prerequisites. Our advice: quickly skim it so you know what material it contains but do not study it in detail. At any point later, when the need arises, you can return to fill in gaps.
The technology of communication and computing advanced at a breathtaking pace in the 20th century, especially in the second half. A significant part of this advance in communication began some 60 years ago when Shannon published his seminal paper “A Mathematical Theory of Communication.” In that paper Shannon framed and posed a fundamental question: how can we efficiently and reliably transmit information? Shannon also gave a basic answer: coding can do it. Since that time the problem of finding practical coding schemes that approach the fundamental limits established by Shannon has been at the heart of information theory and communications. Recently, significant advances have taken place that bring us close to answering this question. Perhaps, at least in a practical sense, the question has been answered. This book is about that answer.
The advance came with a fundamental paradigm shift in the area of coding that took place in the early 1990s. In Modern Coding Theory, codes are viewed as large complex systems described by random sparse graphical models, and encoding as well as decoding are accomplished by efficient local algorithms. The local interactions of the codebits are simple but the overall code is nevertheless complex (and so sufficiently powerful to allow reliable communication) because of the large number of interactions. The idea of random codes is in the spirit of Shannon's original formulation. What is new is the sparseness of the description and the local nature of the algorithms.
Density evolution plays a fundamental role in the analysis of iterative systems; it is also a valuable tool in the design of such systems. Actual computation of density evolution for low-density parity-check (LDPC) codes requires an algorithmic implementation. Finding an efficient such implementation is a challenging problem. In this section we show how this can be done.
When implementing density evolution one may or may not assume symmetry of the densities. Working directly in the space of symmetric distributions yields the most efficient implementations. The interaction of the symmetry with the practical constraints of finite support typically leads to optimistic predictions of the thresh-old. On the other hand, sometimes one is interested specifically in consequences of non-symmetry. Moreover, by allowing densities to be non-symmetric one can compute density evolution for a message-passing decoder which corresponds to a quantized version of belief propagation. Since belief propagation is optimal, thresholds computed this way are lower bounds on the true belief propagation threshold. We will not assume strict symmetry in general, but we will assume that densities are “nearly” symmetric.
We use the notation * to denote standard convolution over ℝ, ℤ, or ℤ/N ℤ – the ring of integers modulo N. Variable-node domain convolution, which we have denoted by *, is standard convolution but we shall use ⊛ to emphasize computational aspects.
In practical LDPC design one is invariably interested in achieving the best possible performance. Although we have used the framework of irregular low-density parity-check (LDPC) ensembles throughout this book, the notion of “irregularity” we have employed is not the most general possible and the best performing structures are not properly captured in that framework.
We start by presenting in Section 7.1 a generalization of irregular LDPC ensembles called multi-edge-type LDPC ensembles. In principle, any of the ensembles suggested to date can be represented in this form and we demonstrate this discussing a few examples.
In Section 7.2 we review the theory surrounding these ensembles. It is largely the same as for the standard irregular case, which is why we have avoided using this more complex notion so far. The multi-edge-type generalization enjoys several advantages. With multi-edge-type LDPC codes one can achieve better performance at lower complexity. The generalization is especially useful under extreme conditions where standard irregular LDPC codes do not fare so well. Examples of these conditions include very low code rates, high rate codes that target very low bit error rates, and codes used in conjunction with turbo equalization schemes.
We discuss in Section 7.3 an alternative and complementary way of describing structured ensembles. These ensembles are the result of lifting a very small graph to a large graph by first replicating the structure of the small graph a large number of times and then by choosing the connections between these copies in a random way.
Asserting a specific property about an individual code is typically a hard task. To the contrary, it is often easy to show that most codes in a properly chosen ensemble possess this property. In the realm of classical coding theory an important such instance is the minimum distance of a code. For Elias's generator ensemble G a few lines suffice to show that with high probability an element chosen uniformly at random has a relative minimum distance of at least δGV, where δGV is the Gilbert-Varshamov distance discussed on page 7. But as mentioned on page 33, it is known that the corresponding decision problem – whether a given code has relative minimum distance at least δGV – is NP-complete.
We encounter a similar situation in the realm of message-passing decoding. The whole analysis rests on the investigation of ensembles of codes and, therefore, concentration theorems which assert that most codes in this ensemble behave close to the ensemble average are at the center of the theory. There is one big difference though which makes concentrations theorems invaluable for message-passing decoding, whereas in the classical setting they only play a marginal role. The main obstacle which we encounter in classical coding is that a random code (in G) is unlikely to have an efficient decoding algorithm, and, therefore, a random code is unlikely to be of much practical value. In the message-passing world the choice of the ensemble (e.g., LDPC, turbo, multi-edge, etc.) guarantees that every element can be decoded with equal ease.
Modulation refers to the representation of digital information in terms of analog waveforms that can be transmitted over physical channels. A simple example is depicted in Figure 2.1, where a sequence of bits is translated into a waveform. The original information may be in the form of bits taking the values 0 and 1. These bits are translated into symbols using a bit-to-symbol map, which in this case could be as simple as mapping the bit 0 to the symbol +1, and the bit 1 to the symbol −1. These symbols are then mapped to an analog waveform by multiplying with translates of a transmit waveform (a rectangular pulse in the example shown): this is an example of linear modulation, to be discussed in detail in Section 2.5. For the bit-to-symbol map just described, the bitstream encoded into the analog waveform shown in Figure 2.1 is 01100010100.
While a rectangular timelimited transmit waveform is shown in the example of Figure 2.1, in practice, the analog waveforms employed for modulation are often constrained in the frequency domain. Such constraints arise either from the physical characteristics of the communication medium, or from external factors such as government regulation of spectrum usage. Thus, we typically classify channels, and the signals transmitted over them, in terms of the frequency bands they occupy. In this chapter, we discuss some important modulation techniques, after first reviewing some basic concepts regarding frequency domain characterization of signals and systems.
Freedom from wires is an attractive, and often indispensable, feature for many communication applications. Examples of wireless communication include radio and television broadcast, point-to-point microwave links, cellular communications, and wireless local area networks (WLANs). Increasing integration of transceiver functionality using DSP-centric design has driven down implementation costs, and has led to explosive growth in consumer and enterprise applications of wireless, especially cellular telephony and WLANs.
While the focus of this chapter is on wireless link design, we comment briefly on some system design issues in this introductory section. In terms of system design, a key difference between wireless and wireline communication is that wireless is a broadcast medium. That is, users “close enough” to each other can “hear,” and potentially interfere with, each other. Thus, appropriate resource sharing mechanisms must be put in place if multiple users are to co-exist in a particular frequency band. The wireless channel can be shared among multiple users using several different approaches. One possibility is to eliminate potential interference by assigning different frequency channels to different users; this is termed frequency division multiple access (FDMA). Similarly, we can assign different time slots to different users; this is termed time division multiple access (TDMA). If we use orthogonal multiple access such as FDMA or TDMA, then we can focus on single-user wireless link design. However, there are also nonorthogonal forms of multiple access, in which different users can signal at the same time over the same frequency band.
In this chapter, we provide an introduction to some commonly used channel coding techniques. The key idea of channel coding is to introduce redundancy in the transmitted signal so as to enable recovery from channel impairments such as errors and erasures. We know from the previous chapter that, for any given set of channel conditions, there exists a Shannon capacity, or maximum rate of reliable transmission. Such Shannon-theoretic limits provide the ultimate benchmark for channel code design. A large number of error control techniques are available to the modern communication system designer, and in this chapter, we provide a glimpse of a small subset of these. Our emphasis is on convolutional codes, which have been a workhorse of communication link design for many decades, and turbo-like codes, which have revolutionized communication systems by enabling implementable designs that approach Shannon capacity for a variety of channel models.
Map of this chapter We begin in Section 7.1 with binary convolutional codes. We introduce the trellis representation and the Viterbi algorithm for ML decoding, and develop performance analysis techniques. The structure of the memory introduced by a convolutional code is similar to that introduced by a dispersive channel. Thus, the techniques are similar to (but simpler than) those developed for MLSE for channel equalization in Chapter 5. Concatenation of convolutional codes leads to turbo codes, which are iteratively decoded by exchanging soft information between the component convolutional decoders.