To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In Chapter 1, we discussed transformations between soliton equations and bilinear equations and the solution of such equations. But what is a bilinear equation, or, more concretely, what mathematical structures are characteristic of bilinear equations? One answer to this question is the existence of groups (related to affine Lie algebras) which act on bilinear equations. In fact, a collective understanding of soliton equations has developed from this viewpoint, and many new soliton equations have been found using this group-theoretical method. This approach, however, calls for a deep knowledge of algebra, and, even when this has been attained, it is difficult to apply. Soon after the birth of quantum mechanics, group theory became a great craze (called Gruppenpest), where many people only studied group theory and never managed to apply it.
Since this book is aimed at students of science and technology, we omit most of the group theory.
A new viewpoint, discovered by Mikio Sato [12, 13], is to regard bilinear equations as equivalent to Plücker relations in a Grassmann manifold. This interpretation of soliton equations, based on a deep knowledge of mathematics, is admirable and beautiful, and has had a strong influence on the author. However, its later development has been so abstract that the author has not been able to understand it completely.
A soliton is a particular type of solitary wave, which is not destroyed when it collides with another wave of the same kind. Such behaviour is suggested by numerical simulation, but is it really possible that the soliton completely recovers its original shape after a collision? In detailed analysis of the results of such numerical simulations, some ripples can be observed after a collision, and it therefore seems that the original shape is not completely recovered. Therefore, in order to clarify whether or not solitons are destroyed through their collisions, it is necessary to find exact solutions of soliton equations.
Generally, it is a very hard task to find exact solutions of nonlinear partial differential equations, including soliton equations. Moreover, even if one manages to find a method for solving one nonlinear equation, in general such a method will not be applicable to other equations. Does there exist any successful and universal tool enabling one to solve many types of nonlinear equations which does not require a deep understanding of mathematics? For this purpose, a direct method has been investigated.
In Chapter 1, we discuss in an intuitive way the conditions under which a solitary wave is formed and we show that a nonlinear solitary wave cannot be made by the superposition of linear waves.
In this chapter we briefly summarize some of the important results in information theory which we have not been able to treat in detail. We shall give no proofs, but instead refer the interested reader elsewhere, usually to a textbook, sometimes to an original paper, for details.
We choose to restrict our attention solely to generalizations and extensions of the twin pearls of information theory, Shannon's channel coding theorem (Theorem 2.4 and its corollary) and his source coding theorem (Theorem 3.4). We treat each in a separate section.
The channel coding theorem
We restate the theorem for reference (see Corollary to Theorem 2.4).
Associated with each discrete memoryless channel, there is a nonnegative number C (called channel capacity) with the following property. For any ε > 0 and R, < C, for large enough n, there exists a code of length n and rate ≥ R (i.e., with at least 2Rn distinct codewords), and an appropriate decoding algorithm, such that, when the code is used on the given channel, the probability of decoder error is < ε.
We shall now conduct a guided tour through the theorem, pointing out as we go places where the hypotheses can be weakened or the conclusions strengthened. The points of interest will be the phrases discrete memoryless channel, a nonnegative number C, for large enough n and there exists a code … and … decoding algorithm. We shall also briefly discuss various converses to the coding theorem.
This book is meant to be a self-contained introduction to the basic results in the theory of information and coding. It was written during 1972–1976, when I taught this subject at Caltech. About half my students were electrical engineering graduate students; the others were majoring in all sorts of other fields (mathematics, physics, biology, even one English major!). As a result the course was aimed at nonspecialists as well as specialists, and so is this book.
The book is in three parts: Introduction, Part one (Information Theory), and Part two (Coding Theory). It is essential to read the introduction first, because it gives an overview of the whole subject. In Part one, Chapter 1 is fundamental, but it is probably a mistake to read it first, since it is really just a collection of technical results about entropy, mutual information, and so forth. It is better regarded as a reference section, and should be consulted as necessary to understand Chapters 2–5. Chapter 6 is a survey of advanced results, and can be read independently. In Part two, Chapter 7 is basic and must be read before Chapters 8 and 9; but Chapter 10 is almost, and Chapter 11 is completely, independent from Chapter 7. Chapter 12 is another survey chapter independent of everything else.
The problems at the end of the chapters are very important. They contain verification of many omitted details, as well as many important results not mentioned in the text. It is a good idea to at least read the problems.
A large body of mathematics consists of facts that can be presented and described much like any other natural phenomenon. These facts, at times explicitly brought out as theorems, at other times concealed within a proof, make up most of the applications of mathematics, and are the most likely to survive changes of style and of interest.
This ENCYCLOPEDIA will attempt to present the factual body of all mathematics. Clarity of exposition, accessibility to the non-specialist, and a thorough bibliography are required of each author. Volumes will appear in no particular order, but will be organized into sections, each one comprising a recognizable branch of present-day mathematics. Numbers of volumes and sections will be reconsidered as times and needs change.
It is hoped that this enterprise will make mathematics more widely used where it is needed, and more accessible in fields in which it can be applied but where it has not yet penetrated because of insufficient information.
Information theory is a success story in contemporary mathematics. Born out of very real engineering problems, it has left its imprint on such far-flung endeavors as the approximation of functions and the central limit theorem of probability. It is an idea whose time has come.
Most mathematicians cannot afford to ignore the basic results in this field. Yet, because of the enormous outpouring of research, it is difficult for anyone who is not a specialist to single out the basic results and the relevant material. Robert McEliece has succeeded in giving a presentation that achieves this objective, perhaps the first of its kind.
In Chapter 7 we gave one useful generalization of the (7, 4) Hamming code of the Introduction: the family of (2m − 1, 2m − m − 1) single-error-correcting Hamming codes. In Chapter 8 we gave a further generalization, to a class of codes capable of correcting a single burst of errors. In this chapter, however, we will give a far more important and extensive generalization, the multipleerror-correcting BCH2 and Reed–Solomon codes.
To motivate the general definition, recall that the parity-check matrix of a Hamming code of length n = 2m − 1 is given by (see Section 7.4)
where (v0, v1, …, vn−1) is some ordering of the 2m − 1 nonzero (column) vectors from Vm = GF(2)m. The matrix H has dimensions m × n, which means that it takes m parity-check bits to correct one error. If we wish to correct two errors, it stands to reason that m more parity checks will be required. Thus we might guess that a matrix of the general form
where w0, w1, …, wn−1 ∈ Vm, will serve as the parity-check matrix for a two-error-correcting code of length n. Since however, the vi's are distinct, we may view the correspondence vi → wi as a function from Vm into itself, and write H2 as
But how should the function f be chosen? According to the results of Section 7.3, H2 will define a two-error-correcting code iff the syndromes of the error pattern of weights 0, 1 and 2 are all distinct.
At the beginning of Chapter 7, we said that by restricting our attention to linear codes (rather than arbitrary, unstructured codes), we could hope to find some good codes which are reasonably easy to implement. And it is true that (via syndrome decoding, for example) a “small” linear code, say with dimension or redundancy at most 20, can be implemented in hardware without much difficulty. However, in order to obtain the performance promised by Shannon's theorems, it is necessary to use larger codes, and in general, a large code, even if it is linear, will be difficult to implement. For this reason, almost all block codes used in practice are in fact cyclic codes; cyclic codes form a very small and highly structured subset of the set of linear codes. In this chapter, we will give a general introduction to cyclic codes, discussing both the underlying mathematical theory (Section 8.1) and the basic hardware circuits used to implement cyclic codes (Section 8.2). In Section 8.3 we will show that Hamming codes can be implemented as cyclic codes, and in Sections 8.4 and 8.5 we will see how cyclic codes are used to combat burst errors. Our story will be continued in Chapter 9, where we will study the most important family of cyclic codes yet discovered: the BCH/Reed–Solomon family.
This chapter serves the same function for Part two as Chapter 6 served for Part one, that is, it summarizes some of the most important results in coding theory which have not been treated in Chapters 7–11. In Sections 12.2, 12.3, and 12.4 we treat channel coding (block codes, convolutional codes, and a comparison of the two). Finally in Section 12.5 we discuss source coding.
Block codes
The theory of block codes is older and richer than the theory of convolutional codes, and so this section is much longer than Section 12.3. (This imbalance does not apply to practical applications, however; see Section 12.4.) In order to give this section some semblance of organization, we shall classify the results to be cited according to Berlekamp's list of the three major problems of coding theory:
How good are the best codes?
How can we design good codes?
How can we decode such codes?
• How good are the best codes? One of the earliest problems which arose in coding theory was that of finding perfect codes. If we view a code of length n over the finite field Fq as a subset {x1, x2, …, xM} of the vector space Vn(Fq), the code is said to be perfect (or close packed) if for some integer e, the Hamming spheres of radius e around the M codewords completely fill Vn(Fq) without overlap.