To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
At the beginning of Chapter 7, we said that by restricting our attention to linear codes (rather than arbitrary, unstructured codes), we could hope to find some good codes which are reasonably easy to implement. And it is true that (via syndrome decoding, for example) a “small” linear code, say with dimension or redundancy at most 20, can be implemented in hardware without much difficulty. However, in order to obtain the performance promised by Shannon's theorems, it is necessary to use larger codes, and in general, a large code, even if it is linear, will be difficult to implement. For this reason, almost all block codes used in practice are in fact cyclic codes; cyclic codes form a very small and highly structured subset of the set of linear codes. In this chapter, we will give a general introduction to cyclic codes, discussing both the underlying mathematical theory (Section 8.1) and the basic hardware circuits used to implement cyclic codes (Section 8.2). In Section 8.3 we will show that Hamming codes can be implemented as cyclic codes, and in Sections 8.4 and 8.5 we will see how cyclic codes are used to combat burst errors. Our story will be continued in Chapter 9, where we will study the most important family of cyclic codes yet discovered: the BCH/ Reed–Solomon family.
We begin our studies with the innocuous-appearing definition of the class of cyclic codes.
Introduction: The generator and parity-check matrices
We have already noted that the channel coding Theorem 2.4 is unsatisfactory from a practical standpoint. This is because the codes whose existence is proved there suffer from at least three distinct defects:
(a) They are hard to find (although the proof of Theorem 2.4 suggests that a code chosen “at random” is likely to be pretty good, provided its length is large enough).
(b) They are hard to analyze. (Given a code, how are we to know how good it is? The impossibility of computing the error probability for a fixed code is what led us to the random coding artifice in the first place!)
(c) They are hard to implement. (In particular, they are hard to decode: the decoding algorithm sugggested in the proof of Theorem 2.4—search the region S(y) for codewords, and so on—is hopelessly complex unless the code is trivially small.)
In fact, virtually the only coding scheme we have encountered so far which suffers from none of these defects is the (7, 4) Hamming code of the Introduction. In this chapter we show that the Hamming code is a member of a very large class of codes, the linear codes, and in Chapters 7–9 we show that there are some very good linear codes which are free from the three defects cited above.
In 1948, in the introduction to his classic paper, “A mathematical theory of communication,” Claude Shannon wrote:
“The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.”
To solve that problem he created, in the pages that followed, a completely new branch of applied mathematics, which is today called information theory and/or coding theory. This book's object is the presentation of the main results of this theory as they stand 30 years later.
In this introductory chapter we illustrate the central ideas of information theory by means of a specific pair of mathematical models, the binary symmetric source and the binary symmetric channel.
The binary symmetric source (the source, for short) is an object which emits one of two possible symbols, which we take to be “0” and “1,” at a rate of R symbols per unit of time. We shall call these symbols bits, an abbreviation of binary digits. The bits emitted by the source are random, and a “0” is as likely to be emitted as a “1.” We imagine that the source rate R is continuously variable, that is, R can assume any nonnegative value.
The binary symmetric channel (the BSC2 for short) is an object through which it is possible to transmit one bit per unit of time. However, the channel is not completely reliable: there is a fixed probability p (called the raw bit error probability3), 0 ≤ p ≤ ½, that the output bit will not be the same as the input bit.
Young tableaux have had a long history since their introduction by A. Young a century ago. It is only in the 1960s that there came to the fore a monoid structure on them, a structure taking into account most of their combinatorial properties, and having applications to the different fields in which Young tableaux were used.
Summarizing what had been his motivation to spend so much time on the plactic monoid, M.R Schützenberger drew out three reasons: (1) it allows us to embed the ring of symmetric polynomials into a noncommutative ring; (2) it is the syntactic monoid of a function on words generalizing the maximal length of a nonincreasing subword; (3) it is a natural generalization to alphabets with more than two letters of the monoid of parentheses.
The starting point of the theory is an algorithm, due to C. Schensted, for the determination of the maximal length of a nondecreasing subword of a given word. The output of this algorithm is a tableau, and if one decides to identify the words leading to the same tableau, one arrives at the plactic monoid, whose defining relations were determined by D. Knuth.
The first significant application of the plactic monoid was to provide a complete proof of the Littlewood-Richardson rule, a combinatorial algorithm for multiplying Schur functions (or equivalently, to decompose tensor products of representations of unitary groups, a fundamental issue in many applications, e.g., in particle physics), which had been in use for almost 50 years before being fully understood.
A large body of mathematics consists of facts that can be presented and described much like any other natural phenomenon. These facts, at times explicitly brought out as theorems, at other times concealed within a proof, make up most of the applications of mathematics, and are the most likely to survive changes of style and of interest.
This ENCYCLOPEDIA will attempt to present the factual body of all mathematics. Clarity of exposition, accessibility to the non-specialist, and a thorough bibliography are required of each author. Volumes will appear in no particular order, but will be organized into sections, each one comprising a recognizable branch of present-day mathematics. Numbers of volumes and sections will be reconsidered as times and needs change.
It is hoped that this enterprise will make mathematics more widely used where it is needed, and more accessible in fields in which it can be applied but where it has not yet penetrated because of insufficient information.
Information theory is a success story in contemporary mathematics. Born out of very real engineering problems, it has left its imprint on such far-flung endeavors as the approximation of functions and the central limit theorem of probability. It is an idea whose time has come.
Most mathematicians cannot afford to ignore the basic results in this field. Yet, because of the enormous outpouring of research, it is difficult for anyone who is not a specialist to single out the basic results and the relevant material.
In Chapter 1, avoidable and unavoidable sets of words have been defined. The focus was then on the case of finite sets of words. In the present chapter, we turn to particular infinite sets of words, defined as pattern languages. A pattern is a word that contains special symbols called variables, and the associated pattern language is obtained by replacing the variables with arbitrary nonempty words, with the condition that two occurrences of the same variable have to be replaced with the same word.
The archetype of a pattern is the square, αα. The associated pattern language is L = {uu|u ∈ A+}, and it is now a classical result that L is an avoidable set of words if A has at least three elements, whereas it is an unavoidable set of words if A has only one or two elements. Indeed, an infinite square-free word on three letters can be constructed, and it is easy to check that every binary word of length 4 contains a square. For short, we will say that the pattern αα is 3-avoidable and 2-unavoidable.
General patterns can contain more than just one variable. For instance, αβα represents words of the form uvu, with u, v ∈ A+ (this pattern is unavoidable whatever the size of the alphabet; see Proposition 3.1.2). They could also be allowed to contain constant letters, which unlike variables are never replaced with arbitrary words, but this is not very useful in the context of avoidability, so we will consider here only “pure” patterns, constituted only of variables.
This chapter serves the same function for Part two as Chapter 6 served for Part one, that is, it summarizes some of the most important results in coding theory which have not been treated in Chapters 7–11. In Sections 12.2, 12.3, and 12.4 we treat channel coding (block codes, convolutional codes, and a comparison of the two). Finally in Section 12.5 we discuss source coding.
Block codes
The theory of block codes is older and richer than the theory of convolutional codes, and so this section is much longer than Section 12.3. (This imbalance does not apply to practical applications, however; see Section 12.4.) In order to give this section some semblance of organization, we shall classify the results to be cited according to Berlekamp's [15] list of the three major problems of coding theory:
How good are the best codes?
How can we design good codes?
How can we decode such codes?
•How good are the best codes? One of the earliest problems which arose in coding theory was that of finding perfect codes. If we view a code of length n over the finite field Fq as a subset {x1; x2, …, xM} of the vector space Vn(Fq), the code is said to be perfect (or close packed) if for some integer e, the Hamming spheres of radius e around the M codewords completely fill Vn(Fq) without overlap.
Periodicity is an important property of words that has applications in various domains. The first significant results on periodicity are the theorem of Fine and Wilf and the critical factorization theorem. These two results refer to two kinds of phenomena concerning periodicity: the theorem of Fine and Wilf considers the simultaneous occurrence of different periods in one finite word, whereas the critical factorization theorem relates local and global periodicity of words. Starting from these basic results the study of periodicity has grown along both directions. This chapter contains a systematic and self-contained exposition of this theory, including very recent results.
In Section 8.1 we analyze the structure of the set of periods of one finite word. This section includes a proof of the theorem of Fine and Wilf and also a generalization of this result to words having three periods. We next give the characterization of Guibas and Odlyzko concerning those sets of integers that yield the periods that can simultaneously occur in a single finite word. Another property is further investigated (similar to the one stated by the theorem of Fine and Wilf) in which the occurrence of two periods in a word of a certain length forces the word to have a shorter period only in a prefix (or suffix) of the word. The golden ratio appears in such a result as an extremal value of a parameter involved in the property. This section also contains some results concerning the squares that can appear as factors in a word. This is a prelude to the next section, since squares describe a special kind of local periodicity.