To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter we consider several, more advanced, topics related to the two-party communication model.
Direct Sum
The direct-sum problem is the following: Alice gets two inputs xf ∈ Xf and xg ∈ Xg. Bob gets two inputs yf ∈ Yf and yg ∈ Yg. They wish to compute both f(xf, yf) and g(xg, yg). The obvious solution would be for Alice and Bob to use the best protocol for f to compute the first value, f(xf, yf) and the best protocol for g to compute the second value, g(xg, yg). We stress that the two subproblems are totally independent. Thus one would tend to conjecture that nothing better than the obvious solution can be done: Alice and Bob cannot “save” any communication over the obvious protocol. As we shall see, in some cases and for some measures of complexity, this intuition is wrong.
Denote by D(f, g) the (deterministic) communication complexity of this computation. Similarly, we define all other complexity measures such as R(f, g), N(f, g), and so forth. We also use the notation D(fℓ) as the (deterministic) communication complexity of computing f and ℓ instances; that is, computing f(x1, y1), f(x2, y2), …, f(xℓ, yℓ
Open Problem 4.1: Can D(f, g) be smaller than D(f) + D(g)? How much smaller can it be? How much smaller can D(fℓ) be compared to ℓ · D(f)?
In some cases we are not interested in computing both f and g but rather some function of the two.
Results about communication complexity have all kinds of applications. The most obvious ones are applications to communication problems. For example, for the management of a distributed system it is often required to check whether two copies of a file that reside in two different sites are the same. Clearly, this is just solving the “equality” problem EQ, whose communication complexity was extensively studied in the first part of this book. It is also very useful to compare a whole directory. Namely, to get for each file in the directory a bit indicating whether the copies of this particular file are the same or not. This is the same as solving the direct-sum version of EQ (see Section 4.1).
Most of the results in Part III of this book are devoted to applications inwhich communication does not appear explicitly in the statement of the problem. These applications show that in fact communication is an essential part of more problems than it may seem at first glance. We start (in Sections 8.1 and 8.2) with several applications in which the relation to communication complexity is obvious. Then, we show (in Section 8.3) how to apply communication complexity results to the study of VLSI chips.
Bisection Width of Networks
A network of k processors can be viewed as a graph G, where nodes correspond to the processors in the network and edges represent the connection of two processors by a communication channel. We will be proving lower bounds and we will do so regardless of the implementation of “processors” and “channels”. We will only rely on the assumption that in each time step a single bit can be sent on each of the channels.
In the first part of this book we were interested in computing functions. That is, for any input (x, y) there was a unique value f(x, y) that Alice and Bob had to compute. More general types of problems are relations. In this case, on input (x, y) there might be several values that are valid outputs. Formally,
Definition 5.1:A relation R is a subset R ⊆ X × Y × Z. The communication problem R is the following: Alice is given x ∈ X, Bob is given y ∈ Y, and their task is to find some z ∈ Z that satisfies the relation. That is, (x, y, z) ∈ R.
Note that functions are a special case of the above definition, where z is uniquely defined. Also note that it may be the case that for a certain input pair (x, y) there is no value z such that (x, y, z) ∈ R. We say that this input is illegal and we assume that it is never given as an input to Alice and Bob. Alternatively, we can assume that for every (x, y) there must exist a possible value z. For example, by extending the relation R and allowing every output z for the illegal pairs (that is, (x, y, z) ∈ R for all z ∈ Z).
One of the simplest models of computation is the decision tree model. In this model we are concerned with computing a function f: {0, 1}m → {0, 1} by using queries. Each query is given by specifying a function q on {0, l}m taken from some fixed family Q of allowed queries (the queries need not be Boolean). The answer given for the query is simply the value of q(x1, …, xm). The algorithm is completely adaptive, that is the i-th query asked may depend in an arbitrary manner on the answers received for the first i − 1 queries. The only wayto gain information about the input x is through these queries. The algorithm can therefore be described as a labeled tree, whose nodes are labeled by queries q ∈ Q, the outgoing edges of each node are labeled by the possible values of q(x1, …, xm), and the leaves are labeled by output values. Each sequence of answers describes a path in the tree to a node that is either the next query or the value of the output. In Figure 9.1 a decision tree is shown that computes (on inputs x1, …, x4) whether at least three of the input bits are 1s. It uses a family of queries Q consisting of all disjunctions of input variables and conjunctions of input variables.
The cost measure we are interested in is the number of queries performed on the worst case input; that is, the depth of the tree.
Definition 9.1: The decision tree complexity of a function f using the family of queries Q, denoted TQ(f), is the minimum cost decision tree algorithm over Q for f.
In Chapter 1 we saw that every communication protocol induces a partition of the space of possible inputs into monochromatic rectangles and learned of two lower bound techniques for the number of rectangles in such a partition. In this section we study how closely these combinatorial measures relate to communication complexity and to each other.
Covers and Nondeterminism
Although every protocol induces a partition of X × Y into f-monochromatic rectangles, simple examples show that the opposite is not true. In Figure 2.1, a partition of X × Y into monochromatic rectangles is given that do not correspond to any protocol. To see this, consider any protocol P for computing the corresponding function f. Since the function is not constant, there must be a first player who sends a message that is not constant. Suppose that this player is Alice. Since the messages that Alice sends on x, x′ and x″ are not all the same, there are two possibilities: (1) her messages on x and x′ are different. In this case the rectangle {x, x′} × {y} is not a monochromatic rectangle induced by the protocol P; or (2) her messages on x′ and x″ are different. In this case the rectangle {x′, x″} × {y″} is not a monochromatic rectangle induced by the protocol P. Similarly, if Bob is the first player to send a nonconstant message, then this message is inconsistent with either the rectangle {x} × {y′, y″) or with the rectangle {x″} × {y, y′}.
Everyone accepts that large programs should be organized as hierarchical modules. Standard ml's structures and signatures meet this requirement. Structures let us package up declarations of related types, values and functions. Signatures let us specify what components a structure must contain. Using structures and signatures in their simplest form we have treated examples ranging from the complex numbers in Chapter 2 to infinite sequences in Chapter 5.
A modular structure makes a program easier to understand. Better still, the modules ought to serve as interchangeable parts: replacing one module by an improved version should not require changing the rest of the program. Standard ml'sabstract types and functors can help us meet this objective too.
A module may reveal its internal details. When the module is replaced, other parts of the program that depend upon such details will fail. ml provides several ways of declaring an abstract type and related operations, while hiding the type's representation.
If structure B depends upon structure A, and we wish to replace A by another structure A′, we could edit the program text and recompile the program. That is satisfactory if A is obsolete and can be discarded. But what if A and A′ are both useful, such as structures for floating point arithmetic in different precisions?
ml lets us declare B to take a structure as a parameter.
This book originated in lectures on Standard ml and functional programming. It can still be regarded as a text on functional programming — one with a pragmatic orientation, in contrast to the rather idealistic books that are the norm — but it is primarily a guide to the effective use of ml. It even discusses ml's imperative features.
Some of the material requires an understanding of discrete mathematics: elementary logic and set theory. Readers will find it easier if they already have some programming experience, but this is not essential.
The book is a programming manual, not a reference manual; it covers the major aspects of ml without getting bogged down with every detail. It devotes some time to theoretical principles, but is mainly concerned with efficient algorithms and practical programming.
The organization reflects my experience with teaching. Higher-order functions appear late, in Chapter 5. They are usually introduced at the very beginning with some contrived example that only confuses students. Higher-order functions are conceptually difficult and require thorough preparation. This book begins with basic types, lists and trees. When higher-order functions are reached, a host of motivating examples is at hand.
The exercises vary greatly in difficulty. They are not intended for assessing students, but for providing practice, broadening the material and provoking discussion.
Overview of the book. Most chapters are devoted to aspects of ml. Chapter 1 introduces the ideas behind functional programming and surveys the history of ml.
Functional programming has its merits, but imperative programming is here to stay. It is the most natural way to perform input and output. Some programs are specifically concerned with managing state: a chess program must keep track of where the pieces are! Some classical data structures, such as hash tables, work by updating arrays and pointers.
Standard ml's imperative features include references, arrays and commands for input and output. They support imperative programming in full generality, though with a flavour unique to ml. Looping is expressed by recursion or using a while construct. References behave differently from Pascal and C pointers; above all, they are secure.
Imperative features are compatible with functional programming. References and arrays can serve in functions and data structures that exhibit purely functional behaviour. We shall code sequences (lazy lists) using references to store each element. This avoids wasteful recomputation, which is a defect of the sequences of Section 5.12. We shall code functional arrays (where updating creates a new array) with the help of mutable arrays. This representation of functional arrays can be far more efficient than the binary tree approach of Section 4.15.
A typical ml program is largely functional. It retains many of the advantages of functional programming, including readability and even efficiency: garbage collection can be faster for immutable objects. Even for imperative programming, ml has advantages over conventional languages.
The first ml compiler was built in 1974. As the user community grew, various dialects began to appear. The ml community then got together to develop and promote a common language, Standard ml — sometimes called sml, or just ml. Good Standard ml compilers are available.
Standard ml has become remarkably popular in a short time. Universities around the world have adopted it as the first programming language to teach to students. Developers of substantial applications have chosen it as their implementation language. One could explain this popularity by saying that ml makes it easy to write clear, reliable programs. For a more satisfying explanation, let us examine how we look at computer systems.
Computers are enormously complex. The hardware and software found in a typical workstation are more than one mind can fully comprehend. Different people understand the workstation on different levels. To the user, the workstation is a word processor or spreadsheet. To the repair crew, it is a box containing a power supply, circuit boards, etc. To the machine language programmer, the workstation provides a large store of bytes, connected to a processor that can perform arithmetic and logical operations. The applications programmer understands the workstation through the medium of the chosen programming language.
Here we take ‘spreadsheet’, ‘power supply’ and ‘processor’ as ideal, abstract concepts. We think of them in terms of their functions and limitations, but not in terms of how they are built.
This chapter brings together all the concepts we have learned so far. For an extended example, it presents a collection of modules to implement the λ-calculus as a primitive functional programming language. Terms of the λ-calculus can be parsed, evaluated and the result displayed. It is hardly a practical language. Trivial arithmetic calculations employ unary notation and take minutes. However, its implementation involves many fundamental techniques: parsing, representing bound variables and reducing expressions to normal form. These techniques can be applied to theorem proving and computer algebra.
Chapter outline
We consider parsing and two interpreters for λ-terms, with an overview of the λ-calculus. The chapter contains the following sections:
A functional parser. An ml functor implements top-down recursive descent parsing. Parsers can be combined using infix operators that resemble the symbols for combining grammatical phrases.
Introducing the λ-calculus. Terms of this calculus can express functional programs. They can be evaluated using either the call-by-value or the call-by-name mechanism. Substitution must be performed carefully, avoiding variable name clashes.
Representing λ-terms inml. Substitution, parsing and pretty printing are implemented as ml structures.
The λ-calculus as a programming language. Typical data structures of functional languages, including infinite lists, are encoded in the λ-calculus. The evaluation of recursive functions is demonstrated.
A functional parser
Before discussing the λ-calculus, let us consider how to write scanners and parsers in a functional style.
The exercises in this book are intended to deepen your understanding of ml and improve your programming skills. But such exercises cannot turn you into a programmer, let alone a software engineer. A project is more than a large programming exercise; it involves more than programming. It demands careful preparation: background study, analysis of requirements, design. The finished program should be evaluated fairly but thoroughly.
Each suggestion is little better than a hint, but with a little effort, can be developed into a proper proposal. Follow the attached references and prepare a project description including a statement of objectives, a provisional timetable and a list of required resources. The next stage is to write a detailed requirements analysis, listing all functions in sufficient detail to allow someone else to carry out eventual testing. Then specify the basic design; ml functors and signatures can describe the main components and their interfaces.
The preparatory phases outlined above might be done by the instructor, a student or a team of students. This depends upon the course aims, which might be concerned purely with ml, with project management, or with demonstrating some methodology of software engineering. The final evaluation might similarly be done by the instructor, the implementor or another team of students.
The evaluation should consider to what extent the program meets its objectives.
The most powerful techniques of functional programming are those that treat functions as data. Most functional languages give function values full rights, free of arbitrary restrictions. Like other values, functions may be arguments and results of other functions and may belong to pairs, lists and trees.
Procedural languages like Fortran and Pascal accept this idea as far as is convenient for the compiler writer. Functions may be arguments: say, the comparison to be used in sorting or a numerical function to be integrated. Even this restricted case is important.
A function is higher-order (or a functional) if it operates on other functions. For instance, the functional map applies a function to every element of a list, creating a new list. A sufficiently rich collection of functionals can express all functions without using variables. Functionals can be designed to construct parsers (see Chapter 9) and theorem proving strategies (see Chapter 10).
Infinite lists, whose elements are evaluated upon demand, can be implemented using functions as data. The tail of a lazy list is a function that, if called, produces another lazy list. A lazy list can be infinitely long and any finite number of its elements can be evaluated.
Chapter outline
The first half presents the essential programming techniques involving functions as data. The second half serves as an extended, practical example. Lazy lists can be represented in ml (despite its strict evaluation rule) by means of function values.