To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Data abstraction is perhaps the most important technique for structuring programs. The main idea is to introduce an interface that serves as a contract between the client and the implementor of an abstract type. The interface specifies what the client may rely on for its own work, and, simultaneously, what the implementor must provide to satisfy the contract. The interface serves to isolate the client from the implementor so that each may be developed in isolation from the other. In particular, one implementation may be replaced by another without affecting the behavior of the client, provided that the two implementations meet the same interface and are, in a sense to bemade precise shortly, suitably related to one another. (Roughly, each simulates the other with respect to the operations in the interface.) This property is called representation independence for an abstract type.
Data abstraction may be formalized by extending the language ℒ{→ ∀} with existential types. Interfaces are modeled as existential types that provide a collection of operations acting on an unspecified, or abstract, type. Implementations are modeled as packages, the introductory form for existentials, and clients are modeled as uses of the corresponding elimination form. It is remarkable that the programming concept of data abstraction is modeled so naturally and directly by the logical concept of existential type quantification. Existential types are closely connected with universal types and hence are often treated together.
Types are the central organizing principle of the theory of programming languages. Language features are manifestations of type structure. The syntax of a language is governed by the constructs that define its types, and its semantics is determined by the interactions among those constructs. The soundness of a language design – the absence of ill-defined programs – follows naturally.
The purpose of this book is to explain this remark. A variety of programming language features are analyzed in the unifying framework of type theory. A language feature is defined by its statics, the rules governing the use of the feature in a program, and its dynamics, the rules defining how programs using this feature are to be executed. The concept of safety emerges as the coherence of the statics and the dynamics of a language.
In this way we establish a foundation for the study of programming languages. But why these particular methods? The main justification is provided by the book itself. The methods we use are both precise and intuitive, providing a uniform framework for explaining programming language concepts. Importantly, these methods scale to a wide range of programming language concepts, supporting rigorous analysis of their properties. Although it would require another book in itself to justify this assertion, these methods are also practical in that they are directly applicable to implementation and uniquely effective as a basis for mechanized reasoning. No other framework offers as much.
The binary product of two types consists of ordered pairs of values, one from each type in the order specified. The associated eliminatory forms are projections, which select the first and second components of a pair. The nullary product, or unit, type consists solely of the unique “null tuple” of no values and has no associated eliminatory form. The product type admits both a lazy and an eager dynamics. According to the lazy dynamics, a pair is a value without regard to whether its components are values; they are not evaluated until (if ever) they are accessed and used in another computation. According to the eager dynamics, a pair is a value only if its components are values; they are evaluated when the pair is created.
More generally, we may consider the finite product ⟨τi ⟩iϵI indexed by a finite set of indices I. The elements of the finite product type are I -indexed tuples whose ith component is an element of the type τi for each i ϵ I. The components are accessed by I -indexed projection operations, generalizing the binary case. Special cases of the finite product include n-tuples, indexed by sets of the form I = {0, …, n − 1}, and labeled tuples, or records, indexed by finite sets of symbols. Similar to binary products, finite products admit both an eager and a lazy interpretation.
The technique of structural dynamics is very useful for theoretical purposes, such as proving type safety, but is too high level to be directly usable in an implementation. One reason is that the use of “search rules” requires the traversal and reconstruction of an expression in order to simplify one small part of it. In an implementation we would prefer to use some mechanism to record “where we are” in the expression so that we may resume from that point after a simplification. This can be achieved by introducing an explicit mechanism, called a control stack, that keeps track of the context of an instruction step for just this purpose. By making the control stack explicit, the transition rules avoid the need for any premises—every rule is an axiom. This is the formal expression of the informal idea that no traversals or reconstructions are required to implement it. This chapter introduces an abstract machine K{nat ⇀} for the language ℒ{nat ⇀}. The purpose of this machine is to make control flow explicit by introducing a control stack that maintains a record of the pending subcomputations of a computation. We then prove the equivalence of K{nat ⇀} with the structural dynamics of ℒ{nat ⇀}.
Constructive logic codifies the principles of mathematical reasoning as they are actually practiced. In mathematics a propositionmay be judged to be true exactly when it has a proof and may be judged to be false exactly when it has a refutation. Because there are, and always will be, unsolved problems, we cannot expect in general that a proposition is either true or false, for in most cases we have neither a proof nor a refutation of it. Constructive logic may be described as logic as if people matter, as distinct from classical logic, which may be described as the logic of the mind of god. From a constructive viewpoint the judgment “ϕ true” means that “there is a proof of ϕ.”
What constitutes a proof is a social construct, an agreement among people as to what a valid argument is. The rules of logic codify a set of principles of reasoning that may be used in a valid proof. The valid forms of proof are determined by the outermost structure of the proposition whose truth is asserted. For example, a proof of a conjunction consists of a proof of each of its conjuncts, and a proof of an implication consists of a transformation of a proof of its antecedent to a proof of its consequent. When spelled out in full, the forms of proof are seen to correspond exactly to the forms of expression of a programming language.
A distributed computation is one that takes place at many different sites, each of which controls some resources located at that site. For example, the sites might be nodes on a network, and a resource might be a device or sensor located at that site or a database controlled by that site. Only programs that execute at a particular site may access the resources situated at that site. Consequently, command execution always takes place at a particular site, called the locus of execution. Access to resources at a remote site from a local site is achieved by moving the locus of execution to the remote site, running code to access the local resource, and returning a value to the local site.
In this chapter we consider the language ℒ{nat cmd ⇀∥}, an extension of Concurrent Algol with a spatial type system that mediates access to located resources on a network. The type safety theorem ensures that all accesses to a resource controlled by a site are through a program executing at that site, even though references to local resources may be freely passed around to other sites on the network. The key idea is that channels and events are located at a particular site and that synchronization on an event may occur only at the site appropriate to that event. Issues of concurrency, which are to do with nondeterministic composition, are thereby cleanly separated from those of distribution, which are to do with the locality of resources on a network.
Programming languages are languages, a means of expressing computations in a form comprehensible to both people and machines. The syntax of a language specifies the means by which various sorts of phrases (expressions, commands, declarations, and so forth) may be combined to form programs. But what sort of thing are these phrases? What is a program made of?
The informal concept of syntax may be seen to involve several distinct concepts. The surface, or concrete, syntax is concerned with how phrases are entered and displayed on a computer. The surface syntax is usually thought of as given by strings of characters from some alphabet (say, ASCII or Unicode). The structural, or abstract, syntax is concerned with the structure of phrases, specifically how they are composed from other phrases. At this level a phrase is a tree, called an abstract syntax tree, whose nodes are operators that combine several phrases to form another phrase. The binding structure of syntax is concerned with the introduction and use of identifiers: how they are declared and how declared identifiers are to be used. At this level phrases are abstract binding trees, which enrich abstract syntax trees with the concepts of binding and scope.
We do not concern ourselves in this book with matters of concrete syntax, but instead work at the level of abstract syntax. To prepare the ground for the rest of the book, this chapter begins by definining abstract syntax trees and abstract binding trees and some functions and relations associated with them.
It frequently arises that the values of a type are partitioned into a variety of classes, each classifying data with a distinct internal structure. A good example is provided by the type of points in the plane, which may be classified according to whether they are represented in Cartesian or polar form. Both are represented by a pair of real numbers, but in the Cartesian case these are the x and y coordinates of the point, whereas in the polar case these are its distance r from the origin and its angle θ with the polar axis. A classified value is said to be an object, or instance, of its class. The class determines the type of the classified data, which are called the instance type of the class. The classified data itself is called the instance data of the object.
Functions that act on classified values are sometimes called methods. The behavior of a method is determined by the class of its argument. The method is said to dispatch on the class of the argument. Because it happens at run time, this is called dynamic dispatch. For example, the squared distance of a point from the origin is calculated differently according to whether the point is represented in Cartesian or polar form. In the former case the required distance is x2 + y2, whereas in the latter it is simply r itself.
Themotivation for introducing polymorphism was to enablemore programs to be written—those that are “generic” Chapter 20. Then if a program does not depend on the choice of types, we can code it by using polymorphism. Moreover, if we wish to insist that a program cannot depend on a choice of types, we demand that it be polymorphic. Thus polymorphism can be used both to expand the collection of programs we may write and also to limit the collection of programs that are permissible in a given context.
The restrictions imposed by polymorphic typing give rise to the experience that in a polymorphic functional language, if the types are correct, then the program is correct. Roughly speaking, if a function has a polymorphic type, then the strictures of type genericity vastly cut down the set of programs with that type. Thus if you have written a program with this type, it is quite likely to be the one you intended!
The technical foundation for these remarks is called parametricity. The goal of this chapter is to give an account of parametricity for ℒ{→ ∀} under a call-by-name interpretation.
Overview
We begin with an informal discussion of parametricity based on a “seat of the pants” understanding of the set of well-formed programs of a type.
Suppose that a function value f has the type ∀(t .t → t).