To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The Shorter Oxford English Dictionary defines the word paradigm as meaning pattern or example, but it is used here in its generally accepted sense in this field, where it is taken to imply a fundamental technique or key idea. This chapter, therefore, is concerned with describing the fundamental ideas behind the implementation of parallel computation.
Two matters need to be dealt with before we begin. First, the reader should avoid confusion between the basic approaches set out in Chapter 1 and the paradigms described here. In the final chapter of this book, I develop a taxonomy of parallel computing systems, i.e. a structured analysis of systems in which each succeeding stage is based on increasingly detailed properties. In this taxonomy, the first two levels of differentiation are on the basis of the three approaches of the first chapter, whereas the third level is based on the paradigms described here. This is shown in Figure 2.1.
Next, there is the whole subject of optical computing. In one sense, an optical component, such as a lens, is a data parallel computer of dedicated functionality (and formidable power). There is certainly an overlap in the functions of such components and those of, say, an image processing parallel computer of the conventional sort. A lens can perform a fourier transform (a kind of frequency analysis) on an image, literally at the speed of light, whereas a conventional computer requires many cycles of operation to achieve the same result.
Before attempting to understand the complexities of the subject of parallel computing, the intending user or student ought, perhaps, to ask why such an exotic approach is necessary. After all, ordinary, serial, computers are in successful and widespread use in every area of society in industrially developed nations, and obtaining a sufficient understanding of their use and operation is no simple task. It might even be argued that, since the only reason for using two computers in place of one is because the single device is insufficiently powerful, a better approach is to increase the power (presumably by technological improvements) of the single machine.
As is usually the case, such a simplistic approach to the problem conceals a number of significant points. There are many application areas where the available power of ‘ordinary’ computers is insufficient to obtain the desired results. In the area of computer vision, for example, this insufficiency is related to the amount of time available for computation, results being required at a rate suitable for, perhaps, autonomous vehicle guidance. In the case of weather forecasting, existing models, running on single computers, are certainly able to produce results. Unfortunately, these are somewhat lacking in accuracy, and improvements here depend on significant extensions to the scope of the computer modelling involved.
This chapter, describes GTC, an alternative approach to the use of type classes that avoids the problems associated with context reduction, while retaining much of the flexibility of HTC. In addition, GTC benefits from a remarkably clean and efficient implementation that does not require sophisticated compile-time analysis or transformation. As in the previous chapter we concentrate more on implementation details than on formal properties of GTC.
An early description of GTC was distributed to the Haskell mailing list in February 1991 and subsequently used as a basis for Gofer, a small experimental system based on Haskell and described in (Jones, 1991c). The two languages are indeed very close, and many programs that are written with one system in mind can be used with the other with little or no changes. On the other hand, the underlying type systems are slightly different: Using explicit type signature declarations it is possible to construct examples that are well typed in one but not in the other.
Section 8.1 describes the basic principles of GTC and its relationship to HTC. The only significant differences between the two systems are in the methods used to simplify the context part of an inferred type. While HTC relies on the use of context reduction, GTC adopts a weaker form of simplification that does not make use of the information provided in instance declarations.
Section 8.2 describes the implementation of dictionaries used in the current version of Gofer. As an alternative to the treatment of dictionaries as tuples of values in the previous chapter, we give a representation which guarantees that the translation of each member function definition requires at most one dictionary parameter.
The principal aim of this chapter is to show how the concept of evidence can be used to give a semantics for OML programs with implicit overloading.
Outline of chapter
We begin by describing a version of the polymorphic λ-calculus called OP that includes the constructs for evidence application and abstraction described in the previous chapter (Section 5.1). One of the main uses of OP is as the target of a translation from OML with the semantics of each OML term being defined by those of its translation. In Section 5.2 we show how the OML typing derivations for a term E can be interpreted as OP derivations for terms with explicit overloading, each of which is a potential translation for E. It is immediate from this construction that every well-typed OML term has a translation and that all translations obtained in this way are well-typed in OP.
Given that each OML typing typically has many distinct derivations it follows that there will also be many distinct translations for a given term and it is not clear which should be chosen to represent the original term. The OP term corresponding to the derivation produced by the type inference algorithm in Section 3.4 gives one possible choice but it seems rather unnatural to base a definition of semantics on any particular type inference algorithm. A better approach is to show that any two translations of a term are semantically equivalent so that an implementation is free to use whichever translation is more convenient in a particular situation while retaining the same, well-defined semantics.
Whilst the two issues of control (what are all these parallel processors going to do?) and programming (how is the user to tell them all what to do?) are perhaps the most difficult conceptual aspects of parallel computing, the question of the connections between all the many system components is probably the hardest technical problem. There are two main levels at which the problem must be considered. At the first level, the connections between major system components, such as CPU, controller, data input and output devices, must be given careful consideration to avoid introducing bottlenecks which might destroy hoped-for performance. At the second level, connections within these major components, particularly between processing units and their associated memories, will be the major controlling factor of overall system performance. Within this area, an important conceptual difference exists between two alternative approaches to inter-processor communication. Individual pairs of elements may be either externally synchronised, in which case a controller ensures that if data is input at one end of a line it is simultaneously accepted at the other, or unsynchronised. In this latter case, the donating element signals that data is available, but the data is not transferred until the receiving element is ready to accept it.
In addition to this, technical problems are present. The first is the position and purpose of memory in the general structure.
This chapter expands on the implementation of type classes in Haskell using dictionary values as proposed by Wadler and Blott (1989) and sketched in Section 4.5. For brevity, we refer to this approach to the use of type classes as HTC. The main emphasis in this chapter is on concrete implementation and we adopt a less rigourous approach to formal properties of HTC than in previous chapters. In particular, we describe a number of optimisations that are necessary to obtain an efficient implementation of HTC - i.e. to minimise the cost of overloading. We do not consider the more general problems associated with the efficient implementation of non-strict functional languages like Haskell which are beyond the scope of this thesis.
Section 7.1 describes an important aspect of the system of type classes in Haskell which means that only a particularly simple form of predicate expression can be used in the type signature of an overloaded function. The set of predicates in a Haskell type signature is usually referred to as the context and hence we will use the term context reduction to describe the process of reducing the context to an acceptable form. Context reduction usually results in a small context, acts as a partial check of satisfiability and helps to guarantee decidability of predicate entailment. Unfortunately, it can also interfere with the use of data abstraction and limits the possibilities for extending the Haskell system of type classes.
The main ideas used in the implementation of HTC are described in Section 7.2 including the treatment of default definitions which were omitted from our previous descriptions.
In this thesis we have developed a general formulation of overloading based on the use of qualified types. Applications of qualified types can be described by choosing an appropriate system of predicates and we have illustrated this with particular examples including Haskell type classes, explicit subtyping and extensible records. We have shown how these ideas can be extended to construct a system that combines ML-style polymorphism and overloading in an implicitly typed programming language. Using the concept of evidence we have extended this work to describe the semantics of overloading in this language, establishing sufficient conditions to guarantee that the meaning of a given term is well-defined. Finally, we have described techniques that can be used to obtain efficient concrete implementations of systems based on this framework.
From a theoretical perspective, some of the main contributions of this thesis are:
The formulation of a general purpose system that can be used to describe a number of different applications of overloading.
The extension of standard results, for example the existence of principal types, to the type system of OML.
A new approach to the proof of coherence, based on the use of conversions.
From a practical perspective, we mention:
The implementation of overloading using the template-based approach, and the closely related implementation of type class overloading in Gofer.
A new implementation for extensible records, based on the use of evidence.
The use of information about satisfiability of predicate sets to obtain more informative inferred types.
The key feature of a system of qualified types that distinguishes it from other systems based solely on parametric polymorphism is the use of a language of predicates to describe sets of types (or more generally, relations between types). Exactly which sets of types and relations are useful will (of course) vary from one application to another and it does not seem appropriate to base a general theory on any particular choice. Our solution, outlined in this chapter, is to work in a framework where the properties of a (largely unspecified) language of predicates are described in terms of an entailment relation that is expected to satisfy a few simple laws. In this way, we are able to treat the choice of a language of predicates as a parameter for each of the type systems described in subsequent chapters. This approach also has the advantage that it enables us to investigate how the properties of particular type systems are affected by properties of the underlying systems of predicates.
The basic notation for predicates and entailment is outlined in Section 2.1. The remaining sections illustrate this general framework with applications to: Haskell-style type classes (Section 2.2), subtyping (Section 2.3) and extensible records (Section 2.4). Although we consider each of these examples independently, this work opens up the possibility of combining elements of each in a single concrete programming language.
Basic definitions
For much of this thesis we deal with an abstract language of predicates on types.
The study of parallel computing is just about as old as that of computing itself. Indeed, the early machine architects and programmers (neither category would have described themselves in these terms) recognised no such delineations in their work, although the natural human predilection for describing any process as a sequence of operations on a series of variables soon entrenched this philosophy as the basis of all normal systems.
Once this basis had become firmly established, it required a definite effort of will to perceive that alternative approaches might be worthwhile, especially as the proliferation of potential techniques made understanding more difficult. Thus, today, newcomers to the field might be told, according to their informer's inclination, that parallel computing means the use of transputers, or neural networks, or systolic arrays, or any one of a seemingly endless number of possibilities. At this point, students have the alternatives of accepting that a single facet comprises the whole, or attempting their own processes of analysis and evaluation. The potential users of a system are as likely to be set on the wrong path as the right one toward fulfilling their own set of practical aims.
This book is an attempt to set out the general principles of parallel computing in a way which will be useful to student and user alike. The approach I adopt to the subject is top-down – the simplest and most fundamental principles are enunciated first, with each important area being subsequently treated in greater depth.
Many programming languages rely on the use of a system of types to distinguish between different kinds of value. This in turn is used to identify two classes of program; those which are well-typed and accepted by the type system, and those that it rejects. Many different kinds of type system have been considered but, in each case, the principal benefits are the same:
The ability to detect program errors at compile time: A type discipline can often help to detect simple program errors such as passing an inappropriate number of parameters to a function.
Improved performance: If, by means of the type system, it is possible to ensure that the result of a particular calculation will always be of a certain type, then it is possible to omit the corresponding runtime checks that would otherwise be needed before using that value. The resulting program will typically be slightly shorter and faster.
Documentation: The types of the values defined in a program are often useful as a simple form of documentation. Indeed, in some situations, just knowing the type of an object can be enough to deduce properties about its behaviour (Wadler, 1989).
The main disadvantage is that no effective type system is complete; there will always be programs that are rejected by the type system, even though they would have produced welldefined results if executed without consideration of the types of the terms involved.