To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
It has long been recognized that computer design utilizing more than one processor is one promising approach — some say the only approach — toward more powerful computing machines. Once one adopts this view, several issues immediately emerge: how to connect processors and memories, how do processors communicate efficiently, how to tolerate faults, how to exploit the redundancy inherent in multiprocessors to perform on-line maintenance and repair, and so forth.
This book confronts the above-mentioned issues with two keys insights. There exist error-correcting codes that generate redundancy which is efficient in terms of the number of bits. Such redundancy is used to correct errors and erasures caused by component failures and resource limitations (such as limited buffer size). This insight comes from Michael Rabin. The next insight, due to Leslie Valiant, demonstrates the criticality of randomization in achieving communication efficiency.
We intend to make this book an up-to-date account of the information dispersal approach as it is applied to parallel computation. We also discuss related work in the general area of parallel communication and computation and provide an extensive bibliography in the hope that either might be helpful for researchers and students who want to explore any particular topic. Although materials in this book extend across several disciplines (algebra, coding theory, number theory, arithmetics, algorithms, graph theory, combinatorics, and probability), it is, the author believes, a self-contained book; adequate introduction is given and every proof is complete.
In this chapter, I consider the potential and actual reuse opportunities within UNIX. First, several methods are suggested that could increase the likelihood that the next submission matches an item in a small set of predictions offered to the user for review and reuse. All methods are applied to the UNIX traces, and the predictive “quality” of each method is measured and contrasted against the others. In the second part of the chapter, I investigate how well the reuse facilities supplied by the UNIX shell are used in practice.
Conditioning the distribution
In the last chapter, particular attention was paid to the recurrence of command lines during csh use, and to the probability distribution of the next line given a sequential history list of previous ones. We saw that the most striking feature of the collected statistics is the tremendous potential for a historical reuse facility: the recurrence rate is high and the last few submissions are the likeliest to be repeated.
One may predict what the user will do next by looking at those recent submissions. But there is still room for improvement, because a significant portion of recurrences are not recent submissions. Can better predictions of the user's next step be offered? This section proposes and evaluates alternative models of arranging a user's command line history that will condition the distribution in different ways.
The recurrence distributions of Section 5.4.2 were derived by considering all input for a user as one long sequential stream, with no barriers placed between sessions.
The simulation of the ideal parallel computation model, pram, on the hypercube network is considered in this chapter. It is shown that a class of pram programs can be simulated with a slowdown of O(log N) with almost certainty and without using hashing, where N denotes the number of processors. Also shown is that general pram programs can be simulated with a slowdown of O(log N) with the help of hashing. Both schemes are ida-based and fault-tolerant.
Introduction
Parallel algorithms are notoriously hard to write and debug. Hence, it is only natural to turn to ideal models that provide good abstraction. As these models do not assume any particular hardware configuration, they should have the additional benefit that programs written for them can be executed on any hardware that supports the model, similar to the situation in the sequential case where the existence of a, say, C compiler on a particular platform implies standard C programs can be compiled and executed there [370]. The PRAM (“Parallel Random Access Machine”) is one such model. It completely abstracts out the cost issue in communication and allows us to focus on the computational aspect. However, such convenience and generality is not without its price: the PRAM model, unlike the von Neumann machine model, is not physically feasible to build [369]. The simulation of PRAMs by feasible computers is therefore important and forms the major theme of this chapter.
After briefly examining the challenges facing designing ever more powerful computers, we discuss the important issues in parallel processing and outline solutions. An overview of the book is also given.
The von Neumann Machine Paradigm
The past five decades have witnessed the birth of the first electronic computer [257] and the rapid growth of the computing industry to exceed $1,000 billion (annual revenue) in the U.S. alone [162]. The demand for high-performance machines is further powered by the advent of many crucial problems whose solutions require enormous computing power: environmental issues, search for cures for diseases, accurate and timely weather forecasting, to mention just a few [271]. Moreover, although unre-lenting decrease in the feature size continues to improve the computing capability per chip, turning that into a corresponding increase in computing performance is a major challenge [152, 316]. All these factors point toward the necessity of sustained innovation in the design of computers.
That the von Neumann machine paradigm [37], the conceptual framework for most computers, considered at the system level will impede further performance gains is not hard to see. Input and output excluded, the von Neumann machine conceptually consists of a processing unit, a memory storing both programs and data, and a wire that connects the two.
In this final chapter, we briefly review techniques and concepts in fault-tolerant computing. Then we sketch the design of a fault-tolerant parallel computer, hpc (“hypercube parallel computer”), based on the results and ideas from previous chapters.
Introduction
A fault-free computer, or any human artifact, has never been built, and will never be. No matter how reliable each component is, there is always possibility, however small, that it will go wrong. Statistical principles dictate that, other things being equal, this possibility increases as the number of components increases. Such an event, if not anticipated and safe-guarded against, will eventually make the computer malfunction and lead to anything from small annoyance and inconvenience to disaster.
Recently, the same enormous decrease in hardware cost which makes parallel computers economically feasible also makes fault tolerance more affordable [297]. In other words, the low cost of hardware makes possible both high degree of fault tolerance using redundancy and high performance. Indeed, most fault-tolerant computers today employ multiple processors; see [241, 254, 317] for good surveys.
It is in the light of these backgrounds that we take this extra step toward designing a hypercube parallel computer (HPC for short). In the HPC processors are grouped into logical clusters consisting of physically close processors, and each program execution is replicated at all members of a cluster. Clusters overlap, however. The concept of cluster — logical or physical — introduces a two-level, instead of flat, organization and can be found in, for example, the Cm [344], Cedar [187], and FTPP (“Fault Tolerant Parallel Processor”) [148] computers.
Humans are the most versatile of creatures, and computers are their most versatile of creations. Human–Computer Interaction (HCI) is the study of what they do together; in particular, HCI aims to make interaction better suit the humans. Computers contribute to art, science, engineering, … all areas of human endeavor. It is no surprise, then, that there is heated debate about what the essence of HCI is and what it should be. What is good HCI? The answer to this question will be elusive given that there is good engineering that is not art, good art that is not science, and good science that is not engineering.
It's easier to see what form of answer there can be by taking a quick excursion into another field. Imagine the discovery of a dye, such as W. H. Perkin's breakthrough discovery of mauve. Is it science? Yes: certain chemicals must react to produce the dyestuff, and the principles of chemistry suggest other possibilities. Is it art? Yes: it makes an attractive color. Is it engineering? Yes: its quantity production, fastness in materials, and so forth, are engineering. Perkin's work made the once royal purple accessible to all. Fortunately there is no subject “Human Chemical Interaction” to slide us into thinking that there is, or should be, one right view of the work of making or using, designing, standardizing, or evaluating a dye. Nevertheless, we appreciate a readily available, stunning color, used by an able artist, and one that lasts without deteriorating.
Schemes for activity reuse are based upon the assumption that the human–computer dialog has many recurring activities. Yet there is almost no empirical evidence confirming the existence of these recurrences or suggestions of how observed patterns of recurrences in one dialog would generalize to other dialogs. The next few chapters address this dearth. They provide empirical evidence that people not only repeat their activities, but that they do so in quite regular ways. This chapter starts with the general notion of recurrent systems, where most users predominantly repeat their previous activities. Such systems suggest potential for activity reuse because there is opportunity to give preferential treatment to the large number of repeated actions. A few suspected recurrent systems from both non-computer and computer domains are examined in this context to help pinpoint salient features. Particular attention is paid to repetition of activities in telephone use, information retrieval in technical manuals, and command lines in UNIX. The following chapters further examine UNIX as a recurrent system, and then generalize the results obtained into a set of design properties.
A definition of recurrent systems
An activity is loosely defined as the formulation and execution of one or more actions whose result is expected to gratify the user's immediate intention. It is the unit entered into incremental interaction systems (as defined in Section 1.2.1) (Thimbleby, 1990). Entering command lines, querying databases, and locating and selecting items in a menu hierarchy are some examples.
This chapter serves two purposes: review of previous work and preview of our new results on the parallel routing problem. Our general approach to the problem is also outlined. The routing schemes and their analysis will appear in the next chapter.
Introduction
Perhaps the most important issue in parallel computers is interprocessor communication. Since processors are linked together by a relatively sparse network due to cost considerations, packets have to traverse many links and even be delayed by other packets due to conflicts or full buffers before reaching their destinations. Communication time, therefore, can easily dominate the total execution time, making designing fast communication schemes a primary concern. It is also preferable that the buffer size be a constant independent of the size of the network for the sake of scalability. Finally, as more components mean more faults [326], other things being equal, the issue of fault tolerance without loss of efficiency should be dealt with together.
We review previous work and preview our approach and results in this chapter. In the next chapter, it will be shown how the above-mentioned problems can be tackled simultaneously using information dispersal.
Fast, fault-tolerant communication schemes using constant size buffers on the hypercube and the de Bruijn networks are presented. All our schemes run with probability of successful routing 1 – N−Θ(logN) where N denotes the number of processors. A summary of this chapter's results can be found in Section 4.4.
Parallel Communication Scheme
The notion of symmetric parallel communication scheme (SPCS) captures the essence, and facilitates the analysis, of our parallel routing algorithms.
Definition 5.1In anepoched communication scheme (ECS), packets are sent inepochs, numbered from 0. In any epoch, packets that demand transmission to neighboring nodes in that epoch, as determined by the routing algorithm, are sent along the links as requested. A symmetric parallel communication scheme (SPCS) is an ECS that satisfies the following conditions.
Each node initially has h packets.
All packets are routed independently by the routing algorithm; that is, in any epoch, a packet crosses an out-going link independently with equal probability.
The expected number of packets at each node is h at the end of each epoch.
Such a scheme is called an h-SPCS.
Comment 5.2 Condition 3 implies that no packets can be lost. In all our schemes, however, pieces (that is, packets in the definition of ECS) can be lost due to buffer overflow. This is not a serious problem since we can analyze a scheme as if pieces over the capacity of the buffer were not lost.
If I send a man to buy a horse for me, I expect him to tell me that horse's points – not how many hairs he has in his tail.
— Carl Sandburg's Abraham Lincoln
This final chapter will be brief. First, the argument of the book is reviewed. Next, the original contributions are identified. Finally, new directions for research are sketched. The individual components of the book are not evaluated or criticized because this has been done at the end of each chapter.
Argument of the book
We began with the observation that orders given to interactive computer systems resemble tools used by people. Like tools, orders are employed to pursue activities that shape one's environment and the objects it contains. People have two general strategies for keeping track of the diverse tools they wield in their physical workshops. Recently used tools are kept available for reuse, and tools are organized into functional and task-oriented collections. Surprisingly, these strategies have not been transferred effectively to interactive systems.
This raises the possibility of an interactive support facility that allows people to use, reuse, and organize their on-line activities. The chief difficulty with this enterprise is the dearth of knowledge of how users behave when giving orders to general-purpose computer systems. As a consequence, existing user support facilities are based on ad hoc designs that do not adequately support a person's natural and intuitive way of working.
A portion of a trace belonging to a randomly selected expert programmer follows in the next few pages. The nine login sessions shown cover slightly over one month of the user's UNIX interactions, and include 155 command lines in total.
As mentioned in Chapter 2, all trace records have been made publicly available through a research report and an accompanying magnetic tape (Greenberg, 1988b). This report may be obtained from the Department of Computer Science, University of Calgary, or the author.
Because the raw data collected is not easily read, it was syntactically transformed to the listing presented here. The number and starting time of each login session are marked in italics. The first column shows the lines processed by csh after history expansions were made. The current working directory is given in the middle column. Blank entries indicate that the directory has not changed since the previous command line, and the “∼” is csh shorthand for the user's home directory. The final column lists any extra annotations recorded. These include alias expansions of the line by csh, error messages given to the user, and whether history was used to enter the line. Long alias expansions are shown truncated and suffixed with “…”.