To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
It has been known since at least the time of Leonardo da Vinci that encoded within a pair of stereo images is information detailing the scene geometry (Leonardo, 1989). The animal brain has known this for millions of years and has developed as yet barely understood neuronal mechanisms for decoding this information. Hold your hand inches in front of your face and, with both eyes focused, stare at your fingers – they appear vividly in three-dimensions (3-D). In fact everywhere you gaze, you are aware of the relative depths of the observed objects.
Stereo vision is not the only clue to depth; there is a whole host of monocular clues which humans bring to bear in determining depth, evidenced by the fact that if you close one eye, it is still relatively simple to determine 3-D spatial relations. Nevertheless, monocular clues are less exact and often ambiguous. Otherwise, why would the mammalian anatomy have bothered to narrow the visual field in order to reposition the eyes for stereo vision? As a simple demonstration of the precision of stereo vision, try to touch the tips of two pencils with your arms outstretched, one pencil in each hand. With one eye closed, the task is frustratingly difficult; with both eyes open, the relative depths of the tips of the pencils are clear, and the task becomes as simple as touching your nose.
This article considers the major change that has taken place in natural language processing research over the last five years. It begins by providing a brief guide to the structure of the field and then presents a caricature of two competing paradigms of 1980s NLP research and indicates the reasons why many of those involved have now seen fit to abandon them in their pure forms. Attention is then directed to the lexicon, a component of NLP systems which started out as Cinderella but which has finally arrived at the ball. This brings us to an account of what has been going on in the field most recently, namely a merging of the two 1980s paradigms in a way that is generating a host of interesting new research questions. The chapter concludes by trying to identify some of the key conceptual, empirical and formal issues that now stand in need of resolution.
Introduction
The academic discipline that studies computer processing of natural languages is known as natural language processing (NLP) or computational linguistics (the terms are interchangeable). NLP is most conveniently seen as a branch of AI, although it is a branch into which many linguists (and a few psycholinguists) have moved. In Europe, NLP is dominated by ex-linguists but this is not the case in the USA where there is a tradition of people moving into the field from a standard computer science background.
It is tempting to say that NLP is the academic discipline that studies computer processing of the written forms of natural languages. But that would be misleading. The discipline that studies computer processing of the spoken forms of natural languages is known as speech processin or just speech.
Progress in the computer and communication industries has run in parallel for a great many years, and convergence or even unity has often been predicted. Do they not, after all, both use digital circuits and the associated logic? The present chapter seeks to investigate why the convergence has not happened, and to look into the future.
The earliest contacts, at any rate in the civil sphere, were in some sense casual. Data transmission, as the subject was known, was not in any way regarded as an integral part of computing but rather as an added facility that one was well advised to have some understanding of – rather like air-conditioning.
Early Contacts
What started to bring the two activities into more serious contact was the development of time-sharing systems. As soon as it was demonstrated that these systems were very valuable means to improve the human convenience of using computers it was obviously necessary to connect terminals in considerable numbers. At first these tended to be in the same buildings as the computers they served, and the physical connections were wholly under the control of the computer people. There was not much, if any, suitable equipment one could go and buy; accordingly computer engineers learned about transmission, designed multiplexors, and so on. They found it not unfamiliar, and not too difficult. Before long, however, it became necessary to communicate off site, or off campus. This revealed that in most if not all countries there were legal restrictions on who could provide telecommunication facilities outside a single site. The rules varied. Maybe it was a matter of crossing a highway, or of going to land in different ownership.
Are there distinct principles and concepts which underlie computing, so that we are justified in calling it an independent science? Or is computing a resource or commodity – like water – which is perfectly well understood in terms of existing science, for which we merely have to find more and better uses?
In this essay I argue that a rich conceptual development is in progress, to which we cannot predict limits, and whose outcome will be a distinct science. This development has all the excitement and unpredictability of any science. We cannot predict how the conceptual landscape will lie in a decade's time; the subject is still young and has many surprises in store, and there is no sure way to extrapolate from the concepts which we now understand to those which will emerge. I therefore support my argument by explaining in outline some semantic ideas which have emerged in the last two or three decades, and some which are just now emerging.
I try to present the ideas here in a way which is accessible to someone with an understanding of programming and a little mathematical background. This volume aims to give a balanced picture of computer science; to achieve this, those parts which are mathematical must be presented as such. The essence of foundational work is to give precise meaning to formulations of processes and information; clearly, we should employ mathematics in this work whenever it strengthens our analytical power. Thus, rather than avoiding equations, I try to surround them with helpful narrative.
It is a somewhat arbitrary matter to decide when a scientific discipline is mature and stands significantly on its own.
Artificial Intelligence (AI) has had a turbulent history. It has alternated between periods of optimism and periods of pessimism. Why does this field of computer science evoke such strong feelings? What has it achieved in the past and what can we expect of it in the future?
I will present my personal view of the nature of AI research and use this to try to answer some of the questions above.
A Potted History of Artificial Intelligence
In artificial intelligence we attempt to emulate human (and other animal) mental abilities using computer programs and associated hardware.
The goal of building an intelligent artificial entity is a potent one and has excited enthusiasts throughout history. The advent of the electronic computer reinvigorated this enthusiasm and initiated the field of artificial intelligence. The first call to arms came from Alan Turing's classic 1950 paper in Mind, (reprinted in Turing (1963)), but the birth of the field can be dated from the 1956 Dartmouth conference, which AI pioneers like McCarthy, Minsky, Newell and Simon attended.
These were heady days. The pioneers were young and conscious of the power of the new computing machinery. They quickly discovered some new computational techniques which appeared to be the key to general-purpose intelligence. The prospects for artificial intelligence looked good. Large-scale projects sprang up to take advantage of the new technology. Unfortunately, these pioneers drastically underestimated the difficulties of AI and made optimistic predictions that have proved to be an embarrassment to the field.
By the end of the 60s it was clear that AI had not made the predicted progress. Many promising new computational techniques had not scaled up to real problems.
In the end we design the tool for the material – in the end, but never in the beginning. In the beginning we have still to find out the first things about the ways in which the material is and is not workable; and we explore it by trying out implements with which we have already learned to work with other materials. There is no other way to start.
Gilbert Ryle, Dilemmas (1962,p. 66)
The general conception of the ideal plant newspaper is pretty well defined… it should carry lucid articles on efficiency, personal betterment, shop news and personals, with the aim of securing cooperation.
Hi Sibley, Factory (April 1918, p. 776)
As Ryle laments, the initial stages of design with a new “material” often amount to solving yesterday's problems and meeting yesterday's requirements. New possibilities and problems have indeed emerged in the era of CSCW applications. CSCW research and development communities have also taken on older, better-recognized, and still-important problems, problems with their predecessors addressed in efforts to shape groups with the use of artifacts. Several salient cultural objects appear to have traveled with all of these kinds of efforts, most notably that of “efficiency.”
A number of problems are associated with efforts to define or construct a group (in terms of its membership, sphere of action, and scope of accountability). Concern with these problems has surfaced in various forms since the early part of this century (as Sibley's advice in the epigraph suggests).
Science makes progress by constructing mathematical models, deducing their observable consequences, and testing them by experiment. Successful theoretical models are later taken as the basis for engineering methods and codes of practice for design of reliable and useful products. Models can play a similar central role in the progress and practical application of computing science.
A model of a computational paradigm starts with choice of a set of potential direct or indirect observations that can be made of a computational process. A particular process is modelled as the subset of observations to which it can give rise. Process composition is modelled by relating observations of a composite process to those of its components. Indirect observations play an essential role in such compositions. Algebraic properties of the composition operators are derived with the aid of the simple theory of sets and relations. Feasibility is checked by a mapping from a more operational model.
A model constructed as a family of sets is easily adapted as a calculus of design for total correctness. A specification is given by an arbitrary set containing all observations permitted in the required product. It should be expressed as clearly as possible with the aid of the full power of mathematics and logic. A product meets a specification if its potential observations form a subset of its permitted observations. This principle requires that all envisaged failure modes of a product are modelled as indirect observations, so that their avoidance can be proved. Specifications of components can be composed mathematically by the same operators as the components themselves. This permits top-down proof of correctness of designs even before their implementation begins. Algebraic properties and reasoning are helpful throughout development.
Let us hope … that in the years ahead we can construct a society that is less in need of suffering and a self that is less a sacrifice to the nihilistic economics and politics of our time.
Philip Cushman (1990, pp. 608–609)
This book is about virtual individuals and virtual groups. It is also about a specific set of computer system applications – groupware and other network-based systems – and the way we employ them in construction, dissemination, and manipulation of these virtual entities. To an increasing extent, management in organizational contexts has become the management of virtual individuals and groups. These virtual entities are employed in establishing the patterns and setting the standards by which we are evaluated and with which we often must conform.
Virtuality has become a common theme in American life, taking on connotations of the “imaginary,” as well as the “designed” or “engineered”: “virtual corporations” are created when corporations design sets of linkages with each other and with critical environmental factors, thus extending their effective spheres of influence (Davidow and Malone, 1992). Instead of tales about lonely teens and their imaginary companions, stories about an engineered “virtual girl” are consumed in the mass market (Thomson, 1993).
A virtual individual is a selection or compilation of various traces, records, imprints, photographs, profiles, and statistical information that pertain (or could reasonably be said to pertain) to an individual – along with writing done, images produced, sounds associated with, and impressions managed by the individual. The amalgam that results (whatever its components) is associated with the individual in the context of particular genres and artifacts.
Much of the software we use today arose from new ideas that emanated from researchers at universities or at industrial research laboratories such as Xerox PARC. Researchers who are concerned with building novel software are usually keen to get their creations out into the field as soon as possible. This is partly because they may get extra brownie points from their research sponsors, but, perhaps more importantly, because usage in the field often leads to new insights, which in turn result in the next step forward for the research.
Nevertheless, building something that people will use means making big sacrifices in the research aims. In particular the prime quality of any research, novelty, must be kept in check.
My work for the past decade has been in on-line documents, a topic that is fairly close to the market-place. I will relate some experiences of the trade-offs between research aims and the market-place, and put this in the context of the future lines of development of the subject.
Introduction
The aim of this chapter is different from most of the others. I wish to consider researchers at the applied end of the spectrum: people engaged in creating novel software tools. I will analyse my experience over the past ten years in trying, reasonably successfully, to create novel software tools in the electronic publishing area – specifically in hypertext – and will try to draw some general lessons which may help future research.
A typical aim of the software tools researcher is to pioneer a novel interface style, such as a successor to the desktop metaphor, or to pioneer a new approach to an application area – the Walter Mitty dream that most of us have is a breakthrough equivalent to the discovery of the spreadsheet.
The modern theory of algorithms dates from the late 1960s when the method of asymptotic execution time measurement began to be used. It is argued that the subject has both an engineering and a scientific wing. The engineering wing consists of well understood design methodologies while the scientific one is concerned with theoretical underpinnings. The key issues of both wings are surveyed. Finally some personal opinions on where the subject will go next are offered.
Introduction
The concept of ‘algorithm’ is the oldest one in computer science. It is of such antiquity in fact that it may appear presumptuous for computer scientists to claim it as part of their subject. However, although algorithms have been part of scientific tradition for millennia it was not until the computer era that they assumed the major role they now play in these traditions. Indeed, until a few decades ago most scientists would have been hard-pressed to give significant examples of any algorithms at all. Yet algorithms have been routinely used for centuries. Generations of school children learnt the algorithms for addition, subtraction, multiplication, and division and could use them with all the blinkered competence of a mechanical computer. Few of them would ever have stopped to wonder how it was that they allowed the computation, in a matter of moments, of quantities far beyond what could be counted or imagined. These elementary algorithms are amazingly efficient but it was not until the computer era that questions of efficiency were seriously addressed. Had they been so addressed the mathematical curriculum might have been different.
Overconcern for privacy may indicate retreat from responsibility and sagging motivation.
Propst (1968, p. 2)
Half the world today is engaged in keeping the other half “under surveillance.”
McLuhan and Nevitt (1972)
What can “privacy,” “anonymity,” or “agency” mean for members of a collaborative workgroup? The presumption that groups are supposed to work together in harmony and close contact may seem to exclude the need for private spaces, the opportunity to make an anonymous contribution, or the capacity to have one's work done by an agent or surrogate. However, these cultural objects have served to shape CSCW and other network-based system applications.
Privacy, anonymity, and agency in the realm of the societal construction of computing applications have a number of common denominators. Each has special associations within the group level: individuals’ expectations for privacy, or the meanings they attach to a statement delivered under the cloak of anonymity, are affected by whether or not they are working in a group or team context. Roles that agents play in computing applications are also sensitive to group context; for instance, groups may set standards for the kinds of activities in which they can be utilized.
These three cultural objects have also been topics of frequent discussion within computer application design communities. “Social analogues,” system features that are intentionally linked by developers to specific cultural objects, have been formulated for each of the three. The question of whether a social analogue will be successful in maintaining a strong linkage with a certain cultural object (beyond the association that designers have attempted to make) has a number of complex dimensions.
If our computer systems break down, we might find an enormous dependency of which we were not truly aware. We may, by then, have become functionally illiterate – unable to deal with each other except with the aid of mechanisms.
Laurie (1979, p.141)
A tool is but the extension of a man's hand, and a machine is but a complex tool. And he that invents a machine augments the power of a man and the well-being of mankind.
Henry Ward Beecher (1813–1887), Proverbs from Plymouth Pulpit
Genres are associated with various cultural objects, objects that can attract us to the genres, make us suspicious of them, or even avoid their use. Sites for constructing, expressing, and modifying cultural objects include debate, writing, speeches, legal decisions, imagery, and design, as well as CSCW and other network-based system applications. Construction of cultural objects, like that of artifacts and other physical objects, should be considered in light of its reflexive dimensions. The objects serve to shape the cultures, individuals, and genres that are associated with them, and in turn are given shape.
Issues of dependence, autonomy, and intellectual augmentation are critical aspects of construction of the “first-person plural” – which involves the authority to attach the word “we” to a document, product, or decision. This authority is not automatically given by group members, and is often not recognized by parties outside the group. In some situations, establishment of this authority may depend on the ability of group participants and audiences to segregate the group from the “computer-mediated group.”
The work of a crowd is always inferior, whatever its nature, to that of an isolated individual.
Gustave LeBon (1895/1960, p. 200)
The release of productivity is the product of cooperatively organized intelligence.
Dewey and Tufts (1939, p. 446)
“Collaboration” and “cooperation” among individuals – the harnessing of people's skills and talents to conduct projects, make decisions, and create new ideas – are notions that are both commonplace and elusive. The contradiction between the two epigraphs underscores the fact that controversies concerning the value of collaboration are not new. We have all participated in meetings and team projects, in informal exchanges as well as structured games, but these activities remain only vaguely understood and nearly impossible to predict and control with any precision. Our modes of individual and group expression (our “virtual individuals” and “virtual groups”) are intimately linked with the technologies that support group interaction – technologies that have undergone dramatic change in the past decades.
Network-based computer applications designed to support joint efforts (“computer-supported cooperative work,” or CSCW, applications) have both staunch supporters and fierce critics. Promoters have characterized these systems as “coaches” and “educators” (Winograd and Flores, 1986); critics, in turn, have labeled the same systems as “oppressors” and “masters” with a “digitized whip” (Dvorak and Seymour, 1988). The terms “groupware” and “workgroup computing” can be found in many computing, management, and social science publications, along with words of high praise, condemnation, or ennui. Virtual reality (VR) applications have been incorporated into some CSCW initiatives, sometimes compounding confusion about the systems and further steepening the learning curve.
Picture a modern office setting, perhaps an insurance company headquarters. Some people are writing on sheets of paper. Others are looking into computer screens, entering numbers into a spreadsheet. Still others are conversing. Which of these individuals are working individually, and which are engaging in cooperative work? And which of the individuals engaging in cooperative activity are participating in healthy, well-working groups? Some of these issues might appear to be riddles or trick questions. Whether there are “riddles” (or linguistic puzzles) involved, the issue of how best to construe cooperative work activity is one of the most salient focal points of research and theory in CSCW applications.
In much the same manner as healthy and unhealthy forms of individual behavior have been constructed by social scientists, today's administrative theorists, network-based system developers, and CSCW researchers are attempting to construct notions of “functional” and “dysfunctional” collaborative behavior. Several of the theorists whose work is described in this chapter are attempting to segregate some kinds of work as “cooperative” and give them special forms of support. Others want to transform existing forms of work from their current, supposedly noncooperative form into cooperative work. Still others label all work as cooperative and want us to see work itself in a new light. Many have identified “right” ways of thinking about and engaging in cooperative activity, their conclusions bolstered with theoretical scaffolding, empirical research, and appeals to common sense.