We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The preceding chapter presented the basic difficulties associated with producing semantic representations of sentences in context. This chapter surveys several well-known natural language processors, concentrating on their efforts at overcoming these particular difficulties. The processors use different styles of semantic representation as well as different methods for producing the chosen semantic representation from the syntactic parse. Ideally, clearly defined methods of producing semantic representations should be based on a linguistic theory of semantic analysis; a theory about the relationships between the given syntactic and semantic representations, and not just on the particular style of semantic representation. Computational linguistics has a unique contribution to make to the study of linguistics, in that it offers the opportunity of realizing the processes that must underlie the theories. Unfortunately, it seems to be the case that those systems that adhere most closely to a particular linguistic theory have the least clearly defined processing methods, and vice versa.
Another important aspect to examine is whether or not any of the methods make significant use of procedural representations. An important contribution hoped for from computational linguistics is an understanding of procedural semantics as “a paradigm or a framework for developing and expressing theories of meaning” [Woods, 1981, p. 302]. It is argued that adding procedures to a framework should greatly enrich its expressive power [Wilks, 1982]. In spite of the intuitive appeal of this argument, much work remains to be done before the benefits can be convincingly demonstrated.
A primary problem in the area of natural language processing is the problem of semantic analysis. This involves both formalizing the general and domain-dependent semantic information relevant to the task involved, and developing a uniform method for access to that information. Natural language interfaces are generally also required to have access to the syntactic analysis of a sentence as well as knowledge of the prior discourse to produce a detailed semantic representation adequate for the task.
Previous approaches to semantic analysis, specifically those which can be described as using templates, use several levels of representation to go from the syntactic parse level to the desired semantic representation. The different levels are largely motivated by the need to preserve context-sensitive constraints on the mappings of syntactic constituents to verb arguments. An alternative to the template approach, inference-driven mapping, is presented here, which goes directly from the syntactic parse to a detailed semantic representation without requiring the same intermediate levels of representation. This is accomplished by defining a grammar for the set of mappings represented by the templates. The grammar rules can be applied to generate, for a given syntactic parse, just that set of mappings that corresponds to the template for the parse. This avoids the necessity of having to represent all possible templates explicitly. The context-sensitive constraints on mappings to verb arguments that templates preserved are now preserved by filters on the application of the grammar rules.
This chapter presents the semantic processor that performs the semantic role assignments at the same time as it is decomposing the verb representation. Chapter 3 has described how semantic roles are defined as arguments to the semantic predicates that appear in the lexical entries. These arguments are instantiated as the lexical entries are interpreted. A possible instantiation of a predicate-argument is the referent of a syntactic constituent of the appropriate syntactic and semantic type. The syntactic constituent instantiations correspond to the desired mappings of syntactic constituents onto semantic roles. Other instantiations can be made using pragmatic information to deduce appropriate fillers from previous knowledge about other syntactic constituents or from general world knowledge.
These tasks are performed by interpreting the lexical entries procedurally similarly to the way that Prolog interprets Horn clauses procedurally [Kowalski, 1979]. The lexical entries are in fact Horn clauses, and the predicate-arguments that correspond to the semantic roles are terms that consist of function symbols with one argument. The procedural interpretation drives the application of the lexical entries, and allows the function symbols to be “evaluated” as a means of instantiating the arguments. The predicate environments associated with the mapping constraints correspond to states that may or may not occur during the procedural interpretation of the entries. Thus the same argument can be constrained differently depending on the state the verb interpretation is in. The state can vary according to instantiations of arguments or by the predicates included in the predicate decomposition.
Two pulleys of weights 12 lb and 8 lb are connected by a fine string hanging over a smooth fixed pulley. Over the former is hung a fine string with weights 3 lb and 6 lb at its ends, and over the latter a fine string with weights 4 lb and x lb. Find x so that the string over the fixed pulley remains stationary, and find the tension in it.
2. (Part of Humphrey, p. 75, nos. 566)
A mass of 9 lb resting on a smooth horizontal table is connected by a light string, passing over a smooth pulley at the edge of the table to a mass of 7 lb hanging freely. Find the common acceleration, the tension in the string and the pressure on the pulley.
3. Two particles of mass B and C are connected by a light string passing over a smooth pulley. Find their common acceleration.
4. Particles of mass 3 and 6 lb are connected by a light string passing over a smooth weightless pulley; this pulley is suspended from a smooth weightless pulley and offset by a particle of mass 8 lb. Find the acceleration of each particle.
5. A man of 12 stone and a weight of 10 stone are connected by a light rope passing over a pulley. Find the acceleration of the man. If the man pulls himself up the rope so that his acceleration is one half its former value, what is the upward acceleration of the weight?
This chapter presents the formalization of the pulley domain. In this domain, the entities involved tend to be simple solid entities like particles and strings, while the relationships between them include notions of support, contact, or motion of some form. Section 3.2 describes the formalization of the pulley world in terms of the types of entities and their properties. The relationships are used for the decompositions of the verbs which are described in section 3.3 where the lexical entries of the verbs are listed. Each verb is subcategorized in terms of the primary relationship involved in the decomposition. The semantic roles are arguments of these relationships. The lexical entries include the decompositions of these primary relationships. Section 3.5 introduces the mapping constraints for assigning syntactic constituents to semantic roles. Examples demonstrate how the syntactic cues can be used with predicate environments to preserve the same semantic role interdependences that are preserved by templates. The last section describes the semantic constraints used in conjunction with the mapping constraints to test that the referent of a syntactic constituent is of the correct semantic type. The last category of constraints described, the pragmatic constraints, are used by inference-driven mapping to fill semantic roles that do not have mappings to syntactic constituents. Chapter 4 describes how inference-driven mapping interprets the lexical entries procedurally to drive the semantic analysis of paragraphs of text.
This chapter summarizes the results that have been presented in the preceding chapters, in particular the process by which inference-driven mapping goes directly from the syntactic parse of a sentence to a “deep” semantic representation that corresponds to a traditional linguistic decomposition. The summary illustrates two of the most important benefits offered by inference-driven mapping over the template approach, namely, (1) the clear distinction between the verb definition and the final semantic representation achieved, and (2) an integrated approach to semantic analysis. The first benefit is of special relevance to linguistic theories about semantic representations, in that it provides a testing ground for such theories. The second benefit is of more relevance to computational models of natural language processors, in terms of interfacing semantic processing with syntactic parsing. The last section suggests directions of future research for pursuing these objectives.
Integrated semantic analysis
As discussed in chapter 2, traditional approaches to semantic processing need several levels of description to produce a “deep” semantic representation from a syntactic parse. The most popular of these approaches, termed the template approach, can be seen as using at least two intermediate levels of description, (1) the template level which is used for assigning mappings from syntactic constituents and semantic roles, and (2) the canonical level where the semantic roles are grouped together to simplify derivation of a “deep” semantic representation. These separate levels of description impose several stages of processing on the implementations, since only certain pieces of information are available at any one stage.
Futurologists have proclaimed the birth of a new species, machina sapiens, that will share (perhaps usurp) our place as the intelligent sovereigns of our earthly domain. These “thinking machines” will take over our burdensome mental chores, just as their mechanical predecessors were intended to eliminate physical drudgery. Eventually they will apply their “ultra-intelligence” to solving all of our problems. Any thoughts of resisting this inevitable evolution is just a form of “speciesism,” born from a romantic and irrational attachment to the peculiarities of the human organism.
Critics have argued with equal fervor that “thinking machine” is an oxymoron – a contradiction in terms. Computers, with their foundations of cold logic, can never be creative or insightful or possess real judgment. No matter how competent they appear, they do not have the genuine intentionality that is at the heart of human understanding. The vain pretensions of those who seek to understand mind as computation can be dismissed as yet another demonstration of the arrogance of modern science.
Although my own understanding developed through active participation in artificial intelligence research, I have now come to recognize a larger grain of truth in the criticisms than in the enthusiastic predictions. But the story is more complex. The issues need not (perhaps cannot) be debated as fundamental questions concerning the place of humanity in the universe. Indeed, artificial intelligence has not achieved creativity, insight, and judgment. But its shortcomings are far more mundane: we have not yet been able to construct a machine with even a modicum of common sense or one that can converse on everyday topics in ordinary language.
Systems of interconnected and interdependent computers are qualitatively different from the relatively isolated computers of the past. Such “open systems” uncover important limitations in current approaches to artificial intelligence (AI). They require a new approach that is more like organizational designs and management than current approaches. Here we'll take a look at some of the implications and constraints imposed by open systems.
Open systems are always subject to communications and constraints from outside. They are characterized by the following properties:
Continuous change and evolution. Distributed systems are always adding new computers, users and software. As a result, systems must be able to change as the components and demands placed upon them change. Moreover, they must be able to evolve new internal components in order to accommodate the shifting work they perform. Without this capability, every system must reach the point where it can no longer expand to accommodate new users and uses.
Arm's-length relationships and decentralized decision making. In general, the computers, people, and agencies that make up open systems do not have direct access to one another's internal information. Arm's-length relationships imply that the architecture must accommodate multiple computers at different physical sites that do not have access to the internal components of others. This leads to decentralized decision making.
Perpetual inconsistency among knowledge bases. Because of privacy and discretionary concerns, different knowledge bases will contain different perspectives and conflicting beliefs. Thus, all the knowledge bases of a distributed AI system taken together will be perpetually inconsistent. Decentralization makes it impossible to update all knowledge bases simultaneously.
“But why,” Aunty asked with perceptible asperity, “does it have to be a language?” Aunty speaks with the voice of the Establishment, and her intransigence is something awful. She is, however, prepared to make certain concessions in the present case. First, she concedes that there are beliefs and desires and that there is a matter of fact about their intentional contents; there's a matter of fact, that is to say, about which proposition the intentional object of a belief or a desire is. Second, Aunty accepts the coherence of physicalism. It may be that believing and desiring will prove to be states of the brain, and if they do that's OK with Aunty. Third, she is prepared to concede that beliefs and desires have causal roles, and that overt behavior is typically the effect of complex interactions among these mental causes. (That Aunty was raised as a strict behaviorist goes without saying. But she hasn't been quite the same since the sixties. Which of us has?) In short, Aunty recognizes that psychological explanations need to postulate a network of causally related intentional states. “But why,” she asks with perceptible asperity, “does it have to be a language?” Or, to put it more succinctly than Aunty often does, what – over and above mere Intentional Realism – does the Language of Thought Hypothesis buy? That is what this discussion is about.
A prior question: what – over and above mere Intentional Realism – does the Language of Thought Hypothesis (LOT) claim? Here, I think, the situation is reasonably clear.
Artificial intelligence is still a relatively young science, in which there are still various influences from different parent disciplines (psychology, philosophy, computer science, etc.). One symptom of this situation is the lack of any clearly defined way of carrying out research in the field (see D. McDermott, 1981, for some pertinent comments on this topic). There used to be a tendency for workers (particularly Ph.D. students) to indulge in what McCarthy has called the “look-ma-no-hands” approach (Hayes, 1975b), in which the worker writes a large, complex program, produces one or two impressive printouts and then writes papers stating that he has done this. The deficiency of this style of “research” is that it is theoretically sterile – it does not develop principles and does not clarify or define the real research problems. What has happened over recent years is that some attempt is now made to outline the principles which a program is supposed to implement. That is, the worker still constructs a complex program with impressive behaviour, but he also provides a statement of how it achieves this performance. Unfortunately, in some cases, the written “theory” may not correspond to the program in detail, but the writer avoids emphasizing (or sometimes even conceals) this discrepancy, resulting in methodological confusion. The “theory” is supposedly justified, or given empirical credibility, by the presence of the program (although the program may have been designed in a totally different way); hence the theory is not subjected to other forms of argument or examination.
Rational reconstruction (reproducing the essence of the program's significant behavior with another program constructed from descriptions of the purportedly important aspects of the original program) has been one approach to assessing the value of published claims about programs.
Campbell attempts to account for why the status of AI vis-a-vis the conventional sciences is a problematic issue. He outlines three classes of theories, the distinguishing elements of which are: equations; entities, operations and a set of axioms; and general principles capable of particularization in different forms. Models in AI, he claims, tend to fall in the last class of theory.
He argues for the methodology of rational reconstruction as an important component of a science of AI, even though the few attempts so far have not been particularly successful, if success is measured in terms of the similarity of behavior between the original AI system and the subsequent rational reconstruction. But, as Campbell points out, it is analysis and exploration of exactly these discrepancies that is likely to lead to significant progress in AI.
The second paper in this section is a reprint of one of the more celebrated attempts to analyse a famous AI program. In addition, to an analysis of the published descriptions of how the program works with respect to the program's behaviour (Lenat's ‘creative rediscovery’ system AM), Richie and Hanna discuss more general considerations of the rational-reconstruction methodology.
There is a continuing concern in AI that proof and correctness, the touchstones of the theory of programming, are being abandoned to the detriment of AI as a whole. On the other hand, we can find arguments to support just the opposite view, that attempts to fit AI programming into the specify-and-prove (or at least, specify-and-test correctness) paradigm of conventional software engineering, is contrary to the role of programming in AI research.
Similarly, the move to establish conventional logic as the foundational calculus of AI (currently seen in the logic programming approach and in knowledge-based decision-making implemented as a proof procedure) is another aspect of correctness in AI; and one whose validity is questioned (for example, Chandrasekaran's paper in section 1 opened the general discussion of such issues when it examined logic-based theories in AI, and Hewitt, in the section 11, takes up the more specific question of the role of logic in expert systems). Both sides of this correctness question are presented below.