To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
What is the objective of semantic analysis? We could say that it is to determine what a sentence means, but by itself this is not a very helpful answer. It may be more enlightening to say that, for declarative sentences, semantics seeks to determine the conditions under which a sentence is true or, almost equivalently, what the inference rules are among sentences of the language. Characterizing the semantics of questions and imperatives is a bit more problematic, but we can see the connection with declaratives by noting that, roughly speaking, questions are requests to be told whether a sentence is true (or to be told the values for which a certain sentence is true) and imperatives are requests to make a sentence true.
People who study natural language semantics find it desirable (or even necessary) to define a formal language with a simple semantics, thus changing the problem to one of determining the mapping from natural language into this formal language. What properties should this formal language have (which natural language does not)? It should
*be unambiguous
*have simple rules of interpretation and inference, and in particular
*have a logical structure determined by the form of the sentence
We shall examine some such languages, the languages of the various logics, shortly.
Of course, when we build a practical natural language system our interest is generally not just finding out if sentences are true or false.
Speech act theory has its roots in the work of Wittgenstein, who in Philosophical Investigations proposed an analogy between using language and playing games. His basic point was that language is a form of rule-governed behavior, much the same as game-playing, employing rules and conventions that are mutually known to all the participants.
The field of speech act theory is usually considered to have been founded by Austin (1962) who analyzed certain utterances called performatives. He observed that some utterances do more than express something that is true about the world. In uttering a sentence like “I promise to take out the garbage,” the speaker is not saying anything about the world, but is rather undertaking an obligation. An utterance like “I now pronounce you man and wife” not only does not say anything that is true about the world, but when uttered in an appropriate context by an appropriate speaker, actually changes the state of the world. Austin argued that an account of performative utterances required an extension of traditional truth-theoretic semantics.
The most significant contribution to speech act theory has been made by philosopher John Searle (1969, 1979a, 1979b), who was the first to develop an extensive formulation of the theory of speech acts.
Kamp represents the first step in a very ambitious program of research. It is appropriate at this time to reflect upon this program, how far we have come, and what lies in the future.
KAMP represents not merely an attempt to devise an expedient strategy for getting text out of a computer, but rather embodies an entire theory of communication. The goal of such a theory could be summarized by saying that its objective is to account for how agents manage to intentionally affect the beliefs, desires and intentions of other agents. Developing such a theory requires examining utterances to determine the goals the speakers are attempting to achieve thereby, and in the process explicating the knowledge about their environment, about their audience, and about their language that these speakers must have. Language generation has been ehosen as an ideal vehicle for the study of problems arising from such a theory because it requires one to face the problem of why speakers choose to do the things they do in a way that is not required by language understanding. Theories of language understanding make heavy use of the fact that the speaker is behaving according to a coherent plan. Language generation requires producing such a coherent plan in the first place, and therefore requires uncovering the underlying principles that make such a plan coherent.
This chapter discusses in detail a typical example that requires KAMP to form a plan involving several physical and illocutionary acts, and then to integrate the illocutionary acts into a single utterance. This example does not reflect every aspect of utterance planning, but hopefully touches upon enough of them to enable an understanding of the way KAMP works, to illustrate the principles discussed in earlier chapters of this book, and to provide a demonstration of KAMP's power and some of its limitations. It is important to bear in mind that the implementation of KAMP was done to test the feasability of a particular approach to multiagent planning and language generation. Since it is not intended to be a “production” system, many details of efficiency involving both fundamental issues and engineering problems have been purposely disregarded in this discussion.
KAMP is based on a first-order logic natural-deduction system that is similar in many respects to the one proposed by Moore (1980). The current implementation does not take advantage of well-known techniques such as structure sharing and indexing that could be used to reduce some of the computational effort required. Nevertheless, the system is reliable, albeit inefficient, in making the necessary deductions to solve problems similar to the one described here.
This chapter examines some of the special requirements of a knowledge representation formalism that arise from the planning of linguistic actions. Utterance planning requires the ability to reason about a wide variety of intensional concepts that include knowledge per se, mutual knowledge, belief, and intention. Intensional concepts can be represented in intensional logic by operators that apply to both individuals and sentences. What makes intensional operators different from ordinary extensional ones such as conjunction and disjunction is that one cannot substitute terms that have the same truth-value within the scope of one of these operators without sometimes changing the truth-value of the entire sentence. For example, suppose that John knows Mary's phone number. Suppose that unbeknown to John, Mary lives with Bill — and therefore Bill's phone number is the same as Mary's. It does not follow from these premises that John knows what Bill's phone number is.
The planning of linguistic actions requires reasoning about several different types of intensional operators. In this research we shall be concerned with the operators Know (and occasionally the related operator Believe), Mutually-Know, Knowref (knowing the denotation of a description), Intend (intending to make a proposition true) and Intend-To-Do (intending to perform a particular action).
This chapter discusses the problems of planning surface linguistic actions, including surface speech acts, concept activation actions, and focusing actions. What distinguishes these surface linguistic acts from the illocutionary acts considered in Chapter 5 is that they correspond directly to parts of the utterance that are produced by the planning agent. An agent intends to convey a proposition by performing an illocutionary act. There may be many choices available to him for the purpose of conveying the proposition with the intended illocutionary force. For example, he may make a direct request by using an imperative, or perform the act of requesting indirectly by asking a question. He usually has many options available to him for referring to objects in the world.
A surface linguistic act, on the other hand, represents a particular linguistic realization of the intended illocutionary act. Planning a surface speech act entails making choices about the many options that are left open by a high-level specification of an illocutionary act. In addition, the surface speech act must satisfy a multitude of constraints imposed by the grammar of the language. The domain of reasoning done by the planner includes actions along with their preconditions and effects. The grammatical constraints lie outside this domain of actions and goals (excluding, of course, the implicit goal of producing coherent English), and are therefore most suitably specified within a different system.
The planning of natural-language utterances builds on contributions from a number of disciplines. The construction of the multiagent planning system is relevant to artificial intelligence research on planning and knowledge representation. The axiomatization of illocutionary acts discussed in Chapter 5 relies on results in speech act theory and the philosophy of language. Constructing a grammar of English builds on the study of syntax in linguistics and of semantics in both linguistics and philosophy. A complete survey of the relevant literature would go far beyond the scope of this book. This chapter is included to give the reader an overview of some of the most important research that is pertinent to utterance planning.
Language generation
It was quite a long time before the problem of language generation began to receive the attention it deserves. Beginning in about 1982, there has been a virtual explosion in the quantity of research being done in this field, and a complete review of all of it could well fill a book (see Bole and McDonald, forthcoming). This chapter presents an overview of some of the earlier work that provides a foundation for the research that follows.
Several early language-generation systems, (e.g. Friedman, 1969), were designed more for the purpose of testing grammars than for communication.
This book is based on research I did in the Stanford University Computer Science Department for the degree of Doctor of Philosophy. I express my sincere gratitude to my dissertation reading committee: Terry Winograd, Gary Hendrix, Doug Lenat and Nils Nilsson. Their discussion and comments contributed greatly to the research reported here. Barbara Grosz's thoughtful comments on my thesis contributed significantly to the quality of the research. I also thank Phil Cohen and Bonnie Webber for providing detailed comments on the first draft of this book and for providing many useful suggestions, and Aravind Joshi for his efforts in editing the Cambridge University Press Studies in Natural Language Processing series.
This research was supported by the Office of Naval Research under contract N00014-80-C-0296, and by the National Science Foundation under grant MCS-8115105. The preparation of this book was in part made possible by a gift from the System Development Foundation to SRI International as part of a coordinated research effort with the Center for the Study of Language and Information at Stanford University.
This book would be totally unreadable were it not for the efforts of SRI International Senior Technical Editor Savel Kliachko, who transformed my muddled ramblings into golden prose.
Toward a theory of language generation and communication
A primary goal of natural-language generation research in artificial intelligence is to design a system that is capable of producing utterances with the same fluency as that of a human speaker. One could imagine a “Turing Test” of sorts in which a person was presented with a dialogue between a human and a computer and, on the basis of the naturalness of its use of the English language, asked to identify which participant was the computer. Unfortunately, no natural-language generation system yet developed can pass the test for an extended dialogue.
A language-generation system capable of passing this test would obviously have a great deal of syntactic competence. It would be capable of using correctly and appropriately such syntactic devices as conjunction and ellipsis; it would be competent at fitting its utterances into a discourse, using pronominal references where appropriate, choosing syntactic structures consistent with the changing focus, and giving an overall feeling of coherence to the discourse. The system would have a large knowledge base of basic concepts and commonsense knowledge so that it could converse about any situation that arose naturally in its domain.
However, even if a language-generation system met all the above criteria, it might still not be able to pass our “Turing Test” because to know only about the syntactic and semantic rules of the language is not enough.
This chapter deals with the design and implementation of a planning system called KAMP (an acronym for Knowledge And Modalities Planner) that is capable of planning to influence another agent's knowledge and intentions. The motivation for the development of such a planning system is the production of natural-language utterances. However, a planner with such capabilities is useful in any domain in which information-gathering actions play an important role, even though the domain does not necessarily involve planning speech acts or coordinating actions among multiple agents.
One could imagine, for example, a police crime laboratory to which officers bring for analysis substances found at the scene of a crime. The system's goal is to identify the unknown substance. The planner would know of certain laboratory operations that agents would be capable of performing — in effect actions that would produce knowledge about what the substance is or is not. A plan would consist of a sequence of such information-gathering actions, and the result of executing the entire plan would be that the agent performing the actions knows the identity of the mystery substance. Since the primary motivation for KAMP is a linguistic one, most of the examples will be taken from utterance planning; the reader should note, however, that the mechanisms proposed are general and appear to have interesting applications in other areas as well.
In order to understand the relationship between syntactic theory and how people parse sentences, it is first necessary to understand the more general relationship between the grammar and the general cognitive system (GCS). The Chomskyan view, adhered to by most linguists working within the modern generative framework, is that the grammar is a cognitive subsystem whose vocabulary and operations are defined independently of the GCS and account for the structure of language (Chomsky, 1980). Linguistics is thus the branch of theoretical cognitive psychology which explains language structure.
There is another possible relationship between the grammar and the GCS in which linguistics does not play a primary theoretical role in explaining language structure. On this view, the structure of language is explained by basic principles of the GCS – for example, the nature of concepts in interaction with basic properties of the human information processing system. If this view is correct, grammars become convenient organizational frameworks for describing the structure of language. Linguistics is then a descriptive rather than a theoretical branch of cognitive psychology. The linguistics-as-descriptive position was held by the American Structuralists and is presently being revived from a somewhat different perspective in the form of “cognitive grammar” (Lakoff, in press).
These two frameworks for understanding the relationship between grammars and the cognitive system – linguistics as explanation and linguistics as description – suggest different research strategies for answering the question posed by the theme of this book: namely, What is the relationship between syntactic theory and how listeners parse sentences?
There has been some interest in recent years in finding functional explanations for various properties of human languages. The general form of these explanations is
Languages have property P because if they did not
couldn't learn them; or
couldn't plan and produce sentences efficiently; or
couldn't understand sentences reliably and efficiently; or
wouldn't be able to express the sorts of messages we typically want to express.
Some linguists are dubious about the legitimacy of such investigations, and they are indeed a notoriously risky undertaking. It is all too easy to be seduced by what looks like a plausible explanation for some linguistic phenomenon, but there is really no way of proving that it is the correct explanation, or even that functional considerations are relevant at all. What, then, can be said in favor of this line of research?
Setting aside the sheer fascination of finding answers to why-questions, we can point to some more practical benefits that may result. First, we may find out something about the learning mechanism, or the sentence processing mechanism, or whichever component of the language faculty provides a likely functional explanation for the linguistic facts. In this paper we will concentrate on the sentence parsing mechanism. (See Fodor and Crain, in preparation, for discussion of language learning.) It is clear that one can derive at least some interesting hypotheses about how the parser is structured, by considering how it would have to be structured in order to explain why certain sentences are ungrammatical, why there are constraints excluding certain kinds of ambiguity, and so forth.
Since the late 1970s there has been vigorous activity in constructing highly constrained grammatical systems by eliminating the transformational component either totally or partially. There is increasing recognition of the fact that the entire range of dependencies that transformational grammars in their various incarnations have tried to account for can be captured satisfactorily by classes of rules that are nontransformational and at the same time highly constrained in terms of the classes of grammars and languages they define.
Two types of dependencies are especially important: subcategorization and filler-gap dependencies. Moreover, these dependencies can be unbounded. One of the motivations for transformations was to account for unbounded dependencies. The so-called nontransformational grammars account for the unbounded dependencies in different ways. In a tree adjoining grammar (TAG) unboundedness is achieved by factoring the dependencies and recursion in a novel and linguistically interesting manner. All dependencies are defined on a finite set of basic structures (trees), which are bounded. Unboundedness is then a corollary of a particular composition operation called adjoining. There are thus no unbounded dependencies in a sense.
This factoring of recursion and dependencies is in contrast to transformational grammars (TG), where recursion is defined in the base and the transformations essentially carry out the checking of the dependencies. The phrase linking grammars (PLGs) (Peters and Ritchie, 1982) and the lexical functional grammars (LFGs) (Kaplan and Bresnan, 1983) share this aspect of TGs; that is, recursion builds up a set a structures, some of which are then filtered out by transformations in a TG, by the constraints on linking in a PLG, and by the constraints introduced via the functional structures in an LFG.