To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The planning of natural-language utterances builds on contributions from a number of disciplines. The construction of the multiagent planning system is relevant to artificial intelligence research on planning and knowledge representation. The axiomatization of illocutionary acts discussed in Chapter 5 relies on results in speech act theory and the philosophy of language. Constructing a grammar of English builds on the study of syntax in linguistics and of semantics in both linguistics and philosophy. A complete survey of the relevant literature would go far beyond the scope of this book. This chapter is included to give the reader an overview of some of the most important research that is pertinent to utterance planning.
Language generation
It was quite a long time before the problem of language generation began to receive the attention it deserves. Beginning in about 1982, there has been a virtual explosion in the quantity of research being done in this field, and a complete review of all of it could well fill a book (see Bole and McDonald, forthcoming). This chapter presents an overview of some of the earlier work that provides a foundation for the research that follows.
Several early language-generation systems, (e.g. Friedman, 1969), were designed more for the purpose of testing grammars than for communication.
This book is based on research I did in the Stanford University Computer Science Department for the degree of Doctor of Philosophy. I express my sincere gratitude to my dissertation reading committee: Terry Winograd, Gary Hendrix, Doug Lenat and Nils Nilsson. Their discussion and comments contributed greatly to the research reported here. Barbara Grosz's thoughtful comments on my thesis contributed significantly to the quality of the research. I also thank Phil Cohen and Bonnie Webber for providing detailed comments on the first draft of this book and for providing many useful suggestions, and Aravind Joshi for his efforts in editing the Cambridge University Press Studies in Natural Language Processing series.
This research was supported by the Office of Naval Research under contract N00014-80-C-0296, and by the National Science Foundation under grant MCS-8115105. The preparation of this book was in part made possible by a gift from the System Development Foundation to SRI International as part of a coordinated research effort with the Center for the Study of Language and Information at Stanford University.
This book would be totally unreadable were it not for the efforts of SRI International Senior Technical Editor Savel Kliachko, who transformed my muddled ramblings into golden prose.
Toward a theory of language generation and communication
A primary goal of natural-language generation research in artificial intelligence is to design a system that is capable of producing utterances with the same fluency as that of a human speaker. One could imagine a “Turing Test” of sorts in which a person was presented with a dialogue between a human and a computer and, on the basis of the naturalness of its use of the English language, asked to identify which participant was the computer. Unfortunately, no natural-language generation system yet developed can pass the test for an extended dialogue.
A language-generation system capable of passing this test would obviously have a great deal of syntactic competence. It would be capable of using correctly and appropriately such syntactic devices as conjunction and ellipsis; it would be competent at fitting its utterances into a discourse, using pronominal references where appropriate, choosing syntactic structures consistent with the changing focus, and giving an overall feeling of coherence to the discourse. The system would have a large knowledge base of basic concepts and commonsense knowledge so that it could converse about any situation that arose naturally in its domain.
However, even if a language-generation system met all the above criteria, it might still not be able to pass our “Turing Test” because to know only about the syntactic and semantic rules of the language is not enough.
This chapter deals with the design and implementation of a planning system called KAMP (an acronym for Knowledge And Modalities Planner) that is capable of planning to influence another agent's knowledge and intentions. The motivation for the development of such a planning system is the production of natural-language utterances. However, a planner with such capabilities is useful in any domain in which information-gathering actions play an important role, even though the domain does not necessarily involve planning speech acts or coordinating actions among multiple agents.
One could imagine, for example, a police crime laboratory to which officers bring for analysis substances found at the scene of a crime. The system's goal is to identify the unknown substance. The planner would know of certain laboratory operations that agents would be capable of performing — in effect actions that would produce knowledge about what the substance is or is not. A plan would consist of a sequence of such information-gathering actions, and the result of executing the entire plan would be that the agent performing the actions knows the identity of the mystery substance. Since the primary motivation for KAMP is a linguistic one, most of the examples will be taken from utterance planning; the reader should note, however, that the mechanisms proposed are general and appear to have interesting applications in other areas as well.
In order to understand the relationship between syntactic theory and how people parse sentences, it is first necessary to understand the more general relationship between the grammar and the general cognitive system (GCS). The Chomskyan view, adhered to by most linguists working within the modern generative framework, is that the grammar is a cognitive subsystem whose vocabulary and operations are defined independently of the GCS and account for the structure of language (Chomsky, 1980). Linguistics is thus the branch of theoretical cognitive psychology which explains language structure.
There is another possible relationship between the grammar and the GCS in which linguistics does not play a primary theoretical role in explaining language structure. On this view, the structure of language is explained by basic principles of the GCS – for example, the nature of concepts in interaction with basic properties of the human information processing system. If this view is correct, grammars become convenient organizational frameworks for describing the structure of language. Linguistics is then a descriptive rather than a theoretical branch of cognitive psychology. The linguistics-as-descriptive position was held by the American Structuralists and is presently being revived from a somewhat different perspective in the form of “cognitive grammar” (Lakoff, in press).
These two frameworks for understanding the relationship between grammars and the cognitive system – linguistics as explanation and linguistics as description – suggest different research strategies for answering the question posed by the theme of this book: namely, What is the relationship between syntactic theory and how listeners parse sentences?
There has been some interest in recent years in finding functional explanations for various properties of human languages. The general form of these explanations is
Languages have property P because if they did not
couldn't learn them; or
couldn't plan and produce sentences efficiently; or
couldn't understand sentences reliably and efficiently; or
wouldn't be able to express the sorts of messages we typically want to express.
Some linguists are dubious about the legitimacy of such investigations, and they are indeed a notoriously risky undertaking. It is all too easy to be seduced by what looks like a plausible explanation for some linguistic phenomenon, but there is really no way of proving that it is the correct explanation, or even that functional considerations are relevant at all. What, then, can be said in favor of this line of research?
Setting aside the sheer fascination of finding answers to why-questions, we can point to some more practical benefits that may result. First, we may find out something about the learning mechanism, or the sentence processing mechanism, or whichever component of the language faculty provides a likely functional explanation for the linguistic facts. In this paper we will concentrate on the sentence parsing mechanism. (See Fodor and Crain, in preparation, for discussion of language learning.) It is clear that one can derive at least some interesting hypotheses about how the parser is structured, by considering how it would have to be structured in order to explain why certain sentences are ungrammatical, why there are constraints excluding certain kinds of ambiguity, and so forth.
Since the late 1970s there has been vigorous activity in constructing highly constrained grammatical systems by eliminating the transformational component either totally or partially. There is increasing recognition of the fact that the entire range of dependencies that transformational grammars in their various incarnations have tried to account for can be captured satisfactorily by classes of rules that are nontransformational and at the same time highly constrained in terms of the classes of grammars and languages they define.
Two types of dependencies are especially important: subcategorization and filler-gap dependencies. Moreover, these dependencies can be unbounded. One of the motivations for transformations was to account for unbounded dependencies. The so-called nontransformational grammars account for the unbounded dependencies in different ways. In a tree adjoining grammar (TAG) unboundedness is achieved by factoring the dependencies and recursion in a novel and linguistically interesting manner. All dependencies are defined on a finite set of basic structures (trees), which are bounded. Unboundedness is then a corollary of a particular composition operation called adjoining. There are thus no unbounded dependencies in a sense.
This factoring of recursion and dependencies is in contrast to transformational grammars (TG), where recursion is defined in the base and the transformations essentially carry out the checking of the dependencies. The phrase linking grammars (PLGs) (Peters and Ritchie, 1982) and the lexical functional grammars (LFGs) (Kaplan and Bresnan, 1983) share this aspect of TGs; that is, recursion builds up a set a structures, some of which are then filtered out by transformations in a TG, by the constraints on linking in a PLG, and by the constraints introduced via the functional structures in an LFG.
In this paper I want to draw together a number of observations bearing on how people interpret constituent questions. The observations concern the interpretation possibilities for “moved” and “unmoved” wh-phrases, as well as wide scope interpretation of quantifiers in embedded sentences. I will argue that languages typically display a correlation between positions that do not allow extractions and positions where a constituent cannot be interpreted with wide scope. Given this correlation, it seems natural to investigate the processes of extraction and wide-scope interpretation from the perspective of sentence processing, in the hope of explaining correlations between the two. I have singled out constituent questions because they illustrate the parsing problem for sentences with nonlocal filler-gap dependencies; they are a particularly interesting case to consider because of interactions between scope determining factors and general interpretive strategies for filler-gap association.
Gap-filling
To what extent is the process of gap-filling sensitive to formal, as opposed to semantic, properties of the linguistic input? One type of evidence that is relevant here is the existence of a morphological dependency between the filler and the environment of the gap, as illustrated in (1).
(1) a. Which people did Mary say — were invited to dinner?
b. *Which people did Mary say — was invited to dinner?
In languages with productive case marking, a similar type of dependency will hold between the case of the filler and the local environment of the gap. This kind of morphological agreement is typically determined by properties having to do with the surface form of the items in question, or with inherent formal properties, such as which noun class a given noun belongs to.
The ostensive goal of this paper is to construct a general complexity metric for the processing of natural language sentences, focusing on syntactic determinants of complexity in sentence comprehension. The ultimate goal, however, is to determine how the grammars of natural languages respond to different types of syntactic processing complexity.
A complexity metric that accurately predicts the relative complexity of processing different syntactic structures is not, in itself, of much theoretical interest. There does not seem to be any compelling reason for linguistic theory or psycholinguistic theory to incorporate such a metric. Rather, ultimately the correct complexity metric should follow directly as a theorem or consequence of an adequate theory of sentence comprehension.
Different theories of sentence comprehension typically lead to distinct predictions concerning the relative perceptual difficulty of sentences. Hence, one reason for developing a complexity metric is simply to help pinpoint inadequacies of current theories of sentence comprehension and to aid in the evaluation and refinement of those theories. An explicit complexity metric should also help to reveal the relation between the human sentence processor and the grammars of natural languages. In particular, developing a well-motivated complexity metric is a crucial prerequisite for evaluating the hypothesis that the grammars of natural languages are shaped in some respect by the properties of the human sentence processor since the most common form of this hypothesis claims that grammars tend to avoid generating sentences that are extremely difficult to process.
In this paper we describe an experiment in sentence processing which was intended to relate two properties of syntactic structures that have received much discussion in linguistics and psychology (see references cited in the next section). First, some syntactic structures, such as the passive construction, require more processing effort than corresponding structures which express the same grammatical relations. Passive sentences in particular have been the subject of much experimental work. Second, it is clear, as was observed by Jespersen (1924), that the difference between active and passive sentences has something to do with focus of attention on a particular constituent, the grammatical subject. And the consequences of the difference of focus of attention is in some way related to the context formed by the discourse in which the sentence occurs. In this experiment we wanted to study syntactic structures which might have similar properties to the passive/active construction, so as to define exactly what features of passive sentences are responsible for their observed greater processing demands and definition of focus of attention, or sentence topic. One of the bases of the experiment, underlying the hypotheses we wanted to test, is that processing load and definition of sentence topic are related in some way.
We combined sentences exemplifying five different syntactic constructions with context sentences having different relations to the target sentences, and measured reaction time for reading and understanding the second or target sentence. The results show that there is a fairly consistent relationship of processing load for the other constructions as well as passive, and that overall processing time is sensitive to both syntactic structure and contextual information.
Language is a system for encoding and transmitting ideas. A theory that seeks to explain linguistic phenomena in terms of this fact is a functional theory. One that does not misses the point. In particular, a theory that shows how the sentences of a language are all generable by rules of a particular formal system, however restricted that system may be, does not explain anything. It may be suggestive, to be sure, because it may point to the existence of an encoding device whose structure that formal system reflects. But, if it points to no such device, it simply constitutes a gratuitous and wholly unwelcome addition to the set of phenomena to be explained.
A formal system that is decorated with informal footnotes and amendments explains even less. If I ask why some phenomenon, say relativization from within the subject of a sentence, does not take place in English and am told that it is because it does not take place in any language, I go away justifiably more perplexed than I came. The theory that attempts to explain things in this way is not functional. It tells me only that the source of my perplexity is more widespread than I had thought. The putative explanation makes no reference to the only assertion that is sufficiently self-evident to provide a basis for linguistic theory, namely that language is a system for encoding and transmitting ideas.
Kimball's parsing principles (Kimball, 1973), Frazier and Fodor's Sausage Machine (Frazier and Fodor, 1978; Fodor and Frazier, 1980) and Wanner's augmented transition network (ATN) model (Wanner, 1980) have tried to explain why certain readings of structurally ambiguous sentences are preferred to others, in the absence of semantic information. The kinds of ambiguity under discussion are exemplified by the following two sentences.
Tom said that Bill had taken the cleaning out yesterday.
John bought the book for Susan.
For sentence (1), the reading ‘Yesterday Bill took the cleaning out’ is preferred to ‘Tom spoke yesterday about Bill taking the cleaning out.’ Kimball (1973) introduced the principle of Right Association (RA) to account for this kind of preference. The basic idea of the Right Association principle is that, in the absence of other information, phrases are attached to a partial analysis as far right as possible.
For sentence (2), the reading ‘The book was bought for Susan’ is preferred to ‘John bought a book that had been beforehand destined for Susan.’ To account for this preference, Frazier and Fodor (1978) introduced the principle of Minimal Attachment (MA), which may be summarized as stating that, in the absence of other information, phrases are attached so as to minimize the complexity of the analysis.
Much of the debate about the formulation and interaction of such principles is caused by their lack of precision and, at the same time, by their being too specific. I propose a simple, precise, and general framework in which improved versions of Right Association and Minimal Attachment can be formulated.
Speakers of certain bilingual communities systematically produce utterances in which they switch from one language to another possibly several times, in the course of an utterance, a phenomenon called code switching. Production and comprehension of utterances with intrasentential code switching is part of the linguistic competence of the speakers and hearers of these communities. Much of the work on code switching has been sociolinguistic or at the discourse level, but there have been few studies of code switching within the scope of a single sentence. And until recently, this phenomenon has not been studied in a formal or computational framework.
The discourse level of code switching is important; however, it is only at the intrasentential level that we are able to observe with some certainty the interaction between two grammatical systems. These interactions, to the extent they can be systematically characterized, provide a nice framework for investigating some processing issues both from the generation and the parsing points of view.
There are some important characteristics of intrasentential code switching which give hope for the kind of work described here. These are as follows. (1) The situation we are concerned with involves participants who are about equally fluent in both languages. (2) Participants have fairly consistent judgments about the “acceptability” of mixed sentences. (In fact it is amazing that participants have such acceptability judgments at all.) (3) Mixed utterances are spoken without hesitation, pauses, repetitions, corrections, etc., suggesting that intrasentential code switching is not some random interference of one system with the other.