To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
When there is light at the end of the tunnel, order more tunnel.
—Anonymous
But in our enthusiasm, we could not resist a radical overhaul of the system, in which all of its major weaknesses have been exposed, analyzed, and replaced with new weaknesses.
—Bruce Leverett
The research I have described in this book necessarily leaves unanswered many questions, big and small. Many sections, especially 3.8, 5.3.6, 5.4, and 7.4 have discussed things left undone by Absity, Polaroid Words, and the Semantic Enquiry Desk. In this chapter, I list a number of other open questions, partially baked ideas, and wild speculations, sometimes with my thoughts on how they might be answered or developed. Some could be dissertation topics; others may be good subjects for a term paper or course project. Several are psycholinguistic experiments. At the start of each question (if appropriate) I give in brackets the section or sections of this book in which the matter is discussed.
The representation of knowledge
Exercise 1.1 [1.1.2,1.3.1,5.2,5.6.3] Could a non-discrete representation of knowledge be developed for AI? Such a representation would be able to handle close similarities and differences, such as the head of a pin compared with the head of a hammer. Consider the possibility of pseudo-continuous representations that are to discrete representations as floating-point numbers are to integers. Candidates to consider include some kind of network of neuron-like nodes (cf. Feldman and Ballard 1982; Feldman 1985), a value-passing machine such as Fahlman, Hinton and Sejnowski's (1983) Thistle system, and a simulated-annealing or Boltzmann network (Kirkpatrick, Gelatt and Vecchi 1983; Fahlman, Hinton and Sejnowski 1983; Smolensky 1983) (see also exercise 4.8).
In the description of Absity in chapter 3, I made the unrealistic assumption that each word and pseudo-word corresponds to the same unique semantic object whenever and wherever it occurs; that is, I assumed there to be no lexical ambiguity and no case flag ambiguity. In this chapter, I will remove this assumption. The goal will be to develop a method for disambiguating words and case flags within the framework of Absity, finding the correct semantic object for an ambiguous lexeme.
Since Absity is “Montague-inspired” (sections 2.2.2 and3.2), the obvious thing to do first is see how Montague handled lexical ambiguity in his PTQ formalism (Montague 1973) (see section 2.2.2). It turns out, however, that Montague had nothing to say on the matter. His PTQ fragment assumes, as we did in chapter 3 but no longer wish to, that there is a unique semantic object for each lexeme. Nor does Montague explicitly use case flags. The verbs of the fragment are all treated as oneplace or two-place functions, and syntactic position in the sentence distinguishes the arguments. Nevertheless, there is an easy opening in the formalism where we may deal with lexical ambiguity: except for a few special words, Montague's formalism does not specify where the translation of a word comes from; rather, there is assumed to be a function g that maps a word a to its translation, or semantic object, g(α),and as long as g(α) (which is usually denoted α') is of the correct semantic type, it doesn't really matter how gdoes its mapping. This means that if we can “hide” disambiguation inside g, we need make no change to the formalism itself to deal with ambiguity in PTQ.
Like all other scientists, linguists wish they were physicists. They dream of performing classic feats like dropping grapefruits off the Leaning Tower of Pisa, of stunning the world with pithy truths like F = ma … [But instead,] the modern linguist spends his or her time starring or unstarring terse unlikely sentences like “John, Bill, and Tom killed each other”, which seethe with repressed frustration and are difficult to work into a conversation.
–Joseph D Becker
When I took my first linguistics course, freshman transformational syntax, in 1974, we were taught that syntax was now basically under control. Sure, people still argued over particular transformations, and this was still all new and exciting stuff, but there was general agreement on the approach. Semantics, on the other hand, was a difficult and tenuous territory; no one yet understood what a semantic was. Semantics was said to have the same qualities as God or Mind—fun to argue about, but inherently unknowable. The study of semantics was left, therefore, until junior year.
Given linguists with attitudes like those toward semantics, it is not surprising that consumers of linguistic theory, such as researchers in natural language understanding, took semantic matters into their own hands. The result was approaches to semantics that were exemplary in their own terms but lacked a firm theoretical basis and hence were inadequate in their relationship to other aspects of language and to wider issues of meaning and representation of meaning. The best example of this is the dissertation of Woods (1967), which I will discuss in some detail in section 2.3.1.
This book is based on my doctoral dissertation at Brown University, submitted in December 1983. I have revised it extensively; in particular, I have kept the literature reviews up to date, and tried to take account of related work on the same topic that has been published since the original dissertation.
The work herein is interdisciplinary, and is perhaps best described as being in cognitive science. It takes in artificial intelligence, computational linguistics, and psycholinguistics, and I believe that it will be of interest to researchers in all three areas. Accordingly, I have tried to make it comprehensible to all by not assuming too much knowledge on the reader's part about any field. The incorporation of complete introductory courses was, however, impractical, and the reader may wish to occasionally consult introductory texts outside his or her main research area.
Organization of the book
Chapter 1 is an introductory chapter that sets out the topic of the work and the general approach. The problems of semantic interpretation, lexical disambiguation, and structural disambiguation are explained. For readers who haven't come across them before, there are also brief overviews of frame systems and case theories of language; people who already know it all can skip this. I then describe the research, and in particular the Frail frame language and the Paragram parser, that was the starting point for the present work.
Chapters 2 and 3 form Part I, on semantic interpretation. Chapter 2 is a detailed examination of past research on the topic, and discusses properties desirable in a semantic interpreter.
A great interpreter ought not to need interpretation.
–John Morley
Introduction
In this chapter, I describe the Absity semantic interpreter. Absity meets five of the six requirements for an interpreter listed in section 2.5, and provides a foundation for further research in meeting the remaining requirement.
Absity is part of the artificial intelligence research project at Brown University that was described in section 1.3. It uses one of the project's parsers, Paragram (see section 1.3.1), and the project's frame representation language, Frail (section 1.3.2). The implementation to be described is therefore necessarily dependent upon the nature of these other components, as are many aspects of Absity's design. Nevertheless, in keeping with the goals of this work, the design has been kept as independent as possible of the representation formalism and the parser. The main ideas in Absity should be usable with other representations that have a suitable notion of semantic object and also, in particular, with other parsers, transformational or otherwise.
The organization of this chapter is as follows. In the first half, Absity is gradually built up, by explaining alternately a strategy and then its use in Absity. I then give some examples and some implementation details. In the second half, Absity is put on trial, and its strengths and weaknesses are evaluated.
Two strategies: Strong typing and tandem processing
In the design of Absity, we will make use of two features of Montague's formalism (see section 2.2.2): a strong typing of semantic objects, and running syntax and semantics not just in parallel but in tandem. These strategies will allow us to simplify the system of semantic rules.
A year working in Artificial Intelligence is enough to make one believe in God.
—Alan Perlis
Introduction
In this chapter, I will show how structural disambiguation may be added to the Paragram–Absity–Polaroid Words system. I will do this in two stages. First, I will consider the present version of the system and the structural ambiguities that it handles, a small subset of those that I listed in section 6.2. (Not all of the sentence types that I listed can even be parsed by the present Paragram grammar.) The disambiguation methods will include a synthesis of some of the ones that we saw in section 6.3, as well as my own. Then, second, I will consider methods for extending the system's present limited range of abilities.
Although Paragram is basically a Marcus parser (see section 1.3.2), it has a somewhat different approach to semantics from that taken by Parsifal (Marcus 1980), which we saw in section 6.3.2. The two are similar in that they both assume the existence of a semantic process that they can ask for guidance when they need it. However, unlike Parsifal, Paragram is a trifle pajanoid: it will never attach anything to anything, whether an ambiguity is possible or not, without first asking for permission from semantics. The semantic process that Paragram uses is called the Semantic Enquiry Desk (SED); it is the operation of this process that we discuss in the remainder of this chapter.
At present, Paragram knows about two types of structural ambiguity for which it requires assistance from the SED: prepositional phrase attachment and gap finding in relative clauses. In the following sections, I will show how the SED handles each of these.
In this chapter, I look at previous approaches to semantics and semantic interpretation in linguistics and NLU. The work I will examine addresses one or more of the following questions:
What kind of formalism, representation, or model can adequately capture the semantics of a natural language utterance? That is, what is a “semantic object“?
What kind of process can map a natural language utterance to these semantic objects?
Can these semantic objects be used in artificial intelligence applications? Once a sentence has been interpreted, how can its meaning be applied by the system to which it was input?
All three questions are of interest to AI researchers, but linguists care only about the first two and philosophers only about the first. We should therefore not be surprised or too disappointed if, in looking at work on semantics outside AI, we find that it stops short of where we wanted to go. Conversely, we may also find that a semantic theory is adequate for AI without satisfying the linguists or philosophers.
It is worth emphasizing the point that our concerns in semantics are not identical to everyone else's. We can identify two distinct (but obviously related) concerns (cf.Winograd 1984):
The study of abstract meaning and its relation to language and the world.
The study of how agents understand the meaning of a linguistic utterance in the world.
Too often the label doing semantics is used vaguely to mean one or the other of these, or some random, shifting mixture of both, and the word semantics is unwittingly used differently by different participants in discussions.
The goal of this chapter is to put the work described in previous chapters into perspective. First, I summarize the virtues of the work, and its potential. Then I compare it with similar work carried out concurrently by others. Finally, I list some of the questions that it leaves unanswered that ought to be answered.
The work in review
What has been achieved
I have presented a semantic interpreter and two disambiguation systems: one for lexical ambiguity and one for structural ambiguity. The systems have been designed to work closely with one another and with an existing parser and knowledgerepresentation system.
The semantic interpreter, Absity, is “Montague-inspired”, in that it adapts to AI several aspects of Montague's (1973) way of thinking about semantics: it is compositional; it has a strong notion of “semantic object”; it operates in tandem with a parser; its partial results are always well-formed semantic objects; and it imposes a strong typing upon its semantic objects. The semantic objects are objects of the knowledge representation, and the types are the types that the representation permits (which, we saw, correspond to the syntactic categories of English).
The structural disambiguator is the Semantic Enquiry Desk. It tells the parser what to do whenever the parser needs semantic help to decide between two or more alternative structures. The SED makes its decisions by looking at the semantic objects in the partial results of the semantic interpreter, and, if necessary, by calling the knowledge base for information on plausibility and referential success.
There's no need to think, dear. Just do what's in the script, and it'll all come out right.
—Alfred Hitchcock
What is necessary for lexical disambiguation?
The problem of determining the correct sense of a lexically ambiguous word in context has often been seen as one primarily of context recognition, a word being disambiguated to the unique meaning appropriate to the frame or script representing the known or newly-established context. For example, in the well-known SAM program (Cullingford 1978, Schank and the Yale AI Project 1975, Schank and Abelson 1977), each script has associated with it a set of word meanings appropriate to that script; in the restaurant script, there will be unique meanings given for such words as waiter and serve, and when (4-1) is processed:
(4-1) The waiter served the lasagna.
the fact that serve has quite a different meaning in the tennis script will not even be noticed (Schank and Abelson 1977:183). (Certain words naming well-known people and things, such as Manhattan, are always present (Cullingford 1978:13).)
In its most simple-minded form, the script approach can easily fail:
(4-2) The lawyer stopped at the bar for a drink.
If only the lawyering script is active, then bar as an establishment purveying alcoholic beverages by the glass will be unavailable, and its legal sense will be incorrectly chosen. If resolution is delayed until the word drink has had a chance to bring in a more suitable script, then there will be the problem of deciding which script to choose the sense from; as Charniak (1984) points out, it is reasonable to expect in practice to have fifty or more scripts simultaneously active, each with its own lexicon, necessitating extensive search and choice among alternatives.
Most programming languages allow programmer defined data structures (e.g. array … of …) and when there is a rich choice available (array, record, set, pointer, etc.) there is no doubt that very neat, expressive data models can be built. However there is one major drawback. That is that the syntax used for accessing each type of structure is distinctive and fixed. This has two effects. Firstly, if for example, a list structure is altered from an array implementation to a record-with-pointer implementation then every reference to the list in the program must be changed. The distinctive array reference syntax (a[i]) has to be changed to record/pointer reference syntax (p↑. field). Secondly the program becomes more machine-oriented and less problem-oriented because of the intrusion of programming details.
The way of avoiding the problems mentioned above is to think of a data structure not just as a storage area but as a collection of distinctive operations on certain data. This almost establishes the informal definition of an abstract data type (ADT)
ADT = Data Structure + Distinctive Operations
We have been using one abstract data type (the list) without naming it as such. Its distinctive operations are head and tail, ‘concatenate’ (∥) and ‘creation from elements’ (〈e1, e2, …, en〉). We have also introduced realisations or implementations of the abstract data type list in various languages – see Chapter 6. In fact in Section 6.3, Templates for FORTRAN, the implementation of LIST as a module shows the clear intention to treat the data space and the operations as an indivisible unit.
In this chapter we address the issues of coding from a PDL into various real imperative programming languages. The PDL stage described in the previous chapter contains a complete (imperative) solution to the original problem so that the coding can now be finished without reference to the original problem. The intention in this chapter is to show that the final code generation can be accomplished using coding templates. Coding templates are shown for a variety of programming languages in common use.
Templates
Coding templates are stylised translations of each feature of a PDL. The methodical application of the templates to the PDL solution will yield the final code.
For any particular final coding language, a set of (coding) templates is created to translate each feature of the PDL in use. This means for example that every ‘if’ statement in the PDL is translated in the same way. Each time the ‘if’ statement is met it is coded using the same pattern or template. The templates are different for each different final coding language. They are chosen more for generality than for elegance or efficiency. There may well be features of a final coding language that are not used in any template. In this methodology these features will never be used. This may seem an unacceptable loss at first sight. However the experience of the authors is that the features which are not used in templates are those which are less widely used anyway or not universally supported or inconsistently supported and so their omission leads to more portable programs.