To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The next part in this book concerns linguistic issues for lexical semantics. The main issues addressed are based on the notion of Generative Lexicon (Pustejovsky, 1991) and its consequences for the construction of lexicons.
The first chapter, “Linguistic constraints on type coercion,” by James Pustejovsky, summarizes the foundations of the Generative Lexicon which he defined a few years ago. This text investigates how best to characterize the formal mechanisms and the linguistic data necessary to explain the behavior of logical polysemy. A comprehensive range of polymorphic behaviors that account for the variations in semantic expressiveness found in natural languages is studied.
Within the same formal linguistic paradigm, we then have a contribution by Sabine Bergler, “From lexical semantics to text analysis,” which illustrates several issues of the Generative Lexicon using data from the Wall Street Journal. This chapter addresses in depth an important issue of the Generative Lexicon: what kind of methods can be used to create Generative Lexicon lexical entries with precise semantic content for the Qualia roles of the Generative Lexicon, from linguistic analysis. Special attention is devoted to the production of partial representations and to incremental analysis of texts.
The next chapter, “Lexical functions, generative lexicons and the world” by Dirk Heylen, explores the convergences and divergences between Mel'čuk's analysis of lexical functions and the generative lexicon approach. The author then proposes an interesting and original knowledge representation method based on lexical functions mainly following Mel'čuk's approach.
In this chapter, we present a synopsis of several notions of psycholinguistics and linguistics that are relevant to the field of lexical semantics. We mainly focus on the notions or theoretical approaches that are broadly used and admitted in computational linguistics. Lexical semantics is now playing a central role in computational linguistics, besides grammar formalisms for parsing and generation, and sentence and discourse semantic representation production. The central role of lexical semantics in computational linguistics can be explained by the fact that lexical entries contain a considerable part of the information that is related to the word-sense they represent.
This introduction will provide the reader with some basic concepts in the field of lexical semantics and should also be considered as a guide to the chapters included in this book. We first present some basic concepts of psycholinguistics which have some interest for natural language processing. We then focus on the linguistic aspects which are commonly admitted to contribute substantially to the field. It has not, however, been possible to include all aspects of lexical semantics: the absence of certain approaches should not be considered as an a priori judgment on their value.
The first part of this text introduces psycholinguistic notions of interest to lexical semantics; we then present linguistic notions more in depth. At the end of this chapter, we review the chapters in this volume.
Contribution of psycholinguistics to the study of word meaning
Results from psycholinguistic research can give us a good idea of how concepts are organized in memory, and how this information is accessed in the mental lexicon.
Many words have two or more very distinct meanings. For example, the word pen can refer to a writing implement or to an enclosure. Many natural language applications, including information retrieval, content analysis and automatic abstracting, and machine translation, require the resolution of lexical ambiguity for words in an input text, or are significantly improved when this can be accomplished. That is, the preferred input to these applications is a text in which each word is “tagged” with the sense of that word intended in the particular context. However, at present there is no reliable way to automatically identify the correct sense of a word in running text. This task, called word sense disambiguation, is especially difficult for texts from unconstrained domains because the number of ambiguous words is potentially very large. The magnitude of the problem can be reduced by considering only very gross sense distinctions (e.g., between the pen-as-implement and pen-as-enclosure senses of pen, rather than between finer sense distinctions within, say, the category of pen-as-enclosure – i.e., enclosure for animals, enclosure for submarines, etc.), which is sufficient for many applications. But even so, substantial ambiguity remains: for example, even the relatively small lexicon (20,000 entries) of the TRUMP system, which includes only gross sense distinctions, finds an average of about four senses for each word in sentences from the Wall Street Journal (McRoy, 1992). The resulting combinatoric explosion demonstrates the magnitude of the lexical ambiguity problem.
Several different kinds of information can contribute to the resolution of lexical ambiguity.
Lexical choice cannot be processed in text generation without appealing to a lexicon which takes into account many lexico-semantic relations. The text generation system must be able to treat the immediate and the larger lexical context.
a) The immediate lexical context consists of the lexemes that surround the lexical item to be generated. This context must absolutely be taken into account in the case of collocational constraints, which restrict ways of expressing a precise meaning to certain lexical items, for example as in expressions like pay attention, receive attention or narrow escape (Wanner & Bateman, 1990; Iordanskaja et al., 1991; Nirenburg & Nirenburg, 1988; Heid & Raab, 1989).
b) The larger textual context consists of the linguistic content of previous and subsequent clauses. This context is the source for cohesive links (Halliday & Hasan, 1976) with the lexical items to be generated in the current clause, as in:
(1) Professor Elmuck was lecturing on lexical functions to third-year students. The lecturer was interesting and the audience was very attentive.
In the second sentence, lecturer is coreferential with Professor Elmuck and the audience is coreferential with third-year students. These semantic links are due to the lexico-semantic relations between, on the one hand, lecturer (“agent noun”) and lecture, and on the other hand, between audience (“patient noun”) and lecture.
In this chapter, we will show that the Lexical Functions (LFs) of the Explanatory Combinatorial Dictionary (hereafter ECD) (Mel'čuk et al., 1984a, 1988; Mel'čuk & Polguère, 1987; Meyer & Steele, 1990) are well suited for these tasks in text generation.
In order to help characterize the expressive power of natural languages in terms of semantic expressiveness, it is natural to think in terms of semantic systems with increasing functional power. Furthermore, a natural way of capturing this might be in terms of the type system which the grammar refers to for its interpretation. What I would like to discuss in this chapter is a method for describing how semantic systems fall on a hierarchy of increasing expressiveness and richness of descriptive power and investigate various phenomena in natural language that indicate (1) that we need a certain amount of expressiveness that we have not considered before in our semantics, but also (2) that by looking at the data it becomes clear that we need natural constraints on the mechanisms which give us such expressive systems. After reviewing the range of semantic types from monomorphic languages to unrestricted polymorphic languages, I would like to argue that we should aim for a model which permits only a restricted amount of polymorphic behavior. I will characterize this class of languages as semi-polymorphic.
I will outline what kind of semantic system produces just this class of languages. I will argue that something like the generative lexicon framework is necessary to capture the richness of type shifting and sense extension phenomena.
Let me begin the discussion on expressiveness by reviewing how this same issue was played out in the realm of syntactic frameworks in the 1950s.
The work described in this chapter starts from the observation that a word in a text has a semantic value which is seldom identical with any of the definitions found in a dictionary. This fact was of little importance as long as dictionaries were primarily intended for human beings, since the process used to convert the lexical meaning into the semantic value seems well mastered by humans – at least by the potential users of dictionaries – but it becomes of prominent importance now that we need computer-oriented dictionaries.
As a matter of fact, the computation of the semantic value for a given word requires lexical information about that word, about the other words of the text, about syntax, and about the world. The set of the possible values for a given word X is open-ended, i.e., a list of values being given, it is always possible to build some context where X takes a value not present in the list. As a consequence, no dictionary – as thick as it could be – may contain all of them. Therefore, in any case, it is necessary to implement a mechanism which constructs the value from information taken in the dictionary, as well as from knowledge of the grammar and of the world.
In artificial intelligence (henceforth A.I.), the main objective is not to find the “correct” meaning of each word or phrase, but to get the set of consequences which can be drawn from a text; if the same set is obtained from “different” interpretations (i.e., interpretations using different values for some word), then the difference is irrelevant.
The first topic, psycholinguistic and cognitive aspects of lexical semantics, is addressed in the first two chapters. This area is particularly active but relatively ignored in computational circles. The reasons might be a lack of precise methods and formalisms. This area is, however, crucial for the construction of well-designed semantic lexicons by the empirical psychologically based analysis introduced in the domain of computational lexical semantics.
“Polysemy and related phenomena from a cognitive linguistic viewpoint” by Alan Cruse surveys the ways in which the contribution of the same grammatical word makes to the meaning of a larger unit depending on the context. Two main explanatory hypotheses that account for contextual variation are explored: lexical semantics and pragmatics. Alternatives to this approach are studied in the other parts of the volume.
The second chapter, “Mental lexicon and machine lexicon: Which properties are shared by machine and mental word representations? Which are not?” by J.-F. Le Ny, sets language comprehension and interpretation within a general cognitive science perspective. Properties common to natural and artificial semantic units (e.g., denotation, super-ordination, case roles, etc.) are first explored. Then, problems related to activability and accessibility in the memory are addressed in both a theoretical and experimental way.
There has been some disagreement about when circles (or closed curves) began being used for representing classical syllogisms. They seem to have first been put to this use in the Middle Ages. However, there seems to be agreement that it was Leonhrad Euler, in the eighteenth century, who proposed using circles to illustrate relations between classes. This diagrammatic method of Euler's was greatly improved by a nineteenth century logician John Venn. And in this century, it was Charles Peirce who made a great contribution to the further development of Venn diagrams.
This chapter explores the essence of Euler diagrams and their descendants, and will serve to prepare the reader for my approach to Venn diagrams presented in the following chapters. In each section, along with the main ideas of each system and its limits, I focus on how some of the main limits of one system are overcome by the following system. That is, the Venn system solves some of the main problems that the Euler system has. This improvement was significant enough to make necessary a distinction between Euler diagrams and Venn diagrams. I will show that Peirce's revolutionary ideas about diagrams not only overcame some important defects of Venn diagrams but opened the way to a totally new horizon for logical diagrams. This last aspect will be discussed in detail in the third section. I will also point out where this new horizon stopped, and will claim that my approach to Venn diagrams (in Chapters 3 and 4) is the natural completion of these predecessors' incomplete projects.
In this chapter, I claim that Venn-II is equivalent to a first-order language ℒ0, which I will specify in the first section. This claim is supported by two subclaims. One is that for every diagram D of Venn-II, there is a sentence ϕ of ℒ0 such that the set assignments that satisfy D are isomorphic to the structures that satisfy ϕ. The other is that for every sentence ϕ of ℒ0, there is a diagram D of Venn-II such that the structures that satisfy ϕ are isomorphic to the set assignments that satisfy D.
In this section, we want to show that there is an isomorphism between the set of set assignments for Venn-II and the set of structures for ℒ0.
Because we have only one closed curve type and one rectangle, we need an extra mechanism in the semantics of this Venn system. That is a counterpart relation among tokens of a closed curve or among tokens of a rectangle. Before we make a mapping between sets and structures, we need to deal with these cp-related regions.
We are about to formalize a way of using Venn diagrams. Before we present a formalism for this system (we call this system Venn-I), let us see how Venn diagrams are used to test the validity of syllogisms:
Draw diagrams to represent the facts that the two premises of a syllogism convey. (Let us call one D1 and the other D2.)
Draw a diagram to represent the fact that the conclusion of the syllogism conveys. (Let us call this diagram D.)
See if we can read off diagram D from diagram D1 and diagram D2.
If we can, then this syllogism is valid.
If we cannot, then this syllogism is invalid.
Let us try to be more precise about each step. Step (1) and step (2) raise the following question: How is it possible for a diagram drawn on a piece of paper to represent the information that a premise or a conclusion conveys? These two steps are analogous to the translation from English to a first-order language. Suppose that we test the validity of a syllogism by using a first-order language. How does this translation take place? First of all, we need to know the syntax and the semantics of each language – English and the first-order language. We want to translate an English sentence into a first-order sentence whose meaning is the same as the meaning of the English sentence.
I started this work with the following assumption: There is a distinction between diagrammatic and linguistic representation, and Venn diagrams are a nonlinguistic form of representation. By showing that the Venn system is sound and complete, I aimed to provide a legitimate reason why logicians who care about valid reasoning should be interested in nonlinguistic representation systems as well. However, the following objection might undermine the import of my project: How do we know that Venn diagrammatic representation is a nonlinguistic system? After all, it might be another linguistic representation system which is too restricted in its expressiveness to be used in our reasoning. If we accept this criticism, what I have done so far would be reduced to the following: A very limited linguistic system was chosen and proven to be sound and complete. Considering how far symbolic logic has developed, this could not be an interesting or meaningful project at all. Therefore, it seems very important to clarify the assumptions that diagrammatic systems are different from linguistic ones and that the Venn systems are nonlinguistic.
There has been some controversy over how to define diagrams in general, despite the fact that we all seem to have some intuitive understanding of diagrams. For example, all of us seem to make some distinction between Venn diagrams and first-order languages. Everyone would classify Euler circles under the same category as Venn diagrams. Or suppose that both a map and verbal instructions are available for us to locate a certain place.
Diagrams have been widely used in reasoning, for example, in solving problems in physics, mathematics, and logic. Mathematicians, psychologists, philosophers, and logicians have been aware of the value of diagrams and, moreover, there has been an increase in the research on visual representation. Many interesting and important issues have been discussed: the distinction, if any, between linguistic symbols and diagrams, the advantages of diagrams over linguistic symbols, the importance of imagery to human reasoning, diagrammatic knowledge representation (especially in artificial intelligence systems), and so on.
The work presented in this book was mainly motivated by the fact that we use diagrams in our reasoning. Despite the great interest shown in diagrams, nevertheless a negative attitude toward diagrams has been prevalent among logicians and mathematicians. They consider any nonlinguistic form of representation to be a heuristic tool only. No diagram or collection of diagrams is considered a valid proof at all. It is more interesting to note that nobody has shown any legitimate justification for this attitude toward diagrams. Let me call this traditional attitude, that is, that diagrams can be only heuristic tools but not valid proofs, the general prejudice against diagrams. This prejudice has been unquestioned even when proponents of diagrams have worked on the applications of diagrams in many areas and argued for the advantages of diagrams over linguistic symbols. This is why it is quite worthwhile to question the legitimacy of this prejudice, that is, whether this prejudice is well grounded or not.