To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
One of the most difficult areas for research in machine translation (MT) is the representation of meanings in the lexicon. The lexicon plays a central role in any MT system, regardless of the theoretical foundations upon which the system is based. However, it is only recently that MT researchers have begun to focus more specifically on issues that concern the lexicon, e.g., cross-linguistic variations that arise during the mapping between lexical items in the source and target languages.
The traditional approach to constructing dictionaries for MT has been to massage on-line dictionaries that are primarily intended for human consumption. Given that most natural language applications have focused primarily on syntactic information that can be extracted from the lexicon, these methods have constituted a reasonable first-pass approach to the problem. However, it is now widely accepted that MT requires language-independent conceptual information in order to successfully process a wide range of phenomena in more than one language. Thus, the task of constructing lexical entries has become a much more difficult problem as researchers endeavor to extend the concept base to support more phenomena and additional languages.
This chapter describes how parameterization of the lexicon allows an MT system to account for a number of cross-linguistic variations, called divergences, during translation. There are many cases in which the natural translation of one language into another results in a very different form than that of the original. These divergences make the straightforward transfer from source structures into target structures impractical.
This volume on computational lexical semantics emerged from a workshop on lexical semantics issues organized in Toulouse, France, in January 1992. The chapters presented here are extended versions of the original texts.
Lexical semantics is now becoming a major research area in computational linguistics and it is playing more of a central role in various types of applications involving natural language parsers as well as generators.
Lexical semantics covers a wide spectrum of problematics from different disciplines, from psycholinguistics to knowledge representation and to computer architecture, which makes this field relatively difficult to perceive as a whole. The goal of this volume is to present the state of the art in lexical semantics from a computational linguistics point of view and from a range of perspectives: psycholinguistics, linguistics (formal and applied), computational linguistics, and application development. The following points are particularly developed in this volume:
psycholinguistics: mental lexicons, access to lexical items, form of lexical items, links between concepts and words, and lexicalizing operations;
linguistics and formal aspects of lexical semantics: lexical semantics relations, prototypes, conceptual representations, event structure, argument structure, and lexical redundancy;
knowledge representation: systems of rules, treatment of type coercion, aspects of inheritance, and relations between linguistics and world knowledge;
applications: creation and maintenance of large-size lexicons, the role of the lexicon in parsing and generation, lexical knowledge bases, and acquisition of lexical data;
operational aspects: processing models and architecture of lexical systems.
Lexical semantics offers a large variety of uses in natural language processing and it obviously allows for more refined treatments. One of the main problems is to identify exactly the lexical semantic resources that one needs to solve a particular problem. Another main difficulty is to know how best to organize this knowledge in order to keep the system reasonably efficient and maintainable; this is particularly crucial for a number of large-scale applications. This volume contains two chapters that explore application of lexical semantics in the area of natural language generation and in the area of machine translation with an interlingua representation.
The first chapter of this section, “Lexical functions of the Explanatory Combinatorial Dictionary for lexicalization in text generation”, by Margarita Alonso Ramos et al., applies Mel'čuk's framework to natural language generation. It shows that the problem of lexicalization, i.e., the relation between a concept (or a combination of concepts) and its linguistic realization, cannot really be correctly carried out without making reference to a lexicon that takes into account the diversity of the lexico-semantics relations. This approach views lexicalization both as a local process (lexicalization is solved within a restricted phrase) and a more global one, taking into account the ‘contextual effects’ of a certain lexicalization with respect to the others in a sentence or in a text. Paradigmatic lexical functions are shown to be well adapted to treat lexicalization in the context of a text, whereas syntagmatic ones operate at the sentence or proposition levels.
Language comprehension and interpretation can be set within a general cognitive science perspective that encompasses both human minds and intelligent machine systems. This comprehensive view, however, must necessarily entail the search for compatibility between various types of concepts or models subsuming classes of facts obtained through specific methods, and hence belonging to various scientific domains, from artificial intelligence to cognitive psychology, and ranging over linguistics, logics, neurosciences, and others.
No sub-domain of cognitive science is more fascinating for the exploration of such functional identities between artificial and natural processing or representation and none deserves more to be comparatively worked out than language comprehension or interpretation. The purpose of this chapter is to highlight some of these identities, but also some differences, in particular as concerns lexical semantics.
A large number of concepts are obviously shared by computational semantics and cognitive psychology. I will approach them in this chapter mainly in the form of relational properties, belonging to lexical-semantic, or lexical-conceptual, units, and will classify them according to whether they can be attributed, or not, to both mental and machine representations. For the sake of simplicity I will often restrict this comparison to consideration of lexical units that are expressed by nouns in most natural languages and which, as a rule, denote classes of objects or individuals. But other kinds of lexical-conceptual units, expressed in these natural languages by verbs, adjectivals, prepositions, function words, etc., and which denote events or actions, properties, various sorts of relations, etc., could also be submitted to this type of analysis and comparison.
The basic elements of most recent Natural Language Understanding (NLU) systems are a syntactic parser which is used to determine sentence structure, and a semantic lexicon whose purpose is to access the system's factual knowledge from the natural language input. In this regard, the semantic lexicon plays the key role of relating words to world knowledge. But the semantic lexicon is also used in solving some specifically linguistic issues when recovering sentence structure, and should contain linguistic knowledge. In this chapter we discuss the issue of lexical content in terms of linguistic and world knowledge, through the so-called dictionary–encyclopedia controversy. To illustrate the discussion we will describe the lexical semantics approach adopted in our NLU program processing sentences from medical records (Zweigenbaum and Cavazza, 1990). This program is a small-scale but fully implemented prototype adopting a broad view to NLU, from syntactic analysis to complex domain inferences through model-based reasoning (Grishman and Ksiezyk, 1990). The dictionary–encyclopedia controversy opposes two extreme conceptions of word definitions: according to the dictionary approach a word is described in terms of linguistic elements only, without recourse to world knowledge, whereas an encyclopedic definition includes
an indication of the different species or different stages of the object or process denoted by the word, the main types of behavior of this object or process,… (Mel'čuk and Zholkovsky, 1988).
This point has been discussed by many authors including Katz and Fodor (1963), Eco (1984), Wierzbicka (1985) and Taylor (1989).
Recent works in Computational Linguistics show the central role played by the lexicon in language processing, and in particular by the lexical semantics component. Lexicons tend no longer to be a mere enumeration of feature-value pairs but tend to have an intelligent behavior. This is the case, for example, for generative lexicons (Pustejovsky, 1991) which contain, besides feature structures, a number of rules to create new (partial) definitions of word-senses such as rules for conflation and type coercion. As a result, the size of lexical entries describing word-senses has substantially increased. These lexical entries become very hard to be used directly by a natural language parser or generator because their size and complexity allow a priori little flexibility.
Most natural language systems consider a lexical entry as an indivisible whole which is percolated up in the parse/generation tree. Access to features and feature values at grammar rule level is realized by more or less complex procedures (Shieber, 1986; Johnson, 1990; Günthner, 1988). The complexity of real natural language processing systems makes such an approach very inefficient and not necessarily linguistically adequate. In this document, we propose a dynamic treatment of features in grammars, embedded within a Constraint Logic Programming framework (noted as CLP hereafter) which permits us to access a feature-value pair associated to a certain word-sense directly into the lexicon, and only when this feature is explicitly required by the grammar, for example, to make a check. More precisely, this dynamic treatment of features will make use of both constraint propagation techniques embedded within CLP and CLP resolution mechanisms.
Lexical semantics knowledge often requires various kinds of inferences to be made in order to make more precise and more explicit the meaning of a word with respect to the context in which it is uttered. The treatment of inheritance is particularly crucial from this point of view since linguistic as well as conceptual hierarchies allow us to relate a word with more generic terms that contribute, in a certain sense, to the definition of the meaning of that word. Because hierarchies are complex, inheritance often involves non-monotonic forms of reasoning. Another relation with artificial intelligence is the development of models and techniques for the automatic acquisition of linguistic knowledge.
In “Blocking”, Ted Briscoe et al. elaborate on the introduction of default inheritance mechanisms into theories of lexical organization. Blocking is a kind of inference rule that realizes a choice when two kinds of inheritances can be drawn from resources occurring at different levels in the lexical organization. For example, a general morphological rule adds -ed at the end of verbs for the past construction. A subclass of verbs consists of irregular verbs, which receive a particular mark for the past. A default rule states that the exceptional mark is preferred to the general case. However there exist verbs such as dream, which accept both flections (dreamt, dreamed). Then the default rule is no longer adequate and a more flexible solution is required. In this chapter, a framework based on a non-monotonic logic that incorporates more powerful mechanisms for resolving conflicts of various kinds is introduced.
Various theories are nowadays used in linguistic analysis, particularly in syntax, and it does not seem reasonable to expect a reduction of their number in the near future. Nor does it seem reasonable to expect that one theory will cover them, as a kind of meta-theory. Nevertheless, all these theories have in common the need for a lexicon which would include the necessary and sufficient information for combining lexical items and extracting a representation of the meaning of such combinations.
If it is not possible to propose a canonical theory to organize the storage of the lexical information, it is necessary to adopt a “polytheoretic” conception of the lexicon. (Hellwig, Minkwitz, and Koch, 1991). A point shared by different theories concerns the need for some semantic information even for a syntactic parsing of a sentence (and a fortiori, beyond this level, for the parsing of a text).
During a first period, the semantic information required was above all concentrated on the thematic roles (Gruber, 1965; Jackendoff, 1972). These thematic roles were introduced because the grammatical functions were insufficient for discriminating between various interpretations and for describing the similarities of sentences. For instance, in:
The door opened
Charlie opened the door
the door is considered as having the same semantic function but not the same grammatical function – subject for (1) and object for (2). Gruber (1965) gives to the door in the two sentences the same thematic role of theme. A similar objective was assigned to the case-theory (Fillmore, 1968).
LexLog is not a lexical theory, but a package of logical specifications for constructing explicit representations for lexical items in restricted information domains. We usually call a domain “restricted” in two main cases.
The domain is notionally bounded. It is sufficiently simple to be decomposed into an organized finite set of notions, which might be enriched, if necessary, by applying a finite number of explicit combination rules. Metaphorically, one could say that the domain has been “axiomatized”. A clear example of this situation can be found in Palmer (1990).
The domain is lexically bounded. The representations to be constructed apply to a finite set of lexical items, and not to the whole of a language, or to an indefinite set of terms.
While they are often associated, neither of these properties strictly implies the other. For example, although the use of a statistical tool like PATHFINDER (Schvaneveldt, 1990) allows the construction of limited lexical clusters (case 2), their notional homogeneity is not warranted by the clustering procedure. E.g., in a study of the word marriage based on the definitions in the Longman Dictionary of Contemporary English (Wilks et al., 1989), the words marriage and muslim co-occur. Although the religious link is obvious, making precise the amount of knowledge about muslim which would be relevant to marriage is not easy. LexLog is designed only for situations which exhibit the TWO forms of boundedness, that is, it can be of some help only when the domain allows the use of a finite notional and a finite lexical basis.
Knowledge representation and reasoning are central to all fields of Artificial Intelligence research. It includes the development of formalisms for the representation of given subject matters as well as the development of inference procedures to reason about the represented knowledge. Before developing a knowledge representation formalism, one must determine what type of knowledge has to be modeled with the formalism. Since a lot of our knowledge of the world can easily be described using natural language, it is an interesting task to examine to what extent the contents of natural language utterances can be formalized and represented with a given representation formalism. Every approach to represent natural language utterances must include a method to formalize aspects of the meaning of single lexical units.
An early attempt in this direction was Quillian's Semantic Memory (Quillian, 1968), an associational model of human memory. A semantic memory consists of nodes corresponding to English words and different associative links connecting the nodes. Based on that approach, various knowledge representation systems have been developed which can be subsumed under the term semantic network. Common to all these systems is that knowledge is represented by a network of nodes and links. The nodes usually represent concepts or meanings whereas the links represent relations between concepts. In most semantic network formalisms, a special kind of link between more specific and more general concepts exists. This link, often called IS-A or AKO (a kind of), organizes the concepts into a hierarchy in which information can be inherited from more general to more specific concepts.
The connections between the study of natural language semantics, especially lexical semantics, and knowledge representation are manifold. One of the reasons why lexical semantics holds this place is obvious when one looks at compositional denotational theories of meaning. Here, one tries to account for the meaning of expressions in terms of a relation between linguistic expressions and the world. The dictionary makes explicit what part of the world each basic item refers to, whereas the grammar rules are associated with general instructions to combine the meanings of parts into the meanings of wholes. Most natural language understanding systems cannot relegate the interpretation function of basic items (as contained in the dictionary) to some mysterious interpretation function, say 'as in the case of Montague semantics, but have to be more explicit about the world and the substantive relation between basic expressions and the assumed ontology. Actual explanatory dictionaries can be viewed as stating complex relationships between natural language expressions. This perspective focuses on the fact that definitions are stated in language. The other perspective is focused on what is described: one could also say that a definition is the representation of a constraint on the world/model or the specification of the (now less mysterious) interpretation function for basic expressions.
Although it may not be realistic to argue that knowledge representation problems can be totally equated with problems of lexical semantics, there is enough reason to take notice of the latter when dealing with the former. Certainly this is the case where one deals with knowledge representation for natural language understanding. Within this general perspective we take the following position.
One of the major challenges today is coping with an overabundance of potentially important information. With newspapers such as the Wall Street Journal available electronically as a large text database, the analysis of natural language texts for the purpose of information retrieval has found renewed interest. Knowledge extraction and knowledge detection in large text databases are challenging problems, most recently under investigation in the TIPSTER projects funded by DARPA, the U.S. Department of Defense research funding agency. Traditionally, the parameters in the task of information retrieval are the style of analysis (statistical or linguistic), the domain of interest (TIPSTER, for instance, focuses on news concerning micro-chip design and joint ventures), the task (filling database entries, question answering, etc.), and the representation formalism (templates, Horn clauses, KL-ONE, etc.).
It is the premise of this chapter that much more detailed information can be gleaned from a careful linguistic analysis than from a statistical analysis. Moreover, a successful linguistic analysis provides more reliable data, as we hope to illustrate here. The problem is, however, that linguistic analysis is very costly and that systems that perform complete, reliable analysis of newspaper articles do not currently exist.
The challenge then is to find ways to do linguistic analysis when it is possible and to the extent that it is feasible. We claim that a promising approach is to perform a careful linguistic preprocessing of the texts, representing linguistically encoded information in a task independent, faithful, and reusable representation scheme.
The next part in this book concerns linguistic issues for lexical semantics. The main issues addressed are based on the notion of Generative Lexicon (Pustejovsky, 1991) and its consequences for the construction of lexicons.
The first chapter, “Linguistic constraints on type coercion,” by James Pustejovsky, summarizes the foundations of the Generative Lexicon which he defined a few years ago. This text investigates how best to characterize the formal mechanisms and the linguistic data necessary to explain the behavior of logical polysemy. A comprehensive range of polymorphic behaviors that account for the variations in semantic expressiveness found in natural languages is studied.
Within the same formal linguistic paradigm, we then have a contribution by Sabine Bergler, “From lexical semantics to text analysis,” which illustrates several issues of the Generative Lexicon using data from the Wall Street Journal. This chapter addresses in depth an important issue of the Generative Lexicon: what kind of methods can be used to create Generative Lexicon lexical entries with precise semantic content for the Qualia roles of the Generative Lexicon, from linguistic analysis. Special attention is devoted to the production of partial representations and to incremental analysis of texts.
The next chapter, “Lexical functions, generative lexicons and the world” by Dirk Heylen, explores the convergences and divergences between Mel'čuk's analysis of lexical functions and the generative lexicon approach. The author then proposes an interesting and original knowledge representation method based on lexical functions mainly following Mel'čuk's approach.
In this chapter, we present a synopsis of several notions of psycholinguistics and linguistics that are relevant to the field of lexical semantics. We mainly focus on the notions or theoretical approaches that are broadly used and admitted in computational linguistics. Lexical semantics is now playing a central role in computational linguistics, besides grammar formalisms for parsing and generation, and sentence and discourse semantic representation production. The central role of lexical semantics in computational linguistics can be explained by the fact that lexical entries contain a considerable part of the information that is related to the word-sense they represent.
This introduction will provide the reader with some basic concepts in the field of lexical semantics and should also be considered as a guide to the chapters included in this book. We first present some basic concepts of psycholinguistics which have some interest for natural language processing. We then focus on the linguistic aspects which are commonly admitted to contribute substantially to the field. It has not, however, been possible to include all aspects of lexical semantics: the absence of certain approaches should not be considered as an a priori judgment on their value.
The first part of this text introduces psycholinguistic notions of interest to lexical semantics; we then present linguistic notions more in depth. At the end of this chapter, we review the chapters in this volume.
Contribution of psycholinguistics to the study of word meaning
Results from psycholinguistic research can give us a good idea of how concepts are organized in memory, and how this information is accessed in the mental lexicon.
Many words have two or more very distinct meanings. For example, the word pen can refer to a writing implement or to an enclosure. Many natural language applications, including information retrieval, content analysis and automatic abstracting, and machine translation, require the resolution of lexical ambiguity for words in an input text, or are significantly improved when this can be accomplished. That is, the preferred input to these applications is a text in which each word is “tagged” with the sense of that word intended in the particular context. However, at present there is no reliable way to automatically identify the correct sense of a word in running text. This task, called word sense disambiguation, is especially difficult for texts from unconstrained domains because the number of ambiguous words is potentially very large. The magnitude of the problem can be reduced by considering only very gross sense distinctions (e.g., between the pen-as-implement and pen-as-enclosure senses of pen, rather than between finer sense distinctions within, say, the category of pen-as-enclosure – i.e., enclosure for animals, enclosure for submarines, etc.), which is sufficient for many applications. But even so, substantial ambiguity remains: for example, even the relatively small lexicon (20,000 entries) of the TRUMP system, which includes only gross sense distinctions, finds an average of about four senses for each word in sentences from the Wall Street Journal (McRoy, 1992). The resulting combinatoric explosion demonstrates the magnitude of the lexical ambiguity problem.
Several different kinds of information can contribute to the resolution of lexical ambiguity.
Lexical choice cannot be processed in text generation without appealing to a lexicon which takes into account many lexico-semantic relations. The text generation system must be able to treat the immediate and the larger lexical context.
a) The immediate lexical context consists of the lexemes that surround the lexical item to be generated. This context must absolutely be taken into account in the case of collocational constraints, which restrict ways of expressing a precise meaning to certain lexical items, for example as in expressions like pay attention, receive attention or narrow escape (Wanner & Bateman, 1990; Iordanskaja et al., 1991; Nirenburg & Nirenburg, 1988; Heid & Raab, 1989).
b) The larger textual context consists of the linguistic content of previous and subsequent clauses. This context is the source for cohesive links (Halliday & Hasan, 1976) with the lexical items to be generated in the current clause, as in:
(1) Professor Elmuck was lecturing on lexical functions to third-year students. The lecturer was interesting and the audience was very attentive.
In the second sentence, lecturer is coreferential with Professor Elmuck and the audience is coreferential with third-year students. These semantic links are due to the lexico-semantic relations between, on the one hand, lecturer (“agent noun”) and lecture, and on the other hand, between audience (“patient noun”) and lecture.
In this chapter, we will show that the Lexical Functions (LFs) of the Explanatory Combinatorial Dictionary (hereafter ECD) (Mel'čuk et al., 1984a, 1988; Mel'čuk & Polguère, 1987; Meyer & Steele, 1990) are well suited for these tasks in text generation.