To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Where is the knowledge we have lost in information?
T. S. Eliot, The Rock
Our world continues to become increasingly complex, interconnected, and dynamic: There are more people and institutions; they engage in more relationships and exchange; and the rates of change continue to grow, largely because of developments in technology and the importance of information to human and technical development. We live in an information society in which more people must manage more information, which in turn requires more technological support, which both demands and creates more information. Electronic technology and information are mutually reinforcing phenomena, and one of the key aspects of living in the information society is the growing level of interactions we have with this complex and increasingly electronic environment. The general consequences of the information society are threefold: larger volumes of information, new forms and aggregations of information, and new tools for working with information.
First, we find ourselves dealing with more information in all aspects of our lives. More of us are “knowledge workers,” generating, managing, and communicating information to produce and provide goods and services for an increasingly global economy. In addition to the often-noted trend toward more people managing more information in the workplace, people must go beyond the workplace to learn new skills and acquire new knowledge to do their jobs.
The open society, the unrestricted access to knowledge, the unplanned and uninhibited association of men for its furtherance – these are what make a vast, complex, ever growing, ever changing, ever more specialized and expert technological world, nevertheless a world of community.
J. Robert Oppenheimer, Science and the Common Understanding
Evolution proceeds in many waves, some brief by human temporal sensibilities and some lasting for centuries. Some changes in information seeking take place before technological investments are fully amortized (e.g., the latest CPU or software upgrade brings with it access to new information resources), and some take place over careers as strategies and patterns of use learned in school evolve based on new sources and tools. This final chapter examines one long-term change that computing technology brings to cognition in general and to information seeking in particular, considers how different domains interact to influence the evolution of information seeking, and concludes with some ideas about what types of systems we should strive to build.
Amplification and augmentation
Applying computational power to information problems has been a research and design goal from the first days of computing. The dreams of language translation and cybernetic assistants have given way to dreams of artificial realities and intelligent agents, but our fascination with the manipulation of symbolic data and with interactivity remains a driving force behind much of the research in artificial intelligence and engineering. One way to consider how computation may be applied to information problems is to examine how it may be applied to amplify and augment intellect.
Physics does not change the nature of the world it studies, and no science of behavior can change the essential nature of man, even though both sciences yield technologies with a vast power to manipulate their subject matters.
B. F. Skinner, Cumulative Record
Throughout our lives we develop knowledge, skills, and attitudes that allow us to seek and use information. This chapter introduces the notion of personal information infrastructure, which will be used to describe this complex of knowledge, skills, and attitudes. It also introduces the notion of interactivity, a key characteristic of computer technology that allows information seekers to use electronic environments in ways that emulate interactions with human sources of information. The chapter also provides an overview of the technological developments that underlie information seeking in electronic environments.
Personal information infrastructures
The primary activities of scientists, physicians, businesspersons, and other professionals are gathering information from the world, mentally integrating that information with their own knowledge – thus creating new knowledge – and acting on this new knowledge to accomplish their goals. Most often, this knowledge and the consequences of using it are articulated to the external world as information. All humans develop mental structures and skills for conducting such activities according to their individual abilities, experiences, and physical resources. An individual person's collection of abilities, experience, and resources to gather, use, and communicate information are referred to as a personal information infrastructure.
Many things difficult to design provide easy performance.
Samuel Johnson, Rasselas
Imagination is more important than knowledge.
Albert Einstein, On Science
Many specific system features have been shown to invite and support browsing as an information-seeking strategy. We are beginning to acquire a set of techniques that define what is possible in designing such features. Determining what is optimal for different users, tasks, and settings requires systematically testing techniques across the range of information-seeking factors. Because browsing requires users to coordinate physical and mental activities, systems that support browsing must solve both technical and conceptual problems. Technical challenges such as the computational power needed to manipulate huge vector spaces on-the-fly and display problems such as resolution limitations, refresh and scroll rates, window sizes, and juxtapositions are difficult enough in isolation but must be coordinated with other technical problems such as mechanisms for selection and control of information and conceptual problems such as what the best representations of meaning are for specific information items and what should be displayed at what time, in what form, and at what level of granularity. Programs of research are needed that address the technical problems of designing interaction styles for browsing, that determine the physiological and psychological boundaries of browsing activities, and that test various representations for browsable information. These are technical, user, and organizational areas, respectively. Although different researchers and groups typically specialize in one of these problem areas, ultimately the support for browsing will depend on integrating results from all three.
The storage and retrieval of scientific texts were early applications of computers, and by the early 1960s, schemes for automatic indexing and abstracting had emerged (e.g., Doyle, 1965; Luhn, 1957, 1958; O'Connor, 1964; Tasman, 1957). As online systems emerged in the 1960s and 1970s, more databases and new search features were created to give professional intermediaries more power in searching for information. Searching in online systems was complex, and so intermediaries created systematic strategies for eliciting users' needs; selecting terms, synonyms, and morphological variants appropriate to the need and the system; using Boolean operators to formulate precise queries; restricting those queries to specific database fields; forming intermediate sets of results; manipulating those sets; and selecting appropriate display formats. The strategies and tactics that professional intermediaries use are meant to maximize retrieval effectiveness while minimizing online costs. These strategies are goal oriented and systematic and are termed analytical strategies. In this chapter, we describe several analytical strategies to illustrate how electronic environments have changed information seeking by allowing searchers to systematically manipulate large sets of potentially relevant documents. These strategies in turn influenced subsequent designs of online systems. Next we look at studies of novice users working with various online systems, showing how difficult analytical strategies are to learn and apply, and the need for electronic systems that support informal information-seeking strategies for end users.
Marco Polo had the opportunity of acquiring a knowledge, either by his own observation or what he collected from others, of so many things, until his time unknown.
The Travels of Marco Polo
The laws of behavior yield to the energy of the individual.
Emerson, Essays, Second Series: Manners
In contrast with the formal, analytical strategies developed by professional intermediaries, information seekers also use a variety of informal, heuristic strategies. These informal, interactive strategies are clustered together under the term browsing strategies. In general, browsing is an approach to information seeking that is informal and opportunistic and depends heavily on the information environment. Four browsing strategies are distinguished in this chapter: scanning, observing, navigating, and monitoring. The term browsing reflects the general behavior that people exhibit as they seek information by using one of these strategies.
Browsing is a natural and effective approach to many types of information-seeking problems. It is natural because it coordinates human physical, emotive, and cognitive resources in the same way that humans monitor the physical world and search for physical objects. It can be effective because the environment and particularly human-created environments are generally organized and highly redundant – especially information environments that are designed according to organizational principles. Browsing is particularly effective for information problems that are ill defined or interdisciplinary and when the goal of information seeking is to gather overview information about a topic or to keep abreast of developments in a field.
One of the fundamental problems of lexical semantics is the fact that what C. Ruhl (1989) calls the ‘perceived meaning’ of a word can vary so greatly from one context to another. In this chapter I want to survey the ways in which the contribution the same grammatical word makes to the meaning of a larger unit may differ in different contexts. There are two main sources of explanatory hypotheses for contextual variations in word meaning: lexical semantics and pragmatics. While there are probably no contexts where each of these is not involved in some way, their relative contributions can vary. For instance, in the following examples the difference between 1 and 2 in respect of the interpretation of the word teacher (i.e.,“male teacher” and “female teacher”, respectively) can be accounted for entirely by differential contextual enrichment of a single lexical meaning for teacher (in other words, pragmatically):
1. The teacher stroked his beard.
2. Our maths teacher is on maternity leave.
The only involvement of lexical semantics here is that the specification of the meaning of teacher must somehow make it clear that although it is unspecified for sex, it is, unlike, say, chair, specifiable for sex. Examples 3 and 4 exemplify a slightly different type of contextual enrichment, in that the extra specificity in context is of a meronymous rather than a hyponymous type:
3. John washed the car.
4. The mechanic lubricated the car.
The different actions performed on the car allow us to infer that the agents were occupied with different parts of the car in each case.
This chapter builds on the title and theme of Apresjan's 1974 paper, Regular Polysemy. Apresjan was concerned merely to define the phenomenon and identify where it occurred. Here, we shall explore how it can be exploited.
Regular polysemy occurs where two or more words each have two senses, and all the words exhibit the same relationship between the two senses. The phenomenon is also called ‘sense extension’ (Copestake & Briscoe, 1991), ‘semantic transfer rules’ (Leech, 1981), ‘lexical implication rules’ (Ostler & Atkins, 1991), or simply ‘lexical rules’. An example, taken direct from a dictionary (Longman Dictionary of Contemporary English, hereafter LDOCE) is:
gin (a glass of) a colourless strong alcoholic drink …
martini (a glass of) an alcoholic drink …
In each case, two senses are referred to, one with the ‘bracketed optional part’ included in the definition and the other with it omitted; the relation between the two is the same in both cases.
Recent work on lexical description has stressed the need for the structure of a lexical knowledge base (LKB) to reflect the structure of the lexicon (Atkins & Levin, 1991) and for the LKB to incorporate productive rules, so the rulebound ways in which words may be used are captured without the lexicon needing to list all options for all words (Boguraev & Levin, 1990). These arguments suggest that generalizations regarding regular polysemy should be expressed in the LKB, and that the formalism in which the LKB is written should be such that, once the generalization is stated, the specific cases follow as consequences of the inference rules of the formalism.
“We want the ring. Without the ring there can be no wedding. May we break the finger?”
“Old age,” said Sir Benjamin, breathing deeply and slowly, “is a time when deafness brings its blessings. I didn't hear what you said then. You have another chance.”
“May we break the finger?” asked Ambrose again. “We could do it with a hammer.”
“I thought, sir,” said Sir Benjamin, “that those were the words. The world is all before you. By God, my hiccoughs have gone, and no wonder. As for ‘break’, ‘break’ is a trull of a word, it will take in everything. Waves, dawns, news, wind, hearts, banks, maidenheads. But never dream of tucking into the same predicate my statues as object and that loose-favoured verb. That would be a most reprehensible solecism.”
The Eve of Saint Venus, Anthony Burgess
Introduction
We are interested in how a verb's description changes as it evolves into different usages. In order to examine this issue we are attempting to combine a LEXICAL ANALYSIS of a particular verb with a CONCEPTUAL ANALYSIS. LEXICAL ANALYSIS is aimed at producing a LEXICAL DEFINITION which functions as a linguistic paraphrase. This should ideally be a lexical decomposition composed of terms with simpler, “more primitive” meanings than the term being defined (Mel'čuk 1988; Wierzbicka 1984). In contrast, a conceptual analysis produces a CONCEPTUAL DESCRIPTION of the lexical item.
Criteria for building a conceptual description are linked to arguments borrowed from “real-world” considerations or expressed in terms of denotations.
A major motivation for the introduction of default inheritance mechanisms into theories of lexical organization has been to account for the prevalence of the family of phenomena variously described as blocking (Aronoff, 1976:43), the elsewhere condition (Kiparsky, 1973), or preemption by synonymy (Clark & Clark, 1979:798). In Copestake and Briscoe (1991) we argued that productive processes of sense extension also undergo the same process, suggesting that an integrated account of lexical semantic and morphological processes must allow for blocking. In this chapter, we review extant accounts which follow from theories of lexical organization based on default inheritance, such as Paradigmatic Morphology (Calder, 1989), DATR (Evans & Gazdar, 1989), ELU (Russell et al., 1991, in press), Word Grammar (Hudson, 1990; Fraser & Hudson, 1992), or the LKB (Copestake 1992; this volume; Copestake et al., in press). We argue that these theories fail to capture the full complexity of even the simplest cases of blocking and sketch a more adequate framework, based on a non-monotonic logic that incorporates more powerful mechanisms for resolving conflict among defeasible knowledge resources (Common-sense Entailment, Asher & Morreau, 1991). Finally, we explore the similarities and differences between various phenomena which have been intuitively felt to be cases of blocking within this formal framework, and discuss the manner in which such processes might interact with more general interpretative strategies during language comprehension. Our presentation is necessarily brief and rather informal; we are primarily concerned to point out the potential advantages using a more expressive default logic for remedying some of the inadequacies of current theories of lexical description.
The work reported here is part of research on the ACQUILEX project which is aimed at the eventual development of a theoretically motivated, but comprehensive and computationally tractable, multilingual lexical knowledge base (LKB) usable for natural language processing, lexicography and other applications. One of the goals of the ACQUILEX project was to demonstrate the feasibility of building an LKB by acquiring a substantial portion of the information semi-automatically from machine readable dictionaries (MRDs). We have paid particular attention to lexical semantic information. Our work therefore attempts to integrate several strands of research:
• Linguistic theories of the lexicon and lexical semantics. In this chapter we will concentrate on the lexical semantics of nominals where our treatment is broadly based on that of Pustejovsky (1991), and in particular on his concepts of the generative lexicon and of qualia structure.
• Knowledge representation techniques. The formal lexical representation language (LRL) used in the ACQUILEX LKB system is based on typed features structures similar to those of Carpenter (1990, 1992), augmented with default inheritance and lexical rules. Our lexicons can thus be highly structured, hierarchical and generative.
• Lexicography and computational lexicography. The work reported here makes extensive use of the Longman Dictionary of Contemporary English (LDOCE; Procter, 1978). MRDs do not just provide data about individual lexical items; our theories of the lexicon have been developed and refined by considering the implicit organization of dictionaries and the insights of lexicographers.
In this chapter we will show how these strands can be combined in developing an appropriate representation for group nouns in the LRL, and in extracting the requisite information automatically from MRDs.
Lexical semantics is still in a rather early stage of development. This explains the reason why there are relatively few elaborated systems for representing its various aspects. A number of semantic aspects can be straightforwardly represented by means of feature-value pairs and by means of typed feature structures. Others, such as the Qualia Structure or the lexical semantics relations are more difficult to represent. The difficulty is twofold: (1) there is first the need to define appropriate models to represent the various levels of semantic information, including their associated possible inference systems and their properties (e.g., transitivity, monotonicity, etc.) and (2) there is the need to develop complex algorithms that allow for as efficient as possible treatments from these models. The first chapter of this section tackles the first point and the second one addresses some algorithmic problems.
“Introducing Lexlog” by J. Jayez, is a set of specifications for constructing explicit representations for lexical objects in restricted domains. Lexlog offers two types of functions: control functions to formalize representations and updatings on these representations in a controlled way, and expression functions to express different semantic operators and to tailor these operators with syntactic operators, for example, the trees of the Tree Adjoining Grammar framework. The discussion ends with a detailed presentation of the implementation in Prolog.
The last chapter, “Constraint propagation techniques for lexical semantics descriptions,” by Patrick Saint-Dizier, addresses the problem of the propagation in parse trees of large feature structures. The motivation is basically to avoid computations of intermediate results which later turn out to be useless.
Another current major issue in lexical semantics is the definition and the construction of real-size lexical databases that will be used by parsers and generators in conjunction with a grammatical system. Word meaning, terminological knowledge representation and extraction of knowledge in machine readable dictionaries are the main topics addressed. They really represent the backbone of a lexical semantics knowledge base construction.
The first chapter, “Lexical semantics and terminological knowledge representation” by Gerrit Burkert, shows the practical and formal inadequacies of semantic networks for representing knowledge, and the advantages of using a term subsumption language. In a first stage, this document shows how several aspects of word meaning can be adequately described using a term subsumption language. Then, some extensions are proposed that make the system more suitable for lexical semantics. Formal aspects are strongly motivated by several examples borrowed from an in-depth study of terminological knowledge extraction, which is a rather challenging area for lexical semantics.
“Word meaning between lexical and conceptual structure”, by Peter Gerstl, presents a method and a system to introduce world-knowledge or domain-dependent knowledge in a lexicon. The meaning of a word is derived from general lexical information on the one hand and from ontological knowledge on the other hand. The notion of semantic scope is explored on an empirical basis by analyzing in a systematic way the influences involved in natural language expressions. This component has been integrated into the Lilog system developed at IBM Stuttgart.