To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this paper, we address the tasks of recognition and interpretation of affect communicated through text messaging in online communication environments. Specifically, we focus on Instant Messaging (IM) or blogs, where people use an informal or garbled style of writing. We introduced a novel rule-based linguistic approach for affect recognition from text. Our Affect Analysis Model (AAM) was designed to deal with not only grammatically and syntactically correct textual input, but also informal messages written in an abbreviated or expressive manner. The proposed rule-based approach processes each sentence in stages, including symbolic cue processing, detection and transformation of abbreviations, sentence parsing and word/phrase/sentence-level analyses. Our method is capable of processing sentences of different complexity, including simple, compound, complex (with complement and relative clauses) and complex–compound sentences. Affect in text is classified into nine emotion categories (or neutral). The strength of the resulting emotional state depends on vectors of emotional words, relations among them, tense of the analysed sentence and availability of first person pronouns. The evaluation of the Affect Analysis Model algorithm showed promising results regarding its capability to accurately recognize fine-grained emotions reflected in sentences from diary-like blog posts (averaged accuracy is up to 77 per cent), fairy tales (averaged accuracy is up to 70.2 per cent) and news headlines (our algorithm outperformed eight other systems on several measures).
The generation of referring expressions is a central topic in computational linguistics. Natural referring expressions – both definite references like ‘the baseball cap’ and pronouns like ‘it’ – are dependent on discourse context. We examine the practical implications of context-dependent referring expression generation for the design of spoken systems. Currently, not all spoken systems have the goal of generating natural referring expressions. Many researchers believe that the context-dependency of natural referring expressions actually makes systems less usable. Using the dual-task paradigm, we demonstrate that generating natural referring expressions that are dependent on discourse context reduces cognitive load. Somewhat surprisingly, we also demonstrate that practice does not improve cognitive load in systems that generate consistent (context-independent) referring expressions. We discuss practical implications for spoken systems as well as other areas of referring expression generation.
This volume contains the refereed and invited papers which were presented at Expert Systems 92, the twelfth annual conference of the British Computer Society's Specialist Group on Expert Systems, held in Cambridge in December 1992. Together with its predecessors this is essential reading for those who wish to keep up to date with developments and opportunities in this important field.
Although widely employed in image processing, the use of fractal techniques and the fractal dimension for speech characterisation and recognition is a relatively new concept which is now receiving serious attention. This book represents the fruit of research carried out to develop novel fractal-based techniques for speech and audio signal processing. Much of this work is finding its way into practical commercial applications with Nokia Communications and other key organisations. The book starts with an introduction to speech processing and fractal geometry, setting the scene for the heart of the book where fractal techniques are described in detail with numerous applications and examples, and concluding with a chapter summing up the advantages and potential of these new techniques over conventional processing methods. A valuable reference for researchers, academics and practising engineers working in the field of audio signal processing and communications.
The relation between ontologies and language is currently at the forefront of natural language processing (NLP). Ontologies, as widely used models in semantic technologies, have much in common with the lexicon. A lexicon organizes words as a conventional inventory of concepts, while an ontology formalizes concepts and their logical relations. A shared lexicon is the prerequisite for knowledge-sharing through language, and a shared ontology is the prerequisite for knowledge-sharing through information technology. In building models of language, computational linguists must be able to accurately map the relations between words and the concepts that they can be linked to. This book focuses on the technology involved in enabling integration between lexical resources and semantic technologies. It will be of interest to researchers and graduate students in NLP, computational linguistics, and knowledge engineering, as well as in semantics, psycholinguistics, lexicology and morphology/syntax.
We show how a quantitative context may be established for what is essentially qualitative in nature by topologically embedding a lexicon (here, WordNet) in a complete metric space. This novel transformation establishes a natural connection between the order relation in the lexicon (e.g., hyponymy) and the notion of distance in the metric space, giving rise to effective word-level and document-level lexical semantic distance measures. We provide a formal account of the topological transformation and demonstrate the value of our metrics on several experiments involving information retrieval and document clustering tasks.
We explore the use of independent component analysis (ICA) for the automatic extraction of linguistic roles or features of words. The extraction is based on the unsupervised analysis of text corpora. We contrast ICA with singular value decomposition (SVD), widely used in statistical text analysis, in general, and specifically in latent semantic analysis (LSA). However, the representations found using the SVD analysis cannot easily be interpreted by humans. In contrast, ICA applied on word context data gives distinct features which reflect linguistic categories. In this paper, we provide justification for our approach called WordICA, present the WordICA method in detail, compare the obtained results with traditional linguistic categories and with the results achieved using an SVD-based method, and discuss the use of the method in practical natural language engineering solutions such as machine translation systems. As the WordICA method is based on unsupervised learning and thus provides a general means for efficient knowledge acquisition, we foresee that the approach has a clear potential for practical applications.
This paper focuses on an important step in the creation of a system of meaning representation and the development of semantically annotated parallel corpora, for use in applications such as machine translation, question answering, text summarization, and information retrieval. The work described below constitutes the first effort of any kind to annotate multiple translations of foreign-language texts with interlingual content. Three levels of representation are introduced: deep syntactic dependencies (IL0), intermediate semantic representations (IL1), and a normalized representation that unifies conversives, nonliteral language, and paraphrase (IL2). The resulting annotated, multilingually induced, parallel corpora will be useful as an empirical basis for a wide range of research, including the development and evaluation of interlingual NLP systems and paraphrase-extraction systems as well as a host of other research and development efforts in theoretical and applied linguistics, foreign language pedagogy, translation studies, and other related disciplines.
This paper presents a novel approach to ontology localization with the objective of obtaining multilingual ontologies. Within the ontology development process, ontology localization has been defined as the activity of adapting an ontology to a concrete linguistic and cultural community. Depending on the ontology layers – terminological and/or conceptual – involved in the ontology localization activity, three heterogeneous multilingual ontology metamodels have been identified, of which we propose one of them. Our proposal consists in associating the ontology metamodel to an external model for representing and structuring lexical and terminological data in different natural languages. Our model has been called Linguistic Information Repository (LIR). The main advantages of this modelling modality rely on its flexibility by allowing (1) the enrichment of any ontology element with as much linguistic information as needed by the final application, and (2) the establishment of links among linguistic elements within and across different natural languages. The LIR model has been designed as an ontology of linguistic elements and is currently available in Web Ontology Language (OWL). The set of lexical and terminological data that it provides to ontology elements enables the localization of any ontology to a certain linguistic and cultural universe. The LIR has been evaluated against the multilingual requirements of the Food and Agriculture Organization of the United Nations in the framework of the NeOn project. It has proven to solve multilingual representation problems related to the establishment of well-defined relations among lexicalizations within and across languages, as well as conceptualization mismatches among different languages. Finally, we present an extension to the Ontology Metadata Vocabulary, the so-called LexOMV, with the aim of reporting on multilinguality at the ontology metadata level. By adding this contribution to the LIR model, we account for multilinguality at the three levels of an ontology: data level, knowledge representation level and metadata level.
We investigate the use of instance-based ranking methods for surface realization in natural language generation. Our approach to instance-based natural language generation (IBNLG) employs two components: a rule system that ‘overgenerates’ a number of realization candidates from a meaning representation and an instance-based ranker that scores the candidates according to their similarity to examples taken from a training corpus. We develop an efficient search technique for identifying the optimal candidate based on a novel extension of the A* algorithm. The rule system is produced automatically from a semantically annotated fragment of the Penn Treebank II containing management succession texts. We detail the annotation scheme and grammar induction algorithm and evaluate the efficiency and output of the generator. We also discuss issues such as input coverage (completeness) and fluency that are relevant to surface generation in general.
This outstanding collection is designed to address the fundamental issues and principles underlying the task of Artificial Intelligence. The editors have selected not only papers now recognized as classics but also many specially commissioned papers which examine the methodological and theoretical foundations of the discipline from a wide variety of perspectives: computer science and software engineering, cognitive psychology, philosophy, formal logic and linguistics. Carefully planned and structured, the volume tackles many of the contentious questions of immediate concern to AI researchers and interested observers. Is Artificial Intelligence in fact a discipline, or is it simply part of computer science? What is the role of programs in AI and how do they relate to theories? What is the nature of representation and implementation, and how should the challenge of connectionism be viewed? Can AI be characterized as an empirical science? The comprehensiveness of this collection is further enhanced by the full, annotated bibliography. All readers who want to consider what Artificial Intelligence really is will find this sourcebook invaluable, and the editors will undoubtedly succeed in their secondary aim of stimulating a lively and continuing debate.
Learning without thought is labor lost; thought without learning is perilous.
Confucius (551 BC – 479 BC), The Confucian Analects
This chapter goes beyond the supervised learning of Chapter 7. It covers learning richer representation and learning what to do; this enables learning to be combined with reasoning. First we consider unsupervised learning in which the classifications are not given in the training set. This is a special case of learning belief network, which is considered next. Finally, we consider reinforcement learning, in which an agent learns how to act while interacting with an environment.
Clustering
Chapter 7 considered supervised learning, where the target features that must be predicted from input features are observed in the training data. In clustering or unsupervised learning, the target features are not given in the training examples. The aim is to construct a natural classification that can be used to cluster the data.
The general idea behind clustering is to partition the examples into clusters or classes. Each class predicts feature values for the examples in the class. Each clustering has a prediction error on the predictions. The best clustering is the one that minimizes the error.
Example 11.1 A diagnostic assistant may want to group the different treatments into groups that predict the desirable and undesirable effects of the treatment. The assistant may not want to give a patient a drug because similar drugs may have had disastrous effects on similar patients. […]
The most serious problems standing in the way of developing an adequate theory of computation are as much ontological as they are semantical. It is not that the semantic problems go away; they remain as challenging as ever. It is just that they are joined – on center stage, as it were – by even more demanding problems of ontology.
–Smith [1996, p. 14]
How do you go about representing knowledge about a world so it is easy to acquire, debug, maintain, communicate, share, and reason with? This chapter explores how to specify the meaning of symbols in intelligent agents, how to use the meaning for knowledge-based debugging and explanation, and, finally, how an agent can represent its own reasoning and how this may be used to build knowledge-based systems. As Smith points out in the quote above, the problems of ontology are central for building intelligent computational agents.
Knowledge Sharing
Having an appropriate representation is only part of the story of building a knowledge-based agent. We also should be able to ensure that the knowledge can be acquired, particularly when the knowledge comes from diverse sources and at multiple points in time and should interoperate with other knowledge. We should also ensure that the knowledge can be reasoned about effectively.
Instead of reasoning explicitly in terms of states, it is often better to describe states in terms of features and then to reason in terms of these features. Often these features are not independent and there are hard constraints that specify legal combinations of assignments of values to variables. As Falen's elegant poem emphasizes, the mind discovers and exploits constraints to solve tasks. Common examples occur in planning and scheduling, where an agent must assign a time for each action that must be carried out; typically, there are constraints on when actions can be carried out and constraints specifying that the actions must actually achieve a goal. There are also often preferences over values that can be specified in terms of soft constraints. This chapter shows how to generate assignments that satisfy a set of hard constraints and how to optimize a collection of soft constraints.
Features and States
For any practical problem, an agent cannot reason in terms of states; there are simply too many of them. Moreover, most problems do not come with an explicit list of states; the states are typically described implicitly in terms of features.
He who every morning plans the transaction of the day and follows out that plan, carries a thread that will guide him through the maze of the most busy life. But where no plan is laid, where the disposal of time is surrendered merely to the chance of incidence, chaos will soon reign.
–Victor Hugo (1802–1885)
Planning is about how an agent achieves its goals. To achieve anything but the simplest goals, an agent must reason about its future. Because an agent does not usually achieve its goals in one step, what it should do at any time depends on what it will do in the future. What it will do in the future depends on the state it is in, which, in turn, depends on what it has done in the past. This chapter considers how an agent can represent its actions and their effects and use these models to find a plan to achieve its goals.
In particular, this chapter considers the case where
the agent's actions are deterministic; that is, the agent can predict the consequences of its actions.
there are no exogenous events beyond the control of the agent that change the state of the world.
the world is fully observable; thus, the agent can observe the current state of the world.
time progresses discretely from one state to the next.
It is remarkable that a science which began with the consideration of games of chance should become the most important object of human knowledge … The most important questions of life are, for the most part, really only problems of probability …
The theory of probabilities is at bottom nothing but common sense reduced to calculus.
–Pierre Simon de Laplace [1812]
All of the time, agents are forced to make decisions based on incomplete information. Even when an agent senses the world to find out more information, it rarely finds out the exact state of the world. A robot does not know exactly where an object is. A doctor does not know exactly what is wrong with a patient. A teacher does not know exactly what a student understands. When intelligent agents must make decisions, they have to use whatever information they have. This chapter considers reasoning under uncertainty: determining what is true in the world based on observations of the world. This is used in Chapter 9 as a basis for acting under uncertainty, where the agent must make decisions about what action to take even though it cannot precisely predict the outcomes of its actions. This chapter starts with probability, shows how to represent the world by making appropriate independence assumptions, and shows how to reason with such representations.