Search results for Artificial Intelligence and Natural Language Processing

1 - Memory-Based Learning in Natural Language Processing
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 3-14
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This book presents a simple and efficient approach to solving natural language processing problems. The approach is based on the combination of two powerful techniques: the efficient storage of solved examples of the problem, and similarity-based reasoning on the basis of these stored examples to solve new ones.
Natural language processing (NLP) is concerned with the knowledge representation and problem solving algorithms involved in learning, producing, and understanding language. Language technology, or language engineering, uses the formalisms and theories developed within NLP in applications ranging from spelling error correction to machine translation and automatic extraction of knowledge from text.
Although the origins of NLP are both logical and statistical, as in other disciplines of artificial intelligence, historically the knowledge-based approach has dominated the field. This has resulted in an emphasis on logical semantics for meaning representation, on the development of grammar formalisms (especially lexicalist unification grammars), and on the design of associated parsing methods and lexical representation and organization methods. Well-known textbooks such as Gazdar and Mellish (1989) and Allen (1995) provide an overview of this ‘rationalist’ or ‘deductive’ approach.
The approach in this book is firmly rooted in the alternative empirical (inductive) approach. From the early 1990s onwards, empirical methods based on statistics derived from corpora have been adopted widely in the field. There were several reasons for this.

Bibliography
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 168-185
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Preface
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 1-2
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This book is a reflection of about twelve years of work on memory-based language processing. It reflects on the central topic from three perspectives. First, it describes the influences from linguistics, artificial intelligence, and psycholinguistics on the foundations of memory-based models of language processing. Second, it highlights applications of memory-based learning to processing tasks in phonology and morphology, and in shallow parsing. Third, it ventures into answering the question why memory-based learning fills a unique role in the larger field of machine learning of natural language – because it is the only algorithm that does not abstract away from its training examples. In addition, we provide tutorial information on the use of TIMBL, a software package for memory-based learning, and an associated suite of software tools for memory-based language processing.
For us, the direct inspiration for starting to experiment with extensions of the k-nearest neighbor classifier to language processing problems was the successful application of the approach by Stanfill and Waltz to grapheme-to-phoneme conversion in the eighties. During the past decade we have been fortunate to have expanded our work with a great team of fellow researchers and students on memory-based language processing in two locations: the ILK (Induction of Linguistic Knowledge) research group at Tilburg University, and CNTS (Center for Dutch Language and Speech) at the University of Antwerp. Our own first implementations of memory-based learning were soon superseded by well-coded software systems by Peter Berck, Jakub Zavrel, Bertjan Busser, and Ko van der Sloot.

Index
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 186-189
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

4 - Application to morpho-phonology
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 57-84
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

As argued in chapter 1, if a natural language processing task is formulated as either a disambiguation task or a segmentation task, it can be presented as a classification task to a memory-based learner, as well as to any other machine learning algorithm capable of learning from labeled examples. In this chapter as well as in the next we provide examples of how we formulate tasks in an MBLP framework. We start with one disambiguation and one segmentation task operating at the phonological and morphological levels, respectively.
A non-trivial portion of the complexity of natural languages is determined at the phonological and morphological levels, where phonemes and morphemes come together to form words. A language's phoneme inventory is based on many individual observations in which changing one particular speech sound of a spoken word into another changes the meaning of the word. A morpheme is usually identified as a string of phonemes carrying meaning on its own; a special class of morphemes, affixes, does not carry meaning on its own, but instead affixes have the ability to add or change some aspect of meaning when attached to a morpheme or string of morphemes.
One major problem of natural language processing in the phonological and morphological domains is that many existing sequences of phonemes and morphemes have highly ambiguous surface written forms, and especially in alphabetic writing systems where there is ambiguity in the relation between letters and phonemes.

6 - Abstraction and generalization
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 104-147
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The concepts of abstraction and generalization are tightly coupled to Ockham's razor, a medieval scientific principle, which is still regarded in many branches of modern science as fundamentally true. Sources quote the principle as “non preterio necessitate delendam”, or freely translated in the imperative form, delete all elements in a theory that are not necessary. The goal of its application is to maximize economy and generality: it favors small theories over large ones, when they have the same expressive power. The latter can be read as ‘having the same generalization accuracy’, which, as we have exemplified in the previous chapters, can be estimated through validation tests with held-out material.
A twentieth-century incarnation of Ockham's razor is the minimal description length (MDL) principle (Rissanen, 1983), coined in the context of computational learning theory. It has been used as the leading principle in the design of decision tree induction algorithms such as C4.5 (Quinlan, 1993) and rule induction algorithms such as RIPPER (Cohen, 1995). The goal of these algorithms is to find a compact representation of the classification information in the given learning material that at the same time generalizes well to unseen material. C4.5 uses decision trees; RIPPER uses ordered lists of rules to meet that end.
In contrast, memory-based learning is not minimal – its description length is equal to the amount of memory it takes to store the learning examples. Keeping all learning examples in memory is all but economical.

2 - Inspirations from linguistics and artificial intelligence
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 15-25
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Memory-Based Language Processing, MBLP, is based on the idea that learning and processing are two sides of the same coin. Learning is the storage of examples in memory, and processing is similarity-based reasoning with these stored examples. Although we have developed a specific operationalization of these ideas, they have been around for a long time. In this chapter we provide an overview of similar ideas in linguistics, psychology, and computer science, and end with a discussion of the crucial lesson learned from this literature, namely, that generalization from experience to new decisions is possible without the creation of abstract representations such as rules.
Inspirations from linguistics
While the rise of Chomskyan linguistics in the 1960s is considered a turning point in the development of linguistic theory, it is mostly before this time that we find explicit and sometimes adamant arguments for the use of memory and analogy that explain both the acquisition and the processing of linguistic knowledge in humans. We compress this into a brief review of thoughts and arguments voiced by the likes of Ferdinand de Saussure, Leonard Bloomfield, John Rupert Firth, Michael Halliday, Zellig Harris, and Royal Skousen, and we point to related ideas in psychology and cognitive linguistics.

Frontmatter
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Contents
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp v-viii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

7 - Extensions
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 148-167
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter describes two complementary extensions to memory-based learning: a search method for optimizing parameter settings, and methods for reducing the near-sightedness of the standard memory-based learner to its own contextual decisions in sequence processing tasks. Both complement the core algorithm as we have been discussing so far. Both methods have a wider applicability than just memory-based learning, and can be combined with any classification-based supervised learning algorithm.
First, in section 7.1 we introduce a search method for finding optimal algorithmic parameter settings. No universal rules of thumb exist for setting parameters such as the k in the k-NN classification rule, or the feature weighting metric, or the distance weighting metric. They also interact in unpredictable ways. Yet, parameter settings do matter; they can seriously change generalization performance on unseen data. We show that applying heuristic search methods in an experimental wrapping environment (in which a training set is further divided into training and validation sets) can produce good parameter settings automatically.
Second, in section 7.2 we describe two technical solutions to the problem of “sequence near-sightedness” from which many machine-learning classifiers and stochastic models suffer that predict class symbols without coordinating one prediction with another in some way. When such a classifier is performing natural language sequence tasks, producing class symbol by class symbol, it is unable to stop itself from generating output sequences that are impossible and invalid, because information on the output sequence being generated is not available to the learner.

The head-modifier principle and multilingual term extraction
ANDREW HIPPISLEY, DAVID CHENG, KHURSHID AHMAD
Journal:

Natural Language Engineering / Volume 11 / Issue 2 / June 2005

Published online by Cambridge University Press:

19 May 2005, pp. 129-157
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Advances in language engineering may be dependent on theoretical principles originating from linguistics, since both share a common object of enquiry, natural language structures. We outline an approach to term extraction that rests on theoretical claims about the structure of words. We use the structural properties of compound words to specifically elicit the sets of terms defined by type hierarchies such as hyponymy and meronymy. The theoretical claims revolve around the head-modifier principle, which determines the formation of a major class of compounds. Significantly it has been suggested that the principle operates in languages other than English. To demonstrate the extendibility of our approach beyond English, we present a case study of term extraction in Chinese, a language whose written form is the vehicle of communication for over 1.3 billion language users, and therefore has great significance for the development of language engineering technologies.

Machine learning-based named entity recognition via effective integration of various evidences
GUODONG ZHOU, JIAN SU
Journal:

Natural Language Engineering / Volume 11 / Issue 2 / June 2005

Published online by Cambridge University Press:

19 May 2005, pp. 189-206
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Named entity recognition identifies and classifies entity names in a text document into some predefined categories. It resolves the “who”, “where” and “how much” problems in information extraction and leads to the resolution of the “what” and “how” problems in further processing. This paper presents a Hidden Markov Model (HMM) and proposes a HMM-based named entity recognizer implemented as the system PowerNE. Through the HMM and an effective constraint relaxation algorithm to deal with the data sparseness problem, PowerNE is able to effectively apply and integrate various internal and external evidences of entity names. Currently, four evidences are included: (1) a simple deterministic internal feature of the words, such as capitalization and digitalization; (2) an internal semantic feature of the important triggers; (3) an internal gazetteer feature, which determines the appearance of the current word string in the provided gazetteer list; and (4) an external macro context feature, which deals with the name alias phenomena. In this way, the named entity recognition problem is resolved effectively. PowerNE has been benchmarked with the Message Understanding Conferences (MUC) data. The evaluation shows that, using the formal training and test data of the MUC-6 and MUC-7 English named entity tasks, and it achieves the F-measures of 96.6 and 94.1, respectively. Compared with the best reported machine learning system, it achieves a 1.7 higher F-measure with one quarter of the training data on MUC-6, and a 3.6 higher F-measure with one ninth of the training data on MUC-7. In addition, it performs slightly better than the best reported handcrafted rule-based systems on MUC-6 and MUC-7.

The Penn Chinese TreeBank: Phrase structure annotation of a large corpus
NAIWEN XUE, FEI XIA, FU-DONG CHIOU, MARTA PALMER
Journal:

Natural Language Engineering / Volume 11 / Issue 2 / June 2005

Published online by Cambridge University Press:

19 May 2005, pp. 207-238
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
With growing interest in Chinese Language Processing, numerous NLP tools (e.g., word segmenters, part-of-speech taggers, and parsers) for Chinese have been developed all over the world. However, since no large-scale bracketed corpora are available to the public, these tools are trained on corpora with different segmentation criteria, part-of-speech tagsets and bracketing guidelines, and therefore, comparisons are difficult. As a first step towards addressing this issue, we have been preparing a large bracketed corpus since late 1998. The first two installments of the corpus, 250 thousand words of data, fully segmented, POS-tagged and syntactically bracketed, have been released to the public via LDC (www.ldc.upenn.edu). In this paper, we discuss several Chinese linguistic issues and their implications for our treebanking efforts and how we address these issues when developing our annotation guidelines. We also describe our engineering strategies to improve speed while ensuring annotation quality.

Finite-state multimodal integration and understanding
MICHAEL JOHNSTON, SRINIVAS BANGALORE
Journal:

Natural Language Engineering / Volume 11 / Issue 2 / June 2005

Published online by Cambridge University Press:

19 May 2005, pp. 159-187
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Multimodal interfaces are systems that allow input and/or output to be conveyed over multiple channels such as speech, graphics, and gesture. In addition to parsing and understanding separate utterances from different modes such as speech or gesture, multimodal interfaces also need to parse and understand composite multimodal utterances that are distributed over multiple input modes. We present an approach in which multimodal parsing and understanding are achieved using a weighted finite-state device which takes speech and gesture streams as inputs and outputs their joint interpretation. In comparison to previous approaches, this approach is significantly more efficient and provides a more general probabilistic framework for multimodal ambiguity resolution. The approach also enables tight-coupling of multimodal understanding with speech recognition. Since the finite-state approach is more lightweight in computational needs, it can be more readily deployed on a broader range of mobile platforms. We provide speech recognition results that demonstrate compensation effects of exploiting gesture information in a directory assistance and messaging task using a multimodal interface.

Robust parsing with weighted constraints
KILIAN FOTH, WOLFGANG MENZEL, INGO SCHRÖDER
Journal:

Natural Language Engineering / Volume 11 / Issue 1 / March 2005

Published online by Cambridge University Press:

28 February 2005, pp. 1-25
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Based on constraint optimization techniques, an architecture for robust parsing of natural language utterances has been developed. The resulting system is able to combine possibly contradicting evidence from a variety of information sources, using a plausibility-based arbitration procedure to derive fairly rich structural representations, comprising aspects of syntax, semantics and other description levels of language. The results of a series of experiments are reported which demonstrate the high potential for robust behaviour with respect to ungrammaticality, incomplete utterances, and temporal pressure.

Visualization-enabled multi-document summarization by Iterative Residual Rescaling
RIE ANDO, BRANIMIR BOGURAEV, ROY BYRD, MARY NEFF
Journal:

Natural Language Engineering / Volume 11 / Issue 1 / March 2005

Published online by Cambridge University Press:

28 February 2005, pp. 67-86
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper describes a novel approach to multi-document summarization, which explicitly addresses the problem of detecting, and retaining for the summary, multiple themes in document collections. We place equal emphasis on the processes of theme identification and theme presentation. For the former, we apply Iterative Residual Rescaling (IRR); for the latter, we argue for graphical display elements. IRR is an algorithm designed to account for correlations between words and to construct multi-dimensional topical space indicative of relationships among linguistic objects (documents, phrases, and sentences). Summaries are composed of objects with certain properties, derived by exploiting the many-to-many relationships in such a space. Given their inherent complexity, our multi-faceted summaries benefit from a visualization environment. We discuss some essential features of such an environment.

A comparison of parsing technologies for the biomedical domain
CLAIRE GROVER, ALEX LASCARIDES, MIRELLA LAPATA
Journal:

Natural Language Engineering / Volume 11 / Issue 1 / March 2005

Published online by Cambridge University Press:

28 February 2005, pp. 27-65
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper reports on a number of experiments which are designed to investigate the extent to which current NLP resources are able to syntactically and semantically analyse biomedical text. We address two tasks: (a) parsing a real corpus with a hand-built wide-coverage grammar, producing both syntactic analyses and logical forms and (b) automatically computing the interpretation of compound nouns where the head is a nominalisation (e.g. hospital arrival means an arrival at hospital, while patient arrival means an arrival of a patient). For the former task we demonstrate that flexible and yet constrained pre-processing techniques are crucial to success: these enable us to use part-of-speech tags to overcome inadequate lexical coverage, and to package up complex technical expressions prior to parsing so that they are blocked from creating misleading amounts of syntactic complexity. We argue that the XML-processing paradigm is ideally suited for automatically preparing the corpus for parsing. For the latter task, we compute interpretations of the compounds by exploiting surface cues and meaning paraphrases, which in turn are extracted from the parsed corpus. This provides an empirical setting in which we can compare the utility of a comparatively deep parser vs. a shallow one, exploring the trade-off between resolving attachment ambiguities on the one hand and generating errors in the parses on the other. We demonstrate that a model of the meaning of compound nominalisations is achievable with the aid of current broad-coverage parsers.

Industry Watch
ROBERT DALE
Journal:

Natural Language Engineering / Volume 11 / Issue 1 / March 2005

Published online by Cambridge University Press:

28 February 2005, pp. 113-117
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
One way to keep in touch with what is happening in the commercial speech and language technology world is to pay occasional visits to the websites of HLT Central (at www.hltcentral.org) and LT World (at www.lt-world.org). Both sites provide links to news stories and press releases from companies and other organizations active in the area. The people who run these sites trawl the web for news stories of relevance, saving you the trouble of doing that yourself.

Erratum
Journal:

Natural Language Engineering / Volume 11 / Issue 1 / March 2005

Published online by Cambridge University Press:

28 February 2005, p. 119
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Journal of Natural Language Engineering, volume 9, issue 4, pp. 365–80, December 2003

Correcting real-word spelling errors by restoring lexical cohesion
GRAEME HIRST, ALEXANDER BUDANITSKY
Journal:

Natural Language Engineering / Volume 11 / Issue 1 / March 2005

Published online by Cambridge University Press:

28 February 2005, pp. 87-111
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Spelling errors that happen to result in a real word in the lexicon cannot be detected by a conventional spelling checker. We present a method for detecting and correcting many such errors by identifying tokens that are semantically unrelated to their context and are spelling variations of words that would be related to the context. Relatedness to context is determined by a measure of semantic distance initially proposed by Jiang and Conrath (1997). We tested the method on an artificial corpus of errors; it achieved recall of 23–50% and precision of 18–25%.

Artificial Intelligence and Natural Language Processing

Refine search

Refine search

Actions for selected content:

3242 results in Artificial Intelligence and Natural Language Processing

1 - Memory-Based Learning in Natural Language Processing

Summary

Bibliography

Preface

Summary

Index

4 - Application to morpho-phonology

Summary

6 - Abstraction and generalization

Summary

2 - Inspirations from linguistics and artificial intelligence

Summary

Frontmatter

Contents

7 - Extensions

Summary

The head-modifier principle and multilingual term extraction

Machine learning-based named entity recognition via effective integration of various evidences

The Penn Chinese TreeBank: Phrase structure annotation of a large corpus

Finite-state multimodal integration and understanding

Robust parsing with weighted constraints

Visualization-enabled multi-document summarization by Iterative Residual Rescaling

A comparison of parsing technologies for the biomedical domain

Industry Watch

Erratum

Correcting real-word spelling errors by restoring lexical cohesion

Artificial Intelligence and Natural Language Processing

Refine search

Refine search

Actions for selected content:

Save Search

3242 results in Artificial Intelligence and Natural Language Processing

Summary

Summary

Summary

Summary

Summary

Summary