Search results for Artificial Intelligence and Natural Language Processing

Discovery of inference rules for question-answering
DEKANG LIN, PATRICK PANTEL
Journal:

Natural Language Engineering / Volume 7 / Issue 4 / December 2001

Published online by Cambridge University Press:

15 February 2002, pp. 343-360
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
One of the main challenges in question-answering is the potential mismatch between the expressions in questions and the expressions in texts. While humans appear to use inference rules such as ‘X writes Y’ implies ‘X is the author of Y’ in answering questions, such rules are generally unavailable to question-answering systems due to the inherent difficulty in constructing them. In this paper, we present an unsupervised algorithm for discovering inference rules from text. Our algorithm is based on an extended version of Harris’ Distributional Hypothesis, which states that words that occurred in the same contexts tend to be similar. Instead of using this hypothesis on words, we apply it to paths in the dependency trees of a parsed corpus. Essentially, if two paths tend to link the same set of words, we hypothesize that their meanings are similar. We use examples to show that our system discovers many inference rules easily missed by humans.

The TREC question answering track
ELLEN M. VOORHEES
Journal:

Natural Language Engineering / Volume 7 / Issue 4 / December 2001

Published online by Cambridge University Press:

14 February 2002, pp. 361-378
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The Text REtrieval Conference (TREC) question answering track is an effort to bring the benefits of large-scale evaluation to bear on a question answering (QA) task. The track has run twice so far, first in TREC-8 and again in TREC-9. In each case, the goal was to retrieve small snippets of text that contain the actual answer to a question rather than the document lists traditionally returned by text retrieval systems. The best performing systems were able to answer about 70% of the questions in TREC-8 and about 65% of the questions in TREC-9. While the 65% score is a slightly worse result than the TREC-8 scores in absolute terms, it represents a very significant improvement in question answering systems. The TREC-9 task was considerably harder than the TREC-8 task because TREC-9 used actual users’ questions while TREC-8 used questions constructed for the track. Future tracks will continue to challenge the QA community with more difficult, and more realistic, question answering tasks.

Complex answers: a case study using a WWW question answering system
S. BUCHHOLZ, W. DAELEMANS
Journal:

Natural Language Engineering / Volume 7 / Issue 4 / December 2001

Published online by Cambridge University Press:

14 February 2002, pp. 301-323
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We investigate the problem of complex answers in question answering. Complex answers consist of several simple answers. We describe the online question answering system SHAPAQA, and using data from this system we show that the problem of complex answers is quite common. We define nine types of complex questions, and suggest two approaches, based on answer frequencies, that allow question answering systems to tackle the problem.

Natural language question answering: the view from here
L. HIRSCHMAN, R. GAIZAUSKAS
Journal:

Natural Language Engineering / Volume 7 / Issue 4 / December 2001

Published online by Cambridge University Press:

14 February 2002, pp. 275-300
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
As users struggle to navigate the wealth of on-line information now available, the need for automated question answering systems becomes more urgent. We need systems that allow a user to ask a question in everyday language and receive an answer quickly and succinctly, with sufficient context to validate the answer. Current search engines can return ranked lists of documents, but they do not deliver answers to the user.
Question answering systems address this problem. Recent successes have been reported in a series of question-answering evaluations that started in 1999 as part of the Text Retrieval Conference (TREC). The best systems are now able to answer more than two thirds of factual questions in this evaluation.

Analyses for elucidating current question answering technology
MARC LIGHT, GIDEON S. MANN, ELLEN RILOFF, ERIC BRECK
Journal:

Natural Language Engineering / Volume 7 / Issue 4 / December 2001

Published online by Cambridge University Press:

14 February 2002, pp. 325-342
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this paper, we take a detailed look at the performance of components of an idealized question answering system on two different tasks: the TREC Question Answering task and a set of reading comprehension exams. We carry out three types of analysis: inherent properties of the data, feature analysis, and performance bounds. Based on these analyses we explain some of the performance results of the current generation of Q/A systems and make predictions on future work. In particular, we present four findings: (1) Q/A system performance is correlated with answer repetition; (2) relative overlap scores are more effective than absolute overlap scores; (3) equivalence classes on scoring functions can be used to quantify performance bounds; and (4) perfect answer typing still leaves a great deal of ambiguity for a Q/A system because sentences often contain several items of the same type.

A corpus-based approach for Korean nominal compound analysis based on linguistic and statistical information
JUNTAE YOON, KEY-SUN CHOI, MANSUK SONG
Journal:

Natural Language Engineering / Volume 7 / Issue 3 / September 2001

Published online by Cambridge University Press:

29 August 2001, pp. 251-270
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The syntactic structure of a nominal compound must be analyzed first for its semantic interpretation. In addition, the syntactic analysis of nominal compounds is very useful for NLP application such as information extraction, since a nominal compound often has a similar linguistic structure with a simple sentence, as well as representing concrete and compound meaning of an object with several nouns combined. In this paper, we present a novel model for structural analysis of nominal compounds using linguistic and statistical knowledge which is coupled based on lexical information. That is, the syntactic relations defined between nouns (complement-predicate and modifier-head relation) are obtained from large corpora and again used to analyze the structures of nominal compounds and identify the underlying relations between nouns. Experiments show that the model gives good results, and can be effectively used for application systems which do not require deep semantic information.

Ehud Reiter and Robert Dale. Building Natural Language Generation Systems. Cambridge University Press, 2000. $64.95/£37.50 (Hardback). 234 pages
ADVAITH SIDDHARTHAN
Journal:

Natural Language Engineering / Volume 7 / Issue 3 / September 2001

Published online by Cambridge University Press:

29 August 2001, pp. 271-274
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Scalable generation of texts using causal and temporal expansions of sentences
YLLIAS CHALI
Journal:

Natural Language Engineering / Volume 7 / Issue 3 / September 2001

Published online by Cambridge University Press:

29 August 2001, pp. 191-205
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper presents a exible bottom-up process to incrementally generate several versions of the same text, building up the core text from its kernel version into other versions varying of the levels of details. We devise a method for identifying the question/answer relations holding between the propositions of a text, we give rules for characterizing the kernel version of a text, and we provide a procedure, based on causal and temporal expansions of sentences, which distinguishes semantically these levels of details according to their importance. This is based on the assumption that we have a stock of information from the interpreter's knowledge base available. The sentence expansion operation is formally defined according to three principles: (1) the kernel principle ensures to obtain the gist information; (2) the expansion principle defines an incremental augmentation of a text; and (3) the subsume principle defines an importance-based order among the possible details of the information. The system developed allows users to generate in a follow-up way their own text version which meets their expectations and their demands expressed as questions about the text under consideration.

Applied morphological processing of English
GUIDO MINNEN, JOHN CARROLL, DARREN PEARCE
Journal:

Natural Language Engineering / Volume 7 / Issue 3 / September 2001

Published online by Cambridge University Press:

29 August 2001, pp. 207-223
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We describe two newly developed computational tools for morphological processing: a program for analysis of English inflectional morphology, and a morphological generator, automatically derived from the analyser. The tools are fast, being based on finite-state techniques, have wide coverage, incorporating data from various corpora and machine readable dictionaries, and are robust, in that they are able to deal effectively with unknown words. The tools are freely available. We evaluate the accuracy and speed of both tools and discuss a number of practical applications in which they have been put to use.

Inderjeet Mani and Mark T. Maybury (eds). Advances in Automatic Text Summarization. MIT Press, 1999. ISBN 0-262-13359-8. 442 pp. $47.95/£32.95 (paperback).
ADVAITH SIDDHARTHAN
Journal:

Natural Language Engineering / Volume 7 / Issue 3 / September 2001

Published online by Cambridge University Press:

29 August 2001, pp. 271-274
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

ILEX: an architecture for a dynamic hypertext generation system
M. O'DONNELL, C. MELLISH, J. OBERLANDER, A. KNOTT
Journal:

Natural Language Engineering / Volume 7 / Issue 3 / September 2001

Published online by Cambridge University Press:

29 August 2001, pp. 225-250
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Generating text in a hypermedia environment places different demands on a text generation system than occurs in non-interactive environments. This paper describes some of these demands, then shows how the architecture of one text generation system, ILEX, has been shaped by them. The architecture is described in terms of the levels of linguistic representation used, and the processes which map between them. Particular attention is paid to the processes of content selection and text structuring.

Sake Jager, John Nerbonne and Arthur van Essen (eds). Language Teaching and Language Technology. Swets & Zeitlinger, 1998. ISBN 90-265-1514-6 (hardback). $87.00. 234 pages
MAYUMI MASUKO
Journal:

Natural Language Engineering / Volume 7 / Issue 2 / June 2001

Published online by Cambridge University Press:

25 July 2001, pp. 187-188
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

A natural language system for retrieval of captioned images
DAVID ELWORTHY, TONY ROSE, AMANDA CLARE, AARON KOTCHEFF
Journal:

Natural Language Engineering / Volume 7 / Issue 2 / June 2001

Published online by Cambridge University Press:

25 July 2001, pp. 117-142
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
ANVIL is an information retrieval system using natural language processing techniques, intended for retrieval of captioned images. It extracts dependency structures from the image captions and user queries, and then applies a high accuracy matching algorithm which recursively explores the dependency structures to determine their similarity. A further algorithm allows additional contextual information to be extracted following a successful match, with the intention of helping users understand and organise the retrieval results. ANVIL was developed to high engineering standards, and as well as looking at the research aspects of the system, we also look at some of the design and development issues. English and Japanese versions of the system have been developed.

Special Issue on Robust Methods in Analysis of Natural Language Data
Afzal Ballim, Vincenzo Pallotta
Journal:

Natural Language Engineering / Volume 7 / Issue 2 / June 2001

Published online by Cambridge University Press:

25 July 2001, pp. 189-190
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The automated analysis of natural language data has become a central issue in the design of Intelligent Information Systems. The term natural language is intended to cover all the possible modalities of human communication and it is not restricted to written or spoken language. Processing unrestricted natural language is still considered as an AI-hard task. However various analysis techniques have been proposed in order to address specific aspects of natural language. In particular, recent interest has been on providing approximate analysis techniques, assuming that perfect analysis is not possible, but that partial results are still very useful.

Compound noun segmentation based on lexical data extracted from corpus
JUNTAE YOON
Journal:

Natural Language Engineering / Volume 7 / Issue 2 / June 2001

Published online by Cambridge University Press:

25 July 2001, pp. 167-185
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Compound noun segmentation is one of the crucial problems in Korean language processing because a series of nouns in Korean may appear without space in real text, which makes it difficult to identify its morphological constituents. This paper presents an effective method of Korean compound noun segmentation based on lexical data extracted from a corpus. The segmentation consists of two tasks: First, it uses a Hand-Build Segmentation Dictionary (HBSD) to segment compound nouns which frequently occur or need an exceptional process. Second, a segmentation algorithm using data from a corpus is proposed, where simple nouns and their frequencies are stored in a Simple Noun Dictionary (SND) for segmentation. The analysis is executed based on modified tabular parsing using min-max operation. Our experiments have shown a very effective accuracy rate of about 97.29%, which turns out to be very effective.

Randomized rule selection in transformation-based learning: a comparative study
SANDRA CARBERRY, K. VIJAY-SHANKER, ANDREW WILSON, KEN SAMUEL
Journal:

Natural Language Engineering / Volume 7 / Issue 2 / June 2001

Published online by Cambridge University Press:

25 July 2001, pp. 99-116
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Transformation-Based Learning (TBL) is a relatively new machine learning method that has achieved notable success on language problems. This paper presents a variant of TBL, called Randomized TBL, that overcomes the training time problems of standard TBL without sacrificing accuracy. It includes a set of experiments on part-of-speech tagging in which the size of the corpus and template set are varied. The results show that Randomized TBL can address problems that are intractable in terms of training time for standard TBL. In addition, for language problems such as dialogue act tagging where the most effective features have not been identified through linguistic studies, Randomized TBL allows the researcher to experiment with a large set of templates capturing many potentially useful features and feature interactions.

Real-time automatic insertion of accents in French text
MICHEL SIMARD, ALEXANDRE DESLAURIERS
Journal:

Natural Language Engineering / Volume 7 / Issue 2 / June 2001

Published online by Cambridge University Press:

25 July 2001, pp. 143-165
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Automatic Accent Insertion (AAI) is the problem of re-inserting accents (diacritics) into a text where they are missing. Unaccented French texts are still quite common in electronic media, as a result of a long history of character encoding problems and the lack of well-established conventions for typing accented characters on computer keyboards. An AAI method for French is presented, based on a statistical language model. Next, it is shown how this AAI method can be used to do real-time accent insertions within a word processing environment, making it possible to type in French without having to type accents. Various mechanisms are proposed to improve the performance of real-time AAI, by exploiting online corrections made by the user. Experiments show that, on average, such a system produces less than one accentuation error for every 200 words typed.

Automatic language and information processing: rethinking evaluation
KAREN SPARCK JONES
Journal:

Natural Language Engineering / Volume 7 / Issue 1 / March 2001

Published online by Cambridge University Press:

26 April 2001, pp. 29-46
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
System evaluation has mattered since research on automatic language and information processing began. However, the (D)ARPA conferences have raised the stakes substantially in requiring and delivering systematic evaluations and in sustaining these through long term programmes; and it has been claimed that this has both significantly raised task performance, as defined by appropriate effectiveness measures, and promoted relevant engineering development. These controlled laboratory evaluations have made very strong assumptions about the task context. The paper examines these assumptions for six task areas, considers their impact on evaluation and performance results, and argues that for current tasks of interest, e.g. summarising, it is now essential to play down the present narrowly-defined performance measures in order to address the task context, and specifically the role of the human participant in the task, so that new measures, of larger value, can be developed and applied.

Andrew Radford. Syntactic Theory and the Structure of English: A minimalist approach. Cambridge University Press, 1997. £18.95, ISBN 0-521-47707-7. Andrew Radford. Syntax: A minimalist introduction. Cambridge University Press, 1997. £14.95, ISBN 0-521-58914-2.
PIUS TEN HACKEN
Journal:

Natural Language Engineering / Volume 7 / Issue 1 / March 2001

Published online by Cambridge University Press:

26 April 2001, pp. 87-97
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Wolfgang Minker, Alex Waibel, Joseph Mariani. Stochastically-based Semantic Analysis. Kluwer Academic, Boston/Dordrecht/London. 1999. ISBN 0-7923-8571-3 (Hardback). £74.75. xv+221 pages
OLIVER MASON
Journal:

Natural Language Engineering / Volume 7 / Issue 1 / March 2001

Published online by Cambridge University Press:

26 April 2001, pp. 87-97
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Artificial Intelligence and Natural Language Processing

Refine search

Refine search

Actions for selected content:

3312 results in Artificial Intelligence and Natural Language Processing

Discovery of inference rules for question-answering

The TREC question answering track

Complex answers: a case study using a WWW question answering system

Natural language question answering: the view from here

Analyses for elucidating current question answering technology

A corpus-based approach for Korean nominal compound analysis based on linguistic and statistical information

Ehud Reiter and Robert Dale. Building Natural Language Generation Systems. Cambridge University Press, 2000. $64.95/£37.50 (Hardback). 234 pages

Scalable generation of texts using causal and temporal expansions of sentences

Applied morphological processing of English

Inderjeet Mani and Mark T. Maybury (eds). Advances in Automatic Text Summarization. MIT Press, 1999. ISBN 0-262-13359-8. 442 pp. $47.95/£32.95 (paperback).

ILEX: an architecture for a dynamic hypertext generation system

Sake Jager, John Nerbonne and Arthur van Essen (eds). Language Teaching and Language Technology. Swets & Zeitlinger, 1998. ISBN 90-265-1514-6 (hardback). $87.00. 234 pages

A natural language system for retrieval of captioned images

Special Issue on Robust Methods in Analysis of Natural Language Data

Compound noun segmentation based on lexical data extracted from corpus

Randomized rule selection in transformation-based learning: a comparative study

Real-time automatic insertion of accents in French text

Automatic language and information processing: rethinking evaluation

Andrew Radford. Syntactic Theory and the Structure of English: A minimalist approach. Cambridge University Press, 1997. £18.95, ISBN 0-521-47707-7. Andrew Radford. Syntax: A minimalist introduction. Cambridge University Press, 1997. £14.95, ISBN 0-521-58914-2.

Wolfgang Minker, Alex Waibel, Joseph Mariani. Stochastically-based Semantic Analysis. Kluwer Academic, Boston/Dordrecht/London. 1999. ISBN 0-7923-8571-3 (Hardback). £74.75. xv+221 pages

Artificial Intelligence and Natural Language Processing

Refine search

Refine search

Actions for selected content:

Save Search

3312 results in Artificial Intelligence and Natural Language Processing