Hostname: page-component-8448b6f56d-dnltx Total loading time: 0 Render date: 2024-04-22T23:44:01.502Z Has data issue: false hasContentIssue false

Ontologies and text retrieval

Published online by Cambridge University Press:  23 August 2002

JAMES MAYFIELD
Affiliation:
The Johns Hopkins University Applied Physics Laboratory, 11100 Johns Hopkins Rd., Laurel MD 20723–6099, USA; e-mail: James.Mayfield@jhuapl.edu

Abstract

Analogues to much of today's work in ontologies have existed for centuries in text retrieval. The use of controlled vocabularies, or thesauri, has been fundamental to document indexing in library science. Thesauri serve several purposes, including:

[bull ] Knowledge organisation A thesaurus provides a hierarchy of concepts that organises domain-specific knowledge.

[bull ] Terminology normalisation By selecting a unique word or phrase to represent each domain concept, then linking synonymous terms to it, a thesaurus enforces terminological consistency.

[bull ] Query expansion A thesaurus facilitates the addition of terms to a query by providing explicit hierarchical and lateral relationships among terms.

These properties serve to mediate the information flow from indexer to user. Thesauri thus serve many of the same functions for people that ontologies are designed to serve for software agents. As automated retrieval has developed over the decades since the inception of computer processing of text, many techniques have been introduced to apply this typically manual work to the automated arena (see Soergel (1985) for an introduction to library information systems, also Anderson and Pélrez-Carballo (2001a, 2001b) for a summary of the intersection of human and machine indexing).

Type
Research Article
Copyright
© 2002 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

This work was supported in part by the Defense Advanced Research Projects Agency (DARPA) under contract number F30602-00-2-0 591 AO K528. I would like to thank Tim Finin, who was instrumental in my involvement in this line of enquiry.