To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
A parser is a function that analyses a piece of text to determine its logical structure. The text is a string of characters describing some value of interest, such as an arithmetic expression, a poem or a spreadsheet. The output of a parser is a representation of the value, such as a tree of some kind for an arithmetic expression, a list of verses for a poem, or something more complicated for a spreadsheet. Most programming tasks involve decoding the input in some way, so parsing is a pervasive component of computer programming. In this chapter we will describe a monadic approach to parsing, mainly designing simple parsers for expressions of various kinds. We will also say a little more about the converse process of encoding the output as a string; in other words, more about the type class Show. This material will be used in the final chapter.
Parsers as monads
Parsers return different values of interest, so as a first cut we can think of a parser as a function that takes a string and returns a value:
type Parser a = String → a
This type is basically the same as that of the standard prelude function
read :: Read a ⇒ String → a
Indeed, read is a parser, though not a very flexible one. One reason is that all the input must be consumed.
This final chapter is devoted to a single programming project, the design and implementation of a simple calculator for carrying out point-free equational proofs. Although the calculator provides only a small subset of the facilities one might want in an automatic proof assistant, and is highly restrictive in a number of other ways, it will nevertheless be powerful enough to prove many of the point-free laws described in previous chapters – well, provided we are prepared to give it a nudge in the right direction if necessary. The project is also a case study in the use of modules. Each component of the calculator, its associated types and functions, is defined in an appropriate module and linked to other modules through explicit import and export lists.
Basic considerations
The basic idea is to construct a single function calculate with type
calculate :: [Law] → Expr → Calculation
The first argument of calculate is a list of laws that may be applied. Each law consists of a descriptive name and an equation. The second argument is an expression and the result is a calculation. A calculation consists of a starting expression and a sequence of steps. Each step consists of the name of a law and the expression that results by applying the left-hand side of the law to the current expression.
In this chapter we present a variant of the tableau algorithm that can accommodate the SHIQ language. We first summarise the elements of the tableau reasoning that have been introduced so far. Next, we discuss the techniques to support each new construct of the SHIQ language: transitivity, role hierarchy, inverse roles, functional restrictions and qualified number restrictions. Finally we describe the full SHIQ tableau algorithm, first only for TBox inference and then for reasoning over ABoxes and TBoxes together. The chapter is concluded with a discussion of optimisation techniques for the SHIQ tableau algorithm.
The discussion follows the papers [63, 60], although parts of the formalism have been simplified. Detailed proofs of the properties of the algorithm can also be found there.
An outline of the SHIQ tableau algorithm
First we reiterate the main characteristics of the ALCN tableau algorithm with respect to a possibly non-empty TBox (see Section 5.4). The purpose of this discussion is to highlight the most important features that will be reused in the SHIQ algorithm.
(1) The goal of the tableau algorithm is to decide whether a root concept C0 is satisfiable w.r.t. a (possibly empty) TBox T. To make the following discussion simpler, we will often omit the reference to the TBox.
The aim of this chapter is to introduce some major problems associated with the World Wide Web that have led to the development of its new generation, the Semantic Web.
The chapter has two main parts. In the first we describe the structure of the Internet, the different kinds of web pages (static and dynamic) and their role in the process of information storage. Here we also introduce the concept of web forms and Common Gateway Interface (CGI) technology and its more advanced alternatives.
In the second part of the chapter we examine how traditional search engines work, what their limits are and how they fare with heterogeneous information sources. We illustrate the problems associated with searching the Web and briefly describe possible solutions. One of these is the Semantic Web approach, which is described in more detail in later chapters.
For readers familiar with the Internet we suggest skipping the first section and starting at Section 1.2.
The architecture of the web
The World Wide Web is made up of servers and clients. Servers store different kinds of information in various ways. Most often these pieces of information are stored in the form of web pages (also called homepages), which are essentially standard text files with a special structure.
In this chapter we focus on heavyweight ontologies and on the role they play in the Semantic Web. We introduce the Web Ontology Language OWL, which is based on Description Logic and which was created as an extension of the RDF schema language (discussed in Section 2.6).
In Section 8.1 we give an introductory overview of the OWL language. Next, in Section 8.2, we present the details of the first OWL standard. Finally, in Section 8.3, we discuss the recently released extension of OWL, the OWL 2 standard.
Unless stated otherwise, we will use the term “OWL” when discussing the features present in both the initial and the recent standards. The terms “OWL 1” and “OWL 2” are used to refer to the first and the recent variant of the Web Ontology Language, respectively.
The language OWL – an introduction
The deficiencies of RDF schema and increasing demand led to the development of a large number of ontology languages. At the end of the last millennium, the two most significant languages were the OIL (Ontology Interface Layer) [37] and the DAML-ONT (DARPA Agent Markup Language) [52] languages. The former was developed by the University of Manchester, the latter by the United States Department of Defense (DoD). Both languages were designed with Description Logic in mind; OIL, in particular, is suited for the FaCT reasoner [55] and hence it realises the SHIQ language class.
The present chapter introduces the Semantic Web and its philosophy. This involves two main ideas. The first is to associate meta-information with Internet-based resources. The second is to reason about this type of information. We show how these two ideas can help in solving the problems mentioned in the previous chapter.
Having introduced the main concepts we continue the chapter by describing technologies that can be used for representing meta-information in a uniform way. First we introduce the XML language, which forms the basis of the Semantic Web as a standard information exchange format. Then we describe the RDF language; this has an XML notation as well as other representations and can be used to associate meta-information to an arbitrary resource. By doing this we can extend web contents with computer-processable semantics.
Subsequently, we introduce the RDF schema language, which provides the background knowledge that is essential to do reasoning on meta-information. We discuss the similarities and differences between RDF schemas and traditional object-oriented modelling paradigms.
We conclude the chapter by presenting several applications that directly or indirectly use RDF descriptions during their operation.
Introduction
The Semantic Web approach was originated by Tim Berners-Lee, the father of the World Wide Web and related technologies (URI, HTTP, HTML etc.). The approach is based on two fundamental ideas.
The Semantic Web is a new area of computer science that is being developed with the main aim of making it easier for computers to process intelligently the huge amount of information on the web. In other words, as the common slogan of the Semantic Web says: computers should not only read but also understand the information on the web. To achieve this, it is necessary to associate metadata with web-based information. For example, in the case of a picture one should formally provide information regarding its author, title and contents. Furthermore, computers should be able to perform reasoning tasks. For example, if it is known that a river appears in a picture, the computer should be able to deduce that water can also be found in the picture.
Research into hierarchical terminology systems, i.e. ontologies, is strongly connected to the area of the Semantic Web. Ontologies are formal systems that allow the description of concrete knowledge about objects of interest as well as of general background knowledge. The description logic formalism is the most widespread approach providing the mathematical foundation of this field. It is not a coincidence that both OWL and its second edition, OWL 2, which are Semantic Web languages standardised by the World Wide Web Consortium (W3C), are based on Description Logic.
In this chapter we discuss the family of description logic (DL) languages. Following an introduction in Section 4.1, we present informally the most important language elements of Description Logic (Section 4.2). Next, we give the exact syntax and semantics of each language (Sections 4.3–4.5). Section 4.6 gives an overview of reasoning tasks for Description Logic while Section 4.7 deals with the simplification of reasoning tasks. Section 4.8 introduces the so-called assertion boxes (ABoxes). In Section 4.9 we explain the links between Description Logic and first-order logic, while Section 4.10 gives an overview of advanced features of DL languages.
In the present chapter and the rest of the book we follow a common notation: names in DL formulae are typeset using the grotesque font.
Introduction
Description Logic allows us to build a mathematical model describing the notions used in a specific area of interest, or in common knowledge [4]. Description Logic deals with concepts representing sets of individuals: for instance the concept “human” describes the set of all human beings. Furthermore, one can also describe relationships between individuals. In Description Logic, as in RDF, only binary (i.e. two-argument) relationships can be used, which here are referred to as roles. For instance, the role “has child” holds between a parent and a child individual.
This chapter presents an implementation of a description logic reasoning engine. Building on the theoretical foundations discussed in the previous chapters we present a program, written in the Haskell functional programming language, which is able to answer concept satisfiability queries over an arbitrary TBox using the ALCN language.
Following the introduction we give some examples illustrating the use of the program to be described. We then present the data structures of the program. Next, we describe the transformation of ALCN concepts into negation normal form and present the entry point of the tableau algorithm. The bulk of the chapter deals with the implementation of the main components of the tableau algorithm: the transformation rules and blocking. Finally, we describe the auxiliary functions used in the preceding sections and discuss possible improvements to the reasoning engine.
Introduction
The ALCN tableau algorithm considered in this chapter is implemented in Haskell, a functional programming language with lazy evaluation [65]. Haskell has been chosen for this task because it allows us to present the inference engine in a concise, simple and easily understandable way which is quite close to the mathematical notation employed in the previous chapters. No prior knowledge of Haskell is required to understand this chapter, since the various features of the Haskell programming language are explained at their first use.
In this chapter we deal with the way in which RDF descriptions are stored, processed and queried as well as the applications and languages involved in the process.
In Section 3.1 we describe how to make RDF meta-information on the web available for search engines. Next, in Section 3.2, we give an overview of development tools which can be used to parse and manage RDF-based sources. In Section 3.3 we describe RDF query languages and show why XML query languages are not suitable for this purpose. Subsequently, in Section 3.4, we discuss the possible reasoning tasks involved in answering RDF queries. Finally, in Section 3.5, we describe problems which arise in the course of optimising RDF queries and outline possible solutions.
RDF descriptions on the web
The RDF language is a generic framework that helps to associate meta-information with resources in a uniform way. RDF is by no means limited to the web, because anything that is identified by a URI can be used in RDF statements. However, as we saw in the previous chapter, practically anything can have a URI: a person, a rucksack, a house etc. This allows us to use RDF in environments other than the web, for example, traditional databases, information integration and other knowledge-intensive systems.