An important application of ontologies is to provide semantics and domain knowledge for data. Traditionally, data has been stored and managed inside relational database systems (aka SQL databases) where it is organised according to a pre-specified schema that describes its structure and meaning. In recent years, though, less and less data comes from such controlled sources. In fact, a lot of data is now found on the web, in social networks and so on, where typically neither its structure nor its meaning is explicitly specified; moreover, data coming from such sources is typically highly incomplete. Ontologies can help to overcome these problems by providing semantics and background knowledge, leading to a paradigm that is often called ontology-mediated querying. As an example, consider data about used-car offers. The ontology can add knowledge about the domain of cars, stating for example that a grand tourer is a kind of sports car. In this way, it becomes possible to return a car that the data identifies as a grand tourer as an answer to a query which asks for finding all sports cars. In the presence of data, a fundamental description logic reasoning service is answering database queries in the presence of ontologies. Since answers to full SQL queries are uncomputable in the presence of ontologies, the prevailing query language is conjunctive queries (CQs) and slight extensions thereof such as unions of conjunctive queries (UCQs) and positive existential queries. Conjunctive queries are essentially the select-from-where fragment of SQL, written in logic.
In this chapter, we study conjunctive query answering in the presence of ontologies that take the form of a DL TBox. In particular, we show how to implement this reasoning service using standard database systems such as relational (SQL) systems and Datalog engines, taking advantage of those systems’ efficiency and maturity. Since database systems are not prepared to deal with TBoxes, we need a way to “sneak them in”. While there are several approaches to achieve this, here we will concentrate on query rewriting: given a CQ q to be answered and a TBox T, produce a query qT such that, for any ABox A, the answers to q on A and T are identical to the answers to qT given by a database system that stores A as data.