Capability-based localization of distributed and heterogeneous queries


One key aspect of data-centric applications is the manipulation of data stored in persistent repositories, which is moving fast from querying a centralized relational database to the ad-hoc combination of constellations of data sources. The extension of general purpose languages with query operations is increasingly popular, as a tool to improve reasoning and optimizing capabilities of interpreters and compilers. However, not much is being done to integrate and orchestrate different and separate sources of data. We present a data manipulation language that abstracts the nature and location of data-sources. We define its semantics and a type directed query localization mechanism to be used in development tools for heterogeneous environments to efficiently compile them into native queries. We introduce a localization procedure based on rewriting of query expressions that is confluent, terminating and provides the maximum mapping between site capabilities and the structure of the query. We provide formal type safety results that support the sound distribution of query fragments over remote sites. Our approach is also suitable for an interactive query construction environment by rich user interfaces that provide immediate feedback on data manipulation operations. This approach is currently the base for the data layer of a development platform for mobile and web applications.

