Hostname: page-component-77f85d65b8-lfk5g Total loading time: 0 Render date: 2026-04-21T10:26:03.266Z Has data issue: false hasContentIssue false

Push versus pull-based loop fusion in query engines

Published online by Cambridge University Press:  10 April 2018

AMIR SHAIKHHA
Affiliation:
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland (e-mails: amir.shaikhha@epfl.ch, mohammad.dashti@epfl.ch, christoph.koch@epfl.ch)
MOHAMMAD DASHTI
Affiliation:
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland (e-mails: amir.shaikhha@epfl.ch, mohammad.dashti@epfl.ch, christoph.koch@epfl.ch)
CHRISTOPH KOCH
Affiliation:
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland (e-mails: amir.shaikhha@epfl.ch, mohammad.dashti@epfl.ch, christoph.koch@epfl.ch)
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the 'Save PDF' action button.

Database query engines use pull-based or push-based approaches to avoid the materialization of data across query operators. In this paper, we study these two types of query engines in depth and present the limitations and advantages of each engine. Similarly, the programming languages community has developed loop fusion techniques to remove intermediate collections in the context of collection programming. We draw parallels between databases (DB) and programming language (PL) research by demonstrating the connection between pipelined query engines and loop fusion techniques. Based on this connection, we propose a new type of pull-based engine, inspired by a loop fusion technique, which combines the benefits of both approaches. Then, we experimentally evaluate the various engines, in the context of query compilation, for the first time in a fair environment, eliminating the biasing impact of ancillary optimizations that have traditionally only been used with one of the approaches. We show that for realistic analytical workloads, there is no considerable advantage for either form of pipelined query engine, as opposed to what recent research suggests. Also, by using micro-benchmarks, which demonstrate certain edge cases on which one approach or the other performs better, we show that our proposed engine dominates the existing engines by combining the benefits of both.

Information

Type
Research Article
Copyright
Copyright © Cambridge University Press 2018 
Submit a response

Discussions

No Discussions have been published for this article.