Skip to main content Accessibility help

A SQL to C compiler in 500 lines of code

  • TIARK ROMPF (a1) and NADA AMIN (a2)


We present the design and implementation of a SQL query processor that outperforms existing database systems and is written in just about 500 lines of Scala code – a convincing case study that high-level functional programming can handily beat C for systems-level programming where the last drop of performance matters. The key enabler is a shift in perspective toward generative programming. The core of the query engine is an interpreter for relational-algebra operations, written in Scala. Using the open-source lightweight modular staging framework, we turn this interpreter into a query compiler with very low effort. To do so, we capitalize on an old and widely known result from partial evaluation: the first Futamura projection, which states that a process that can specialize an interpreter to any given input program is equivalent to a compiler. In this context, we discuss lightweight modular staging programming patterns such as mixed-stage data structures (e.g., data records with static schema and dynamic field components) and techniques to generate low-level C code, including specialized data structures and data loading primitives.



Hide All
Ackermann, S., Jovanovic, V., Rompf, T. & Odersky, M. (2012) Jet: An embedded DSL for high performance big data processing. In International Workshop on End-to-end Management of Big Data (BigData 2012).
Amin, N. & Rompf, T. (2018) Collapsing towers of interpreters. In PACMPL, vol. 2. (POPL). ACM.
Armbrust, M., et al. (2015) Spark SQL: Relational data processing in spark. In SIGMOD. ACM.
Astrahan, M. M., Blasgen, M. W., Chamberlin, D. D., Eswaran, K. P., Gray, J., Griffiths, P. P. III, King, W. F., Lorie, R. A., McJones, P. R., Mehl, J. W., Putzolu, G. R., Traiger, I. L., Wade, B. W. & Watson, V. (1976) System R: relational approach to database management. ACM Trans. Database Syst. 1(2), 97137.
Axelsson, E., Claessen, K., Sheeran, M., Svenningsson, J., Engdal, D. & Persson, A. (2011) The design and implementation of Feldspar: An embedded language for digital signal processing. In IFL’10. Springer.
Beckmann, O., Houghton, A., Mellor, M. R. & Kelly, P. H. J. (2003) Runtime code generation in C++ as a foundation for domain-specific optimisation. Domain-Specific Program Generation. Lecture Notes in Computer Science, vol. 3016. Springer.
Boncz, P., Grust, T., Van Keulen, M., Manegold, S., Rittinger, J. & Teubner, J. (2006) MonetDB/XQuery: A fast XQuery processor powered by a relational engine. In SIGMOD.
Bondorf, A. (1990) Self-applicable partial evaluation. Ph.D. thesis, DIKU, Department of Computer Science, University of Copenhagen.
Brown, K. J., Sujeeth, A. K., Lee, H. J., Rompf, T., Chafi, H., Odersky, M. & Olukotun, K. (2011) A heterogeneous parallel framework for domain-specific languages. In PACT.
Brown, K. J., Lee, H. J., Rompf, T., Sujeeth, A. K., De Sa, C., Aberger, C. & Olukotun, K. (2016) Have abstraction and eat performance, too: Optimized heterogeneous computing with parallel patterns. In CGO. ACM.
Calcagno, C., Taha, W., Huang, L. & Leroy, X. (2003) Implementing multi-stage languages using asts, gensym, and reflection. In GPCE. ACM.
Catanzaro, B., Garland, M. & Keutzer, K. (2011) Copperhead: Compiling an embedded data parallel language. In PPoPP. ACM.
Chamberlin, D. D., Astrahan, M.M., Blasgen, M. W., Gray, J. N., King, W. F., Lindsay, B. G., Lorie, R., Mehl, J. W., Price, T. G., Putzolu, F., Selinger, P. G., Schkolnick, M., Slutz, D. R., Traiger, I. L., Wade, B. W. & Yost, R. A. (1981) A history and evaluation of System R. Commun. ACM, 24(10).
Chiba, T. & Onodera, T. (2015) Workload Characterization and Optimization of tpc-h Queries on Apache Spark. Technical Report RT0968.
Cohen, A., Donadio, S., Garzarán, M. J., Herrmann, C. A., Kiselyov, O. & Padua, D. A. (2006) In search of a program generator to implement generic transformations for high-performance computing. Sci. Comput. Program. 62(1). Elsevier.
Consel, C. & Danvy, O. (1993) Tutorial notes on partial evaluation. In POPL. ACM.
Consel, C. & Khoo, S.-C. (1993) Parameterized partial evaluation. ACM Trans. Program. Lang. Syst. 15(3).
Crotty, A., Galakatos, A., Dursun, K., Kraska, T., Binnig, C., Çetintemel, U. & Zdonik, S. (2015) An architecture for compiling UDF-centric workflows. PVLDB 8(12). ACM.
DeVito, Z., Hegarty, J., Aiken, A., Hanrahan, P. & Vitek, J. (2013) Terra: A multi-stage language for high-performance computing. In PLDI. ACM.
DeVito, Z., Ritchie, D., Fisher, M., Aiken, A. & Hanrahan, P. (2014) First-class runtime generation of high-performance types using exotypes. In PLDI. ACM.
Diaconu, C., Freedman, C., Ismert, E., Larson, P.-A., Mittal, P., Stonecipher, R., Verma, N. & Zwilling, M. (2013) Hekaton: SQL Server’s memory-optimized OLTP engine. In SIGMOD. ACM.
Essertel, G.M., Tahboub, R. Y., Decker, J.M., Brown, K. J., Olukotun, K. & Rompf, T. (2018) Flare: Optimizing apache spark with native compilation for scale-up architectures and medium-size data. In OSDI. ACM.
Futamura, Y. (1971) Partial evaluation of computation process—An approach to a compiler-compiler. Trans. Inst. Electron. Commun. Eng. Jpn 54-C(8).
Gedik, B., Andrade, H., Wu, K.-L., Yu, P. S. & Doo, M. (2008) Spade: The system s declarative stream processing engine. In SIGMOD. ACM.
Graefe, G. (1994) Volcano - An extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng. 6(1).
Graefe, G. & McKenna, W. J. (1993) The Volcano optimizer generator: Extensibility and efficient search. In ICDE. ACM.
Grant, B., Mock, M., Philipose, M., Chambers, C. & Eggers, S. J. (2000) DyC: An expressive annotation-directed dynamic compiler for C. Theor. Comput. Sci. 248(1–2).
Hatcliff, J. & Danvy, O. (1997) A computational formalization for partial evaluation. Math. Struct. Comput. Sci. 7(5). ACM.
Isard, M., Budiu, M., Yu, Y., Birrell, A. & Fetterly, D. (2007) Dryad: Distributed data-parallel programs from sequential building blocks. In EuroSys. ACM.
Jones, N. D., Gomard, C. K. & Sestoft, P. (1993) Partial Evaluation and Automatic Program Generation. Prentice-Hall, Inc.
Jonnalagedda, M., Coppey, T., Stucki, S., Rompf, T. & Odersky, M. (2014) Staged parser combinators for efficient data processing. In OOPSLA. ACM.
Jørring, U. & Scherlis, W. L. (1986) Compilers and staging transformations. In POPL. ACM.
Kiselyov, O., Swadi, K. N. & Taha, W. (2004) A methodology for generating verified combinatorial circuits. In EMSOFT. ACM.
Klimov, A. V. (2009) A Java supercompiler and its application to verification of cache-coherence protocols. In Ershov Memorial Conference. Springer.
Klonatos, Y., Koch, C., Rompf, T. & Chafi, H. (2014) Building efficient query engines in a high-level language. VLDB 7(10). ACM.
Kornacker, M., Behm, A., Bittorf, V., Bobrovytsky, T., Ching, C., Choi, A., Erickson, J., Grund, M., Hecht, D., Jacobs, M., Joshi, I., Kuff, L., Kumar, D., Leblang, A., Li, N., Pandis, I., Robinson, H., Rorke, D., Rus, S., Russell, J., Tsirogiannis, D., Wanderman-Milne, S. & Yoder, M. (2015) Impala: A modern, open-source SQL engine for Hadoop. In CIDR.
Krikellas, K., Viglas, S. D. & Cintra, M. (2010) Generating code for holistic query evaluation. In ICDE. IEEE.
Lawall, J. L. & Thiemann, P. (1997) Sound specialization in the presence of computational effects. TACS. Lecture Notes in Computer Science, vol. 1281. Springer.
Lee, H. J., Brown, K. J., Sujeeth, A. K., Chafi, H., Rompf, T., Odersky, M. & Olukotun, K. (2011) Implementing domain-specific languages for heterogeneous parallel computing. IEEE Micro 31(5).
Mainland, G. & Morrisett, G. (2010) Nikola: Embedding compiled GPU functions in Haskell. In Haskell Symposium.
McDonell, T. L., Chakravarty, M. M. T., Keller, G. & Lippmeier, B. (2013) Optimising purely functional GPU programs. In ICFP. ACM.
Mehta, M. & DeWitt, D. J. (1995) Managing intra-operator parallelism in parallel database systems. In SIGMOD. ACM.
Mogensen, T. A. E. (1988) Partially static structures in a self-applicable partial evaluator. In Partial Evaluation and Mixed Computation, Bjørner, D., Ershov, A. P., & Jones, N. D. (eds).
Moldovan, D., Decker, J. M., Wang, F., Johnson, A. A., Lee, B. K., Nado, Z., S., D., Rompf, T. & Wiltschko, A. B. (2019) AutoGraph: Imperative-style coding with graph-based performance. In SysML. Springer.
Neumann, T. (2011) Efficiently compiling efficient query plans for modern hardware. PVLDB 4(9).
Odersky, M. & Rompf, T. (2014) Unifying functional and object-oriented programming with scala. Commun. ACM 57(4).
Ofenbeck, G., Rompf, T. & Püschel, M. (2017) Staging for generic programming in space and time. In GPCE. ACM.
Rao, J., Pirahesh, H., Mohan, C. & Lohman, G. (2006) Compiled query execution engine using JVM. In ICDE. IEEE.
Reynolds, J. C. (1972) Definitional interpreters for higher-order programming languages. In Proceedings of the ACM Annual Conference - Volume 2. ACM ‘72.
Reynolds, J. C. (1998) Definitional interpreters for higher-order programming languages. Higherorder Symb. Comput. 11(4). ACM.
Rompf, T. (2012) Lightweight Modular Staging and Embedded Compilers: Abstraction Without Regret for High-Level High-Performance Programming. Ph.D. thesis, EPFL.
Rompf, T. (2016a) The essence of multi-stage evaluation in LMS. A List of Successes That Can Change the World. Lecture Notes in Computer Science, vol. 9600. Springer.
Rompf, T. (2016b) Reflections on LMS: Exploring front-end alternatives. In Scala Symposium.
Rompf, T. & Amin, N. (2015) Functional pearl: A SQL to C compiler in 500 lines of code. In ICFP. ACM.
Rompf, T. & Odersky, M. (2010) Lightweight modular staging: A pragmatic approach to runtime code generation and compiled DSLs. In GPCE. ACM.
Rompf, T. & Odersky, M. (2012) Lightweight modular staging: A pragmatic approach to runtime code generation and compiled DSLs. Commun. ACM 55(6).
Rompf, T.Sujeeth, A. K., Lee, H. J., Brown, K. J., Chafi, H., Odersky, M. & Olukotun, K. (2011) Building-blocks for performance oriented DSLs. In IFIP Working Conference on Domain-Specific Languages (DSL). EPTCS, vol. 66. Open Publishing Association.
Rompf, T., Amin, N., Moors, A., Haller, P. & Odersky, M. (2012) Scala-Virtualized: Linguistic reuse for deep embeddings. Higher-order Symb. Comput. 25(1). ACM.
Rompf, T., Sujeeth, A. K., Amin, N., Brown, K., Jovanovic, V., Lee, H. J., Jonnalagedda, M., Olukotun, K. & Odersky, M. (2013) Optimizing data structures in high-level programs. In POPL. ACM.
Rompf, T., Sujeeth, A. K., Brown, K. J., Lee, H. J., Chafi, H. & Olukotun, K. (2014) Surgical precision JIT compilers. In PLDI. ACM.
Rompf, T., Brown, K. J., Lee, H. J., Sujeeth, A. K., Jonnalagedda, M., Amin, N., Ofenbeck, G., Stojanov, A., Klonatos, Y., Dashti, M., Koch, C., Püschel, M. & Olukotun, K. (2015) Go meta! A case for generative programming and DSLs in performance critical systems. In SNAPL.
Schultz, U. P., Lawall, J. L. & Consel, C. (2003) Automatic program specialization for Java. ACM Trans. Program. Lang. Syst. 25(4).
Shaikhha, A., Klonatos, I., Parreaux, L. E. V., Brown, L., Dashti Rahmat Abadi, M. & Koch, C. (2016) How to architect a query compiler. In SIGMOD. ACM.
Shali, A. & Cook, W. R. (2011) Hybrid partial evaluation. In OOPSLA. ACM.
Sperber, M. & Thiemann, P. (1996) Realistic compilation by partial evaluation. In PLDI. ACM.
Stonebraker, M. & Çetintemel, U. (2005) “One Size Fits All”: An idea whose time has come and gone (abstract). In ICDE. IEEE.
Stonebraker, M., Madden, S., Abadi, D. J., Harizopoulos, S., Hachem, N. & Helland, P. (2007) The end of an architectural era (it’s time for a complete rewrite). In PVLDB. ACM.
Sujeeth, A. K., Rompf, T., Brown, K. J., Lee, H. J., Chafi, H., Popic, V., Wu, M., Prokopec, A., Jovanovic, V., Odersky, M. & Olukotun, K. (2013a) Composition and reuse with compiled domain-specific languages. In ECOOP. Springer.
Sujeeth, A. K., Gibbons, A., Brown, K. J., Lee, H. J., Rompf, T., Odersky, M. & Olukotun, K. (2013b) Forge: Generating a high performance DSL implementation from a declarative specification. In GPCE. ACM.
Svenningsson, J. & Axelsson, E. (2012) Combining deep and shallow embedding for EDSL. In Trends in Functional Programming (TFP).
Taha, W. & Sheard, T. (2000) MetaML and multi-stage programming with explicit annotations. Theor. Comput. Sci. 248(1–2). Elsevier.
Tahboub, R. Y. & Rompf, T. (2016) On supporting compilation in spatial query engines (vision paper). In SIGSPATIAL.
Tahboub, R. Y., Essertel, G. M. & Rompf, T. (2018) How to architect a query compiler, revisited. In SIGMOD. ACM. The Transaction Processing Council. (2002) TPC-H Revision 2.
Thiemann, P. (2013) Partially static operations. In PEPM. ACM.
Thiemann, P. & Dussart, D. (1999) Partial evaluation for higher-order languages with state. Technical Report. Germany: Universität Tübingen.
Tobin-Hochstadt, S., St-Amour, V., Culpepper, R., Flatt, M. & Felleisen, M. (2011) Languages as libraries. In PLDI. ACM.
Zukowski, M., Boncz, P. A., Nes, N. & Héman, S. (2005) MonetDB/X100 - a DBMS in the CPU cache. IEEE Data Eng. Bull. 28(2).


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed

A SQL to C compiler in 500 lines of code

  • TIARK ROMPF (a1) and NADA AMIN (a2)
Submit a response


No Discussions have been published for this article.


Reply to: Submit a response

Your details

Conflicting interests

Do you have any conflicting interests? *