Skip to main content Accessibility help

Fixpoint semantics and optimization of recursive Datalog programs with aggregates*



A very desirable Datalog extension investigated by many researchers in the last 30 years consists in allowing the use of the basic SQL aggregates min, max, count and sum in recursive rules. In this paper, we propose a simple comprehensive solution that extends the declarative least-fixpoint semantics of Horn Clauses, along with the optimization techniques used in the bottom-up implementation approach adopted by many Datalog systems. We start by identifying a large class of programs of great practical interest in which the use of min or max in recursive rules does not compromise the declarative fixpoint semantics of the programs using those rules. Then, we revisit the monotonic versions of count and sum aggregates proposed by Mazuran et al. (2013b, The VLDB Journal 22, 4, 471–493) and named, respectively, mcount and msum. Since mcount, and also msum on positive numbers, are monotonic in the lattice of set-containment, they preserve the fixpoint semantics of Horn Clauses. However, in many applications of practical interest, their use can lead to inefficiencies, that can be eliminated by combining them with max, whereby mcount and msum become the standard count and sum. Therefore, the semantics and optimization techniques of Datalog are extended to recursive programs with min, max, count and sum, making possible the advanced applications of superior performance and scalability demonstrated by BigDatalog (Shkapsky et al. 2016. In SIGMOD. ACM, 1135–1149) and Datalog-MC (Yang et al. 2017. The VLDB Journal 26, 2, 229–248).



Hide All

Work done while at UCLA.


This work was supported in part by NSF grants IIS-1218471, IIS-1302698 and CNS-1351047 and U54EB020404 awarded by NIH Big Data to Knowledge (BD2K).



Hide All
Aref, M. et al. 2015. Design and implementation of the LogicBlox system. In Proc. of SIGMOD. ACM, 1371–1382.
Arni, F., Ong, K., Tsur, S., Wang, H. and Zaniolo, C. 2003. The deductive database system LDL++. Theory and Practice of Logic Programming 3, 1, 6194.
Chimenti, D., O'Hare, A. B., Krishnamurthy, R., Tsur, S., West, C. and Zaniolo, C. 1987. An overview of the LDL system. IEEE Data Engineering Bulletin 10, 4, 5262.
Condie, T., et al. 2017. Advanced Applications by Least-Fixpoint Algorithms Specified using Aggregates in Datalog. Technical Report 170012, UCLA CSD.
Faber, W., Pfeifer, G. and Leone, N. 2011. Semantics and complexity of recursive aggregates in answer set programming. Artificial Intelligence 175, 1, 278298.
Furfaro, F., Greco, S., Ganguly, S. and Zaniolo, C. 2002. Pushing extrema aggregates to optimize logic queries. Information Systems 27, 5 (July), 321343.
Ganguly, S., Greco, S. and Zaniolo, C. 1995. Extrema predicates in deductive databases. Journal of Computer and System Sciences 51, 2, 244259.
Gelfond, M. and Zhang, Y. 2014. Vicious circle principle and logic programs with aggregates. Theory and Practice of Logic Programming 14, 4–5, 587601.
Greco, S., Zaniolo, C. and Ganguly, S. 1992. Greedy by choice. In Proc. of PODS. ACM, 105–113.
Kemp, D., Meenakshi, K., Balbin, I. and Ramamohanarao, K. 1989. Propagating constraints in recursive deductive databases. In North American Conference on Logic Programming, 981–998.
Kemp, D. B. and Stuckey, P. J. 1991. Semantics of logic programs with aggregates. In Proc. of ISLP, 387–401.
Liu, Y. A., Stoller, S. D., Lin, B. and Gorbovitski, M. 2012. From clarity to efficiency for distributed algorithms. SIGPLAN Notices 47, 10 (October), 395410.
Mazuran, M., Serra, E. and Zaniolo, C. 2013a. A declarative extension of Horn clauses, and its significance for Datalog and its applications. Theory and Practice of Logic Programming 13, 4–5, 609623.
Mazuran, M., Serra, E. and Zaniolo, C. 2013b. Extending the power of Datalog recursion. The VLDB Journal 22, 4, 471493.
Morris, K. A., Ullman, J. D. and Gelder, A. V. 1986. Design overview of the NAIL! system. In Proc. of ICLP, 554–568.
Mumick, I. S., Pirahesh, H. and Ramakrishnan, R. 1990. The magic of duplicates and aggregates. In VLDB. Morgan Kaufmann Publishers Inc., 264277.
Mumick, I. S. and Shmueli, O. 1995. How expressive is stratified aggregation? Annals of Mathematics and Artificial Intelligence 15, 3–4, 407435.
Pelov, N., Denecker, M. and Bruynooghe, M. 2007. Well-founded and stable semantics of logic programs with aggregates. Theory and Practice of Logic Programming 7, 3, 301353.
Przymusinski, T. C. 1988. Perfect model semantics. In Proc. of ICLP/SLP, 1081–1096.
Ramakrishnan, R., Srivastava, D. and Sudarshan, S. 1992. CORAL - control, relations and logic. In Proc. of PVLDB, 238–250.
Ross, K. A. and Sagiv, Y. 1992. Monotonic aggregation in deductive databases. In Proc. of PODS, 114–126.
Seo, J., Park, J., Shin, J. and Lam, M. S. 2013. Distributed socialite: a Datalog-based language for large-scale graph analysis. Proceedings of the VLDB Endowment 6, 14, 19061917.
Shkapsky, A., Yang, M., Interlandi, M., Chiu, H., Condie, T. and Zaniolo, C. 2016. Big data analytics with Datalog queries on Spark. In Proc. of SIGMOD. ACM, 1135–1149.
Shkapsky, A., Yang, M. and Zaniolo, C. 2015. Optimizing recursive queries with monotonic aggregates in DeALS. In Proc. of ICDE. IEEE, 867–878.
Shkapsky, A., Zeng, K. and Zaniolo, C. 2013. Graph queries in a next-generation Datalog system. Proceedings of the VLDB Endowment 6, 12, 12581261.
Son, T. C. and Pontelli, E. 2007. A constructive semantic characterization of aggregates in answer set programming. Theory and Practice of Logic Programming 7, 3, 355375.
Srivastava, D. and Ramakrishnan, R. 1992. Pushing constraint selections. In Journal of Logic Programming, 301–315.
Sudarshan, S. and Ramakrishnan, R. 1991. Aggregation and relevance in deductive databases. In Proc. of VLDB, 501–511.
Swift, T. and Warren, D. S. 2010. Tabling with answer subsumption: Implementation, applications and performance. In Proc. of JELIA, 300–312.
Vaghani, J., Ramamohanarao, K., Kemp, D. B., Somogyi, Z., Stuckey, P. J., Leask, T. S. and Harland, J. 1994. The Aditi deductive database system. VLDB Journal 3, 2, 245288.
van Emden, M. H. and Kowalski, R. A. 1976. The semantics of predicate logic as a programming language. Journal of the ACM 23, 4, 733742.
Van Gelder, A. 1993. Foundations of aggregation in deductive databases. In Deductive and Object-Oriented Databases. Springer, 1334.
Wang, J., Balazinska, M. and Halperin, D. 2015. Asynchronous and fault-tolerant recursive Datalog evaluation in shared-nothing engines. Proceedings of the VLDB Endowment 8, 12, 15421553.
Yang, M., Shkapsky, A. and Zaniolo, C. 2015. Parallel bottom-up evaluation of logic programs: DeALS on shared-memory multicore machines. In Technical Communications of ICLP.
Yang, M., Shkapsky, A. and Zaniolo, C. 2017. Scaling up the performance of more powerful Datalog systems on multicore machines. The VLDB Journal 26, 2, 229248.
Yang, M. and Zaniolo, C. 2014. Main memory evaluation of recursive queries on multicore machines. In Proc. of IEEE Big Data, 251–260.
Zaniolo, C., Ceri, S., Faloutsos, C., Snodgrass, R. T., Subrahmanian, V. S. and Zicari, R. 1997. Advanced Database Systems. Morgan Kaufmann.
Zaniolo, C., Yang, M., Das, A. and Interlandi, M. 2016. The magic of pushing extrema into recursion: Simple, powerful Datalog programs. In Proc. of AMW.
Zhou, N.-F., Barták, R. and Dovier, A. 2015. Planning as tabled logic programming. Theory and Practice of Logic Programming 15, 4–5, 543558.
Zhou, N.-F., Kameya, Y. and Sato, T. 2010. Mode-directed tabling for dynamic programming, machine learning, and constraint solving. In Proc. of ICTAI '10. Washington, DC, USA, 213–218.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Theory and Practice of Logic Programming
  • ISSN: 1471-0684
  • EISSN: 1475-3081
  • URL: /core/journals/theory-and-practice-of-logic-programming
Please enter your name
Please enter a valid email address
Who would you like to send this to? *



Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed