Hostname: page-component-6766d58669-7fx5l Total loading time: 0 Render date: 2026-05-20T05:02:59.748Z Has data issue: false hasContentIssue false

Inductive synthesis of structurally recursive functional programs from non-recursive expressions

Part of: POPL 23

Published online by Cambridge University Press:  15 August 2025

HANGYEOL CHO
Affiliation:
Hanyang University, Department of Computer Science & Engineering, South Korea (e-mail: pigon8@hanyang.ac.kr)
WOOSUK LEE
Affiliation:
Hanyang University, Department of Computer Science & Engineering, South Korea (e-mail: woosuk@hanyang.ac.kr)
Rights & Permissions [Opens in a new window]

Abstract

We present a novel approach to synthesizing recursive functional programs from input–output examples. Synthesizing a recursive function is challenging because recursive subexpressions should be constructed while the target function has not been fully defined yet. We address this challenge by using a new technique we call block-based pruning. A block refers to a recursion- and conditional-free expression (i.e., straight-line code) that yields an output from a particular input. We first synthesize as many blocks as possible for each input–output example, and then we explore the space of recursive programs, pruning candidates that are inconsistent with the blocks. Our method is based on an efficient version space learning, thereby effectively dealing with a possibly enormous number of blocks. In addition, we present a method that uses sampled input–output behaviors of library functions to enable a goal-directed search for a recursive program using the library. We have implemented our approach in a system called Trio and evaluated it on synthesis tasks from prior work and on new tasks. Our experiments show that Trio significantly outperforms prior work.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Fig. 1. High-level architecture of our synthesis algorithm.

Figure 1

Fig. 2. Our ML-like language.

Figure 2

Algorithm 1 The TRIO Algorithm

Figure 3

Fig. 3. Inference rules for Deduce.

Figure 4

Fig. 4. Inference rules for BlockGen.

Figure 5

Fig. 5. Rules for unfolding (symbolic evaluation interleaved with concrete evaluation) for deriving open blocks from $P=\textsf{rec}\ \texttt{f}(\texttt{x}) = e_{\textrm{body}}$ with input i.

Figure 6

Fig. 6. Matching rules for checking block consistency.

Figure 7

Fig. 7. Termination checking procedure.

Figure 8

Table 1. List of new 20 benchmarks collected from the exercises in the official OCaml online tutorial (https://ocaml.org/exercises) and their variants

Figure 9

Table 2. Results for the IO benchmark suite (with 15 easy problems omitted), where “Time” gives synthesis time in seconds, and “Size” shows the size of the synthesized program (measured by number of AST nodes). Synthesis time of the fastest tool for each problem is highlighted in bold.

Figure 10

Table 3. Results for the Ref benchmark suite where “# Iter” shows the number of CEGIS iterations.

Figure 11

Table 4. Number of instances that can be solved by four variants of Trio among 20 newly added benchmarks

Figure 12

Fig. 8. Comparison of different variants of Trio.

Figure 13

Table 5. Comparison of Trio and SyRup on the 43 benchmarks with random input-output examples. Each row represents the results for a different number of examples. “Succ. Rate” gives the success rate of each tool. “Avg Time” shows the average synthesis time for successful trials. “# T/O” denotes the number of time-outs

Figure 14

Fig. 9. Success rates of Trio and SyRup for 12 chosen benchmarks for different numbers of examples (1–8). The x-axis label indicates the number of examples, and the y-axis label indicates the success rate. The plots for the other 31 benchmarks are available in the appendix.

Figure 15

Table 6. Results for the 15 easy problems in the IO benchmark suite, where “Time” gives synthesis time in seconds, and “Size” shows the size of the synthesized program (measured by number of AST nodes). Synthesis time of the fastest tool for each problem is highlighted in bold.

Figure 16

Table 7. Results for the 15 easy problems in the Ref benchmark suite where “# Iter” shows the number of CEGIS iterations

Figure 17

Fig. 10. Full results of Figure 9. The x-axis represents the number of examples, and the y-axis represents the success rate. The empty plot indicates that both tools failed to synthesize a program within the time limit.

Submit a response

Discussions

No Discussions have been published for this article.