05/2020 Tweet
Inductive program synthesis in Javascript
Any compiler or transpiler is doing some form of program synthesis. But “proper” program synthesis usually means generation of high level programs from scratch, given only high level problem descriptions.
This post is about one of the most simple and straightforward methods of automatic program synthesis: inductive synthesis based on input-output examples specification and brute-force enumerative search.
Synthesis task specification
Let's start with a simple example. Suppose that we want to automatically generate a program that reverses an array.
How to formulate this task for the automatic program generator? One option is to just describe it in natural language. But it would be quite hard to automatically parse and understand. A simpler way is to provide examples of inputs and expected outputs:
[1, 2, 3, 4, 5, 6] → [6, 5, 4, 3, 2, 1]
[5, 2, 7, 6] → [6, 7, 2, 5]
It's similar to a programmer working on a task using test-driven development, with a set of self-explanatory tescases.
Brute-force search
How to find a program that satisfies some input-output examples? There are several possible search strategies, but the simplest one is just with a brute force enumeration.
The idea is to generate all programs (within some size liimts and a limited language subset), and pick the first one that works correctly for all specified examples.
This is probably the least efficient approach of all, but with some additional optimizations and heuristics for reducing the search space it can actually be practical.
For example, an algorithm that won last year's synthesis competition SyGuS-Comp 2019 in programming-by-examples track, is based on brute-force enumerative search strategy. As well as some previous years winners (EUSOLVER, Enumerative CEGIS Solver). There are even some practical applications, for example brute-force search can be used for synthesising distributed systems protocols.
Generating Javascript
For generating and manipulating programs, code has to be represented as a data structure. This is not a problem in homoiconic languages like Lisp, but in Javascript it’s harder.
One simple idea (used, for example in miniMAL project https://github.com/kanaka/miniMAL), is to use JSON for representing programs. In this case, JSON arrays are serving as an analog of Lisp s-expressions.
Similar to an s-expression, each JavaScript expression is represented as a JSON array. First element is a function name, and the rest of elements are for function arguments.
For example:
Javascript | JSON |
Math.pow(x, 2) | ["Math.pow", "x", 2] |
Math.pow(Math.sin(x), 2) | [“Math.pow”, [“Math.sin”, “x”], 2] |
For operators, can use prefix notation:
x = 2 + 2 | ["=", x, ["+", 2, 2]] |
There are some special expressions formats: control flow, variables declaration.
Javascript | JSON |
let x = 1 | ["let", "x", 1] |
for (let i = 0; i < 10; i += 1) { console.log(i) } | ["for”, ”i”, ”0”, ”10”, ”1”,["block", [“console.log”, “i”]]] |
if (i > 1) { i = 1; } | ["if", [“>“, “i”, 1], ["block", [“=“, “i”, 1]]] |
Also need syntax conventions for calling methods, accessing properties, and indexing arrays:
let a = new Array(0, 1, 2) | ["let", "a", ["new Array", 0, 1, 2]] |
a.shift() | [".shift", "a"] |
let n1 = a.length | ["let", "n1", [".length_", "a"]] |
let n2 = a[1] | ["let", "n2", ["get_index", a, 1] |
a[1] = 2 | ["set_index", a, 0, 0] |
This allows to represent a wide range of possible JavaScript expressions, and covers all language features needed for a simple demo.
Code for transforming from JSON representation to Javascript is fairly simple. It just has to traverse arrays, print out expressions elements in correct order, and deal with code formatting.
Simple DSL
For brute-force search to be feasible, search space has to be limited to some restricted domain-specific language subset, which depends on the kind of the target problem.
For example, can design a simple DSL targeted for simple array processing. It should have functions for array access and manipulation, simple arithmetics, and control flow functions.
function | Javascript | JSON |
subtract | number = number?? - number?? | ["-", "number??", "number??", "number"] |
add | number = number?? + number?? | ["+", "number??", "number??", "number"] |
greater than | boolean = number?? > number?? | [">", "number??", "number??", "boolean"] |
less than | boolean = number?? < number?? | ["<", "number??", "number??", "boolean"] |
equal | boolean = number?? == number?? | ["==", "number??", "number??", "boolean"] |
array length | number = array??.length | [".length_", "array??", "number"] |
get array element at index | number = array??[number??] | ["get_index", "array??", "number??", "number"] |
set array element at index | array??[number??] = number | ["set_index", "output", "number??", "number??", ""] |
for loop | for (let i in array??) { ?? } | ["for", "i", ["block", "??"], "in", "array??", ""] |
If condition | if (boolean??) { ?? } | ["if", "boolean??", ["block", "??"], ""], |
In the DSL description, "??" stands for an undefined value, that can be replaced with an expression or a variable. This notation is taken from SKETCH language, where "??" represents a hole, a placeholder that the synthesizer can replace with an integer constant.
Some undefined values have annotated types. For example "number??" means that the argument should be replaced with an expression that has "number" return value. This allows for additional limitations on the search space.
Algorithm for program generation
How to generate all programs that can be represented with the DSL? The algorithm is fairly simple: start with some undefined values, and iteratively replace them with all possible options picked from the list of DSL expressions. Selected and pasted expressions would contain some undefined values of their own, so the process is repeated until all the options are used, or until expressions tree depth reaches some limit.
It’s like iterative template rewriting, and it’s easier to demonstrate with an example:
1. Start with an empty/undefined program | ["??", "??", "??", "??"] |
2. Traverse the expressions tree, and pick the first occurring undefined value | ["??", "??", "??", "??"] |
3. List all options for replacing this value (expressions that have an empty return type) | ["set_index", "output", "number??", "number??", ""] ["if", "boolean??", ["block", "??"], ""] ["if", "boolean??", ["block", "??"], ""] |
4. Repace undefined value with the first option | [["set_index", "output", "number??", "number??", ""], "??", "??" ,"??"] |
Steps 2-4 are repeated iteratively. Expressions tree is traversed depth-first, and each found leaf corresponds to a unique program. Press “next” button in the demo below, to see next iterations.
Generated programs are verified with the given input-output examples:
- Translate from JSON represenation to Javascript
- Fill pre-defined input array variable with an example
- Evaluate the program with eval()
- Check if contents of the pre-defined output variable correspond to the output example.
The demo below shows program generation for "array reversal" problem. Press “generate” button to see continuous generation, or use “next” to see generation steps. It generates and checks around a million of short programs per minute, and is able so solve the problem in ~10-20s. I tweaked the DSL to have a bare minimum of available expressions, for search to be faster. Size of the program is also limited by number of used varialbes (limited to 4). It would still work with larger more generic DSL, just would take more time.
Some more examples are show below. They also have custom DSLs tuned for each problem, so that it does not take too much time (~10-20s).
Example: generate a program that finds sum of array elements
Example: generate a program that finds max element in an array
Refenences & links
(book) Program Synthesis
Sumit Gulwani, Oleksandr Polozov, Rishabh Singh
(course materials) Introduction to Program Synthesis
http://people.csail.mit.edu/asolar/SynthesisCourse/Lecture2.htm
Papers
Dimensions in Program Synthesis
Sumit Gulwani
Inductive Programming (A Survey of Program Synthesis Techniques)
Emanuel Kitzelmann
↑
Scaling Enumerative Program Synthesis via Divide and Conquer
Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa
CVC4SY for SyGuS-COMP 2019
Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark Barrett, Cesare Tinelli
↑
The Sketching Approach to Program Synthesis
Armando Solar-Lezama
Blog posts
Software that writes software: on program synthesis
Adithya Ganesh
A Program Synthesis Primer
Aws Albarghouthi
Program Synthesis in 2017-18
Alex Polozov
Program Synthesis is Possible
Adrian Sampson
Homoiconicity, Lisp, and Program Synthesis
Rajesh Jayaprakash
Program Synthesis in 2019
Program Synthesis Explained
Building a Program Synthesizer
Building Your First Program Synthesizer
James Bornholt