Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Leonardo de Moura

Universal Policies for Software-Defined MDPs

Dec 21, 2020

Daniel Selsam, Jesse Michael Han, Leonardo de Moura, Patrice Godefroid

Figure 1 for Universal Policies for Software-Defined MDPs

Abstract:We introduce a new programming paradigm called oracle-guided decision programming in which a program specifies a Markov Decision Process (MDP) and the language provides a universal policy. We prototype a new programming language, Dodona, that manifests this paradigm using a primitive 'choose' representing nondeterministic choice. The Dodona interpreter returns either a value or a choicepoint that includes a lossless encoding of all information necessary in principle to make an optimal decision. Meta-interpreters query Dodona's (neural) oracle on these choicepoints to get policy and value estimates, which they can use to perform heuristic search on the underlying MDP. We demonstrate Dodona's potential for zero-shot heuristic guidance by meta-learning over hundreds of synthetic tasks that simulate basic operations over lists, trees, Church datastructures, polynomials, first-order terms and higher-order terms.

Via

Access Paper or Ask Questions

Learning a SAT Solver from Single-Bit Supervision

Feb 13, 2018

Daniel Selsam, Matthew Lamm, Benedikt Bünz, Percy Liang, Leonardo de Moura, David L. Dill

Figure 1 for Learning a SAT Solver from Single-Bit Supervision

Figure 2 for Learning a SAT Solver from Single-Bit Supervision

Figure 3 for Learning a SAT Solver from Single-Bit Supervision

Figure 4 for Learning a SAT Solver from Single-Bit Supervision

Abstract:We present NeuroSAT, a message passing neural network that learns to solve SAT problems after only being trained as a classifier to predict satisfiability. Although it is not competitive with state-of-the-art SAT solvers, NeuroSAT can solve problems that are substantially larger and more difficult than it ever saw during training by simply running for more iterations. Moreover, NeuroSAT generalizes to novel distributions; after training only on random SAT problems, at test time it can solve SAT problems encoding graph coloring, clique detection, dominating set, and vertex cover problems, all on a range of distributions over small random graphs.

Via

Access Paper or Ask Questions