Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Synthesizing Programmatic Policies with Actor-Critic Algorithms and ReLU Networks

Aug 04, 2023

Spyros Orfanos, Levi H. S. Lelis

Figure 1 for Synthesizing Programmatic Policies with Actor-Critic Algorithms and ReLU Networks

Figure 2 for Synthesizing Programmatic Policies with Actor-Critic Algorithms and ReLU Networks

Share this with someone who'll enjoy it:

Abstract:Programmatically Interpretable Reinforcement Learning (PIRL) encodes policies in human-readable computer programs. Novel algorithms were recently introduced with the goal of handling the lack of gradient signal to guide the search in the space of programmatic policies. Most of such PIRL algorithms first train a neural policy that is used as an oracle to guide the search in the programmatic space. In this paper, we show that such PIRL-specific algorithms are not needed, depending on the language used to encode the programmatic policies. This is because one can use actor-critic algorithms to directly obtain a programmatic policy. We use a connection between ReLU neural networks and oblique decision trees to translate the policy learned with actor-critic algorithms into programmatic policies. This translation from ReLU networks allows us to synthesize policies encoded in programs with if-then-else structures, linear transformations of the input values, and PID operations. Empirical results on several control problems show that this translation approach is capable of learning short and effective policies. Moreover, the translated policies are at least competitive and often far superior to the policies PIRL algorithms synthesize.

View paper on

Share this with someone who'll enjoy it:

Title:Synthesizing Programmatic Policies with Actor-Critic Algorithms and ReLU Networks

Paper and Code