Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dweep Trivedi

Learning to Synthesize Programs as Interpretable and Generalizable Policies

Aug 31, 2021

Dweep Trivedi, Jesse Zhang, Shao-Hua Sun, Joseph J. Lim

Figure 1 for Learning to Synthesize Programs as Interpretable and Generalizable Policies

Figure 2 for Learning to Synthesize Programs as Interpretable and Generalizable Policies

Figure 3 for Learning to Synthesize Programs as Interpretable and Generalizable Policies

Figure 4 for Learning to Synthesize Programs as Interpretable and Generalizable Policies

Abstract:Recently, deep reinforcement learning (DRL) methods have achieved impressive performance on tasks in a variety of domains. However, neural network policies produced with DRL methods are not human-interpretable and often have difficulty generalizing to novel scenarios. To address these issues, prior works explore learning programmatic policies that are more interpretable and structured for generalization. Yet, these works either employ limited policy representations (e.g. decision trees, state machines, or predefined program templates) or require stronger supervision (e.g. input/output state pairs or expert demonstrations). We present a framework that instead learns to synthesize a program, which details the procedure to solve a task in a flexible and expressive manner, solely from reward signals. To alleviate the difficulty of learning to compose programs to induce the desired agent behavior from scratch, we propose to first learn a program embedding space that continuously parameterizes diverse behaviors in an unsupervised manner and then search over the learned program embedding space to yield a program that maximizes the return for a given task. Experimental results demonstrate that the proposed framework not only learns to reliably synthesize task-solving programs but also outperforms DRL and program synthesis baselines while producing interpretable and more generalizable policies. We also justify the necessity of the proposed two-stage learning scheme as well as analyze various methods for learning the program embedding.

* 52 pages, 16 figures, 12 tables

Via

Access Paper or Ask Questions

Multi-agent Trajectory Prediction with Fuzzy Query Attention

Oct 29, 2020

Nitin Kamra, Hao Zhu, Dweep Trivedi, Ming Zhang, Yan Liu

Figure 1 for Multi-agent Trajectory Prediction with Fuzzy Query Attention

Figure 2 for Multi-agent Trajectory Prediction with Fuzzy Query Attention

Figure 3 for Multi-agent Trajectory Prediction with Fuzzy Query Attention

Figure 4 for Multi-agent Trajectory Prediction with Fuzzy Query Attention

Abstract:Trajectory prediction for scenes with multiple agents and entities is a challenging problem in numerous domains such as traffic prediction, pedestrian tracking and path planning. We present a general architecture to address this challenge which models the crucial inductive biases of motion, namely, inertia, relative motion, intents and interactions. Specifically, we propose a relational model to flexibly model interactions between agents in diverse environments. Since it is well-known that human decision making is fuzzy by nature, at the core of our model lies a novel attention mechanism which models interactions by making continuous-valued (fuzzy) decisions and learning the corresponding responses. Our architecture demonstrates significant performance gains over existing state-of-the-art predictive models in diverse domains such as human crowd trajectories, US freeway traffic, NBA sports data and physics datasets. We also present ablations and augmentations to understand the decision-making process and the source of gains in our model.

* NeurIPS 2020 Camera-ready version. Code: https://github.com/nitinkamra1992/FQA

Via

Access Paper or Ask Questions