Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daan Wierstra

Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

Jun 06, 2019

Alex Mott, Daniel Zoran, Mike Chrzanowski, Daan Wierstra, Danilo J. Rezende

Figure 1 for Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

Figure 2 for Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

Figure 3 for Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

Figure 4 for Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

Abstract:Inspired by recent work in attention models for image captioning and question answering, we present a soft attention model for the reinforcement learning domain. This model uses a soft, top-down attention mechanism to create a bottleneck in the agent, forcing it to focus on task-relevant information by sequentially querying its view of the environment. The output of the attention mechanism allows direct observation of the information used by the agent to select its actions, enabling easier interpretation of this model than of traditional models. We analyze different strategies that the agents learn and show that a handful of strategies arise repeatedly across different games. We also show that the model learns to query separately about space and content (`where' vs. `what'). We demonstrate that an agent using this mechanism can achieve performance competitive with state-of-the-art models on ATARI tasks while still being interpretable.

Via

Access Paper or Ask Questions

Relational inductive biases, deep learning, and graph networks

Oct 17, 2018

Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner(+17 more)

Figure 1 for Relational inductive biases, deep learning, and graph networks

Figure 2 for Relational inductive biases, deep learning, and graph networks

Figure 3 for Relational inductive biases, deep learning, and graph networks

Figure 4 for Relational inductive biases, deep learning, and graph networks

Abstract:Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. However, many defining characteristics of human intelligence, which developed under much different pressures, remain out of reach for current approaches. In particular, generalizing beyond one's experiences--a hallmark of human intelligence from infancy--remains a formidable challenge for modern AI. The following is part position paper, part review, and part unification. We argue that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective. Just as biology uses nature and nurture cooperatively, we reject the false choice between "hand-engineering" and "end-to-end" learning, and instead advocate for an approach which benefits from their complementary strengths. We explore how using relational inductive biases within deep learning architectures can facilitate learning about entities, relations, and rules for composing them. We present a new building block for the AI toolkit with a strong relational inductive bias--the graph network--which generalizes and extends various approaches for neural networks that operate on graphs, and provides a straightforward interface for manipulating structured knowledge and producing structured behaviors. We discuss how graph networks can support relational reasoning and combinatorial generalization, laying the foundation for more sophisticated, interpretable, and flexible patterns of reasoning. As a companion to this paper, we have released an open-source software library for building graph networks, with demonstrations of how to use them in practice.

Via

Access Paper or Ask Questions

Learning to Search with MCTSnets

Jul 17, 2018

Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver

Figure 1 for Learning to Search with MCTSnets

Figure 2 for Learning to Search with MCTSnets

Figure 3 for Learning to Search with MCTSnets

Figure 4 for Learning to Search with MCTSnets

Abstract:Planning problems are among the most important and well-studied problems in artificial intelligence. They are most typically solved by tree search algorithms that simulate ahead into the future, evaluate future states, and back-up those evaluations to the root of a search tree. Among these algorithms, Monte-Carlo tree search (MCTS) is one of the most general, powerful and widely used. A typical implementation of MCTS uses cleverly designed rules, optimized to the particular characteristics of the domain. These rules control where the simulation traverses, what to evaluate in the states that are reached, and how to back-up those evaluations. In this paper we instead learn where, what and how to search. Our architecture, which we call an MCTSnet, incorporates simulation-based search inside a neural network, by expanding, evaluating and backing-up a vector embedding. The parameters of the network are trained end-to-end using gradient-based optimisation. When applied to small searches in the well known planning problem Sokoban, the learned search algorithm significantly outperformed MCTS baselines.

* ICML 2018 (camera-ready version)

Via

Access Paper or Ask Questions

Relational recurrent neural networks

Jun 28, 2018

Adam Santoro, Ryan Faulkner, David Raposo, Jack Rae, Mike Chrzanowski, Theophane Weber, Daan Wierstra, Oriol Vinyals, Razvan Pascanu, Timothy Lillicrap

Figure 1 for Relational recurrent neural networks

Figure 2 for Relational recurrent neural networks

Figure 3 for Relational recurrent neural networks

Figure 4 for Relational recurrent neural networks

Abstract:Memory-based neural networks model temporal data by leveraging an ability to remember information for long periods. It is unclear, however, whether they also have an ability to perform complex relational reasoning with the information they remember. Here, we first confirm our intuitions that standard memory architectures may struggle at tasks that heavily involve an understanding of the ways in which entities are connected -- i.e., tasks involving relational reasoning. We then improve upon these deficits by using a new memory module -- a \textit{Relational Memory Core} (RMC) -- which employs multi-head dot product attention to allow memories to interact. Finally, we test the RMC on a suite of tasks that may profit from more capable relational reasoning across sequential information, and show large gains in RL domains (e.g. Mini PacMan), program evaluation, and language modeling, achieving state-of-the-art results on the WikiText-103, Project Gutenberg, and GigaWord datasets.

Via

Access Paper or Ask Questions

Imagination-Augmented Agents for Deep Reinforcement Learning

Feb 14, 2018

Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li(+5 more)

Figure 1 for Imagination-Augmented Agents for Deep Reinforcement Learning

Figure 2 for Imagination-Augmented Agents for Deep Reinforcement Learning

Figure 3 for Imagination-Augmented Agents for Deep Reinforcement Learning

Figure 4 for Imagination-Augmented Agents for Deep Reinforcement Learning

Abstract:We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects. In contrast to most existing model-based reinforcement learning and planning methods, which prescribe how a model should be used to arrive at a policy, I2As learn to interpret predictions from a learned environment model to construct implicit plans in arbitrary ways, by using the predictions as additional context in deep policy networks. I2As show improved data efficiency, performance, and robustness to model misspecification compared to several baselines.

Via

Access Paper or Ask Questions

Learning and Querying Fast Generative Models for Reinforcement Learning

Feb 08, 2018

Lars Buesing, Theophane Weber, Sebastien Racaniere, S. M. Ali Eslami, Danilo Rezende, David P. Reichert, Fabio Viola, Frederic Besse, Karol Gregor, Demis Hassabis(+1 more)

Figure 1 for Learning and Querying Fast Generative Models for Reinforcement Learning

Figure 2 for Learning and Querying Fast Generative Models for Reinforcement Learning

Figure 3 for Learning and Querying Fast Generative Models for Reinforcement Learning

Figure 4 for Learning and Querying Fast Generative Models for Reinforcement Learning

Abstract:A key challenge in model-based reinforcement learning (RL) is to synthesize computationally efficient and accurate environment models. We show that carefully designed generative models that learn and operate on compact state representations, so-called state-space models, substantially reduce the computational costs for predicting outcomes of sequences of actions. Extensive experiments establish that state-space models accurately capture the dynamics of Atari games from the Arcade Learning Environment from raw pixels. The computational speed-up of state-space models while maintaining high accuracy makes their application in RL feasible: We demonstrate that agents which query these models for decision making outperform strong model-free baselines on the game MSPACMAN, demonstrating the potential of using learned environment models for planning.

Via

Access Paper or Ask Questions

Matching Networks for One Shot Learning

Dec 29, 2017

Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Koray Kavukcuoglu, Daan Wierstra

Figure 1 for Matching Networks for One Shot Learning

Figure 2 for Matching Networks for One Shot Learning

Figure 3 for Matching Networks for One Shot Learning

Figure 4 for Matching Networks for One Shot Learning

Abstract:Learning from a few examples remains a key challenge in machine learning. Despite recent advances in important domains such as vision and language, the standard supervised deep learning paradigm does not offer a satisfactory solution for learning new concepts rapidly from little data. In this work, we employ ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories. Our framework learns a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types. We then define one-shot learning problems on vision (using Omniglot, ImageNet) and language tasks. Our algorithm improves one-shot accuracy on ImageNet from 87.6% to 93.2% and from 88.0% to 93.8% on Omniglot compared to competing approaches. We also demonstrate the usefulness of the same model on language modeling by introducing a one-shot task on the Penn Treebank.

Via

Access Paper or Ask Questions

Learning model-based planning from scratch

Jul 19, 2017

Razvan Pascanu, Yujia Li, Oriol Vinyals, Nicolas Heess, Lars Buesing, Sebastien Racanière, David Reichert, Théophane Weber, Daan Wierstra, Peter Battaglia

Figure 1 for Learning model-based planning from scratch

Figure 2 for Learning model-based planning from scratch

Figure 3 for Learning model-based planning from scratch

Figure 4 for Learning model-based planning from scratch

Abstract:Conventional wisdom holds that model-based planning is a powerful approach to sequential decision-making. It is often very challenging in practice, however, because while a model can be used to evaluate a plan, it does not prescribe how to construct a plan. Here we introduce the "Imagination-based Planner", the first model-based, sequential decision-making agent that can learn to construct, evaluate, and execute plans. Before any action, it can perform a variable number of imagination steps, which involve proposing an imagined action and evaluating it with its model-based imagination. All imagined actions and outcomes are aggregated, iteratively, into a "plan context" which conditions future real and imagined actions. The agent can even decide how to imagine: testing out alternative imagined actions, chaining sequences of actions together, or building a more complex "imagination tree" by navigating flexibly among the previously imagined states using a learned policy. And our agent can learn to plan economically, jointly optimizing for external rewards and computational costs associated with using its imagination. We show that our architecture can learn to solve a challenging continuous control problem, and also learn elaborate planning strategies in a discrete maze-solving task. Our work opens a new direction toward learning the components of a model-based planning system and how to use them.

Via

Access Paper or Ask Questions

Comparison of Maximum Likelihood and GAN-based training of Real NVPs

May 15, 2017

Ivo Danihelka, Balaji Lakshminarayanan, Benigno Uria, Daan Wierstra, Peter Dayan

Figure 1 for Comparison of Maximum Likelihood and GAN-based training of Real NVPs

Figure 2 for Comparison of Maximum Likelihood and GAN-based training of Real NVPs

Figure 3 for Comparison of Maximum Likelihood and GAN-based training of Real NVPs

Figure 4 for Comparison of Maximum Likelihood and GAN-based training of Real NVPs

Abstract:We train a generator by maximum likelihood and we also train the same generator architecture by Wasserstein GAN. We then compare the generated samples, exact log-probability densities and approximate Wasserstein distances. We show that an independent critic trained to approximate Wasserstein distance between the validation set and the generator distribution helps detect overfitting. Finally, we use ideas from the one-shot learning literature to develop a novel fast learning critic.

Via

Access Paper or Ask Questions

Recurrent Environment Simulators

Apr 19, 2017

Silvia Chiappa, Sébastien Racaniere, Daan Wierstra, Shakir Mohamed

Figure 1 for Recurrent Environment Simulators

Figure 2 for Recurrent Environment Simulators

Figure 3 for Recurrent Environment Simulators

Figure 4 for Recurrent Environment Simulators

Abstract:Models that can simulate how environments change in response to actions can be used by agents to plan and act efficiently. We improve on previous environment simulators from high-dimensional pixel observations by introducing recurrent neural networks that are able to make temporally and spatially coherent predictions for hundreds of time-steps into the future. We present an in-depth analysis of the factors affecting performance, providing the most extensive attempt to advance the understanding of the properties of these models. We address the issue of computationally inefficiency with a model that does not need to generate a high-dimensional image at each time-step. We show that our approach can be used to improve exploration and is adaptable to many diverse environments, namely 10 Atari games, a 3D car racing environment, and complex 3D mazes.

Via

Access Paper or Ask Questions