Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jim Portegies

Is Vanilla Policy Gradient Overlooked? Analyzing Deep Reinforcement Learning for Hanabi

Mar 22, 2022

Bram Grooten, Jelle Wemmenhove, Maurice Poot, Jim Portegies

Figure 1 for Is Vanilla Policy Gradient Overlooked? Analyzing Deep Reinforcement Learning for Hanabi

Figure 2 for Is Vanilla Policy Gradient Overlooked? Analyzing Deep Reinforcement Learning for Hanabi

Figure 3 for Is Vanilla Policy Gradient Overlooked? Analyzing Deep Reinforcement Learning for Hanabi

Figure 4 for Is Vanilla Policy Gradient Overlooked? Analyzing Deep Reinforcement Learning for Hanabi

Abstract:In pursuit of enhanced multi-agent collaboration, we analyze several on-policy deep reinforcement learning algorithms in the recently published Hanabi benchmark. Our research suggests a perhaps counter-intuitive finding, where Proximal Policy Optimization (PPO) is outperformed by Vanilla Policy Gradient over multiple random seeds in a simplified environment of the multi-agent cooperative card game. In our analysis of this behavior we look into Hanabi-specific metrics and hypothesize a reason for PPO's plateau. In addition, we provide proofs for the maximum length of a perfect game (71 turns) and any game (89 turns). Our code can be found at: https://github.com/bramgrooten/DeepRL-for-Hanabi

* Accepted at ALA 2022 (Adaptive and Learning Agents Workshop at AAMAS 2022)

Via

Access Paper or Ask Questions

PDE-based Group Equivariant Convolutional Neural Networks

Jan 24, 2020

Bart Smets, Jim Portegies, Erik Bekkers, Remco Duits

Figure 1 for PDE-based Group Equivariant Convolutional Neural Networks

Figure 2 for PDE-based Group Equivariant Convolutional Neural Networks

Figure 3 for PDE-based Group Equivariant Convolutional Neural Networks

Figure 4 for PDE-based Group Equivariant Convolutional Neural Networks

Abstract:We present a PDE-based framework that generalizes Group equivariant Convolutional Neural Networks (G-CNNs). In this framework, a network layer is seen as a set of PDE-solvers where the equation's geometrically meaningful coefficients become the layer's trainable weights. Formulating our PDEs on homogeneous spaces allows these networks to be designed with built-in symmetries such as rotation equivariance instead of being restricted to just translation equivariance as in traditional CNNs. Having all the desired symmetries included in the design obviates the need to include them by means of costly techniques such as data augmentation. Roto-translation equivariance for image analysis applications is the example we will be using throughout the paper. Our default PDE is solved by a combination of linear group convolutions and non-linear morphological group convolutions. Just like for linear convolution a morphological convolution is specified by a kernel and this kernel is what is being optimized during the training process. We demonstrate how the common CNN operations of max/min-pooling and ReLUs arise naturally from solving a PDE and how they are subsumed by morphological convolutions. We present a proof-of-concept experiment to demonstrate the potential of this framework in increasing the performance of deep learning based imaging applications.

Via

Access Paper or Ask Questions

SciSports: Learning football kinematics through two-dimensional tracking data

Aug 14, 2018

Anatoliy Babic, Harshit Bansal, Gianluca Finocchio, Julian Golak, Mark Peletier, Jim Portegies, Clara Stegehuis, Anuj Tyagi, Roland Vincze, William Weimin Yoo

Figure 1 for SciSports: Learning football kinematics through two-dimensional tracking data

Figure 2 for SciSports: Learning football kinematics through two-dimensional tracking data

Figure 3 for SciSports: Learning football kinematics through two-dimensional tracking data

Figure 4 for SciSports: Learning football kinematics through two-dimensional tracking data

Abstract:SciSports is a Dutch startup company specializing in football analytics. This paper describes a joint research effort with SciSports, during the Study Group Mathematics with Industry 2018 at Eindhoven, the Netherlands. The main challenge that we addressed was to automatically process empirical football players' trajectories, in order to extract useful information from them. The data provided to us was two-dimensional positional data during entire matches. We developed methods based on Newtonian mechanics and the Kalman filter, Generative Adversarial Nets and Variational Autoencoders. In addition, we trained a discriminator network to recognize and discern different movement patterns of players. The Kalman-filter approach yields an interpretable model, in which a small number of player-dependent parameters can be fit; in theory this could be used to distinguish among players. The Generative-Adversarial-Nets approach appears promising in theory, and some initial tests showed an improvement with respect to the baseline, but the limits in time and computational power meant that we could not fully explore it. We also trained a Discriminator network to distinguish between two players based on their trajectories; after training, the network managed to distinguish between some pairs of players, but not between others. After training, the Variational Autoencoders generated trajectories that are difficult to distinguish, visually, from the data. These experiments provide an indication that deep generative models can learn the underlying structure and statistics of football players' trajectories. This can serve as a starting point for determining player qualities based on such trajectory data.

* This report was made for the Study Group Mathematics with Industry 2018

Via

Access Paper or Ask Questions