Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Avraam Chatzimichailidis

Combating Mode Collapse in GAN training: An Empirical Analysis using Hessian Eigenvalues

Dec 17, 2020

Ricard Durall, Avraam Chatzimichailidis, Peter Labus, Janis Keuper

Figure 1 for Combating Mode Collapse in GAN training: An Empirical Analysis using Hessian Eigenvalues

Figure 2 for Combating Mode Collapse in GAN training: An Empirical Analysis using Hessian Eigenvalues

Figure 3 for Combating Mode Collapse in GAN training: An Empirical Analysis using Hessian Eigenvalues

Figure 4 for Combating Mode Collapse in GAN training: An Empirical Analysis using Hessian Eigenvalues

Abstract:Generative adversarial networks (GANs) provide state-of-the-art results in image generation. However, despite being so powerful, they still remain very challenging to train. This is in particular caused by their highly non-convex optimization space leading to a number of instabilities. Among them, mode collapse stands out as one of the most daunting ones. This undesirable event occurs when the model can only fit a few modes of the data distribution, while ignoring the majority of them. In this work, we combat mode collapse using second-order gradient information. To do so, we analyse the loss surface through its Hessian eigenvalues, and show that mode collapse is related to the convergence towards sharp minima. In particular, we observe how the eigenvalues of the $G$ are directly correlated with the occurrence of mode collapse. Finally, motivated by these findings, we design a new optimization algorithm called nudged-Adam (NuGAN) that uses spectral information to overcome mode collapse, leading to empirically more stable convergence properties.

Via

Access Paper or Ask Questions

GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks

Sep 27, 2019

Avraam Chatzimichailidis, Franz-Josef Pfreundt, Nicolas R. Gauger, Janis Keuper

Figure 1 for GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks

Figure 2 for GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks

Figure 3 for GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks

Figure 4 for GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks

Abstract:Current training methods for deep neural networks boil down to very high dimensional and non-convex optimization problems which are usually solved by a wide range of stochastic gradient descent methods. While these approaches tend to work in practice, there are still many gaps in the theoretical understanding of key aspects like convergence and generalization guarantees, which are induced by the properties of the optimization surface (loss landscape). In order to gain deeper insights, a number of recent publications proposed methods to visualize and analyze the optimization surfaces. However, the computational cost of these methods are very high, making it hardly possible to use them on larger networks. In this paper, we present the GradVis Toolbox, an open source library for efficient and scalable visualization and analysis of deep neural network loss landscapes in Tensorflow and PyTorch. Introducing more efficient mathematical formulations and a novel parallelization scheme, GradVis allows to plot 2d and 3d projections of optimization surfaces and trajectories, as well as high resolution second order gradient information for large networks.

* 10 pages, 8 figures

Via

Access Paper or Ask Questions