Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gowtham R. Kurri

Addressing GAN Training Instabilities via Tunable Classification Losses

Oct 27, 2023

Monica Welfert, Gowtham R. Kurri, Kyle Otstot, Lalitha Sankar

Figure 1 for Addressing GAN Training Instabilities via Tunable Classification Losses

Figure 2 for Addressing GAN Training Instabilities via Tunable Classification Losses

Figure 3 for Addressing GAN Training Instabilities via Tunable Classification Losses

Figure 4 for Addressing GAN Training Instabilities via Tunable Classification Losses

Abstract:Generative adversarial networks (GANs), modeled as a zero-sum game between a generator (G) and a discriminator (D), allow generating synthetic data with formal guarantees. Noting that D is a classifier, we begin by reformulating the GAN value function using class probability estimation (CPE) losses. We prove a two-way correspondence between CPE loss GANs and $f$-GANs which minimize $f$-divergences. We also show that all symmetric $f$-divergences are equivalent in convergence. In the finite sample and model capacity setting, we define and obtain bounds on estimation and generalization errors. We specialize these results to $\alpha$-GANs, defined using $\alpha$-loss, a tunable CPE loss family parametrized by $\alpha\in(0,\infty]$. We next introduce a class of dual-objective GANs to address training instabilities of GANs by modeling each player's objective using $\alpha$-loss to obtain $(\alpha_D,\alpha_G)$-GANs. We show that the resulting non-zero sum game simplifies to minimizing an $f$-divergence under appropriate conditions on $(\alpha_D,\alpha_G)$. Generalizing this dual-objective formulation using CPE losses, we define and obtain upper bounds on an appropriately defined estimation error. Finally, we highlight the value of tuning $(\alpha_D,\alpha_G)$ in alleviating training instabilities for the synthetic 2D Gaussian mixture ring as well as the large publicly available Celeb-A and LSUN Classroom image datasets.

* arXiv admin note: text overlap with arXiv:2302.14320

Via

Access Paper or Ask Questions

Towards Addressing GAN Training Instabilities: Dual-objective GANs with Tunable Parameters

Feb 28, 2023

Monica Welfert, Kyle Otstot, Gowtham R. Kurri, Lalitha Sankar

Abstract:In an effort to address the training instabilities of GANs, we introduce a class of dual-objective GANs with different value functions (objectives) for the generator (G) and discriminator (D). In particular, we model each objective using $\alpha$-loss, a tunable classification loss, to obtain $(\alpha_D,\alpha_G)$-GANs, parameterized by $(\alpha_D,\alpha_G)\in [0,\infty)^2$. For sufficiently large number of samples and capacities for G and D, we show that the resulting non-zero sum game simplifies to minimizing an $f$-divergence under appropriate conditions on $(\alpha_D,\alpha_G)$. In the finite sample and capacity setting, we define estimation error to quantify the gap in the generator's performance relative to the optimal setting with infinite samples and obtain upper bounds on this error, showing it to be order optimal under certain conditions. Finally, we highlight the value of tuning $(\alpha_D,\alpha_G)$ in alleviating training instabilities for the synthetic 2D Gaussian mixture ring and the Stacked MNIST datasets.

Via

Access Paper or Ask Questions

$α$-GAN: Convergence and Estimation Guarantees

May 12, 2022

Gowtham R. Kurri, Monica Welfert, Tyler Sypherd, Lalitha Sankar

Figure 1 for $α$-GAN: Convergence and Estimation Guarantees

Figure 2 for $α$-GAN: Convergence and Estimation Guarantees

Figure 3 for $α$-GAN: Convergence and Estimation Guarantees

Figure 4 for $α$-GAN: Convergence and Estimation Guarantees

Abstract:We prove a two-way correspondence between the min-max optimization of general CPE loss function GANs and the minimization of associated $f$-divergences. We then focus on $\alpha$-GAN, defined via the $\alpha$-loss, which interpolates several GANs (Hellinger, vanilla, Total Variation) and corresponds to the minimization of the Arimoto divergence. We show that the Arimoto divergences induced by $\alpha$-GAN equivalently converge, for all $\alpha\in \mathbb{R}_{>0}\cup\{\infty\}$. However, under restricted learning models and finite samples, we provide estimation bounds which indicate diverse GAN behavior as a function of $\alpha$. Finally, we present empirical results on a toy dataset that highlight the practical utility of tuning the $\alpha$ hyperparameter.

* Extended version of a paper accepted to ISIT 2022. 12 pages, 7 figures

Via

Access Paper or Ask Questions

Realizing GANs via a Tunable Loss Function

Jun 09, 2021

Gowtham R. Kurri, Tyler Sypherd, Lalitha Sankar

Figure 1 for Realizing GANs via a Tunable Loss Function

Figure 2 for Realizing GANs via a Tunable Loss Function

Abstract:We introduce a tunable GAN, called $\alpha$-GAN, parameterized by $\alpha \in (0,\infty]$, which interpolates between various $f$-GANs and Integral Probability Metric based GANs (under constrained discriminator set). We construct $\alpha$-GAN using a supervised loss function, namely, $\alpha$-loss, which is a tunable loss function capturing several canonical losses. We show that $\alpha$-GAN is intimately related to the Arimoto divergence, which was first proposed by \"{O}sterriecher (1996), and later studied by Liese and Vajda (2006). We posit that the holistic understanding that $\alpha$-GAN introduces will have practical benefits of addressing both the issues of vanishing gradients and mode collapse.

* 6 pages, 2 figures

Via

Access Paper or Ask Questions