Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Towards Controllable Diffusion Models via Reward-Guided Exploration

Apr 14, 2023

Hengtong Zhang, Tingyang Xu

Figure 1 for Towards Controllable Diffusion Models via Reward-Guided Exploration

Figure 2 for Towards Controllable Diffusion Models via Reward-Guided Exploration

Figure 3 for Towards Controllable Diffusion Models via Reward-Guided Exploration

Figure 4 for Towards Controllable Diffusion Models via Reward-Guided Exploration

Share this with someone who'll enjoy it:

Abstract:By formulating data samples' formation as a Markov denoising process, diffusion models achieve state-of-the-art performances in a collection of tasks. Recently, many variants of diffusion models have been proposed to enable controlled sample generation. Most of these existing methods either formulate the controlling information as an input (i.e.,: conditional representation) for the noise approximator, or introduce a pre-trained classifier in the test-phase to guide the Langevin dynamic towards the conditional goal. However, the former line of methods only work when the controlling information can be formulated as conditional representations, while the latter requires the pre-trained guidance classifier to be differentiable. In this paper, we propose a novel framework named RGDM (Reward-Guided Diffusion Model) that guides the training-phase of diffusion models via reinforcement learning (RL). The proposed training framework bridges the objective of weighted log-likelihood and maximum entropy RL, which enables calculating policy gradients via samples from a pay-off distribution proportional to exponential scaled rewards, rather than from policies themselves. Such a framework alleviates the high gradient variances and enables diffusion models to explore for highly rewarded samples in the reverse process. Experiments on 3D shape and molecule generation tasks show significant improvements over existing conditional diffusion models.

View paper on

Share this with someone who'll enjoy it:

Title:Towards Controllable Diffusion Models via Reward-Guided Exploration

Paper and Code