Abstract:We propose a Bayesian framework for fine-tuning large diffusion models with a novel network structure called Bayesian Power Steering (BPS). We clarify the meaning behind adaptation from a \textit{large probability space} to a \textit{small probability space} and explore the task of fine-tuning pre-trained models using learnable modules from a Bayesian perspective. BPS extracts task-specific knowledge from a pre-trained model's learned prior distribution. It efficiently leverages large diffusion models, differentially intervening different hidden features with a head-heavy and foot-light configuration. Experiments highlight the superiority of BPS over contemporary methods across a range of tasks even with limited amount of data. Notably, BPS attains an FID score of 10.49 under the sketch condition on the COCO17 dataset.
Abstract:We propose a conditional stochastic interpolation (CSI) approach to learning conditional distributions. CSI learns probability flow equations or stochastic differential equations that transport a reference distribution to the target conditional distribution. This is achieved by first learning the drift function and the conditional score function based on conditional stochastic interpolation, which are then used to construct a deterministic process governed by an ordinary differential equation or a diffusion process for conditional sampling. In our proposed CSI model, we incorporate an adaptive diffusion term to address the instability issues arising during the training process. We provide explicit forms of the conditional score function and the drift function in terms of conditional expectations under mild conditions, which naturally lead to an nonparametric regression approach to estimating these functions. Furthermore, we establish non-asymptotic error bounds for learning the target conditional distribution via conditional stochastic interpolation in terms of KL divergence, taking into account the neural network approximation error. We illustrate the application of CSI on image generation using a benchmark image dataset.