Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

Nov 07, 2024

Shuhong Zheng, Zhipeng Bao, Ruoyu Zhao, Martial Hebert, Yu-Xiong Wang

Figure 1 for Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

Figure 2 for Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

Figure 3 for Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

Figure 4 for Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

Share this with someone who'll enjoy it:

Abstract:Beyond high-fidelity image synthesis, diffusion models have recently exhibited promising results in dense visual perception tasks. However, most existing work treats diffusion models as a standalone component for perception tasks, employing them either solely for off-the-shelf data augmentation or as mere feature extractors. In contrast to these isolated and thus sub-optimal efforts, we introduce a unified, versatile, diffusion-based framework, Diff-2-in-1, that can simultaneously handle both multi-modal data generation and dense visual perception, through a unique exploitation of the diffusion-denoising process. Within this framework, we further enhance discriminative visual perception via multi-modal generation, by utilizing the denoising network to create multi-modal data that mirror the distribution of the original training set. Importantly, Diff-2-in-1 optimizes the utilization of the created diverse and faithful data by leveraging a novel self-improving learning mechanism. Comprehensive experimental evaluations validate the effectiveness of our framework, showcasing consistent performance improvements across various discriminative backbones and high-quality multi-modal data generation characterized by both realism and usefulness.

* 26 pages, 14 figures

View paper on

Share this with someone who'll enjoy it:

Title:Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

Paper and Code