Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Customizing Text-to-Image Diffusion with Camera Viewpoint Control

Apr 18, 2024

Nupur Kumari, Grace Su, Richard Zhang, Taesung Park, Eli Shechtman, Jun-Yan Zhu

Figure 1 for Customizing Text-to-Image Diffusion with Camera Viewpoint Control

Figure 2 for Customizing Text-to-Image Diffusion with Camera Viewpoint Control

Figure 3 for Customizing Text-to-Image Diffusion with Camera Viewpoint Control

Figure 4 for Customizing Text-to-Image Diffusion with Camera Viewpoint Control

Share this with someone who'll enjoy it:

Abstract:Model customization introduces new concepts to existing text-to-image models, enabling the generation of the new concept in novel contexts. However, such methods lack accurate camera view control w.r.t the object, and users must resort to prompt engineering (e.g., adding "top-view") to achieve coarse view control. In this work, we introduce a new task -- enabling explicit control of camera viewpoint for model customization. This allows us to modify object properties amongst various background scenes via text prompts, all while incorporating the target camera pose as additional control. This new task presents significant challenges in merging a 3D representation from the multi-view images of the new concept with a general, 2D text-to-image model. To bridge this gap, we propose to condition the 2D diffusion process on rendered, view-dependent features of the new object. During training, we jointly adapt the 2D diffusion modules and 3D feature predictions to reconstruct the object's appearance and geometry while reducing overfitting to the input multi-view images. Our method outperforms existing image editing and model personalization baselines in preserving the custom object's identity while following the input text prompt and the object's camera pose.

* project page: https://customdiffusion360.github.io

View paper on

Share this with someone who'll enjoy it:

Title:Customizing Text-to-Image Diffusion with Camera Viewpoint Control

Paper and Code