Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Reality's Canvas, Language's Brush: Crafting 3D Avatars from Monocular Video

Dec 08, 2023

Yuchen Rao, Eduardo Perez Pellitero, Benjamin Busam, Yiren Zhou, Jifei Song

Figure 1 for Reality's Canvas, Language's Brush: Crafting 3D Avatars from Monocular Video

Figure 2 for Reality's Canvas, Language's Brush: Crafting 3D Avatars from Monocular Video

Figure 3 for Reality's Canvas, Language's Brush: Crafting 3D Avatars from Monocular Video

Figure 4 for Reality's Canvas, Language's Brush: Crafting 3D Avatars from Monocular Video

Share this with someone who'll enjoy it:

Abstract:Recent advancements in 3D avatar generation excel with multi-view supervision for photorealistic models. However, monocular counterparts lag in quality despite broader applicability. We propose ReCaLab to close this gap. ReCaLab is a fully-differentiable pipeline that learns high-fidelity 3D human avatars from just a single RGB video. A pose-conditioned deformable NeRF is optimized to volumetrically represent a human subject in canonical T-pose. The canonical representation is then leveraged to efficiently associate viewpoint-agnostic textures using 2D-3D correspondences. This enables to separately generate albedo and shading which jointly compose an RGB prediction. The design allows to control intermediate results for human pose, body shape, texture, and lighting with text prompts. An image-conditioned diffusion model thereby helps to animate appearance and pose of the 3D avatar to create video sequences with previously unseen human motion. Extensive experiments show that ReCaLab outperforms previous monocular approaches in terms of image quality for image synthesis tasks. ReCaLab even outperforms multi-view methods that leverage up to 19x more synchronized videos for the task of novel pose rendering. Moreover, natural language offers an intuitive user interface for creative manipulation of 3D human avatars.

* Video link: https://youtu.be/Oz83z1es2J4

View paper on

Share this with someone who'll enjoy it:

Title:Reality's Canvas, Language's Brush: Crafting 3D Avatars from Monocular Video

Paper and Code