Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

David Pankratz

SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

Mar 18, 2024

Vikram Voleti, Chun-Han Yao, Mark Boss, Adam Letts, David Pankratz, Dmitry Tochilkin, Christian Laforte, Robin Rombach, Varun Jampani

Figure 1 for SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

Figure 2 for SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

Figure 3 for SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

Figure 4 for SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

Abstract:We present Stable Video 3D (SV3D) -- a latent video diffusion model for high-resolution, image-to-multi-view generation of orbital videos around a 3D object. Recent work on 3D generation propose techniques to adapt 2D generative models for novel view synthesis (NVS) and 3D optimization. However, these methods have several disadvantages due to either limited views or inconsistent NVS, thereby affecting the performance of 3D object generation. In this work, we propose SV3D that adapts image-to-video diffusion model for novel multi-view synthesis and 3D generation, thereby leveraging the generalization and multi-view consistency of the video models, while further adding explicit camera control for NVS. We also propose improved 3D optimization techniques to use SV3D and its NVS outputs for image-to-3D generation. Extensive experimental results on multiple datasets with 2D and 3D metrics as well as user study demonstrate SV3D's state-of-the-art performance on NVS as well as 3D reconstruction compared to prior works.

* Project page: https://sv3d.github.io/

Via

Access Paper or Ask Questions

TripoSR: Fast 3D Object Reconstruction from a Single Image

Mar 04, 2024

Dmitry Tochilkin, David Pankratz, Zexiang Liu, Zixuan Huang, Adam Letts, Yangguang Li, Ding Liang, Christian Laforte, Varun Jampani, Yan-Pei Cao

Figure 1 for TripoSR: Fast 3D Object Reconstruction from a Single Image

Figure 2 for TripoSR: Fast 3D Object Reconstruction from a Single Image

Figure 3 for TripoSR: Fast 3D Object Reconstruction from a Single Image

Figure 4 for TripoSR: Fast 3D Object Reconstruction from a Single Image

Abstract:This technical report introduces TripoSR, a 3D reconstruction model leveraging transformer architecture for fast feed-forward 3D generation, producing 3D mesh from a single image in under 0.5 seconds. Building upon the LRM network architecture, TripoSR integrates substantial improvements in data processing, model design, and training techniques. Evaluations on public datasets show that TripoSR exhibits superior performance, both quantitatively and qualitatively, compared to other open-source alternatives. Released under the MIT license, TripoSR is intended to empower researchers, developers, and creatives with the latest advancements in 3D generative AI.

* Model: https://huggingface.co/stabilityai/TripoSR Code: https://github.com/VAST-AI-Research/TripoSR Demo: https://huggingface.co/spaces/stabilityai/TripoSR

Via

Access Paper or Ask Questions