Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving

May 30, 2024

Lening Wang, Wenzhao Zheng, Yilong Ren, Han Jiang, Zhiyong Cui, Haiyang Yu, Jiwen Lu

Figure 1 for OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving

Figure 2 for OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving

Figure 3 for OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving

Figure 4 for OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving

Share this with someone who'll enjoy it:

Abstract:Understanding the evolution of 3D scenes is important for effective autonomous driving. While conventional methods mode scene development with the motion of individual instances, world models emerge as a generative framework to describe the general scene dynamics. However, most existing methods adopt an autoregressive framework to perform next-token prediction, which suffer from inefficiency in modeling long-term temporal evolutions. To address this, we propose a diffusion-based 4D occupancy generation model, OccSora, to simulate the development of the 3D world for autonomous driving. We employ a 4D scene tokenizer to obtain compact discrete spatial-temporal representations for 4D occupancy input and achieve high-quality reconstruction for long-sequence occupancy videos. We then learn a diffusion transformer on the spatial-temporal representations and generate 4D occupancy conditioned on a trajectory prompt. We conduct extensive experiments on the widely used nuScenes dataset with Occ3D occupancy annotations. OccSora can generate 16s-videos with authentic 3D layout and temporal consistency, demonstrating its ability to understand the spatial and temporal distributions of driving scenes. With trajectory-aware 4D generation, OccSora has the potential to serve as a world simulator for the decision-making of autonomous driving. Code is available at: https://github.com/wzzheng/OccSora.

* Code is available at: https://github.com/wzzheng/OccSora

View paper on

Share this with someone who'll enjoy it:

Title:OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving

Paper and Code