Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhiyuan Min

Advances in 4D Generation: A Survey

Mar 19, 2025

Qiaowei Miao, Kehan Li, Jinsheng Quan, Zhiyuan Min, Shaojie Ma, Yichao Xu, Yi Yang, Yawei Luo

Abstract:Generative artificial intelligence (AI) has made significant progress across various domains in recent years. Building on the rapid advancements in 2D, video, and 3D content generation fields, 4D generation has emerged as a novel and rapidly evolving research area, attracting growing attention. 4D generation focuses on creating dynamic 3D assets with spatiotemporal consistency based on user input, offering greater creative freedom and richer immersive experiences. This paper presents a comprehensive survey of the 4D generation field, systematically summarizing its core technologies, developmental trajectory, key challenges, and practical applications, while also exploring potential future research directions. The survey begins by introducing various fundamental 4D representation models, followed by a review of 4D generation frameworks built upon these representations and the key technologies that incorporate motion and geometry priors into 4D assets. We summarize five major challenges of 4D generation: consistency, controllability, diversity, efficiency, and fidelity, accompanied by an outline of existing solutions to address these issues. We systematically analyze applications of 4D generation, spanning dynamic object generation, scene generation, digital human synthesis, 4D editing, and autonomous driving. Finally, we provide an in-depth discussion of the obstacles currently hindering the development of the 4D generation. This survey offers a clear and comprehensive overview of 4D generation, aiming to stimulate further exploration and innovation in this rapidly evolving field. Our code is publicly available at: https://github.com/MiaoQiaowei/Awesome-4D.

Via

Access Paper or Ask Questions

Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis

Oct 31, 2024

Zhiyuan Min, Yawei Luo, Jianwen Sun, Yi Yang

Figure 1 for Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis

Figure 2 for Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis

Figure 3 for Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis

Figure 4 for Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis

Abstract:Generalizable 3D Gaussian splitting (3DGS) can reconstruct new scenes from sparse-view observations in a feed-forward inference manner, eliminating the need for scene-specific retraining required in conventional 3DGS. However, existing methods rely heavily on epipolar priors, which can be unreliable in complex realworld scenes, particularly in non-overlapping and occluded regions. In this paper, we propose eFreeSplat, an efficient feed-forward 3DGS-based model for generalizable novel view synthesis that operates independently of epipolar line constraints. To enhance multiview feature extraction with 3D perception, we employ a selfsupervised Vision Transformer (ViT) with cross-view completion pre-training on large-scale datasets. Additionally, we introduce an Iterative Cross-view Gaussians Alignment method to ensure consistent depth scales across different views. Our eFreeSplat represents an innovative approach for generalizable novel view synthesis. Different from the existing pure geometry-free methods, eFreeSplat focuses more on achieving epipolar-free feature matching and encoding by providing 3D priors through cross-view pretraining. We evaluate eFreeSplat on wide-baseline novel view synthesis tasks using the RealEstate10K and ACID datasets. Extensive experiments demonstrate that eFreeSplat surpasses state-of-the-art baselines that rely on epipolar priors, achieving superior geometry reconstruction and novel view synthesis quality. Project page: https://tatakai1.github.io/efreesplat/.

* NeurIPS 2024

Via

Access Paper or Ask Questions

Entangled View-Epipolar Information Aggregation for Generalizable Neural Radiance Fields

Nov 20, 2023

Zhiyuan Min, Yawei Luo, Wei Yang, Yuesong Wang, Yi Yang

Figure 1 for Entangled View-Epipolar Information Aggregation for Generalizable Neural Radiance Fields

Figure 2 for Entangled View-Epipolar Information Aggregation for Generalizable Neural Radiance Fields

Figure 3 for Entangled View-Epipolar Information Aggregation for Generalizable Neural Radiance Fields

Figure 4 for Entangled View-Epipolar Information Aggregation for Generalizable Neural Radiance Fields

Abstract:Generalizable NeRF can directly synthesize novel views across new scenes, eliminating the need for scene-specific retraining in vanilla NeRF. A critical enabling factor in these approaches is the extraction of a generalizable 3D representation by aggregating source-view features. In this paper, we propose an Entangled View-Epipolar Information Aggregation method dubbed EVE-NeRF. Different from existing methods that consider cross-view and along-epipolar information independently, EVE-NeRF conducts the view-epipolar feature aggregation in an entangled manner by injecting the scene-invariant appearance continuity and geometry consistency priors to the aggregation process. Our approach effectively mitigates the potential lack of inherent geometric and appearance constraint resulting from one-dimensional interactions, thus further boosting the 3D representation generalizablity. EVE-NeRF attains state-of-the-art performance across various evaluation scenarios. Extensive experiments demonstate that, compared to prevailing single-dimensional aggregation, the entangled network excels in the accuracy of 3D scene geometry and appearance reconstruction.Our project page is https://github.com/tatakai1/EVENeRF.

Via

Access Paper or Ask Questions