Abstract:Generative artificial intelligence (AI) has made significant progress across various domains in recent years. Building on the rapid advancements in 2D, video, and 3D content generation fields, 4D generation has emerged as a novel and rapidly evolving research area, attracting growing attention. 4D generation focuses on creating dynamic 3D assets with spatiotemporal consistency based on user input, offering greater creative freedom and richer immersive experiences. This paper presents a comprehensive survey of the 4D generation field, systematically summarizing its core technologies, developmental trajectory, key challenges, and practical applications, while also exploring potential future research directions. The survey begins by introducing various fundamental 4D representation models, followed by a review of 4D generation frameworks built upon these representations and the key technologies that incorporate motion and geometry priors into 4D assets. We summarize five major challenges of 4D generation: consistency, controllability, diversity, efficiency, and fidelity, accompanied by an outline of existing solutions to address these issues. We systematically analyze applications of 4D generation, spanning dynamic object generation, scene generation, digital human synthesis, 4D editing, and autonomous driving. Finally, we provide an in-depth discussion of the obstacles currently hindering the development of the 4D generation. This survey offers a clear and comprehensive overview of 4D generation, aiming to stimulate further exploration and innovation in this rapidly evolving field. Our code is publicly available at: https://github.com/MiaoQiaowei/Awesome-4D.
Abstract:The segmentation of transparent objects can be very useful in computer vision applications. However, because they borrow texture from their background and have a similar appearance to their surroundings, transparent objects are not handled well by regular image segmentation methods. We propose a method that overcomes these problems using the consistency and distortion properties of a light-field image. Graph-cut optimization is applied for the pixel labeling problem. The light-field linearity is used to estimate the likelihood of a pixel belonging to the transparent object or Lambertian background, and the occlusion detector is used to find the occlusion boundary. We acquire a light field dataset for the transparent object, and use this dataset to evaluate our method. The results demonstrate that the proposed method successfully segments transparent objects from the background.
Abstract:The light field camera is useful for computer graphics and vision applications. Calibration is an essential step for these applications. After calibration, we can rectify the captured image by using the calibrated camera parameters. However, the large camera array calibration method, which assumes that all cameras are on the same plane, ignores the orientation and intrinsic parameters. The multi-camera calibration technique usually assumes that the working volume and viewpoints are fixed. In this paper, we describe a calibration algorithm suitable for a mobile camera array based light field acquisition system. The algorithm performs in Zhang's style by moving a checkerboard, and computes the initial parameters in closed form. Global optimization is then applied to refine all the parameters simultaneously. Our implementation is rather flexible in that users can assign the number of viewpoints and refinement of intrinsic parameters is optional. Experiments on both simulated data and real data acquired by a commercial product show that our method yields good results. Digital refocusing application shows the calibrated light field can well focus to the target object we desired.