Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiju Poovvancheri

DepGAN: Leveraging Depth Maps for Handling Occlusions and Transparency in Image Composition

Jul 16, 2024

Amr Ghoneim, Jiju Poovvancheri, Yasushi Akiyama, Dong Chen

Figure 1 for DepGAN: Leveraging Depth Maps for Handling Occlusions and Transparency in Image Composition

Figure 2 for DepGAN: Leveraging Depth Maps for Handling Occlusions and Transparency in Image Composition

Figure 3 for DepGAN: Leveraging Depth Maps for Handling Occlusions and Transparency in Image Composition

Figure 4 for DepGAN: Leveraging Depth Maps for Handling Occlusions and Transparency in Image Composition

Abstract:Image composition is a complex task which requires a lot of information about the scene for an accurate and realistic composition, such as perspective, lighting, shadows, occlusions, and object interactions. Previous methods have predominantly used 2D information for image composition, neglecting the potentials of 3D spatial information. In this work, we propose DepGAN, a Generative Adversarial Network that utilizes depth maps and alpha channels to rectify inaccurate occlusions and enhance transparency effects in image composition. Central to our network is a novel loss function called Depth Aware Loss which quantifies the pixel wise depth difference to accurately delineate occlusion boundaries while compositing objects at different depth levels. Furthermore, we enhance our network's learning process by utilizing opacity data, enabling it to effectively manage compositions involving transparent and semi-transparent objects. We tested our model against state-of-the-art image composition GANs on benchmark (both real and synthetic) datasets. The results reveal that DepGAN significantly outperforms existing methods in terms of accuracy of object placement semantics, transparency and occlusion handling, both visually and quantitatively. Our code is available at https://amrtsg.github.io/DepGAN/.

* 10 pages, 13 figures

Via

Access Paper or Ask Questions

Point-JEPA: A Joint Embedding Predictive Architecture for Self-Supervised Learning on Point Cloud

Apr 25, 2024

Ayumu Saito, Jiju Poovvancheri

Abstract:Recent advancements in self-supervised learning in the point cloud domain have demonstrated significant potential. However, these methods often suffer from drawbacks, including lengthy pre-training time, the necessity of reconstruction in the input space, or the necessity of additional modalities. In order to address these issues, we introduce Point-JEPA, a joint embedding predictive architecture designed specifically for point cloud data. To this end, we introduce a sequencer that orders point cloud tokens to efficiently compute and utilize tokens proximity based on their indices during target and context selection. The sequencer also allows shared computations of the tokens proximity between context and target selection, further improving the efficiency. Experimentally, our method achieves competitive results with state-of-the-art methods while avoiding the reconstruction in the input space or additional modality.

* 10 pages, 4 figures

Via

Access Paper or Ask Questions