Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Srinjay Sarkar

MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection

Apr 10, 2025

Rishubh Parihar, Srinjay Sarkar, Sarthak Vora, Jogendra Kundu, R. Venkatesh Babu

Abstract:Current monocular 3D detectors are held back by the limited diversity and scale of real-world datasets. While data augmentation certainly helps, it's particularly difficult to generate realistic scene-aware augmented data for outdoor settings. Most current approaches to synthetic data generation focus on realistic object appearance through improved rendering techniques. However, we show that where and how objects are positioned is just as crucial for training effective 3D monocular detectors. The key obstacle lies in automatically determining realistic object placement parameters - including position, dimensions, and directional alignment when introducing synthetic objects into actual scenes. To address this, we introduce MonoPlace3D, a novel system that considers the 3D scene content to create realistic augmentations. Specifically, given a background scene, MonoPlace3D learns a distribution over plausible 3D bounding boxes. Subsequently, we render realistic objects and place them according to the locations sampled from the learned distribution. Our comprehensive evaluation on two standard datasets KITTI and NuScenes, demonstrates that MonoPlace3D significantly improves the accuracy of multiple existing monocular 3D detectors while being highly data efficient.

* CVPR 2025 Camera Ready. Project page - https://rishubhpar.github.io/monoplace3D

Via

Access Paper or Ask Questions

Test-Time Augmentation for 3D Point Cloud Classification and Segmentation

Nov 22, 2023

Tuan-Anh Vu, Srinjay Sarkar, Zhiyuan Zhang, Binh-Son Hua, Sai-Kit Yeung

Figure 1 for Test-Time Augmentation for 3D Point Cloud Classification and Segmentation

Figure 2 for Test-Time Augmentation for 3D Point Cloud Classification and Segmentation

Figure 3 for Test-Time Augmentation for 3D Point Cloud Classification and Segmentation

Figure 4 for Test-Time Augmentation for 3D Point Cloud Classification and Segmentation

Abstract:Data augmentation is a powerful technique to enhance the performance of a deep learning task but has received less attention in 3D deep learning. It is well known that when 3D shapes are sparsely represented with low point density, the performance of the downstream tasks drops significantly. This work explores test-time augmentation (TTA) for 3D point clouds. We are inspired by the recent revolution of learning implicit representation and point cloud upsampling, which can produce high-quality 3D surface reconstruction and proximity-to-surface, respectively. Our idea is to leverage the implicit field reconstruction or point cloud upsampling techniques as a systematic way to augment point cloud data. Mainly, we test both strategies by sampling points from the reconstructed results and using the sampled point cloud as test-time augmented data. We show that both strategies are effective in improving accuracy. We observed that point cloud upsampling for test-time augmentation can lead to more significant performance improvement on downstream tasks such as object classification and segmentation on the ModelNet40, ShapeNet, ScanObjectNN, and SemanticKITTI datasets, especially for sparse point clouds.

* This paper is accepted in 3DV 2024

Via

Access Paper or Ask Questions

CoRF : Colorizing Radiance Fields using Knowledge Distillation

Sep 14, 2023

Ankit Dhiman, R Srinath, Srinjay Sarkar, Lokesh R Boregowda, R Venkatesh Babu

Figure 1 for CoRF : Colorizing Radiance Fields using Knowledge Distillation

Figure 2 for CoRF : Colorizing Radiance Fields using Knowledge Distillation

Figure 3 for CoRF : Colorizing Radiance Fields using Knowledge Distillation

Figure 4 for CoRF : Colorizing Radiance Fields using Knowledge Distillation

Abstract:Neural radiance field (NeRF) based methods enable high-quality novel-view synthesis for multi-view images. This work presents a method for synthesizing colorized novel views from input grey-scale multi-view images. When we apply image or video-based colorization methods on the generated grey-scale novel views, we observe artifacts due to inconsistency across views. Training a radiance field network on the colorized grey-scale image sequence also does not solve the 3D consistency issue. We propose a distillation based method to transfer color knowledge from the colorization networks trained on natural images to the radiance field network. Specifically, our method uses the radiance field network as a 3D representation and transfers knowledge from existing 2D colorization methods. The experimental results demonstrate that the proposed method produces superior colorized novel views for indoor and outdoor scenes while maintaining cross-view consistency than baselines. Further, we show the efficacy of our method on applications like colorization of radiance field network trained from 1.) Infra-Red (IR) multi-view images and 2.) Old grey-scale multi-view image sequences.

* AI3DCC @ ICCV 2023

Via

Access Paper or Ask Questions

Clustering Plotted Data by Image Segmentation

Oct 06, 2021

Tarek Naous, Srinjay Sarkar, Abubakar Abid, James Zou

Figure 1 for Clustering Plotted Data by Image Segmentation

Figure 2 for Clustering Plotted Data by Image Segmentation

Figure 3 for Clustering Plotted Data by Image Segmentation

Figure 4 for Clustering Plotted Data by Image Segmentation

Abstract:Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a dataset as points in a metric space and compute distances to group together similar points. In this paper, we present a wholly different way of clustering points in 2-dimensional space, inspired by how humans cluster data: by training neural networks to perform instance segmentation on plotted data. Our approach, Visual Clustering, has several advantages over traditional clustering algorithms: it is much faster than most existing clustering algorithms (making it suitable for very large datasets), it agrees strongly with human intuition for clusters, and it is by default hyperparameter free (although additional steps with hyperparameters can be introduced for more control of the algorithm). We describe the method and compare it to ten other clustering methods on synthetic data to illustrate its advantages and disadvantages. We then demonstrate how our approach can be extended to higher dimensional data and illustrate its performance on real-world data. The implementation of Visual Clustering is publicly available and can be applied to any dataset in a few lines of code.

Via

Access Paper or Ask Questions