Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Seema Kumari

FlowIID: Single-Step Intrinsic Image Decomposition via Latent Flow Matching

Jan 18, 2026

Mithlesh Singla, Seema Kumari, Shanmuganathan Raman

Abstract:Intrinsic Image Decomposition (IID) separates an image into albedo and shading components. It is a core step in many real-world applications, such as relighting and material editing. Existing IID models achieve good results, but often use a large number of parameters. This makes them costly to combine with other models in real-world settings. To address this problem, we propose a flow matching-based solution. For this, we design a novel architecture, FlowIID, based on latent flow matching. FlowIID combines a VAE-guided latent space with a flow matching module, enabling a stable decomposition of albedo and shading. FlowIID is not only parameter-efficient, but also produces results in a single inference step. Despite its compact design, FlowIID delivers competitive and superior results compared to existing models across various benchmarks. This makes it well-suited for deployment in resource-constrained and real-time vision applications.

Via

Access Paper or Ask Questions

PointResNet: Residual Network for 3D Point Cloud Segmentation and Classification

Nov 20, 2022

Aadesh Desai, Saagar Parikh, Seema Kumari, Shanmuganathan Raman

Abstract:Point cloud segmentation and classification are some of the primary tasks in 3D computer vision with applications ranging from augmented reality to robotics. However, processing point clouds using deep learning-based algorithms is quite challenging due to the irregular point formats. Voxelization or 3D grid-based representation are different ways of applying deep neural networks to this problem. In this paper, we propose PointResNet, a residual block-based approach. Our model directly processes the 3D points, using a deep neural network for the segmentation and classification tasks. The main components of the architecture are: 1) residual blocks and 2) multi-layered perceptron (MLP). We show that it preserves profound features and structural information, which are useful for segmentation and classification tasks. The experimental evaluations demonstrate that the proposed model produces the best results for segmentation and comparable results for classification in comparison to the conventional baselines.

* Paper Under Review at IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

Via

Access Paper or Ask Questions