Picture for Danda Pani Paudel

Danda Pani Paudel

GaussianVLM: Scene-centric 3D Vision-Language Models using Language-aligned Gaussian Splats for Embodied Reasoning and Beyond

Add code
Jul 01, 2025
Viaarxiv icon

SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting

Add code
Jun 10, 2025
Viaarxiv icon

Cross-View Multi-Modal Segmentation @ Ego-Exo4D Challenges 2025

Add code
Jun 06, 2025
Viaarxiv icon

Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection

Add code
Jun 06, 2025
Viaarxiv icon

StateSpaceDiffuser: Bringing Long Context to Diffusion World Models

Add code
May 28, 2025
Viaarxiv icon

Manifold-aware Representation Learning for Degradation-agnostic Image Restoration

Add code
May 24, 2025
Viaarxiv icon

MLLMs are Deeply Affected by Modality Bias

Add code
May 24, 2025
Viaarxiv icon

Are Multimodal Large Language Models Ready for Omnidirectional Spatial Reasoning?

Add code
May 17, 2025
Viaarxiv icon

Reducing Unimodal Bias in Multi-Modal Semantic Segmentation with Multi-Scale Functional Entropy Regularization

Add code
May 10, 2025
Viaarxiv icon

Split Matching for Inductive Zero-shot Semantic Segmentation

Add code
May 08, 2025
Viaarxiv icon