Picture for Kanchana Ranasinghe

Kanchana Ranasinghe

Pixel Motion Diffusion is What We Need for Robot Control

Add code
Sep 26, 2025
Viaarxiv icon

Pixel Motion as Universal Representation for Robot Control

Add code
May 12, 2025
Figure 1 for Pixel Motion as Universal Representation for Robot Control
Figure 2 for Pixel Motion as Universal Representation for Robot Control
Figure 3 for Pixel Motion as Universal Representation for Robot Control
Figure 4 for Pixel Motion as Universal Representation for Robot Control
Viaarxiv icon

Test-Time Optimization for Domain Adaptive Open Vocabulary Segmentation

Add code
Jan 08, 2025
Viaarxiv icon

LatentCRF: Continuous CRF for Efficient Latent Diffusion

Add code
Dec 24, 2024
Figure 1 for LatentCRF: Continuous CRF for Efficient Latent Diffusion
Figure 2 for LatentCRF: Continuous CRF for Efficient Latent Diffusion
Figure 3 for LatentCRF: Continuous CRF for Efficient Latent Diffusion
Figure 4 for LatentCRF: Continuous CRF for Efficient Latent Diffusion
Viaarxiv icon

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

Add code
Jun 28, 2024
Figure 1 for LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Figure 2 for LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Figure 3 for LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Figure 4 for LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Viaarxiv icon

Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA

Add code
Jun 17, 2024
Figure 1 for Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA
Figure 2 for Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA
Figure 3 for Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA
Figure 4 for Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA
Viaarxiv icon

Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs

Add code
Apr 11, 2024
Figure 1 for Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs
Figure 2 for Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs
Figure 3 for Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs
Figure 4 for Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs
Viaarxiv icon

Understanding Long Videos in One Multimodal Language Model Pass

Add code
Mar 25, 2024
Figure 1 for Understanding Long Videos in One Multimodal Language Model Pass
Figure 2 for Understanding Long Videos in One Multimodal Language Model Pass
Figure 3 for Understanding Long Videos in One Multimodal Language Model Pass
Figure 4 for Understanding Long Videos in One Multimodal Language Model Pass
Viaarxiv icon

Language Repository for Long Video Understanding

Add code
Mar 21, 2024
Figure 1 for Language Repository for Long Video Understanding
Figure 2 for Language Repository for Long Video Understanding
Figure 3 for Language Repository for Long Video Understanding
Figure 4 for Language Repository for Long Video Understanding
Viaarxiv icon

Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning

Add code
Mar 21, 2024
Viaarxiv icon