Picture for Katerina Fragkiadaki

Katerina Fragkiadaki

Unified Multimodal Discrete Diffusion

Add code
Mar 26, 2025
Viaarxiv icon

Unifying 2D and 3D Vision-Language Understanding

Add code
Mar 13, 2025
Viaarxiv icon

Video Depth without Video Models

Add code
Nov 28, 2024
Viaarxiv icon

Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations

Add code
Aug 08, 2024
Figure 1 for Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations
Figure 2 for Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations
Figure 3 for Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations
Figure 4 for Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations
Viaarxiv icon

Video Diffusion Alignment via Reward Gradients

Add code
Jul 11, 2024
Viaarxiv icon

ICAL: Continual Learning of Multimodal Agents by Transforming Trajectories into Actionable Insights

Add code
Jun 20, 2024
Viaarxiv icon

DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos

Add code
May 03, 2024
Figure 1 for DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
Figure 2 for DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
Figure 3 for DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
Figure 4 for DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
Viaarxiv icon

HELPER-X: A Unified Instructable Embodied Agent to Tackle Four Interactive Vision-Language Domains with Memory-Augmented Language Models

Add code
Apr 29, 2024
Viaarxiv icon

Tractable Joint Prediction and Planning over Discrete Behavior Modes for Urban Driving

Add code
Mar 12, 2024
Viaarxiv icon

3D Diffuser Actor: Policy Diffusion with 3D Scene Representations

Add code
Feb 16, 2024
Viaarxiv icon