Picture for Jonathan Tremblay

Jonathan Tremblay

VoLo: A Physical Orchestrator for Open-Vocabulary Long-Horizon Manipulation

Add code
Jun 05, 2026
Viaarxiv icon

Effective Multi-sensor Conditioning for Street-view Novel-view Synthesis

Add code
Jun 01, 2026
Viaarxiv icon

Why Far Looks Up: Probing Spatial Representation in Vision-Language Models

Add code
May 28, 2026
Viaarxiv icon

RoboLab: A High-Fidelity Simulation Benchmark for Analysis of Task Generalist Policies

Add code
Apr 10, 2026
Viaarxiv icon

MME-CoF-Pro: Evaluating Reasoning Coherence in Video Generative Models with Text and Visual Hints

Add code
Mar 20, 2026
Viaarxiv icon

Tactile Modality Fusion for Vision-Language-Action Models

Add code
Mar 15, 2026
Viaarxiv icon

DT-NVS: Diffusion Transformers for Novel View Synthesis

Add code
Nov 11, 2025
Viaarxiv icon

3D-Generalist: Self-Improving Vision-Language-Action Models for Crafting 3D Worlds

Add code
Jul 09, 2025
Viaarxiv icon

BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation

Add code
Apr 03, 2025
Viaarxiv icon

RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics

Add code
Nov 25, 2024
Figure 1 for RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
Figure 2 for RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
Figure 3 for RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
Figure 4 for RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
Viaarxiv icon