Picture for Alexandre Alahi

Alexandre Alahi

EPFL

Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors

Add code
Apr 07, 2025
Viaarxiv icon

Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model

Add code
Mar 30, 2025
Viaarxiv icon

FG$^2$: Fine-Grained Cross-View Localization by Fine-Grained Feature Matching

Add code
Mar 24, 2025
Viaarxiv icon

Unified Human Localization and Trajectory Prediction with Monocular Vision

Add code
Mar 05, 2025
Viaarxiv icon

COARSE: Collaborative Pseudo-Labeling with Coarse Real Labels for Off-Road Semantic Segmentation

Add code
Mar 05, 2025
Viaarxiv icon

DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests

Add code
Jan 08, 2025
Figure 1 for DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests
Figure 2 for DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests
Figure 3 for DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests
Figure 4 for DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests
Viaarxiv icon

Towards Generalizable Trajectory Prediction Using Dual-Level Representation Learning And Adaptive Prompting

Add code
Jan 08, 2025
Viaarxiv icon

Multi-Source Urban Traffic Flow Forecasting with Drone and Loop Detector Data

Add code
Jan 07, 2025
Figure 1 for Multi-Source Urban Traffic Flow Forecasting with Drone and Loop Detector Data
Figure 2 for Multi-Source Urban Traffic Flow Forecasting with Drone and Loop Detector Data
Figure 3 for Multi-Source Urban Traffic Flow Forecasting with Drone and Loop Detector Data
Figure 4 for Multi-Source Urban Traffic Flow Forecasting with Drone and Loop Detector Data
Viaarxiv icon

MotionMap: Representing Multimodality in Human Pose Forecasting

Add code
Dec 25, 2024
Viaarxiv icon

GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control

Add code
Dec 15, 2024
Figure 1 for GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Figure 2 for GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Figure 3 for GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Figure 4 for GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Viaarxiv icon