Picture for Xianda Guo

Xianda Guo

InsightDrive: Insight Scene Representation for End-to-End Autonomous Driving

Add code
Mar 17, 2025
Viaarxiv icon

Bidirectional Prototype-Reward co-Evolution for Test-Time Adaptation of Vision-Language Models

Add code
Mar 12, 2025
Viaarxiv icon

Keeping Yourself is Important in Downstream Tuning Multimodal Large Language Model

Add code
Mar 06, 2025
Viaarxiv icon

WMNav: Integrating Vision-Language Models into World Models for Object Goal Navigation

Add code
Mar 04, 2025
Viaarxiv icon

Stereo Anything: Unifying Stereo Matching with Large-Scale Mixed Data

Add code
Nov 21, 2024
Figure 1 for Stereo Anything: Unifying Stereo Matching with Large-Scale Mixed Data
Figure 2 for Stereo Anything: Unifying Stereo Matching with Large-Scale Mixed Data
Figure 3 for Stereo Anything: Unifying Stereo Matching with Large-Scale Mixed Data
Figure 4 for Stereo Anything: Unifying Stereo Matching with Large-Scale Mixed Data
Viaarxiv icon

DriveMLLM: A Benchmark for Spatial Understanding with Multimodal Large Language Models in Autonomous Driving

Add code
Nov 20, 2024
Viaarxiv icon

AdvDiffuser: Generating Adversarial Safety-Critical Driving Scenarios via Guided Diffusion

Add code
Oct 11, 2024
Viaarxiv icon

LightStereo: Channel Boost Is All Your Need for Efficient 2D Cost Aggregation

Add code
Jun 28, 2024
Viaarxiv icon

Instruct Large Language Models to Drive like Humans

Add code
Jun 11, 2024
Viaarxiv icon

MaskFuser: Masked Fusion of Joint Multi-Modal Tokenization for End-to-End Autonomous Driving

Add code
May 13, 2024
Viaarxiv icon