Picture for Wenwen Tong

Wenwen Tong

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

Add code
Apr 29, 2024
Viaarxiv icon

DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving

Add code
Dec 25, 2023
Figure 1 for DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Figure 2 for DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Figure 3 for DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Figure 4 for DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Viaarxiv icon

Scene as Occupancy

Add code
Jun 06, 2023
Viaarxiv icon

3D Data Augmentation for Driving Scenes on Camera

Add code
Mar 18, 2023
Viaarxiv icon