Picture for Chunwei Wang

Chunwei Wang

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Add code
Sep 26, 2024
Figure 1 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 2 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 3 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 4 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Viaarxiv icon

UNIT: Unifying Image and Text Recognition in One Vision Encoder

Add code
Sep 06, 2024
Figure 1 for UNIT: Unifying Image and Text Recognition in One Vision Encoder
Figure 2 for UNIT: Unifying Image and Text Recognition in One Vision Encoder
Figure 3 for UNIT: Unifying Image and Text Recognition in One Vision Encoder
Figure 4 for UNIT: Unifying Image and Text Recognition in One Vision Encoder
Viaarxiv icon

HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models

Add code
Jul 11, 2024
Viaarxiv icon

From Summary to Action: Enhancing Large Language Models for Complex Tasks with Open World APIs

Add code
Feb 28, 2024
Viaarxiv icon

Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving

Add code
Dec 06, 2023
Viaarxiv icon

Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis

Add code
Oct 20, 2023
Viaarxiv icon

PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection

Add code
Aug 08, 2023
Figure 1 for PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection
Figure 2 for PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection
Figure 3 for PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection
Figure 4 for PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection
Viaarxiv icon