Picture for Zhaowei Li

Zhaowei Li

Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models

Add code
Nov 14, 2024
Figure 1 for Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Figure 2 for Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Figure 3 for Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Figure 4 for Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Viaarxiv icon

Understanding the Role of LLMs in Multimodal Evaluation Benchmarks

Add code
Oct 16, 2024
Viaarxiv icon

UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model

Add code
Aug 05, 2024
Figure 1 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Figure 2 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Figure 3 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Figure 4 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Viaarxiv icon

SpeechAlign: Aligning Speech Generation to Human Preferences

Add code
Apr 08, 2024
Viaarxiv icon

CMP: Cooperative Motion Prediction with Multi-Agent Communication

Add code
Mar 26, 2024
Viaarxiv icon

GroundingGPT:Language Enhanced Multi-modal Grounding Model

Add code
Jan 30, 2024
Viaarxiv icon

SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

Add code
Jan 08, 2024
Viaarxiv icon