Picture for Dongdong Chen

Dongdong Chen

Olympus: A Universal Task Router for Computer Vision Tasks

Add code
Dec 12, 2024
Viaarxiv icon

MageBench: Bridging Large Multimodal Models to Agents

Add code
Dec 05, 2024
Viaarxiv icon

Enhancing MMDiT-Based Text-to-Image Models for Similar Subject Generation

Add code
Nov 27, 2024
Viaarxiv icon

LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation

Add code
Nov 26, 2024
Figure 1 for LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation
Figure 2 for LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation
Figure 3 for LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation
Figure 4 for LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation
Viaarxiv icon

LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation

Add code
Nov 07, 2024
Figure 1 for LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation
Figure 2 for LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation
Figure 3 for LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation
Figure 4 for LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation
Viaarxiv icon

ToolBridge: An Open-Source Dataset to Equip LLMs with External Tool Capabilities

Add code
Oct 08, 2024
Viaarxiv icon

SynChart: Synthesizing Charts from Language Models

Add code
Sep 25, 2024
Viaarxiv icon

Pluralistic Salient Object Detection

Add code
Sep 04, 2024
Viaarxiv icon

Chat2Layout: Interactive 3D Furniture Layout with a Multimodal LLM

Add code
Jul 31, 2024
Viaarxiv icon

Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge

Add code
Jul 05, 2024
Viaarxiv icon