Picture for Hanwang Zhang

Hanwang Zhang

Unified Generative and Discriminative Training for Multi-modal Large Language Models

Add code
Nov 01, 2024
Viaarxiv icon

Enhancing Zero-Shot Vision Models by Label-Free Prompt Distribution Learning and Bias Correcting

Add code
Oct 25, 2024
Viaarxiv icon

Few-shot NeRF by Adaptive Rendering Loss Regularization

Add code
Oct 23, 2024
Viaarxiv icon

Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration

Add code
Sep 30, 2024
Viaarxiv icon

Instruction Tuning-free Visual Token Complement for Multimodal LLMs

Add code
Aug 09, 2024
Viaarxiv icon

Selective Vision-Language Subspace Projection for Few-shot CLIP

Add code
Jul 26, 2024
Viaarxiv icon

Visual Prompt Selection for In-Context Learning Segmentation

Add code
Jul 14, 2024
Viaarxiv icon

ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models

Add code
Jun 16, 2024
Viaarxiv icon

EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Add code
Jun 13, 2024
Viaarxiv icon

MVGamba: Unify 3D Content Generation as State Space Sequence Modeling

Add code
Jun 10, 2024
Viaarxiv icon