Picture for Xiangyu Yue

Xiangyu Yue

Chimera: Improving Generalist Model with Domain-Specific Experts

Add code
Dec 08, 2024
Viaarxiv icon

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

Add code
Dec 03, 2024
Viaarxiv icon

Diffusion Autoencoders for Few-shot Image Generation in Hyperbolic Space

Add code
Nov 27, 2024
Viaarxiv icon

Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment

Add code
Nov 18, 2024
Figure 1 for Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
Figure 2 for Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
Figure 3 for Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
Figure 4 for Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
Viaarxiv icon

Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines

Add code
Oct 28, 2024
Figure 1 for Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines
Figure 2 for Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines
Figure 3 for Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines
Figure 4 for Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines
Viaarxiv icon

BIFRÖST: 3D-Aware Image compositing with Language Instructions

Add code
Oct 24, 2024
Viaarxiv icon

Remember, Retrieve and Generate: Understanding Infinite Visual Concepts as Your Personalized Assistant

Add code
Oct 17, 2024
Figure 1 for Remember, Retrieve and Generate: Understanding Infinite Visual Concepts as Your Personalized Assistant
Figure 2 for Remember, Retrieve and Generate: Understanding Infinite Visual Concepts as Your Personalized Assistant
Figure 3 for Remember, Retrieve and Generate: Understanding Infinite Visual Concepts as Your Personalized Assistant
Figure 4 for Remember, Retrieve and Generate: Understanding Infinite Visual Concepts as Your Personalized Assistant
Viaarxiv icon

Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling

Add code
Oct 14, 2024
Viaarxiv icon

Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations

Add code
Oct 10, 2024
Viaarxiv icon

Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?

Add code
Oct 02, 2024
Viaarxiv icon