Picture for Yueze Wang

Yueze Wang

Emu3: Next-Token Prediction is All You Need

Add code
Sep 27, 2024
Viaarxiv icon

OmniGen: Unified Image Generation

Add code
Sep 17, 2024
Figure 1 for OmniGen: Unified Image Generation
Figure 2 for OmniGen: Unified Image Generation
Figure 3 for OmniGen: Unified Image Generation
Figure 4 for OmniGen: Unified Image Generation
Viaarxiv icon

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

Add code
Jul 11, 2024
Figure 1 for DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Figure 2 for DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Figure 3 for DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Figure 4 for DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Viaarxiv icon

Unveiling Encoder-Free Vision-Language Models

Add code
Jun 17, 2024
Viaarxiv icon

Seeing Clearly, Answering Incorrectly: A Multimodal Robustness Benchmark for Evaluating MLLMs on Leading Questions

Add code
Jun 15, 2024
Viaarxiv icon

Efficient Multimodal Learning from Data-centric Perspective

Add code
Feb 18, 2024
Viaarxiv icon

Universal Prompt Optimizer for Safe Text-to-Image Generation

Add code
Feb 16, 2024
Viaarxiv icon

Generative Multimodal Models are In-Context Learners

Add code
Dec 20, 2023
Viaarxiv icon

Generative Pretraining in Multimodality

Add code
Jul 11, 2023
Viaarxiv icon

Fine-Grained Visual Prompting

Add code
Jun 07, 2023
Viaarxiv icon