Picture for Keqin Chen

Keqin Chen

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Add code
Sep 18, 2024
Viaarxiv icon

Qwen2 Technical Report

Add code
Jul 16, 2024
Figure 1 for Qwen2 Technical Report
Figure 2 for Qwen2 Technical Report
Figure 3 for Qwen2 Technical Report
Figure 4 for Qwen2 Technical Report
Viaarxiv icon

SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models

Add code
Nov 13, 2023
Viaarxiv icon

Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic

Add code
Jul 03, 2023
Viaarxiv icon