Picture for Zehuan Yuan

Zehuan Yuan

ChatSearch: a Dataset and a Generative Retrieval Model for General Conversational Image Retrieval

Add code
Oct 24, 2024
Figure 1 for ChatSearch: a Dataset and a Generative Retrieval Model for General Conversational Image Retrieval
Figure 2 for ChatSearch: a Dataset and a Generative Retrieval Model for General Conversational Image Retrieval
Figure 3 for ChatSearch: a Dataset and a Generative Retrieval Model for General Conversational Image Retrieval
Figure 4 for ChatSearch: a Dataset and a Generative Retrieval Model for General Conversational Image Retrieval
Viaarxiv icon

OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation

Add code
Jun 13, 2024
Figure 1 for OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Figure 2 for OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Figure 3 for OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Figure 4 for OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Viaarxiv icon

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Add code
Jun 10, 2024
Viaarxiv icon

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models

Add code
Apr 19, 2024
Viaarxiv icon

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Add code
Apr 03, 2024
Viaarxiv icon

Generative Region-Language Pretraining for Open-Ended Object Detection

Add code
Mar 15, 2024
Viaarxiv icon

UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces

Add code
Dec 25, 2023
Viaarxiv icon

General Object Foundation Model for Images and Videos at Scale

Add code
Dec 14, 2023
Viaarxiv icon

Recognize Any Regions

Add code
Nov 02, 2023
Viaarxiv icon

CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection

Add code
Oct 25, 2023
Viaarxiv icon