Picture for Yufeng Cui

Yufeng Cui

Emu3: Next-Token Prediction is All You Need

Add code
Sep 27, 2024
Viaarxiv icon

Unveiling Encoder-Free Vision-Language Models

Add code
Jun 17, 2024
Viaarxiv icon

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Add code
Feb 06, 2024
Viaarxiv icon

Generative Multimodal Models are In-Context Learners

Add code
Dec 20, 2023
Viaarxiv icon

CapsFusion: Rethinking Image-Text Data at Scale

Add code
Nov 02, 2023
Viaarxiv icon

Generative Pretraining in Multimodality

Add code
Jul 11, 2023
Viaarxiv icon

Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline

Add code
Jan 29, 2023
Viaarxiv icon

Democratizing Contrastive Language-Image Pre-training: A CLIP Benchmark of Data, Model, and Supervision

Add code
Mar 11, 2022
Figure 1 for Democratizing Contrastive Language-Image Pre-training: A CLIP Benchmark of Data, Model, and Supervision
Figure 2 for Democratizing Contrastive Language-Image Pre-training: A CLIP Benchmark of Data, Model, and Supervision
Figure 3 for Democratizing Contrastive Language-Image Pre-training: A CLIP Benchmark of Data, Model, and Supervision
Figure 4 for Democratizing Contrastive Language-Image Pre-training: A CLIP Benchmark of Data, Model, and Supervision
Viaarxiv icon

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

Add code
Oct 11, 2021
Figure 1 for Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Figure 2 for Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Figure 3 for Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Figure 4 for Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Viaarxiv icon