Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling

Aug 02, 2024

Qian Zhang, Xiangzi Dai, Ninghua Yang, Xiang An, Ziyong Feng, Xingyu Ren

Figure 1 for VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling

Figure 2 for VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling

Figure 3 for VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling

Figure 4 for VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling

Share this with someone who'll enjoy it:

Abstract:VAR is a new generation paradigm that employs 'next-scale prediction' as opposed to 'next-token prediction'. This innovative transformation enables auto-regressive (AR) transformers to rapidly learn visual distributions and achieve robust generalization. However, the original VAR model is constrained to class-conditioned synthesis, relying solely on textual captions for guidance. In this paper, we introduce VAR-CLIP, a novel text-to-image model that integrates Visual Auto-Regressive techniques with the capabilities of CLIP. The VAR-CLIP framework encodes captions into text embeddings, which are then utilized as textual conditions for image generation. To facilitate training on extensive datasets, such as ImageNet, we have constructed a substantial image-text dataset leveraging BLIP2. Furthermore, we delve into the significance of word positioning within CLIP for the purpose of caption guidance. Extensive experiments confirm VAR-CLIP's proficiency in generating fantasy images with high fidelity, textual congruence, and aesthetic excellence. Our project page are https://github.com/daixiangzi/VAR-CLIP

* total 10 pages, code:https://github.com/daixiangzi/VAR-CLIP

View paper on

Share this with someone who'll enjoy it:

Title:VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling

Paper and Code