Picture for Leigang Qu

Leigang Qu

SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation

Add code
Dec 08, 2024
Viaarxiv icon

Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation

Add code
Jul 24, 2024
Figure 1 for Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation
Figure 2 for Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation
Figure 3 for Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation
Figure 4 for Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation
Viaarxiv icon

Unified Text-to-Image Generation and Retrieval

Add code
Jun 09, 2024
Viaarxiv icon

Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives

Add code
Jun 09, 2024
Viaarxiv icon

Discriminative Probing and Tuning for Text-to-Image Generation

Add code
Mar 14, 2024
Viaarxiv icon

Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond

Add code
Feb 16, 2024
Viaarxiv icon

NExT-GPT: Any-to-Any Multimodal LLM

Add code
Sep 13, 2023
Viaarxiv icon

LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation

Add code
Aug 12, 2023
Figure 1 for LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Figure 2 for LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Figure 3 for LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Figure 4 for LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Viaarxiv icon

Learnable Pillar-based Re-ranking for Image-Text Retrieval

Add code
Apr 25, 2023
Viaarxiv icon

Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization

Add code
Nov 14, 2022
Viaarxiv icon