Picture for Leigang Qu

Leigang Qu

Automatic Pruning via Structured Lasso with Class-wise Information

Add code
Feb 13, 2025
Viaarxiv icon

Generative Ghost: Investigating Ranking Bias Hidden in AI-Generated Videos

Add code
Feb 11, 2025
Viaarxiv icon

SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation

Add code
Dec 08, 2024
Figure 1 for SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation
Figure 2 for SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation
Figure 3 for SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation
Figure 4 for SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation
Viaarxiv icon

Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation

Add code
Jul 24, 2024
Figure 1 for Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation
Figure 2 for Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation
Figure 3 for Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation
Figure 4 for Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation
Viaarxiv icon

Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives

Add code
Jun 09, 2024
Viaarxiv icon

Unified Text-to-Image Generation and Retrieval

Add code
Jun 09, 2024
Viaarxiv icon

Discriminative Probing and Tuning for Text-to-Image Generation

Add code
Mar 14, 2024
Figure 1 for Discriminative Probing and Tuning for Text-to-Image Generation
Figure 2 for Discriminative Probing and Tuning for Text-to-Image Generation
Figure 3 for Discriminative Probing and Tuning for Text-to-Image Generation
Figure 4 for Discriminative Probing and Tuning for Text-to-Image Generation
Viaarxiv icon

Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond

Add code
Feb 16, 2024
Viaarxiv icon

NExT-GPT: Any-to-Any Multimodal LLM

Add code
Sep 13, 2023
Figure 1 for NExT-GPT: Any-to-Any Multimodal LLM
Figure 2 for NExT-GPT: Any-to-Any Multimodal LLM
Figure 3 for NExT-GPT: Any-to-Any Multimodal LLM
Figure 4 for NExT-GPT: Any-to-Any Multimodal LLM
Viaarxiv icon

LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation

Add code
Aug 12, 2023
Figure 1 for LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Figure 2 for LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Figure 3 for LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Figure 4 for LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Viaarxiv icon