Picture for Aoxue Li

Aoxue Li

GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing

Add code
Jul 08, 2024
Viaarxiv icon

Towards Understanding the Working Mechanism of Text-to-Image Diffusion Model

Add code
May 24, 2024
Viaarxiv icon

Enhancing Text-to-Image Editing via Hybrid Mask-Informed Fusion

Add code
May 24, 2024
Viaarxiv icon

Efficient Transferability Assessment for Selection of Pre-trained Detectors

Add code
Mar 14, 2024
Viaarxiv icon

Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization

Add code
Mar 14, 2024
Viaarxiv icon

Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation

Add code
Jan 30, 2024
Viaarxiv icon

CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects

Add code
Jan 18, 2024
Viaarxiv icon

Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning

Add code
Dec 19, 2023
Viaarxiv icon

UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation

Add code
Aug 15, 2023
Viaarxiv icon

ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real Novel View Synthesis via Contrastive Learning

Add code
Mar 30, 2023
Viaarxiv icon