Picture for Huanzhang Dou

Huanzhang Dou

ChatDiT: A Training-Free Baseline for Task-Agnostic Free-Form Chatting with Diffusion Transformers

Add code
Dec 17, 2024
Viaarxiv icon

IDEA-Bench: How Far are Generative Models from Professional Designing?

Add code
Dec 16, 2024
Viaarxiv icon

In-Context LoRA for Diffusion Transformers

Add code
Oct 31, 2024
Figure 1 for In-Context LoRA for Diffusion Transformers
Figure 2 for In-Context LoRA for Diffusion Transformers
Figure 3 for In-Context LoRA for Diffusion Transformers
Figure 4 for In-Context LoRA for Diffusion Transformers
Viaarxiv icon

Group Diffusion Transformers are Unsupervised Multitask Learners

Add code
Oct 19, 2024
Figure 1 for Group Diffusion Transformers are Unsupervised Multitask Learners
Viaarxiv icon

CLASH: Complementary Learning with Neural Architecture Search for Gait Recognition

Add code
Jul 04, 2024
Viaarxiv icon

GVDIFF: Grounded Text-to-Video Generation with Diffusion Models

Add code
Jul 02, 2024
Figure 1 for GVDIFF: Grounded Text-to-Video Generation with Diffusion Models
Figure 2 for GVDIFF: Grounded Text-to-Video Generation with Diffusion Models
Figure 3 for GVDIFF: Grounded Text-to-Video Generation with Diffusion Models
Figure 4 for GVDIFF: Grounded Text-to-Video Generation with Diffusion Models
Viaarxiv icon

ScanFormer: Referring Expression Comprehension by Iteratively Scanning

Add code
Jun 26, 2024
Figure 1 for ScanFormer: Referring Expression Comprehension by Iteratively Scanning
Figure 2 for ScanFormer: Referring Expression Comprehension by Iteratively Scanning
Figure 3 for ScanFormer: Referring Expression Comprehension by Iteratively Scanning
Figure 4 for ScanFormer: Referring Expression Comprehension by Iteratively Scanning
Viaarxiv icon

SemanticMIM: Marring Masked Image Modeling with Semantics Compression for General Visual Representation

Add code
Jun 15, 2024
Viaarxiv icon

Language Adaptive Weight Generation for Multi-task Visual Grounding

Add code
Jun 06, 2023
Figure 1 for Language Adaptive Weight Generation for Multi-task Visual Grounding
Figure 2 for Language Adaptive Weight Generation for Multi-task Visual Grounding
Figure 3 for Language Adaptive Weight Generation for Multi-task Visual Grounding
Figure 4 for Language Adaptive Weight Generation for Multi-task Visual Grounding
Viaarxiv icon

GaitMPL: Gait Recognition with Memory-Augmented Progressive Learning

Add code
Jun 06, 2023
Viaarxiv icon