Picture for Dawei Leng

Dawei Leng

PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models

Add code
Mar 13, 2025
Viaarxiv icon

NAMI: Efficient Image Generation via Progressive Rectified Flow Transformers

Add code
Mar 12, 2025
Viaarxiv icon

WISA: World Simulator Assistant for Physics-Aware Text-to-Video Generation

Add code
Mar 11, 2025
Viaarxiv icon

U-StyDiT: Ultra-high Quality Artistic Style Transfer Using Diffusion Transformers

Add code
Mar 11, 2025
Viaarxiv icon

Prompt as Knowledge Bank: Boost Vision-language model via Structural Representation for zero-shot medical detection

Add code
Feb 22, 2025
Viaarxiv icon

RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers

Add code
Feb 21, 2025
Viaarxiv icon

HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation

Add code
Oct 18, 2024
Viaarxiv icon

Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task

Add code
Sep 06, 2024
Viaarxiv icon

IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities

Add code
Aug 23, 2024
Figure 1 for IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
Figure 2 for IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
Figure 3 for IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
Figure 4 for IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
Viaarxiv icon

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance

Add code
Aug 15, 2024
Figure 1 for FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Figure 2 for FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Figure 3 for FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Figure 4 for FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Viaarxiv icon