Picture for Yuhui Yin

Yuhui Yin

HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation

Add code
Oct 18, 2024
Viaarxiv icon

Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task

Add code
Sep 06, 2024
Viaarxiv icon

IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities

Add code
Aug 23, 2024
Figure 1 for IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
Figure 2 for IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
Figure 3 for IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
Figure 4 for IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
Viaarxiv icon

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance

Add code
Aug 15, 2024
Figure 1 for FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Figure 2 for FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Figure 3 for FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Figure 4 for FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Viaarxiv icon

Bridge Diffusion Model: bridge non-English language-native text-to-image diffusion model with English communities

Add code
Sep 02, 2023
Viaarxiv icon