Picture for Qinglin Lu

Qinglin Lu

Refer to the report for detailed contributions

HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation

Add code
Mar 25, 2025
Viaarxiv icon

FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model

Add code
Mar 25, 2025
Viaarxiv icon

HunyuanVideo: A Systematic Framework For Large Video Generative Models

Add code
Dec 03, 2024
Figure 1 for HunyuanVideo: A Systematic Framework For Large Video Generative Models
Figure 2 for HunyuanVideo: A Systematic Framework For Large Video Generative Models
Figure 3 for HunyuanVideo: A Systematic Framework For Large Video Generative Models
Figure 4 for HunyuanVideo: A Systematic Framework For Large Video Generative Models
Viaarxiv icon

Sonic: Shifting Focus to Global Audio Perception in Portrait Animation

Add code
Nov 25, 2024
Figure 1 for Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Figure 2 for Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Figure 3 for Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Figure 4 for Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Viaarxiv icon

Searching Priors Makes Text-to-Video Synthesis Better

Add code
Jun 05, 2024
Figure 1 for Searching Priors Makes Text-to-Video Synthesis Better
Figure 2 for Searching Priors Makes Text-to-Video Synthesis Better
Figure 3 for Searching Priors Makes Text-to-Video Synthesis Better
Figure 4 for Searching Priors Makes Text-to-Video Synthesis Better
Viaarxiv icon

Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Add code
May 14, 2024
Figure 1 for Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Figure 2 for Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Figure 3 for Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Figure 4 for Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Viaarxiv icon

LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models

Add code
Mar 18, 2024
Viaarxiv icon

DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation

Add code
Mar 13, 2024
Viaarxiv icon

Smooth Video Synthesis with Noise Constraints on Diffusion Models for One-shot Video Tuning

Add code
Nov 29, 2023
Viaarxiv icon

Tencent AVS: A Holistic Ads Video Dataset for Multi-modal Scene Segmentation

Add code
Dec 09, 2022
Figure 1 for Tencent AVS: A Holistic Ads Video Dataset for Multi-modal Scene Segmentation
Figure 2 for Tencent AVS: A Holistic Ads Video Dataset for Multi-modal Scene Segmentation
Figure 3 for Tencent AVS: A Holistic Ads Video Dataset for Multi-modal Scene Segmentation
Figure 4 for Tencent AVS: A Holistic Ads Video Dataset for Multi-modal Scene Segmentation
Viaarxiv icon