Picture for Daquan Zhou

Daquan Zhou

Refer to the report for detailed contributions

MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Portrait Few-Step Synthesis

Add code
Mar 17, 2025
Viaarxiv icon

AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction

Add code
Mar 17, 2025
Viaarxiv icon

MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice

Add code
Mar 07, 2025
Viaarxiv icon

Magic 1-For-1: Generating One Minute Video Clips within One Minute

Add code
Feb 11, 2025
Viaarxiv icon

SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer

Add code
Jan 30, 2025
Figure 1 for SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Figure 2 for SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Figure 3 for SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Figure 4 for SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Viaarxiv icon

Real-time One-Step Diffusion-based Expressive Portrait Videos Generation

Add code
Dec 18, 2024
Figure 1 for Real-time One-Step Diffusion-based Expressive Portrait Videos Generation
Figure 2 for Real-time One-Step Diffusion-based Expressive Portrait Videos Generation
Figure 3 for Real-time One-Step Diffusion-based Expressive Portrait Videos Generation
Figure 4 for Real-time One-Step Diffusion-based Expressive Portrait Videos Generation
Viaarxiv icon

MaskCLIP++: A Mask-Based CLIP Fine-tuning Framework for Open-Vocabulary Image Segmentation

Add code
Dec 16, 2024
Figure 1 for MaskCLIP++: A Mask-Based CLIP Fine-tuning Framework for Open-Vocabulary Image Segmentation
Figure 2 for MaskCLIP++: A Mask-Based CLIP Fine-tuning Framework for Open-Vocabulary Image Segmentation
Figure 3 for MaskCLIP++: A Mask-Based CLIP Fine-tuning Framework for Open-Vocabulary Image Segmentation
Figure 4 for MaskCLIP++: A Mask-Based CLIP Fine-tuning Framework for Open-Vocabulary Image Segmentation
Viaarxiv icon

HunyuanVideo: A Systematic Framework For Large Video Generative Models

Add code
Dec 03, 2024
Figure 1 for HunyuanVideo: A Systematic Framework For Large Video Generative Models
Figure 2 for HunyuanVideo: A Systematic Framework For Large Video Generative Models
Figure 3 for HunyuanVideo: A Systematic Framework For Large Video Generative Models
Figure 4 for HunyuanVideo: A Systematic Framework For Large Video Generative Models
Viaarxiv icon

LVD-2M: A Long-take Video Dataset with Temporally Dense Captions

Add code
Oct 14, 2024
Figure 1 for LVD-2M: A Long-take Video Dataset with Temporally Dense Captions
Figure 2 for LVD-2M: A Long-take Video Dataset with Temporally Dense Captions
Figure 3 for LVD-2M: A Long-take Video Dataset with Temporally Dense Captions
Figure 4 for LVD-2M: A Long-take Video Dataset with Temporally Dense Captions
Viaarxiv icon

Loong: Generating Minute-level Long Videos with Autoregressive Language Models

Add code
Oct 03, 2024
Viaarxiv icon