Picture for Zunnan Xu

Zunnan Xu

Refer to the report for detailed contributions

HunyuanVideo: A Systematic Framework For Large Video Generative Models

Add code
Dec 03, 2024
Viaarxiv icon

AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward

Add code
Nov 27, 2024
Viaarxiv icon

Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation

Add code
Aug 29, 2024
Figure 1 for Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation
Figure 2 for Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation
Figure 3 for Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation
Figure 4 for Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation
Viaarxiv icon

V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results

Add code
Jun 17, 2024
Figure 1 for V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results
Figure 2 for V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results
Figure 3 for V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results
Viaarxiv icon

REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment

Add code
May 28, 2024
Viaarxiv icon

Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference

Add code
May 23, 2024
Viaarxiv icon

MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models

Add code
Mar 14, 2024
Viaarxiv icon

BATON: Aligning Text-to-Audio Model with Human Preference Feedback

Add code
Feb 01, 2024
Figure 1 for BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Figure 2 for BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Figure 3 for BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Figure 4 for BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Viaarxiv icon

Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness

Add code
Jan 07, 2024
Figure 1 for Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness
Figure 2 for Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness
Figure 3 for Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness
Figure 4 for Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness
Viaarxiv icon

Chain of Generation: Multi-Modal Gesture Synthesis via Cascaded Conditional Control

Add code
Dec 26, 2023
Viaarxiv icon