Picture for Zunnan Xu

Zunnan Xu

Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation

Add code
Aug 29, 2024
Figure 1 for Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation
Figure 2 for Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation
Figure 3 for Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation
Figure 4 for Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation
Viaarxiv icon

V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results

Add code
Jun 17, 2024
Viaarxiv icon

REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment

Add code
May 28, 2024
Viaarxiv icon

Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference

Add code
May 23, 2024
Viaarxiv icon

MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models

Add code
Mar 14, 2024
Viaarxiv icon

BATON: Aligning Text-to-Audio Model with Human Preference Feedback

Add code
Feb 01, 2024
Figure 1 for BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Figure 2 for BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Figure 3 for BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Figure 4 for BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Viaarxiv icon

Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness

Add code
Jan 07, 2024
Figure 1 for Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness
Figure 2 for Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness
Figure 3 for Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness
Figure 4 for Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness
Viaarxiv icon

Chain of Generation: Multi-Modal Gesture Synthesis via Cascaded Conditional Control

Add code
Dec 26, 2023
Viaarxiv icon

Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors

Add code
Sep 29, 2023
Viaarxiv icon

Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation

Add code
Jul 21, 2023
Viaarxiv icon