Picture for Huan Liao

Huan Liao

Metis: A Foundation Speech Generation Model with Masked Generative Pre-training

Add code
Feb 05, 2025
Viaarxiv icon

Overview of the Amphion Toolkit (v0.2)

Add code
Jan 26, 2025
Figure 1 for Overview of the Amphion Toolkit (v0.2)
Figure 2 for Overview of the Amphion Toolkit (v0.2)
Figure 3 for Overview of the Amphion Toolkit (v0.2)
Figure 4 for Overview of the Amphion Toolkit (v0.2)
Viaarxiv icon

AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward

Add code
Nov 27, 2024
Figure 1 for AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward
Figure 2 for AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward
Figure 3 for AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward
Figure 4 for AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward
Viaarxiv icon

Rhythmic Foley: A Framework For Seamless Audio-Visual Alignment In Video-to-Audio Synthesis

Add code
Sep 13, 2024
Viaarxiv icon

REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment

Add code
May 28, 2024
Figure 1 for REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment
Figure 2 for REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment
Figure 3 for REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment
Figure 4 for REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment
Viaarxiv icon

BATON: Aligning Text-to-Audio Model with Human Preference Feedback

Add code
Feb 01, 2024
Figure 1 for BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Figure 2 for BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Figure 3 for BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Figure 4 for BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Viaarxiv icon