Picture for Jinghui Xie

Jinghui Xie

Text-to-Edit: Controllable End-to-End Video Ad Creation via Multimodal LLMs

Add code
Jan 10, 2025
Viaarxiv icon

LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync

Add code
Dec 12, 2024
Viaarxiv icon

Speech2Slot: An End-to-End Knowledge-based Slot Filling from Speech

Add code
May 10, 2021
Figure 1 for Speech2Slot: An End-to-End Knowledge-based Slot Filling from Speech
Figure 2 for Speech2Slot: An End-to-End Knowledge-based Slot Filling from Speech
Figure 3 for Speech2Slot: An End-to-End Knowledge-based Slot Filling from Speech
Figure 4 for Speech2Slot: An End-to-End Knowledge-based Slot Filling from Speech
Viaarxiv icon

Understanding Semantics from Speech Through Pre-training

Add code
Sep 24, 2019
Figure 1 for Understanding Semantics from Speech Through Pre-training
Figure 2 for Understanding Semantics from Speech Through Pre-training
Figure 3 for Understanding Semantics from Speech Through Pre-training
Figure 4 for Understanding Semantics from Speech Through Pre-training
Viaarxiv icon