Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion

Apr 18, 2022

Evonne Ng, Hanbyul Joo, Liwen Hu, Hao Li, Trevor Darrell, Angjoo Kanazawa, Shiry Ginosar

Figure 1 for Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion

Figure 2 for Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion

Figure 3 for Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion

Figure 4 for Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion

Share this with someone who'll enjoy it:

Abstract:We present a framework for modeling interactional communication in dyadic conversations: given multimodal inputs of a speaker, we autoregressively output multiple possibilities of corresponding listener motion. We combine the motion and speech audio of the speaker using a motion-audio cross attention transformer. Furthermore, we enable non-deterministic prediction by learning a discrete latent representation of realistic listener motion with a novel motion-encoding VQ-VAE. Our method organically captures the multimodal and non-deterministic nature of nonverbal dyadic interactions. Moreover, it produces realistic 3D listener facial motion synchronous with the speaker (see video). We demonstrate that our method outperforms baselines qualitatively and quantitatively via a rich suite of experiments. To facilitate this line of research, we introduce a novel and large in-the-wild dataset of dyadic conversations. Code, data, and videos available at https://evonneng.github.io/learning2listen/.

View paper on

Share this with someone who'll enjoy it:

Title:Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion

Paper and Code