Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Language Model

Aug 03, 2024

Farzaneh Jafari, Stefano Berretti, Anup Basu

Share this with someone who'll enjoy it:

Abstract:In recent years, talking head generation has become a focal point for researchers. Considerable effort is being made to refine lip-sync motion, capture expressive facial expressions, generate natural head poses, and achieve high video quality. However, no single model has yet achieved equivalence across all these metrics. This paper aims to animate a 3D face using Jamba, a hybrid Transformers-Mamba model. Mamba, a pioneering Structured State Space Model (SSM) architecture, was designed to address the constraints of the conventional Transformer architecture. Nevertheless, it has several drawbacks. Jamba merges the advantages of both Transformer and Mamba approaches, providing a holistic solution. Based on the foundational Jamba block, we present JambaTalk to enhance motion variety and speed through multimodal integration. Extensive experiments reveal that our method achieves performance comparable or superior to state-of-the-art models.

* 12 pages with 3 figures

View paper on

Share this with someone who'll enjoy it:

Title:JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Language Model

Paper and Code