Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wenlin Zhuang

CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation

Apr 29, 2024

Xiangyu Liang, Wenlin Zhuang, Tianyong Wang, Guangxing Geng, Guangyue Geng, Haifeng Xia, Siyu Xia

Figure 1 for CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation

Figure 2 for CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation

Figure 3 for CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation

Figure 4 for CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation

Abstract:Speech-driven 3D facial animation technology has been developed for years, but its practical application still lacks expectations. The main challenges lie in data limitations, lip alignment, and the naturalness of facial expressions. Although lip alignment has seen many related studies, existing methods struggle to synthesize natural and realistic expressions, resulting in a mechanical and stiff appearance of facial animations. Even with some research extracting emotional features from speech, the randomness of facial movements limits the effective expression of emotions. To address this issue, this paper proposes a method called CSTalk (Correlation Supervised) that models the correlations among different regions of facial movements and supervises the training of the generative model to generate realistic expressions that conform to human facial motion patterns. To generate more intricate animations, we employ a rich set of control parameters based on the metahuman character model and capture a dataset for five different emotions. We train a generative network using an autoencoder structure and input an emotion embedding vector to achieve the generation of user-control expressions. Experimental results demonstrate that our method outperforms existing state-of-the-art methods.

Via

Access Paper or Ask Questions

Music2Dance: DanceNet for Music-driven Dance Generation

Mar 10, 2020

Wenlin Zhuang, Congyi Wang, Siyu Xia, Jinxiang Chai, Yangang Wang

Figure 1 for Music2Dance: DanceNet for Music-driven Dance Generation

Figure 2 for Music2Dance: DanceNet for Music-driven Dance Generation

Figure 3 for Music2Dance: DanceNet for Music-driven Dance Generation

Figure 4 for Music2Dance: DanceNet for Music-driven Dance Generation

Abstract:Synthesize human motions from music, i.e., music to dance, is appealing and attracts lots of research interests in recent years. It is challenging due to not only the requirement of realistic and complex human motions for dance, but more importantly, the synthesized motions should be consistent with the style, rhythm and melody of the music. In this paper, we propose a novel autoregressive generative model, DanceNet, to take the style, rhythm and melody of music as the control signals to generate 3D dance motions with high realism and diversity. To boost the performance of our proposed model, we capture several synchronized music-dance pairs by professional dancers, and build a high-quality music-dance pair dataset. Experiments have demonstrated that the proposed method can achieve the state-of-the-art results.

* Our results are shown at https://youtu.be/bTHSrfEHcG8

Via

Access Paper or Ask Questions