Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ruicong Zhi

PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features

Dec 05, 2023

Tianshun Han, Shengnan Gui, Yiqing Huang, Baihui Li, Lijian Liu, Benjia Zhou, Ning Jiang, Quan Lu, Ruicong Zhi, Yanyan Liang(+2 more)

Figure 1 for PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features

Figure 2 for PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features

Figure 3 for PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features

Figure 4 for PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features

Abstract:Speech-driven 3D facial animation has improved a lot recently while most related works only utilize acoustic modality and neglect the influence of visual and textual cues, leading to unsatisfactory results in terms of precision and coherence. We argue that visual and textual cues are not trivial information. Therefore, we present a novel framework, namely PMMTalk, using complementary Pseudo Multi-Modal features for improving the accuracy of facial animation. The framework entails three modules: PMMTalk encoder, cross-modal alignment module, and PMMTalk decoder. Specifically, the PMMTalk encoder employs the off-the-shelf talking head generation architecture and speech recognition technology to extract visual and textual information from speech, respectively. Subsequently, the cross-modal alignment module aligns the audio-image-text features at temporal and semantic levels. Then PMMTalk decoder is employed to predict lip-syncing facial blendshape coefficients. Contrary to prior methods, PMMTalk only requires an additional random reference face image but yields more accurate results. Additionally, it is artist-friendly as it seamlessly integrates into standard animation production workflows by introducing facial blendshape coefficients. Finally, given the scarcity of 3D talking face datasets, we introduce a large-scale 3D Chinese Audio-Visual Facial Animation (3D-CAVFA) dataset. Extensive experiments and user studies show that our approach outperforms the state of the art. We recommend watching the supplementary video.

Via

Access Paper or Ask Questions

Contrast Enhancement of Brightness-Distorted Images by Improved Adaptive Gamma Correction

Sep 13, 2017

Gang Cao, Lihui Huang, Huawei Tian, Xianglin Huang, Yongbin Wang, Ruicong Zhi

Figure 1 for Contrast Enhancement of Brightness-Distorted Images by Improved Adaptive Gamma Correction

Figure 2 for Contrast Enhancement of Brightness-Distorted Images by Improved Adaptive Gamma Correction

Figure 3 for Contrast Enhancement of Brightness-Distorted Images by Improved Adaptive Gamma Correction

Figure 4 for Contrast Enhancement of Brightness-Distorted Images by Improved Adaptive Gamma Correction

Abstract:As an efficient image contrast enhancement (CE) tool, adaptive gamma correction (AGC) was previously proposed by relating gamma parameter with cumulative distribution function (CDF) of the pixel gray levels within an image. ACG deals well with most dimmed images, but fails for globally bright images and the dimmed images with local bright regions. Such two categories of brightness-distorted images are universal in real scenarios, such as improper exposure and white object regions. In order to attenuate such deficiencies, here we propose an improved AGC algorithm. The novel strategy of negative images is used to realize CE of the bright images, and the gamma correction modulated by truncated CDF is employed to enhance the dimmed ones. As such, local over-enhancement and structure distortion can be alleviated. Both qualitative and quantitative experimental results show that our proposed method yields consistently good CE results.

Via

Access Paper or Ask Questions