Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

He Feng

One-Shot Pose-Driving Face Animation Platform

Jul 12, 2024

He Feng, Donglin Di, Yongjia Ma, Wei Chen, Tonghua Su

Figure 1 for One-Shot Pose-Driving Face Animation Platform

Abstract:The objective of face animation is to generate dynamic and expressive talking head videos from a single reference face, utilizing driving conditions derived from either video or audio inputs. Current approaches often require fine-tuning for specific identities and frequently fail to produce expressive videos due to the limited effectiveness of Wav2Pose modules. To facilitate the generation of one-shot and more consecutive talking head videos, we refine an existing Image2Video model by integrating a Face Locator and Motion Frame mechanism. We subsequently optimize the model using extensive human face video datasets, significantly enhancing its ability to produce high-quality and expressive talking head videos. Additionally, we develop a demo platform using the Gradio framework, which streamlines the process, enabling users to quickly create customized talking head videos.

Via

Access Paper or Ask Questions

DialogUSR: Complex Dialogue Utterance Splitting and Reformulation for Multiple Intent Detection

Oct 20, 2022

Haoran Meng, Zheng Xin, Tianyu Liu, Zizhen Wang, He Feng, Binghuai Lin, Xuemin Zhao, Yunbo Cao, Zhifang Sui

Figure 1 for DialogUSR: Complex Dialogue Utterance Splitting and Reformulation for Multiple Intent Detection

Figure 2 for DialogUSR: Complex Dialogue Utterance Splitting and Reformulation for Multiple Intent Detection

Figure 3 for DialogUSR: Complex Dialogue Utterance Splitting and Reformulation for Multiple Intent Detection

Figure 4 for DialogUSR: Complex Dialogue Utterance Splitting and Reformulation for Multiple Intent Detection

Abstract:While interacting with chatbots, users may elicit multiple intents in a single dialogue utterance. Instead of training a dedicated multi-intent detection model, we propose DialogUSR, a dialogue utterance splitting and reformulation task that first splits multi-intent user query into several single-intent sub-queries and then recovers all the coreferred and omitted information in the sub-queries. DialogUSR can serve as a plug-in and domain-agnostic module that empowers the multi-intent detection for the deployed chatbots with minimal efforts. We collect a high-quality naturally occurring dataset that covers 23 domains with a multi-step crowd-souring procedure. To benchmark the proposed dataset, we propose multiple action-based generative models that involve end-to-end and two-stage training, and conduct in-depth analyses on the pros and cons of the proposed baselines.

* Accepted by EMNLP2022(findings); The first three authors contribute equally

Via

Access Paper or Ask Questions