Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Merel C. J. Scholman

Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis

Dec 09, 2024

M. Hamza Mughal, Rishabh Dabral, Merel C. J. Scholman, Vera Demberg, Christian Theobalt

Figure 1 for Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis

Figure 2 for Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis

Figure 3 for Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis

Figure 4 for Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis

Abstract:Non-verbal communication often comprises of semantically rich gestures that help convey the meaning of an utterance. Producing such semantic co-speech gestures has been a major challenge for the existing neural systems that can generate rhythmic beat gestures, but struggle to produce semantically meaningful gestures. Therefore, we present RAG-Gesture, a diffusion-based gesture generation approach that leverages Retrieval Augmented Generation (RAG) to produce natural-looking and semantically rich gestures. Our neuro-explicit gesture generation approach is designed to produce semantic gestures grounded in interpretable linguistic knowledge. We achieve this by using explicit domain knowledge to retrieve exemplar motions from a database of co-speech gestures. Once retrieved, we then inject these semantic exemplar gestures into our diffusion-based gesture generation pipeline using DDIM inversion and retrieval guidance at the inference time without any need of training. Further, we propose a control paradigm for guidance, that allows the users to modulate the amount of influence each retrieval insertion has over the generated sequence. Our comparative evaluations demonstrate the validity of our approach against recent gesture generation approaches. The reader is urged to explore the results on our project page.

* Preprint. Project page: https://vcai.mpi-inf.mpg.de/projects/RAG-Gesture/

Via

Access Paper or Ask Questions

Design Choices for Crowdsourcing Implicit Discourse Relations: Revealing the Biases Introduced by Task Design

Apr 03, 2023

Valentina Pyatkin, Frances Yung, Merel C. J. Scholman, Reut Tsarfaty, Ido Dagan, Vera Demberg

Abstract:Disagreement in natural language annotation has mostly been studied from a perspective of biases introduced by the annotators and the annotation frameworks. Here, we propose to analyze another source of bias: task design bias, which has a particularly strong impact on crowdsourced linguistic annotations where natural language is used to elicit the interpretation of laymen annotators. For this purpose we look at implicit discourse relation annotation, a task that has repeatedly been shown to be difficult due to the relations' ambiguity. We compare the annotations of 1,200 discourse relations obtained using two distinct annotation tasks and quantify the biases of both methods across four different domains. Both methods are natural language annotation tasks designed for crowdsourcing. We show that the task design can push annotators towards certain relations and that some discourse relations senses can be better elicited with one or the other annotation approach. We also conclude that this type of bias should be taken into account when training and testing models.

* Accepted to TACL, pre-MIT Press publication version

Via

Access Paper or Ask Questions