Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

João Paulo Carvalho

Dialogue Quality and Emotion Annotations for Customer Support Conversations

Nov 23, 2023

John Mendonça, Patrícia Pereira, Miguel Menezes, Vera Cabarrão, Ana C. Farinha, Helena Moniz, João Paulo Carvalho, Alon Lavie, Isabel Trancoso

Figure 1 for Dialogue Quality and Emotion Annotations for Customer Support Conversations

Figure 2 for Dialogue Quality and Emotion Annotations for Customer Support Conversations

Figure 3 for Dialogue Quality and Emotion Annotations for Customer Support Conversations

Figure 4 for Dialogue Quality and Emotion Annotations for Customer Support Conversations

Abstract:Task-oriented conversational datasets often lack topic variability and linguistic diversity. However, with the advent of Large Language Models (LLMs) pretrained on extensive, multilingual and diverse text data, these limitations seem overcome. Nevertheless, their generalisability to different languages and domains in dialogue applications remains uncertain without benchmarking datasets. This paper presents a holistic annotation approach for emotion and conversational quality in the context of bilingual customer support conversations. By performing annotations that take into consideration the complete instances that compose a conversation, one can form a broader perspective of the dialogue as a whole. Furthermore, it provides a unique and valuable resource for the development of text classification models. To this end, we present benchmarks for Emotion Recognition and Dialogue Quality Estimation and show that further research is needed to leverage these models in a production setting.

* Accepted at GEM (EMNLP Workshop)

Via

Access Paper or Ask Questions

Simple LLM Prompting is State-of-the-Art for Robust and Multilingual Dialogue Evaluation

Sep 08, 2023

John Mendonça, Patrícia Pereira, Helena Moniz, João Paulo Carvalho, Alon Lavie, Isabel Trancoso

Figure 1 for Simple LLM Prompting is State-of-the-Art for Robust and Multilingual Dialogue Evaluation

Figure 2 for Simple LLM Prompting is State-of-the-Art for Robust and Multilingual Dialogue Evaluation

Figure 3 for Simple LLM Prompting is State-of-the-Art for Robust and Multilingual Dialogue Evaluation

Figure 4 for Simple LLM Prompting is State-of-the-Art for Robust and Multilingual Dialogue Evaluation

Abstract:Despite significant research effort in the development of automatic dialogue evaluation metrics, little thought is given to evaluating dialogues other than in English. At the same time, ensuring metrics are invariant to semantically similar responses is also an overlooked topic. In order to achieve the desired properties of robustness and multilinguality for dialogue evaluation metrics, we propose a novel framework that takes advantage of the strengths of current evaluation models with the newly-established paradigm of prompting Large Language Models (LLMs). Empirical results show our framework achieves state of the art results in terms of mean Spearman correlation scores across several benchmarks and ranks first place on both the Robust and Multilingual tasks of the DSTC11 Track 4 "Automatic Evaluation Metrics for Open-Domain Dialogue Systems", proving the evaluation capabilities of prompted LLMs.

* DSTC11 best paper for Track 4

Via

Access Paper or Ask Questions