Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wenquan Wu

PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation

Nov 02, 2022

Siqi Bao, Huang He, Jun Xu, Hua Lu, Fan Wang, Hua Wu, Han Zhou, Wenquan Wu, Zheng-Yu Niu, Haifeng Wang

Figure 1 for PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation

Figure 2 for PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation

Figure 3 for PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation

Figure 4 for PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation

Abstract:Recently, the practical deployment of open-domain dialogue systems has been plagued by the knowledge issue of information deficiency and factual inaccuracy. To this end, we introduce PLATO-K based on two-stage dialogic learning to strengthen internal knowledge memorization and external knowledge exploitation. In the first stage, PLATO-K learns through massive dialogue corpora and memorizes essential knowledge into model parameters. In the second stage, PLATO-K mimics human beings to search for external information and to leverage the knowledge in response generation. Extensive experiments reveal that the knowledge issue is alleviated significantly in PLATO-K with such comprehensive internal and external knowledge enhancement. Compared to the existing state-of-the-art Chinese dialogue model, the overall engagingness of PLATO-K is improved remarkably by 36.2% and 49.2% on chit-chat and knowledge-intensive conversations.

* First four authors contributed equally to this work

Via

Access Paper or Ask Questions

CDConv: A Benchmark for Contradiction Detection in Chinese Conversations

Oct 16, 2022

Chujie Zheng, Jinfeng Zhou, Yinhe Zheng, Libiao Peng, Zhen Guo, Wenquan Wu, Zhengyu Niu, Hua Wu, Minlie Huang

Figure 1 for CDConv: A Benchmark for Contradiction Detection in Chinese Conversations

Figure 2 for CDConv: A Benchmark for Contradiction Detection in Chinese Conversations

Figure 3 for CDConv: A Benchmark for Contradiction Detection in Chinese Conversations

Figure 4 for CDConv: A Benchmark for Contradiction Detection in Chinese Conversations

Abstract:Dialogue contradiction is a critical issue in open-domain dialogue systems. The contextualization nature of conversations makes dialogue contradiction detection rather challenging. In this work, we propose a benchmark for Contradiction Detection in Chinese Conversations, namely CDConv. It contains 12K multi-turn conversations annotated with three typical contradiction categories: Intra-sentence Contradiction, Role Confusion, and History Contradiction. To efficiently construct the CDConv conversations, we devise a series of methods for automatic conversation generation, which simulate common user behaviors that trigger chatbots to make contradictions. We conduct careful manual quality screening of the constructed conversations and show that state-of-the-art Chinese chatbots can be easily goaded into making contradictions. Experiments on CDConv show that properly modeling contextual information is critical for dialogue contradiction detection, but there are still unresolved challenges that require future research.

* EMNLP 2022

Via

Access Paper or Ask Questions

SINC: Service Information Augmented Open-Domain Conversation

Jun 28, 2022

Han Zhou, Xinchao Xu, Wenquan Wu, Zhengyu Niu, Hua Wu, Siqi Bao, Fan Wang, Haifeng Wang

Figure 1 for SINC: Service Information Augmented Open-Domain Conversation

Figure 2 for SINC: Service Information Augmented Open-Domain Conversation

Figure 3 for SINC: Service Information Augmented Open-Domain Conversation

Figure 4 for SINC: Service Information Augmented Open-Domain Conversation

Abstract:Generative open-domain dialogue systems can benefit from external knowledge, but the lack of external knowledge resources and the difficulty in finding relevant knowledge limit the development of this technology. To this end, we propose a knowledge-driven dialogue task using dynamic service information. Specifically, we use a large number of service APIs that can provide high coverage and spatiotemporal sensitivity as external knowledge sources. The dialogue system generates queries to request external services along with user information, get the relevant knowledge, and generate responses based on this knowledge. To implement this method, we collect and release the first open domain Chinese service knowledge dialogue dataset DuSinc. At the same time, we construct a baseline model PLATO-SINC, which realizes the automatic utilization of service information for dialogue. Both automatic evaluation and human evaluation show that our proposed new method can significantly improve the effect of open-domain conversation, and the session-level overall score in human evaluation is improved by 59.29% compared with the dialogue pre-training model PLATO-2. The dataset and benchmark model will be open sourced.

* 12pages, 7 figures

Via

Access Paper or Ask Questions

Towards Multi-Turn Empathetic Dialogs with Positive Emotion Elicitation

Apr 22, 2022

Shihang Wang, Xinchao Xu, Wenquan Wu, Zheng-Yu Niu, Hua Wu, Haifeng Wang

Figure 1 for Towards Multi-Turn Empathetic Dialogs with Positive Emotion Elicitation

Figure 2 for Towards Multi-Turn Empathetic Dialogs with Positive Emotion Elicitation

Figure 3 for Towards Multi-Turn Empathetic Dialogs with Positive Emotion Elicitation

Figure 4 for Towards Multi-Turn Empathetic Dialogs with Positive Emotion Elicitation

Abstract:Emotional support is a crucial skill for many real-world scenarios, including caring for the elderly, mental health support, and customer service chats. This paper presents a novel task of empathetic dialog generation with positive emotion elicitation to promote users' positive emotions, similar to that of emotional support between humans. In this task, the agent conducts empathetic responses along with the target of eliciting the user's positive emotions in the multi-turn dialog. To facilitate the study of this task, we collect a large-scale emotional dialog dataset with positive emotion elicitation, called PosEmoDial (about 820k dialogs, 3M utterances). In these dialogs, the agent tries to guide the user from any possible initial emotional state, e.g., sadness, to a positive emotional state. Then we present a positive-emotion-guided dialog generation model with a novel loss function design. This loss function encourages the dialog model to not only elicit positive emotions from users but also ensure smooth emotional transitions along with the whole dialog. Finally, we establish benchmark results on PosEmoDial, and we will release this dataset and related source code to facilitate future studies.

Via

Access Paper or Ask Questions

Long Time No See! Open-Domain Conversation with Long-Term Persona Memory

Mar 14, 2022

Xinchao Xu, Zhibin Gou, Wenquan Wu, Zheng-Yu Niu, Hua Wu, Haifeng Wang, Shihang Wang

Figure 1 for Long Time No See! Open-Domain Conversation with Long-Term Persona Memory

Figure 2 for Long Time No See! Open-Domain Conversation with Long-Term Persona Memory

Figure 3 for Long Time No See! Open-Domain Conversation with Long-Term Persona Memory

Figure 4 for Long Time No See! Open-Domain Conversation with Long-Term Persona Memory

Abstract:Most of the open-domain dialogue models tend to perform poorly in the setting of long-term human-bot conversations. The possible reason is that they lack the capability of understanding and memorizing long-term dialogue history information. To address this issue, we present a novel task of Long-term Memory Conversation (LeMon) and then build a new dialogue dataset DuLeMon and a dialogue generation framework with Long-Term Memory (LTM) mechanism (called PLATO-LTM). This LTM mechanism enables our system to accurately extract and continuously update long-term persona memory without requiring multiple-session dialogue datasets for model training. To our knowledge, this is the first attempt to conduct real-time dynamic management of persona information of both parties, including the user and the bot. Results on DuLeMon indicate that PLATO-LTM can significantly outperform baselines in terms of long-term dialogue consistency, leading to better dialogue engagingness.

* Accepted by Findings of ACL 2022 (Camera-ready version)

Via

Access Paper or Ask Questions

PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

Sep 20, 2021

Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang, Wenquan Wu, Zhihua Wu, Zhen Guo, Hua Lu, Xinxian Huang(+4 more)

Figure 1 for PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

Figure 2 for PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

Figure 3 for PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

Figure 4 for PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

Abstract:To explore the limit of dialogue generation pre-training, we present the models of PLATO-XL with up to 11 billion parameters, trained on both Chinese and English social media conversations. To train such large models, we adopt the architecture of unified transformer with high computation and parameter efficiency. In addition, we carry out multi-party aware pre-training to better distinguish the characteristic information in social media conversations. With such designs, PLATO-XL successfully achieves superior performances as compared to other approaches in both Chinese and English chitchat. We further explore the capacity of PLATO-XL on other conversational tasks, such as knowledge grounded dialogue and task-oriented conversation. The experimental results indicate that PLATO-XL obtains state-of-the-art results across multiple conversational tasks, verifying its potential as a foundation model of conversational AI.

* First four authors contributed equally to this work

Via

Access Paper or Ask Questions

A Unified Pre-training Framework for Conversational AI

May 27, 2021

Siqi Bao, Bingjin Chen, Huang He, Xin Tian, Han Zhou, Fan Wang, Hua Wu, Haifeng Wang, Wenquan Wu, Yingzhan Lin

Figure 1 for A Unified Pre-training Framework for Conversational AI

Figure 2 for A Unified Pre-training Framework for Conversational AI

Figure 3 for A Unified Pre-training Framework for Conversational AI

Figure 4 for A Unified Pre-training Framework for Conversational AI

Abstract:In this work, we explore the application of PLATO-2 on various dialogue systems, including open-domain conversation, knowledge grounded dialogue, and task-oriented conversation. PLATO-2 is initially designed as an open-domain chatbot, trained via two-stage curriculum learning. In the first stage, a coarse-grained response generation model is learned to fit the simplified one-to-one mapping relationship. This model is applied to the task-oriented conversation, given that the semantic mappings tend to be deterministic in task completion. In the second stage, another fine-grained generation model and an evaluation model are further learned for diverse response generation and coherence estimation, respectively. With superior capability on capturing one-to-many mapping, such models are suitable for the open-domain conversation and knowledge grounded dialogue. For the comprehensive evaluation of PLATO-2, we have participated in multiple tasks of DSTC9, including interactive evaluation of open-domain conversation (Track3-task2), static evaluation of knowledge grounded dialogue (Track3-task1), and end-to-end task-oriented conversation (Track2-task1). PLATO-2 has obtained the 1st place in all three tasks, verifying its effectiveness as a unified framework for various dialogue systems.

* Presented at AAAI-21 DSTC9 Workshop. First five authors contributed equally to this work

Via

Access Paper or Ask Questions

PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning

Jul 13, 2020

Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang, Wenquan Wu, Zhen Guo, Zhibin Liu, Xinchao Xu

Figure 1 for PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning

Figure 2 for PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning

Figure 3 for PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning

Figure 4 for PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning

Abstract:To build a high-quality open-domain chatbot, we introduce the effective training process of PLATO-2 via curriculum learning. There are two stages involved in the learning process. In the first stage, a coarse-grained generation model is trained to learn response generation under the simplified framework of one-to-one mapping. In the second stage, a fine-grained generation model and an evaluation model are further trained to learn diverse response generation and response coherence estimation, respectively. PLATO-2 was trained on both Chinese and English data, whose effectiveness and superiority are verified through comprehensive evaluations, achieving new state-of-the-art results.

* First four authors contributed equally to this work

Via

Access Paper or Ask Questions

Proactive Human-Machine Conversation with Explicit Conversation Goals

Jun 13, 2019

Wenquan Wu, Zhen Guo, Xiangyang Zhou, Hua Wu, Xiyuan Zhang, Rongzhong Lian, Haifeng Wang

Figure 1 for Proactive Human-Machine Conversation with Explicit Conversation Goals

Figure 2 for Proactive Human-Machine Conversation with Explicit Conversation Goals

Figure 3 for Proactive Human-Machine Conversation with Explicit Conversation Goals

Figure 4 for Proactive Human-Machine Conversation with Explicit Conversation Goals

Abstract:Though great progress has been made for human-machine conversation, current dialogue system is still in its infancy: it usually converses passively and utters words more as a matter of response, rather than on its own initiatives. In this paper, we take a radical step towards building a human-like conversational agent: endowing it with the ability of proactively leading the conversation (introducing a new topic or maintaining the current topic). To facilitate the development of such conversation systems, we create a new dataset named DuConv where one acts as a conversation leader and the other acts as the follower. The leader is provided with a knowledge graph and asked to sequentially change the discussion topics, following the given conversation goal, and meanwhile keep the dialogue as natural and engaging as possible. DuConv enables a very challenging task as the model needs to both understand dialogue and plan over the given knowledge graph. We establish baseline results on this dataset (about 270K utterances and 30k dialogues) using several state-of-the-art models. Experimental results show that dialogue models that plan over the knowledge graph can make full use of related knowledge to generate more diverse multi-turn conversations. The baseline systems along with the dataset are publicly available

* Accepted by ACL 2019

Via

Access Paper or Ask Questions