Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Seonghan Ryu

Ensemble-Based Deep Reinforcement Learning for Chatbots

Aug 27, 2019

Heriberto Cuayáhuitl, Donghyeon Lee, Seonghan Ryu, Yongjin Cho, Sungja Choi, Satish Indurthi, Seunghak Yu, Hyungtak Choi, Inchul Hwang, Jihie Kim

Figure 1 for Ensemble-Based Deep Reinforcement Learning for Chatbots

Figure 2 for Ensemble-Based Deep Reinforcement Learning for Chatbots

Figure 3 for Ensemble-Based Deep Reinforcement Learning for Chatbots

Figure 4 for Ensemble-Based Deep Reinforcement Learning for Chatbots

Abstract:Trainable chatbots that exhibit fluent and human-like conversations remain a big challenge in artificial intelligence. Deep Reinforcement Learning (DRL) is promising for addressing this challenge, but its successful application remains an open question. This article describes a novel ensemble-based approach applied to value-based DRL chatbots, which use finite action sets as a form of meaning representation. In our approach, while dialogue actions are derived from sentence clustering, the training datasets in our ensemble are derived from dialogue clustering. The latter aim to induce specialised agents that learn to interact in a particular style. In order to facilitate neural chatbot training using our proposed approach, we assume dialogue data in raw text only -- without any manually-labelled data. Experimental results using chitchat data reveal that (1) near human-like dialogue policies can be induced, (2) generalisation to unseen data is a difficult problem, and (3) training an ensemble of chatbot agents is essential for improved performance over using a single agent. In addition to evaluations using held-out data, our results are further supported by a human evaluation that rated dialogues in terms of fluency, engagingness and consistency -- which revealed that our proposed dialogue rewards strongly correlate with human judgements.

* arXiv admin note: text overlap with arXiv:1908.10331

Via

Access Paper or Ask Questions

Deep Reinforcement Learning for Chatbots Using Clustered Actions and Human-Likeness Rewards

Aug 27, 2019

Heriberto Cuayáhuitl, Donghyeon Lee, Seonghan Ryu, Sungja Choi, Inchul Hwang, Jihie Kim

Figure 1 for Deep Reinforcement Learning for Chatbots Using Clustered Actions and Human-Likeness Rewards

Figure 2 for Deep Reinforcement Learning for Chatbots Using Clustered Actions and Human-Likeness Rewards

Figure 3 for Deep Reinforcement Learning for Chatbots Using Clustered Actions and Human-Likeness Rewards

Figure 4 for Deep Reinforcement Learning for Chatbots Using Clustered Actions and Human-Likeness Rewards

Abstract:Training chatbots using the reinforcement learning paradigm is challenging due to high-dimensional states, infinite action spaces and the difficulty in specifying the reward function. We address such problems using clustered actions instead of infinite actions, and a simple but promising reward function based on human-likeness scores derived from human-human dialogue data. We train Deep Reinforcement Learning (DRL) agents using chitchat data in raw text---without any manual annotations. Experimental results using different splits of training data report the following. First, that our agents learn reasonable policies in the environments they get familiarised with, but their performance drops substantially when they are exposed to a test set of unseen dialogues. Second, that the choice of sentence embedding size between 100 and 300 dimensions is not significantly different on test data. Third, that our proposed human-likeness rewards are reasonable for training chatbots as long as they use lengthy dialogue histories of >=10 sentences.

* In International Joint Conference of Neural Networks (IJCNN), 2019

Via

Access Paper or Ask Questions

A Study on Dialogue Reward Prediction for Open-Ended Conversational Agents

Dec 02, 2018

Heriberto Cuayáhuitl, Seonghan Ryu, Donghyeon Lee, Jihie Kim

Figure 1 for A Study on Dialogue Reward Prediction for Open-Ended Conversational Agents

Figure 2 for A Study on Dialogue Reward Prediction for Open-Ended Conversational Agents

Figure 3 for A Study on Dialogue Reward Prediction for Open-Ended Conversational Agents

Figure 4 for A Study on Dialogue Reward Prediction for Open-Ended Conversational Agents

Abstract:The amount of dialogue history to include in a conversational agent is often underestimated and/or set in an empirical and thus possibly naive way. This suggests that principled investigations into optimal context windows are urgently needed given that the amount of dialogue history and corresponding representations can play an important role in the overall performance of a conversational system. This paper studies the amount of history required by conversational agents for reliably predicting dialogue rewards. The task of dialogue reward prediction is chosen for investigating the effects of varying amounts of dialogue history and their impact on system performance. Experimental results using a dataset of 18K human-human dialogues report that lengthy dialogue histories of at least 10 sentences are preferred (25 sentences being the best in our experiments) over short ones, and that lengthy histories are useful for training dialogue reward predictors with strong positive correlations between target dialogue rewards and predicted ones.

* In NeurIPS Workshop on Conversational AI: "Today's Practice and Tomorrow's Potential", December 2018

Via

Access Paper or Ask Questions

Neural Sentence Embedding using Only In-domain Sentences for Out-of-domain Sentence Detection in Dialog Systems

Jul 27, 2018

Seonghan Ryu, Seokhwan Kim, Junhwi Choi, Hwanjo Yu, Gary Geunbae Lee

Figure 1 for Neural Sentence Embedding using Only In-domain Sentences for Out-of-domain Sentence Detection in Dialog Systems

Figure 2 for Neural Sentence Embedding using Only In-domain Sentences for Out-of-domain Sentence Detection in Dialog Systems

Figure 3 for Neural Sentence Embedding using Only In-domain Sentences for Out-of-domain Sentence Detection in Dialog Systems

Figure 4 for Neural Sentence Embedding using Only In-domain Sentences for Out-of-domain Sentence Detection in Dialog Systems

Abstract:To ensure satisfactory user experience, dialog systems must be able to determine whether an input sentence is in-domain (ID) or out-of-domain (OOD). We assume that only ID sentences are available as training data because collecting enough OOD sentences in an unbiased way is a laborious and time-consuming job. This paper proposes a novel neural sentence embedding method that represents sentences in a low-dimensional continuous vector space that emphasizes aspects that distinguish ID cases from OOD cases. We first used a large set of unlabeled text to pre-train word representations that are used to initialize neural sentence embedding. Then we used domain-category analysis as an auxiliary task to train neural sentence embedding for OOD sentence detection. After the sentence representations were learned, we used them to train an autoencoder aimed at OOD sentence detection. We evaluated our method by experimentally comparing it to the state-of-the-art methods in an eight-domain dialog system; our proposed method achieved the highest accuracy in all tests.

* Published in Pattern Recognition Letters, 88:26-32, 2017

Via

Access Paper or Ask Questions