Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm

Dec 28, 2022

Jabri Ismail, Aboulbichr Ahmed, El ouaazizi Aziza

Figure 1 for Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm

Figure 2 for Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm

Figure 3 for Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm

Figure 4 for Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm

Share this with someone who'll enjoy it:

Abstract:Nowadays, the current neural network models of dialogue generation(chatbots) show great promise for generating answers for chatty agents. But they are short-sighted in that they predict utterances one at a time while disregarding their impact on future outcomes. Modelling a dialogue's future direction is critical for generating coherent, interesting dialogues, a need that has led traditional NLP dialogue models that rely on reinforcement learning. In this article, we explain how to combine these objectives by using deep reinforcement learning to predict future rewards in chatbot dialogue. The model simulates conversations between two virtual agents, with policy gradient methods used to reward sequences that exhibit three useful conversational characteristics: the flow of informality, coherence, and simplicity of response (related to forward-looking function). We assess our model based on its diversity, length, and complexity with regard to humans. In dialogue simulation, evaluations demonstrated that the proposed model generates more interactive responses and encourages a more sustained successful conversation. This work commemorates a preliminary step toward developing a neural conversational model based on the long-term success of dialogues.

* CS & IT - CSCP 2022 pp. 221-231, 2022 * Published in Proceedings of Artificial Intelligence, Soft Computing and Applications 12th International Conference on Artificial Intelligence, Soft Computing and Applications (AIAA 2022) December 22 ~ 24, 2022, Sydney, Australia Volume Editors : David C. Wyld, Dhinaharan Nagamalai (Eds) ISBN : 978-1-925953-83-1

View paper on

Share this with someone who'll enjoy it:

Title:Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm

Paper and Code