Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuta Tsuboi

DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Oct 28, 2018

Riku Arakawa, Sosuke Kobayashi, Yuya Unno, Yuta Tsuboi, Shin-ichi Maeda

Figure 1 for DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Figure 2 for DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Figure 3 for DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Figure 4 for DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Abstract:Exploration has been one of the greatest challenges in reinforcement learning (RL), which is a large obstacle in the application of RL to robotics. Even with state-of-the-art RL algorithms, building a well-learned agent often requires too many trials, mainly due to the difficulty of matching its actions with rewards in the distant future. A remedy for this is to train an agent with real-time feedback from a human observer who immediately gives rewards for some actions. This study tackles a series of challenges for introducing such a human-in-the-loop RL scheme. The first contribution of this work is our experiments with a precisely modeled human observer: binary, delay, stochasticity, unsustainability, and natural reaction. We also propose an RL method called DQN-TAMER, which efficiently uses both human feedback and distant rewards. We find that DQN-TAMER agents outperform their baselines in Maze and Taxi simulated environments. Furthermore, we demonstrate a real-world human-in-the-loop RL application where a camera automatically recognizes a user's facial expressions as feedback to the agent while the agent explores a maze.

Via

Access Paper or Ask Questions

Addressee and Response Selection for Multilingual Conversation

Aug 12, 2018

Motoki Sato, Hiroki Ouch, Yuta Tsuboi

Figure 1 for Addressee and Response Selection for Multilingual Conversation

Figure 2 for Addressee and Response Selection for Multilingual Conversation

Figure 3 for Addressee and Response Selection for Multilingual Conversation

Figure 4 for Addressee and Response Selection for Multilingual Conversation

Abstract:Developing conversational systems that can converse in many languages is an interesting challenge for natural language processing. In this paper, we introduce multilingual addressee and response selection. In this task, a conversational system predicts an appropriate addressee and response for an input message in multiple languages. A key to developing such multilingual responding systems is how to utilize high-resource language data to compensate for low-resource language data. We present several knowledge transfer methods for conversational systems. To evaluate our methods, we create a new multilingual conversation dataset. Experiments on the dataset demonstrate the effectiveness of our methods.

* coling 2018

Via

Access Paper or Ask Questions

Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

Mar 28, 2018

Jun Hatori, Yuta Kikuchi, Sosuke Kobayashi, Kuniyuki Takahashi, Yuta Tsuboi, Yuya Unno, Wilson Ko, Jethro Tan

Figure 1 for Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

Figure 2 for Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

Figure 3 for Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

Figure 4 for Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

Abstract:Comprehension of spoken natural language is an essential component for robots to communicate with human effectively. However, handling unconstrained spoken instructions is challenging due to (1) complex structures including a wide variety of expressions used in spoken language and (2) inherent ambiguity in interpretation of human instructions. In this paper, we propose the first comprehensive system that can handle unconstrained spoken language and is able to effectively resolve ambiguity in spoken instructions. Specifically, we integrate deep-learning-based object detection together with natural language processing technologies to handle unconstrained spoken instructions, and propose a method for robots to resolve instruction ambiguity through dialogue. Through our experiments on both a simulated environment as well as a physical industrial robot arm, we demonstrate the ability of our system to understand natural instructions from human operators effectively, and how higher success rates of the object picking task can be achieved through an interactive clarification process.

* 9 pages. International Conference on Robotics and Automation (ICRA) 2018. Accompanying videos are available at the following links: https://youtu.be/_Uyv1XIUqhk (the system submitted to ICRA-2018) and http://youtu.be/DGJazkyw0Ws (with improvements after ICRA-2018 submission)

Via

Access Paper or Ask Questions