Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuya Unno

DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Oct 28, 2018

Riku Arakawa, Sosuke Kobayashi, Yuya Unno, Yuta Tsuboi, Shin-ichi Maeda

Figure 1 for DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Figure 2 for DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Figure 3 for DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Figure 4 for DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Abstract:Exploration has been one of the greatest challenges in reinforcement learning (RL), which is a large obstacle in the application of RL to robotics. Even with state-of-the-art RL algorithms, building a well-learned agent often requires too many trials, mainly due to the difficulty of matching its actions with rewards in the distant future. A remedy for this is to train an agent with real-time feedback from a human observer who immediately gives rewards for some actions. This study tackles a series of challenges for introducing such a human-in-the-loop RL scheme. The first contribution of this work is our experiments with a precisely modeled human observer: binary, delay, stochasticity, unsustainability, and natural reaction. We also propose an RL method called DQN-TAMER, which efficiently uses both human feedback and distant rewards. We find that DQN-TAMER agents outperform their baselines in Maze and Taxi simulated environments. Furthermore, we demonstrate a real-world human-in-the-loop RL application where a camera automatically recognizes a user's facial expressions as feedback to the agent while the agent explores a maze.

Via

Access Paper or Ask Questions

ESPnet: End-to-End Speech Processing Toolkit

Mar 30, 2018

Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen(+2 more)

Figure 1 for ESPnet: End-to-End Speech Processing Toolkit

Figure 2 for ESPnet: End-to-End Speech Processing Toolkit

Figure 3 for ESPnet: End-to-End Speech Processing Toolkit

Figure 4 for ESPnet: End-to-End Speech Processing Toolkit

Abstract:This paper introduces a new open source platform for end-to-end speech processing named ESPnet. ESPnet mainly focuses on end-to-end automatic speech recognition (ASR), and adopts widely-used dynamic neural network toolkits, Chainer and PyTorch, as a main deep learning engine. ESPnet also follows the Kaldi ASR toolkit style for data processing, feature extraction/format, and recipes to provide a complete setup for speech recognition and other speech processing experiments. This paper explains a major architecture of this software platform, several important functionalities, which differentiate ESPnet from other open source ASR toolkits, and experimental results with major ASR benchmarks.

Via

Access Paper or Ask Questions

Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

Mar 28, 2018

Jun Hatori, Yuta Kikuchi, Sosuke Kobayashi, Kuniyuki Takahashi, Yuta Tsuboi, Yuya Unno, Wilson Ko, Jethro Tan

Figure 1 for Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

Figure 2 for Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

Figure 3 for Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

Figure 4 for Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

Abstract:Comprehension of spoken natural language is an essential component for robots to communicate with human effectively. However, handling unconstrained spoken instructions is challenging due to (1) complex structures including a wide variety of expressions used in spoken language and (2) inherent ambiguity in interpretation of human instructions. In this paper, we propose the first comprehensive system that can handle unconstrained spoken language and is able to effectively resolve ambiguity in spoken instructions. Specifically, we integrate deep-learning-based object detection together with natural language processing technologies to handle unconstrained spoken instructions, and propose a method for robots to resolve instruction ambiguity through dialogue. Through our experiments on both a simulated environment as well as a physical industrial robot arm, we demonstrate the ability of our system to understand natural instructions from human operators effectively, and how higher success rates of the object picking task can be achieved through an interactive clarification process.

* 9 pages. International Conference on Robotics and Automation (ICRA) 2018. Accompanying videos are available at the following links: https://youtu.be/_Uyv1XIUqhk (the system submitted to ICRA-2018) and http://youtu.be/DGJazkyw0Ws (with improvements after ICRA-2018 submission)

Via

Access Paper or Ask Questions