Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

William B. Shen

Interactive Gibson: A Benchmark for Interactive Navigation in Cluttered Environments

Oct 30, 2019

Fei Xia, William B. Shen, Chengshu Li, Priya Kasimbeg, Micael Tchapmi, Alexander Toshev, Roberto Martín-Martín, Silvio Savarese

Figure 1 for Interactive Gibson: A Benchmark for Interactive Navigation in Cluttered Environments

Figure 2 for Interactive Gibson: A Benchmark for Interactive Navigation in Cluttered Environments

Figure 3 for Interactive Gibson: A Benchmark for Interactive Navigation in Cluttered Environments

Figure 4 for Interactive Gibson: A Benchmark for Interactive Navigation in Cluttered Environments

Abstract:We present Interactive Gibson, the first comprehensive benchmark for training and evaluating Interactive Navigation: robot navigation strategies where physical interaction with objects is allowed and even encouraged to accomplish a task. For example, the robot can move objects if needed in order to clear a path leading to the goal location. Our benchmark comprises two novel elements: 1) a new experimental setup, the Interactive Gibson Environment, which simulates high fidelity visuals of indoor scenes, and high fidelity physical dynamics of the robot and common objects found in these scenes; 2) a set of Interactive Navigation metrics which allows one to study the interplay between navigation and physical interaction. We present and evaluate multiple learning-based baselines in Interactive Gibson, and provide insights into regimes of navigation with different trade-offs between navigation path efficiency and disturbance of surrounding objects. We make our benchmark publicly available(https://sites.google.com/view/interactivegibsonenv) and encourage researchers from all disciplines in robotics (e.g. planning, learning, control) to propose, evaluate, and compare their Interactive Navigation solutions in Interactive Gibson.

Via

Access Paper or Ask Questions

Situational Fusion of Visual Representation for Visual Navigation

Aug 24, 2019

William B. Shen, Danfei Xu, Yuke Zhu, Leonidas J. Guibas, Li Fei-Fei, Silvio Savarese

Figure 1 for Situational Fusion of Visual Representation for Visual Navigation

Figure 2 for Situational Fusion of Visual Representation for Visual Navigation

Figure 3 for Situational Fusion of Visual Representation for Visual Navigation

Figure 4 for Situational Fusion of Visual Representation for Visual Navigation

Abstract:A complex visual navigation task puts an agent in different situations which call for a diverse range of visual perception abilities. For example, to "go to the nearest chair'', the agent might need to identify a chair in a living room using semantics, follow along a hallway using vanishing point cues, and avoid obstacles using depth. Therefore, utilizing the appropriate visual perception abilities based on a situational understanding of the visual environment can empower these navigation models in unseen visual environments. We propose to train an agent to fuse a large set of visual representations that correspond to diverse visual perception abilities. To fully utilize each representation, we develop an action-level representation fusion scheme, which predicts an action candidate from each representation and adaptively consolidate these action candidates into the final action. Furthermore, we employ a data-driven inter-task affinity regularization to reduce redundancies and improve generalization. Our approach leads to a significantly improved performance in novel environments over ImageNet-pretrained baseline and other fusion methods.

Via

Access Paper or Ask Questions

Visual Forecasting by Imitating Dynamics in Natural Sequences

Aug 19, 2017

Kuo-Hao Zeng, William B. Shen, De-An Huang, Min Sun, Juan Carlos Niebles

Figure 1 for Visual Forecasting by Imitating Dynamics in Natural Sequences

Figure 2 for Visual Forecasting by Imitating Dynamics in Natural Sequences

Figure 3 for Visual Forecasting by Imitating Dynamics in Natural Sequences

Figure 4 for Visual Forecasting by Imitating Dynamics in Natural Sequences

Abstract:We introduce a general framework for visual forecasting, which directly imitates visual sequences without additional supervision. As a result, our model can be applied at several semantic levels and does not require any domain knowledge or handcrafted features. We achieve this by formulating visual forecasting as an inverse reinforcement learning (IRL) problem, and directly imitate the dynamics in natural sequences from their raw pixel values. The key challenge is the high-dimensional and continuous state-action space that prohibits the application of previous IRL algorithms. We address this computational bottleneck by extending recent progress in model-free imitation with trainable deep feature representations, which (1) bypasses the exhaustive state-action pair visits in dynamic programming by using a dual formulation and (2) avoids explicit state sampling at gradient computation using a deep feature reparametrization. This allows us to apply IRL at scale and directly imitate the dynamics in high-dimensional continuous visual sequences from the raw pixel values. We evaluate our approach at three different level-of-abstraction, from low level pixels to higher level semantics: future frame generation, action anticipation, visual story forecasting. At all levels, our approach outperforms existing methods.

* 10 pages, 9 figures, accepted to ICCV 2017

Via

Access Paper or Ask Questions