Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Seth Teller

Generalized Grounding Graphs: A Probabilistic Framework for Understanding Grounded Commands

Nov 29, 2017

Thomas Kollar, Stefanie Tellex, Matthew Walter, Albert Huang, Abraham Bachrach, Sachi Hemachandra, Emma Brunskill, Ashis Banerjee, Deb Roy, Seth Teller(+1 more)

Figure 1 for Generalized Grounding Graphs: A Probabilistic Framework for Understanding Grounded Commands

Figure 2 for Generalized Grounding Graphs: A Probabilistic Framework for Understanding Grounded Commands

Figure 3 for Generalized Grounding Graphs: A Probabilistic Framework for Understanding Grounded Commands

Figure 4 for Generalized Grounding Graphs: A Probabilistic Framework for Understanding Grounded Commands

Abstract:Many task domains require robots to interpret and act upon natural language commands which are given by people and which refer to the robot's physical surroundings. Such interpretation is known variously as the symbol grounding problem, grounded semantics and grounded language acquisition. This problem is challenging because people employ diverse vocabulary and grammar, and because robots have substantial uncertainty about the nature and contents of their surroundings, making it difficult to associate the constitutive language elements (principally noun phrases and spatial relations) of the command text to elements of those surroundings. Symbolic models capture linguistic structure but have not scaled successfully to handle the diverse language produced by untrained users. Existing statistical approaches can better handle diversity, but have not to date modeled complex linguistic structure, limiting achievable accuracy. Recent hybrid approaches have addressed limitations in scaling and complexity, but have not effectively associated linguistic and perceptual features. Our framework, called Generalized Grounding Graphs (G^3), addresses these issues by defining a probabilistic graphical model dynamically according to the linguistic parse structure of a natural language command. This approach scales effectively, handles linguistic diversity, and enables the system to associate parts of a command with the specific objects, places, and events in the external world to which they refer. We show that robots can learn word meanings and use those learned meanings to robustly follow natural language commands produced by untrained users. We demonstrate our approach for both mobility commands and mobile manipulation commands involving a variety of semi-autonomous robotic platforms, including a wheelchair, a micro-air vehicle, a forklift, and the Willow Garage PR2.

* Submitted to the Journal of Artificial Intelligence Research

Via

Access Paper or Ask Questions

Learning Articulated Motions From Visual Demonstration

Feb 05, 2015

Sudeep Pillai, Matthew R. Walter, Seth Teller

Figure 1 for Learning Articulated Motions From Visual Demonstration

Figure 2 for Learning Articulated Motions From Visual Demonstration

Figure 3 for Learning Articulated Motions From Visual Demonstration

Figure 4 for Learning Articulated Motions From Visual Demonstration

Abstract:Many functional elements of human homes and workplaces consist of rigid components which are connected through one or more sliding or rotating linkages. Examples include doors and drawers of cabinets and appliances; laptops; and swivel office chairs. A robotic mobile manipulator would benefit from the ability to acquire kinematic models of such objects from observation. This paper describes a method by which a robot can acquire an object model by capturing depth imagery of the object as a human moves it through its range of motion. We envision that in future, a machine newly introduced to an environment could be shown by its human user the articulated objects particular to that environment, inferring from these "visual demonstrations" enough information to actuate each object independently of the user. Our method employs sparse (markerless) feature tracking, motion segmentation, component pose estimation, and articulation learning; it does not require prior object models. Using the method, a robot can observe an object being exercised, infer a kinematic model incorporating rigid, prismatic and revolute joints, then use the model to predict the object's motion from a novel vantage point. We evaluate the method's performance, and compare it to that of a previously published technique, for a variety of household objects.

* Published in Robotics: Science and Systems X, Berkeley, CA. ISBN: 978-0-9923747-0-9

Via

Access Paper or Ask Questions