Abstract:In Embodied Question Answering (EQA), agents must explore and develop a semantic understanding of an unseen environment in order to answer a situated question with confidence. This remains a challenging problem in robotics, due to the difficulties in obtaining useful semantic representations, updating these representations online, and leveraging prior world knowledge for efficient exploration and planning. Aiming to address these limitations, we propose GraphEQA, a novel approach that utilizes real-time 3D metric-semantic scene graphs (3DSGs) and task relevant images as multi-modal memory for grounding Vision-Language Models (VLMs) to perform EQA tasks in unseen environments. We employ a hierarchical planning approach that exploits the hierarchical nature of 3DSGs for structured planning and semantic-guided exploration. Through experiments in simulation on the HM-EQA dataset and in the real world in home and office environments, we demonstrate that our method outperforms key baselines by completing EQA tasks with higher success rates and fewer planning steps.
Abstract:Robots often interact with the world via attached parts such as wheels, joints, or appendages. In many systems, these interactions, and the manner in which they lead to locomotion, can be understood using the machinery of geometric mechanics, explaining how inputs in the shape space of a robot affect motion in its configuration space and the configuration space of its environment. In this paper we consider an opposite type of locomotion, wherein robots are influenced actively by interactions with an externally forced ambient medium. We investigate two examples of externally actuated systems; one for which locomotion is governed by a principal connection, and is usually considered to possess no drift dynamics, and another for which no such connection exists, with drift inherent in its locomotion. For the driftless system, we develop geometric tools based on previously understood internally actuated versions of the system and demonstrate their use for motion planning under external actuation. For the system possessing drift, we employ nonholonomic reduction to obtain a reduced representation of the system dynamics, illustrate geometric features conducive to studying locomotion, and derive strategies for external actuation.