Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vineet Kamat

OVAMOS: A Framework for Open-Vocabulary Multi-Object Search in Unknown Environments

Mar 03, 2025

Qianwei Wang, Yifan Xu, Vineet Kamat, Carol Menassa

Abstract:Object search is a fundamental task for robots deployed in indoor building environments, yet challenges arise due to observation instability, especially for open-vocabulary models. While foundation models (LLMs/VLMs) enable reasoning about object locations even without direct visibility, the ability to recover from failures and replan remains crucial. The Multi-Object Search (MOS) problem further increases complexity, requiring the tracking multiple objects and thorough exploration in novel environments, making observation uncertainty a significant obstacle. To address these challenges, we propose a framework integrating VLM-based reasoning, frontier-based exploration, and a Partially Observable Markov Decision Process (POMDP) framework to solve the MOS problem in novel environments. VLM enhances search efficiency by inferring object-environment relationships, frontier-based exploration guides navigation in unknown spaces, and POMDP models observation uncertainty, allowing recovery from failures in occlusion and cluttered environments. We evaluate our framework on 120 simulated scenarios across several Habitat-Matterport3D (HM3D) scenes and a real-world robot experiment in a 50-square-meter office, demonstrating significant improvements in both efficiency and success rate over baseline methods.

* 7 pages, 4 Figures

Via

Access Paper or Ask Questions

CoNav Chair: Design of a ROS-based Smart Wheelchair for Shared Control Navigation in the Built Environment

Jan 16, 2025

Yifan Xu, Qianwei Wang, Jordan Lillie, Vineet Kamat, Carol Menassa

Figure 1 for CoNav Chair: Design of a ROS-based Smart Wheelchair for Shared Control Navigation in the Built Environment

Figure 2 for CoNav Chair: Design of a ROS-based Smart Wheelchair for Shared Control Navigation in the Built Environment

Figure 3 for CoNav Chair: Design of a ROS-based Smart Wheelchair for Shared Control Navigation in the Built Environment

Figure 4 for CoNav Chair: Design of a ROS-based Smart Wheelchair for Shared Control Navigation in the Built Environment

Abstract:With the number of people with disabilities (PWD) increasing worldwide each year, the demand for mobility support to enable independent living and social integration is also growing. Wheelchairs commonly support the mobility of PWD in both indoor and outdoor environments. However, current powered wheelchairs (PWC) often fail to meet the needs of PWD, who may find it difficult to operate them. Furthermore, existing research on robotic wheelchairs typically focuses either on full autonomy or enhanced manual control, which can lead to reduced efficiency and user trust. To address these issues, this paper proposes a Robot Operating System (ROS)-based smart wheelchair, called CoNav Chair, that incorporates a shared control navigation algorithm and obstacle avoidance to support PWD while fostering efficiency and trust between the robot and the user. Our design consists of hardware and software components. Experimental results conducted in a typical indoor social environment demonstrate the performance and effectiveness of the smart wheelchair hardware and software design. This integrated design promotes trust and autonomy, which are crucial for the acceptance of assistive mobility technologies in the built environment.

* 8 pages, 9 figures

Via

Access Paper or Ask Questions

Seeing with Partial Certainty: Conformal Prediction for Robotic Scene Recognition in Built Environments

Jan 09, 2025

Yifan Xu, Vineet Kamat, Carol Menassa

Abstract:In assistive robotics serving people with disabilities (PWD), accurate place recognition in built environments is crucial to ensure that robots navigate and interact safely within diverse indoor spaces. Language interfaces, particularly those powered by Large Language Models (LLM) and Vision Language Models (VLM), hold significant promise in this context, as they can interpret visual scenes and correlate them with semantic information. However, such interfaces are also known for their hallucinated predictions. In addition, language instructions provided by humans can also be ambiguous and lack precise details about specific locations, objects, or actions, exacerbating the hallucination issue. In this work, we introduce Seeing with Partial Certainty (SwPC) - a framework designed to measure and align uncertainty in VLM-based place recognition, enabling the model to recognize when it lacks confidence and seek assistance when necessary. This framework is built on the theory of conformal prediction to provide statistical guarantees on place recognition while minimizing requests for human help in complex indoor environment settings. Through experiments on the widely used richly-annotated scene dataset Matterport3D, we show that SwPC significantly increases the success rate and decreases the amount of human intervention required relative to the prior art. SwPC can be utilized with any VLMs directly without requiring model fine-tuning, offering a promising, lightweight approach to uncertainty modeling that complements and scales alongside the expanding capabilities of foundational models.

* 10 pages, 4 Figures

Via

Access Paper or Ask Questions

Point2Graph: An End-to-end Point Cloud-based 3D Open-Vocabulary Scene Graph for Robot Navigation

Sep 16, 2024

Yifan Xu, Ziming Luo, Qianwei Wang, Vineet Kamat, Carol Menassa

Figure 1 for Point2Graph: An End-to-end Point Cloud-based 3D Open-Vocabulary Scene Graph for Robot Navigation

Figure 2 for Point2Graph: An End-to-end Point Cloud-based 3D Open-Vocabulary Scene Graph for Robot Navigation

Figure 3 for Point2Graph: An End-to-end Point Cloud-based 3D Open-Vocabulary Scene Graph for Robot Navigation

Figure 4 for Point2Graph: An End-to-end Point Cloud-based 3D Open-Vocabulary Scene Graph for Robot Navigation

Abstract:Current open-vocabulary scene graph generation algorithms highly rely on both 3D scene point cloud data and posed RGB-D images and thus have limited applications in scenarios where RGB-D images or camera poses are not readily available. To solve this problem, we propose Point2Graph, a novel end-to-end point cloud-based 3D open-vocabulary scene graph generation framework in which the requirement of posed RGB-D image series is eliminated. This hierarchical framework contains room and object detection/segmentation and open-vocabulary classification. For the room layer, we leverage the advantage of merging the geometry-based border detection algorithm with the learning-based region detection to segment rooms and create a "Snap-Lookup" framework for open-vocabulary room classification. In addition, we create an end-to-end pipeline for the object layer to detect and classify 3D objects based solely on 3D point cloud data. Our evaluation results show that our framework can outperform the current state-of-the-art (SOTA) open-vocabulary object and room segmentation and classification algorithm on widely used real-scene datasets.

* 8 pages, 9 figures

Via

Access Paper or Ask Questions

Socially-Aware Shared Control Navigation for Assistive Mobile Robots in the Built Environment

May 27, 2024

Yifan Xu, Qianwei Wang, Vineet Kamat, Carol Menassa

Abstract:As the number of Persons with Disabilities (PWD), particularly those with one or more physical impairments, increases, there is an increasing demand for assistive robotic technologies that can support independent mobility in the built environment and reduce the burden on caregivers. Current assistive mobility platforms (e.g., robotic wheelchairs) often fail to incorporate user preferences and control, leading to reduced trust and efficiency. Existing shared control algorithms do not allow the incorporation of the user control preferences inside the navigation framework or the path planning algorithm. In addition, existing dynamic local planner algorithms for robotic wheelchairs do not take into account the social spaces of people, potentially leading such platforms to infringe upon these areas and cause discomfort. To address these concerns, this work introduces a novel socially-aware shared autonomy-based navigation system for assistive mobile robotic platforms. Our navigation framework comprises a Global Planner and a Local Planner. To implement the Global Planner, the proposed approach introduces a novel User Preference Field (UPF) theory within its global planning framework, explicitly acknowledging user preferences to adeptly navigate away from congested areas. For the Local Planner, we propose a Socially-aware Shared Control-based Model Predictive Control with Dynamic Control Barrier Function (SS-MPC-DCBF) to adjust movements in real-time, integrating user preferences for safer, more autonomous navigation. Evaluation results show that our Global Planner aligns closely with user preferences compared to baselines, and our Local Planner demonstrates enhanced safety and efficiency in dynamic and static scenarios. This integrated approach fosters trust and autonomy, crucial for the acceptance of assistive mobility technologies in the built environment.

* 42 pages, 14 figures

Via

Access Paper or Ask Questions