Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matthew Lisondra

Embodied AI with Foundation Models for Mobile Service Robots: A Systematic Review

May 26, 2025

Matthew Lisondra, Beno Benhabib, Goldie Nejat

Abstract:Rapid advancements in foundation models, including Large Language Models, Vision-Language Models, Multimodal Large Language Models, and Vision-Language-Action Models have opened new avenues for embodied AI in mobile service robotics. By combining foundation models with the principles of embodied AI, where intelligent systems perceive, reason, and act through physical interactions, robots can improve understanding, adapt to, and execute complex tasks in dynamic real-world environments. However, embodied AI in mobile service robots continues to face key challenges, including multimodal sensor fusion, real-time decision-making under uncertainty, task generalization, and effective human-robot interactions (HRI). In this paper, we present the first systematic review of the integration of foundation models in mobile service robotics, identifying key open challenges in embodied AI and examining how foundation models can address them. Namely, we explore the role of such models in enabling real-time sensor fusion, language-conditioned control, and adaptive task execution. Furthermore, we discuss real-world applications in the domestic assistance, healthcare, and service automation sectors, demonstrating the transformative impact of foundation models on service robotics. We also include potential future research directions, emphasizing the need for predictive scaling laws, autonomous long-term adaptation, and cross-embodiment generalization to enable scalable, efficient, and robust deployment of foundation models in human-centric robotic systems.

Via

Access Paper or Ask Questions

Inverse k-visibility for RSSI-based Indoor Geometric Mapping

Aug 14, 2024

Junseo Kim, Matthew Lisondra, Yeganeh Bahoo, Sajad Saeedi

Abstract:In recent years, the increased availability of WiFi in indoor environments has gained an interest in the robotics community to leverage WiFi signals for enhancing indoor SLAM (Simultaneous Localization and Mapping) systems. SLAM technology is widely used, especially for the navigation and control of autonomous robots. This paper discusses various works in developing WiFi-based localization and challenges in achieving high-accuracy geometric maps. This paper introduces the concept of inverse k-visibility developed from the k-visibility algorithm to identify the free space in an unknown environment for planning, navigation, and obstacle avoidance. Comprehensive experiments, including those utilizing single and multiple RSSI signals, were conducted in both simulated and real-world environments to demonstrate the robustness of the proposed algorithm. Additionally, a detailed analysis comparing the resulting maps with ground-truth Lidar-based maps is provided to highlight the algorithm's accuracy and reliability.

* This work has been submitted to the IEEE Sensors Journal for possible publication

Via

Access Paper or Ask Questions

Visual Inertial Odometry using Focal Plane Binary Features (BIT-VIO)

Mar 14, 2024

Matthew Lisondra, Junseo Kim, Riku Murai, Kourosh Zareinia, Sajad Saeedi

Figure 1 for Visual Inertial Odometry using Focal Plane Binary Features (BIT-VIO)

Figure 2 for Visual Inertial Odometry using Focal Plane Binary Features (BIT-VIO)

Figure 3 for Visual Inertial Odometry using Focal Plane Binary Features (BIT-VIO)

Figure 4 for Visual Inertial Odometry using Focal Plane Binary Features (BIT-VIO)

Abstract:Focal-Plane Sensor-Processor Arrays (FPSP)s are an emerging technology that can execute vision algorithms directly on the image sensor. Unlike conventional cameras, FPSPs perform computation on the image plane -- at individual pixels -- enabling high frame rate image processing while consuming low power, making them ideal for mobile robotics. FPSPs, such as the SCAMP-5, use parallel processing and are based on the Single Instruction Multiple Data (SIMD) paradigm. In this paper, we present BIT-VIO, the first Visual Inertial Odometry (VIO) which utilises SCAMP-5.BIT-VIO is a loosely-coupled iterated Extended Kalman Filter (iEKF) which fuses together the visual odometry running fast at 300 FPS with predictions from 400 Hz IMU measurements to provide accurate and smooth trajectories.

* Accepted for Presentation Yokohama, Japan for IEEE 2024 ICRA

Via

Access Paper or Ask Questions