Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sarvapali D. Ramchurn

Safe and Socially Aware Multi-Robot Coordination in Multi-Human Social Care Settings

Jul 03, 2025

Ayodeji O. Abioye, Jayati Deshmukh, Athina Georgara, Dominic Price, Tuyen Nguyen, Aleksandra Landowska, Amel Bennaceur, Joel E. Fischer, Sarvapali D. Ramchurn

Abstract:This research investigates strategies for multi-robot coordination in multi-human environments. It proposes a multi-objective learning-based coordination approach to addressing the problem of path planning, navigation, task scheduling, task allocation, and human-robot interaction in multi-human multi-robot (MHMR) settings.

* 3 pages, 1 figure. Accepted for poster presentation at the UK AI Research Symposium (UKAIR) 2025, themed "A Festival of Ideas", being held in Newcastle from 8th - 9th September, 2025. https://www.ukairs.ac.uk/

Via

Access Paper or Ask Questions

TAT-VPR: Ternary Adaptive Transformer for Dynamic and Efficient Visual Place Recognition

May 22, 2025

Oliver Grainge, Michael Milford, Indu Bodala, Sarvapali D. Ramchurn, Shoaib Ehsan

Abstract:TAT-VPR is a ternary-quantized transformer that brings dynamic accuracy-efficiency trade-offs to visual SLAM loop-closure. By fusing ternary weights with a learned activation-sparsity gate, the model can control computation by up to 40% at run-time without degrading performance (Recall@1). The proposed two-stage distillation pipeline preserves descriptor quality, letting it run on micro-UAV and embedded SLAM stacks while matching state-of-the-art localization accuracy.

Via

Access Paper or Ask Questions

TeTRA-VPR: A Ternary Transformer Approach for Compact Visual Place Recognition

Mar 04, 2025

Oliver Grainge, Michael Milford, Indu Bodala, Sarvapali D. Ramchurn, Shoaib Ehsan

Figure 1 for TeTRA-VPR: A Ternary Transformer Approach for Compact Visual Place Recognition

Figure 2 for TeTRA-VPR: A Ternary Transformer Approach for Compact Visual Place Recognition

Figure 3 for TeTRA-VPR: A Ternary Transformer Approach for Compact Visual Place Recognition

Figure 4 for TeTRA-VPR: A Ternary Transformer Approach for Compact Visual Place Recognition

Abstract:Visual Place Recognition (VPR) localizes a query image by matching it against a database of geo-tagged reference images, making it essential for navigation and mapping in robotics. Although Vision Transformer (ViT) solutions deliver high accuracy, their large models often exceed the memory and compute budgets of resource-constrained platforms such as drones and mobile robots. To address this issue, we propose TeTRA, a ternary transformer approach that progressively quantizes the ViT backbone to 2-bit precision and binarizes its final embedding layer, offering substantial reductions in model size and latency. A carefully designed progressive distillation strategy preserves the representational power of a full-precision teacher, allowing TeTRA to retain or even surpass the accuracy of uncompressed convolutional counterparts, despite using fewer resources. Experiments on standard VPR benchmarks demonstrate that TeTRA reduces memory consumption by up to 69% compared to efficient baselines, while lowering inference latency by 35%, with either no loss or a slight improvement in recall@1. These gains enable high-accuracy VPR on power-constrained, memory-limited robotic platforms, making TeTRA an appealing solution for real-world deployment.

Via

Access Paper or Ask Questions

Position: Towards a Responsible LLM-empowered Multi-Agent Systems

Feb 03, 2025

Jinwei Hu, Yi Dong, Shuang Ao, Zhuoyun Li, Boxuan Wang, Lokesh Singh, Guangliang Cheng, Sarvapali D. Ramchurn, Xiaowei Huang

Figure 1 for Position: Towards a Responsible LLM-empowered Multi-Agent Systems

Figure 2 for Position: Towards a Responsible LLM-empowered Multi-Agent Systems

Figure 3 for Position: Towards a Responsible LLM-empowered Multi-Agent Systems

Figure 4 for Position: Towards a Responsible LLM-empowered Multi-Agent Systems

Abstract:The rise of Agent AI and Large Language Model-powered Multi-Agent Systems (LLM-MAS) has underscored the need for responsible and dependable system operation. Tools like LangChain and Retrieval-Augmented Generation have expanded LLM capabilities, enabling deeper integration into MAS through enhanced knowledge retrieval and reasoning. However, these advancements introduce critical challenges: LLM agents exhibit inherent unpredictability, and uncertainties in their outputs can compound across interactions, threatening system stability. To address these risks, a human-centered design approach with active dynamic moderation is essential. Such an approach enhances traditional passive oversight by facilitating coherent inter-agent communication and effective system governance, allowing MAS to achieve desired outcomes more efficiently.

* Under Review

Via

Access Paper or Ask Questions

Image-based Geo-localization for Robotics: Are Black-box Vision-Language Models there yet?

Jan 28, 2025

Sania Waheed, Bruno Ferrarini, Michael Milford, Sarvapali D. Ramchurn, Shoaib Ehsan

Figure 1 for Image-based Geo-localization for Robotics: Are Black-box Vision-Language Models there yet?

Figure 2 for Image-based Geo-localization for Robotics: Are Black-box Vision-Language Models there yet?

Figure 3 for Image-based Geo-localization for Robotics: Are Black-box Vision-Language Models there yet?

Figure 4 for Image-based Geo-localization for Robotics: Are Black-box Vision-Language Models there yet?

Abstract:The advances in Vision-Language models (VLMs) offer exciting opportunities for robotic applications involving image geo-localization, the problem of identifying the geo-coordinates of a place based on visual data only. Recent research works have focused on using a VLM as embeddings extractor for geo-localization, however, the most sophisticated VLMs may only be available as black boxes that are accessible through an API, and come with a number of limitations: there is no access to training data, model features and gradients; retraining is not possible; the number of predictions may be limited by the API; training on model outputs is often prohibited; and queries are open-ended. The utilization of a VLM as a stand-alone, zero-shot geo-localization system using a single text-based prompt is largely unexplored. To bridge this gap, this paper undertakes the first systematic study, to the best of our knowledge, to investigate the potential of some of the state-of-the-art VLMs as stand-alone, zero-shot geo-localization systems in a black-box setting with realistic constraints. We consider three main scenarios for this thorough investigation: a) fixed text-based prompt; b) semantically-equivalent text-based prompts; and c) semantically-equivalent query images. We also take into account the auto-regressive and probabilistic generation process of the VLMs when investigating their utility for geo-localization task by using model consistency as a metric in addition to traditional accuracy. Our work provides new insights in the capabilities of different VLMs for the above-mentioned scenarios.

* Submitted to IROS 2025

Via

Access Paper or Ask Questions

Structured Pruning for Efficient Visual Place Recognition

Sep 12, 2024

Oliver Grainge, Michael Milford, Indu Bodala, Sarvapali D. Ramchurn, Shoaib Ehsan

Figure 1 for Structured Pruning for Efficient Visual Place Recognition

Figure 2 for Structured Pruning for Efficient Visual Place Recognition

Figure 3 for Structured Pruning for Efficient Visual Place Recognition

Figure 4 for Structured Pruning for Efficient Visual Place Recognition

Abstract:Visual Place Recognition (VPR) is fundamental for the global re-localization of robots and devices, enabling them to recognize previously visited locations based on visual inputs. This capability is crucial for maintaining accurate mapping and localization over large areas. Given that VPR methods need to operate in real-time on embedded systems, it is critical to optimize these systems for minimal resource consumption. While the most efficient VPR approaches employ standard convolutional backbones with fixed descriptor dimensions, these often lead to redundancy in the embedding space as well as in the network architecture. Our work introduces a novel structured pruning method, to not only streamline common VPR architectures but also to strategically remove redundancies within the feature embedding space. This dual focus significantly enhances the efficiency of the system, reducing both map and model memory requirements and decreasing feature extraction and retrieval latencies. Our approach has reduced memory usage and latency by 21% and 16%, respectively, across models, while minimally impacting recall@1 accuracy by less than 1%. This significant improvement enhances real-time applications on edge devices with negligible accuracy loss.

Via

Access Paper or Ask Questions

Mapping Safe Zones for Co-located Human-UAV Interaction

Sep 03, 2024

Ayodeji O. Abioye, Lisa Bidgood, Sarvapali D. Ramchurn, Mohammad D. Soorati

Abstract:Recent advances in robotics bring us closer to the reality of living, co-habiting, and sharing personal spaces with robots. However, it is not clear how close a co-located robot can be to a human in a shared environment without making the human uncomfortable or anxious. This research aims to map safe and comfortable zones for co-located aerial robots. The objective is to identify the distances at which a drone causes discomfort to a co-located human and to create a map showing no-fly, moderate-fly, and safe-fly zones. We recruited a total of 18 participants and conducted two indoor laboratory experiments, one with a single drone and the other set with two drones. Our results show that multiple drones cause more discomfort when close to a co-located human than a single drone. We observed that distances below 200 cm caused discomfort, the moderate fly zone was 200 - 300 cm, and the safe-fly zone was any distance greater than 300 cm in single drone experiments. The safe zones were pushed further away by 100 cm for the multiple drone experiments. In this paper, we present the preliminary findings on safe-fly zones for multiple drones. Further work would investigate the impact of a higher number of aerial robots, the speed of approach, direction of travel, and noise level on co-located humans, and autonomously develop 3D models of trust zones and safe zones for co-located aerial swarms.

* This paper has been accepted for presentation at the Second International Symposium on Trustworthy Autonomous Systems (TAS '24), being held from September 16-18, 2024, at Austin, TX, USA. It consists of 10 pages, 9 figures, and 1 table

Via

Access Paper or Ask Questions

Trade-offs of Dynamic Control Structure in Human-swarm Systems

Aug 05, 2024

Thomas G. Kelly, Mohammad D. Soorati, Klaus-Peter Zauner, Sarvapali D. Ramchurn, and Danesh Tarapore

Abstract:Swarm robotics is a study of simple robots that exhibit complex behaviour only by interacting locally with other robots and their environment. The control in swarm robotics is mainly distributed whereas centralised control is widely used in other fields of robotics. Centralised and decentralised control strategies both pose a unique set of benefits and drawbacks for the control of multi-robot systems. While decentralised systems are more scalable and resilient, they are less efficient compared to the centralised systems and they lead to excessive data transmissions to the human operators causing cognitive overload. We examine the trade-offs of each of these approaches in a human-swarm system to perform an environmental monitoring task and propose a flexible hybrid approach, which combines elements of hierarchical and decentralised systems. We find that a flexible hybrid system can outperform a centralised system (in our environmental monitoring task by 19.2%) while reducing the number of messages sent to a human operator (here by 23.1%). We conclude that establishing centralisation for a system is not always optimal for performance and that utilising aspects of centralised and decentralised systems can keep the swarm from hindering its performance.

* The International Symposium on Distributed Autonomous Robotic Systems (DARS 2024)

Via

Access Paper or Ask Questions

Learning to Imitate Spatial Organization in Multi-robot Systems

Jul 16, 2024

Ayomide O. Agunloye, Sarvapali D. Ramchurn, Mohammad D. Soorati

Abstract:Understanding collective behavior and how it evolves is important to ensure that robot swarms can be trusted in a shared environment. One way to understand the behavior of the swarm is through collective behavior reconstruction using prior demonstrations. Existing approaches often require access to the swarm controller which may not be available. We reconstruct collective behaviors in distinct swarm scenarios involving shared environments without using swarm controller information. We achieve this by transforming prior demonstrations into features that sufficiently describe multi-agent interactions before behavior reconstruction with multi-agent generative adversarial imitation learning (MA-GAIL). We show that our approach outperforms existing algorithms in all investigated swarm scenarios, and can be used to observe and reconstruct a swarm's behavior for further analysis and testing, which might be impractical or undesirable on the original robot swarm.

* 6 pages, 4 figures. Accepted for presentation at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

Via

Access Paper or Ask Questions

A Survey of Language-Based Communication in Robotics

Jun 06, 2024

William Hunt, Sarvapali D. Ramchurn, Mohammad D. Soorati

Figure 1 for A Survey of Language-Based Communication in Robotics

Figure 2 for A Survey of Language-Based Communication in Robotics

Figure 3 for A Survey of Language-Based Communication in Robotics

Figure 4 for A Survey of Language-Based Communication in Robotics

Abstract:Embodied robots which can interact with their environment and neighbours are increasingly being used as a test case to develop Artificial Intelligence. This creates a need for multimodal robot controllers which can operate across different types of information including text. Large Language Models are able to process and generate textual as well as audiovisual data and, more recently, robot actions. Language Models are increasingly being applied to robotic systems; these Language-Based robots leverage the power of language models in a variety of ways. Additionally, the use of language opens up multiple forms of information exchange between members of a human-robot team. This survey motivates the use of language models in robotics, and then delineates works based on the part of the overall control flow in which language is incorporated. Language can be used by human to task a robot, by a robot to inform a human, between robots as a human-like communication medium, and internally for a robot's planning and control. Applications of language-based robots are explored, and finally numerous limitations and challenges are discussed to provide a summary of the development needed for language-based robotics moving forward. Links to each paper and, if available, source code are made available in the accompanying site at https://uos-haris.online/sooratilab/papers/WillSurvey/LangRobotSurvey.php

Via

Access Paper or Ask Questions