Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Elaine Short

How Users Understand Robot Foundation Model Performance through Task Success Rates and Beyond

Feb 03, 2026

Isaac Sheidlower, Jindan Huang, James Staley, Bingyu Wu, Qicong Chen, Reuben Aronson, Elaine Short

Abstract:Robot Foundation Models (RFMs) represent a promising approach to developing general-purpose home robots. Given the broad capabilities of RFMs, users will inevitably ask an RFM-based robot to perform tasks that the RFM was not trained or evaluated on. In these cases, it is crucial that users understand the risks associated with attempting novel tasks due to the relatively high cost of failure. Furthermore, an informed user who understands an RFM's capabilities will know what situations and tasks the robot can handle. In this paper, we study how non-roboticists interpret performance information from RFM evaluations. These evaluations typically report task success rate (TSR) as the primary performance metric. While TSR is intuitive to experts, it is necessary to validate whether novices also use this information as intended. Toward this end, we conducted a study in which users saw real evaluation data, including TSR, failure case descriptions, and videos from multiple published RFM research projects. The results highlight that non-experts not only use TSR in a manner consistent with expert expectations but also highly value other information types, such as failure cases that are not often reported in RFM evaluations. Furthermore, we find that users want access to both real data from previous evaluations of the RFM and estimates from the robot about how well it will do on a novel task.

* Submitted to IJCAI 2026

Via

Access Paper or Ask Questions

Modifying RL Policies with Imagined Actions: How Predictable Policies Can Enable Users to Perform Novel Tasks

Dec 10, 2023

Isaac Sheidlower, Reuben Aronson, Elaine Short

Figure 1 for Modifying RL Policies with Imagined Actions: How Predictable Policies Can Enable Users to Perform Novel Tasks

Abstract:It is crucial that users are empowered to use the functionalities of a robot to creatively solve problems on the fly. A user who has access to a Reinforcement Learning (RL) based robot may want to use the robot's autonomy and their knowledge of its behavior to complete new tasks. One way is for the user to take control of some of the robot's action space through teleoperation while the RL policy simultaneously controls the rest. However, an out-of-the-box RL policy may not readily facilitate this. For example, a user's control may bring the robot into a failure state from the policy's perspective, causing it to act in a way the user is not familiar with, hindering the success of the user's desired task. In this work, we formalize this problem and present Imaginary Out-of-Distribution Actions, IODA, an initial algorithm for addressing that problem and empowering user's to leverage their expectation of a robot's behavior to accomplish new tasks.

* Pre-print to be published in the AAAI Fall Symposium 2023 Proceedings (part of the AI-HRI Symposium)

Via

Access Paper or Ask Questions

SENSAR: A Visual Tool for Intelligent Robots for Collaborative Human-Robot Interaction

Nov 09, 2020

Andre Cleaver, Faizan Muhammad, Amel Hassan, Elaine Short, Jivko Sinapov

Figure 1 for SENSAR: A Visual Tool for Intelligent Robots for Collaborative Human-Robot Interaction

Figure 2 for SENSAR: A Visual Tool for Intelligent Robots for Collaborative Human-Robot Interaction

Figure 3 for SENSAR: A Visual Tool for Intelligent Robots for Collaborative Human-Robot Interaction

Figure 4 for SENSAR: A Visual Tool for Intelligent Robots for Collaborative Human-Robot Interaction

Abstract:Establishing common ground between an intelligent robot and a human requires communication of the robot's intention, behavior, and knowledge to the human to build trust and assure safety in a shared environment. This paper introduces SENSAR (Seeing Everything iN Situ with Augmented Reality), an augmented reality robotic system that enables robots to communicate their sensory and cognitive data in context over the real-world with rendered graphics, allowing a user to understand, correct, and validate the robot's perception of the world. Our system aims to support human-robot interaction research by establishing common ground where the perceptions of the human and the robot align.

Via

Access Paper or Ask Questions