Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ysobel Sims

Exploring GPT-4 for Robotic Agent Strategy with Real-Time State Feedback and a Reactive Behaviour Framework

Mar 30, 2025

Thomas O'Brien, Ysobel Sims

Abstract:We explore the use of GPT-4 on a humanoid robot in simulation and the real world as proof of concept of a novel large language model (LLM) driven behaviour method. LLMs have shown the ability to perform various tasks, including robotic agent behaviour. The problem involves prompting the LLM with a goal, and the LLM outputs the sub-tasks to complete to achieve that goal. Previous works focus on the executability and correctness of the LLM's generated tasks. We propose a method that successfully addresses practical concerns around safety, transitions between tasks, time horizons of tasks and state feedback. In our experiments we have found that our approach produces output for feasible requests that can be executed every time, with smooth transitions. User requests are achieved most of the time across a range of goal time horizons.

* Australasian Conference on Robotics and Automation (2023)

Via

Access Paper or Ask Questions

Diffusion in Zero-Shot Learning for Environmental Audio

Dec 04, 2024

Ysobel Sims, Stephan Chalup, Alexandre Mendes

Abstract:Zero-shot learning enables models to generalize to unseen classes by leveraging semantic information, bridging the gap between training and testing sets with non-overlapping classes. While much research has focused on zero-shot learning in computer vision, the application of these methods to environmental audio remains underexplored, with poor performance in existing studies. Generative methods, which have demonstrated success in computer vision, are notably absent from environmental audio zero-shot learning, where classification-based approaches dominate. To address this gap, this work investigates generative methods for zero-shot learning in environmental audio. Two successful generative models from computer vision are adapted: a cross-aligned and distribution-aligned variational autoencoder (CADA-VAE) and a leveraging invariant side generative adversarial network (LisGAN). Additionally, a novel diffusion model conditioned on class auxiliary data is introduced. The diffusion model generates synthetic data for unseen classes, which is combined with seen-class data to train a classifier. Experiments are conducted on two environmental audio datasets, ESC-50 and FSC22. Results show that the diffusion model significantly outperforms all baseline methods, achieving more than 25% higher accuracy on the ESC-50 test partition. This work establishes the diffusion model as a promising generative approach for zero-shot learning and introduces the first benchmark of generative methods for environmental audio zero-shot learning, providing a foundation for future research in the field. Code is provided at https://github.com/ysims/ZeroDiffusion for the novel ZeroDiffusion method.

* This work has been submitted to the IEEE for possible publication

Via

Access Paper or Ask Questions

The Director: A Composable Behaviour System with Soft Transitions

Sep 17, 2023

Ysobel Sims, Trent Houliston

Abstract:Software frameworks for behaviour are critical in robotics as they enable the correct and efficient execution of functions. While modern behaviour systems have improved their composability, they do not focus on smooth transitions and often lack functionality. In this work, we present the Director, a novel behaviour framework and algorithm that addresses these problems. It has functionality for soft transitions, multiple implementations of the same action chosen based on conditionals, and strict resource control. This system has shown success in the Humanoid Kid Size 2022/2023 Virtual Season and the Humanoid Kid Size RoboCup 2023 Bordeaux competition.

Via

Access Paper or Ask Questions