Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Martim Brandão

ChatbotManip: A Dataset to Facilitate Evaluation and Oversight of Manipulative Chatbot Behaviour

Jun 11, 2025

Jack Contro, Simrat Deol, Yulan He, Martim Brandão

Abstract:This paper introduces ChatbotManip, a novel dataset for studying manipulation in Chatbots. It contains simulated generated conversations between a chatbot and a (simulated) user, where the chatbot is explicitly asked to showcase manipulation tactics, persuade the user towards some goal, or simply be helpful. We consider a diverse set of chatbot manipulation contexts, from consumer and personal advice to citizen advice and controversial proposition argumentation. Each conversation is annotated by human annotators for both general manipulation and specific manipulation tactics. Our research reveals three key findings. First, Large Language Models (LLMs) can be manipulative when explicitly instructed, with annotators identifying manipulation in approximately 84\% of such conversations. Second, even when only instructed to be ``persuasive'' without explicit manipulation prompts, LLMs frequently default to controversial manipulative strategies, particularly gaslighting and fear enhancement. Third, small fine-tuned open source models, such as BERT+BiLSTM have a performance comparable to zero-shot classification with larger models like Gemini 2.5 pro in detecting manipulation, but are not yet reliable for real-world oversight. Our work provides important insights for AI safety research and highlights the need of addressing manipulation risks as LLMs are increasingly deployed in consumer-facing applications.

Via

Access Paper or Ask Questions

Seeing Soundscapes: Audio-Visual Generation and Separation from Soundscapes Using Audio-Visual Separator

Apr 25, 2025

Minjae Kang, Martim Brandão

Abstract:Recent audio-visual generative models have made substantial progress in generating images from audio. However, existing approaches focus on generating images from single-class audio and fail to generate images from mixed audio. To address this, we propose an Audio-Visual Generation and Separation model (AV-GAS) for generating images from soundscapes (mixed audio containing multiple classes). Our contribution is threefold: First, we propose a new challenge in the audio-visual generation task, which is to generate an image given a multi-class audio input, and we propose a method that solves this task using an audio-visual separator. Second, we introduce a new audio-visual separation task, which involves generating separate images for each class present in a mixed audio input. Lastly, we propose new evaluation metrics for the audio-visual generation task: Class Representation Score (CRS) and a modified R@K. Our model is trained and evaluated on the VGGSound dataset. We show that our method outperforms the state-of-the-art, achieving 7% higher CRS and 4% higher R@2* in generating plausible images with mixed audio.

* Originally submitted to CVPR 2025 on 2024-11-15 with paper ID 15808

Via

Access Paper or Ask Questions

LLM-Driven Robots Risk Enacting Discrimination, Violence, and Unlawful Actions

Jun 13, 2024

Rumaisa Azeem, Andrew Hundt, Masoumeh Mansouri, Martim Brandão

Abstract:Members of the Human-Robot Interaction (HRI) and Artificial Intelligence (AI) communities have proposed Large Language Models (LLMs) as a promising resource for robotics tasks such as natural language interactions, doing household and workplace tasks, approximating `common sense reasoning', and modeling humans. However, recent research has raised concerns about the potential for LLMs to produce discriminatory outcomes and unsafe behaviors in real-world robot experiments and applications. To address these concerns, we conduct an HRI-based evaluation of discrimination and safety criteria on several highly-rated LLMs. Our evaluation reveals that LLMs currently lack robustness when encountering people across a diverse range of protected identity characteristics (e.g., race, gender, disability status, nationality, religion, and their intersections), producing biased outputs consistent with directly discriminatory outcomes -- e.g. `gypsy' and `mute' people are labeled untrustworthy, but not `european' or `able-bodied' people. Furthermore, we test models in settings with unconstrained natural language (open vocabulary) inputs, and find they fail to act safely, generating responses that accept dangerous, violent, or unlawful instructions -- such as incident-causing misstatements, taking people's mobility aids, and sexual predation. Our results underscore the urgent need for systematic, routine, and comprehensive risk assessments and assurances to improve outcomes and ensure LLMs only operate on robots when it is safe, effective, and just to do so. Data and code will be made available.

* 40 pages (52 with references), 21 Figures, 6 Tables

Via

Access Paper or Ask Questions

Real-Time Volumetric-Semantic Exploration and Mapping: An Uncertainty-Aware Approach

Sep 03, 2021

Rui Pimentel de Figueiredo, Jonas le Fevre Sejersen, Jakob Grimm Hansen, Martim Brandão, Erdal Kayacan

Figure 1 for Real-Time Volumetric-Semantic Exploration and Mapping: An Uncertainty-Aware Approach

Figure 2 for Real-Time Volumetric-Semantic Exploration and Mapping: An Uncertainty-Aware Approach

Figure 3 for Real-Time Volumetric-Semantic Exploration and Mapping: An Uncertainty-Aware Approach

Figure 4 for Real-Time Volumetric-Semantic Exploration and Mapping: An Uncertainty-Aware Approach

Abstract:In this work we propose a holistic framework for autonomous aerial inspection tasks, using semantically-aware, yet, computationally efficient planning and mapping algorithms. The system leverages state-of-the-art receding horizon exploration techniques for next-best-view (NBV) planning with geometric and semantic segmentation information provided by state-of-the-art deep convolutional neural networks (DCNNs), with the goal of enriching environment representations. The contributions of this article are threefold, first we propose an efficient sensor observation model, and a reward function that encodes the expected information gains from the observations taken from specific view points. Second, we extend the reward function to incorporate not only geometric but also semantic probabilistic information, provided by a DCNN for semantic segmentation that operates in real-time. The incorporation of semantic information in the environment representation allows biasing exploration towards specific objects, while ignoring task-irrelevant ones during planning. Finally, we employ our approaches in an autonomous drone shipyard inspection task. A set of simulations in realistic scenarios demonstrate the efficacy and efficiency of the proposed framework when compared with the state-of-the-art.

* IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021

Via

Access Paper or Ask Questions

On the Advantages of Multiple Stereo Vision Camera Designs for Autonomous Drone Navigation

May 26, 2021

Rui Pimentel de Figueiredo, Jakob Grimm Hansen, Jonas Le Fevre, Martim Brandão, Erdal Kayacan

Figure 1 for On the Advantages of Multiple Stereo Vision Camera Designs for Autonomous Drone Navigation

Figure 2 for On the Advantages of Multiple Stereo Vision Camera Designs for Autonomous Drone Navigation

Figure 3 for On the Advantages of Multiple Stereo Vision Camera Designs for Autonomous Drone Navigation

Figure 4 for On the Advantages of Multiple Stereo Vision Camera Designs for Autonomous Drone Navigation

Abstract:In this work we showcase the design and assessment of the performance of a multi-camera UAV, when coupled with state-of-the-art planning and mapping algorithms for autonomous navigation. The system leverages state-of-the-art receding horizon exploration techniques for Next-Best-View (NBV) planning with 3D and semantic information, provided by a reconfigurable multi stereo camera system. We employ our approaches in an autonomous drone-based inspection task and evaluate them in an autonomous exploration and mapping scenario. We discuss the advantages and limitations of using multi stereo camera flying systems, and the trade-off between number of cameras and mapping performance.

Via

Access Paper or Ask Questions