Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pedro A. Santos

Playing Hex and Counter Wargames using Reinforcement Learning and Recurrent Neural Networks

Feb 19, 2025

Guilherme Palma, Pedro A. Santos, João Dias

Abstract:Hex and Counter Wargames are adversarial two-player simulations of real military conflicts requiring complex strategic decision-making. Unlike classical board games, these games feature intricate terrain/unit interactions, unit stacking, large maps of varying sizes, and simultaneous move and combat decisions involving hundreds of units. This paper introduces a novel system designed to address the strategic complexity of Hex and Counter Wargames by integrating cutting-edge advancements in Recurrent Neural Networks with AlphaZero, a reliable modern Reinforcement Learning algorithm. The system utilizes a new Neural Network architecture developed from existing research, incorporating innovative state and action representations tailored to these specific game environments. With minimal training, our solution has shown promising results in typical scenarios, demonstrating the ability to generalize across different terrain and tactical situations. Additionally, we explore the system's potential to scale to larger map sizes. The developed system is openly accessible, facilitating continued research and exploration within this challenging domain.

Via

Access Paper or Ask Questions

Multi-Bellman operator for convergence of $Q$-learning with linear function approximation

Sep 28, 2023

Diogo S. Carvalho, Pedro A. Santos, Francisco S. Melo

Abstract:We study the convergence of $Q$-learning with linear function approximation. Our key contribution is the introduction of a novel multi-Bellman operator that extends the traditional Bellman operator. By exploring the properties of this operator, we identify conditions under which the projected multi-Bellman operator becomes contractive, providing improved fixed-point guarantees compared to the Bellman operator. To leverage these insights, we propose the multi $Q$-learning algorithm with linear function approximation. We demonstrate that this algorithm converges to the fixed-point of the projected multi-Bellman operator, yielding solutions of arbitrary accuracy. Finally, we validate our approach by applying it to well-known environments, showcasing the effectiveness and applicability of our findings.

Via

Access Paper or Ask Questions

Building Persuasive Robots with Social Power Strategies

Jul 12, 2023

Mojgan Hashemian, Marta Couto, Samuel Mascarenhas, Ana Paiva, Pedro A. Santos, Rui Prada

Abstract:Can social power endow social robots with the capacity to persuade? This paper represents our recent endeavor to design persuasive social robots. We have designed and run three different user studies to investigate the effectiveness of different bases of social power (inspired by French and Raven's theory) on peoples' compliance to the requests of social robots. The results show that robotic persuaders that exert social power (specifically from expert, reward, and coercion bases) demonstrate increased ability to influence humans. The first study provides a positive answer and shows that under the same circumstances, people with different personalities prefer robots using a specific social power base. In addition, social rewards can be useful in persuading individuals. The second study suggests that by employing social power, social robots are capable of persuading people objectively to select a less desirable choice among others. Finally, the third study shows that the effect of power on persuasion does not decay over time and might strengthen under specific circumstances. Moreover, exerting stronger social power does not necessarily lead to higher persuasion. Overall, we argue that the results of these studies are relevant for designing human--robot-interaction scenarios especially the ones aiming at behavioral change.

Via

Access Paper or Ask Questions

GAN-Based Content Generation of Maps for Strategy Games

Jan 07, 2023

Vasco Nunes, João Dias, Pedro A. Santos

Abstract:Maps are a very important component of strategy games, and a time-consuming task if done by hand. Maps generated by traditional PCG techniques such as Perlin noise or tile-based PCG techniques look unnatural and unappealing, thus not providing the best user experience for the players. However it is possible to have a generator that can create realistic and natural images of maps, given that it is trained how to do so. We propose a model for the generation of maps based on Generative Adversarial Networks (GAN). In our implementation we tested out different variants of GAN-based networks on a dataset of heightmaps. We conducted extensive empirical evaluation to determine the advantages and properties of each approach. The results obtained are promising, showing that it is indeed possible to generate realistic looking maps using this type of approach.

* Proceedings of GAME-ON'2022, pg 20-31, ISBN 978-9-492859-22-8
* Published in the Proceedings of GAME ON 2022

Via

Access Paper or Ask Questions

Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning

Oct 12, 2022

Pedro P. Santos, Diogo S. Carvalho, Miguel Vasco, Alberto Sardinha, Pedro A. Santos, Ana Paiva, Francisco S. Melo

Figure 1 for Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning

Figure 2 for Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning

Figure 3 for Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning

Figure 4 for Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning

Abstract:We introduce hybrid execution in multi-agent reinforcement learning (MARL), a new paradigm in which agents aim to successfully perform cooperative tasks with any communication level at execution time by taking advantage of information-sharing among the agents. Under hybrid execution, the communication level can range from a setting in which no communication is allowed between agents (fully decentralized), to a setting featuring full communication (fully centralized). To formalize our setting, we define a new class of multi-agent partially observable Markov decision processes (POMDPs) that we name hybrid-POMDPs, which explicitly models a communication process between the agents. We contribute MARO, an approach that combines an autoregressive predictive model to estimate missing agents' observations, and a dropout-based RL training scheme that simulates different communication levels during the centralized training phase. We evaluate MARO on standard scenarios and extensions of previous benchmarks tailored to emphasize the negative impact of partial observability in MARL. Experimental results show that our method consistently outperforms baselines, allowing agents to act with faulty communication while successfully exploiting shared information.

Via

Access Paper or Ask Questions

Emergent social NPC interactions in the Social NPCs Skyrim mod and beyond

Jul 27, 2022

Manuel Guimarães, Pedro A. Santos, Arnav Jhala

Figure 1 for Emergent social NPC interactions in the Social NPCs Skyrim mod and beyond

Figure 2 for Emergent social NPC interactions in the Social NPCs Skyrim mod and beyond

Figure 3 for Emergent social NPC interactions in the Social NPCs Skyrim mod and beyond

Figure 4 for Emergent social NPC interactions in the Social NPCs Skyrim mod and beyond

Abstract:This work presents an implementation of a social architecture model for authoring Non-Player Character (NPC) in open world games inspired in academic research on agentbased modeling. Believable NPC authoring is burdensome in terms of rich dialogue and responsive behaviors. We briefly present the characteristics and advantages of using a social agent architecture for this task and describe an implementation of a social agent architecture CiF-CK released as a mod Social NPCs for The Elder Scrolls V: Skyrim

* Originally a chapter for Game AI Pro, contains 14 pages, 3 figures

Via

Access Paper or Ask Questions

Towards Explainable Social Agent Authoring tools: A case study on FAtiMA-Toolkit

Jun 07, 2022

Manuel Guimarães, Joana Campos, Pedro A. Santos, João Dias, Rui Prada

Figure 1 for Towards Explainable Social Agent Authoring tools: A case study on FAtiMA-Toolkit

Figure 2 for Towards Explainable Social Agent Authoring tools: A case study on FAtiMA-Toolkit

Figure 3 for Towards Explainable Social Agent Authoring tools: A case study on FAtiMA-Toolkit

Figure 4 for Towards Explainable Social Agent Authoring tools: A case study on FAtiMA-Toolkit

Abstract:The deployment of Socially Intelligent Agents (SIAs) in learning environments has proven to have several advantages in different areas of application. Social Agent Authoring Tools allow scenario designers to create tailored experiences with high control over SIAs behaviour, however, on the flip side, this comes at a cost as the complexity of the scenarios and its authoring can become overbearing. In this paper we introduce the concept of Explainable Social Agent Authoring Tools with the goal of analysing if authoring tools for social agents are understandable and interpretable. To this end we examine whether an authoring tool, FAtiMA-Toolkit, is understandable and its authoring steps interpretable, from the point-of-view of the author. We conducted two user studies to quantitatively assess the Interpretability, Comprehensibility and Transparency of FAtiMA-Toolkit from the perspective of a scenario designer. One of the key findings is the fact that FAtiMA-Toolkit's conceptual model is, in general, understandable, however the emotional-based concepts were not as easily understood and used by the authors. Although there are some positive aspects regarding the explainability of FAtiMA-Toolkit, there is still progress to be made to achieve a fully explainable social agent authoring tool. We provide a set of key concepts and possible solutions that can guide developers to build such tools.

* 24 Pages, 6 figures, in submission limbo

Via

Access Paper or Ask Questions

Semantic Norm Recognition and its application to Portuguese Law

Mar 10, 2022

Maria Duarte, Pedro A. Santos, João Dias, Jorge Baptista

Figure 1 for Semantic Norm Recognition and its application to Portuguese Law

Figure 2 for Semantic Norm Recognition and its application to Portuguese Law

Figure 3 for Semantic Norm Recognition and its application to Portuguese Law

Figure 4 for Semantic Norm Recognition and its application to Portuguese Law

Abstract:Being able to clearly interpret legal texts and fully understanding our rights, obligations and other legal norms has become progressively more important in the digital society. However, simply giving citizens access to the laws is not enough, as there is a need to provide meaningful information that cater to their specific queries and needs. For this, it is necessary to extract the relevant semantic information present in legal texts. Thus, we introduce the SNR (Semantic Norm Recognition) system, an automatic semantic information extraction system trained on a domain-specific (legal) text corpus taken from Portuguese Consumer Law. The SNR system uses the Portuguese Bert (BERTimbau) and was trained on a legislative Portuguese corpus. We demonstrate how our system achieved good results (81.44\% F1-score) on this domain-specific corpus, despite existing noise, and how it can be used to improve downstream tasks such as information retrieval.

Via

Access Paper or Ask Questions

Limited depth bandit-based strategy for Monte Carlo planning in continuous action spaces

Jun 29, 2021

Ricardo Quinteiro, Francisco S. Melo, Pedro A. Santos

Figure 1 for Limited depth bandit-based strategy for Monte Carlo planning in continuous action spaces

Figure 2 for Limited depth bandit-based strategy for Monte Carlo planning in continuous action spaces

Figure 3 for Limited depth bandit-based strategy for Monte Carlo planning in continuous action spaces

Figure 4 for Limited depth bandit-based strategy for Monte Carlo planning in continuous action spaces

Abstract:This paper addresses the problem of optimal control using search trees. We start by considering multi-armed bandit problems with continuous action spaces and propose LD-HOO, a limited depth variant of the hierarchical optimistic optimization (HOO) algorithm. We provide a regret analysis for LD-HOO and show that, asymptotically, our algorithm exhibits the same cumulative regret as the original HOO while being faster and more memory efficient. We then propose a Monte Carlo tree search algorithm based on LD-HOO for optimal control problems and illustrate the resulting approach's application in several optimal control problems.

Via

Access Paper or Ask Questions

FAtiMA Toolkit -- Toward an effective and accessible tool for the development of intelligent virtual agents and social robots

Mar 04, 2021

Samuel Mascarenhas, Manuel Guimarães, Pedro A. Santos, João Dias, Rui Prada, Ana Paiva

Figure 1 for FAtiMA Toolkit -- Toward an effective and accessible tool for the development of intelligent virtual agents and social robots

Figure 2 for FAtiMA Toolkit -- Toward an effective and accessible tool for the development of intelligent virtual agents and social robots

Figure 3 for FAtiMA Toolkit -- Toward an effective and accessible tool for the development of intelligent virtual agents and social robots

Figure 4 for FAtiMA Toolkit -- Toward an effective and accessible tool for the development of intelligent virtual agents and social robots

Abstract:More than a decade has passed since the development of FearNot!, an application designed to help children deal with bullying through role-playing with virtual characters. It was also the application that led to the creation of FAtiMA, an affective agent architecture for creating autonomous characters that can evoke empathic responses. In this paper, we describe FAtiMA Toolkit, a collection of open-source tools that is designed to help researchers, game developers and roboticists incorporate a computational model of emotion and decision-making in their work. The toolkit was developed with the goal of making FAtiMA more accessible, easier to incorporate into different projects and more flexible in its capabilities for human-agent interaction, based upon the experience gathered over the years across different virtual environments and human-robot interaction scenarios. As a result, this work makes several different contributions to the field of Agent-Based Architectures. More precisely, FAtiMA Toolkit's library based design allows developers to easily integrate it with other frameworks, its meta-cognitive model affords different internal reasoners and affective components and its explicit dialogue structure gives control to the author even within highly complex scenarios. To demonstrate the use of FAtiMA Toolkit, several different use cases where the toolkit was successfully applied are described and discussed.

Via

Access Paper or Ask Questions