Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Leon Fröhling

Agent-Supported Foresight for AI Systemic Risks: AI Agents for Breadth, Experts for Judgment

Feb 09, 2026

Leon Fröhling, Alessandro Giaconia, Edyta Paulina Bogucka, Daniele Quercia

Abstract:AI impact assessments often stress near-term risks because human judgment degrades over longer horizons, exemplifying the Collingridge dilemma: foresight is most needed when knowledge is scarcest. To address long-term systemic risks, we introduce a scalable approach that simulates in-silico agents using the strategic foresight method of the Futures Wheel. We applied it to four AI uses spanning Technology Readiness Levels (TRLs): Chatbot Companion (TRL 9, mature), AI Toy (TRL 7, medium), Griefbot (TRL 5, low), and Death App (TRL 2, conceptual). Across 30 agent runs per use, agents produced 86-110 consequences, condensed into 27-47 unique risks. To benchmark the agent outputs against human perspectives, we collected evaluations from 290 domain experts and 7 leaders, and conducted Futures Wheel sessions with 42 experts and 42 laypeople. Agents generated many systemic consequences across runs. Compared with these outputs, experts identified fewer risks, typically less systemic but judged more likely, whereas laypeople surfaced more emotionally salient concerns that were generally less systemic. We propose a hybrid foresight workflow, wherein agents broaden systemic coverage, and humans provide contextual grounding. Our dataset is available at: https://social-dynamics.net/ai-risks/foresight.

* 48 pages, 15 figures

Via

Access Paper or Ask Questions

SubData: A Python Library to Collect and Combine Datasets for Evaluating LLM Alignment on Downstream Tasks

Dec 21, 2024

Leon Fröhling, Pietro Bernardelle, Gianluca Demartini

Abstract:With the release of ever more capable large language models (LLMs), researchers in NLP and related disciplines have started to explore the usability of LLMs for a wide variety of different annotation tasks. Very recently, a lot of this attention has shifted to tasks that are subjective in nature. Given that the latest generations of LLMs have digested and encoded extensive knowledge about different human subpopulations and individuals, the hope is that these models can be trained, tuned or prompted to align with a wide range of different human perspectives. While researchers already evaluate the success of this alignment via surveys and tests, there is a lack of resources to evaluate the alignment on what oftentimes matters the most in NLP; the actual downstream tasks. To fill this gap we present SubData, a Python library that offers researchers working on topics related to subjectivity in annotation tasks a convenient way of collecting, combining and using a range of suitable datasets.

* 4 pages, 1 figure

Via

Access Paper or Ask Questions

Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas

Dec 19, 2024

Pietro Bernardelle, Leon Fröhling, Stefano Civelli, Riccardo Lunardi, Kevin Roiter, Gianluca Demartini

Figure 1 for Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas

Figure 2 for Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas

Figure 3 for Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas

Figure 4 for Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas

Abstract:The analysis of political biases in large language models (LLMs) has primarily examined these systems as single entities with fixed viewpoints. While various methods exist for measuring such biases, the impact of persona-based prompting on LLMs' political orientation remains unexplored. In this work we leverage PersonaHub, a collection of synthetic persona descriptions, to map the political distribution of persona-based prompted LLMs using the Political Compass Test (PCT). We then examine whether these initial compass distributions can be manipulated through explicit ideological prompting towards diametrically opposed political orientations: right-authoritarian and left-libertarian. Our experiments reveal that synthetic personas predominantly cluster in the left-libertarian quadrant, with models demonstrating varying degrees of responsiveness when prompted with explicit ideological descriptors. While all models demonstrate significant shifts towards right-authoritarian positions, they exhibit more limited shifts towards left-libertarian positions, suggesting an asymmetric response to ideological manipulation that may reflect inherent biases in model training.

* 4 pages, 2 figures, 2 tables

Via

Access Paper or Ask Questions

Personas with Attitudes: Controlling LLMs for Diverse Data Annotation

Oct 15, 2024

Leon Fröhling, Gianluca Demartini, Dennis Assenmacher

Figure 1 for Personas with Attitudes: Controlling LLMs for Diverse Data Annotation

Figure 2 for Personas with Attitudes: Controlling LLMs for Diverse Data Annotation

Figure 3 for Personas with Attitudes: Controlling LLMs for Diverse Data Annotation

Figure 4 for Personas with Attitudes: Controlling LLMs for Diverse Data Annotation

Abstract:We present a novel approach for enhancing diversity and control in data annotation tasks by personalizing large language models (LLMs). We investigate the impact of injecting diverse persona descriptions into LLM prompts across two studies, exploring whether personas increase annotation diversity and whether the impacts of individual personas on the resulting annotations are consistent and controllable. Our results show that persona-prompted LLMs produce more diverse annotations than LLMs prompted without personas and that these effects are both controllable and repeatable, making our approach a suitable tool for improving data annotation in subjective NLP tasks like toxicity detection.

* 21 pages, 13 figures

Via

Access Paper or Ask Questions

The Unseen Targets of Hate -- A Systematic Review of Hateful Communication Datasets

May 14, 2024

Zehui Yu, Indira Sen, Dennis Assenmacher, Mattia Samory, Leon Fröhling, Christina Dahn, Debora Nozza, Claudia Wagner

Abstract:Machine learning (ML)-based content moderation tools are essential to keep online spaces free from hateful communication. Yet, ML tools can only be as capable as the quality of the data they are trained on allows them. While there is increasing evidence that they underperform in detecting hateful communications directed towards specific identities and may discriminate against them, we know surprisingly little about the provenance of such bias. To fill this gap, we present a systematic review of the datasets for the automated detection of hateful communication introduced over the past decade, and unpack the quality of the datasets in terms of the identities that they embody: those of the targets of hateful communication that the data curators focused on, as well as those unintentionally included in the datasets. We find, overall, a skewed representation of selected target identities and mismatches between the targets that research conceptualizes and ultimately includes in datasets. Yet, by contextualizing these findings in the language and location of origin of the datasets, we highlight a positive trend towards the broadening and diversification of this research space.

* 20 pages, 14 figures

Via

Access Paper or Ask Questions