Picture for Jan Batzner

Jan Batzner

Agent Benchmarks Fail Public Sector Requirements

Add code
Jan 28, 2026
Viaarxiv icon

One Persona, Many Cues, Different Results: How Sociodemographic Cues Impact LLM Personalization

Add code
Jan 26, 2026
Viaarxiv icon

Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations

Add code
Nov 06, 2025
Figure 1 for Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations
Figure 2 for Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations
Figure 3 for Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations
Figure 4 for Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations
Viaarxiv icon

Oversight Structures for Agentic AI in Public-Sector Organizations

Add code
Jun 05, 2025
Figure 1 for Oversight Structures for Agentic AI in Public-Sector Organizations
Viaarxiv icon

GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy

Add code
Jul 25, 2024
Figure 1 for GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy
Figure 2 for GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy
Figure 3 for GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy
Figure 4 for GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy
Viaarxiv icon

Immunization against harmful fine-tuning attacks

Add code
Feb 26, 2024
Viaarxiv icon