Picture for Patrick Schramowski

Patrick Schramowski

No Safe Dose: How Training Data Drives Unsafe Image Generation

Add code
May 27, 2026
Viaarxiv icon

Multilingual Steering by Design: Multilingual Sparse Autoencoders and Principled Layer Selection

Add code
May 21, 2026
Viaarxiv icon

LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking

Add code
Apr 16, 2026
Viaarxiv icon

Breaking Up with Normatively Monolithic Agency with GRACE: A Reason-Based Neuro-Symbolic Architecture for Safe and Ethical AI Alignment

Add code
Jan 15, 2026
Viaarxiv icon

CLaS-Bench: A Cross-Lingual Alignment and Steering Benchmark

Add code
Jan 13, 2026
Viaarxiv icon

LIME: Making LLM Data More Efficient with Linguistic Metadata Embeddings

Add code
Dec 08, 2025
Viaarxiv icon

CHRONOBERG: Capturing Language Evolution and Temporal Awareness in Foundation Models

Add code
Sep 26, 2025
Figure 1 for CHRONOBERG: Capturing Language Evolution and Temporal Awareness in Foundation Models
Figure 2 for CHRONOBERG: Capturing Language Evolution and Temporal Awareness in Foundation Models
Figure 3 for CHRONOBERG: Capturing Language Evolution and Temporal Awareness in Foundation Models
Figure 4 for CHRONOBERG: Capturing Language Evolution and Temporal Awareness in Foundation Models
Viaarxiv icon

Measuring and Guiding Monosemanticity

Add code
Jun 24, 2025
Viaarxiv icon

Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models

Add code
May 28, 2025
Figure 1 for Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Figure 2 for Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Figure 3 for Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Figure 4 for Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Viaarxiv icon

MSTS: A Multimodal Safety Test Suite for Vision-Language Models

Add code
Jan 17, 2025
Viaarxiv icon