Picture for Maarten Sap

Maarten Sap

Shammie

Rejected Dialects: Biases Against African American Language in Reward Models

Add code
Feb 18, 2025
Viaarxiv icon

Interactive Agents to Overcome Ambiguity in Software Engineering

Add code
Feb 18, 2025
Viaarxiv icon

AutoPresent: Designing Structured Visuals from Scratch

Add code
Jan 01, 2025
Viaarxiv icon

Multi-Attribute Constraint Satisfaction via Language Model Rewriting

Add code
Dec 26, 2024
Viaarxiv icon

Minion: A Technology Probe for Resolving Value Conflicts through Expert-Driven and User-Driven Strategies in AI Companion Applications

Add code
Nov 11, 2024
Figure 1 for Minion: A Technology Probe for Resolving Value Conflicts through Expert-Driven and User-Driven Strategies in AI Companion Applications
Figure 2 for Minion: A Technology Probe for Resolving Value Conflicts through Expert-Driven and User-Driven Strategies in AI Companion Applications
Figure 3 for Minion: A Technology Probe for Resolving Value Conflicts through Expert-Driven and User-Driven Strategies in AI Companion Applications
Figure 4 for Minion: A Technology Probe for Resolving Value Conflicts through Expert-Driven and User-Driven Strategies in AI Companion Applications
Viaarxiv icon

SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation

Add code
Oct 22, 2024
Figure 1 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 2 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 3 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 4 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Viaarxiv icon

BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data

Add code
Oct 21, 2024
Figure 1 for BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Figure 2 for BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Figure 3 for BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Figure 4 for BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Viaarxiv icon

Data Defenses Against Large Language Models

Add code
Oct 17, 2024
Figure 1 for Data Defenses Against Large Language Models
Figure 2 for Data Defenses Against Large Language Models
Figure 3 for Data Defenses Against Large Language Models
Figure 4 for Data Defenses Against Large Language Models
Viaarxiv icon

HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions

Add code
Sep 26, 2024
Figure 1 for HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Figure 2 for HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Figure 3 for HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Figure 4 for HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Viaarxiv icon

AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents

Add code
Sep 13, 2024
Figure 1 for AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
Figure 2 for AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
Figure 3 for AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
Figure 4 for AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
Viaarxiv icon