Picture for Jan-Philipp Fränken

Jan-Philipp Fränken

MARPLE: A Benchmark for Long-Horizon Inference

Add code
Oct 02, 2024
Viaarxiv icon

Human-like Affective Cognition in Foundation Models

Add code
Sep 19, 2024
Viaarxiv icon

PERSONA: A Reproducible Testbed for Pluralistic Alignment

Add code
Jul 24, 2024
Viaarxiv icon

Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels

Add code
Apr 22, 2024
Figure 1 for Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels
Figure 2 for Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels
Figure 3 for Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels
Figure 4 for Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels
Viaarxiv icon

Procedural Dilemma Generation for Evaluating Moral Reasoning in Humans and Language Models

Add code
Apr 17, 2024
Viaarxiv icon

STaR-GATE: Teaching Language Models to Ask Clarifying Questions

Add code
Mar 29, 2024
Viaarxiv icon

Social Contract AI: Aligning AI Assistants with Implicit Group Norms

Add code
Oct 26, 2023
Viaarxiv icon

Understanding Social Reasoning in Language Models with Language Models

Add code
Jun 21, 2023
Viaarxiv icon