Picture for Max Kleiman-Weiner

Max Kleiman-Weiner

SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation

Add code
Oct 22, 2024
Figure 1 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 2 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 3 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 4 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Viaarxiv icon

Value Internalization: Learning and Generalizing from Social Reward

Add code
Jul 19, 2024
Viaarxiv icon

Multilingual Trolley Problems for Language Models

Add code
Jul 02, 2024
Figure 1 for Multilingual Trolley Problems for Language Models
Figure 2 for Multilingual Trolley Problems for Language Models
Figure 3 for Multilingual Trolley Problems for Language Models
Figure 4 for Multilingual Trolley Problems for Language Models
Viaarxiv icon

Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents

Add code
Apr 25, 2024
Figure 1 for Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents
Figure 2 for Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents
Figure 3 for Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents
Figure 4 for Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents
Viaarxiv icon

CLadder: A Benchmark to Assess Causal Reasoning Capabilities of Language Models

Add code
Dec 07, 2023
Viaarxiv icon

Learning to Coordinate with Humans using Action Features

Add code
Jan 29, 2022
Viaarxiv icon

When Is It Acceptable to Break the Rules? Knowledge Representation of Moral Judgement Based on Empirical Data

Add code
Jan 19, 2022
Viaarxiv icon

Modeling Communication to Coordinate Perspectives in Cooperation

Add code
Jun 03, 2021
Figure 1 for Modeling Communication to Coordinate Perspectives in Cooperation
Figure 2 for Modeling Communication to Coordinate Perspectives in Cooperation
Figure 3 for Modeling Communication to Coordinate Perspectives in Cooperation
Figure 4 for Modeling Communication to Coordinate Perspectives in Cooperation
Viaarxiv icon

Too many cooks: Coordinating multi-agent collaboration through inverse planning

Add code
Mar 26, 2020
Figure 1 for Too many cooks: Coordinating multi-agent collaboration through inverse planning
Figure 2 for Too many cooks: Coordinating multi-agent collaboration through inverse planning
Figure 3 for Too many cooks: Coordinating multi-agent collaboration through inverse planning
Figure 4 for Too many cooks: Coordinating multi-agent collaboration through inverse planning
Viaarxiv icon

Finding Friend and Foe in Multi-Agent Games

Add code
Jun 05, 2019
Figure 1 for Finding Friend and Foe in Multi-Agent Games
Figure 2 for Finding Friend and Foe in Multi-Agent Games
Figure 3 for Finding Friend and Foe in Multi-Agent Games
Figure 4 for Finding Friend and Foe in Multi-Agent Games
Viaarxiv icon