Picture for Erik Miehling

Erik Miehling

Granite Guardian

Add code
Dec 10, 2024
Viaarxiv icon

Evaluating the Prompt Steerability of Large Language Models

Add code
Nov 19, 2024
Viaarxiv icon

Programming Refusal with Conditional Activation Steering

Add code
Sep 06, 2024
Figure 1 for Programming Refusal with Conditional Activation Steering
Figure 2 for Programming Refusal with Conditional Activation Steering
Figure 3 for Programming Refusal with Conditional Activation Steering
Figure 4 for Programming Refusal with Conditional Activation Steering
Viaarxiv icon

CELL your Model: Contrastive Explanation Methods for Large Language Models

Add code
Jun 17, 2024
Viaarxiv icon

Language Models in Dialogue: Conversational Maxims for Human-AI Interactions

Add code
Mar 22, 2024
Figure 1 for Language Models in Dialogue: Conversational Maxims for Human-AI Interactions
Figure 2 for Language Models in Dialogue: Conversational Maxims for Human-AI Interactions
Figure 3 for Language Models in Dialogue: Conversational Maxims for Human-AI Interactions
Figure 4 for Language Models in Dialogue: Conversational Maxims for Human-AI Interactions
Viaarxiv icon

Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations

Add code
Mar 09, 2024
Viaarxiv icon

Reinforcement Learning in Non-Stationary Discrete-Time Linear-Quadratic Mean-Field Games

Add code
Oct 01, 2020
Viaarxiv icon

Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning

Add code
Apr 18, 2020
Figure 1 for Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning
Figure 2 for Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning
Figure 3 for Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning
Viaarxiv icon

Non-Cooperative Inverse Reinforcement Learning

Add code
Nov 03, 2019
Figure 1 for Non-Cooperative Inverse Reinforcement Learning
Viaarxiv icon

Online Planning for Decentralized Stochastic Control with Partial History Sharing

Add code
Aug 06, 2019
Figure 1 for Online Planning for Decentralized Stochastic Control with Partial History Sharing
Figure 2 for Online Planning for Decentralized Stochastic Control with Partial History Sharing
Viaarxiv icon