Picture for Sam Toyer

Sam Toyer

Trading Inference-Time Compute for Adversarial Robustness

Add code
Jan 31, 2025
Figure 1 for Trading Inference-Time Compute for Adversarial Robustness
Figure 2 for Trading Inference-Time Compute for Adversarial Robustness
Figure 3 for Trading Inference-Time Compute for Adversarial Robustness
Figure 4 for Trading Inference-Time Compute for Adversarial Robustness
Viaarxiv icon

OpenAI o1 System Card

Add code
Dec 21, 2024
Figure 1 for OpenAI o1 System Card
Figure 2 for OpenAI o1 System Card
Figure 3 for OpenAI o1 System Card
Figure 4 for OpenAI o1 System Card
Viaarxiv icon

Deliberative Alignment: Reasoning Enables Safer Language Models

Add code
Dec 20, 2024
Viaarxiv icon

Human Action Anticipation: A Survey

Add code
Oct 17, 2024
Figure 1 for Human Action Anticipation: A Survey
Figure 2 for Human Action Anticipation: A Survey
Figure 3 for Human Action Anticipation: A Survey
Figure 4 for Human Action Anticipation: A Survey
Viaarxiv icon

Exploring and Addressing Reward Confusion in Offline Preference Learning

Add code
Jul 22, 2024
Figure 1 for Exploring and Addressing Reward Confusion in Offline Preference Learning
Figure 2 for Exploring and Addressing Reward Confusion in Offline Preference Learning
Figure 3 for Exploring and Addressing Reward Confusion in Offline Preference Learning
Figure 4 for Exploring and Addressing Reward Confusion in Offline Preference Learning
Viaarxiv icon

A StrongREJECT for Empty Jailbreaks

Add code
Feb 15, 2024
Viaarxiv icon

Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

Add code
Nov 02, 2023
Viaarxiv icon

imitation: Clean Imitation Learning Implementations

Add code
Nov 22, 2022
Figure 1 for imitation: Clean Imitation Learning Implementations
Figure 2 for imitation: Clean Imitation Learning Implementations
Figure 3 for imitation: Clean Imitation Learning Implementations
Figure 4 for imitation: Clean Imitation Learning Implementations
Viaarxiv icon

An Empirical Investigation of Representation Learning for Imitation

Add code
May 16, 2022
Figure 1 for An Empirical Investigation of Representation Learning for Imitation
Figure 2 for An Empirical Investigation of Representation Learning for Imitation
Figure 3 for An Empirical Investigation of Representation Learning for Imitation
Figure 4 for An Empirical Investigation of Representation Learning for Imitation
Viaarxiv icon

A Primer on Maximum Causal Entropy Inverse Reinforcement Learning

Add code
Mar 22, 2022
Figure 1 for A Primer on Maximum Causal Entropy Inverse Reinforcement Learning
Viaarxiv icon