Picture for Itamar Pres

Itamar Pres

Competition Dynamics Shape Algorithmic Phases of In-Context Learning

Add code
Dec 01, 2024
Viaarxiv icon

Towards Reliable Evaluation of Behavior Steering Interventions in LLMs

Add code
Oct 22, 2024
Viaarxiv icon

A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity

Add code
Jan 03, 2024
Figure 1 for A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Figure 2 for A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Figure 3 for A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Figure 4 for A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Viaarxiv icon