Picture for Sumeet Ramesh Motwani

Sumeet Ramesh Motwani

MALT: Improving Reasoning with Multi-Agent LLM Training

Add code
Dec 02, 2024
Viaarxiv icon

Unelicitable Backdoors in Language Models via Cryptographic Transformer Circuits

Add code
Jun 03, 2024
Viaarxiv icon

Secret Collusion Among Generative AI Agents

Add code
Feb 12, 2024
Figure 1 for Secret Collusion Among Generative AI Agents
Figure 2 for Secret Collusion Among Generative AI Agents
Figure 3 for Secret Collusion Among Generative AI Agents
Figure 4 for Secret Collusion Among Generative AI Agents
Viaarxiv icon

STARC: A General Framework For Quantifying Differences Between Reward Functions

Add code
Sep 26, 2023
Viaarxiv icon