Picture for Usman Anwar

Usman Anwar

The Reality of AI and Biorisk

Add code
Dec 02, 2024
Viaarxiv icon

Learning to Forget using Hypernetworks

Add code
Dec 01, 2024
Viaarxiv icon

Comparing Bottom-Up and Top-Down Steering Approaches on In-Context Learning Tasks

Add code
Nov 11, 2024
Viaarxiv icon

Adversarial Robustness of In-Context Learning in Transformers for Linear Regression

Add code
Nov 07, 2024
Viaarxiv icon

Noisy Zero-Shot Coordination: Breaking The Common Knowledge Assumption In Zero-Shot Coordination Games

Add code
Nov 07, 2024
Viaarxiv icon

IDs for AI Systems

Add code
Jun 17, 2024
Viaarxiv icon

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Add code
Apr 15, 2024
Figure 1 for Foundational Challenges in Assuring Alignment and Safety of Large Language Models
Figure 2 for Foundational Challenges in Assuring Alignment and Safety of Large Language Models
Figure 3 for Foundational Challenges in Assuring Alignment and Safety of Large Language Models
Figure 4 for Foundational Challenges in Assuring Alignment and Safety of Large Language Models
Viaarxiv icon

Reward Model Ensembles Help Mitigate Overoptimization

Add code
Oct 04, 2023
Viaarxiv icon

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Add code
Jul 27, 2023
Figure 1 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 2 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 3 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 4 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Viaarxiv icon

Domain Generalization for Robust Model-Based Offline Reinforcement Learning

Add code
Nov 27, 2022
Viaarxiv icon