Picture for Aleksander Madry

Aleksander Madry

Optimizing ML Training with Metagradient Descent

Add code
Mar 17, 2025
Viaarxiv icon

Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation

Add code
Mar 14, 2025
Viaarxiv icon

Do Large Language Model Benchmarks Test Reliability?

Add code
Feb 05, 2025
Viaarxiv icon

OpenAI o1 System Card

Add code
Dec 21, 2024
Figure 1 for OpenAI o1 System Card
Figure 2 for OpenAI o1 System Card
Figure 3 for OpenAI o1 System Card
Figure 4 for OpenAI o1 System Card
Viaarxiv icon

Attribute-to-Delete: Machine Unlearning via Datamodel Matching

Add code
Oct 30, 2024
Figure 1 for Attribute-to-Delete: Machine Unlearning via Datamodel Matching
Figure 2 for Attribute-to-Delete: Machine Unlearning via Datamodel Matching
Figure 3 for Attribute-to-Delete: Machine Unlearning via Datamodel Matching
Figure 4 for Attribute-to-Delete: Machine Unlearning via Datamodel Matching
Viaarxiv icon

ContextCite: Attributing Model Generation to Context

Add code
Sep 01, 2024
Figure 1 for ContextCite: Attributing Model Generation to Context
Figure 2 for ContextCite: Attributing Model Generation to Context
Figure 3 for ContextCite: Attributing Model Generation to Context
Figure 4 for ContextCite: Attributing Model Generation to Context
Viaarxiv icon

Data Debiasing with Datamodels (D3M): Improving Subgroup Robustness via Data Selection

Add code
Jun 24, 2024
Viaarxiv icon

Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content

Add code
May 09, 2024
Figure 1 for Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content
Figure 2 for Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content
Figure 3 for Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content
Figure 4 for Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content
Viaarxiv icon

Decomposing and Editing Predictions by Modeling Model Computation

Add code
Apr 17, 2024
Figure 1 for Decomposing and Editing Predictions by Modeling Model Computation
Figure 2 for Decomposing and Editing Predictions by Modeling Model Computation
Figure 3 for Decomposing and Editing Predictions by Modeling Model Computation
Figure 4 for Decomposing and Editing Predictions by Modeling Model Computation
Viaarxiv icon

Ask Your Distribution Shift if Pre-Training is Right for You

Add code
Feb 29, 2024
Figure 1 for Ask Your Distribution Shift if Pre-Training is Right for You
Figure 2 for Ask Your Distribution Shift if Pre-Training is Right for You
Figure 3 for Ask Your Distribution Shift if Pre-Training is Right for You
Figure 4 for Ask Your Distribution Shift if Pre-Training is Right for You
Viaarxiv icon