Picture for Sarah Schwettmann

Sarah Schwettmann

Establishing Best Practices for Building Rigorous Agentic Benchmarks

Add code
Jul 03, 2025
Viaarxiv icon

The Singapore Consensus on Global AI Safety Research Priorities

Add code
Jun 25, 2025
Viaarxiv icon

Line of Sight: On Linear Representations in VLLMs

Add code
Jun 05, 2025
Viaarxiv icon

Eliciting Language Model Behaviors with Investigator Agents

Add code
Feb 03, 2025
Viaarxiv icon

Nearest Neighbor Normalization Improves Multimodal Retrieval

Add code
Oct 31, 2024
Figure 1 for Nearest Neighbor Normalization Improves Multimodal Retrieval
Figure 2 for Nearest Neighbor Normalization Improves Multimodal Retrieval
Figure 3 for Nearest Neighbor Normalization Improves Multimodal Retrieval
Figure 4 for Nearest Neighbor Normalization Improves Multimodal Retrieval
Viaarxiv icon

A Multimodal Automated Interpretability Agent

Add code
Apr 22, 2024
Figure 1 for A Multimodal Automated Interpretability Agent
Figure 2 for A Multimodal Automated Interpretability Agent
Figure 3 for A Multimodal Automated Interpretability Agent
Figure 4 for A Multimodal Automated Interpretability Agent
Viaarxiv icon

Automatic Discovery of Visual Circuits

Add code
Apr 22, 2024
Figure 1 for Automatic Discovery of Visual Circuits
Figure 2 for Automatic Discovery of Visual Circuits
Figure 3 for Automatic Discovery of Visual Circuits
Figure 4 for Automatic Discovery of Visual Circuits
Viaarxiv icon

A Function Interpretation Benchmark for Evaluating Interpretability Methods

Add code
Sep 07, 2023
Figure 1 for A Function Interpretation Benchmark for Evaluating Interpretability Methods
Figure 2 for A Function Interpretation Benchmark for Evaluating Interpretability Methods
Figure 3 for A Function Interpretation Benchmark for Evaluating Interpretability Methods
Figure 4 for A Function Interpretation Benchmark for Evaluating Interpretability Methods
Viaarxiv icon

Multimodal Neurons in Pretrained Text-Only Transformers

Add code
Aug 03, 2023
Viaarxiv icon

Natural Language Descriptions of Deep Visual Features

Add code
Jan 26, 2022
Figure 1 for Natural Language Descriptions of Deep Visual Features
Figure 2 for Natural Language Descriptions of Deep Visual Features
Figure 3 for Natural Language Descriptions of Deep Visual Features
Figure 4 for Natural Language Descriptions of Deep Visual Features
Viaarxiv icon