Picture for Aaron Mueller

Aaron Mueller

Incremental Sentence Processing Mechanisms in Autoregressive Transformer Language Models

Add code
Dec 06, 2024
Viaarxiv icon

Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora

Add code
Dec 06, 2024
Viaarxiv icon

Characterizing the Role of Similarity in the Property Inferences of Language Models

Add code
Oct 29, 2024
Viaarxiv icon

Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics

Add code
Oct 28, 2024
Viaarxiv icon

The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability

Add code
Aug 02, 2024
Viaarxiv icon

NNsight and NDIF: Democratizing Access to Foundation Model Internals

Add code
Jul 18, 2024
Figure 1 for NNsight and NDIF: Democratizing Access to Foundation Model Internals
Figure 2 for NNsight and NDIF: Democratizing Access to Foundation Model Internals
Figure 3 for NNsight and NDIF: Democratizing Access to Foundation Model Internals
Figure 4 for NNsight and NDIF: Democratizing Access to Foundation Model Internals
Viaarxiv icon

Missed Causes and Ambiguous Effects: Counterfactuals Pose Challenges for Interpreting Neural Networks

Add code
Jul 05, 2024
Viaarxiv icon

[Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus

Add code
Apr 09, 2024
Figure 1 for [Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus
Viaarxiv icon

Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models

Add code
Mar 31, 2024
Figure 1 for Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Figure 2 for Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Figure 3 for Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Figure 4 for Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Viaarxiv icon

In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax

Add code
Nov 13, 2023
Figure 1 for In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax
Figure 2 for In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax
Figure 3 for In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax
Figure 4 for In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax
Viaarxiv icon