Picture for David Bau

David Bau

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice

Add code
Dec 09, 2024
Viaarxiv icon

Art-Free Generative Models: Art Creation Without Graphic Art Knowledge

Add code
Nov 29, 2024
Figure 1 for Art-Free Generative Models: Art Creation Without Graphic Art Knowledge
Figure 2 for Art-Free Generative Models: Art Creation Without Graphic Art Knowledge
Figure 3 for Art-Free Generative Models: Art Creation Without Graphic Art Knowledge
Figure 4 for Art-Free Generative Models: Art Creation Without Graphic Art Knowledge
Viaarxiv icon

Erasing Conceptual Knowledge from Language Models

Add code
Oct 03, 2024
Viaarxiv icon

The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability

Add code
Aug 02, 2024
Figure 1 for The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability
Figure 2 for The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability
Figure 3 for The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability
Viaarxiv icon

Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models

Add code
Jul 31, 2024
Figure 1 for Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
Figure 2 for Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
Figure 3 for Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
Figure 4 for Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
Viaarxiv icon

NNsight and NDIF: Democratizing Access to Foundation Model Internals

Add code
Jul 18, 2024
Figure 1 for NNsight and NDIF: Democratizing Access to Foundation Model Internals
Figure 2 for NNsight and NDIF: Democratizing Access to Foundation Model Internals
Figure 3 for NNsight and NDIF: Democratizing Access to Foundation Model Internals
Figure 4 for NNsight and NDIF: Democratizing Access to Foundation Model Internals
Viaarxiv icon

Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs

Add code
Jun 28, 2024
Viaarxiv icon

Customizing Text-to-Image Models with a Single Image Pair

Add code
May 02, 2024
Viaarxiv icon

Locating and Editing Factual Associations in Mamba

Add code
Apr 04, 2024
Viaarxiv icon

Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models

Add code
Mar 31, 2024
Figure 1 for Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Figure 2 for Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Figure 3 for Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Figure 4 for Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Viaarxiv icon