Picture for Dan Friedman

Dan Friedman

Continual Memorization of Factoids in Large Language Models

Add code
Nov 11, 2024
Viaarxiv icon

When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1

Add code
Oct 02, 2024
Viaarxiv icon

Representing Rule-based Chatbots with Transformers

Add code
Jul 15, 2024
Figure 1 for Representing Rule-based Chatbots with Transformers
Figure 2 for Representing Rule-based Chatbots with Transformers
Figure 3 for Representing Rule-based Chatbots with Transformers
Figure 4 for Representing Rule-based Chatbots with Transformers
Viaarxiv icon

Finding Transformer Circuits with Edge Pruning

Add code
Jun 24, 2024
Figure 1 for Finding Transformer Circuits with Edge Pruning
Figure 2 for Finding Transformer Circuits with Edge Pruning
Figure 3 for Finding Transformer Circuits with Edge Pruning
Figure 4 for Finding Transformer Circuits with Edge Pruning
Viaarxiv icon

The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models

Add code
Mar 06, 2024
Viaarxiv icon

Interpretability Illusions in the Generalization of Simplified Models

Add code
Dec 06, 2023
Figure 1 for Interpretability Illusions in the Generalization of Simplified Models
Figure 2 for Interpretability Illusions in the Generalization of Simplified Models
Figure 3 for Interpretability Illusions in the Generalization of Simplified Models
Figure 4 for Interpretability Illusions in the Generalization of Simplified Models
Viaarxiv icon

Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve

Add code
Sep 24, 2023
Figure 1 for Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve
Figure 2 for Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve
Figure 3 for Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve
Figure 4 for Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve
Viaarxiv icon

Learning Transformer Programs

Add code
Jun 01, 2023
Viaarxiv icon

Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations

Add code
May 22, 2023
Viaarxiv icon

The Vendi Score: A Diversity Evaluation Metric for Machine Learning

Add code
Oct 05, 2022
Figure 1 for The Vendi Score: A Diversity Evaluation Metric for Machine Learning
Figure 2 for The Vendi Score: A Diversity Evaluation Metric for Machine Learning
Figure 3 for The Vendi Score: A Diversity Evaluation Metric for Machine Learning
Figure 4 for The Vendi Score: A Diversity Evaluation Metric for Machine Learning
Viaarxiv icon