Picture for William Merrill

William Merrill

2 OLMo 2 Furious

Add code
Dec 31, 2024
Figure 1 for 2 OLMo 2 Furious
Figure 2 for 2 OLMo 2 Furious
Figure 3 for 2 OLMo 2 Furious
Figure 4 for 2 OLMo 2 Furious
Viaarxiv icon

Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG

Add code
Jun 18, 2024
Viaarxiv icon

Let's Think Dot by Dot: Hidden Computation in Transformer Language Models

Add code
Apr 24, 2024
Viaarxiv icon

The Illusion of State in State-Space Models

Add code
Apr 12, 2024
Viaarxiv icon

Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment

Add code
Feb 29, 2024
Figure 1 for Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment
Figure 2 for Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment
Figure 3 for Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment
Figure 4 for Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment
Viaarxiv icon

OLMo: Accelerating the Science of Language Models

Add code
Feb 07, 2024
Figure 1 for OLMo: Accelerating the Science of Language Models
Figure 2 for OLMo: Accelerating the Science of Language Models
Figure 3 for OLMo: Accelerating the Science of Language Models
Figure 4 for OLMo: Accelerating the Science of Language Models
Viaarxiv icon

Transformers as Recognizers of Formal Languages: A Survey on Expressivity

Add code
Nov 01, 2023
Viaarxiv icon

The Expressive Power of Transformers with Chain of Thought

Add code
Oct 18, 2023
Viaarxiv icon

How Language Model Hallucinations Can Snowball

Add code
May 22, 2023
Figure 1 for How Language Model Hallucinations Can Snowball
Figure 2 for How Language Model Hallucinations Can Snowball
Figure 3 for How Language Model Hallucinations Can Snowball
Figure 4 for How Language Model Hallucinations Can Snowball
Viaarxiv icon

A Tale of Two Circuits: Grokking as Competition of Sparse and Dense Subnetworks

Add code
Mar 21, 2023
Viaarxiv icon