Picture for Sham M. Kakade

Sham M. Kakade

Self-Improving Language Models with Bidirectional Evolutionary Search

Add code
May 27, 2026
Viaarxiv icon

Evaluating Relational Reasoning in LLMs with REL

Add code
Apr 14, 2026
Viaarxiv icon

Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models

Add code
Mar 12, 2026
Viaarxiv icon

The Role of Sparsity for Length Generalization in Transformers

Add code
Feb 24, 2025
Figure 1 for The Role of Sparsity for Length Generalization in Transformers
Figure 2 for The Role of Sparsity for Length Generalization in Transformers
Figure 3 for The Role of Sparsity for Length Generalization in Transformers
Figure 4 for The Role of Sparsity for Length Generalization in Transformers
Viaarxiv icon

Mixture of Parrots: Experts improve memorization more than reasoning

Add code
Oct 24, 2024
Figure 1 for Mixture of Parrots: Experts improve memorization more than reasoning
Figure 2 for Mixture of Parrots: Experts improve memorization more than reasoning
Figure 3 for Mixture of Parrots: Experts improve memorization more than reasoning
Figure 4 for Mixture of Parrots: Experts improve memorization more than reasoning
Viaarxiv icon

Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques

Add code
Sep 04, 2024
Figure 1 for Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques
Figure 2 for Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques
Figure 3 for Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques
Figure 4 for Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques
Viaarxiv icon

Eliminating Position Bias of Language Models: A Mechanistic Approach

Add code
Jul 01, 2024
Figure 1 for Eliminating Position Bias of Language Models: A Mechanistic Approach
Figure 2 for Eliminating Position Bias of Language Models: A Mechanistic Approach
Figure 3 for Eliminating Position Bias of Language Models: A Mechanistic Approach
Figure 4 for Eliminating Position Bias of Language Models: A Mechanistic Approach
Viaarxiv icon

Transcendence: Generative Models Can Outperform The Experts That Train Them

Add code
Jun 17, 2024
Figure 1 for Transcendence: Generative Models Can Outperform The Experts That Train Them
Figure 2 for Transcendence: Generative Models Can Outperform The Experts That Train Them
Figure 3 for Transcendence: Generative Models Can Outperform The Experts That Train Them
Figure 4 for Transcendence: Generative Models Can Outperform The Experts That Train Them
Viaarxiv icon

Scaling Laws in Linear Regression: Compute, Parameters, and Data

Add code
Jun 12, 2024
Figure 1 for Scaling Laws in Linear Regression: Compute, Parameters, and Data
Figure 2 for Scaling Laws in Linear Regression: Compute, Parameters, and Data
Figure 3 for Scaling Laws in Linear Regression: Compute, Parameters, and Data
Figure 4 for Scaling Laws in Linear Regression: Compute, Parameters, and Data
Viaarxiv icon

Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass

Add code
May 29, 2024
Figure 1 for Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
Figure 2 for Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
Figure 3 for Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
Figure 4 for Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
Viaarxiv icon