Picture for Dale Schuurmans

Dale Schuurmans

University of Alberta

Toward Understanding In-context vs. In-weight Learning

Add code
Oct 30, 2024
Viaarxiv icon

Faster WIND: Accelerating Iterative Best-of-$N$ Distillation for LLM Alignment

Add code
Oct 28, 2024
Viaarxiv icon

Plastic Learning with Deep Fourier Features

Add code
Oct 27, 2024
Viaarxiv icon

Autoregressive Large Language Models are Computationally Universal

Add code
Oct 04, 2024
Viaarxiv icon

Generative Hierarchical Materials Search

Add code
Sep 10, 2024
Figure 1 for Generative Hierarchical Materials Search
Figure 2 for Generative Hierarchical Materials Search
Figure 3 for Generative Hierarchical Materials Search
Figure 4 for Generative Hierarchical Materials Search
Viaarxiv icon

Exploring and Benchmarking the Planning Capabilities of Large Language Models

Add code
Jun 18, 2024
Viaarxiv icon

Learning Continually by Spectral Regularization

Add code
Jun 10, 2024
Viaarxiv icon

Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation

Add code
May 31, 2024
Viaarxiv icon

Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF

Add code
May 29, 2024
Figure 1 for Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Figure 2 for Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Figure 3 for Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Viaarxiv icon

Soft Preference Optimization: Aligning Language Models to Expert Distributions

Add code
Apr 30, 2024
Viaarxiv icon