Picture for Aaron Courville

Aaron Courville

Universite de Montreal

A Comedy of Estimators: On KL Regularization in RL Training of LLMs

Add code
Dec 26, 2025
Viaarxiv icon

Shape of Thought: When Distribution Matters More than Correctness in Reasoning Tasks

Add code
Dec 24, 2025
Viaarxiv icon

Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning

Add code
Oct 02, 2025
Viaarxiv icon

Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models

Add code
Jul 16, 2025
Viaarxiv icon

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning

Add code
Jun 18, 2025
Figure 1 for Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Figure 2 for Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Figure 3 for Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Figure 4 for Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Viaarxiv icon

Adaptive Accompaniment with ReaLchords

Add code
Jun 17, 2025
Viaarxiv icon

The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning

Add code
Jun 16, 2025
Figure 1 for The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning
Figure 2 for The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning
Figure 3 for The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning
Figure 4 for The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning
Viaarxiv icon

Bias Analysis in Unconditional Image Generative Models

Add code
Jun 10, 2025
Figure 1 for Bias Analysis in Unconditional Image Generative Models
Figure 2 for Bias Analysis in Unconditional Image Generative Models
Figure 3 for Bias Analysis in Unconditional Image Generative Models
Figure 4 for Bias Analysis in Unconditional Image Generative Models
Viaarxiv icon

Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning

Add code
May 29, 2025
Viaarxiv icon

FLAM: Frame-Wise Language-Audio Modeling

Add code
May 08, 2025
Viaarxiv icon