Picture for Alexandre Piché

Alexandre Piché

Self-Evolving Curriculum for LLM Reasoning

Add code
May 20, 2025
Figure 1 for Self-Evolving Curriculum for LLM Reasoning
Figure 2 for Self-Evolving Curriculum for LLM Reasoning
Figure 3 for Self-Evolving Curriculum for LLM Reasoning
Figure 4 for Self-Evolving Curriculum for LLM Reasoning
Viaarxiv icon

LLMs can learn self-restraint through iterative self-reflection

Add code
May 15, 2024
Figure 1 for LLMs can learn self-restraint through iterative self-reflection
Figure 2 for LLMs can learn self-restraint through iterative self-reflection
Figure 3 for LLMs can learn self-restraint through iterative self-reflection
Figure 4 for LLMs can learn self-restraint through iterative self-reflection
Viaarxiv icon

Causal Discovery with Language Models as Imperfect Experts

Add code
Jul 05, 2023
Figure 1 for Causal Discovery with Language Models as Imperfect Experts
Figure 2 for Causal Discovery with Language Models as Imperfect Experts
Figure 3 for Causal Discovery with Language Models as Imperfect Experts
Figure 4 for Causal Discovery with Language Models as Imperfect Experts
Viaarxiv icon

Can large language models build causal graphs?

Add code
Mar 07, 2023
Figure 1 for Can large language models build causal graphs?
Figure 2 for Can large language models build causal graphs?
Figure 3 for Can large language models build causal graphs?
Figure 4 for Can large language models build causal graphs?
Viaarxiv icon

Unsupervised Model-based Pre-training for Data-efficient Control from Pixels

Add code
Sep 24, 2022
Figure 1 for Unsupervised Model-based Pre-training for Data-efficient Control from Pixels
Figure 2 for Unsupervised Model-based Pre-training for Data-efficient Control from Pixels
Figure 3 for Unsupervised Model-based Pre-training for Data-efficient Control from Pixels
Figure 4 for Unsupervised Model-based Pre-training for Data-efficient Control from Pixels
Viaarxiv icon

Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization

Add code
Jun 07, 2021
Figure 1 for Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization
Figure 2 for Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization
Figure 3 for Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization
Figure 4 for Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization
Viaarxiv icon

Iterative Amortized Policy Optimization

Add code
Oct 20, 2020
Figure 1 for Iterative Amortized Policy Optimization
Figure 2 for Iterative Amortized Policy Optimization
Figure 3 for Iterative Amortized Policy Optimization
Figure 4 for Iterative Amortized Policy Optimization
Viaarxiv icon

Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields

Add code
Jul 10, 2018
Figure 1 for Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields
Figure 2 for Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields
Figure 3 for Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields
Figure 4 for Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields
Viaarxiv icon

Reward Estimation for Variance Reduction in Deep Reinforcement Learning

Add code
May 09, 2018
Figure 1 for Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Figure 2 for Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Figure 3 for Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Figure 4 for Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Viaarxiv icon

Bayesian Nonparametric Modeling of Heterogeneous Groups of Censored Data

Add code
Dec 02, 2016
Figure 1 for Bayesian Nonparametric Modeling of Heterogeneous Groups of Censored Data
Figure 2 for Bayesian Nonparametric Modeling of Heterogeneous Groups of Censored Data
Figure 3 for Bayesian Nonparametric Modeling of Heterogeneous Groups of Censored Data
Figure 4 for Bayesian Nonparametric Modeling of Heterogeneous Groups of Censored Data
Viaarxiv icon