Picture for Alexandre Piché

Alexandre Piché

LLMs can learn self-restraint through iterative self-reflection

Add code
May 15, 2024
Viaarxiv icon

Causal Discovery with Language Models as Imperfect Experts

Add code
Jul 05, 2023
Viaarxiv icon

Can large language models build causal graphs?

Add code
Mar 07, 2023
Figure 1 for Can large language models build causal graphs?
Figure 2 for Can large language models build causal graphs?
Figure 3 for Can large language models build causal graphs?
Figure 4 for Can large language models build causal graphs?
Viaarxiv icon

Unsupervised Model-based Pre-training for Data-efficient Control from Pixels

Add code
Sep 24, 2022
Figure 1 for Unsupervised Model-based Pre-training for Data-efficient Control from Pixels
Figure 2 for Unsupervised Model-based Pre-training for Data-efficient Control from Pixels
Figure 3 for Unsupervised Model-based Pre-training for Data-efficient Control from Pixels
Figure 4 for Unsupervised Model-based Pre-training for Data-efficient Control from Pixels
Viaarxiv icon

Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization

Add code
Jun 07, 2021
Figure 1 for Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization
Figure 2 for Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization
Figure 3 for Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization
Figure 4 for Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization
Viaarxiv icon

Iterative Amortized Policy Optimization

Add code
Oct 20, 2020
Figure 1 for Iterative Amortized Policy Optimization
Figure 2 for Iterative Amortized Policy Optimization
Figure 3 for Iterative Amortized Policy Optimization
Figure 4 for Iterative Amortized Policy Optimization
Viaarxiv icon

Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields

Add code
Jul 10, 2018
Figure 1 for Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields
Figure 2 for Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields
Figure 3 for Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields
Figure 4 for Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields
Viaarxiv icon

Reward Estimation for Variance Reduction in Deep Reinforcement Learning

Add code
May 09, 2018
Figure 1 for Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Figure 2 for Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Figure 3 for Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Figure 4 for Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Viaarxiv icon

Bayesian Nonparametric Modeling of Heterogeneous Groups of Censored Data

Add code
Dec 02, 2016
Figure 1 for Bayesian Nonparametric Modeling of Heterogeneous Groups of Censored Data
Figure 2 for Bayesian Nonparametric Modeling of Heterogeneous Groups of Censored Data
Figure 3 for Bayesian Nonparametric Modeling of Heterogeneous Groups of Censored Data
Figure 4 for Bayesian Nonparametric Modeling of Heterogeneous Groups of Censored Data
Viaarxiv icon