Picture for Assaf Hallak

Assaf Hallak

PlaMo: Plan and Move in Rich 3D Physical Environments

Add code
Jun 26, 2024
Viaarxiv icon

SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search

Add code
Jan 30, 2023
Viaarxiv icon

SoftTreeMax: Policy Gradient with Tree Search

Add code
Sep 28, 2022
Figure 1 for SoftTreeMax: Policy Gradient with Tree Search
Figure 2 for SoftTreeMax: Policy Gradient with Tree Search
Figure 3 for SoftTreeMax: Policy Gradient with Tree Search
Viaarxiv icon

Reinforcement Learning with a Terminator

Add code
May 30, 2022
Figure 1 for Reinforcement Learning with a Terminator
Figure 2 for Reinforcement Learning with a Terminator
Figure 3 for Reinforcement Learning with a Terminator
Figure 4 for Reinforcement Learning with a Terminator
Viaarxiv icon

Planning and Learning with Adaptive Lookahead

Add code
Jan 28, 2022
Viaarxiv icon

On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning

Add code
Oct 13, 2021
Figure 1 for On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning
Figure 2 for On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning
Figure 3 for On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning
Figure 4 for On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning
Viaarxiv icon

Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction

Add code
Jul 04, 2021
Figure 1 for Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction
Figure 2 for Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction
Figure 3 for Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction
Figure 4 for Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction
Viaarxiv icon

Automatic Representation for Lifetime Value Recommender Systems

Add code
Feb 23, 2017
Figure 1 for Automatic Representation for Lifetime Value Recommender Systems
Figure 2 for Automatic Representation for Lifetime Value Recommender Systems
Figure 3 for Automatic Representation for Lifetime Value Recommender Systems
Figure 4 for Automatic Representation for Lifetime Value Recommender Systems
Viaarxiv icon

Consistent On-Line Off-Policy Evaluation

Add code
Feb 23, 2017
Figure 1 for Consistent On-Line Off-Policy Evaluation
Figure 2 for Consistent On-Line Off-Policy Evaluation
Figure 3 for Consistent On-Line Off-Policy Evaluation
Figure 4 for Consistent On-Line Off-Policy Evaluation
Viaarxiv icon

Generalized Emphatic Temporal Difference Learning: Bias-Variance Analysis

Add code
Nov 27, 2015
Figure 1 for Generalized Emphatic Temporal Difference Learning: Bias-Variance Analysis
Figure 2 for Generalized Emphatic Temporal Difference Learning: Bias-Variance Analysis
Figure 3 for Generalized Emphatic Temporal Difference Learning: Bias-Variance Analysis
Viaarxiv icon