Picture for Daniil Tiapkin

Daniil Tiapkin

CMAP, LMO

Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents

Add code
Oct 30, 2024
Viaarxiv icon

Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization

Add code
Oct 20, 2024
Figure 1 for Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Figure 2 for Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Figure 3 for Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Figure 4 for Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Viaarxiv icon

Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization

Add code
Jul 08, 2024
Viaarxiv icon

Improving GFlowNets with Monte Carlo Tree Search

Add code
Jun 19, 2024
Viaarxiv icon

Incentivized Learning in Principal-Agent Bandit Games

Add code
Mar 06, 2024
Figure 1 for Incentivized Learning in Principal-Agent Bandit Games
Figure 2 for Incentivized Learning in Principal-Agent Bandit Games
Figure 3 for Incentivized Learning in Principal-Agent Bandit Games
Figure 4 for Incentivized Learning in Principal-Agent Bandit Games
Viaarxiv icon

Model-free Posterior Sampling via Learning Rate Randomization

Add code
Oct 27, 2023
Viaarxiv icon

Demonstration-Regularized RL

Add code
Oct 26, 2023
Viaarxiv icon

Generative Flow Networks as Entropy-Regularized RL

Add code
Oct 23, 2023
Viaarxiv icon

Finite-Sample Analysis of the Temporal Difference Learning

Add code
Oct 22, 2023
Viaarxiv icon

Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms

Add code
Apr 06, 2023
Viaarxiv icon