Picture for Yichi Zhou

Yichi Zhou

Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process

Add code
Mar 07, 2024
Viaarxiv icon

Simultaneously Learning Stochastic and Adversarial Bandits with General Graph Feedback

Add code
Jun 16, 2022
Figure 1 for Simultaneously Learning Stochastic and Adversarial Bandits with General Graph Feedback
Viaarxiv icon

Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit

Add code
Jun 29, 2021
Figure 1 for Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit
Figure 2 for Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit
Figure 3 for Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit
Figure 4 for Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit
Viaarxiv icon

Lazy-CFR: a fast regret minimization algorithm for extensive games with imperfect information

Add code
Oct 10, 2018
Figure 1 for Lazy-CFR: a fast regret minimization algorithm for extensive games with imperfect information
Figure 2 for Lazy-CFR: a fast regret minimization algorithm for extensive games with imperfect information
Viaarxiv icon

Label Aggregation via Finding Consensus Between Models

Add code
Jul 19, 2018
Figure 1 for Label Aggregation via Finding Consensus Between Models
Figure 2 for Label Aggregation via Finding Consensus Between Models
Figure 3 for Label Aggregation via Finding Consensus Between Models
Figure 4 for Label Aggregation via Finding Consensus Between Models
Viaarxiv icon

Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors

Add code
Aug 16, 2017
Figure 1 for Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors
Figure 2 for Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors
Figure 3 for Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors
Viaarxiv icon