Picture for Zheqing Zhu

Zheqing Zhu

Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank

Add code
Oct 01, 2024
Viaarxiv icon

Uncertainty of Joint Neural Contextual Bandit

Add code
Jun 04, 2024
Viaarxiv icon

Pearl: A Production-ready Reinforcement Learning Agent

Add code
Dec 06, 2023
Viaarxiv icon

Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling

Add code
Oct 14, 2023
Figure 1 for Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Figure 2 for Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Figure 3 for Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Figure 4 for Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Viaarxiv icon

Offline Reinforcement Learning for Optimizing Production Bidding Policies

Add code
Oct 13, 2023
Viaarxiv icon

Scalable Neural Contextual Bandit for Recommender Systems

Add code
Jun 26, 2023
Viaarxiv icon

IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

Add code
Jun 01, 2023
Viaarxiv icon

Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning

Add code
May 24, 2023
Viaarxiv icon

Optimism Based Exploration in Large-Scale Recommender Systems

Add code
Apr 05, 2023
Viaarxiv icon

Deep Exploration for Recommendation Systems

Add code
Sep 26, 2021
Figure 1 for Deep Exploration for Recommendation Systems
Figure 2 for Deep Exploration for Recommendation Systems
Figure 3 for Deep Exploration for Recommendation Systems
Figure 4 for Deep Exploration for Recommendation Systems
Viaarxiv icon