Picture for Ruohan Zhan

Ruohan Zhan

Distributionally Robust Policy Learning under Concept Drifts

Add code
Dec 18, 2024
Viaarxiv icon

Adaptively Learning to Select-Rank in Online Platforms

Add code
Jun 07, 2024
Viaarxiv icon

Statistical Properties of Robust Satisficing

Add code
May 30, 2024
Viaarxiv icon

Proportional Response: Contextual Bandits for Simple and Cumulative Regret Minimization

Add code
Jul 05, 2023
Viaarxiv icon

Post-Episodic Reinforcement Learning Inference

Add code
Feb 17, 2023
Viaarxiv icon

Two-Stage Constrained Actor-Critic for Short Video Recommendation

Add code
Feb 06, 2023
Viaarxiv icon

Deconfounding Duration Bias in Watch-time Prediction for Video Recommendation

Add code
Jun 13, 2022
Figure 1 for Deconfounding Duration Bias in Watch-time Prediction for Video Recommendation
Figure 2 for Deconfounding Duration Bias in Watch-time Prediction for Video Recommendation
Figure 3 for Deconfounding Duration Bias in Watch-time Prediction for Video Recommendation
Figure 4 for Deconfounding Duration Bias in Watch-time Prediction for Video Recommendation
Viaarxiv icon

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor

Add code
Jun 01, 2022
Figure 1 for ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Figure 2 for ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Figure 3 for ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Figure 4 for ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Viaarxiv icon

Constrained Reinforcement Learning for Short Video Recommendation

Add code
May 26, 2022
Figure 1 for Constrained Reinforcement Learning for Short Video Recommendation
Figure 2 for Constrained Reinforcement Learning for Short Video Recommendation
Figure 3 for Constrained Reinforcement Learning for Short Video Recommendation
Figure 4 for Constrained Reinforcement Learning for Short Video Recommendation
Viaarxiv icon

Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits

Add code
Jun 10, 2021
Figure 1 for Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits
Figure 2 for Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits
Figure 3 for Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits
Figure 4 for Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits
Viaarxiv icon