Picture for Wanqi Xue

Wanqi Xue

Two-Stage Constrained Actor-Critic for Short Video Recommendation

Add code
Feb 06, 2023
Viaarxiv icon

Reinforcement Learning from Diverse Human Preferences

Add code
Jan 30, 2023
Viaarxiv icon

PrefRec: Preference-based Recommender Systems for Reinforcing Long-term User Engagement

Add code
Dec 06, 2022
Viaarxiv icon

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor

Add code
Jun 01, 2022
Figure 1 for ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Figure 2 for ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Figure 3 for ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Figure 4 for ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Viaarxiv icon

NSGZero: Efficiently Learning Non-Exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search

Add code
Jan 17, 2022
Figure 1 for NSGZero: Efficiently Learning Non-Exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search
Figure 2 for NSGZero: Efficiently Learning Non-Exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search
Figure 3 for NSGZero: Efficiently Learning Non-Exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search
Figure 4 for NSGZero: Efficiently Learning Non-Exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search
Viaarxiv icon

Mis-spoke or mis-lead: Achieving Robustness in Multi-Agent Communicative Reinforcement Learning

Add code
Aug 09, 2021
Figure 1 for Mis-spoke or mis-lead: Achieving Robustness in Multi-Agent Communicative Reinforcement Learning
Figure 2 for Mis-spoke or mis-lead: Achieving Robustness in Multi-Agent Communicative Reinforcement Learning
Figure 3 for Mis-spoke or mis-lead: Achieving Robustness in Multi-Agent Communicative Reinforcement Learning
Figure 4 for Mis-spoke or mis-lead: Achieving Robustness in Multi-Agent Communicative Reinforcement Learning
Viaarxiv icon

Solving Large-Scale Extensive-Form Network Security Games via Neural Fictitious Self-Play

Add code
Jun 02, 2021
Figure 1 for Solving Large-Scale Extensive-Form Network Security Games via Neural Fictitious Self-Play
Figure 2 for Solving Large-Scale Extensive-Form Network Security Games via Neural Fictitious Self-Play
Figure 3 for Solving Large-Scale Extensive-Form Network Security Games via Neural Fictitious Self-Play
Figure 4 for Solving Large-Scale Extensive-Form Network Security Games via Neural Fictitious Self-Play
Viaarxiv icon

CFR-MIX: Solving Imperfect Information Extensive-Form Games with Combinatorial Action Space

Add code
May 18, 2021
Figure 1 for CFR-MIX: Solving Imperfect Information Extensive-Form Games with Combinatorial Action Space
Figure 2 for CFR-MIX: Solving Imperfect Information Extensive-Form Games with Combinatorial Action Space
Figure 3 for CFR-MIX: Solving Imperfect Information Extensive-Form Games with Combinatorial Action Space
Figure 4 for CFR-MIX: Solving Imperfect Information Extensive-Form Games with Combinatorial Action Space
Viaarxiv icon

One-Shot Image Classification by Learning to Restore Prototypes

Add code
May 04, 2020
Figure 1 for One-Shot Image Classification by Learning to Restore Prototypes
Figure 2 for One-Shot Image Classification by Learning to Restore Prototypes
Figure 3 for One-Shot Image Classification by Learning to Restore Prototypes
Figure 4 for One-Shot Image Classification by Learning to Restore Prototypes
Viaarxiv icon