Picture for Yaozhong Gan

Yaozhong Gan

AdaMemento: Adaptive Memory-Assisted Policy Optimization for Reinforcement Learning

Add code
Oct 06, 2024
Viaarxiv icon

The Exploration-Exploitation Dilemma Revisited: An Entropy Perspective

Add code
Aug 19, 2024
Viaarxiv icon

Reflective Policy Optimization

Add code
Jun 06, 2024
Viaarxiv icon

Transductive Off-policy Proximal Policy Optimization

Add code
Jun 06, 2024
Viaarxiv icon

Smoothing Advantage Learning

Add code
Mar 20, 2022
Figure 1 for Smoothing Advantage Learning
Figure 2 for Smoothing Advantage Learning
Figure 3 for Smoothing Advantage Learning
Figure 4 for Smoothing Advantage Learning
Viaarxiv icon

Robust Action Gap Increasing with Clipped Advantage Learning

Add code
Mar 20, 2022
Figure 1 for Robust Action Gap Increasing with Clipped Advantage Learning
Figure 2 for Robust Action Gap Increasing with Clipped Advantage Learning
Figure 3 for Robust Action Gap Increasing with Clipped Advantage Learning
Figure 4 for Robust Action Gap Increasing with Clipped Advantage Learning
Viaarxiv icon

Stabilizing Q Learning Via Soft Mellowmax Operator

Add code
Dec 18, 2020
Figure 1 for Stabilizing Q Learning Via Soft Mellowmax Operator
Figure 2 for Stabilizing Q Learning Via Soft Mellowmax Operator
Figure 3 for Stabilizing Q Learning Via Soft Mellowmax Operator
Figure 4 for Stabilizing Q Learning Via Soft Mellowmax Operator
Viaarxiv icon

Trust Region-Guided Proximal Policy Optimization

Add code
Jan 29, 2019
Figure 1 for Trust Region-Guided Proximal Policy Optimization
Figure 2 for Trust Region-Guided Proximal Policy Optimization
Figure 3 for Trust Region-Guided Proximal Policy Optimization
Figure 4 for Trust Region-Guided Proximal Policy Optimization
Viaarxiv icon