Picture for Xiaoyang Tan

Xiaoyang Tan

Transductive Off-policy Proximal Policy Optimization

Add code
Jun 06, 2024
Viaarxiv icon

Highway Reinforcement Learning

Add code
May 28, 2024
Viaarxiv icon

HiQA: A Hierarchical Contextual Augmentation RAG for Massive Documents QA

Add code
Feb 01, 2024
Viaarxiv icon

ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing Part Sensitive Transformer

Add code
Feb 28, 2023
Viaarxiv icon

Contextual Conservative Q-Learning for Offline Reinforcement Learning

Add code
Jan 16, 2023
Viaarxiv icon

Smoothing Advantage Learning

Add code
Mar 20, 2022
Figure 1 for Smoothing Advantage Learning
Figure 2 for Smoothing Advantage Learning
Figure 3 for Smoothing Advantage Learning
Figure 4 for Smoothing Advantage Learning
Viaarxiv icon

Robust Action Gap Increasing with Clipped Advantage Learning

Add code
Mar 20, 2022
Figure 1 for Robust Action Gap Increasing with Clipped Advantage Learning
Figure 2 for Robust Action Gap Increasing with Clipped Advantage Learning
Figure 3 for Robust Action Gap Increasing with Clipped Advantage Learning
Figure 4 for Robust Action Gap Increasing with Clipped Advantage Learning
Viaarxiv icon

A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

Add code
Jun 11, 2021
Figure 1 for A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising
Figure 2 for A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising
Figure 3 for A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising
Figure 4 for A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising
Viaarxiv icon

Greedy Multi-step Off-Policy Reinforcement Learning

Add code
Mar 07, 2021
Figure 1 for Greedy Multi-step Off-Policy Reinforcement Learning
Figure 2 for Greedy Multi-step Off-Policy Reinforcement Learning
Figure 3 for Greedy Multi-step Off-Policy Reinforcement Learning
Figure 4 for Greedy Multi-step Off-Policy Reinforcement Learning
Viaarxiv icon

Stabilizing Q Learning Via Soft Mellowmax Operator

Add code
Dec 18, 2020
Figure 1 for Stabilizing Q Learning Via Soft Mellowmax Operator
Figure 2 for Stabilizing Q Learning Via Soft Mellowmax Operator
Figure 3 for Stabilizing Q Learning Via Soft Mellowmax Operator
Figure 4 for Stabilizing Q Learning Via Soft Mellowmax Operator
Viaarxiv icon