Picture for Hongyao Tang

Hongyao Tang

Can We Optimize Deep RL Policy Weights as Trajectory Modeling?

Add code
Mar 06, 2025
Viaarxiv icon

Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer

Add code
Feb 04, 2025
Viaarxiv icon

Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn

Add code
Sep 07, 2024
Viaarxiv icon

MFE-ETP: A Comprehensive Evaluation Benchmark for Multi-modal Foundation Models on Embodied Task Planning

Add code
Jul 06, 2024
Viaarxiv icon

Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey

Add code
Jan 22, 2024
Viaarxiv icon

The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting

Add code
Mar 02, 2023
Viaarxiv icon

State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement Learning

Add code
Nov 28, 2022
Viaarxiv icon

ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation

Add code
Oct 26, 2022
Viaarxiv icon

Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes

Add code
Sep 16, 2022
Figure 1 for Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes
Figure 2 for Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes
Figure 3 for Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes
Figure 4 for Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes
Viaarxiv icon

PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations

Add code
Apr 06, 2022
Figure 1 for PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations
Figure 2 for PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations
Figure 3 for PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations
Figure 4 for PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations
Viaarxiv icon