Picture for Yixiu Mao

Yixiu Mao

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning

Add code
Dec 15, 2024
Viaarxiv icon

Doubly Mild Generalization for Offline Reinforcement Learning

Add code
Nov 13, 2024
Figure 1 for Doubly Mild Generalization for Offline Reinforcement Learning
Figure 2 for Doubly Mild Generalization for Offline Reinforcement Learning
Figure 3 for Doubly Mild Generalization for Offline Reinforcement Learning
Figure 4 for Doubly Mild Generalization for Offline Reinforcement Learning
Viaarxiv icon

Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression

Add code
Oct 28, 2024
Viaarxiv icon

Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration

Add code
Oct 03, 2024
Figure 1 for Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
Figure 2 for Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
Figure 3 for Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
Figure 4 for Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
Viaarxiv icon

Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation

Add code
Jul 28, 2024
Figure 1 for Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation
Figure 2 for Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation
Figure 3 for Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation
Figure 4 for Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation
Viaarxiv icon

Supported Trust Region Optimization for Offline Reinforcement Learning

Add code
Nov 15, 2023
Viaarxiv icon

A Hypergradient Approach to Robust Regression without Correspondence

Add code
Nov 30, 2020
Figure 1 for A Hypergradient Approach to Robust Regression without Correspondence
Figure 2 for A Hypergradient Approach to Robust Regression without Correspondence
Figure 3 for A Hypergradient Approach to Robust Regression without Correspondence
Figure 4 for A Hypergradient Approach to Robust Regression without Correspondence
Viaarxiv icon