Picture for Yanchao Sun

Yanchao Sun

Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theory

Add code
Nov 01, 2024
Figure 1 for Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theory
Figure 2 for Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theory
Figure 3 for Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theory
Figure 4 for Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theory
Viaarxiv icon

TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights

Add code
Oct 06, 2024
Figure 1 for TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights
Figure 2 for TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights
Figure 3 for TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights
Figure 4 for TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights
Viaarxiv icon

Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies

Add code
Feb 20, 2024
Viaarxiv icon

Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models

Add code
Feb 05, 2024
Viaarxiv icon

O3D: Offline Data-driven Discovery and Distillation for Sequential Decision-Making with Large Language Models

Add code
Oct 22, 2023
Figure 1 for O3D: Offline Data-driven Discovery and Distillation for Sequential Decision-Making with Large Language Models
Figure 2 for O3D: Offline Data-driven Discovery and Distillation for Sequential Decision-Making with Large Language Models
Figure 3 for O3D: Offline Data-driven Discovery and Distillation for Sequential Decision-Making with Large Language Models
Figure 4 for O3D: Offline Data-driven Discovery and Distillation for Sequential Decision-Making with Large Language Models
Viaarxiv icon

Robustness to Multi-Modal Environment Uncertainty in MARL using Curriculum Learning

Add code
Oct 12, 2023
Viaarxiv icon

COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL

Add code
Oct 11, 2023
Viaarxiv icon

Learning Generalizable Agents via Saliency-Guided Features Decorrelation

Add code
Oct 08, 2023
Viaarxiv icon

Safe and Robust Multi-Agent Reinforcement Learning for Connected Autonomous Vehicles under State Perturbations

Add code
Sep 20, 2023
Figure 1 for Safe and Robust Multi-Agent Reinforcement Learning for Connected Autonomous Vehicles under State Perturbations
Figure 2 for Safe and Robust Multi-Agent Reinforcement Learning for Connected Autonomous Vehicles under State Perturbations
Figure 3 for Safe and Robust Multi-Agent Reinforcement Learning for Connected Autonomous Vehicles under State Perturbations
Figure 4 for Safe and Robust Multi-Agent Reinforcement Learning for Connected Autonomous Vehicles under State Perturbations
Viaarxiv icon

Equal Long-term Benefit Rate: Adapting Static Fairness Notions to Sequential Decision Making

Add code
Sep 07, 2023
Viaarxiv icon