Picture for Kaiqing Zhang

Kaiqing Zhang

Last-Iterate Convergence of Payoff-Based Independent Learning in Zero-Sum Stochastic Games

Add code
Sep 02, 2024
Viaarxiv icon

Principled RLHF from Heterogeneous Feedback via Personalization and Preference Aggregation

Add code
Apr 30, 2024
Viaarxiv icon

Do LLM Agents Have Regret? A Case Study in Online Learning and Games

Add code
Mar 25, 2024
Viaarxiv icon

Two-Timescale Q-Learning with Function Approximation in Zero-Sum Stochastic Games

Add code
Dec 08, 2023
Viaarxiv icon

Fleet Policy Learning via Weight Merging and An Application to Robotic Tool-Use

Add code
Oct 02, 2023
Viaarxiv icon

Partially Observable Multi-agent RL with (Quasi-)Efficiency: The Blessing of Information Sharing

Add code
Aug 16, 2023
Viaarxiv icon

Tackling Combinatorial Distribution Shift: A Matrix Completion Perspective

Add code
Jul 28, 2023
Viaarxiv icon

Multi-Player Zero-Sum Markov Games with Networked Separable Interactions

Add code
Jul 13, 2023
Viaarxiv icon

Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs

Add code
Jun 20, 2023
Viaarxiv icon

Self-Supervised Reinforcement Learning that Transfers using Random Features

Add code
May 26, 2023
Viaarxiv icon