Picture for Chengshuai Shi

Chengshuai Shi

Transformers as Game Players: Provable In-context Game-playing Capabilities of Pre-trained Models

Add code
Oct 13, 2024
Viaarxiv icon

Building Math Agents with Multi-Turn Iterative Preference Learning

Add code
Sep 04, 2024
Figure 1 for Building Math Agents with Multi-Turn Iterative Preference Learning
Figure 2 for Building Math Agents with Multi-Turn Iterative Preference Learning
Figure 3 for Building Math Agents with Multi-Turn Iterative Preference Learning
Figure 4 for Building Math Agents with Multi-Turn Iterative Preference Learning
Viaarxiv icon

Best Arm Identification for Prompt Learning under a Limited Budget

Add code
Feb 20, 2024
Viaarxiv icon

Harnessing the Power of Federated Learning in Federated Contextual Bandits

Add code
Dec 26, 2023
Viaarxiv icon

Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources

Add code
Jun 14, 2023
Viaarxiv icon

On High-dimensional and Low-rank Tensor Bandits

Add code
May 06, 2023
Viaarxiv icon

Reward Teaching for Federated Multi-armed Bandits

Add code
May 03, 2023
Viaarxiv icon

A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games

Add code
Oct 04, 2022
Viaarxiv icon

Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game

Add code
May 31, 2022
Viaarxiv icon

Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization

Add code
Oct 29, 2021
Figure 1 for Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization
Figure 2 for Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization
Figure 3 for Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization
Figure 4 for Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization
Viaarxiv icon