Picture for Soichiro Nishimori

Soichiro Nishimori

A Batch Sequential Halving Algorithm without Performance Degradation

Add code
Jun 01, 2024
Viaarxiv icon

Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains

Add code
Apr 11, 2024
Viaarxiv icon

A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees

Add code
Feb 02, 2024
Viaarxiv icon

End-to-End Policy Gradient Method for POMDPs and Explainable Agents

Add code
Apr 19, 2023
Viaarxiv icon

Pgx: Hardware-accelerated parallel game simulation for reinforcement learning

Add code
Mar 29, 2023
Viaarxiv icon