Picture for Yuki Ichihara

Yuki Ichihara

Evaluation of Best-of-N Sampling Strategies for Language Model Alignment

Add code
Feb 18, 2025
Viaarxiv icon

Theoretical Guarantees for Minimum Bayes Risk Decoding

Add code
Feb 18, 2025
Viaarxiv icon

A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees

Add code
Feb 02, 2024
Figure 1 for A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees
Figure 2 for A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees
Viaarxiv icon