Picture for Jiacai Liu

Jiacai Liu

Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs

Add code
Oct 24, 2024
Viaarxiv icon

Elementary Analysis of Policy Gradient Methods

Add code
Apr 11, 2024
Viaarxiv icon

On the Linear Convergence of Policy Gradient under Hadamard Parameterization

Add code
May 31, 2023
Viaarxiv icon