Picture for Yazhe Niu

Yazhe Niu

Empowering LLMs in Decision Games through Algorithmic Data Synthesis

Add code
Mar 18, 2025
Viaarxiv icon

Hierarchical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM

Add code
Mar 10, 2025
Viaarxiv icon

Revisiting Generative Policies: A Simpler Reinforcement Learning Algorithmic Perspective

Add code
Dec 02, 2024
Viaarxiv icon

Pretrained Reversible Generation as Unsupervised Visual Representation Learning

Add code
Nov 29, 2024
Viaarxiv icon

PsyDI: Towards a Personalized and Progressively In-depth Chatbot for Psychological Measurements

Add code
Jul 22, 2024
Viaarxiv icon

UniZero: Generalized and Efficient Planning with Scalable Latent World Models

Add code
Jun 15, 2024
Figure 1 for UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Figure 2 for UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Figure 3 for UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Figure 4 for UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Viaarxiv icon

ReZero: Boosting MCTS-based Algorithms by Just-in-Time and Speedy Reanalyze

Add code
Apr 28, 2024
Viaarxiv icon

A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning

Add code
Dec 12, 2023
Figure 1 for A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Figure 2 for A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Figure 3 for A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Figure 4 for A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Viaarxiv icon

LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios

Add code
Oct 12, 2023
Figure 1 for LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
Figure 2 for LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
Figure 3 for LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
Figure 4 for LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
Viaarxiv icon

Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning

Add code
Jul 24, 2023
Viaarxiv icon