Picture for Yuheng Zhang

Yuheng Zhang

Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs

Add code
Mar 03, 2025
Viaarxiv icon

Improving LLM General Preference Alignment via Optimistic Online Mirror Descent

Add code
Feb 24, 2025
Viaarxiv icon

Teaching LLMs to Refine with Tools

Add code
Dec 22, 2024
Figure 1 for Teaching LLMs to Refine with Tools
Figure 2 for Teaching LLMs to Refine with Tools
Figure 3 for Teaching LLMs to Refine with Tools
Figure 4 for Teaching LLMs to Refine with Tools
Viaarxiv icon

Noise Matters: Diffusion Model-based Urban Mobility Generation with Collaborative Noise Priors

Add code
Dec 06, 2024
Viaarxiv icon

Understanding World or Predicting Future? A Comprehensive Survey of World Models

Add code
Nov 21, 2024
Figure 1 for Understanding World or Predicting Future? A Comprehensive Survey of World Models
Figure 2 for Understanding World or Predicting Future? A Comprehensive Survey of World Models
Figure 3 for Understanding World or Predicting Future? A Comprehensive Survey of World Models
Figure 4 for Understanding World or Predicting Future? A Comprehensive Survey of World Models
Viaarxiv icon

Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

Add code
Jun 30, 2024
Figure 1 for Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Figure 2 for Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Figure 3 for Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Viaarxiv icon

LCSim: A Large-Scale Controllable Traffic Simulator

Add code
Jun 28, 2024
Viaarxiv icon

Provably Efficient Interactive-Grounded Learning with Personalized Reward

Add code
May 31, 2024
Viaarxiv icon

On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation

Add code
Feb 22, 2024
Viaarxiv icon

Efficient Contextual Bandits with Uninformed Feedback Graphs

Add code
Feb 12, 2024
Viaarxiv icon