Picture for Shangzhe Li

Shangzhe Li

Your Self-Play Algorithm is Secretly an Adversarial Imitator: Understanding LLM Self-Play through the Lens of Imitation Learning

Add code
Feb 01, 2026
Viaarxiv icon

Imitation from Observations with Trajectory-Level Generative Embeddings

Add code
Jan 01, 2026
Viaarxiv icon

Quantile Q-Learning: Revisiting Offline Extreme Q-Learning with Quantile Regression

Add code
Nov 15, 2025
Viaarxiv icon

Near-Optimal Second-Order Guarantees for Model-Based Adversarial Imitation Learning

Add code
Oct 10, 2025
Viaarxiv icon

Language Model Distillation: A Temporal Difference Imitation Learning Perspective

Add code
May 24, 2025
Viaarxiv icon

Coupled Distributional Random Expert Distillation for World Model Online Imitation Learning

Add code
May 04, 2025
Viaarxiv icon

Molecular Graph Contrastive Learning with Line Graph

Add code
Jan 15, 2025
Viaarxiv icon

Reward-free World Models for Online Imitation Learning

Add code
Oct 17, 2024
Viaarxiv icon

HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification

Add code
Mar 26, 2024
Figure 1 for HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification
Figure 2 for HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification
Figure 3 for HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification
Figure 4 for HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification
Viaarxiv icon

Distilling Conditional Diffusion Models for Offline Reinforcement Learning through Trajectory Stitching

Add code
Feb 01, 2024
Viaarxiv icon