Picture for Weinan Zhang

Weinan Zhang

ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning

Add code
Mar 12, 2025
Viaarxiv icon

Adding Alignment Control to Language Models

Add code
Mar 07, 2025
Viaarxiv icon

PALo: Learning Posture-Aware Locomotion for Quadruped Robots

Add code
Mar 06, 2025
Viaarxiv icon

Humanoid Whole-Body Locomotion on Narrow Terrain via Dynamic Balance and Reinforcement Learning

Add code
Feb 24, 2025
Viaarxiv icon

ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning

Add code
Feb 22, 2025
Viaarxiv icon

Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning

Add code
Feb 20, 2025
Viaarxiv icon

Bursting Filter Bubble: Enhancing Serendipity Recommendations with Aligned Large Language Models

Add code
Feb 19, 2025
Viaarxiv icon

RHINO: Learning Real-Time Humanoid-Human-Object Interaction from Human Demonstrations

Add code
Feb 18, 2025
Viaarxiv icon

Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation

Add code
Feb 18, 2025
Viaarxiv icon

Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport

Add code
Feb 18, 2025
Viaarxiv icon