Picture for Yongbin Li

Yongbin Li

RollArt: Scaling Agentic RL Training via Disaggregated Infrastructure

Add code
Dec 27, 2025
Viaarxiv icon

Understanding Generalization in Role-Playing Models via Information Theory

Add code
Dec 19, 2025
Viaarxiv icon

MOA: Multi-Objective Alignment for Role-Playing Agents

Add code
Dec 10, 2025
Viaarxiv icon

Selective Weak-to-Strong Generalization

Add code
Nov 18, 2025
Viaarxiv icon

CPO: Addressing Reward Ambiguity in Role-playing Dialogue via Comparative Policy Optimization

Add code
Aug 12, 2025
Viaarxiv icon

RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization

Add code
Jul 31, 2025
Viaarxiv icon

EIFBENCH: Extremely Complex Instruction Following Benchmark for Large Language Models

Add code
Jun 10, 2025
Viaarxiv icon

TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence

Add code
May 30, 2025
Viaarxiv icon

Socratic-PRMBench: Benchmarking Process Reward Models with Systematic Reasoning Patterns

Add code
May 29, 2025
Viaarxiv icon

ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents

Add code
May 29, 2025
Figure 1 for ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents
Figure 2 for ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents
Figure 3 for ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents
Figure 4 for ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents
Viaarxiv icon