Picture for Ran He

Ran He

On-Policy Self-Distillation for Reasoning Compression

Add code
Mar 05, 2026
Viaarxiv icon

Random Wins All: Rethinking Grouping Strategies for Vision Tokens

Add code
Feb 28, 2026
Viaarxiv icon

Taming Momentum: Rethinking Optimizer States Through Low-Rank Approximation

Add code
Feb 27, 2026
Viaarxiv icon

The Trinity of Consistency as a Defining Principle for General World Models

Add code
Feb 26, 2026
Viaarxiv icon

Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning

Add code
Feb 24, 2026
Viaarxiv icon

How to Train Your Deep Research Agent? Prompt, Reward, and Policy Optimization in Search-R1

Add code
Feb 23, 2026
Viaarxiv icon

Mitigating the Safety-utility Trade-off in LLM Alignment via Adaptive Safe Context Learning

Add code
Feb 14, 2026
Viaarxiv icon

Stop Tracking Me! Proactive Defense Against Attribute Inference Attack in LLMs

Add code
Feb 12, 2026
Viaarxiv icon

Do MLLMs Really Understand Space? A Mathematical Reasoning Evaluation

Add code
Feb 12, 2026
Viaarxiv icon

Semantic Search At LinkedIn

Add code
Feb 07, 2026
Viaarxiv icon