Picture for Wei Liu

Wei Liu

Peter

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

Add code
Feb 05, 2026
Viaarxiv icon

Towards a Science of Collective AI: LLM-based Multi-Agent Systems Need a Transition from Blind Trial-and-Error to Rigorous Science

Add code
Feb 05, 2026
Viaarxiv icon

Entropy-Gated Selective Policy Optimization:Token-Level Gradient Allocation for Hybrid Training of Large Language Models

Add code
Feb 03, 2026
Viaarxiv icon

Seeing Is Believing? A Benchmark for Multimodal Large Language Models on Visual Illusions and Anomalies

Add code
Feb 02, 2026
Viaarxiv icon

Hunt Instead of Wait: Evaluating Deep Data Research on Large Language Models

Add code
Feb 02, 2026
Viaarxiv icon

FutureMind: Equipping Small Language Models with Strategic Thinking-Pattern Priors via Adaptive Knowledge Distillation

Add code
Feb 01, 2026
Viaarxiv icon

SP^2DPO: An LLM-assisted Semantic Per-Pair DPO Generalization

Add code
Jan 29, 2026
Viaarxiv icon

MobileBench-OL: A Comprehensive Chinese Benchmark for Evaluating Mobile GUI Agents in Real-World Environment

Add code
Jan 29, 2026
Viaarxiv icon

Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision

Add code
Jan 27, 2026
Viaarxiv icon

R^3: Replay, Reflection, and Ranking Rewards for LLM Reinforcement Learning

Add code
Jan 27, 2026
Viaarxiv icon