Picture for Guanhua Chen

Guanhua Chen

Anchored Policy Optimization: Mitigating Exploration Collapse Via Support-Constrained Rectification

Add code
Feb 05, 2026
Viaarxiv icon

Rethinking the Role of Entropy in Optimizing Tool-Use Behaviors for Large Language Model Agents

Add code
Feb 02, 2026
Viaarxiv icon

From Abstract to Contextual: What LLMs Still Cannot Do in Mathematics

Add code
Jan 30, 2026
Viaarxiv icon

InstructDiff: Domain-Adaptive Data Selection via Differential Entropy for Efficient LLM Fine-Tuning

Add code
Jan 30, 2026
Viaarxiv icon

Generation Enhances Understanding in Unified Multimodal Models via Multi-Representation Generation

Add code
Jan 29, 2026
Viaarxiv icon

Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs

Add code
Jan 16, 2026
Viaarxiv icon

No More Stale Feedback: Co-Evolving Critics for Open-World Agent Learning

Add code
Jan 11, 2026
Viaarxiv icon

From Word to World: Can Large Language Models be Implicit Text-based World Models?

Add code
Dec 21, 2025
Viaarxiv icon

ConInstruct: Evaluating Large Language Models on Conflict Detection and Resolution in Instructions

Add code
Nov 19, 2025
Viaarxiv icon

MoMA: A Mixture-of-Multimodal-Agents Architecture for Enhancing Clinical Prediction Modelling

Add code
Aug 07, 2025
Viaarxiv icon