Picture for Yu Cheng

Yu Cheng

Benign in Isolation, Harmful in Composition: Security Risks in Agent Skill Ecosystems

Add code
Jun 13, 2026
Viaarxiv icon

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

Add code
Jun 11, 2026
Viaarxiv icon

ComBench: A Benchmark for Rigorous Proof Reasoning and Constructive Realization in Olympiad-Level Combinatorics

Add code
Jun 09, 2026
Viaarxiv icon

How Does Reasoning Flow? Tracing Attention-Induced Information Flow for Targeted RL in LLMs

Add code
Jun 09, 2026
Viaarxiv icon

3D-CoS: A New 3D Reconstruction Paradigm Based on VLM Code Synthesis

Add code
Jun 09, 2026
Viaarxiv icon

SubtleMemory: A Benchmark for Fine-Grained Relational Memory Discrimination in Long-Horizon AI Agents

Add code
Jun 04, 2026
Viaarxiv icon

Agent Planning Benchmark: A Diagnostic Framework for Planning Capabilities in LLM Agents

Add code
Jun 03, 2026
Viaarxiv icon

Annotations Are Not All You Need: A Cross-modal Knowledge Transfer Network for Unsupervised Temporal Sentence Grounding

Add code
May 29, 2026
Viaarxiv icon

Not All Inputs Are Valid: Towards Open-Set Video Moment Retrieval Using Language

Add code
May 28, 2026
Viaarxiv icon

Draft-OPD: On-Policy Distillation for Speculative Draft Models

Add code
May 28, 2026
Viaarxiv icon