Picture for Shiyu Chang

Shiyu Chang

CompliBench: Benchmarking LLM Judges for Compliance Violation Detection in Dialogue Systems

Add code
Apr 14, 2026
Viaarxiv icon

How Well Do Agentic Skills Work in the Wild: Benchmarking LLM Skill Usage in Realistic Settings

Add code
Apr 06, 2026
Viaarxiv icon

Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

Add code
Mar 23, 2026
Viaarxiv icon

Ares: Adaptive Reasoning Effort Selection for Efficient LLM Agents

Add code
Mar 09, 2026
Viaarxiv icon

RetouchIQ: MLLM Agents for Instruction-Based Image Retouching with Generalist Reward

Add code
Feb 19, 2026
Viaarxiv icon

Learning from Online Videos at Inference Time for Computer-Use Agents

Add code
Nov 06, 2025
Figure 1 for Learning from Online Videos at Inference Time for Computer-Use Agents
Figure 2 for Learning from Online Videos at Inference Time for Computer-Use Agents
Figure 3 for Learning from Online Videos at Inference Time for Computer-Use Agents
Figure 4 for Learning from Online Videos at Inference Time for Computer-Use Agents
Viaarxiv icon

Rethinking the Text-Vision Reasoning Imbalance in MLLMs through the Lens of Training Recipes

Add code
Oct 26, 2025
Viaarxiv icon

A Hierarchical Probabilistic Framework for Incremental Knowledge Tracing in Classroom Settings

Add code
Jun 11, 2025
Viaarxiv icon

Collision- and Reachability-Aware Multi-Robot Control with Grounded LLM Planners

Add code
May 26, 2025
Viaarxiv icon

Defending LLM Watermarking Against Spoofing Attacks with Contrastive Representation Learning

Add code
Apr 10, 2025
Viaarxiv icon