Picture for Zhiyuan Wang

Zhiyuan Wang

ChartAct: A Benchmark for Dynamic Chart Understanding

Add code
May 28, 2026
Viaarxiv icon

DisasterBench: Benchmarking LLM Planning under Typed Tool Interface Constraints

Add code
May 27, 2026
Viaarxiv icon

MiRD: Reliable Set-Valued Prediction for Open-Ended Question Answering via Miscoverage Risk Decomposition

Add code
May 25, 2026
Viaarxiv icon

IndusAgent: Reinforcing Open-Vocabulary Industrial Anomaly Detection with Agentic Tools

Add code
May 20, 2026
Viaarxiv icon

BalanceRAG: Joint Risk Calibration for Cascaded Retrieval-Augmented Generation

Add code
May 19, 2026
Viaarxiv icon

PSI: Shared State as the Missing Layer for Coherent AI-Generated Instruments in Personal AI Agents

Add code
Apr 09, 2026
Viaarxiv icon

Set-Valued Prediction for Large Language Models with Feasibility-Aware Coverage Guarantees

Add code
Mar 24, 2026
Viaarxiv icon

Towards Efficient and Robust Linguistic Emotion Diagnosis for Mental Health via Multi-Agent Instruction Refinement

Add code
Jan 20, 2026
Viaarxiv icon

MAXS: Meta-Adaptive Exploration with LLM Agents

Add code
Jan 14, 2026
Viaarxiv icon

$A^3$-Bench: Benchmarking Memory-Driven Scientific Reasoning via Anchor and Attractor Activation

Add code
Jan 14, 2026
Viaarxiv icon