Picture for Jing Shao

Jing Shao

Benchmarks for Trajectory Safety Evaluation and Diagnosis in OpenClaw and Codex: ATBench-Claw and ATBench-CodeX

Add code
Apr 16, 2026
Viaarxiv icon

SEARL: Joint Optimization of Policy and Tool Graph Memory for Self-Evolving Agents

Add code
Apr 09, 2026
Viaarxiv icon

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Add code
Apr 08, 2026
Viaarxiv icon

ATBench: A Diverse and Realistic Agent Trajectory Benchmark for Safety Evaluation and Diagnosis

Add code
Apr 08, 2026
Viaarxiv icon

DARE: Diffusion Large Language Models Alignment and Reinforcement Executor

Add code
Apr 05, 2026
Viaarxiv icon

Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control

Add code
Apr 03, 2026
Viaarxiv icon

ATBench: A Diverse and Realistic Trajectory Benchmark for Long-Horizon Agent Safety

Add code
Apr 02, 2026
Viaarxiv icon

Seeing with You: Perception-Reasoning Coevolution for Multimodal Reasoning

Add code
Mar 30, 2026
Viaarxiv icon

TreeTeaming: Autonomous Red-Teaming of Vision-Language Models via Hierarchical Strategy Exploration

Add code
Mar 24, 2026
Viaarxiv icon

HomeGuard: VLM-based Embodied Safeguard for Identifying Contextual Risk in Household Task

Add code
Mar 15, 2026
Viaarxiv icon