Picture for Wenxuan Wang

Wenxuan Wang

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Add code
Jan 26, 2026
Viaarxiv icon

Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification

Add code
Jan 22, 2026
Viaarxiv icon

MMedExpert-R1: Strengthening Multimodal Medical Reasoning via Domain-Specific Adaptation and Clinical Guideline Reinforcement

Add code
Jan 16, 2026
Viaarxiv icon

MedEinst: Benchmarking the Einstellung Effect in Medical LLMs through Counterfactual Differential Diagnosis

Add code
Jan 10, 2026
Viaarxiv icon

Probing Multimodal Large Language Models on Cognitive Biases in Chinese Short-Video Misinformation

Add code
Jan 10, 2026
Viaarxiv icon

AutoMonitor-Bench: Evaluating the Reliability of LLM-Based Misbehavior Monitor

Add code
Jan 09, 2026
Viaarxiv icon

MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization

Add code
Jan 08, 2026
Viaarxiv icon

ChartEditor: A Reinforcement Learning Framework for Robust Chart Editing

Add code
Nov 19, 2025
Viaarxiv icon

Mem-PAL: Towards Memory-based Personalized Dialogue Assistants for Long-term User-Agent Interaction

Add code
Nov 17, 2025
Viaarxiv icon

Emu3.5: Native Multimodal Models are World Learners

Add code
Oct 30, 2025
Viaarxiv icon