Picture for Meng Fang

Meng Fang

WebArXiv: Evaluating Multimodal Agents on Time-Invariant arXiv Tasks

Add code
Jul 01, 2025
Viaarxiv icon

MuBench: Assessment of Multilingual Capabilities of Large Language Models Across 61 Languages

Add code
Jun 24, 2025
Viaarxiv icon

MEAL: A Benchmark for Continual Multi-Agent Reinforcement Learning

Add code
Jun 17, 2025
Viaarxiv icon

Task Adaptation from Skills: Information Geometry, Disentanglement, and New Objectives for Unsupervised Reinforcement Learning

Add code
Jun 12, 2025
Viaarxiv icon

Revisiting Overthinking in Long Chain-of-Thought from the Perspective of Self-Doubt

Add code
May 29, 2025
Viaarxiv icon

Monte Carlo Planning with Large Language Model for Text-Based Game Agents

Add code
Apr 23, 2025
Viaarxiv icon

HASARD: A Benchmark for Vision-Based Safe Reinforcement Learning in Embodied Agents

Add code
Mar 11, 2025
Viaarxiv icon

ATLaS: Agent Tuning via Learning Critical Steps

Add code
Mar 04, 2025
Viaarxiv icon

Distill Not Only Data but Also Rewards: Can Smaller Language Models Surpass Larger Ones?

Add code
Feb 26, 2025
Viaarxiv icon

Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework

Add code
Feb 19, 2025
Viaarxiv icon