Picture for Bowen Zhou

Bowen Zhou

BrowseComp-$V^3$: A Visual, Vertical, and Verifiable Benchmark for Multimodal Browsing Agents

Add code
Feb 13, 2026
Viaarxiv icon

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

Add code
Feb 10, 2026
Viaarxiv icon

Next Concept Prediction in Discrete Latent Space Leads to Stronger Language Models

Add code
Feb 09, 2026
Viaarxiv icon

InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery

Add code
Feb 09, 2026
Viaarxiv icon

MARTI-MARS$^2$: Scaling Multi-Agent Self-Search via Reinforcement Learning for Code Generation

Add code
Feb 08, 2026
Viaarxiv icon

CAF-Mamba: Mamba-Based Cross-Modal Adaptive Attention Fusion for Multimodal Depression Detection

Add code
Jan 29, 2026
Viaarxiv icon

HIPPO: Accelerating Video Large Language Models Inference via Holistic-aware Parallel Speculative Decoding

Add code
Jan 13, 2026
Viaarxiv icon

I2E: From Image Pixels to Actionable Interactive Environments for Text-Guided Image Editing

Add code
Jan 07, 2026
Viaarxiv icon

Effective Online 3D Bin Packing with Lookahead Parcels Using Monte Carlo Tree Search

Add code
Jan 06, 2026
Viaarxiv icon

InternVLA-A1: Unifying Understanding, Generation and Action for Robotic Manipulation

Add code
Jan 05, 2026
Viaarxiv icon