Picture for Bei Chen

Bei Chen

Steering Large Reasoning Models towards Concise Reasoning via Flow Matching

Add code
Feb 05, 2026
Viaarxiv icon

Understanding DeepResearch via Reports

Add code
Oct 09, 2025
Figure 1 for Understanding DeepResearch via Reports
Figure 2 for Understanding DeepResearch via Reports
Figure 3 for Understanding DeepResearch via Reports
Figure 4 for Understanding DeepResearch via Reports
Viaarxiv icon

A$^2$Search: Ambiguity-Aware Question Answering with Reinforcement Learning

Add code
Oct 09, 2025
Figure 1 for A$^2$Search: Ambiguity-Aware Question Answering with Reinforcement Learning
Figure 2 for A$^2$Search: Ambiguity-Aware Question Answering with Reinforcement Learning
Figure 3 for A$^2$Search: Ambiguity-Aware Question Answering with Reinforcement Learning
Figure 4 for A$^2$Search: Ambiguity-Aware Question Answering with Reinforcement Learning
Viaarxiv icon

Generative Frame Sampler for Long Video Understanding

Add code
Mar 12, 2025
Figure 1 for Generative Frame Sampler for Long Video Understanding
Figure 2 for Generative Frame Sampler for Long Video Understanding
Figure 3 for Generative Frame Sampler for Long Video Understanding
Figure 4 for Generative Frame Sampler for Long Video Understanding
Viaarxiv icon

ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks

Add code
Mar 10, 2025
Viaarxiv icon

Aria-UI: Visual Grounding for GUI Instructions

Add code
Dec 20, 2024
Figure 1 for Aria-UI: Visual Grounding for GUI Instructions
Figure 2 for Aria-UI: Visual Grounding for GUI Instructions
Figure 3 for Aria-UI: Visual Grounding for GUI Instructions
Figure 4 for Aria-UI: Visual Grounding for GUI Instructions
Viaarxiv icon

Yi-Lightning Technical Report

Add code
Dec 03, 2024
Figure 1 for Yi-Lightning Technical Report
Figure 2 for Yi-Lightning Technical Report
Figure 3 for Yi-Lightning Technical Report
Figure 4 for Yi-Lightning Technical Report
Viaarxiv icon

HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks

Add code
Oct 16, 2024
Figure 1 for HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
Figure 2 for HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
Figure 3 for HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
Figure 4 for HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
Viaarxiv icon

Aria: An Open Multimodal Native Mixture-of-Experts Model

Add code
Oct 08, 2024
Figure 1 for Aria: An Open Multimodal Native Mixture-of-Experts Model
Figure 2 for Aria: An Open Multimodal Native Mixture-of-Experts Model
Figure 3 for Aria: An Open Multimodal Native Mixture-of-Experts Model
Figure 4 for Aria: An Open Multimodal Native Mixture-of-Experts Model
Viaarxiv icon

LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding

Add code
Jul 22, 2024
Figure 1 for LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding
Figure 2 for LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding
Figure 3 for LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding
Figure 4 for LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding
Viaarxiv icon