Picture for Ruochen Xu

Ruochen Xu

VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model

Add code
Apr 10, 2025
Viaarxiv icon

Grasping by Spiraling: Reproducing Elephant Movements with Rigid-Soft Robot Synergy

Add code
Apr 02, 2025
Viaarxiv icon

The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding?

Add code
Feb 19, 2025
Viaarxiv icon

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration

Add code
Nov 25, 2024
Figure 1 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Figure 2 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Figure 3 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Figure 4 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Viaarxiv icon

OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding

Add code
Jul 06, 2024
Figure 1 for OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Figure 2 for OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Figure 3 for OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Figure 4 for OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Viaarxiv icon

Preserving Knowledge in Large Language Model: A Model-Agnostic Self-Decompression Approach

Add code
Jun 17, 2024
Viaarxiv icon

Rho-1: Not All Tokens Are What You Need

Add code
Apr 11, 2024
Figure 1 for Rho-1: Not All Tokens Are What You Need
Figure 2 for Rho-1: Not All Tokens Are What You Need
Figure 3 for Rho-1: Not All Tokens Are What You Need
Figure 4 for Rho-1: Not All Tokens Are What You Need
Viaarxiv icon

ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models

Add code
Mar 08, 2024
Figure 1 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Figure 2 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Figure 3 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Figure 4 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Viaarxiv icon

DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents

Add code
Feb 21, 2024
Figure 1 for DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
Figure 2 for DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
Figure 3 for DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
Figure 4 for DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
Viaarxiv icon

SciAgent: Tool-augmented Language Models for Scientific Reasoning

Add code
Feb 21, 2024
Figure 1 for SciAgent: Tool-augmented Language Models for Scientific Reasoning
Figure 2 for SciAgent: Tool-augmented Language Models for Scientific Reasoning
Figure 3 for SciAgent: Tool-augmented Language Models for Scientific Reasoning
Figure 4 for SciAgent: Tool-augmented Language Models for Scientific Reasoning
Viaarxiv icon