Picture for Yao Hu

Yao Hu

Alibaba Group

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

Add code
Mar 11, 2025
Viaarxiv icon

VLRMBench: A Comprehensive and Challenging Benchmark for Vision-Language Reward Models

Add code
Mar 10, 2025
Viaarxiv icon

Speculative Decoding for Multi-Sample Inference

Add code
Mar 07, 2025
Viaarxiv icon

Scalable Overload-Aware Graph-Based Index Construction for 10-Billion-Scale Vector Similarity Search

Add code
Feb 28, 2025
Viaarxiv icon

Revisiting Self-Consistency from Dynamic Distributional Alignment Perspective on Answer Aggregation

Add code
Feb 27, 2025
Viaarxiv icon

Beyond One-Size-Fits-All: Tailored Benchmarks for Efficient Evaluation

Add code
Feb 19, 2025
Viaarxiv icon

From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN

Add code
Feb 19, 2025
Viaarxiv icon

UniCBE: An Uniformity-driven Comparing Based Evaluation Framework with Unified Multi-Objective Optimization

Add code
Feb 17, 2025
Viaarxiv icon

InsBank: Evolving Instruction Subset for Ongoing Alignment

Add code
Feb 17, 2025
Viaarxiv icon

WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs

Add code
Feb 06, 2025
Figure 1 for WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs
Figure 2 for WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs
Figure 3 for WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs
Figure 4 for WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs
Viaarxiv icon