Picture for Rui Zhao

Rui Zhao

Department of Radiology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China

Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans?

Add code
Dec 18, 2025
Viaarxiv icon

What Happens Next? Next Scene Prediction with a Unified Video Model

Add code
Dec 15, 2025
Figure 1 for What Happens Next? Next Scene Prediction with a Unified Video Model
Figure 2 for What Happens Next? Next Scene Prediction with a Unified Video Model
Figure 3 for What Happens Next? Next Scene Prediction with a Unified Video Model
Figure 4 for What Happens Next? Next Scene Prediction with a Unified Video Model
Viaarxiv icon

Hyperbolic Hierarchical Alignment Reasoning Network for Text-3D Retrieval

Add code
Nov 14, 2025
Viaarxiv icon

VIDEOP2R: Video Understanding from Perception to Reasoning

Add code
Nov 14, 2025
Viaarxiv icon

Human-in-the-loop Online Rejection Sampling for Robotic Manipulation

Add code
Oct 30, 2025
Viaarxiv icon

GTR-Bench: Evaluating Geo-Temporal Reasoning in Vision-Language Models

Add code
Oct 09, 2025
Figure 1 for GTR-Bench: Evaluating Geo-Temporal Reasoning in Vision-Language Models
Figure 2 for GTR-Bench: Evaluating Geo-Temporal Reasoning in Vision-Language Models
Figure 3 for GTR-Bench: Evaluating Geo-Temporal Reasoning in Vision-Language Models
Figure 4 for GTR-Bench: Evaluating Geo-Temporal Reasoning in Vision-Language Models
Viaarxiv icon

Language-Instructed Reasoning for Group Activity Detection via Multimodal Large Language Model

Add code
Sep 19, 2025
Viaarxiv icon

Fishing for Answers: Exploring One-shot vs. Iterative Retrieval Strategies for Retrieval Augmented Generation

Add code
Sep 05, 2025
Viaarxiv icon

Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges

Add code
Jun 12, 2025
Viaarxiv icon

PHRASED: Phrase Dictionary Biasing for Speech Translation

Add code
Jun 10, 2025
Viaarxiv icon