Picture for Yu Zhou

Yu Zhou

National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China, Fanyu AI Laboratory, Zhongke Fanyu Technology Co., Ltd, Beijing, China

STEP3-VL-10B Technical Report

Add code
Jan 15, 2026
Viaarxiv icon

PROMISE: Process Reward Models Unlock Test-Time Scaling Laws in Generative Recommendations

Add code
Jan 08, 2026
Viaarxiv icon

HiSciBench: A Hierarchical Multi-disciplinary Benchmark for Scientific Intelligence from Reading to Discovery

Add code
Dec 28, 2025
Viaarxiv icon

UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture

Add code
Dec 25, 2025
Figure 1 for UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture
Figure 2 for UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture
Figure 3 for UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture
Figure 4 for UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture
Viaarxiv icon

Mirage Persistent Kernel: A Compiler and Runtime for Mega-Kernelizing Tensor Programs

Add code
Dec 22, 2025
Viaarxiv icon

EmoCaliber: Advancing Reliable Visual Emotion Comprehension via Confidence Verbalization and Calibration

Add code
Dec 17, 2025
Viaarxiv icon

MuSc-V2: Zero-Shot Multimodal Industrial Anomaly Classification and Segmentation with Mutual Scoring of Unlabeled Samples

Add code
Nov 13, 2025
Viaarxiv icon

SUGAR: Learning Skeleton Representation with Visual-Motion Knowledge for Action Recognition

Add code
Nov 13, 2025
Viaarxiv icon

When Eyes and Ears Disagree: Can MLLMs Discern Audio-Visual Confusion?

Add code
Nov 13, 2025
Viaarxiv icon

Task-Aware 3D Affordance Segmentation via 2D Guidance and Geometric Refinement

Add code
Nov 12, 2025
Viaarxiv icon