Picture for Xiangyu Zhang

Xiangyu Zhang

Perception-R1: Pioneering Perception Policy with Reinforcement Learning

Add code
Apr 10, 2025
Viaarxiv icon

Perception in Reflection

Add code
Apr 09, 2025
Viaarxiv icon

Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation

Add code
Apr 04, 2025
Viaarxiv icon

Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness

Add code
Apr 02, 2025
Viaarxiv icon

$μ$KE: Matryoshka Unstructured Knowledge Editing of Large Language Models

Add code
Apr 01, 2025
Viaarxiv icon

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Add code
Mar 31, 2025
Viaarxiv icon

M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?

Add code
Mar 27, 2025
Viaarxiv icon

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

Add code
Mar 14, 2025
Viaarxiv icon

Why Pre-trained Models Fail: Feature Entanglement in Multi-modal Depression Detection

Add code
Mar 09, 2025
Viaarxiv icon

Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining

Add code
Mar 06, 2025
Viaarxiv icon