Picture for Yu Zhou

Yu Zhou

National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China, Fanyu AI Laboratory, Zhongke Fanyu Technology Co., Ltd, Beijing, China

Towards Real-World Document Parsing via Realistic Scene Synthesis and Document-Aware Training

Add code
Mar 25, 2026
Viaarxiv icon

MMTIT-Bench: A Multilingual and Multi-Scenario Benchmark with Cognition-Perception-Reasoning Guided Text-Image Machine Translation

Add code
Mar 25, 2026
Viaarxiv icon

Semantic Audio-Visual Navigation in Continuous Environments

Add code
Mar 20, 2026
Viaarxiv icon

Integrated Channel Sounding and Communication: Requirements, Architecture, Challenges, and Key Technologies

Add code
Mar 16, 2026
Viaarxiv icon

IMTBench: A Multi-Scenario Cross-Modal Collaborative Evaluation Benchmark for In-Image Machine Translation

Add code
Mar 11, 2026
Viaarxiv icon

PromptDLA: A Domain-aware Prompt Document Layout Analysis Framework with Descriptive Knowledge as a Cue

Add code
Mar 10, 2026
Viaarxiv icon

ICDAR 2025 Competition on End-to-End Document Image Machine Translation Towards Complex Layouts

Add code
Mar 10, 2026
Viaarxiv icon

GEMs: Breaking the Long-Sequence Barrier in Generative Recommendation with a Multi-Stream Decoder

Add code
Feb 14, 2026
Viaarxiv icon

Echo: Towards Advanced Audio Comprehension via Audio-Interleaved Reasoning

Add code
Feb 12, 2026
Viaarxiv icon

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Add code
Feb 11, 2026
Viaarxiv icon