Picture for Yu Zhou

Yu Zhou

National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China, Fanyu AI Laboratory, Zhongke Fanyu Technology Co., Ltd, Beijing, China

Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation

Add code
Jan 10, 2025
Viaarxiv icon

Char-SAM: Turning Segment Anything Model into Scene Text Segmentation Annotator with Character-level Visual Prompts

Add code
Dec 27, 2024
Viaarxiv icon

Less is More: Towards Green Code Large Language Models via Unified Structural Pruning

Add code
Dec 20, 2024
Viaarxiv icon

LDP: Generalizing to Multilingual Visual Information Extraction by Language Decoupled Pretraining

Add code
Dec 19, 2024
Viaarxiv icon

SimADFuzz: Simulation-Feedback Fuzz Testing for Autonomous Driving Systems

Add code
Dec 18, 2024
Viaarxiv icon

Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues

Add code
Dec 17, 2024
Figure 1 for Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Figure 2 for Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Figure 3 for Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Figure 4 for Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Viaarxiv icon

Arbitrary Reading Order Scene Text Spotter with Local Semantics Guidance

Add code
Dec 13, 2024
Viaarxiv icon

Falcon-UI: Understanding GUI Before Following User Instructions

Add code
Dec 12, 2024
Viaarxiv icon

A4-Unet: Deformable Multi-Scale Attention Network for Brain Tumor Segmentation

Add code
Dec 08, 2024
Viaarxiv icon

M3D: Dual-Stream Selective State Spaces and Depth-Driven Framework for High-Fidelity Single-View 3D Reconstruction

Add code
Nov 20, 2024
Viaarxiv icon