Picture for Zhe Li

Zhe Li

Visual-Seeker: Towards Visual-Native Multimodal Agentic Search via Active Visual Reasoning

Add code
Jun 13, 2026
Viaarxiv icon

Stain-Aware Wavelet Regularization for Instant Adversarial Purification in Histopathology

Add code
Jun 07, 2026
Viaarxiv icon

3DThinkVLA: Endowing Vision-Language-Action Models with Latent 3D Priors via 3D-Thinking-Guided Co-training

Add code
Jun 03, 2026
Viaarxiv icon

TiWeaver: Unified Temporal Dynamics Modeling via Contextual Patching

Add code
Jun 02, 2026
Viaarxiv icon

Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini

Add code
May 26, 2026
Viaarxiv icon

Chronicles-OCR: A Cross-Temporal Perception Benchmark for the Evolutionary Trajectory of Chinese Characters

Add code
May 12, 2026
Viaarxiv icon

Learning Dynamics of Zeroth-Order Optimization: A Kernel Perspective

Add code
May 05, 2026
Viaarxiv icon

UNet-Based Fusion and Exponential Moving Average Adaptation for Noise-Robust Speaker Recognition

Add code
Apr 28, 2026
Viaarxiv icon

Utilizing Improper Gaussian Signaling for Downlink Rate-Splitting Multiple Access with Imperfect Successive Interference Cancellation

Add code
Apr 16, 2026
Viaarxiv icon

ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection

Add code
Apr 13, 2026
Viaarxiv icon