Picture for Le Zhang

Le Zhang

RiT: Vanilla Diffusion Transformers Suffice in Representation Space

Add code
May 21, 2026
Viaarxiv icon

From Where Things Are to What They Are For: Benchmarking Spatial-Functional Intelligence in Multimodal LLMs

Add code
May 04, 2026
Viaarxiv icon

MedFlowSeg: Flow Matching for Medical Image Segmentation with Frequency-Aware Attention

Add code
Apr 21, 2026
Viaarxiv icon

Make It Up: Fake Images, Real Gains in Generalized Few-shot Semantic Segmentation

Add code
Mar 28, 2026
Viaarxiv icon

End-to-End Dexterous Grasp Learning from Single-View Point Clouds via a Multi-Object Scene Dataset

Add code
Mar 16, 2026
Viaarxiv icon

Spectrum Matching: a Unified Perspective for Superior Diffusability in Latent Diffusion

Add code
Mar 15, 2026
Viaarxiv icon

Legal-DC: Benchmarking Retrieval-Augmented Generation for Legal Documents

Add code
Mar 12, 2026
Viaarxiv icon

Chain of Flow: A Foundational Generative Framework for ECG-to-4D Cardiac Digital Twins

Add code
Feb 26, 2026
Viaarxiv icon

AnimeAgent: Is the Multi-Agent via Image-to-Video models a Good Disney Storytelling Artist?

Add code
Feb 24, 2026
Viaarxiv icon

YOLOv10-Based Multi-Task Framework for Hand Localization and Laterality Classification in Surgical Videos

Add code
Feb 21, 2026
Viaarxiv icon