Picture for Zhe Lin

Zhe Lin

LaViDa-R1: Advancing Reasoning for Unified Multimodal Diffusion Language Models

Add code
Feb 15, 2026
Viaarxiv icon

Rethinking Global Text Conditioning in Diffusion Transformers

Add code
Feb 09, 2026
Viaarxiv icon

Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion

Add code
Feb 08, 2026
Viaarxiv icon

Controllable Layered Image Generation for Real-World Editing

Add code
Jan 21, 2026
Viaarxiv icon

Self-Evaluation Unlocks Any-Step Text-to-Image Generation

Add code
Dec 26, 2025
Viaarxiv icon

Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing

Add code
Dec 19, 2025
Viaarxiv icon

Sparse-LaViDa: Sparse Multimodal Discrete Diffusion Language Models

Add code
Dec 16, 2025
Viaarxiv icon

UniSER: A Foundation Model for Unified Soft Effects Removal

Add code
Nov 18, 2025
Viaarxiv icon

Image Tokenizer Needs Post-Training

Add code
Sep 15, 2025
Viaarxiv icon

Cross-Border Legal Adaptation of Autonomous Vehicle Design based on Logic and Non-monotonic Reasoning

Add code
Jul 30, 2025
Viaarxiv icon