Picture for Shaobin Zhuang

Shaobin Zhuang

UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model

Add code
Feb 15, 2026
Viaarxiv icon

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Add code
Feb 15, 2026
Viaarxiv icon

WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction

Add code
Aug 07, 2025
Viaarxiv icon

Unleashing High-Quality Image Generation in Diffusion Sampling Using Second-Order Levenberg-Marquardt-Langevin

Add code
May 30, 2025
Figure 1 for Unleashing High-Quality Image Generation in Diffusion Sampling Using Second-Order Levenberg-Marquardt-Langevin
Figure 2 for Unleashing High-Quality Image Generation in Diffusion Sampling Using Second-Order Levenberg-Marquardt-Langevin
Figure 3 for Unleashing High-Quality Image Generation in Diffusion Sampling Using Second-Order Levenberg-Marquardt-Langevin
Figure 4 for Unleashing High-Quality Image Generation in Diffusion Sampling Using Second-Order Levenberg-Marquardt-Langevin
Viaarxiv icon

Efficiently Access Diffusion Fisher: Within the Outer Product Span Space

Add code
May 29, 2025
Viaarxiv icon

Video-GPT via Next Clip Diffusion

Add code
May 18, 2025
Figure 1 for Video-GPT via Next Clip Diffusion
Figure 2 for Video-GPT via Next Clip Diffusion
Figure 3 for Video-GPT via Next Clip Diffusion
Figure 4 for Video-GPT via Next Clip Diffusion
Viaarxiv icon

V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents

Add code
Mar 15, 2025
Figure 1 for V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents
Figure 2 for V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents
Figure 3 for V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents
Figure 4 for V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents
Viaarxiv icon

TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision

Add code
Mar 10, 2025
Figure 1 for TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision
Figure 2 for TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision
Figure 3 for TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision
Figure 4 for TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision
Viaarxiv icon

Get In Video: Add Anything You Want to the Video

Add code
Mar 08, 2025
Figure 1 for Get In Video: Add Anything You Want to the Video
Figure 2 for Get In Video: Add Anything You Want to the Video
Figure 3 for Get In Video: Add Anything You Want to the Video
Figure 4 for Get In Video: Add Anything You Want to the Video
Viaarxiv icon

WeGen: A Unified Model for Interactive Multimodal Generation as We Chat

Add code
Mar 03, 2025
Viaarxiv icon