Picture for Jun Zhu

Jun Zhu

Tsinghua University

Vidarc: Embodied Video Diffusion Model for Closed-loop Control

Add code
Dec 19, 2025
Viaarxiv icon

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Add code
Dec 18, 2025
Viaarxiv icon

Motus: A Unified Latent Action World Model

Add code
Dec 15, 2025
Viaarxiv icon

Part-X-MLLM: Part-aware 3D Multimodal Large Language Model

Add code
Nov 17, 2025
Viaarxiv icon

Imagine in Space: Exploring the Frontier of Spatial Intelligence and Reasoning Efficiency in Vision Language Models

Add code
Nov 16, 2025
Viaarxiv icon

TetraJet-v2: Accurate NVFP4 Training for Large Language Models with Oscillation Suppression and Outlier Control

Add code
Oct 31, 2025
Viaarxiv icon

Effective and Stealthy One-Shot Jailbreaks on Deployed Mobile Vision-Language Agents

Add code
Oct 09, 2025
Figure 1 for Effective and Stealthy One-Shot Jailbreaks on Deployed Mobile Vision-Language Agents
Figure 2 for Effective and Stealthy One-Shot Jailbreaks on Deployed Mobile Vision-Language Agents
Figure 3 for Effective and Stealthy One-Shot Jailbreaks on Deployed Mobile Vision-Language Agents
Figure 4 for Effective and Stealthy One-Shot Jailbreaks on Deployed Mobile Vision-Language Agents
Viaarxiv icon

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Add code
Sep 19, 2025
Viaarxiv icon

Theoretical Analysis of Relative Errors in Gradient Computations for Adversarial Attacks with CE Loss

Add code
Jul 30, 2025
Viaarxiv icon

RCR-AF: Enhancing Model Generalization via Rademacher Complexity Reduction Activation Function

Add code
Jul 30, 2025
Viaarxiv icon