Picture for Jun Zhu

Jun Zhu

Tsinghua University

MSNeRV: Neural Video Representation with Multi-Scale Feature Fusion

Add code
Jun 18, 2025
Viaarxiv icon

Understanding and Benchmarking the Trustworthiness in Multimodal LLMs for Video Understanding

Add code
Jun 14, 2025
Viaarxiv icon

Teaching in adverse scenes: a statistically feedback-driven threshold and mask adjustment teacher-student framework for object detection in UAV images under adverse scenes

Add code
Jun 12, 2025
Viaarxiv icon

D2AF: A Dual-Driven Annotation and Filtering Framework for Visual Grounding

Add code
May 30, 2025
Viaarxiv icon

Versatile Cardiovascular Signal Generation with a Unified Diffusion Transformer

Add code
May 28, 2025
Viaarxiv icon

SageAttention2++: A More Efficient Implementation of SageAttention2

Add code
May 28, 2025
Viaarxiv icon

Bridging Supervised Learning and Reinforcement Learning in Math Reasoning

Add code
May 23, 2025
Viaarxiv icon

Understanding Pre-training and Fine-tuning from Loss Landscape Perspectives

Add code
May 23, 2025
Viaarxiv icon

Scaling Diffusion Transformers Efficiently via $μ$P

Add code
May 21, 2025
Viaarxiv icon

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Add code
May 16, 2025
Viaarxiv icon