Picture for Zhangyang Wang

Zhangyang Wang

Atlas

Data Efficient Any Transformer-to-Mamba Distillation via Attention Bridge

Add code
Oct 22, 2025
Viaarxiv icon

RAPID^3: Tri-Level Reinforced Acceleration Policies for Diffusion Transformer

Add code
Sep 26, 2025
Viaarxiv icon

Foundation Models for Logistics: Toward Certifiable, Conversational Planning Interfaces

Add code
Jul 15, 2025
Viaarxiv icon

Martian World Models: Controllable Video Synthesis with Physically Accurate 3D Reconstructions

Add code
Jul 10, 2025
Viaarxiv icon

Demystifying the Visual Quality Paradox in Multimodal Large Language Models

Add code
Jun 18, 2025
Viaarxiv icon

LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning

Add code
Jun 18, 2025
Viaarxiv icon

On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention

Add code
Jun 12, 2025
Viaarxiv icon

CXR-LT 2024: A MICCAI challenge on long-tailed, multi-label, and zero-shot disease classification from chest X-ray

Add code
Jun 09, 2025
Viaarxiv icon

Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models

Add code
Jun 09, 2025
Viaarxiv icon

HALoS: Hierarchical Asynchronous Local SGD over Slow Networks for Geo-Distributed Large Language Model Training

Add code
Jun 05, 2025
Viaarxiv icon