Picture for Kai Wang

Kai Wang

Refer to the report for detailed contributions

DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation

Add code
Apr 09, 2025
Viaarxiv icon

MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models

Add code
Apr 08, 2025
Viaarxiv icon

Dynamic Vision Mamba

Add code
Apr 07, 2025
Viaarxiv icon

Slow-Fast Architecture for Video Multi-Modal Large Language Models

Add code
Apr 02, 2025
Viaarxiv icon

ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion

Add code
Mar 31, 2025
Viaarxiv icon

Free-Lunch Color-Texture Disentanglement for Stylized Image Generation

Add code
Mar 21, 2025
Viaarxiv icon

Safety Evaluation and Enhancement of DeepSeek Models in Chinese Contexts

Add code
Mar 18, 2025
Viaarxiv icon

AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction

Add code
Mar 17, 2025
Viaarxiv icon

ProbDiffFlow: An Efficient Learning-Free Framework for Probabilistic Single-Image Optical Flow Estimation

Add code
Mar 16, 2025
Viaarxiv icon

MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification

Add code
Mar 16, 2025
Viaarxiv icon