Picture for Dong Zhou

Dong Zhou

NutVLM: A Self-Adaptive Defense Framework against Full-Dimension Attacks for Vision Language Models in Autonomous Driving

Add code
Feb 09, 2026
Viaarxiv icon

Imitation learning-based spacecraft rendezvous and docking method with Expert Demonstration

Add code
Jan 19, 2026
Viaarxiv icon

E-MMDiT: Revisiting Multimodal Diffusion Transformer Design for Fast Image Synthesis under Limited Resources

Add code
Oct 31, 2025
Viaarxiv icon

Multimodal Spiking Neural Network for Space Robotic Manipulation

Add code
Aug 10, 2025
Viaarxiv icon

Megrez2 Technical Report

Add code
Jul 23, 2025
Viaarxiv icon

Language-Conditioned Open-Vocabulary Mobile Manipulation with Pretrained Models

Add code
Jul 23, 2025
Viaarxiv icon

AMD-Hummingbird: Towards an Efficient Text-to-Video Model

Add code
Mar 25, 2025
Viaarxiv icon

LLM-driven Effective Knowledge Tracing by Integrating Dual-channel Difficulty

Add code
Feb 27, 2025
Viaarxiv icon

Megrez-Omni Technical Report

Add code
Feb 19, 2025
Figure 1 for Megrez-Omni Technical Report
Figure 2 for Megrez-Omni Technical Report
Figure 3 for Megrez-Omni Technical Report
Figure 4 for Megrez-Omni Technical Report
Viaarxiv icon

Edit as You See: Image-guided Video Editing via Masked Motion Modeling

Add code
Jan 08, 2025
Viaarxiv icon