Picture for Zhen Dong

Zhen Dong

Can World Models Benefit VLMs for World Dynamics?

Add code
Oct 01, 2025
Viaarxiv icon

WHU-STree: A Multi-modal Benchmark Dataset for Street Tree Inventory

Add code
Sep 16, 2025
Viaarxiv icon

ButterflyQuant: Ultra-low-bit LLM Quantization through Learnable Orthogonal Butterfly Transforms

Add code
Sep 11, 2025
Viaarxiv icon

Aerial-ground Cross-modal Localization: Dataset, Ground-truth, and Benchmark

Add code
Sep 09, 2025
Viaarxiv icon

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Add code
Aug 21, 2025
Figure 1 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 2 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 3 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 4 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Viaarxiv icon

DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering

Add code
Jul 15, 2025
Viaarxiv icon

R-KV: Redundancy-aware KV Cache Compression for Training-Free Reasoning Models Acceleration

Add code
May 30, 2025
Viaarxiv icon

SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence

Add code
May 19, 2025
Viaarxiv icon

ARMOR: Adaptive Meshing with Reinforcement Optimization for Real-time 3D Monitoring in Unexposed Scenes

Add code
Apr 28, 2025
Viaarxiv icon

Learning to Detect Objects from Multi-Agent LiDAR Scans without Manual Labels

Add code
Mar 13, 2025
Viaarxiv icon