Picture for Xu Zhao

Xu Zhao

LiViBench: An Omnimodal Benchmark for Interactive Livestream Video Understanding

Add code
Jan 21, 2026
Viaarxiv icon

STEP-LLM: Generating CAD STEP Models from Natural Language with Large Language Models

Add code
Jan 19, 2026
Viaarxiv icon

Learning to Feel the Future: DreamTacVLA for Contact-Rich Manipulation

Add code
Dec 29, 2025
Viaarxiv icon

DreamOmni3: Scribble-based Editing and Generation

Add code
Dec 27, 2025
Viaarxiv icon

Metasurfaces Enable Active-Like Passive Radar

Add code
Dec 09, 2025
Viaarxiv icon

Multi-Granularity Distribution Modeling for Video Watch Time Prediction via Exponential-Gaussian Mixture Network

Add code
Aug 18, 2025
Figure 1 for Multi-Granularity Distribution Modeling for Video Watch Time Prediction via Exponential-Gaussian Mixture Network
Figure 2 for Multi-Granularity Distribution Modeling for Video Watch Time Prediction via Exponential-Gaussian Mixture Network
Figure 3 for Multi-Granularity Distribution Modeling for Video Watch Time Prediction via Exponential-Gaussian Mixture Network
Figure 4 for Multi-Granularity Distribution Modeling for Video Watch Time Prediction via Exponential-Gaussian Mixture Network
Viaarxiv icon

Step-Audio 2 Technical Report

Add code
Jul 24, 2025
Figure 1 for Step-Audio 2 Technical Report
Figure 2 for Step-Audio 2 Technical Report
Figure 3 for Step-Audio 2 Technical Report
Figure 4 for Step-Audio 2 Technical Report
Viaarxiv icon

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model

Add code
Jun 10, 2025
Figure 1 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 2 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 3 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 4 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Viaarxiv icon

Trajectory Entropy: Modeling Game State Stability from Multimodality Trajectory Prediction

Add code
Jun 06, 2025
Viaarxiv icon

DGOcc: Depth-aware Global Query-based Network for Monocular 3D Occupancy Prediction

Add code
Apr 10, 2025
Viaarxiv icon