Picture for Jason Li

Jason Li

Sandy

PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models

Add code
Jun 17, 2026
Viaarxiv icon

MagpieTTS-LF: Inference-Time Long-Form Speech Generation Without Training on Long-Form data

Add code
Jun 16, 2026
Viaarxiv icon

Watch, Remember, Reason: Human-View Video Understanding with MLLMs

Add code
Jun 05, 2026
Viaarxiv icon

Towards One-to-Many Temporal Grounding

Add code
Jun 04, 2026
Viaarxiv icon

Benchmarking and Evolving Reason-Reflect-Rectify for Reflective Visual Generation

Add code
May 19, 2026
Viaarxiv icon

Threshold-Guided Optimization for Visual Generative Models

Add code
May 06, 2026
Viaarxiv icon

Praxium: Diagnosing Cloud Anomalies with AI-based Telemetry and Dependency Analysis

Add code
Mar 25, 2026
Viaarxiv icon

Recover to Predict: Progressive Retrospective Learning for Variable-Length Trajectory Prediction

Add code
Mar 11, 2026
Viaarxiv icon

Superman: Unifying Skeleton and Vision for Human Motion Perception and Generation

Add code
Feb 02, 2026
Viaarxiv icon

UM-Text: A Unified Multimodal Model for Image Understanding

Add code
Jan 13, 2026
Viaarxiv icon