Picture for Qi Zhao

Qi Zhao

Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Add code
Jun 23, 2025
Viaarxiv icon

VINCIE: Unlocking In-context Image Editing from Video

Add code
Jun 12, 2025
Viaarxiv icon

Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer

Add code
May 30, 2025
Viaarxiv icon

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Add code
Apr 11, 2025
Viaarxiv icon

An Integrated AI-Enabled System Using One Class Twin Cross Learning (OCT-X) for Early Gastric Cancer Detection

Add code
Mar 31, 2025
Viaarxiv icon

Synthetic Video Enhances Physical Fidelity in Video Synthesis

Add code
Mar 26, 2025
Viaarxiv icon

ASP-VMUNet: Atrous Shifted Parallel Vision Mamba U-Net for Skin Lesion Segmentation

Add code
Mar 25, 2025
Viaarxiv icon

DreamInsert: Zero-Shot Image-to-Video Object Insertion from A Single Image

Add code
Mar 13, 2025
Viaarxiv icon

CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models

Add code
Mar 13, 2025
Viaarxiv icon

KnowPath: Knowledge-enhanced Reasoning via LLM-generated Inference Paths over Knowledge Graphs

Add code
Feb 17, 2025
Viaarxiv icon