Picture for Zhonghua Zhai

Zhonghua Zhai

Composable Visual Tokenizers with Generator-Free Diagnostics of Learnability

Add code
Feb 03, 2026
Viaarxiv icon

Revisiting Multi-Task Visual Representation Learning

Add code
Jan 20, 2026
Viaarxiv icon

Universal Video Temporal Grounding with Generative Multi-modal Large Language Models

Add code
Jun 23, 2025
Viaarxiv icon

SeedEdit 3.0: Fast and High-Quality Generative Image Editing

Add code
Jun 06, 2025
Viaarxiv icon

Seedream 3.0 Technical Report

Add code
Apr 16, 2025
Figure 1 for Seedream 3.0 Technical Report
Figure 2 for Seedream 3.0 Technical Report
Figure 3 for Seedream 3.0 Technical Report
Figure 4 for Seedream 3.0 Technical Report
Viaarxiv icon

Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model

Add code
Mar 10, 2025
Viaarxiv icon

Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos

Add code
Apr 26, 2024
Figure 1 for Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos
Figure 2 for Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos
Figure 3 for Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos
Figure 4 for Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos
Viaarxiv icon

Cell Variational Information Bottleneck Network

Add code
Mar 29, 2024
Figure 1 for Cell Variational Information Bottleneck Network
Figure 2 for Cell Variational Information Bottleneck Network
Figure 3 for Cell Variational Information Bottleneck Network
Figure 4 for Cell Variational Information Bottleneck Network
Viaarxiv icon

Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment

Add code
Mar 19, 2024
Figure 1 for Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment
Figure 2 for Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment
Figure 3 for Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment
Figure 4 for Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment
Viaarxiv icon

Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Models

Add code
Dec 12, 2023
Figure 1 for Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Models
Figure 2 for Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Models
Figure 3 for Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Models
Figure 4 for Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Models
Viaarxiv icon