Picture for Yuzhong Zhao

Yuzhong Zhao

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Add code
Nov 28, 2024
Viaarxiv icon

Evaluation of Text-to-Video Generation Models: A Dynamics Perspective

Add code
Jul 01, 2024
Viaarxiv icon

DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution

Add code
May 25, 2024
Figure 1 for DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution
Figure 2 for DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution
Figure 3 for DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution
Figure 4 for DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution
Viaarxiv icon

Controllable Dense Captioner with Multimodal Embedding Bridging

Add code
Feb 01, 2024
Viaarxiv icon

VMamba: Visual State Space Model

Add code
Jan 18, 2024
Figure 1 for VMamba: Visual State Space Model
Figure 2 for VMamba: Visual State Space Model
Figure 3 for VMamba: Visual State Space Model
Figure 4 for VMamba: Visual State Space Model
Viaarxiv icon

Continual Learning for Image Segmentation with Dynamic Query

Add code
Nov 29, 2023
Viaarxiv icon

DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models

Add code
Aug 11, 2023
Viaarxiv icon

Generative Prompt Model for Weakly Supervised Object Localization

Add code
Jul 19, 2023
Viaarxiv icon

A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension

Add code
May 05, 2023
Viaarxiv icon

FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation

Add code
May 05, 2023
Viaarxiv icon