Picture for Yuzhong Zhao

Yuzhong Zhao

Evaluation of Text-to-Video Generation Models: A Dynamics Perspective

Add code
Jul 01, 2024
Viaarxiv icon

DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution

Add code
May 25, 2024
Figure 1 for DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution
Figure 2 for DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution
Figure 3 for DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution
Figure 4 for DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution
Viaarxiv icon

Controllable Dense Captioner with Multimodal Embedding Bridging

Add code
Feb 01, 2024
Viaarxiv icon

VMamba: Visual State Space Model

Add code
Jan 18, 2024
Figure 1 for VMamba: Visual State Space Model
Figure 2 for VMamba: Visual State Space Model
Figure 3 for VMamba: Visual State Space Model
Figure 4 for VMamba: Visual State Space Model
Viaarxiv icon

Continual Learning for Image Segmentation with Dynamic Query

Add code
Nov 29, 2023
Viaarxiv icon

DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models

Add code
Aug 11, 2023
Viaarxiv icon

Generative Prompt Model for Weakly Supervised Object Localization

Add code
Jul 19, 2023
Viaarxiv icon

FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation

Add code
May 05, 2023
Viaarxiv icon

A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension

Add code
May 05, 2023
Viaarxiv icon

ICDAR 2023 Video Text Reading Competition for Dense and Small Text

Add code
Apr 10, 2023
Viaarxiv icon