Picture for Ruizhi Qiao

Ruizhi Qiao

Multimodal Label Relevance Ranking via Reinforcement Learning

Add code
Jul 18, 2024
Viaarxiv icon

Unified and Dynamic Graph for Temporal Character Grouping in Long Videos

Add code
Aug 29, 2023
Viaarxiv icon

Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval

Add code
Aug 08, 2023
Viaarxiv icon

D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation

Add code
Aug 08, 2023
Viaarxiv icon

Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies

Add code
Mar 26, 2023
Viaarxiv icon

See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval

Add code
Aug 26, 2022
Figure 1 for See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval
Figure 2 for See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval
Figure 3 for See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval
Figure 4 for See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval
Viaarxiv icon

VLMAE: Vision-Language Masked Autoencoder

Add code
Aug 19, 2022
Figure 1 for VLMAE: Vision-Language Masked Autoencoder
Figure 2 for VLMAE: Vision-Language Masked Autoencoder
Figure 3 for VLMAE: Vision-Language Masked Autoencoder
Figure 4 for VLMAE: Vision-Language Masked Autoencoder
Viaarxiv icon

Exploiting Feature Diversity for Make-up Temporal Video Grounding

Add code
Aug 12, 2022
Figure 1 for Exploiting Feature Diversity for Make-up Temporal Video Grounding
Figure 2 for Exploiting Feature Diversity for Make-up Temporal Video Grounding
Figure 3 for Exploiting Feature Diversity for Make-up Temporal Video Grounding
Figure 4 for Exploiting Feature Diversity for Make-up Temporal Video Grounding
Viaarxiv icon

Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge Transfer

Add code
Jul 05, 2022
Figure 1 for Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge Transfer
Figure 2 for Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge Transfer
Figure 3 for Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge Transfer
Figure 4 for Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge Transfer
Viaarxiv icon

Scene Consistency Representation Learning for Video Scene Segmentation

Add code
May 11, 2022
Figure 1 for Scene Consistency Representation Learning for Video Scene Segmentation
Figure 2 for Scene Consistency Representation Learning for Video Scene Segmentation
Figure 3 for Scene Consistency Representation Learning for Video Scene Segmentation
Figure 4 for Scene Consistency Representation Learning for Video Scene Segmentation
Viaarxiv icon